02e9e8b05d
Elisa is working on the same "Upgrade the System Controller Using the CLI" page. Link: https://review.opendev.org/c/starlingx/docs/+/937735/28/doc/source/dist_cloud/kubernetes/upgrading-the-systemcontroller-using-the-cli.rst#b485 @all Please review the changes in the above review Change-Id: I979c0c3387d7e7fd90f588a0e74c6f95c4eb2ff5 Signed-off-by: Ngairangbam Mili <ngairangbam.mili@windriver.com>
537 lines
20 KiB
ReStructuredText
537 lines
20 KiB
ReStructuredText
.. vco1593176327490
|
|
.. _upgrading-the-systemcontroller-using-the-cli:
|
|
|
|
===========================================
|
|
Upgrade the System Controller Using the CLI
|
|
===========================================
|
|
|
|
You can upload and apply a software upgrade (deploy a major release or patched
|
|
major Release) to the system controller, using the CLI. The software upgrade
|
|
not only upgrades software of the system controller but also updates software
|
|
in the system controller's |prod-dc| vault and the central container image
|
|
repository, in support of subsequent subcloud upgrades.
|
|
|
|
The system controller can be upgraded using either a :ref:`manual software
|
|
upgrade <manual-host-software-deployment-ee17ec6f71a4>` or by using the
|
|
standalone cloud :ref:`orchestrated software upgraded procedure
|
|
<orchestrated-deployment-host-software-deployment-d234754c7d20>` with
|
|
:command:`sw-manager`.
|
|
|
|
.. rubric:: |context|
|
|
|
|
Follow the steps below to manually upgrade the system controller:
|
|
|
|
.. rubric:: |prereq|
|
|
|
|
.. only:: starlingx
|
|
|
|
- Transfer the ISO and signature files for the new major release (or new
|
|
patched major release) from the |prod-long| mirror
|
|
https://mirror.starlingx.cengn.ca/mirror/starlingx/release/latest_release/debian/monolithic/outputs/iso/
|
|
to controller-0 (active controller).
|
|
|
|
- Upgrade to a patched major release (patched ISO).
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: manualupgrade1-begin
|
|
:end-before: manualupgrade1-end
|
|
|
|
.. only:: starlingx
|
|
|
|
- If you are using a private registry (see the ``docker / *-registry``
|
|
sections of `system service-parameter-list`), transfer the container
|
|
image versions associated with the new major release (or new patched
|
|
major release) using the list from |prod-long| mirror
|
|
https://mirror.starlingx.cengn.ca/mirror/starlingx/release/latest_release/debian/monolithic/outputs/docker-images/
|
|
from docker.io to the private registry.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: manualupgrade2-begin
|
|
:end-before: manualupgrade2-end
|
|
|
|
- The platform issuer (system-local-ca) is required to have an RSA
|
|
certificate/private key pair before upgrading. If ``system-local-ca`` was
|
|
configured with a different type of certificate/private key, the deploy pre
|
|
check will fail with an informative message. In this case, the
|
|
:ref:`migrate-platform-certificates-to-use-cert-manager-c0b1727e4e5d`
|
|
procedure needs to be executed to reconfigure ``system-local-ca`` with the
|
|
RSA certificate/private key targeting the ``SystemController`` and all
|
|
subclouds.
|
|
|
|
- If there are software updates for your current |prod| software release that
|
|
are required in order to upgrade to the new software release, these
|
|
patches/updates should be applied in a separate software deploy of the
|
|
patch release(s) (see :ref:`manual-host-software-deployment-ee17ec6f71a4`)
|
|
on the system controller. These patches/updates should also be applied in
|
|
an orchestrated software deploy of the subclouds (see
|
|
:ref:`orchestrated-deployment-host-software-deployment-d234754c7d20`) in
|
|
order to get patch current of all the systems before starting the upgrade
|
|
to the new major release on the |prod-dc| system.
|
|
|
|
.. rubric:: |proc|
|
|
|
|
.. _upgrading-the-systemcontroller-using-the-cli-steps-oq4-dgm-cmb:
|
|
|
|
#. Source the platform environment.
|
|
|
|
.. code-block:: none
|
|
|
|
$ source /etc/platform/openrc
|
|
~(keystone_admin)]$
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: license-begin
|
|
:end-before: license-end
|
|
|
|
#. Upload the load.
|
|
|
|
.. only:: starlingx
|
|
|
|
.. parsed-literal::
|
|
|
|
~(keystone_admin)]$ software upload --local /full_path/<bootimage>.iso /full_path/<bootimage>.sig
|
|
+-------------------------------+--------------------------+
|
|
| Uploaded File | Release |
|
|
+-------------------------------+--------------------------+
|
|
| starlingx-intel-x86-64-cd.iso | stx-10.0.0 |
|
|
+-------------------------------+--------------------------+
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/software-upload-output.rest
|
|
:start-after: software-upload-begin
|
|
:end-before: software-upload-end
|
|
|
|
.. note::
|
|
|
|
Do not use ``--os-region-name SystemController`` proxy at this moment for
|
|
subcloud deployment. This step will be performed once the system
|
|
controller deploy is complete.
|
|
|
|
.. note::
|
|
If you face any issue while importing the load, go to
|
|
``/var/log/software.log`` and examine the error messages.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: wrsbegin
|
|
:end-before: wrsend
|
|
|
|
#. Confirm that the system is healthy.
|
|
|
|
Check the current system health status, resolve any alarms and other issues
|
|
reported by the :command:`software deploy precheck <release-id>` command
|
|
then recheck the system health status to confirm that all **System Health**
|
|
fields are set to **OK**.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy precheck <release-id>
|
|
System Health:
|
|
All hosts are provisioned: [OK]
|
|
All hosts are unlocked/enabled: [OK]
|
|
All hosts have current configurations: [OK]
|
|
Ceph Storage Healthy: [OK]
|
|
No alarms: [OK]
|
|
All kubernetes nodes are ready: [OK]
|
|
All kubernetes control plane pods are ready: [OK]
|
|
All kubernetes applications are in a valid state: [OK]
|
|
All hosts are patch current: [OK]
|
|
Active kubernetes version [vX.XX.X] is a valid supported version: [OK]
|
|
Active controller is controller-0: [OK]
|
|
Installed license is valid: [OK]
|
|
Valid upgrade path from release 22.12 to 24.09: [OK]
|
|
Required patches are applied: [OK]
|
|
|
|
|
|
.. only:: starlingx
|
|
|
|
Where ``<release-id>`` is stx-10.0.0 for above software upload
|
|
example, or it can be found out by running :command:`software list`.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/software-upload-output.rest
|
|
:start-after: software-upload-precheck-begin
|
|
:end-before: software-upload-precheck-end
|
|
|
|
By default, the deploy process cannot run and is not recommended to run
|
|
with active alarms present. It is strongly recommended that you clear your
|
|
system of all alarms before doing a deploy.
|
|
|
|
#. Begin the deploy from controller-0.
|
|
|
|
Make sure that controller-0 is the active controller, and you are logged
|
|
into controller-0 as **sysadmin** and your present working directory is
|
|
your home directory.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy start <release-id>
|
|
+--------------+------------+------+--------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+--------------+
|
|
| 22.12.0 | 24.09.100 | True | deploy-start |
|
|
+--------------+------------+------+--------------+
|
|
|
|
.. note::
|
|
|
|
It is recommended to run the :command:`software deploy precheck`
|
|
command before running :command:`software deploy start`. However, the
|
|
:command:`software deploy start` command will automatically run
|
|
the precheck command even if the precheck command has not been run
|
|
before.
|
|
|
|
Wait for :command:`software deploy start <release-id>` to complete by monitoring the
|
|
status of the deploy.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
+--------------+------------+------+-------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+-------------------+
|
|
| 22.12.0 | 24.09.100 | True | deploy-start-done |
|
|
+--------------+------------+------+-------------------+
|
|
|
|
:command:`software deploy start <release-id>` will migrate configuration
|
|
data to the new release's data model. Configuration must not be changed
|
|
after this point, until the deploy is completed.
|
|
|
|
#. Software deploy controller-1.
|
|
|
|
|
|
#. Lock controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock controller-1
|
|
|
|
#. Begin the deploy on controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host controller-1
|
|
Running major release deployment, major_release=24.09, force=False, async_req=False, commit_id=<commit-id>
|
|
Host installation was successful on controller-1
|
|
|
|
#. Unlock controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock controller-1
|
|
|
|
Wait for controller-1 to enter the ``unlocked-enabled`` state. Wait until
|
|
the DRBD sync **400.001** Services-related alarm has been raised and then
|
|
cleared.
|
|
|
|
When the first :command:`software deploy host <hostname>` command is
|
|
issued after the deploy state becomes ``deploy-start-done``, the
|
|
software deploy show state is changed to ``deploy-host``. When the
|
|
software is deployed to all the hosts, that is, when the
|
|
:command:`software deploy host <hostname>` successfully completes
|
|
against the last host, the software deploy show state changes to
|
|
``deploy-host-done``.
|
|
|
|
If software deploy show state transitions to
|
|
**unlocked-disabled-failed**, check the issue before proceeding to the
|
|
next step. The alarms may indicate a configuration error. Check the
|
|
result of the configuration logs on controller-1, (for example, Error
|
|
logs in controller-1:``/var/log/puppet``).
|
|
|
|
#. Run the :command:`system application-list` and :command:`software deploy host-list`
|
|
commands to view the current progress.
|
|
|
|
After controller-1 is unlocked/enabled/available, run the following step to check
|
|
controller-1 is running the new release:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-show controller-1
|
|
|
|
#. Set controller-1 as the active controller. Swact away from controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-swact controller-0
|
|
|
|
Wait until services have gone active on the new active controller-1 before
|
|
proceeding to the next step. When all services on controller-1 are
|
|
enabled-active, the swact is complete.
|
|
|
|
#. Software deploy controller-0.
|
|
|
|
For more information, see
|
|
:ref:`introduction-platform-software-updates-upgrades-06d6de90bbd0`.
|
|
|
|
#. Lock controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock controller-0
|
|
|
|
#. Begin the deploy on controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host controller-0
|
|
Running major release deployment, major_release=24.09, force=False, async_req=False, commit_id=<commit-id>
|
|
|
|
#. Unlock controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock controller-0
|
|
|
|
#. Check the system health to ensure that there are no unexpected alarms.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ fm alarm-list
|
|
|
|
Clear all alarms unrelated to the deploy process.
|
|
|
|
#. If using Ceph storage backend, deploy the storage nodes one at a time.
|
|
|
|
The storage node must be locked and all |OSDs| must be down in order to do
|
|
the upgrade.
|
|
|
|
|
|
#. Lock storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock storage-0
|
|
|
|
#. Verify that the |OSDs| are down after the storage node is locked.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ ceph osd tree
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| ID | CLASS | WEIGHT | TYPE | NAME | STATUS | REWEIGHT | PRI-AFF |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| -1 | | 0.01700 | root | storage-tier | | | |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| -2 | | 0.01700 | chassis | group-0 | | | |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| -4 | | 0.00850 | host | controller-0 | | | |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| 0 | hdd | 0.00850 | | osd.0 | up | 1.00000 | 1.00000 |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| -3 | | 0.00850 | host | controller-1 | | | |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
| 1 | hdd | 0.00850 | | osd.1 | down | 1.00000 | 1.00000 |
|
|
+----+---------+------------+---------+-------------------+-------------+------------------+-------------+
|
|
|
|
#. Begin the deploy on storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host storage-0
|
|
|
|
The deploy is complete when the node comes online, and at that point,
|
|
you can safely unlock the node.
|
|
|
|
After upgrading a storage node, but before unlocking, there are Ceph
|
|
synchronization alarms (that appear to be making progress in
|
|
synching), and there are infrastructure network interface alarms
|
|
(since the infrastructure network interface configuration has not been
|
|
applied to the storage node yet, as it has not been unlocked).
|
|
|
|
Unlock the node as soon as the deployed storage node comes online.
|
|
|
|
#. Unlock storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock storage-0
|
|
|
|
Wait for all alarms to clear after the unlock before proceeding to
|
|
deploy the next storage host.
|
|
|
|
#. Repeat the above steps for each storage host.
|
|
|
|
.. note::
|
|
|
|
After deploying the first storage node you can expect alarm
|
|
**800.003**. The alarm is cleared after all storage nodes are
|
|
deployed.
|
|
|
|
#. If worker nodes are present, deploy worker hosts, serially or in parallel,
|
|
if any.
|
|
|
|
|
|
#. Lock worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock worker-0
|
|
|
|
#. Deploy worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host worker-0
|
|
|
|
Wait for the host to run the installer, reboot, and go online before
|
|
unlocking it in the next step.
|
|
|
|
#. Unlock worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock worker-0
|
|
|
|
Wait for all alarms to clear after the unlock before proceeding to the
|
|
next worker host.
|
|
|
|
#. Repeat the above steps for each worker host.
|
|
|
|
|
|
#. Set controller-0 as the active controller. Swact away from controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-swact controller-1
|
|
|
|
Wait until services have gone active on the active controller-0 before
|
|
proceeding to the next step. When all services on controller-0 are
|
|
enabled-active, the swact is complete.
|
|
|
|
#. Activate the deploy.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy activate
|
|
Deploy activate has started
|
|
|
|
Check deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
+--------------+------------+------+-----------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+-----------------+
|
|
| 22.12.0 | 24.09.100 | True | deploy-activate |
|
|
+--------------+------------+------+-----------------+
|
|
|
|
Wait for :command:`software deploy activate` to complete by monitoring the
|
|
status of the deploy.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
+--------------+------------+------+----------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+----------------------+
|
|
| 22.12.0 | 24.09.100 | True | deploy-activate-done |
|
|
+--------------+------------+------+----------------------+
|
|
|
|
During the running of the :command:`software deploy activate` command, new
|
|
configurations are applied to the controller. 250.001 (**hostname
|
|
Configuration is out-of-date**) alarms are raised and are cleared as the
|
|
configuration is applied. The deploy state goes from ``deploy-activate`` to
|
|
``deploy-activate-done`` once this is done.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: deploymentmanager-begin
|
|
:end-before: deploymentmanager-end
|
|
|
|
The following states apply when this command is executed.
|
|
|
|
**deploy-activate**
|
|
State entered when deploy is being activated.
|
|
|
|
**deploy-activate-done**
|
|
State entered when the deploy-activate completes successfully.
|
|
|
|
.. note::
|
|
|
|
This can take more than 15 minutes to complete.
|
|
|
|
.. note::
|
|
|
|
Alarms are generated as the subcloud software sync_status is "out-of-sync".
|
|
|
|
#. Complete the upgrade.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy complete
|
|
Deployment has been completed
|
|
|
|
Verify deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
+--------------+------------+------+-----------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+-----------------------+
|
|
| 22.12.0 | 24.09.100 | True | deploy-completed |
|
|
+--------------+------------+------+-----------------------+
|
|
|
|
#. Upgrade Kubernetes, after the platform deploy is completed. To upgrade
|
|
Kubernetes of standalone system, see :ref:`index-updates-kub-03d4d10fa0be`.
|
|
|
|
#. When the Kubernetes upgrade completes, conclude the platform deploy by deleting
|
|
it.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy delete
|
|
Deploy deleted with success
|
|
|
|
Verify deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
No deploy in progress
|
|
|
|
#. Upload the load for subcloud deployment.
|
|
|
|
.. only:: starlingx
|
|
|
|
.. parsed-literal::
|
|
|
|
~(keystone_admin)]$ software --os-region-name SystemController upload --local /full_path/<bootimage>.iso /full_path/<bootimage>.sig
|
|
+-------------------------------+--------------------------+
|
|
| Uploaded File | Release |
|
|
+-------------------------------+--------------------------+
|
|
| starlingx-intel-x86-64-cd.iso | stx-10.0.0 |
|
|
+-------------------------------+--------------------------+
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/software-upload-output.rest
|
|
:start-after: software-load-begin
|
|
:end-before: software-load-end
|
|
|
|
.. note::
|
|
This can take a few minutes. After the system controller is successfully
|
|
deployed, the old load (which is in imported state) should not be deleted
|
|
from load list as this load is required for managing the subclouds that
|
|
are still running the previous load.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: DMupgrades-begin
|
|
:end-before: DMupgrades-end
|
|
|
|
.. rubric:: |postreq|
|
|
|
|
Separately apply the patches after the upgrade to the major release.
|