
closes-bug: 2086733 Change-Id: I1f1719e213d70e5a30abdd40c63388f0e4379fac Signed-off-by: Suzana Fernandes <Suzana.Fernandes@windriver.com>
527 lines
18 KiB
ReStructuredText
527 lines
18 KiB
ReStructuredText
.. Greg updates required for -High Security Vulnerability Document Updates
|
|
|
|
.. vco1593176327490
|
|
.. _upgrading-the-systemcontroller-using-the-cli:
|
|
|
|
===========================================
|
|
Upgrade the System Controller Using the CLI
|
|
===========================================
|
|
|
|
You can upload and apply upgrades to the system controller in order to upgrade
|
|
the central repository, from the CLI. The system controller can be upgraded
|
|
using either a manual software upgrade procedure or by using the
|
|
non-distributed systems :command:`sw-manager` orchestration procedure.
|
|
|
|
.. rubric:: |context|
|
|
|
|
Follow the steps below to manually upgrade the system controller:
|
|
|
|
.. rubric:: |prereq|
|
|
|
|
- Validate the list of new images with the target release. If you are using a
|
|
private registry for installs/upgrades, you must populate your private
|
|
registry with the new images prior to bootstrap and/or patch application.
|
|
|
|
.. rubric:: |proc|
|
|
|
|
.. _upgrading-the-systemcontroller-using-the-cli-steps-oq4-dgm-cmb:
|
|
|
|
#. Source the platform environment.
|
|
|
|
.. code-block:: none
|
|
|
|
$ source /etc/platform/openrc
|
|
~(keystone_admin)]$
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: license-begin
|
|
:end-before: license-end
|
|
|
|
#. Transfer iso and signature files to controller-0 (active controller) and import the load.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software --os-region-name SystemController upload --local <bootimage>.iso <bootimage>.sig
|
|
+-------------------------------+-------------------+
|
|
| Uploaded File | Release |
|
|
+-------------------------------+-------------------+
|
|
| starlingx-intel-x86-64-cd.iso | starlingx-24.09.0 |
|
|
+-------------------------------+-------------------+
|
|
|
|
.. note::
|
|
If you face any issue while importing the load, go to
|
|
``/var/log/software.log`` and examine the error messages.
|
|
|
|
.. note::
|
|
This can take several minutes. After the system controller is successfully
|
|
upgraded, the old load (which is in imported state) should not be deleted
|
|
from load list otherwise the subcloud upgrade orchestration will fail
|
|
with an error.
|
|
|
|
#. Apply any required software updates. After the update is installed ensure
|
|
controller-0 is active.
|
|
|
|
The system controller as well as the subclouds must be 'patch current'. All
|
|
software updates related to your current |prod| software release must be
|
|
uploaded, applied, and installed.
|
|
|
|
All software updates to the new |prod| release, only need to be uploaded
|
|
and applied. The install of these software updates will occur automatically
|
|
during the software upgrade procedure as the hosts are reset to load the
|
|
new release of software.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: wrsbegin
|
|
:end-before: wrsend
|
|
|
|
#. Confirm that the system is healthy.
|
|
|
|
Check the current system health status, resolve any alarms and other issues
|
|
reported by the :command:`software deploy precheck <release-id>` command
|
|
then recheck the system health status to confirm that all **System Health**
|
|
fields are set to **OK**. "If the upgrade health query fails 'Boot Device
|
|
and Root file system Device' check as seen below:"
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy precheck <release-id>
|
|
System Health:
|
|
All hosts are provisioned: [OK]
|
|
All hosts are unlocked/enabled: [OK]
|
|
All hosts have current configurations: [OK]
|
|
Ceph Storage Healthy: [OK]
|
|
No alarms: [OK]
|
|
All kubernetes nodes are ready: [OK]
|
|
All kubernetes control plane pods are ready: [OK]
|
|
All kubernetes applications are in a valid state: [OK]
|
|
All hosts are patch current: [OK]
|
|
Valid upgrade path from release 22.12 to 24.09: [OK]
|
|
Required patches are applied: [OK]
|
|
|
|
Where ``<release-id>`` is ``starlingx-24.09.0`` for above software upload
|
|
example, or it can be found out by running :command:`software list`.
|
|
|
|
The platform issuer (system-local-ca) is required to have an RSA
|
|
certificate/private key pair before upgrading. If ``system-local-ca`` was
|
|
configured with a different type of certificate/private key, the upgrade
|
|
pre check will fail with an informative message. In this case, the
|
|
:ref:`migrate-platform-certificates-to-use-cert-manager-c0b1727e4e5d` procedure
|
|
needs to be executed to reconfigure ``system-local-ca`` with the RSA
|
|
certificate/private key targeting the ``SystemController`` and all subclouds.
|
|
|
|
By default, the upgrade process cannot run and is not recommended to run
|
|
with active alarms present. It is strongly recommended that you clear your
|
|
system of all alarms before doing an upgrade.
|
|
|
|
.. note::
|
|
|
|
Use the command :command:`system upgrade-start --force` to force the
|
|
upgrade process to start and ignore non-management-affecting alarms.
|
|
This should ONLY be done if these alarms do not cause an issue for the
|
|
upgrades process.
|
|
|
|
#. Start the upgrade from controller-0.
|
|
|
|
Make sure that controller-0 is the active controller, and you are logged
|
|
into controller-0 as **sysadmin** and your present working directory is
|
|
your home directory.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy start <release-id>
|
|
+--------------+------------+------+--------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+--------------+
|
|
| 22.12.0 | 24.09.0 | True | deploy-start |
|
|
+--------------+------------+------+--------------+
|
|
|
|
When ``deploy start`` is complete:
|
|
|
|
.. code-block:: none
|
|
|
|
+--------------+------------+------+-------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+-------------------+
|
|
| 22.12.0 | 24.09.0 | True | deploy-start-done |
|
|
+--------------+------------+------+-------------------+
|
|
|
|
This will make a copy of the system data to be used in the upgrade.
|
|
Configuration changes must not be made after this point, until the
|
|
upgrade is completed.
|
|
|
|
The following upgrade state applies once this command is executed. Run the
|
|
:command:`system upgrade-show` command to verify the status of the upgrade.
|
|
|
|
|
|
- started:
|
|
|
|
- State entered after :command:`system upgrade-start` completes.
|
|
|
|
- Release <nn.nn> system data (for example, postgres databases) has
|
|
been exported to be used in the upgrade.
|
|
|
|
As part of the upgrade, the upgrade process checks the health of the system
|
|
and validates that the system is ready for an upgrade.
|
|
|
|
The upgrade process checks that no alarms are active before starting an
|
|
upgrade.
|
|
|
|
.. note::
|
|
|
|
Use the command :command:`system upgrade-start --force` to force the
|
|
upgrades process to start and to ignore management affecting alarms.
|
|
This should only be done if these alarms do not cause an issue for the
|
|
upgrades process.
|
|
|
|
The ``fm alarm-list --mgmt_affecting`` option provides specific alarms
|
|
which may be blocking an orchestrated upgrade.
|
|
|
|
On systems with Ceph storage, it also checks that the Ceph cluster is
|
|
healthy.
|
|
|
|
#. Upgrade controller-1.
|
|
|
|
|
|
#. Lock controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock controller-1
|
|
|
|
#. Start the upgrade on controller-1.
|
|
|
|
Controller-1 installs the update and reboots, then performs data
|
|
migration.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host controller-1
|
|
Running major release deployment, major_release=24.09, force=False, async_req=False, commit_id=<commit-id>
|
|
Host installation was successful on controller-1
|
|
|
|
#. Unlock controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock controller-1
|
|
|
|
Wait for controller-1 to enter the ``unlocked-enabled`` state. Wait until
|
|
the DRBD sync **400.001** Services-related alarm has been raised and then
|
|
cleared.
|
|
|
|
The **upgrading-controllers** state applies when this command is
|
|
run. This state is entered after controller-1 has been upgraded to
|
|
release nn.nn and data migration is successfully completed.
|
|
|
|
where *nn.nn* in the update file name is the |prod| release number.
|
|
|
|
If it transitions to **unlocked-disabled-failed**, check the issue
|
|
before proceeding to the next step. The alarms may indicate a
|
|
configuration error. Check the result of the configuration logs on
|
|
controller-1, (for example, Error logs in
|
|
controller1:``/var/log/puppet``).
|
|
|
|
#. Run the :command:`system application-list` and :command:`software deploy host-list`
|
|
commands to view the current progress.
|
|
|
|
After controller-1 is unlocked/enabled/available, insert step to check
|
|
controller-1 is running the new release:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-show controller-1
|
|
|
|
#. Set controller-1 as the active controller. Swact to controller-1.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-swact controller-0
|
|
|
|
Wait until services have gone active on the new active controller-1 before
|
|
proceeding to the next step. When all services on controller-1 are
|
|
enabled-active, the swact is complete.
|
|
|
|
.. note::
|
|
|
|
Continue the remaining steps below to manually upgrade or use upgrade
|
|
orchestration to upgrade the remaining nodes.
|
|
|
|
#. Upgrade controller-0.
|
|
|
|
For more information, see
|
|
:ref:`introduction-platform-software-updates-upgrades-06d6de90bbd0`.
|
|
|
|
#. Lock controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock controller-0
|
|
|
|
#. Upgrade controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host controller-0
|
|
|
|
.. note::
|
|
|
|
controller-0 must pxe-boot over the management network and its load
|
|
must be served from controller-1, and not from any external
|
|
pxe-boot server attached to the |OAM| network. To ensure this,
|
|
check that the network boot list/order of BIOS |NIC| is correct.
|
|
|
|
#. Unlock controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock controller-0
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host controller-0
|
|
|
|
You may encounter the following error message:
|
|
|
|
.. code-block:: none
|
|
|
|
Expecting number of interface sriov_numvfs=16. Please wait a few
|
|
minutes for inventory update and retry host-unlock.
|
|
|
|
If you see this error message, you need to retry after 5 minutes.
|
|
|
|
Wait until the DRBD sync **400.001** Services-related alarm has been raised
|
|
and then cleared before proceeding to the next step.
|
|
|
|
|
|
- upgrading-hosts:
|
|
|
|
- State entered when both controllers are running release <nn.nn>
|
|
software.
|
|
|
|
|
|
#. Check the system health to ensure that there are no unexpected alarms.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ fm alarm-list
|
|
|
|
Clear all alarms unrelated to the upgrade process.
|
|
|
|
#. If using Ceph storage backend, upgrade the storage nodes one at a time.
|
|
|
|
The storage node must be locked and all |OSDs| must be down in order to do
|
|
the upgrade.
|
|
|
|
|
|
#. Lock storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock storage-0
|
|
|
|
#. Verify that the |OSDs| are down after the storage node is locked.
|
|
|
|
In the Horizon interface, navigate to **Admin** \> **Platform** \>
|
|
**Storage Overview** to view the status of the |OSDs|.
|
|
|
|
#. Upgrade storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host storage-0
|
|
|
|
The upgrade is complete when the node comes online, and at that point,
|
|
you can safely unlock the node.
|
|
|
|
After upgrading a storage node, but before unlocking, there are Ceph
|
|
synchronization alarms (that appear to be making progress in
|
|
synching), and there are infrastructure network interface alarms
|
|
(since the infrastructure network interface configuration has not been
|
|
applied to the storage node yet, as it has not been unlocked).
|
|
|
|
Unlock the node as soon as the upgraded storage node comes online.
|
|
|
|
#. Unlock storage-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock storage-0
|
|
|
|
Wait for all alarms to clear after the unlock before proceeding to
|
|
upgrade the next storage host.
|
|
|
|
#. Repeat the above steps for each storage host.
|
|
|
|
.. note::
|
|
|
|
After upgrading the first storage node you can expect alarm
|
|
**800.003**. The alarm is cleared after all storage nodes are
|
|
upgraded.
|
|
|
|
#. If worker nodes are present, upgrade worker hosts, serially or in parallel,
|
|
if any.
|
|
|
|
|
|
#. Lock worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-lock worker-0
|
|
|
|
#. Upgrade worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy host worker-0
|
|
|
|
Wait for the host to run the installer, reboot, and go online before
|
|
unlocking it in the next step.
|
|
|
|
#. Unlock worker-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-unlock worker-0
|
|
|
|
Wait for all alarms to clear after the unlock before proceeding to the
|
|
next worker host.
|
|
|
|
#. Repeat the above steps for each worker host.
|
|
|
|
|
|
#. Set controller-0 as the active controller. Swact to controller-0.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system host-swact controller-1
|
|
|
|
Wait until services have gone active on the active controller-0 before
|
|
proceeding to the next step. When all services on controller-0 are
|
|
enabled-active, the swact is complete.
|
|
|
|
#. Activate the upgrade.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy activate
|
|
Deploy activate has started
|
|
|
|
Check deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show
|
|
+--------------+------------+------+-----------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+-----------------+
|
|
| 22.12.0 | 24.09.0 | True | deploy-activate |
|
|
+--------------+------------+------+-----------------+
|
|
|
|
When activate is complete:
|
|
|
|
.. code-block:: none
|
|
|
|
+--------------+------------+------+----------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+----------------------+
|
|
| 22.12.0 | 24.09.0 | True | deploy-activate-done |
|
|
+--------------+------------+------+----------------------+
|
|
|
|
During the running of the :command:`upgrade-activate` command, new
|
|
configurations are applied to the controller. 250.001 (**hostname
|
|
Configuration is out-of-date**) alarms are raised and are cleared as the
|
|
configuration is applied. The upgrade state goes from **activating** to
|
|
**activation-complete** once this is done.
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: deploymentmanager-begin
|
|
:end-before: deploymentmanager-end
|
|
|
|
The following states apply when this command is executed.
|
|
|
|
**activation-requested**
|
|
State entered when :command:`system upgrade-activate` is executed.
|
|
|
|
**activating**
|
|
State entered when we have started activating the upgrade by
|
|
applying new configurations to the controller and compute hosts.
|
|
|
|
**activating-hosts**
|
|
State entered when applying host-specific configurations. This state is
|
|
entered only if needed.
|
|
|
|
**activation-complete**
|
|
State entered when new configurations have been applied to all
|
|
controller and compute hosts.
|
|
|
|
#. Check the status of the upgrade again to see it has reached
|
|
**activation-complete**, for example.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ system upgrade-show
|
|
+--------------+--------------------------------------+
|
|
| Property | Value |
|
|
+--------------+--------------------------------------+
|
|
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
|
|
| state | activation-complete |
|
|
| from_release | nn.nn |
|
|
| to_release | nn.nn |
|
|
+--------------+--------------------------------------+
|
|
|
|
.. note::
|
|
|
|
This can take more than half an hour to complete.
|
|
|
|
.. note::
|
|
|
|
Alarms are generated as the subcloud load sync_status is "out-of-sync".
|
|
|
|
#. Complete the upgrade.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy complete
|
|
Deployment has been completed
|
|
|
|
Verify deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show,
|
|
+--------------+------------+------+------------------+
|
|
| From Release | To Release | RR | State |
|
|
+--------------+------------+------+------------------+
|
|
| 22.12.0 | 24.09.0 | True | deploy-completed |
|
|
+--------------+------------+------+------------------+
|
|
|
|
Run the :command:`system upgrade-show` command, and the status will display
|
|
"no upgrade in progress". The subclouds will be out-of-sync.
|
|
|
|
#. Upgrade Kubernetes, after deploy is completed. When Kubernetes upgrade
|
|
completes, conclude the deploy by deleting it.
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy delete, output
|
|
Deploy deleted with success
|
|
|
|
Verify deploy state:
|
|
|
|
.. code-block:: none
|
|
|
|
~(keystone_admin)]$ software deploy show, output
|
|
No deploy in progress
|
|
|
|
.. only:: partner
|
|
|
|
.. include:: /_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
|
:start-after: DMupgrades-begin
|
|
:end-before: DMupgrades-end
|