Merge "Upgrade procedure"

This commit is contained in:
Zuul 2021-06-11 11:46:34 +00:00 committed by Gerrit Code Review
commit 13b74004fd
6 changed files with 168 additions and 99 deletions

View File

@ -37,6 +37,12 @@ follows:
.. rubric:: |prereq|
For |AIO-SX| subclouds, end user container images in`registry.local` will be
backed up during the upgrade process. This only includes images other than
|prod| system and application images. These images are limited to 5 GBytes in
total size. If the system contains more than 5 GBytes of these images, the
upgrade start will fail.
The following prerequisites apply to a |prod-dc| upgrade management service.
.. _upgrade-management-overview-ul-smx-y2m-cmb:
@ -66,7 +72,7 @@ The following prerequisites apply to a |prod-dc| upgrade management service.
- Ensure **controller-0** is the active controller.
- The subclouds must all be |AIO-DX|, and using the Redfish
- The subclouds must all be |AIO-SX|, and must use the Redfish
platform management service.
- **Remove Non GA Applications**:

View File

@ -42,6 +42,9 @@ Follow the steps below to manually upgrade the System Controller:
~(keystone_admin)]$ system --os-region-name SystemController load-import <bootimage>.iso <bootimage>.sig
.. note::
This can take several minutes.
#. Apply any required software updates. After the update is installed ensure
controller-0 is active.
@ -56,7 +59,7 @@ Follow the steps below to manually upgrade the System Controller:
To find and download applicable updates, visit the `Wind River Support
Network <https://docs.windriver.com>`__.
.. xbooklink For more information, see |updates-doc|: :ref:`Managing Software Updates <managing-software-updates>`.
For more information, see |updates-doc|: :ref:`Managing Software Updates <managing-software-updates>`.
#. Confirm that the system is healthy.
@ -171,13 +174,12 @@ Follow the steps below to manually upgrade the System Controller:
- data-migration:
- State entered when :command:`system host-upgrade controller-1`
is executed.
- System data is being migrated from release N to release N+1.
- data-migration-complete:
- data-migration-complete or upgrading-controllers:
- State entered when controller-1 upgrade is complete.
@ -262,10 +264,9 @@ Follow the steps below to manually upgrade the System Controller:
Continue the remaining steps below to manually upgrade or use upgrade
orchestration to upgrade the remaining nodes.
#. Upgrade **controller-0**. For more information, see
.. xbooklink :ref:`|updates-doc| <software-updates-and-upgrades-software-updates>`.
#. Upgrade **controller-0**.
.. xbooklink For more information, see :ref:`|updates-doc| <software-updates-and-upgrades-software-updates>` guide.
#. Lock **controller-0**.
@ -419,19 +420,20 @@ Follow the steps below to manually upgrade the System Controller:
The following states apply when this command is executed.
- activation-requested:
**activation-requested**
State entered when :command:`system upgrade-activate` is executed.
- State entered when :command:`system upgrade-activate` is executed.
**activating**
State entered when we have started activating the upgrade by
applying new configurations to the controller and compute hosts.
- activating:
**activating-hosts**
State entered when applying host-specific configurations. This state is
entered only if needed.
- State entered when we have started activating the upgrade by
applying new configurations to the controller and compute hosts.
- activation-complete:
- State entered when new configurations have been applied to all
controller and compute hosts.
**activation-complete**
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**, for example.
@ -448,6 +450,8 @@ Follow the steps below to manually upgrade the System Controller:
| to_release | nn.nn |
+--------------+--------------------------------------+
.. note::
This can take more than half an hour to complete.
.. note::
Alarms are generated as the subcloud load sync\_status is "out-of-sync".

View File

@ -7,6 +7,13 @@ Abort Simplex System Upgrades
=============================
You can abort a Simplex System upgrade before or after upgrading controller-0.
The upgrade abort procedure can only be applied before the
:command:`upgrade-complete` command is issued. Once this command is issued the
upgrade can not be aborted. If the return to the previous release is required,
then restore the system using the backup data taken prior to the upgrade.
Before starting, verify the upgrade data under `/opt/platform-backup`. This data
must be present to perform the abort process.
.. _aborting-simplex-system-upgrades-section-N10025-N1001B-N10001:
@ -50,32 +57,33 @@ upgrade. This involves performing a system restore with the previous release.
.. _aborting-simplex-system-upgrades-ol-jmw-kcp-xdb:
#. Abort the upgrade with the :command:`upgrade-abort` command.
.. code-block:: none
$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
#. Lock and downgrade controller-0
.. code-block:: none
$ system host-lock controller-0
$ system host-downgrade controller-0
The data is stored in /opt/platform-backup. Ensure the data is present,and
preserved through the downgrade.
#. Install the previous release of |prod-long| Simplex software via network or
USB.
#. Verify and configure IP connectivity. External connectivity is required to
run the Ansible restore playbook. The |prod-long| boot image will DHCP out all
interfaces so the server may have obtained an IP address and have external IP
connectivity if a DHCP server is present in your environment. Verify this using
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
route.
#. Restore the system data. The restore is preserved in /opt/platform-backup.
For more information, see, :ref:`Upgrading All-in-One Simplex
<upgrading-all-in-one-simplex>`.
The system will be restored to the state when the :command:`upgrade-start`
command was issued. Follow the process in :ref:`Run Restore Playbook Locally on the
Controller <running-restore-playbook-locally-on-the-controller>`.
Specify the upgrade data filename as `backup_filename` and the `initial_backup_dir`
as `/opt/platform-backup`.
The user images will also need to be restored as described in the Postrequisites section.
#. Unlock controller-0
.. code-block:: none
$ system host-unlock controller-0
#. Abort the upgrade with the :command:`upgrade-abort` command.
@ -83,10 +91,6 @@ upgrade. This involves performing a system restore with the previous release.
$ system upgrade-abort
The system will be restored to the state when the :command:`upgrade-start`
command was issued. The :command:`upgrade-abort` command must be issued at
this time.
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.

View File

@ -9,6 +9,14 @@ Roll Back a Software Upgrade After the Second Controller Upgrade
After the second controller is upgraded, you can still roll back a software
upgrade, however, the rollback will impact the hosting of applications.
The upgrade abort procedure can only be applied before the
:command:`upgrade-complete` command is issued. Once this command is issued
the upgrade can not be aborted. If the return to the previous release is required,
then restore the system using the backup data taken prior to the upgrade.
In some scenarios additional actions will be required to complete the upgrade
abort. It may be necessary to restore the system from a backup.
.. rubric:: |proc|
#. Run the :command:`upgrade-abort` command to abort the upgrade.
@ -74,6 +82,10 @@ upgrade, however, the rollback will impact the hosting of applications.
$ system host-unlock controller-0
.. note::
Wait for controller-0 to become unlocked-enabled. Wait for the
|DRBD| sync 400.001 Services-related alarm to be raised and then cleared.
#. Swact to controller-0.
.. code-block:: none
@ -86,6 +98,10 @@ upgrade, however, the rollback will impact the hosting of applications.
#. Lock and downgrade controller-1.
.. code-block:: none
$ system host-lock controller-1
.. code-block:: none
$ system host-downgrade controller-1
@ -97,10 +113,10 @@ upgrade, however, the rollback will impact the hosting of applications.
.. code-block:: none
$ system host-unlock controller-1
#. Power up and unlock the storage hosts one at a time \(if using a Ceph
storage backend\). The hosts are re-installed with the release N load.
storage backend\). The hosts are re-installed with the previous release load.
.. note::
Skip this step if doing this procedure on a |prod| Duplex system.

View File

@ -46,7 +46,7 @@ of |prod| software.
#. Ensure that controller-0 is the active controller.
#. Install the license file for the release you are upgrading to, for example,
20.06.
21.05.
.. code-block:: none
@ -78,14 +78,17 @@ of |prod| software.
+--------------------+-----------+
| id | 2 |
| state | importing |
| software_version | 20.06 |
| compatible_version | 20.04 |
| software_version | 21.05 |
| compatible_version | 20.06 |
| required_patches | |
+--------------------+-----------+
The :command:`load-import` must be done on **controller-0** and accepts
relative paths.
.. note::
This can take a few minutes to complete.
#. Check to ensure the load was successfully imported.
.. code-block:: none
@ -95,8 +98,8 @@ of |prod| software.
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | active | 20.04 |
| 2 | imported | 20.06 |
| 1 | active | 20.06 |
| 2 | imported | 21.05 |
+----+----------+------------------+
@ -162,8 +165,8 @@ of |prod| software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | starting |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
This will make a copy of the system data to be used in the upgrade.
@ -176,7 +179,7 @@ of |prod| software.
- State entered after :command:`system upgrade-start` completes.
- Release 20.04 system data \(for example, postgres databases\) has
- Release 20.06 system data \(for example, postgres databases\) has
been exported to be used in the upgrade.
- Configuration changes must not be made after this point, until the
@ -218,7 +221,7 @@ of |prod| software.
**locked-disabled-online** state.
The following data migration states apply when this command is
executed.
executed:
- data-migration:
@ -227,12 +230,12 @@ of |prod| software.
- System data is being migrated from release N to release N+1.
- data-migration-complete:
- data-migration-complete or upgrading-controllers:
- State entered when controller-1 upgrade is complete.
- System data has been successfully migrated from release 20.04
to release 20.06.
- System data has been successfully migrated from release 20.06
to release 21.05.
- data-migration-failed:
@ -251,8 +254,8 @@ of |prod| software.
+--------------+--------------------------------------+
| uuid | e7c8f6bc-518c-46d4-ab81-7a59f8f8e64b |
| state | data-migration-complete |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
If the :command:`upgrade-show` status indicates
@ -273,7 +276,7 @@ of |prod| software.
- upgrading-controllers:
- State entered when controller-1 has been unlocked and is
running release 20.06 software.
running release 21.05 software.
If it transitions to **unlocked-disabled-failed**, check the issue
before proceeding to the next step. The alarms may indicate a
@ -317,7 +320,7 @@ of |prod| software.
- upgrading-hosts:
- State entered when both controllers are running release 20.06
- State entered when both controllers are running release 21.05
software.
#. Check the system health to ensure that there are no unexpected alarms.
@ -425,8 +428,8 @@ of |prod| software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activating |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
During the running of the :command:`upgrade-activate` command, new
@ -444,6 +447,10 @@ of |prod| software.
State entered when we have started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activating-hosts**
State entered when applying host-specific configurations. This state is
entered only if needed.
**activation-complete**
State entered when new configurations have been applied to all
controller and compute hosts.
@ -459,10 +466,13 @@ of |prod| software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activation-complete |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
.. note::
This can take more than half an hour to complete.
#. Complete the upgrade.
.. code-block:: none
@ -473,8 +483,8 @@ of |prod| software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | completing |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
#. Delete the imported load.
@ -485,8 +495,8 @@ of |prod| software.
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | imported | 20.04 |
| 2 | active | 20.06 |
| 1 | imported | 20.06 |
| 2 | active | 21.05 |
+----+----------+------------------+
~(keystone_admin)]$ system load-delete 1

View File

@ -11,6 +11,7 @@ software.
.. rubric:: |prereq|
.. _upgrading-all-in-one-simplex-ul-ezb-b11-cx:
- Perform a full backup to allow recovery.
@ -22,12 +23,12 @@ software.
- The system must be 'patch current'. All upgrades available for the current
release running on the system must be applied. To find and download
applicable upgrades, visit the |dnload-loc| site.
applicable upgrades, visit |dnload-loc| site.
- Transfer the new release software load to controller-0 \(or onto a USB
stick\); controller-0 must be active.
- Transfer the new release software license file to controller-0, \(or onto a
- Transfer the new release software license file to controller-0 \(or onto a
USB stick\).
- Transfer the new release software signature to controller-0 \(or onto a USB
@ -36,6 +37,11 @@ software.
.. note::
The upgrade procedure includes steps to resolve system health issues.
End user container images in`registry.local` will be backed up during the
upgrade process. This only includes images other than |prod| system and
application images. These images are limited to 5 GBytes in total size. If
the system contains more than 5 GBytes of these images, the upgrade start will fail.
.. rubric:: |proc|
#. Source the platform environment.
@ -78,8 +84,8 @@ software.
+--------------------+-----------+
| id | 2 |
| state | importing |
| software_version | 20.06 |
| compatible_version | 20.04 |
| software_version | 21.05 |
| compatible_version | 20.06 |
| required_patches | |
+--------------------+-----------+
@ -95,10 +101,13 @@ software.
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | active | 20.04 |
| 2 | imported | 20.06 |
| 1 | active | 20.06 |
| 2 | imported | 21.05 |
+----+----------+------------------+
.. note::
This will take a few minutes.
#. Apply any required software updates.
The system must be 'patch current'. All software updates related to your
@ -157,8 +166,8 @@ software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | starting |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
This will back up the system data and images to /opt/platform-backup.
@ -206,10 +215,13 @@ software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | started |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
Ensure the upgrade state is **started**. It will take several minutes to
transition to the started state.
#. \(Optional\) Copy the upgrade data from the system to an alternate safe
location \(such as a USB drive or remote server\).
@ -228,7 +240,7 @@ software.
~(keystone_admin)]$ system host-lock controller-0
#. Start Upgrade controller-0.
#. Upgrade controller-0.
This is the point of no return. All data except /opt/platform-backup/ will
be erased from the system. This will wipe the **rootfs** and reboot the
@ -244,6 +256,13 @@ software.
#. Install the new release of |prod-long| Simplex software via network or USB.
#. Verify and configure IP connectivity. External connectivity is required to
run the Ansible upgrade playbook. The |prod-long| boot image will DHCP out all
interfaces so the server may have obtained an IP address and have external IP
connectivity if a DHCP server is present in your environment. Verify this using
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
route.
#. Restore the upgrade data.
.. code-block:: none
@ -269,6 +288,9 @@ software.
.. note::
This playbook does not support replay.
.. note::
This can take more than one hour to complete.
Once the data restoration is complete the upgrade state will be set to
**upgrading-hosts**.
@ -282,8 +304,8 @@ software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | upgrading-hosts |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
#. Unlock controller-0.
@ -310,8 +332,8 @@ software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activating |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
The following states apply when this command is executed.
@ -323,24 +345,31 @@ software.
State entered when we have started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activating-hosts**
State entered when applying host-specific configurations. This state is
entered only if needed.
**activation-complete**
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**
Check the status of the upgrade again to see it has reached
**activation-complete**
.. code-block:: none
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activation-complete |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activation-complete |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
.. note::
This can take more than half an hour to complete.
#. Complete the upgrade.
@ -352,8 +381,8 @@ software.
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | completing |
| from_release | 20.04 |
| to_release | 20.06 |
| from_release | 20.06 |
| to_release | 21.05 |
+--------------+--------------------------------------+
#. Delete the imported load.
@ -364,8 +393,8 @@ software.
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | imported | 20.04 |
| 2 | active | 20.06 |
| 1 | imported | 20.06 |
| 2 | active | 21.05 |
+----+----------+------------------+
~(keystone_admin)]$ system load-delete 1