Merge "Add updates and upgrades content"

This commit is contained in:
Zuul 2021-05-18 17:08:32 +00:00 committed by Gerrit Code Review
commit 7230189e63
34 changed files with 3722 additions and 1 deletions

View File

@ -0,0 +1 @@
#. Install, configure and unlock nodes.

View File

@ -33,6 +33,7 @@
.. |admintasks-doc| replace:: :title:`StarlingX Administrator Tasks`
.. |datanet-doc| replace:: :title:`StarlingX Data Networks`
.. |os-intro-doc| replace:: :title:`OpenStack Introduction`
.. |updates-doc| replace:: :title:`StarlingX Updates and Upgrades`
.. Name of downloads location
@ -54,4 +55,12 @@
.. |max-workers| replace:: 99
.. |release-caveat| replace:: This is a pre-release feature and may not function as described in |prod| 5.
.. Product name used in patch file names
.. |pn| replace:: STLX
.. Product version used in patch file names
.. |pvr| replace:: 00004
.. |release-caveat| replace:: This is a pre-release feature and may not function as described in |prod| 5 documentation.

View File

@ -0,0 +1,97 @@
.. syj1592947192958
.. _aborting-simplex-system-upgrades:
=============================
Abort Simplex System Upgrades
=============================
You can abort a Simplex System upgrade before or after upgrading controller-0.
.. _aborting-simplex-system-upgrades-section-N10025-N1001B-N10001:
.. contents:: |minitoc|
:local:
:depth: 1
-----------------------------
Before upgrading controller-0
-----------------------------
.. _aborting-simplex-system-upgrades-ol-nlw-zbp-xdb:
#. Abort the upgrade with the upgrade-abort command.
.. code-block:: none
$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
At this time any upgrade data generated as part of the upgrade-start
command will be deleted. This includes the upgrade data in
/opt/platform-backup.
.. _aborting-simplex-system-upgrades-section-N10063-N1001B-N10001:
----------------------------
After upgrading controller-0
----------------------------
After controller-0 has been upgraded it is possible to roll back the software
upgrade. This involves performing a system restore with the previous release.
.. _aborting-simplex-system-upgrades-ol-jmw-kcp-xdb:
#. Abort the upgrade with the :command:`upgrade-abort` command.
.. code-block:: none
$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
#. Lock and downgrade controller-0
.. code-block:: none
$ system host-lock controller-0
$ system host-downgrade controller-0
The data is stored in /opt/platform-backup. Ensure the data is present,and
preserved through the downgrade.
#. Install the previous release of |prod-long| Simplex software via network or
USB.
#. Restore the system data. The restore is preserved in /opt/platform-backup.
For more information, see, :ref:`Upgrading All-in-One Simplex
<upgrading-all-in-one-simplex>`.
#. Abort the upgrade with the :command:`upgrade-abort` command.
.. code-block:: none
$ system upgrade-abort
The system will be restored to the state when the :command:`upgrade-start`
command was issued. The :command:`upgrade-abort` command must be issued at
this time.
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete

View File

@ -0,0 +1,192 @@
.. gep1552920534437
.. _configuring-update-orchestration:
==============================
Configure Update Orchestration
==============================
You can configure update orchestration using the Horizon Web interface.
.. rubric:: |context|
The update orchestration interface is found in Horizon on the Patch
Orchestration tab, available from **Admin** \> **Platform** \> **Software
Management** in the left-hand pane.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
level or higher by using relaxed alarm rules during an orchestrated update
operation. For a list of management-affecting alarms, see |fault-doc|:
:ref:`Alarm Messages <100-series-alarm-messages>`. To display
management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)]$ fm alarm-list --mgmt_affecting
During an orchestrated update operation, the following alarms are ignored
even when strict restrictions are selected:
- 200.001, Maintenance host lock alarm
- 900.001, Patch in progress
- 900.005, Upgrade in progress
- 900.101, Software patch auto apply in progress
.. _configuring-update-orchestration-ul-qhy-q1p-v1b:
.. rubric:: |prereq|
You cannot successfully create an update \(patch\) strategy if any hosts show
**Patch Current** = **Pending**, indicating that the update status of these
hosts has not yet been updated. The creation attempt fails, and you must try
again. You can use :command:`sw-patch query-hosts` to review the current update
status before creating a update strategy.
.. rubric:: |proc|
#. Upload and apply your updates as described in :ref:`Manage Software Updates
<managing-software-updates>` \(do not lock any hosts or use
:command:`host-install` to install the updates on any hosts\).
#. Select **Platform** \> **Software Management**, then select the **Patch
Orchestration** tab.
#. Click the **Create Strategy** button.
The Create Strategy dialog appears.
.. image:: figures/zcj1567178380908.png
#. Create a update strategy by specifying settings for the parameters in the
Create Strategy dialog box.
**Description** field
Provides information about current alarms, including whether an alarm
is Management Affecting.
**Controller Apply Type**
- Serial \(default\): controllers will be updated one at a time
\(standby controller first\)
- Ignore: controllers will not be updated
**Storage Apply Type**
- Serial \(default\): storage hosts will be updated one at a time
- Parallel: storage hosts will be updated in parallel, ensuring that
only one storage node in each replication group is updated at a
time.
- Ignore: storage hosts will not be updated
**Worker Apply Type**
- Serial \(default\): worker hosts will be updated one at a time
- Parallel: worker hosts will be updated in parallel
- At most, **Parallel** will be updated at the same time.
- For a reboot parallel update only, worker hosts with no pods
are updated before worker hosts with pods.
- Parallel: specify the maximum worker hosts to update in parallel
\(minimum: 2, maximum: 100\)
- Ignore: Worker hosts will not be updated
**Default Instance Action**
This parameter only applies for systems with the stx-openstack
application.
- Stop-Start \(default\): hosted applications VMs will be stopped
before a host is updated \(applies to reboot updates only\)
- Migrate: hosted application VMs will be migrated off a host before
it is updated \(applies to reboot updates only\).
**Alarm Restrictions**
This option lets you specify how update orchestration behaves when
alarms are present.
You can use the CLI command :command:`fm alarm-list --mgmt_affecting`
to view the alarms that are management affecting.
**Strict**
The default strict option will result in update orchestration
failing if there are any alarms present in the system \(except for a
small list of alarms\).
**Relaxed**
This option allows orchestration to proceed if alarms are present,
as long as none of these alarms are management affecting.
#. Click **Create Strategy** to save the update orchestration strategy.
.. note::
The update orchestration process ensures that no hosts are reported as
**Patch Status** = **Pending**. If any hosts have this status, the
creation attempt fails with an error message. Wait a few minutes and
try again. You can also use :command:`sw-patch query-hosts` to review
the current update status.
Examine the update strategy. Pay careful attention to:
- The sets of hosts that will be updated together in each stage.
- The sets of hosted application pods that will be impacted in each stage.
The update strategy has one or more stages, with each stage consisting of
one or more hosts to be updated at the same time. Each stage is split into
steps \(for example, :command:`query-alarms`, :command:`lock-hosts`,
:command:`sw-patch-hosts`\). Note the following about stages:
.. note::
- Controller hosts are updated first, followed by storage hosts and
then worker hosts.
- Worker hosts with no hosted application pods are updated before
worker hosts with hosted application pods.
- The final step in each stage is "system-stabilize," which waits for
a period of time \(up to several minutes\) and ensures that the
system is free of alarms. This ensures that the update orchestrator
does not continue to update more hosts if the update application
has caused an issue resulting in an alarm.
#. Click the **Apply Strategy** button to apply the update- strategy. You can
optionally apply a single stage at a time by clicking the **Apply Stage**
button.
When applying a single stage, you can only apply the next stage; you cannot
skip stages.
#. To abort the update, click the **Abort Strategy** button.
- While a update-strategy is being applied, it can be aborted. This
results in:
- The current step being allowed to complete.
- If necessary an abort phase will be created and applied, which will
attempt to unlock any hosts that were locked.
.. note::
If a update-strategy is aborted after hosts were locked, but before
they were updated, the hosts will not be unlocked, as this would result
in the updates being installed. You must either install the updates on
the hosts or remove the updates before unlocking the hosts.
#. Delete the update strategy.
After a update strategy has been applied \(or aborted\) it must be deleted
before another update strategy can be created. If a update strategy
application fails, you must address the issue that caused the failure, then
delete and re-create the strategy before attempting to apply it again.

Binary file not shown.

After

Width:  |  Height:  |  Size: 26 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

View File

@ -0,0 +1,35 @@
.. kiv1552920729184
.. _identifying-the-software-version-and-update-level-using-horizon:
============================================================
Identify the Software Version and Update Level Using Horizon
============================================================
You can view the current software version and update level from the Horizon Web
interface. The system type is also shown.
.. rubric:: |proc|
#. In the |prod| Horizon, open the System Configuration page.
The System Configuration page is available from **Admin** \> **Platform**
\> **System Configuration** in the left-hand pane.
#. Select the **Systems** tab to view the software version.
The software version is shown in the **Version** field.
The type of system selected at installation \(Standard or All-in-one\) is
shown in the **System Type** field. The mode \(**simplex**, **duplex**, or
**standard**\) is shown in the **System Mode** field.
#. In the |prod| Horizon interface, open the Software Management page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
#. Select the **Patches** tab to view update information.
The **Patches** tab shows the Patch ID, a Summary description, the Status
of the Patch, and an Actions button to use to select an appropriate action.

View File

@ -0,0 +1,58 @@
.. lob1552920716157
.. _identifying-the-software-version-and-update-level-using-the-cli:
============================================================
Identify the Software Version and Update Level Using the CLI
============================================================
You can view the current software version and update level from the CLI. The
system type is also shown.
.. rubric:: |context|
For more about working with software updates, see :ref:`Manage Software Updates
<managing-software-updates>`
.. rubric:: |proc|
.. _identifying-the-software-version-and-update-level-using-the-cli-steps-smg-b4r-hkb:
- To find the software version from the CLI, use the :command:`system show`
command.
.. code-block:: none
~(keystone_admin)]$ system show
+----------------------+----------------------------------------------------+
| Property | Value |
+----------------------+----------------------------------------------------+
| contact | None |
| created_at | 2020-02-27T15:29:26.140606+00:00 |
| description | yow-cgcs-ironpass-1_4 |
| https_enabled | False |
| location | None |
| name | yow-cgcs-ironpass-1-4 |
| region_name | RegionOne |
| sdn_enabled | False |
| security_feature | spectre_meltdown_v1 |
| service_project_name | services |
| software_version | 20.06 |
| system_mode | duplex |
| system_type | Standard |
| timezone | UTC |
| updated_at | 2020-02-28T16:19:56.987581+00:00 |
| uuid | 90212c98-7e27-4a14-8981-b8f5b777b26b |
| vswitch_type | none |
+----------------------+----------------------------------------------------+
.. note::
The **system\_mode** field is shown only for a |prod| Simplex or Duplex
system.
- To list applied software updates from the CLI, use the :command:`sw-patch
query` command.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query

View File

@ -0,0 +1,24 @@
.. gwe1552920505159
.. _in-service-versus-reboot-required-software-updates:
==================================================
In-Service Versus Reboot-Required Software Updates
==================================================
In-Service \(Reboot-not-Required\) and a Reboot-Required software updates are
available depending on the nature of the update to be performed.
In-Service software updates provides a mechanism to issue updates that do not
require a reboot, allowing the update to be installed on in-service nodes and
restarting affected processes as needed.
Depending on the area of software being updated and the type of software
change, installation of the update may or may not require the |prod| hosts to
be rebooted. For example, a software update to the kernel would require the
host to be rebooted in order to apply the update. Software updates are
classified as reboot-required or reboot-not-required \(also referred to as
in-service\) type updates to indicate this. For reboot-required updates, the
hosted application pods are automatically relocated to an alternate host as
part of the update procedure, prior to applying the update and rebooting the
host.

View File

@ -3,6 +3,49 @@
Kubernetes
==========
------------
Introduction
------------
.. toctree::
:maxdepth: 1
software-updates-and-upgrades-software-updates
software-upgrades
-----------------------
Manual software updates
-----------------------
.. toctree::
:maxdepth: 1
managing-software-updates
in-service-versus-reboot-required-software-updates
identifying-the-software-version-and-update-level-using-horizon
identifying-the-software-version-and-update-level-using-the-cli
populating-the-storage-area
update-status-and-lifecycle
installing-software-updates-before-initial-commissioning
installing-reboot-required-software-updates-using-horizon
installing-reboot-required-software-updates-using-the-cli
installing-in-service-software-update-using-horizon
installing-in-service-software-updates-using-the-cli
removing-reboot-required-software-updates
software-update-space-reclamation
reclaiming-disk-space
----------------------------
Orchestrated Software Update
----------------------------
.. toctree::
:maxdepth: 1
update-orchestration-overview
configuring-update-orchestration
update-orchestration-cli
---------------------------------
Manual Kubernetes Version Upgrade
---------------------------------
@ -27,3 +70,52 @@ Kubernetes Version Upgrade Cloud Orchestration
configuring-kubernetes-update-orchestration
handling-kubernetes-update-orchestration-failures
----------------------------------
Manual Platform components upgrade
----------------------------------
.. toctree::
:maxdepth: 1
manual-upgrade-overview
******************
All-in-one Simplex
******************
.. toctree::
:maxdepth: 1
upgrading-all-in-one-simplex
aborting-simplex-system-upgrades
******************
All-in-one Duplex
******************
.. toctree::
:maxdepth: 1
upgrading-all-in-one-duplex-or-standard
overview-of-upgrade-abort-procedure
******************
Roll back upgrades
******************
.. toctree::
:maxdepth: 1
rolling-back-a-software-upgrade-before-the-second-controller-upgrade
rolling-back-a-software-upgrade-after-the-second-controller-upgrade
---------------------------------------
Orchestrated Platform component upgrade
---------------------------------------
.. toctree::
:maxdepth: 1
orchestration-upgrade-overview
performing-an-orchestrated-upgrade
performing-an-orchestrated-upgrade-using-the-cli

View File

@ -0,0 +1,81 @@
.. jfc1552920636790
.. _installing-in-service-software-update-using-horizon:
================================================
Install In-Service Software Update Using Horizon
================================================
The procedure for applying an in-service update is similar to that of a
reboot-required update, except that the host does not need to be locked and
unlocked as part of applying the update.
.. rubric:: |proc|
.. _installing-in-service-software-update-using-horizon-steps-x1b-qnv-vw:
#. Log in to the Horizon Web interface as the **admin** user.
#. In |prod| Horizon, open the Software Management page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
#. Select the Patches tab to see the current update status.
The Patches page shows the current status of all updates uploaded to the
system. If there are no updates, an empty Patch Table is displayed.
#. Upload the update \(patch\) file to the update storage area.
Click the **Upload Patch** button to display an upload window from which
you can browse your workstation's file system to select the update file.
Click the **Upload Patch** button once the selection is done.
The update file is transferred to the Active Controller and is copied to
the update storage area, but it has yet to be applied to the cluster. This
is reflected in the Patches page.
#. Apply the update.
Click the **Apply Patch** button associated with the update. Alternatively,
select the update first using the selection boxes on the left, and then
click the **Apply Patches** button at the top. You can use this selection
process to apply all updates, or a selected subset, in a single operation.
The Patches page is updated to report the update to be in the
*Partial-Apply* state.
#. Install the update on **controller-0**.
#. Select the **Hosts** tab.
The **Hosts** tab on the Host Inventory page reflects the new status of
the hosts with respect to the new update state. In this example, the
update only applies to controller software, as can be seen by the
worker host's status field being empty, indicating that it is 'patch
current'.
.. image:: figures/ekn1453233538504.png
#. Next, select the Install Patches option from the **Edit Host** button
associated with **controller-0** to install the update.
A confirmation window is presented giving you a last opportunity to
cancel the operation before proceeding.
#. Repeat the steps 6 a,b, above with **controller-1** to install the update
on **controller-1**.
#. Repeat the steps 6 a,b above for the worker and/or storage hosts \(if
present\).
This step does not apply for |prod| Simplex or Duplex systems.
#. Verify the state of the update.
Visit the Patches page again. The update is now in the *Applied* state.
.. rubric:: |result|
The update is now applied, and all affected hosts have been updated.

View File

@ -0,0 +1,131 @@
.. hfj1552920618138
.. _installing-in-service-software-updates-using-the-cli:
=================================================
Install In-Service Software Updates Using the CLI
=================================================
The procedure for applying an in-service update is similar to that of a
reboot-required update, except that the host does not need to be locked and
unlocked as part of applying the update.
.. rubric:: |proc|
#. Upload the update \(patch\).
.. code-block:: none
$ sudo sw-patch upload INSVC_HORIZON_SYSINV.patch
INSVC_HORIZON_SYSINV is now available
#. Confirm that the update is available.
.. code-block:: none
$ sudo sw-patch query
Patch ID RR Release Patch State
==================== == ======= ===========
INSVC_HORIZON_SYSINV N 20.04 Available
#. Check the status of the hosts.
.. code-block:: none
$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
worker-0 192.168.204.24 Yes No 20.01 idle
controller-0 192.168.204.3 Yes No 20.01 idle
controller-1 192.168.204.4 Yes No 20.01 idle
#. Ensure the original update files have been deleted from the root drive.
After they are uploaded to the storage area, the original files are no
longer required. You must use the command-line interface to delete them, in
order to ensure enough disk space to complete the installation.
.. code-block:: none
$ rm </path/patchfile>
.. caution::
If the original files are not deleted before the updates are applied,
the installation may fail due to a full disk.
#. Apply the update \(patch\).
.. code-block:: none
$ sudo sw-patch apply INSVC_HORIZON_SYSINV
INSVC_HORIZON_SYSINV is now in the repo
The update state transitions to Partial-Apply:
.. code-block:: none
$ sudo sw-patch query
Patch ID RR Release Patch State
==================== == ======= =============
INSVC_HORIZON_SYSINV N 20.04 Partial-Apply
As it is an in-service update, the hosts report that they are not 'patch
current', but they do not require a reboot.
.. code-block:: none
$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
worker-0 192.168.204.24 No No 20.04 idle
controller-0 192.168.204.3 No No 20.04 idle
controller-1 192.168.204.4 No No 20.04 idle
#. Install the update on controller-0.
.. code-block:: none
$ sudo sw-patch host-install controller-0
.............
Installation was successful.
#. Query the hosts to check status.
.. code-block:: none
$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
worker-0 192.168.204.24 No No 20.01 idle
controller-0 192.168.204.3 Yes No 20.01 idle
controller-1 192.168.204.4 No No 20.01 idle
The controller-1 host reports it is now 'patch current' and does not
require a reboot, without having been locked or rebooted
#. Install the update on worker-0 \(and other worker nodes and storage nodes,
if present\)
.. code-block:: none
$ sudo sw-patch host-install worker-0
....
Installation was successful.
You can query the hosts to confirm that all nodes are now 'patch current',
and that the update has transitioned to the Applied state.
.. code-block:: none
$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
worker-0 192.168.204.24 Yes No 20.04 idle
controller-0 192.168.204.3 Yes No 20.04 idle
controller-1 192.168.204.4 Yes No 20.04 idle
$ sudo sw-patch query
Patch ID RR Release Patch State
==================== == ======= ===========
INSVC_HORIZON_SYSINV N 20.04 Applied

View File

@ -0,0 +1,126 @@
.. phg1552920664442
.. _installing-reboot-required-software-updates-using-horizon:
======================================================
Install Reboot-Required Software Updates Using Horizon
======================================================
You can use the Horizon Web interface to upload, delete, apply, and remove
software updates.
.. rubric:: |context|
This section presents an example of a software update workflow using a single
update. The main steps of the procedure are:
.. _installing-reboot-required-software-updates-using-horizon-ul-mbr-wsr-s5:
- Upload the updates.
- Lock the host\(s\).
- Install updates; any unlocked nodes will reject the request.
- Unlock the host\(s\). Unlocking the host\(s\) automatically triggers a
reboot.
.. rubric:: |proc|
.. _installing-reboot-required-software-updates-using-horizon-steps-lnt-14y-hjb:
#. Log in to the Horizon Web interface interface as the **admin** user.
#. In Horizon, open the Software Management page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
#. Select the Patches tab to see the current status.
The Patches page shows the current status of all updates uploaded to the
system. If there are no updates, an empty Patch Table is displayed.
#. Upload the update \(patch\) file to the update storage area.
Click the **Upload Patches** button to display an upload window from which
you can browse your workstation's file system to select the update file.
Click the **Upload Patches** button once the selection is done.
The update file is transferred to the Active Controller and is copied to
the storage area, but it has yet to be applied to the cluster. This is
reflected in the Patches page.
#. Apply the update.
Click the **Apply Patch** button associated with the update. Alternatively,
select the update first using the selection boxes on the left, and then
click the **Apply Patches** button at the top. You can use this selection
process to apply all updates, or a selected subset, in a single operation.
The Patches page is updated to report the update to be in the
*Partial-Apply* state.
#. Install the update on **controller-0**.
.. _installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001:
#. Select the **Hosts** tab.
The **Hosts** tab on the Host Inventory page reflects the new status of
the hosts with respect to the new update state. As shown below, both
controllers are now reported as not 'patch current' and requiring
reboot.
.. image:: figures/ekn1453233538504.png
#. Transfer active services to the standby controller by selecting the
**Swact Host** option from the **Edit Host** button associated with the
active controller host.
.. note::
Access to Horizon may be lost briefly during the active controller
transition. You may have to log in again.
#. Select the Lock Host option from the **Edit Host** button associated
with **controller-0**.
#. Select the Install Patches option from the **Edit Host** button
associated with **controller-0** to install the update.
A confirmation window is presented giving you a last opportunity to
cancel the operation before proceeding.
Wait for the update install to complete.
#. Select the Unlock Host option from the **Edit Host** button associated
with controller-0.
#. Repeat steps :ref:`6
<installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001>`
a to e, with **controller-1** to install the update on **controller-1**.
.. note::
For |prod| Simplex systems, this step does not apply.
#. Repeat steps :ref:`6
<installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001>`
a to e, for the worker and/or storage hosts.
.. note::
For |prod| Simplex or Duplex systems, this step does not apply.
#. Verify the state of the update.
Visit the Patches page. The update is now in the Applied state.
.. rubric:: |result|
The update is applied now, and all affected hosts have been updated.
Updates can be removed using the **Remove Patches** button from the Patches
page. The workflow is similar to the one presented in this section, with the
exception that updates are being removed from each host instead of being
applied.

View File

@ -0,0 +1,295 @@
.. ffh1552920650754
.. _installing-reboot-required-software-updates-using-the-cli:
======================================================
Install Reboot-Required Software Updates Using the CLI
======================================================
You can install reboot-required software updates using the CLI.
.. rubric:: |proc|
.. _installing-reboot-required-software-updates-using-the-cli-steps-v1q-vlv-vw:
#. Log in as user **sysadmin** to the active controller and source the script
/etc/platform/openrc to obtain administrative privileges.
#. Verify that the updates are available using the :command:`sw-patch query`
command.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch query
Patch ID Patch State
===================== ===========
|pn|-nn.nn_PATCH_0001 Available
|pn|-nn.nn_PATCH_0002 Available
|pn|-nn.nn_PATCH_0003 Available
where *nn.nn* in the update \(patch\) filename is the |prod| release number.
#. Ensure the original update files have been deleted from the root drive.
After the updates are uploaded to the storage area, the original files are
no longer required. You must use the command-line interface to delete them,
in order to ensure enough disk space to complete the installation.
.. code-block:: none
$ rm </path/patchfile>
.. caution::
If the original files are not deleted before the updates are applied,
the installation may fail due to a full disk.
#. Apply the update.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch apply |pn|-nn.nn_PATCH_0001
|pn|-nn.nn_PATCH_0001 is now in the repo
where nn.nn in the update filename is the |prod-long| release number.
The update is now in the Partial-Apply state, ready for installation from
the software updates repository on the impacted hosts.
#. Apply all available updates in a single operation, for example:
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch apply --all
|pn|-|pvr|-PATCH_0001 is now in the repo
|pn|-|pvr|-PATCH_0002 is now in the repo
|pn|-|pvr|-PATCH_0003 is now in the repo
In this example, there are three updates ready for installation from the
software updates repository.
#. Query the updating status of all hosts in the cluster.
You can query the updating status of all hosts at any time as illustrated
below.
.. note::
The reported status is the accumulated result of all applied and
removed updates in the software updates repository, and not just the
status due to a particular update.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
worker-0 192.168.204.12 Yes No 20.04 idle
controller-0 192.168.204.3 Yes Yes 20.04 idle
controller-1 192.168.204.4 Yes Yes 20.04 idle
For each host in the cluster, the following status fields are displayed:
**Patch Current**
Indicates whether there are updates pending for installation or removal
on the host or not. If *Yes*, then all relevant updates in the software
updates repository have been installed on, or removed from, the host
already. If *No*, then there is at least one update in either the
Partial-Apply or Partial-Remove state that has not been applied to the
host.
The **Patch Current** field of the :command:`query-hosts` command will
briefly report “Pending” after you apply or remove an update, until
that host has checked against the repository to see if it is impacted
by the patching operation.
**Reboot Required**
Indicates whether the host must be rebooted or not as a result of one
or more updates that have been either applied or removed, or because it
is not 'patch current'.
**Release**
Indicates the running software release version.
**State**
There are four possible states:
**idle**
In a wait state.
**installing**
Installing \(or removing\) updates.
**install-failed**
The operation failed, either due to an update error or something
killed the process. Check the patching.log on the node in question.
**install-rejected**
The node is unlocked, therefore the request to install has been
rejected. This state persists until there is another install
request, or the node is reset.
Once the state has gone back to idle, the install operation is complete
and you can safely unlock the node.
In this example, **worker-0** is up to date, no updates need to be
installed and no reboot is required. By contrast, the controllers are not
'patch current', and therefore a reboot is required as part of installing
the update.
#. Install all pending updates on **controller-0**.
#. Switch the active controller services.
.. code-block:: none
~(keystone_admin)]$ system host-swact controller-0
Before updating a controller node, you must transfer any active
services running on the host to the other controller. Only then it is
safe to lock the host.
#. Lock the host.
You must lock the target host, controller, worker, or storage, before
installing updates.
.. code-block:: none
~(keystone_admin)]$ system host-lock controller-0
#. Install the update.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install <controller-0>
.. note::
You can use the :command:`sudo sw-patch host-install-async`
<hostname> command if you are launching multiple installs in
parallel.
#. Unlock the host.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-0
Unlocking the host forces a reset of the host followed by a reboot.
This ensures that the host is restarted in a known state.
All updates are now installed on **controller-0**. Querying the current
update status displays the following information:
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
compute-0 192.168.204.95 Yes No 20.04 idle
compute-1 192.168.204.63 Yes No 20.04 idle
compute-2 192.168.204.99 Yes No 20.04 idle
compute-3 192.168.204.49 Yes No 20.04 idle
controller-0 192.168.204.3 Yes No 20.04 idle
controller-1 192.168.204.4 Yes No 20.04 idle
storage-0 192.168.204.37 Yes No 20.04 idle
storage-1 192.168.204.90 Yes No 20.04 idle
#. Install all pending updates on **controller-1**.
.. note::
For |prod| Simplex systems, this step does not apply.
Repeat the previous step targeting **controller-1**.
All updates are now installed on **controller-1** as well. Querying the
current updating status displays the following information:
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
compute-0 192.168.204.95 Yes No 20.04 idle
compute-1 192.168.204.63 Yes No 20.04 idle
compute-2 192.168.204.99 Yes No 20.04 idle
compute-3 192.168.204.49 Yes No 20.04 idle
controller-0 192.168.204.3 Yes No 20.04 idle
controller-1 192.168.204.4 Yes No 20.04 idle
storage-0 192.168.204.37 Yes No 20.04 idle
storage-1 192.168.204.90 Yes No 20.04 idle
#. Install any pending updates for the worker or storage hosts.
.. note::
For |prod| Simplex or Duplex systems, this step does not apply.
All hosted application pods currently running on a worker host are
re-located to another host.
If the **Patch Current** status for a worker or storage host is **No**,
apply the pending updates using the following commands:
.. code-block:: none
~(keystone_admin)]$ system host-lock <hostname>
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install-async <hostname>
.. code-block:: none
~(keystone_admin)]$ system host-unlock <hostname>
where <hostname> is the name of the host \(for example, **worker-0**\).
.. note::
Update installations can be triggered in parallel.
The :command:`sw-patch host-install-async` command \(**install
patches** on the Horizon Web interface\) can be run on all locked
nodes, without waiting for one node to complete the install before
triggering the install on the next. If you can lock the nodes at the
same time, without impacting hosted application services, you can
update them at the same time.
Likewise, you can install an update to the standby controller and a
worker node at the same time. The only restrictions are those of the
lock: You cannot lock both controllers, and you cannot lock a worker
node if you do not have enough free resources to relocate the hosted
applications from it. Also, in a Ceph configuration \(with storage
nodes\), you cannot lock more than one of
controller-0/controller-1/storage-0 at the same time, as these nodes
are running Ceph monitors and you must have at least two in service at
all times.
#. Confirm that all updates are installed and the |prod| is up-to-date.
Use the :command:`sw-patch query` command to verify that all updates are
**Applied**.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch query
Patch ID Patch State
========================= ===========
|pn|-nn.nn_PATCH_0001 Applied
where *nn.nn* in the update filename is the |prod| release number.
If the **Patch State** for any update is still shown as **Available** or
**Partial-Apply**, use the **sw-patch query-hosts** command to identify
which hosts are not **Patch Current**, and then apply updates to them as
described in the preceding steps.
.. rubric:: |result|
The |prod| is up to date now. All updates are installed.

View File

@ -0,0 +1,105 @@
.. tla1552920677022
.. _installing-software-updates-before-initial-commissioning:
=====================================================
Install Software Updates Before Initial Commissioning
=====================================================
This section describes installing software updates before you can commission
|prod-long|.
.. rubric:: |context|
This procedure assumes that the software updates to install are available on a
USB flash drive, or from a server reachable by **controller-0**.
.. rubric:: |prereq|
When initially installing the |prod-long| software, it is required that you
install the latest available updates on **controller-0** before running Ansible
Bootstrap Playbook, and before installing the software on other hosts. This
ensures that:
.. _installing-software-updates-before-initial-commissioning-ul-gsq-1ht-vp:
- The software on **controller-0**, and all other hosts, is up to date when
the cluster comes alive.
- You reduce installation time by avoiding updating the system right after an
out-of-date software installation is complete.
.. rubric:: |proc|
#. Install software on **controller-0**.
Use the |prod-long| bootable ISO image to initialize **controller-0**.
This step takes you to the point where you use the console port to log in
to **controller-0** as user **sysadmin**.
#. Populate the storage area.
Upload the updates from the USB flash drive using the command
:command:`sw-patch upload` or :command:`sw-patch upload-dir` as described
in :ref:`Populating the Storage Area <populating-the-storage-area>`.
#. Delete the update files from the root drive.
After the updates are uploaded to the storage area, the original files are
no longer required. You must delete them to ensure enough disk space to
complete the installation.
.. caution::
If the original files are not deleted before the updates are applied,
the installation may fail due to a full disk.
#. Apply the updates.
Apply the updates using the command :command:`sw-patch apply --all`.
The updates are now in the repository, ready to be installed.
#. Install the updates on the controller.
.. code-block:: none
$ sudo sw-patch install-local
Patch installation is complete.
Please reboot before continuing with configuration.
This command installs all applied updates on **controller-0**.
#. Reboot **controller-0**.
You must reboot the controller to ensure that it is running with the
software fully updated.
.. code-block:: none
$ sudo reboot
#. Bootstrap system on controller-0.
#. Configure an IP interface.
.. note::
The |prod| software will automatically enable all interfaces and
send out a |DHCP| request, so this may happen automatically if a
|DHCP| Server is present on the network. Otherwise, you must
manually configure an IP interface.
#. Run the Ansible Bootstrap Playbook. This can be run remotely or locally
on controller-0.
.. include:: /_includes/installing-software-updates-before-initial-commissioning.rest
.. rubric:: |result|
Once all hosts in the cluster are initialized and they are all running fully
updated software. The |prod-long| cluster is up to date.
.. xbooklink From step 1
For details, see :ref:`Install Software on controller-0
<installing-software-on-controller-0>` for your system.

View File

@ -0,0 +1,108 @@
.. kol1552920779041
.. _managing-software-updates:
=======================
Manage Software Updates
=======================
Updates \(also known as patches\) to the system software become available as
needed to address issues associated with a current |prod-long| software
release. Software updates must be uploaded to the active controller and applied
to all required hosts in the cluster.
.. note::
Updating |prod-dc| is distinct from updating other |prod| configurations.
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|: :ref:`Update
Management for Distributed Cloud
<update-management-for-distributed-cloud>`.
The following elements form part of the software update environment:
**Reboot-Required Software Updates**
Reboot-required updates are typically major updates that require hosts to
be locked during the update process and rebooted to complete the process.
.. note::
When a |prod| host is locked and rebooted for updates, the hosted
application pods are re-located to an alternate host in order to
minimize the impact to the hosted application service.
**In-Service Software Updates**
In-service \(reboot-not-required\), software updates are updates that do
not require the locking and rebooting of hosts. The required |prod|
software is updated and any required |prod| processes are re-started.
Hosted applications pods and services are completely unaffected.
**Software Update Commands**
The :command:`sw-patch` command is available on both active controllers. It
must be run as root using :command:`sudo`. It provides the user interface
to process the updates, including querying the state of an update, listing
affected hosts, and applying, installing, and removing updates.
**Software Update Storage Area**
A central storage area maintained by the update controller. Software
updates are initially uploaded to the storage area and remains there until
they are deleted.
**Software Update Repository**
A central repository of software updates associated with any updates
applied to the system. This repository is used by all hosts in the cluster
to identify the software updates and rollbacks required on each host.
**Software Update Logs**
The following logs are used to record software update activity:
**patching.log**
This records software update agent activity on each host.
**patching-api.log**
This records user actions that involve software updates, performed
using either the CLI or the REST API.
The overall flow for installing a software update from the command line
interface on a working |prod| cluster is the following:
.. _managing-software-updates-ol-vgf-yzz-jp:
#. Consult the |org| support personnel for details on the availability of new
software updates.
#. Download the software update from the |org| servers to a workstation that
can reach the active controller through the |OAM| network.
#. Copy the software update to the active controller using the cluster's |OAM|
floating IP address as the destination point.
You can use a command such as :command:`scp` to copy the software update.
The software update workflows presented in this document assume that this
step is complete already, that is, they assume that the software update is
already available on the file system of the active controller.
#. Upload the new software update to the storage area.
This step makes the new software update available within the system, but
does not install it to the cluster yet. For all purposes, the software
update is dormant.
#. Apply the software update.
This step adds the updates to the repository, making it visible to all
hosts in the cluster.
#. Install the software updates on each of the affected hosts in the cluster.
This can be done manually or by using upgrade orchestration. For more
information, see :ref:`Update Orchestration Overview
<update-orchestration-overview>`.
Updating software in the system can be done using the Horizon Web interface or
the command line interface on the active controller. When using Horizon you
upload the software update directly from your workstation using a file browser
window provided by the software update upload facility.
A special case occurs during the initial provisioning of a cluster when you
want to update **controller-0** before the system software is configured. This
can only be done from the command line interface. See :ref:`Install Software
Updates Before Initial Commissioning
<installing-software-updates-before-initial-commissioning>` for details.

View File

@ -0,0 +1,39 @@
.. mzg1592854560344
.. _manual-upgrade-overview:
=======================
Manual Upgrade Overview
=======================
|prod-long| enables you to upgrade the software across your Simplex, Duplex,
Standard, |prod-dc|, and subcloud deployments.
.. note::
Upgrading |prod-dc| is distinct from upgrading other |prod| configurations.
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|: :ref:`Upgrade
Management <upgrade-management-overview>`.
An upgrade can be performed manually or by the Upgrade Orchestrator which
automates a rolling install of an update across all of the |prod-long| hosts.
This section describes the manual upgrade procedures.
.. xbooklink For the orchestrated
procedure, see |distcloud-doc|: :ref:`Orchestration Upgrade Overview
<orchestration-upgrade-overview>`.
Before starting the upgrades process, the system must be “patch current,” there
must be no management-affecting alarms present on the system, the new software
load must be imported, and a valid license file for the upgrade must be
installed.
The upgrade procedure is different for the All-in-One Simplex configuration
versus the All-in-One Duplex, and Standard configurations. For more
information, see:
.. _manual-upgrade-overview-ul-bcp-ght-cmb:
- :ref:`Upgrading All-in-One Simplex <upgrading-all-in-one-simplex>`
- :ref:`Upgrading All-in-One Duplex / Standard <upgrading-all-in-one-duplex-or-standard>`

View File

@ -0,0 +1,135 @@
.. bla1593031188931
.. _orchestration-upgrade-overview:
==============================
Upgrade Orchestration Overview
==============================
Upgrade Orchestration automates much of the upgrade procedure, leaving a few
manual steps for operator oversight.
.. contents:: |minitoc|
:local:
:depth: 1
.. note::
Upgrading of |prod-dc| is distinct from upgrading other |prod|
configurations.
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|:
:ref:`Upgrade Management <upgrade-management-overview>`.
.. note::
The upgrade orchestration CLI is :command:`sw-manager`.To use upgrade
orchestration commands, you need administrator privileges. You must log in
to the active controller as user **sysadmin** and source the
/etc/platform/openrc script to obtain administrator privileges. Do not use
**sudo**.
.. code-block:: none
~(keystone_admin)]$ sw-manager upgrade-strategy --help
usage: sw-manager upgrade-strategy [-h] ...
optional arguments:
-h, --help show this help message and exit
Software Upgrade Commands:
create Create a strategy
delete Delete a strategy
apply Apply a strategy
abort Abort a strategy
show Show a strategy
.. _orchestration-upgrade-overview-section-N10029-N10026-N10001:
----------------------------------
Upgrade Orchestration Requirements
----------------------------------
Upgrade orchestration can only be done on a system that meets the following
conditions:
.. _orchestration-upgrade-overview-ul-blp-gcx-ry:
- The system is clear of alarms \(with the exception of the alarm upgrade in
progress\).
- All hosts must be unlocked, enabled, and available.
- The system is fully redundant \(two controller nodes available, at least
one complete storage replication group available for systems with Ceph
backend\).
- An upgrade has been started, and controller-1 has been upgraded and is
active.
- No update orchestration strategy exists. An upgrade cannot be orchestrated
while update orchestration is in progress.
- Sufficient free capacity or unused worker resources must be available
across the cluster. A rough calculation is: Required spare capacity \( %\)
= \(Number of hosts to upgrade in parallel divided by the total number of
hosts\) times 100.
.. _orchestration-upgrade-overview-section-N10081-N10026-N10001:
---------------------------------
The Upgrade Orchestration Process
---------------------------------
Upgrade orchestration can be initiated after the manual upgrade and stability
of the initial controller host. Upgrade orchestration automatically iterates
through the remaining hosts, installing the new software load on each one:
first the other controller host, then the storage hosts, and finally the worker
hosts. During worker host upgrades, pods are moved to alternate worker hosts
automatically.
The user first creates an upgrade orchestration strategy, or plan, for the
automated upgrade procedure. This customizes the upgrade orchestration, using
parameters to specify:
.. _orchestration-upgrade-overview-ul-eyw-fyr-31b:
- the host types to be upgraded
- whether to upgrade hosts serially or in parallel
Based on these parameters, and the state of the hosts, upgrade orchestration
creates a number of stages for the overall upgrade strategy. Each stage
generally consists of moving pods, locking hosts, installing upgrades, and
unlocking hosts for a subset of the hosts on the system.
After creating the upgrade orchestration strategy, the user can either apply
the entire strategy automatically, or apply individual stages to control and
monitor its progress manually.
Update and upgrade orchestration are mutually exclusive; they perform
conflicting operations. Only a single strategy \(sw-patch or sw-upgrade\) is
allowed to exist at a time. If you need to update during an upgrade, you can
abort/delete the sw-upgrade strategy, and then create and apply a sw-patch
strategy before going back to the upgrade.
Some stages of the upgrade could take a significant amount of time \(hours\).
For example, after upgrading a storage host, re-syncing the OSD data could take
30m per TB \(assuming 500MB/s sync rate, which is about half of a 10G
infrastructure link\).
.. _orchestration-upgrade-overview-section-N10101-N10026-N10001:
------------------------------
Upgrade Orchestration Workflow
------------------------------
The Upgrade Orchestration procedure has several major parts:
.. _orchestration-upgrade-overview-ul-r1k-wzj-wy:
- Manually upgrade controller-1.
- Orchestrate the automatic upgrade of the remaining controller, all the
storage nodes, and all the worker nodes.
- Manually complete the upgrade.

View File

@ -0,0 +1,29 @@
.. yim1593277634652
.. _overview-of-upgrade-abort-procedure:
===================================
Overview of Upgrade Abort Procedure
===================================
You can abort an upgrade procedure if necessary.
There are two cases for aborting an upgrade:
.. _overview-of-upgrade-abort-procedure-ul-q5f-vmz-bx:
- Before controller-0 has been upgraded \(that is, only controller-1 has been
upgraded\): In this case the upgrade can be aborted and the system will
remain in service during the abort.
- After controller-0 has been upgraded \(that is, both controllers have been
upgraded\): In this case the upgrade can only be aborted with a complete
outage and a re-install of all hosts. This would only be done as a last
resort, if there was absolutely no other way to recover the system.
- :ref:`Rolling Back a Software Upgrade Before the Second Controller Upgrade
<rolling-back-a-software-upgrade-before-the-second-controller-upgrade>`
- :ref:`Rolling Back a Software Upgrade After the Second Controller Upgrade
<rolling-back-a-software-upgrade-after-the-second-controller-upgrade>`

View File

@ -0,0 +1,185 @@
.. kad1593196868935
.. _performing-an-orchestrated-upgrade-using-the-cli:
=============================================
Perform an Orchestrated Upgrade Using the CLI
=============================================
The upgrade orchestration CLI is :command:`sw-manager`.
.. rubric:: |context|
.. note::
To use upgrade orchestration commands, you need administrator privileges.
You must log in to the active controller as user **sysadmin** and source the
/etc/platform/openrc script to obtain administrator privileges. Do not use
**sudo**.
The upgrade strategy options are shown in the following output:
.. code-block:: none
~(keystone_admin)]$ sw-manager upgrade-strategy --help
usage: sw-manager upgrade-strategy [-h] ...
optional arguments:
-h, --help show this help message and exit
Software Upgrade Commands:
create Create a strategy
delete Delete a strategy
apply Apply a strategy
abort Abort a strategy
show Show a strategy
You can perform a partially orchestrated upgrade using the CLI. Upgrade and
stability of the initial controller node must be done manually before using
upgrade orchestration to orchestrate the remaining nodes of the |prod|.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
level or higher by using relaxed alarm rules during an orchestrated upgrade
operation. For a list of management-affecting alarms, see |fault-doc|:
:ref:`Alarm Messages <alarm-messages-overview>`. To display
management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)]$ fm alarm-list --mgmt_affecting
During an orchestrated upgrade, the following alarms are ignored even when
strict restrictions are selected:
- 900.005, Upgrade in progress
- 900.201, Software upgrade auto apply in progress
.. _performing-an-orchestrated-upgrade-using-the-cli-ul-qhy-q1p-v1b:
.. rubric:: |prereq|
See :ref:`Upgrading All-in-One Duplex / Standard
<upgrading-all-in-one-duplex-or-standard>` to manually upgrade the initial
controller node before doing the upgrade orchestration described below to
upgrade the remaining nodes of the |prod|.
.. rubric:: |proc|
.. _performing-an-orchestrated-upgrade-using-the-cli-steps-e45-kh5-sy:
#. Create a update strategy using the :command:`sw-manager` upgrade-strategy
command.
.. code-block:: none
~(keystone_admin)]$ sw-manager upgrade-strategy create
Create an upgrade strategy, specifying the following parameters:
- storage-apply-type:
- serial \(default\): storage hosts will be upgraded one at a time
- parallel: storage hosts will be upgraded in parallel, ensuring that
only one storage node in each replication group is patched at a
time.
- ignore: storage hosts will not be upgraded
- worker-apply-type:
**serial** \(default\)
Worker hosts will be upgraded one at a time.
**ignore**
Worker hosts will not be upgraded.
- Alarm Restrictions
This option lets you determine how to handle alarm restrictions based
on the management affecting statuses of any existing alarms, which
takes into account the alarm type as well as the alarm's current
severity. If set to relaxed, orchestration will be allowed to proceed
if there are no management affecting alarms present.
Performing management actions without specifically relaxing the alarm
checks will still fail if there are any alarms present in the system
\(except for a small list of basic alarms for the orchestration actions
such as an upgrade operation in progress alarm not impeding upgrade
orchestration\).
You can use the CLI command :command:`fm alarm-list --mgmt_affecting`
to view the alarms that are management affecting.
**Strict**
Maintains alarm restrictions.
**Relaxed**
Relaxes the usual alarm restrictions and allows the action to
proceed if there are no alarms present in the system with a severity
equal to or greater than its management affecting severity.
The upgrade strategy consists of one or more stages, which consist of one
or more hosts to be upgraded at the same time. Each stage will be split
into steps \(for example, query-alarms, lock-hosts, upgrade-hosts\).
Following are some notes about stages:
- Controller-0 is upgraded first, followed by storage hosts and then
worker hosts.
- Worker hosts with no instances are upgraded before worker hosts with
application pods.
- Pods will be relocated off each worker host before it is upgraded.
- The final step in each stage is one of:
**system-stabilize**
This waits for a period of time \(up to several minutes\) and
ensures that the system is free of alarms. This ensures that we do
not continue to upgrade more hosts if the upgrade has caused an
issue resulting in an alarm.
**wait-data-sync**
This waits for a period of time \(up to many hours\) and ensures
that data synchronization has completed after the upgrade of a
controller or storage node.
Examine the upgrade strategy. Pay careful attention to:
- The sets of hosts that will be upgraded together in each stage.
- The sets of pods that will be impacted in each stage.
.. note::
It is likely that as each stage is applied, pods will be relocated
to worker hosts that have not yet been upgraded. That means that
later stages will be relocating more pods than those originally
listed in the upgrade strategy. The upgrade strategy is NOT
updated, but any additional pods on each worker host will be
relocated before it is upgraded.
#. Apply the upgrade-strategy. You can optionally apply a single stage at a time.
.. code-block:: none
~(keystone_admin)]$ sw-manager upgrade-strategy apply
While an upgrade-strategy is being applied, it can be aborted. This results
in:
- The current step will be allowed to complete.
- If necessary an abort phase will be created and applied, which will
attempt to unlock any hosts that were locked.
After an upgrade-strategy has been applied \(or aborted\) it must be
deleted before another upgrade-strategy can be created. If an
upgrade-strategy application fails, you must address the issue that caused
the failure, then delete/re-create the strategy before attempting to apply
it again.

View File

@ -0,0 +1,169 @@
.. sab1593196680415
.. _performing-an-orchestrated-upgrade:
===============================
Perform an Orchestrated Upgrade
===============================
You can perform a partially-Orchestrated Upgrade of a |prod| system using the CLI and Horizon Web interface. Upgrade and stability of the initial controller node must be done manually before using upgrade orchestration to orchestrate the remaining nodes of the |prod|.
.. rubric:: |context|
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
level or higher by using relaxed alarm rules during an orchestrated upgrade
operation. For a list of management-affecting alarms, see |fault-doc|:
:ref:`Alarm Messages <alarm-messages-overview>`. To display
management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)]$ fm alarm-list --mgmt_affecting
During an orchestrated upgrade, the following alarms are ignored even when
strict restrictions are selected:
- 750.006, Automatic application re-apply is pending
- 900.005, Upgrade in progress
- 900.201, Software upgrade auto apply in progress
.. _performing-an-orchestrated-upgrade-ul-qhy-q1p-v1b:
.. rubric:: |prereq|
See :ref:`Upgrading All-in-One Duplex / Standard
<upgrading-all-in-one-duplex-or-standard>`, to manually upgrade the initial
controller node before doing the upgrade orchestration described below to
upgrade the remaining nodes of the |prod| system.
.. rubric:: |proc|
.. _performing-an-orchestrated-upgrade-steps-e45-kh5-sy:
#. Select **Platform** \> **Software Management**, then select the **Upgrade
Orchestration** tab.
#. Click the **Create Strategy** button.
The Create Strategy dialog appears.
#. Create an upgrade strategy by specifying settings for the parameters in the
Create Strategy dialog box.
Create an upgrade strategy, specifying the following parameters:
- storage-apply-type:
**serial** \(default\)
Storage hosts will be upgraded one at a time.
**parallel**
Storage hosts will be upgraded in parallel, ensuring that only one
storage node in each replication group is upgraded at a time.
**ignore**
Storage hosts will not be upgraded.
- worker-apply-type:
**serial** \(default\):
Worker hosts will be upgraded one at a time.
**parallel**
Worker hosts will be upgraded in parallel, ensuring that:
- At most max-parallel-worker-hosts \(see below\) worker hosts
will be upgraded at the same time.
- At most half of the hosts in a host aggregate will be upgraded
at the same time.
- Worker hosts with no application pods are upgraded before
worker hosts with application pods.
**ignore**
Worker hosts will not be upgraded.
**max-parallel-worker-hosts**
Specify the maximum worker hosts to upgrade in parallel \(minimum:
2, maximum: 10\).
**alarm-restrictions**
This option lets you specify how upgrade orchestration behaves when
alarms are present.
You can use the CLI command :command:`fm alarm-list
--mgmt_affecting` to view the alarms that are management affecting.
**Strict**
The default strict option will result in upgrade orchestration
failing if there are any alarms present in the system \(except
for a small list of alarms\).
**Relaxed**
This option allows orchestration to proceed if alarms are
present, as long as none of these alarms are management
affecting.
#. Click **Create Strategy** to save the upgrade orchestration strategy.
The upgrade strategy consists of one or more stages, which consist of one
or more hosts to be upgraded at the same time. Each stage will be split
into steps \(for example, query-alarms, lock-hosts, upgrade-hosts\).
Following are some notes about stages:
- Controller-0 is upgraded first, followed by storage hosts and then
worker hosts.
- Worker hosts with no application pods are upgraded before worker hosts
with application pods.
- Pods will be moved off each worker host before it is upgraded.
- The final step in each stage is one of:
**system-stabilize**
This waits for a period of time \(up to several minutes\) and
ensures that the system is free of alarms. This ensures that we do
not continue to upgrade more hosts if the upgrade has caused an
issue resulting in an alarm.
**wait-data-sync**
This waits for a period of time \(up to many hours\) and ensures
that data synchronization has completed after the upgrade of a
controller or storage node.
Examine the upgrade strategy. Pay careful attention to:
- The sets of hosts that will be upgraded together in each stage.
- The sets of pods that will be impacted in each stage.
.. note::
It is likely that as each stage is applied, application pods will
be relocated to worker hosts that have not yet been upgraded. That
means that later stages will be migrating more pods than those
originally listed in the upgrade strategy. The upgrade strategy is
NOT updated, but any additional pods on each worker host will be
relocated before it is upgraded.
#. Apply the upgrade-strategy. You can optionally apply a single stage at a
time.
While an upgrade-strategy is being applied, it can be aborted. This results
in:
- The current step will be allowed to complete.
- If necessary an abort phase will be created and applied, which will
attempt to unlock any hosts that were locked.
After an upgrade-strategy has been applied \(or aborted\) it must be
deleted before another upgrade-strategy can be created. If an
upgrade-strategy application fails, you must address the issue that caused
the failure, then delete/re-create the strategy before attempting to apply
it again.

View File

@ -0,0 +1,74 @@
.. fek1552920702618
.. _populating-the-storage-area:
=========================
Populate the Storage Area
=========================
Software updates \(patches\) have to be uploaded to the |prod| storage area
before they can be applied.
.. rubric:: |proc|
#. Log in as **sysadmin** to the active controller.
#. Upload the update file to the storage area.
.. parsed-literal::
$ sudo sw-patch upload /home/sysadmin/patches/|pn|-CONTROLLER_<nn.nn>_PATCH_0001.patch
Cloud_Platform__CONTROLLER_nn.nn_PATCH_0001 is now available
where *nn.nn* in the update file name is the |prod| release number.
This example uploads a single update to the storage area. You can specify
multiple update files on the same command separating their names with
spaces.
Alternatively, you can upload all update files stored in a directory using
a single command, as illustrated in the following example:
.. code-block:: none
$ sudo sw-patch upload-dir /home/sysadmin/patches
The update is now available in the storage area, but has not been applied
to the update repository or installed to the nodes in the cluster.
#. Verify the status of the update.
.. code-block:: none
$ sudo sw-patch query
The update state is *Available* now, indicating that it is included in the
storage area. Further details about the updates can be retrieved as
follows:
#. Delete the update files from the root drive.
After the updates are uploaded to the storage area, the original files are
no longer required. You must delete them to ensure there is enough disk
space to complete the installation.
.. code-block:: none
$ rm /home/sysadmin/patches/*
.. caution::
If the original files are not deleted before the updates are applied,
the installation may fail due to a full disk.
.. rubric:: |postreq|
When an update in the *Available* state is no longer required, you can delete
it using the following command:
.. parsed-literal::
$ sudo sw-patch delete |pn|-|pvr|-PATCH_0001
The update to delete from the storage area is identified by the update
\(patch\) ID reported by the :command:`sw-patch query` command. You can provide
multiple patch IDs to the delete command, separating their names by spaces.

View File

@ -0,0 +1,95 @@
.. ngk1552920570137
.. _reclaiming-disk-space:
==================
Reclaim Disk Space
==================
You can free up and reclaim disk space taken by previous updates once a newer
version of an update has been committed to the system.
.. rubric:: |proc|
#. Run the :command:`query-dependencies` command to show a list of updates
that are required by the specified update \(patch\), including itself.
.. code-block:: none
sw-patch query-dependences [ --recursive ] <patch-id>
The :command:`query-dependencies` command will show a list of updates that
are required by the specified update \(including itself\). The
**--recursive** option will crawl through those dependencies to return a
list of all the updates in the specified update's dependency tree. This
query is used by the “commit” command in calculating the set of updates to
be committed.For example,
.. parsed-literal::
controller-0:/home/sysadmin# sw-patch query-dependencies |pn|-|pvr|-PATCH_0004
|pn|-|pvr|-PATCH_0002
|pn|-|pvr|-PATCH_0003
|pn|-|pvr|-PATCH_0004
controller-0:/home/sysadmin# sw-patch query-dependencies |pn|-|pvr|-PATCH_0004 --recursive
|pn|-|pvr|-PATCH_0001
|pn|-|pvr|-PATCH_0002
|pn|-|pvr|-PATCH_0003
|pn|-|pvr|-PATCH_0004
#. Run the :command:`sw-patch commit` command.
.. code-block:: none
sw-patch commit [ --dry-run ] [ --all ] [ --release ] [ <patch-id> … ]
The :command:`sw-patch commit` command allows you to specify a set of
updates to be committed. The commit set is calculated by querying the
dependencies of each specified update.
The **--all** option, without the **--release** option, commits all updates
of the currently running release. When two releases are on the system use
the **--release** option to specify a particular release's updates if
committing all updates for the non-running release. The **--dry-run**
option shows the list of updates to be committed and how much disk space
will be freed up. This information is also shown without the **--dry-run**
option, before prompting to continue with the operation. An update can only
be committed once it has been fully applied to the system, and cannot be
removed after.
Following are examples that show the command usage.
The following command lists the status of all updates that are in an
APPLIED state.
.. code-block:: none
controller-0:/home/sysadmin# sw-patch query
The following command commits the updates.
.. parsed-literal::
controller-0:/home/sysadmin# sw-patch commit |pvr|-PATCH_0001 |pvr|-PATCH_0002
The following patches will be committed:
|pvr|-PATCH_0001
|pvr|-PATCH_0002
This commit operation would free 2186.31 MiB
WARNING: Committing a patch is an irreversible operation. Committed patches
cannot be removed.
Would you like to continue? [y/N]: y
The patches have been committed.
The following command shows the updates now in the COMMITTED state.
.. parsed-literal::
controller-0:/home/sysadmin# sw-patch query
Patch ID RR Release Patch State
================ ===== ======== =========
|pvr|-PATCH_0001 N |pvr| Committed
|pvr|-PATCH_0002 Y |pvr| Committed

View File

@ -0,0 +1,117 @@
.. scm1552920603294
.. _removing-reboot-required-software-updates:
=======================================
Remove Reboot-Required Software Updates
=======================================
Updates in the *Applied* or *Partial-Apply* states can be removed if necessary,
for example, when they trigger undesired or unplanned effects on the cluster.
.. rubric:: |context|
Rolling back updates is conceptually identical to installing updates. A
roll-back operation can be commanded for an update in either the *Applied* or
the *Partial-Apply* states. As the update is removed, it goes through the
following state transitions:
**Applied or Partial-Apply to Partial-Remove**
An update in the *Partial-Remove* state indicates that it has been removed
from zero or more, but not from all, the applicable hosts.
Use the command :command:`sw-patch remove` to trigger this transition.
**Partial-Remove to Available**
Use the command :command:`sudo sw-patch host-install-async` <hostname>
repeatedly targeting each one of the applicable hosts in the cluster. The
transition to the *Available* state is complete when the update is removed
from all target hosts. The update remains in the update storage area as if
it had just been uploaded.
.. note::
The command :command:`sudo sw-patch host-install-async` <hostname> both
installs and removes updates as necessary.
The following example describes removing an update that applies only to the
controllers. Removing updates can be done using the Horizon Web interface,
also, as discussed in :ref:`Install Reboot-Required Software Updates Using
Horizon <installing-reboot-required-software-updates-using-horizon>`.
.. rubric:: |proc|
#. Log in as Keystone user **admin** to the active controller.
#. Verify the state of the update.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch query
Patch ID Patch State
========================= ===========
|pn|-|pvr|-PATCH_0001 Applied
In this example the update is listed in the *Applied* state, but it could
be in the *Partial-Apply* state as well.
#. Remove the update.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch remove |pn|-|pvr|-PATCH_0001
|pn|-|pvr|-PATCH_0001 has been removed from the repo
The update is now in the *Partial-Remove* state, ready to be removed from
the impacted hosts where it was already installed.
#. Query the updating status of all hosts in the cluster.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ =============== ============= =============== ======= =====
compute-0 192.168.204.179 Yes No 20.04 idle
compute-1 192.168.204.173 Yes No 20.04 idle
controller-0 192.168.204.3 No No 20.04 idle
controller-1 192.168.204.4 No No 20.04 idle
storage-0 192.168.204.213 Yes No 20.04 idle
storage-1 192.168.204.181 Yes No 20.04 idle
In this example, the controllers have updates ready to be removed, and
therefore must be rebooted.
#. Remove all pending-for-removal updates from **controller-0**.
#. Swact controller services away from controller-0.
#. Lock controller-0.
#. Run the updating \(patching\) sequence.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-0>
#. Unlock controller-0.
#. Remove all pending-for-removal updates from controller-1.
#. Swact controller services away from controller-1.
#. Lock controller-1.
#. Run the updating sequence.
#. Unlock controller-1.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-1>
.. rubric:: |result|
The cluster is up to date now. All updates have been removed, and the update
|pn|-|pvr|-PATCH_0001 can be deleted from the storage area if necessary.

View File

@ -0,0 +1,129 @@
.. eiu1593277809293
.. _rolling-back-a-software-upgrade-after-the-second-controller-upgrade:
================================================================
Roll Back a Software Upgrade After the Second Controller Upgrade
================================================================
After the second controller is upgraded, you can still roll back a software
upgrade, however, the rollback will impact the hosting of applications.
.. rubric:: |proc|
#. Run the :command:`upgrade-abort` command to abort the upgrade.
.. code-block:: none
$ system upgrade-abort
Once this is done there is no going back; the upgrade must be completely
aborted.
The following state applies when you run this command.
- aborting-reinstall:
- State entered when :command:`system upgrade-abort` is executed
after upgrading controller-0.
- Remain in this state until the abort is completed.
#. Make controller-1 active.
.. code-block:: none
$ system host-swact controller-0
#. Lock controller-0.
.. code-block:: none
$ system host-lock controller-0
#. Wipe the disk and power down all storage \(if applicable\) and worker hosts.
.. note::
Skip this step if doing this procedure on a |prod| Duplex system.
#. Execute :command:`wipedisk` from the shell on each storage or worker
host.
#. Power down each host.
#. Lock all storage \(if applicable\) and worker hosts.
.. note::
Skip this step if doing this procedure on a |prod| Duplex system.
.. code-block:: none
$ system host-lock <hostID>
#. Downgrade controller-0.
.. code-block:: none
$ system host-downgrade controller-0
The host is re-installed with the previous release load.
#. Unlock controller-0.
.. code-block:: none
$ system host-unlock controller-0
#. Swact to controller-0.
.. code-block:: none
$ system host-swact controller-1
Swacting back to controller-0 will switch back to using the previous
release databases, which were frozen at the time of the swact to
controller-1. This is essentially the same result as a system restore.
#. Lock and downgrade controller-1.
.. code-block:: none
$ system host-downgrade controller-1
The host is re-installed with the previous release load.
#. Unlock controller-1.
.. code-block:: none
$ system host-unlock controller-1
#. Power up and unlock the storage hosts one at a time \(if using a Ceph
storage backend\). The hosts are re-installed with the release N load.
.. note::
Skip this step if doing this procedure on a |prod| Duplex system.
#. Power up and unlock the worker hosts one at a time.
.. note::
Skip this step if doing this procedure on a |prod| Duplex system.
The hosts are re-installed with the previous release load. As each worker
host goes online, application pods will be automatically recovered by the
system.
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
This cleans up the upgrade release, configuration, databases, and so forth.
#. Delete the upgrade release load.
.. code-block:: none
$ system load-delete

View File

@ -0,0 +1,68 @@
.. wyr1593277734184
.. _rolling-back-a-software-upgrade-before-the-second-controller-upgrade:
=================================================================
Roll Back a Software Upgrade Before the Second Controller Upgrade
=================================================================
You can perform an in-service abort of an upgrade before the second Controller
\(controller-0 in the examples of this procedure\) have been upgraded.
.. rubric:: |proc|
#. Abort the upgrade with the :command:`upgrade-abort` command.
.. code-block:: none
$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
The following states apply when you execute this command.
- aborting:
- State entered when :command:`system upgrade-abort` is executed
before upgrading controller-0.
- Remain in this state until the abort is completed.
#. Make controller-0 active.
.. code-block:: none
$ system host-swact controller-1
If controller-1 was active with the new upgrade release, swacting back to
controller-0 will switch back to using the previous release databases,
which were frozen at the time of the swact to controller-1. Any changes to
the system that were made while controller-1 was active will be lost.
#. Lock and downgrade controller-1.
.. code-block:: none
$ system host-lock controller-1
$ system host-downgrade controller-1
The host is re-installed with the previous release load.
#. Unlock controller-1.
.. code-block:: none
$ system host-unlock controller-1
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
#. Delete the newer upgrade release that has been aborted.
.. code-block:: none
$ system load-delete <loadID>

View File

@ -0,0 +1,19 @@
.. qbz1552920585263
.. _software-update-space-reclamation:
=================================
Software Update Space Reclamation
=================================
|prod-long| provides functionality for reclaiming disk space used by older
versions of software updates once newer versions have been committed.
The :command:`sw-patch commit` command allows you to “commit” a set of software
updates, which effectively locks down those updates and makes them unremovable.
In doing so, |prod-long| is then able to delete package files with lower
versions from the storage and repo, keeping only the highest version of each
package in the committed software update set.
.. caution::
This action is irreversible.

View File

@ -0,0 +1,88 @@
.. lei1552920487053
.. _software-updates-and-upgrades-software-updates:
================
Software Updates
================
|prod-long| software updates \(also known as patches\) must be applied to the
system in order to keep your system updated with feature enhancements, free of
known bugs, and security vulnerabilities.
|org| provides software updates that are cryptographically signed to ensure
integrity and authenticity. The |prod-long| REST APIs, CLIs and GUI validate
the signature of software updates before loading it into the system.
An update typically modifies a small portion of your system to address the
following items:
.. _software-updates-and-upgrades-software-updates-ul-gcd-smn-xw:
- bugs
- security vulnerabilities
- feature enhancements
Software updates can be installed manually or by the Update Orchestrator which
automates a rolling install of an update across all of the |prod-long| hosts.
For more information on manual updates, see :ref:`Manage Software Updates
<managing-software-updates>`. For more information on upgrade orchestration,
see :ref:`Orchestrated Software Update <update-orchestration-overview>`.
.. warning::
Do NOT use the |updates-doc| guide for |prod-dc| orchestrated
software updates. The |prod-dc| Update Orchestrator automates a
recursive rolling install of an update across all subclouds and all hosts
within the subclouds.
.. xbooklink For more information, see, |distcloud-doc|: :ref:`Update Management for
Distributed Cloud <update-management-for-distributed-cloud>`.
The |prod| handles multiple updates being applied and removed at once. Software
updates can modify and update any area of |prod| software, including the kernel
itself. For information on populating, installing and removing software
updates, see :ref:`Manage Software Updates <managing-software-updates>`.
There are two different kinds of Software updates that you can use to update
the |prod| software:
.. _software-updates-and-upgrades-software-updates-ol-kxm-wgv-njb:
#. **RPM Software Updates**
These software updates deliver |prod| software updates containing RPMs for
updating the |prod| software running directly on the hosts.
Software updates can be installed manually or by the Update Orchestrator
which automates a rolling install of an update across all of the
|prod-long| hosts.
The |prod| handles multiple updates being applied and removed at once.
Software updates can modify and update any area of |prod| software,
including the kernel itself.
For information on populating, installing and removing software updates,
see :ref:`Manage Software Updates <managing-software-updates>`.
.. note::
A 10 GB internal management network is required for reboot-required
software update operations.
#. **Application Software Updates**
These software updates apply to software being managed through the
StarlingX Application Package Manager, that is, ':command:`system
application-upload/apply/remove/delete`'. |prod| delivers some software
through this mechanism, for example, **platform-integ-apps**.
For software updates for these applications, download the updated
application tarball, containing the updated Armada manifest, and updated
Helm charts for the application, and apply the updates using the
:command:`system application-update` command.
.. xbooklink For more information, see,
:ref:`Cloud Platform Kubernetes Admin Tutorials
<about-the-admin-tutorials>`: :ref:`StarlingX Application Package Manager
<kubernetes-admin-tutorials-tarlingx-application-package-manager>`.

View File

@ -0,0 +1,108 @@
.. upe1593016272562
.. _software-upgrades:
=================
Software Upgrades
=================
|prod-long| upgrades enable you to move |prod| software from one release of
|prod| to the next release of |prod|.
.. contents:: |minitoc|
:local:
:depth: 1
|prod| software upgrade is a multi-step rolling-upgrade process, where |prod|
hosts are upgraded one at time while continuing to provide its hosting services
to its hosted applications. An upgrade can be performed manually or using
Upgrade Orchestration, which automates much of the upgrade procedure, leaving a
few manual steps to prevent operator oversight. For more information on manual
upgrades, see :ref:`Manual |prod| Components Upgrade
<manual-upgrade-overview>`. For more information on upgrade orchestration, see
:ref:`Orchestrated |prod| Component Upgrade <orchestration-upgrade-overview>`.
.. warning::
Do NOT use information in the |updates-doc| guide for |prod-dc|
orchestrated software upgrades. If information in this document is used for
a |prod-dc| orchestrated upgrade, the upgrade will fail resulting
in an outage. The |prod-dc| Upgrade Orchestrator automates a
recursive rolling upgrade of all subclouds and all hosts within the
subclouds.
.. xbooklink For more information on the |prod-dc| Upgrade Orchestrator, see,
|distcloud-doc|: :ref:`Upgrade Orchestration for Distributed Cloud
Subclouds Using CLI
<upgrade-orchestration-for-distributed-cloud-subclouds-using-the-cli>`.
Before starting the upgrades process:
.. _software-upgrades-ul-ant-vgq-gmb:
- the system must be “patch current”
- there must be no management-affecting alarms present on the system
- the new software load must be imported, and
- a valid license file for the new software release must be installed
The upgrade process starts by upgrading the controllers. The standby controller
is upgraded first and involves loading the standby controller with the new
release of software and migrating all the controller services' databases for
the new release of software. Activity is switched to the upgraded controller,
running in a 'compatibility' mode where all inter-node messages are using
message formats from the old release of software. Before upgrading the second
controller, is the "point-of-no-return for an in-service abort" of the upgrades
process. The second controller is loaded with the new release of software and
becomes the new Standby controller. For more information on manual upgrades,
see :ref:`Manual |prod| Components Upgrade <manual-upgrade-overview>` .
If present, storage nodes are locked, upgraded and unlocked one at a time in
order to respect the redundancy model of |prod| storage nodes. Storage nodes
can be upgraded in parallel if using upgrade orchestration.
Upgrade of worker nodes is the next step in the process. When locking a worker
node the node is tainted, such that Kubernetes shuts down any pods on this
worker node and restarts the pods on another worker node. When upgrading the
worker node, the worker node network boots/installs the new software from the
active controller. After unlocking the worker node, the worker services are
running in a 'compatibility' mode where all inter-node messages are using
message formats from the old release of software. Note that the worker nodes
can only be upgraded in parallel if using upgrade orchestration.
The final step of the upgrade process is to activate and complete the upgrade.
This involves disabling 'compatibility' modes on all hosts and clearing the
Upgrade Alarm.
.. _software-upgrades-section-N1002F-N1001F-N10001:
----------------------------------
Rolling Back / Aborting an Upgrade
----------------------------------
In general, any issues encountered during an upgrade should be addressed during
the upgrade with the intention of completing the upgrade after the issues are
resolved. Issues specific to a storage or worker host can be addressed by
temporarily downgrading the host, addressing the issues and then upgrading the
host again, or in some cases by replacing the node.
In extremely rare cases, it may be necessary to abort an upgrade. This is a
last resort and should only be done if there is no other way to address the
issue within the context of the upgrade. There are two cases for doing such an
abort:
.. _software-upgrades-ul-dqp-brt-cx:
- Before controller-0 has been upgraded \(that is, only controller-1 has been
upgraded\): In this case the upgrade can be aborted and the system will
remain in service during the abort, see, :ref:`Rolling Back a Software
Upgrade Before the Second Controller Upgrade
<rolling-back-a-software-upgrade-before-the-second-controller-upgrade>`.
- After controller-0 has been upgraded \(that is, both controllers have been
upgraded\): In this case the upgrade can only be aborted with a complete
outage and a reinstall of all hosts. This would only be done as a last
resort, if there was absolutely no other way to recover the system, see,
:ref:`Rolling Back a Software Upgrade After the Second Controller Upgrade
<rolling-back-a-software-upgrade-after-the-second-controller-upgrade>`.

View File

@ -0,0 +1,69 @@
.. agv1552920520258
.. _update-orchestration-cli:
========================
Update Orchestration CLI
========================
The update orchestration CLI is :command:`sw-manager`. Use this to create your
update strategy.
The commands and options map directly to the parameter descriptions in the web
interface dialog, described in :ref:`Configuring Update Orchestration
<configuring-update-orchestration>`.
.. note::
To use update orchestration commands, you need administrator privileges.
You must log in to the active controller as user **sysadmin** and source
the /etc/platform/openrc script to obtain administrator privileges. Do not
use **sudo**.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
level or higher by using relaxed alarm rules during an orchestrated update
operation. For a list of management-affecting alarms, see |fault-doc|:
:ref:`Alarm Messages <100-series-alarm-messages>`. To display
management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)$ fm alarm-list --mgmt_affecting
During an orchestrated update operation, the following alarms are ignored
even when strict restrictions are selected:
- 200.001, Maintenance host lock alarm
- 900.001, Patch in progress
- 900.005, Upgrade in progress
- 900.101, Software patch auto apply in progress
.. _update-orchestration-cli-ul-qhy-q1p-v1b:
Help is available for the overall command and also for each sub-command. For
example:
.. code-block:: none
~(keystone_admin)]$ sw-manager patch-strategy --help
usage: sw-manager patch-strategy [-h] ...
optional arguments:
-h, --help show this help message and exit
Update orchestration commands include:
.. _update-orchestration-cli-ul-cvv-gdd-nx:
- :command:`create` - Create a strategy
- :command:`delete` - Delete a strategy
- :command:`apply` - Apply a strategy
- :command:`abort` - Abort a strategy
- :command:`show` - Show a strategy

View File

@ -0,0 +1,95 @@
.. kzb1552920557323
.. _update-orchestration-overview:
=============================
Update Orchestration Overview
=============================
Update orchestration allows an entire |prod| system to be updated with a single
operation.
.. contents:: |minitoc|
:local:
:depth: 1
You can configure and run update orchestration using the CLI, the Horizon Web
interface, or the stx-nfv REST API.
.. note::
Updating of |prod-dc| is distinct from updating of other |prod|
configurations.
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|:
:ref:`Update Management for Distributed Cloud
<update-management-for-distributed-cloud>`.
.. _update-orchestration-overview-section-N10031-N10023-N10001:
---------------------------------
Update Orchestration Requirements
---------------------------------
Update orchestration can only be done on a system that meets the following
conditions:
.. _update-orchestration-overview-ul-e1y-t4c-nx:
- The system is clear of alarms \(with the exception of alarms for locked
hosts, and update applications in progress\).
.. note::
When configuring update orchestration, you have the option to ignore
alarms with a severity less than management-affecting severity. For
more information, see :ref:`Configuring Update Orchestration
<configuring-update-orchestration>`.
- All hosts must be unlocked-enabled-available.
- Two controller hosts must be available.
- All storage hosts must be available.
- When installing reboot required updates, there must be spare worker
capacity to move hosted application pods off the worker host\(s\) being
updated such that hosted application services are not impacted.
.. _update-orchestration-overview-section-N1009D-N10023-N10001:
--------------------------------
The Update Orchestration Process
--------------------------------
Update orchestration automatically iterates through all hosts on the system and
installs the applied updates to each host: first the controller hosts, then the
storage hosts, and finally the worker hosts. During the worker host updating,
hosted application pod re-locations are managed automatically. The controller
hosts are always updated serially. The storage hosts and worker hosts can be
configured to be updated in parallel in order to reduce the overall update
installation time.
Update orchestration can install one or more applied updates at the same time.
It can also install reboot-required updates or in-service updates or both at
the same time. Update orchestration only locks and unlocks \(that is, reboots\)
a host to install an update if at least one reboot-required update has been
applied.
The user first creates an update orchestration strategy, or plan, for the
automated updating procedure. This customizes the update orchestration, using
parameters to specify:
.. _update-orchestration-overview-ul-eyw-fyr-31b:
- the host types to be updated
- whether to update hosts serially or in parallel
Based on these parameters, and the state of the hosts, update orchestration
creates a number of stages for the overall update strategy. Each stage
generally consists of re-locating hosted application pods, locking hosts,
installing updates, and unlocking hosts for a subset of the hosts on the
system.
After creating the update orchestration strategy, the user can either apply the
entire strategy automatically, or manually apply individual stages to control
and monitor the update progress.

View File

@ -0,0 +1,76 @@
.. utq1552920689344
.. _update-status-and-lifecycle:
===========================
Update Status and Lifecycle
===========================
|prod| software updates move through different status levels as the updates are
being applied.
.. rubric:: |context|
After adding an update \(patch\) to the storage area you must move it to the
repository, which manages distribution for the cluster. From there, you can
install the updates to the hosts that require them.
Some of the available updates may be required on controller hosts only, while
others may be required on worker or storage hosts. Use :command:`sw-patch
query-hosts` to see which hosts are impacted by the newly applied \(or
removed\) updates. You can then use :command:`sw-patch host-install` to update
the software on individual hosts.
To keep track of software update installation, you can use the
:command:`sw-patch query` command.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch query
Patch ID Patch State
=========== ============
|pvr|-nn.nn_PATCH_0001 Applied
where *nn.nn* in the update filename is the |prod| release number.
This shows the **Patch State** for each of the updates in the storage area:
**Available**
An update in the *Available* state has been added to the storage area, but
is not currently in the repository or installed on the hosts.
**Partial-Apply**
An update in the *Partial-Apply* state has been added to the software
updates repository using the :command:`sw-patch apply` command, but has not
been installed on all hosts that require it. It may have been installed on
some but not others, or it may not have been installed on any hosts. If any
reboot-required update is in a partial state \(Partial-Apply or
Partial-Remove\), you cannot update the software on any given host without
first locking it. If, for example, you had one reboot-required update and
one in-service update, both in a Partial-Apply state and both applicable to
node X, you cannot just install the non-reboot-required update to the
unlocked node X.
**Applied**
An update in the *Applied* state has been installed on all hosts that
require it.
You can use the :command:`sw-patch query-hosts` command to see which hosts are
fully updated \(**Patch Current**\). This also shows which hosts require
reboot, either because they are not fully updated, or because they are fully
updated but not yet rebooted.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch query-hosts
Hostname IP Address Patch Current Reboot Required Release State
============ ============== ============= =============== ======= =====
compute-0 192.168.204.95 Yes No 20.06 idle
compute-1 192.168.204.63 Yes No 20.06 idle
compute-2 192.168.204.99 Yes No 20.06 idle
compute-3 192.168.204.49 Yes No 20.06 idle
controller-0 192.168.204.3 Yes No 20.06 idle
controller-1 192.168.204.4 Yes No 20.06 idle
storage-0 192.168.204.37 Yes No 20.06 idle
storage-1 192.168.204.90 Yes No 20.06 idle

View File

@ -0,0 +1,495 @@
.. btn1592861794542
.. _upgrading-all-in-one-duplex-or-standard:
======================================
Upgrade All-in-One Duplex / Standard
======================================
You can upgrade the |prod| Duplex or Standard configurations with a new release
of |prod| software.
.. rubric:: |prereq|
.. _upgrading-all-in-one-duplex-or-standard-ul-ezb-b11-cx:
- Perform a full backup to allow recovery.
.. note::
Back up files in the /home/sysadmin and /root directories prior
to doing an upgrade. Home directories are not preserved during backup or
restore operations, blade replacement, or upgrades.
- The system must be "patch current". All updates available for the current
release running on the system must be applied. To find and download
applicable updates, visit the |dnload-loc|.
- Transfer the new release software load to controller-0 \(or onto a USB
stick\); controller-0 must be active.
- Transfer the new release software license file to controller-0, \(or onto a
USB stick\).
- Transfer the new release software signature to controller-0 \(or onto a USB
stick\).
- Unlock all hosts.
- All nodes must be unlocked. The upgrade cannot be started when there
are locked nodes \(the health check prevents it\).
.. note::
The upgrade procedure includes steps to resolve system health issues.
.. rubric:: |proc|
#. Ensure that controller-0 is the active controller.
#. Install the license file for the release you are upgrading to, for example,
20.06.
.. code-block:: none
~(keystone_admin)]$ system license-install <license_file>
For example,
.. code-block:: none
~(keystone_admin)]$ system license-install license.lic
#. Import the new release.
#. Run the :command:`load-import` command on **controller-0** to import
the new release.
First, source /etc/platform/openrc. Also, you must specify an exact
path to the \*.iso bootimage file and to the \*.sig bootimage signature
file.
.. code-block:: none
$ source /etc/platform/openrc
~(keystone_admin)]$ system load-import /home/sysadmin/<bootimage>.iso \
<bootimage>.sig
+--------------------+-----------+
| Property | Value |
+--------------------+-----------+
| id | 2 |
| state | importing |
| software_version | 20.06 |
| compatible_version | 20.04 |
| required_patches | |
+--------------------+-----------+
The :command:`load-import` must be done on **controller-0** and accepts
relative paths.
#. Check to ensure the load was successfully imported.
.. code-block:: none
~(keystone_admin)]$ system load-list
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | active | 20.04 |
| 2 | imported | 20.06 |
+----+----------+------------------+
#. Apply any required software updates.
The system must be 'patch current'. All software updates related to your
current |prod| software release must be uploaded, applied, and installed.
All software updates to the new |prod| release, only need to be uploaded
and applied. The install of these software updates will occur automatically
during the software upgrade procedure as the hosts are reset to load the
new release of software.
To find and download applicable updates, visit the |dnload-loc|.
For more information, see :ref:`Manage Software Updates
<managing-software-updates>`.
#. Confirm that the system is healthy.
Check the current system health status, resolve any alarms and other issues
reported by the :command:`health-query-upgrade` command, then recheck the
system health status to confirm that all **System Health** fields are set
to **OK**.
.. code-block:: none
~(keystone_admin)]$ system health-query-upgrade
System Health:
All hosts are provisioned: [OK]
All hosts are unlocked/enabled: [OK]
All hosts have current configurations: [OK]
All hosts are patch current: [OK]
Ceph Storage Healthy: [OK]
No alarms: [OK]
All kubernetes nodes are ready: [OK]
All kubernetes control plane pods are ready: [OK]
Required patches are applied: [OK]
License valid for upgrade: [OK]
By default, the upgrade process cannot be run and is not recommended to be
run with Active Alarms present. However, management affecting alarms can be
ignored with the :command:`--force` option with the :command:`system
upgrade-start` command to force the upgrade process to start.
.. note::
It is strongly recommended that you clear your system of any and all
alarms before doing an upgrade. While the :command:`--force` option is
available to run the upgrade, it is a best practice to clear any
alarms.
#. Start the upgrade from controller-0.
Make sure that controller-0 is the active controller, and you are logged
into controller-0 as **sysadmin** and your present working directory is
your home directory.
.. code-block:: none
~(keystone_admin)]$ system upgrade-start
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | starting |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
This will make a copy of the system data to be used in the upgrade.
Configuration changes are not allowed after this point until the swact to
controller-1 is completed.
The following upgrade state applies once this command is executed:
- started:
- State entered after :command:`system upgrade-start` completes.
- Release 20.04 system data \(for example, postgres databases\) has
been exported to be used in the upgrade.
- Configuration changes must not be made after this point, until the
upgrade is completed.
As part of the upgrade, the upgrade process checks the health of the system
and validates that the system is ready for an upgrade.
The upgrade process checks that no alarms are active before starting an
upgrade.
.. note::
Use the command :command:`system upgrade-start --force` to force the
upgrades process to start and to ignore management affecting alarms.
This should ONLY be done if you feel these alarms will not be an issue
over the upgrades process.
On systems with Ceph storage, it also checks that the Ceph cluster is
healthy.
#. Upgrade controller-1.
#. Lock controller-1.
.. code-block:: none
~(keystone_admin)]$ system host-lock controller-1
#. Upgrade controller-1.
Controller-1 installs the update and reboots, then performs data
migration.
.. code-block:: none
~(keystone_admin)]$ system host-upgrade controller-1
Wait for controller-1 to reinstall with the load N+1 and becomes
**locked-disabled-online** state.
The following data migration states apply when this command is
executed.
- data-migration:
- State entered when :command:`system host-upgrade controller-1`
is executed.
- System data is being migrated from release N to release N+1.
- data-migration-complete:
- State entered when controller-1 upgrade is complete.
- System data has been successfully migrated from release 20.04
to release 20.06.
- data-migration-failed:
- State entered if data migration on controller-1 fails.
- Upgrade must be aborted.
#. Check the upgrade state.
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | e7c8f6bc-518c-46d4-ab81-7a59f8f8e64b |
| state | data-migration-complete |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
If the :command:`upgrade-show` status indicates
'data-migration-failed', then there is an issue with the data
migration. Check the issue before proceeding to the next step.
#. Unlock controller-1.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-1
Wait for controller-1 to become **unlocked-enabled**. Wait for the DRBD
sync **400.001** Services-related alarm is raised and then cleared.
The following states apply when this command is executed.
- upgrading-controllers:
- State entered when controller-1 has been unlocked and is
running release 20.06 software.
If it transitions to **unlocked-disabled-failed**, check the issue
before proceeding to the next step. The alarms may indicate a
configuration error. Check the result of the configuration logs on
controller-1, \(for example, Error logs in
controller1:/var/log/puppet\).
#. Set controller-1 as the active controller. Swact to controller-1.
.. code-block:: none
~(keystone_admin)]$ system host-swact controller-0
Wait until services have gone active on the new active controller-1 before
proceeding to the next step. When all services on controller-1 are
enabled-active, the swact is complete.
#. Upgrade **controller-0**.
#. Lock **controller-0**.
.. code-block:: none
~(keystone_admin)]$ system host-lock controller-0
#. Upgrade **controller-0**.
.. code-block:: none
~(keystone_admin)]$ system host-upgrade controller-0
#. Unlock **controller-0**.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-0
Wait until the DRBD sync **400.001** Services-related alarm is raised
and then cleared before proceeding to the next step.
- upgrading-hosts:
- State entered when both controllers are running release 20.06
software.
#. Check the system health to ensure that there are no unexpected alarms.
.. code-block:: none
~(keystone_admin)]$ fm alarm-list
Clear all alarms unrelated to the upgrade process.
#. If using Ceph storage backend, upgrade the storage nodes one at a time.
The storage node must be locked and all OSDs must be down in order to do
the upgrade.
#. Lock storage-0.
.. code-block:: none
~(keystone_admin)]$ system host-lock storage-0
#. Verify that the OSDs are down after the storage node is locked.
In the Horizon interface, navigate to **Admin** \> **Platform** \>
**Storage Overview** to view the status of the OSDs.
#. Upgrade storage-0.
.. code-block:: none
~(keystone_admin)]$ system host-upgrade storage-0
The upgrade is complete when the node comes online, and at that point,
you can safely unlock the node.
After upgrading a storage node, but before unlocking, there are Ceph
synchronization alarms \(that appear to be making progress in
synching\), and there are infrastructure network interface alarms
\(since the infrastructure network interface configuration has not been
applied to the storage node yet, as it has not been unlocked\).
Unlock the node as soon as the upgraded storage node comes online.
#. Unlock storage-0.
.. code-block:: none
~(keystone_admin)]$ system host-unlock storage-0
Wait for all alarms to clear after the unlock before proceeding to
upgrade the next storage host.
#. Repeat the above steps for each storage host.
.. note::
After upgrading the first storage node you can expect alarm
**800.003**. The alarm is cleared after all storage nodes are
upgraded.
#. Upgrade worker hosts, one at a time, if any.
#. Lock worker-0.
.. code-block:: none
~(keystone_admin)]$ system host-lock worker-0
#. Upgrade worker-0.
.. code-block:: none
~(keystone_admin)]$ system host-upgrade worker-0
Wait for the host to run the installer, reboot, and go online before
unlocking it in the next step.
#. Unlock worker-0.
.. code-block:: none
~(keystone_admin)]$ system host-unlock worker-0
Wait for all alarms to clear after the unlock before proceeding to the
next worker host.
#. Repeat the above steps for each worker host.
#. Set controller-0 as the active controller. Swact to controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-swact controller-1
Wait until services have gone active on the active controller-0 before
proceeding to the next step. When all services on controller-0 are
enabled-active, the swact is complete.
#. Activate the upgrade.
.. code-block:: none
~(keystone_admin)]$ system upgrade-activate
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activating |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
During the running of the :command:`upgrade-activate` command, new
configurations are applied to the controller. 250.001 \(**hostname
Configuration is out-of-date**\) alarms are raised and are cleared as the
configuration is applied. The upgrade state goes from **activating** to
**activation-complete** once this is done.
The following states apply when this command is executed.
**activation-requested**
State entered when :command:`system upgrade-activate` is executed.
**activating**
State entered when we have started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activation-complete**
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**.
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activation-complete |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. Complete the upgrade.
.. code-block:: none
~(keystone_admin)]$ system upgrade-complete
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | completing |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. Delete the imported load.
.. code-block:: none
~(keystone_admin)]$ system load-list
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | imported | 20.04 |
| 2 | active | 20.06 |
+----+----------+------------------+
~(keystone_admin)]$ system load-delete 1
Deleted load: load 1

View File

@ -0,0 +1,377 @@
.. nfq1592854955302
.. _upgrading-all-in-one-simplex:
==========================
Upgrade All-in-One Simplex
==========================
You can upgrade a |prod| Simplex configuration with a new release of |prod|
software.
.. rubric:: |prereq|
.. _upgrading-all-in-one-simplex-ul-ezb-b11-cx:
- Perform a full backup to allow recovery.
.. note::
Back up files in the /home/sysadmin and /rootdirectories prior to doing
an upgrade. Home directories are not preserved during backup or restore
operations, blade replacement, or upgrades.
- The system must be 'patch current'. All upgrades available for the current
release running on the system must be applied. To find and download
applicable upgrades, visit the |dnload-loc| site.
- Transfer the new release software load to controller-0 \(or onto a USB
stick\); controller-0 must be active.
- Transfer the new release software license file to controller-0, \(or onto a
USB stick\).
- Transfer the new release software signature to controller-0 \(or onto a USB
stick\).
- Unlock all hosts.
- All nodes must be unlocked. The upgrade cannot be started when there
are locked nodes \(the health check prevents it\).
.. note::
The upgrade procedure includes steps to resolve system health issues.
.. rubric:: |proc|
#. Source the platform environment.
.. code-block:: none
$ source /etc/platform/openrc
~(keystone_admin)]$
#. Install the license file for the release you are upgrading to, for example,
20.06.
.. code-block:: none
~(keystone_admin)]$ system license-install <license_file>
For example,
.. code-block:: none
~(keystone_admin)]$ system license-install license.lic
#. Import the new release.
#. Run the :command:`load-import` command on **controller-0** to import
the new release.
First, source /etc/platform/openrc.
You must specify an exact path to the \*.iso bootimage file and to the
\*.sig bootimage signature file.
.. code-block:: none
$ source /etc/platform/openrc
~(keystone_admin)]$ system load-import /home/sysadmin/<bootimage>.iso \
<bootimage>.sig
+--------------------+-----------+
| Property | Value |
+--------------------+-----------+
| id | 2 |
| state | importing |
| software_version | 20.06 |
| compatible_version | 20.04 |
| required_patches | |
+--------------------+-----------+
The :command:`load-import` must be done on **controller-0** and accepts
relative paths.
#. Check to ensure the load was successfully imported.
.. code-block:: none
~(keystone_admin)]$ system load-list
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | active | 20.04 |
| 2 | imported | 20.06 |
+----+----------+------------------+
#. Apply any required software updates.
The system must be 'patch current'. All software updates related to your
current |prod| software release must be, uploaded, applied, and installed.
All software updates to the new |prod| release, only need to be uploaded
and applied. The install of these software updates will occur automatically
during the software upgrade procedure as the hosts are reset to load the
new release of software.
To find and download applicable updates, visit the |dnload-loc|.
For more information, see :ref:`Manage Software Updates
<managing-software-updates>`.
#. Confirm that the system is healthy.
Check the current system health status, resolve any alarms and other issues
reported by the :command:`health-query-upgrade` command, then recheck the
system health status to confirm that all **System Health** fields are set
to **OK**.
.. code-block:: none
~(keystone_admin)]$ system health-query-upgrade
System Health:
All hosts are provisioned: [OK]
All hosts are unlocked/enabled: [OK]
All hosts have current configurations: [OK]
All hosts are patch current: [OK]
Ceph Storage Healthy: [OK]
No alarms: [OK]
All kubernetes nodes are ready: [OK]
All kubernetes control plane pods are ready: [OK]
Required patches are applied: [OK]
License valid for upgrade: [OK]
By default, the upgrade process cannot be run and is not recommended to be
run with Active Alarms present. However, management affecting alarms can be
ignored with the :command:`--force` option with the :command:`system
upgrade-start` command to force the upgrade process to start.
.. note::
It is strongly recommended that you clear your system of any and all
alarms before doing an upgrade. While the :command:`--force` option is
available to run the upgrade, it is a best practice to clear any
alarms.
#. Start the upgrade.
.. code-block:: none
~(keystone_admin)]$ system upgrade-start
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | starting |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
This will back up the system data and images to /opt/platform-backup.
/opt/platform-backup is preserved when the host is reinstalled. With the
platform backup, the size of /home/sysadmin must be less than 2GB.
This process may take several minutes.
When the upgrade state is upgraded to **started** the process is complete.
Any changes made to the system after this point will be lost when the data
is restored.
The following upgrade state applies once this command is executed:
- started:
- State entered after :command:`system upgrade-start` completes.
- Release 20.04 system data \(for example, postgres databases\) has
been exported to be used in the upgrade.
- Configuration changes must not be made after this point, until the
upgrade is completed.
As part of the upgrade, the upgrade process checks the health of the system
and validates that the system is ready for an upgrade.
The upgrade process checks that no alarms are active before starting an
upgrade.
.. note::
Use the command :command:`system upgrade-start --force` to force the
upgrades process to start and to ignore management affecting alarms.
This should ONLY be done if you feel these alarms will not be an issue
over the upgrades process.
#. Check the upgrade state.
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | started |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. \(Optional\) Copy the upgrade data from the system to an alternate safe
location \(such as a USB drive or remote server\).
The upgrade data is located under /opt/platform-backup. Example file names
are:
**lost+found upgrade\_data\_2020-06-23T033950\_61e5fcd7-a38d-40b0-ab83-8be55b87fee2.tgz**
.. code-block:: none
~(keystone_admin)]$ ls /opt/platform-backup/
#. Lock controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-lock controller-0
#. Start Upgrade controller-0.
This is the point of no return. All data except /opt/platform-backup/ will
be erased from the system. This will wipe the **rootfs** and reboot the
host. The new release must then be manually installed \(via network or
USB\).
.. code-block:: none
~(keystone_admin)]$ system host-upgrade controller-0
WARNING: THIS OPERATION WILL COMPLETELY ERASE ALL DATA FROM THE SYSTEM.
Only proceed once the system data has been copied to another system.
Are you absolutely sure you want to continue? [yes/N]: yes
#. Install the new release of |prod-long| Simplex software via network or USB.
#. Restore the upgrade data.
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/upgrade_platform.yml
Once the host has installed the new load, this will restore the upgrade
data and migrate it to the new load.
The playbook can be run locally or remotely and must be provided with the
following parameter:
``ansible_become_pass``
The ansible playbook will check /home/sysadmin/<hostname\>.yml for these
user configuration override files for hosts. For example, if running
ansible locally, /home/sysadmin/localhost.yml.
By default the playbook will search for the upgrade data file under
/opt/platform-backup. If required, use the **upgrade\_data\_file**
parameter to specify the path to the **upgrade\_data**.
.. note::
This playbook does not support replay.
Once the data restoration is complete the upgrade state will be set to
**upgrading-hosts**.
#. Check the status of the upgrade.
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | upgrading-hosts |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. Unlock controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-0
This step is required only for Simplex systems that are not a subcloud.
#. Activate the upgrade.
During the running of the :command:`upgrade-activate` command, new
configurations are applied to the controller. 250.001 \(**hostname
Configuration is out-of-date**\) alarms are raised and are cleared as the
configuration is applied. The upgrade state goes from **activating** to
**activation-complete** once this is done.
.. code-block:: none
~(keystone_admin)]$ system upgrade-activate
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activating |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
The following states apply when this command is executed.
**activation-requested**
State entered when :command:`system upgrade-activate` is executed.
**activating**
State entered when we have started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activation-complete**
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**
.. code-block:: none
~(keystone_admin)]$ system upgrade-show
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | activation-complete |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. Complete the upgrade.
.. code-block:: none
~(keystone_admin)]$ system upgrade-complete
+--------------+--------------------------------------+
| Property | Value |
+--------------+--------------------------------------+
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
| state | completing |
| from_release | 20.04 |
| to_release | 20.06 |
+--------------+--------------------------------------+
#. Delete the imported load.
.. code-block:: none
~(keystone_admin)]$ system load-list
+----+----------+------------------+
| id | state | software_version |
+----+----------+------------------+
| 1 | imported | 20.04 |
| 2 | active | 20.06 |
+----+----------+------------------+
~(keystone_admin)]$ system load-delete 1
Deleted load: load 1