Merge "Upgrades edits (r6,r7,dsR6,dsR7)"

This commit is contained in:
Zuul 2022-11-09 21:14:20 +00:00 committed by Gerrit Code Review
commit e73f762df6
37 changed files with 668 additions and 661 deletions

View File

@ -9,11 +9,11 @@ Abort Simplex System Upgrades
You can abort a Simplex System upgrade before or after upgrading controller-0.
The upgrade abort procedure can only be applied before the
:command:`upgrade-complete` command is issued. Once this command is issued the
upgrade can not be aborted. If the return to the previous release is required,
then restore the system using the backup data taken prior to the upgrade.
upgrade can not be aborted. If you must return to the previous release, then
restore the system using the backup data taken prior to the upgrade.
Before starting, verify the upgrade data under `/opt/platform-backup`. This data
must be present to perform the abort process.
Before starting, verify the upgrade data under ``/opt/platform-backup``. This
data must be present to perform the abort process.
.. _aborting-simplex-system-upgrades-section-N10025-N1001B-N10001:
@ -31,20 +31,20 @@ Before upgrading controller-0
.. code-block:: none
$ system upgrade-abort
~(keystone_admin)$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
The upgrade state is set to ``aborting``. Once this is executed, it cannot
be cancelled; the upgrade must be completely aborted.
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
~(keystone_admin)$ system upgrade-complete
At this time any upgrade data generated as part of the upgrade-start
command will be deleted. This includes the upgrade data in
/opt/platform-backup.
``/opt/platform-backup``.
.. _aborting-simplex-system-upgrades-section-N10063-N1001B-N10001:
@ -52,7 +52,7 @@ Before upgrading controller-0
After upgrading controller-0
----------------------------
After controller-0 has been upgraded it is possible to roll back the software
After controller-0 has been upgraded, it is possible to roll back the software
upgrade. This involves performing a system restore with the previous release.
.. _aborting-simplex-system-upgrades-ol-jmw-kcp-xdb:
@ -61,41 +61,42 @@ upgrade. This involves performing a system restore with the previous release.
USB.
#. Verify and configure IP connectivity. External connectivity is required to
run the Ansible restore playbook. The |prod-long| boot image will DHCP out all
interfaces so the server may have obtained an IP address and have external IP
connectivity if a DHCP server is present in your environment. Verify this using
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
route.
run the Ansible restore playbook. The |prod-long| boot image will |DHCP| out
all interfaces so the server may have obtained an IP address and have
external IP connectivity if a |DHCP| server is present in your environment.
Verify this using the :command:`ip addr` command. Otherwise, manually
configure an IP address and default IP route.
#. Restore the system data. The restore is preserved in /opt/platform-backup.
#. Restore the system data. The restore is preserved in ``/opt/platform-backup``.
The system will be restored to the state when the :command:`upgrade-start`
command was issued. Follow the process in :ref:`Run Restore Playbook Locally on the
Controller <running-restore-playbook-locally-on-the-controller>`.
command was issued. Follow the process in :ref:`Run Restore Playbook Locally
on the Controller <running-restore-playbook-locally-on-the-controller>`.
Specify the upgrade data filename as `backup_filename` and the `initial_backup_dir`
as `/opt/platform-backup`.
Specify the upgrade data filename as `backup_filename` and the
`initial_backup_dir` as ``/opt/platform-backup``.
The user images will also need to be restored as described in the Postrequisites section.
The user images will also need to be restored as described in the
Postrequisites section.
#. Unlock controller-0
.. code-block:: none
$ system host-unlock controller-0
~(keystone_admin)$ system host-unlock controller-0
#. Abort the upgrade with the :command:`upgrade-abort` command.
.. code-block:: none
$ system upgrade-abort
~(keystone_admin)$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
The upgrade state is set to ``aborting``. Once this is executed, it cannot
be cancelled; the upgrade must be completely aborted.
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
~(keystone_admin)$ system upgrade-complete

View File

@ -7,12 +7,12 @@ Configure Firmware Update Orchestration
=======================================
You can configure *Firmware Update Orchestration Strategy* using the
**sw-manager** |CLI|.
:command:`sw-manager` |CLI|.
.. note::
Management-affecting alarms cannot be ignored using relaxed alarm rules
during an orchestrated firmware update operation. For a list of
management-affecting alarms, see |prod| Fault Management:
management-affecting alarms, see |fault-doc|:
:ref:`Alarm Messages <100-series-alarm-messages>`. To display
management-affecting active alarms, use the following command:
@ -37,9 +37,9 @@ ignored even when the default strict restrictions are selected:
.. _noc1590162360081-ul-ls2-pxs-tlb:
- Hosts that need to be updated must be in the **unlocked-enabled** state.
- Hosts that need to be updated must be in the ``unlocked-enabled`` state.
- The firmware update image must be in the **applied** state. For more
- The firmware update image must be in the ``applied`` state. For more
information, see :ref:`Managing Software Updates <managing-software-updates>`.
.. rubric:: |proc|
@ -69,7 +69,7 @@ ignored even when the default strict restrictions are selected:
state: building
inprogress: true
#. Optional: Display the strategy in summary, if required. The firmware update
#. |Optional| Display the strategy in summary, if required. The firmware update
strategy :command:`show` command displays the strategy in a summary.
.. code-block:: none
@ -87,7 +87,7 @@ ignored even when the default strict restrictions are selected:
state: ready-to-apply
build-result: success
The strategy steps and stages are displayed using the **--details** option.
The strategy steps and stages are displayed using the ``--details`` option.
#. Apply the strategy.
@ -96,7 +96,7 @@ ignored even when the default strict restrictions are selected:
all the hosts in the strategy is complete.
- Use the **-stage-id** option to specify a specific stage to apply; one
- Use the ``-stage-id`` option to specify a specific stage to apply; one
at a time.
.. note::
@ -106,7 +106,7 @@ ignored even when the default strict restrictions are selected:
.. code-block:: none
~(keystone_admin)$sw-manager fw-update-strategy apply
~(keystone_admin)$ sw-manager fw-update-strategy apply
Strategy Firmware Update Strategy:
strategy-uuid: 3e43c018-9c75-4ba8-a276-472c3bcbb268
controller-apply-type: ignore
@ -125,7 +125,7 @@ ignored even when the default strict restrictions are selected:
.. code-block:: none
~(keystone_admin)$sw-manager fw-update-strategy show
~(keystone_admin)$ sw-manager fw-update-strategy show
Strategy Firmware Update Strategy:
strategy-uuid: 3e43c018-9c75-4ba8-a276-472c3bcbb268
controller-apply-type: ignore
@ -138,7 +138,7 @@ ignored even when the default strict restrictions are selected:
state: applying
inprogress: true
#. Optional: Abort the strategy, if required. This is only used to stop, and
#. |optional| Abort the strategy, if required. This is only used to stop, and
abort the entire strategy.
The firmware update strategy :command:`abort` command can be used to abort
@ -157,7 +157,7 @@ ignored even when the default strict restrictions are selected:
.. code-block:: none
~(keystone_admin)$sw-manager fw-update-strategy delete
~(keystone_admin)$ sw-manager fw-update-strategy delete
Strategy deleted.
For more information see :ref:`Firmware Update Orchestration Using the CLI

View File

@ -12,7 +12,7 @@ You can configure *Kubernetes Version Upgrade Orchestration Strategy* using the
.. note::
You require administrator privileges to use :command:`sw-manager`. You must
log in to the active controller as **user sysadmin** and source the script
by using the command, source /etc/platform/openrc to obtain administrator
by using the command, source ``/etc/platform/openrc`` to obtain administrator
privileges. Do not use :command:`sudo`.
.. note::
@ -75,9 +75,10 @@ For example:
- Hosts that need to be upgraded must be in the ``unlocked-enabled`` state.
- If you are using NetApp Trident, ensure that your NetApp version is
compatible with Trident 22.01 before upgrading Kubernetes to version |kube-ver|
and after updating |prod| to version |prod-ver|. For more information,
see :ref:`Upgrade the NetApp Trident Software <upgrade-the-netapp-trident-software-c5ec64d213d3>`.
compatible with Trident 22.01 before upgrading Kubernetes to version
|kube-ver| and after updating |prod| to version |prod-ver|. For more
information, see :ref:`Upgrade the NetApp Trident Software
<upgrade-the-netapp-trident-software-c5ec64d213d3>`.
.. only:: partner
@ -104,7 +105,7 @@ For example:
#. Confirm that the system is healthy.
Check the current system health status, resolve any alarms and other issues
reported by the :command:`system health-query-kube-upgrade` command then
reported by the :command:`system health-query-kube-upgrade` command, then
recheck the system health status to confirm that all **System Health**
fields are set to **OK**.
@ -272,7 +273,7 @@ For example:
defaults to strict
#. Optional: Display the strategy in summary, if required. The Kubernetes
#. |optional| Display the strategy in summary, if required. The Kubernetes
upgrade strategy :command:`show` command displays the strategy in a summary.
.. code-block:: none
@ -350,7 +351,7 @@ For example:
``downloading-images``, ``downloaded-images``, ``upgrading-first-master``,
``upgraded-first-master``, etc.
#. Optional: Abort the strategy, if required. This is only used to stop, and
#. |optional| Abort the strategy, if required. This is only used to stop, and
abort the entire strategy.
The Kubernetes version upgrade strategy :command:`abort` command can be

View File

@ -156,14 +156,14 @@ status before creating a update strategy.
- Worker hosts with no hosted application pods are updated before
worker hosts with hosted application pods.
- The final step in each stage is "system-stabilize," which waits for
a period of time \(up to several minutes\) and ensures that the
- The final step in each stage is ``system-stabilize``, which waits
for a period of time \(up to several minutes\) and ensures that the
system is free of alarms. This ensures that the update orchestrator
does not continue to update more hosts if the update application
has caused an issue resulting in an alarm.
does not continue to update more hosts if the update application has
caused an issue resulting in an alarm.
#. Click the **Apply Strategy** button to apply the update- strategy. You can
#. Click the **Apply Strategy** button to apply the update strategy. You can
optionally apply a single stage at a time by clicking the **Apply Stage**
button.
@ -181,7 +181,7 @@ status before creating a update strategy.
attempt to unlock any hosts that were locked.
.. note::
If a update-strategy is aborted after hosts were locked, but before
If a update strategy is aborted after hosts were locked, but before
they were updated, the hosts will not be unlocked, as this would result
in the updates being installed. You must either install the updates on
the hosts or remove the updates before unlocking the hosts.

View File

@ -15,13 +15,13 @@ Do the following to manage the instance re-location manually:
.. _rbp1590431075472-ul-mgr-kvs-tlb:
- Manually firmware update at least one openstack-compute worker host. This
- Manually firmware-update at least one openstack-compute worker host. This
assumes that at least one openstack-compute worker host does not have any
instances, or has instances that can be migrated. For more information on
manually updating a host, see the :ref:`Display Worker Host Information
<displaying-worker-host-information>`.
- If the migration is prevented by limitations in the VNF or virtual
- If the migration is prevented by limitations in the |VNF| or virtual
application, perform the following:
@ -35,7 +35,7 @@ Do the following to manage the instance re-location manually:
- Terminate the old instances.
- If the migration is prevented by the size of the instances local disks:
- If the migration is prevented by the size of the instances' local disks:
- For each openstack-compute worker host that has instances that cannot
be migrated, manually move the instances using the CLI. For more

View File

@ -7,29 +7,30 @@ Firmware Update Orchestration Using the CLI
===========================================
You can configure the *Firmware Update Orchestration Strategy* using the
**sw-manager** |CLI|.
:command:`sw-manager`` |CLI| commands.
---------------
About this task
---------------
.. note::
You require administrator privileges to use **sw-manager**. You must log in
to the active controller as **user sysadmin** and source the script by
using the command, source /etc/platform/openrc to obtain administrator
privileges. Do not use sudo.
You require administrator privileges to use :command:`sw-manager` commands.
You must log in to the active controller as **user sysadmin** and source the
script by using the command, source ``/etc/platform/openrc`` to obtain
administrator privileges. Do not use sudo.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
level or higher by using relaxed alarm rules during an orchestrated
firmware update operation. For a list of management-affecting alarms, see
|prod| Fault Management: :ref:`Alarm Messages
|fault-doc|: :ref:`Alarm Messages
<100-series-alarm-messages>`. To display management-affecting active
alarms, use the following command:
.. code-block:: none
.. code-block:: none
~(keystone_admin)$ fm alarm-list --mgmt_affecting
~(keystone_admin)$ fm alarm-list --mgmt_affecting
During an orchestrated firmware update operation, the following alarms are
ignored even when strict restrictions are selected:
@ -72,112 +73,108 @@ be created with override worker apply type concurrency with a max host
parallelism, instance action, and alarm restrictions.
``--controller-apply-type`` and ``--storage-apply-type``
These options cannot be changed from ``ignore`` because firmware update is
only supported for worker hosts.
These options cannot be changed from '**ignore**' because firmware update
is only supported for worker hosts.
.. note::
Firmware update is currently only supported for hosts with worker
function. Any attempt to modify the controller or storage apply type is
rejected.
.. note::
Firmware update is currently only supported for hosts with worker
function. Any attempt to modify the controller or storage apply type is
rejected.
``--worker-apply-type``
This option specifies the host concurrency of the firmware update strategy:
This option specifies the host concurrency of the firmware update strategy:
- ``serial`` \(default\): worker hosts will be patched one at a time
- ``parallel``: worker hosts will be updated in parallel
- At most, ``parallel`` will be updated at the same time
- At most, half of the hosts in a host aggregate will be updated at the
same time
- ``ignore``: worker hosts will not be updated; strategy create will fail
- ``serial`` \(default\): worker hosts will be patched one at a time
- ``parallel``: worker hosts will be updated in parallel
- At most, ``parallel`` will be updated at the same time
- At most, half of the hosts in a host aggregate will be updated at the
same time
- ``ignore``: worker hosts will not be updated; strategy create will fail
Worker hosts with no instances are updated before worker hosts with instances.
Worker hosts with no instances are updated before worker hosts with
instances.
``--max-parallel-worker-hosts``
This option applies to the parallel worker apply type selection to specify
the maximum worker hosts to update in parallel \(minimum: 2, maximum: 10\).
This option applies to the parallel worker apply type selection to specify
the maximum worker hosts to update in parallel \(minimum: 2, maximum: 10\).
``-instance-action``
This option only has significance when the |prefix|-openstack application is
loaded and there are instances running on worker hosts. It specifies how the
strategy deals with worker host instances over the strategy execution.
This option only has significance when the |prefix|-openstack application is
loaded and there are instances running on worker hosts. It specifies how
the strategy deals with worker host instances over the strategy execution.
- ``stop-start`` (default)
Instances will be stopped before the host lock operation following the
update and then started again following the host unlock.
.. warning::
Using the ``stop-start`` option will result in an outage for each
instance, as it is stopped while the worker host is locked/unlocked. In
order to ensure this does not impact service, instances MUST be grouped
into anti-affinity \(or anti-affinity best effort\) server groups,
which will ensure that only a single instance in each server group is
stopped at a time.
- ``migrate``
Instances will be migrated off a host before it is patched \(this applies
to reboot patching only\).
- ``stop-start`` (default)
Instances will be stopped before the host lock operation following the
update and then started again following the host unlock.
.. warning::
Using the ``stop-start`` option will result in an outage for each
instance, as it is stopped while the worker host is locked/unlocked. In
order to ensure this does not impact service, instances MUST be grouped
into anti-affinity \(or anti-affinity best effort\) server groups,
which will ensure that only a single instance in each server group is
stopped at a time.
- ``migrate``
Instances will be migrated off a host before it is patched \(this applies
to reboot patching only\).
``--alarm-restrictions``
This option sets how the how the firmware update orchestration behaves when
alarms are present.
This option sets how the how the firmware update orchestration behaves when
alarms are present.
To display management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)$ fm alarm-list --mgmt_affecting
- ``strict`` (default)
The default strict option will result in patch orchestration failing if
there are any alarms present in the system \(except for a small list of
alarms\).
- ``relaxed``
This option allows orchestration to proceed if alarms are present, as long
as none of these alarms are management affecting.
.. code-block:: none
~(keystone_admin)]$ sw-manager fw-update-strategy create --help
usage:sw-manager fw-update-strategy create [-h]
[--controller-apply-type {ignore}]
[--storage-apply-type {ignore}]
[--worker-apply-type
{serial,parallel,ignore}]
[--max-parallel-worker-hosts
{2,3,4,5,6,7,8,9,10}]
[--instance-action {migrate,stop-start}]
[--alarm-restrictions {strict,relaxed}]
optional arguments:
-h, --help show this help message and exit
--controller-apply-type {ignore}
defaults to ignore
--storage-apply-type {ignore}
defaults to ignore
--worker-apply-type {serial,parallel,ignore}
defaults to serial
--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}
maximum worker hosts to update in parallel
--instance-action {migrate,stop-start}
defaults to stop-start
--alarm-restrictions {strict,relaxed}
defaults to strict
To display management-affecting active alarms, use the following command:
.. code-block:: none
~(keystone_admin)$ fm alarm-list --mgmt_affecting
- ``strict`` (default)
The default strict option will result in patch orchestration failing if
there are any alarms present in the system \(except for a small list of
alarms\).
- ``relaxed``
This option allows orchestration to proceed if alarms are present, as long
as none of these alarms are management affecting.
.. code-block:: none
~(keystone_admin)]$ sw-manager fw-update-strategy create --help
usage:sw-manager fw-update-strategy create [-h]
[--controller-apply-type {ignore}]
[--storage-apply-type {ignore}]
[--worker-apply-type
{serial,parallel,ignore}]
[--max-parallel-worker-hosts
{2,3,4,5,6,7,8,9,10}]
[--instance-action {migrate,stop-start}]
[--alarm-restrictions {strict,relaxed}]
optional arguments:
-h, --help show this help message and exit
--controller-apply-type {ignore}
defaults to ignore
--storage-apply-type {ignore}
defaults to ignore
--worker-apply-type {serial,parallel,ignore}
defaults to serial
--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}
maximum worker hosts to update in parallel
--instance-action {migrate,stop-start}
defaults to stop-start
--alarm-restrictions {strict,relaxed}
defaults to strict
.. _tsr1590164474201-section-l3x-wr5-tlb:
-------------------------------------------
@ -203,11 +200,10 @@ of the strategy. A complete view of the strategy can be shown using the
Firmware update orchestration strategy apply
--------------------------------------------
The ``apply`` strategy subcommand with no options executes the firmware
update strategy from current state to the end. The apply strategy operation can
be called with the ``stage-id`` option to execute the next stage of the
strategy. The ``stage-id`` option cannot be used to execute the strategy out of
order.
The ``apply`` strategy subcommand with no options executes the firmware update
strategy from current state to the end. The apply strategy operation can be
called with the ``stage-id`` option to execute the next stage of the strategy.
The ``stage-id`` option cannot be used to execute the strategy out of order.
.. code-block:: none
@ -226,9 +222,9 @@ Firmware update orchestration strategy abort
The ``abort`` strategy subcommand with no options sets the strategy to abort
after the current applying stage is complete. The abort strategy operation can
be called with the ``stage-id`` option to specify that the strategy abort
before executing the next stage of the strategy. The ``stage-id`` option cannot
be used to execute the strategy out of order.
be called with the ``stage-id`` option to specify that the strategy abort before
executing the next stage of the strategy. The ``stage-id`` option cannot be used
to execute the strategy out of order.
.. code-block:: none

View File

@ -6,9 +6,8 @@
Handle Firmware Update Orchestration Failures
=============================================
The creation or application of a strategy could fail for any of the listed
reasons described in this section. Follow the suggested actions in each case to
resolve the issue.
The creation or application of a strategy could fail for any of the reasons
listed below. Follow the suggested actions in each case to resolve the issue.
-------------------------
Strategy creation failure
@ -20,31 +19,31 @@ Strategy creation failure
- **Action**:
- verify that the **--worker-apply-type** was not set to '**ignore**'
- Verify that the ``--worker-apply-type`` was not set to ``ignore``.
- check recent logs added to /var/log/nfv-vim.log
- Check recent logs added to ``/var/log/nfv-vim.log``.
- **Reason**: alarms from platform are present
- **Action**:
- query for management affecting alarms and take actions to clear them
- Query for management affecting alarms and take actions to clear them.
.. code-block:: none
~(keystone_admin)$ fm alarm-list --mgmt_affecting
- if there are no management affecting alarms present take actions to
- If there are no management affecting alarms present take actions to
clear other reported alarms or try creating the strategy with the
'**relaxed**' alarms restrictions option **--alarm-restrictions
relaxed**
'**relaxed**' alarms restrictions option ``--alarm-restrictions
relaxed``.
- **Reason**: no firmware update required
- **Action**:
- verify that the firmware device image has been applied for the
worker hosts that require updating
- Verify that the firmware device image has been applied for the
worker hosts that require updating.
.. note::
If the strategy create failed. After resolving the strategy
@ -57,52 +56,53 @@ Strategy apply failure
.. _jkf1590184623714-ul-rdf-4pq-5lb:
- **Reason**: alarms from platform are present
- **Reason**: Alarms from platform are present.
- **Action**: suggests that an alarm has been raised since the creation
of the strategy. Address the cause of the new alarm, delete the strategy
and try creating and applying a new strategy
- **Action**: This suggests that an alarm has been raised since the
creation of the strategy. Address the cause of the new alarm, delete the
strategy and try creating and applying a new strategy.
- **Reason**: unable to migrate instances
- **Reason**: Unable to migrate instances.
- **Action**: See :ref:`Firmware Update Operations Requiring Manual
Migration <firmware-update-operations-requiring-manual-migration>` for
steps to resolve migration issues.
- **Reason**: firmware update failed. Suggests that the firmware update for
- **Reason**: Firmware update failed. Suggests that the firmware update for
the specified host has failed
- **Action**: For more information, see |prod| Node Management:
:ref:`Display Worker Host Information <displaying-worker-host-information>`
- **Reason**: lock host failed
- **Reason**: Lock host failed.
- **Action**:
- investigate the /var/log/sysinv.log, and /var/log/nfv-vim.log files
- Investigate the ``/var/log/sysinv.log``, and
``/var/log/nfv-vim.log`` files.
- address the underlying issue
- Address the underlying issue.
- manually lock and unlock the host
- Manually lock and unlock the host.
- try recreating and re-applying the firmware update strategy to
automatically finish the update process
- Try recreating and re-applying the firmware update strategy to
automatically finish the update process.
- **Reason**: unlock host failed
- **Reason**: Unlock host failed.
- **Action**:
- investigate /var/log/mtcAgent.log file for cause logs files
- Investigate the ``/var/log/mtcAgent.log`` file for cause logs files
- address the underlying issue
- Address the underlying issue.
- manually lock and unlock the host to recover
- Manually lock and unlock the host to recover.
- try recreating and re-applying the firmware update strategy to
automatically finish the update process
- Try recreating and re-applying the firmware update strategy to
automatically finish the update process.
.. note::
If the strategy :command:`apply` fails, you must resolve the
strategy:command:`apply` failure, and delete the failed strategy before
strategy:command:`apply` failure and delete the failed strategy before
trying to create and apply another strategy.

View File

@ -18,7 +18,7 @@ Strategy creation failure
.. _jkf1590184623714-ul-fvs-vnq-5lb:
- **Reason**: build failed with no reason.
- **Reason**: Build failed with no reason.
- **Action**:
@ -27,7 +27,7 @@ Strategy creation failure
- Check recent logs added to /var/log/nfv-vim.log.
- **Reason**: alarms from platform are present.
- **Reason**: Alarms from platform are present.
- **Action**:
@ -43,7 +43,7 @@ Strategy creation failure
the ``relaxed`` alarms restrictions option ``--alarm-restrictions
relaxed``.
- **Reason**: no Kubernetes version upgrade required.
- **Reason**: No Kubernetes version upgrade required.
- **Action**:
@ -65,14 +65,14 @@ Strategy Apply Failure
.. _jkf1590184623714-ul-rdf-4pq-5lb:
- **Reason**: alarms from platform are present.
- **Reason**: Alarms from platform are present.
- **Action**: suggests that an alarm has been raised since the creation
of the strategy. Address the cause of the new alarm, delete the
- **Action**: This suggests that an alarm has been raised since the
creation of the strategy. Address the cause of the new alarm, delete the
strategy and try creating and applying a new strategy.
- **Reason**: unable to migrate instances.
- **Reason**: Unable to migrate instances.
- **Action**: See :ref:`Kubernetes Version Upgrade Operations Requiring
Manual Migration
@ -91,11 +91,11 @@ Strategy Apply Failure
.. include:: /_includes/handling-kubernetes-update-orchestration-failures.rest
- **Reason**: lock host failed.
- **Reason**: Lock host failed.
- **Action**:
- Investigate the /var/log/sysinv.log, and /var/log/nfv-vim.log
- Investigate the ``/var/log/sysinv.log``, and ``/var/log/nfv-vim.log``
files.
- Address the underlying issue.
@ -106,11 +106,11 @@ Strategy Apply Failure
strategy to automatically finish the upgrade process.
- **Reason**: unlock host failed.
- **Reason**: Unlock host failed.
- **Action**:
- Investigate /var/log/mtcAgent.log file for cause logs files.
- Investigate ``/var/log/mtcAgent.log`` file for cause logs files.
- Address the underlying issue.

View File

@ -11,10 +11,10 @@ interface. The system type is also shown.
.. rubric:: |proc|
#. In the |prod| Horizon, open the System Configuration page.
#. In the |prod| Horizon, open the **System Configuration** page.
The System Configuration page is available from **Admin** \> **Platform**
\> **System Configuration** in the left-hand pane.
The **System Configuration** page is available from **Admin** \>
**Platform** \> **System Configuration** in the left-hand pane.
#. Select the **Systems** tab to view the software version.
@ -24,10 +24,10 @@ interface. The system type is also shown.
shown in the **System Type** field. The mode \(**simplex**, **duplex**, or
**standard**\) is shown in the **System Mode** field.
#. In the |prod| Horizon interface, open the Software Management page.
#. In the |prod| Horizon interface, open the **Software Management** page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
The **Software Management** page is available from **Admin** \> **Platform**
\> **Software Management** in the left-hand pane.
#. Select the **Patches** tab to view update information.

View File

@ -47,7 +47,7 @@ For more about working with software updates, see :ref:`Manage Software Updates
+----------------------+----------------------------------------------------+
.. note::
The **system\_mode** field is shown only for a |prod| Simplex or Duplex
The **system_mode** field is shown only for a |prod| Simplex or Duplex
system.
- To list applied software updates from the CLI, use the :command:`sw-patch

View File

@ -9,16 +9,15 @@ In-Service Versus Reboot-Required Software Updates
In-Service \(Reboot-not-Required\) and a Reboot-Required software updates are
available depending on the nature of the update to be performed.
In-Service software updates provides a mechanism to issue updates that do not
In-Service software updates provide a mechanism to issue updates that do not
require a reboot, allowing the update to be installed on in-service nodes and
restarting affected processes as needed.
Depending on the area of software being updated and the type of software
change, installation of the update may or may not require the |prod| hosts to
be rebooted. For example, a software update to the kernel would require the
host to be rebooted in order to apply the update. Software updates are
classified as reboot-required or reboot-not-required \(also referred to as
Depending on the area of software being updated and the type of software change,
installation of the update may or may not require the |prod| hosts to be
rebooted. For example, a software update to the kernel would require the host to
be rebooted in order to apply the update. Software updates are classified as
reboot-required or reboot-not-required \(also referred to as
in-service\) type updates to indicate this. For reboot-required updates, the
hosted application pods are automatically relocated to an alternate host as
part of the update procedure, prior to applying the update and rebooting the
host.
hosted application pods are automatically relocated to an alternate host as part
of the update procedure, prior to applying the update and rebooting the host.

View File

@ -18,13 +18,13 @@ unlocked as part of applying the update.
#. In |prod| Horizon, open the Software Management page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
The **Software Management** page is available from **Admin** \> **Platform**
\> **Software Management** in the left-hand pane.
#. Select the Patches tab to see the current update status.
#. Select the **Patches** tab to see the current update status.
The Patches page shows the current status of all updates uploaded to the
system. If there are no updates, an empty Patch Table is displayed.
The **Patches** tab shows the current status of all updates uploaded to the
system. If there are no updates, an empty **Patch Table** is displayed.
#. Upload the update \(patch\) file to the update storage area.
@ -34,7 +34,7 @@ unlocked as part of applying the update.
The update file is transferred to the Active Controller and is copied to
the update storage area, but it has yet to be applied to the cluster. This
is reflected in the Patches page.
is reflected in the **Patches** tab.
#. Apply the update.
@ -43,29 +43,29 @@ unlocked as part of applying the update.
click the **Apply Patches** button at the top. You can use this selection
process to apply all updates, or a selected subset, in a single operation.
The Patches page is updated to report the update to be in the
The **Patches** tab is updated to report the update to be in the
*Partial-Apply* state.
#. Install the update on **controller-0**.
#. Install the update on controller-0.
#. Select the **Hosts** tab.
The **Hosts** tab on the Host Inventory page reflects the new status of
the hosts with respect to the new update state. In this example, the
The **Hosts** tab on the **Host Inventory** page reflects the new status
of the hosts with respect to the new update state. In this example, the
update only applies to controller software, as can be seen by the
worker host's status field being empty, indicating that it is 'patch
current'.
.. image:: figures/ekn1453233538504.png
#. Next, select the Install Patches option from the **Edit Host** button
associated with **controller-0** to install the update.
#. Select the Install Patches option from the **Edit Host** button
associated with controller-0 to install the update.
A confirmation window is presented giving you a last opportunity to
cancel the operation before proceeding.
#. Repeat the steps 6 a,b, above with **controller-1** to install the update
on **controller-1**.
#. Repeat the steps 6 a,b, above with controller-1 to install the update
on controller-1.
#. Repeat the steps 6 a,b above for the worker and/or storage hosts \(if
present\).
@ -74,7 +74,7 @@ unlocked as part of applying the update.
#. Verify the state of the update.
Visit the Patches page again. The update is now in the *Applied* state.
Visit the **Patches** tab again. The update is now in the *Applied* state.
.. rubric:: |result|

View File

@ -39,7 +39,7 @@ unlocked as part of applying the update.
controller-0 192.168.204.3 Yes No nn.nn idle
controller-1 192.168.204.4 Yes No nn.nn idle
#. Ensure the original update files have been deleted from the root drive.
#. Ensure that the original update files have been deleted from the root drive.
After they are uploaded to the storage area, the original files are no
longer required. You must use the command-line interface to delete them, in

View File

@ -32,15 +32,15 @@ update. The main steps of the procedure are:
#. Log in to the Horizon Web interface interface as the **admin** user.
#. In Horizon, open the Software Management page.
#. In Horizon, open the **Software Management** page.
The Software Management page is available from **Admin** \> **Platform** \>
**Software Management** in the left-hand pane.
The **Software Management** page is available from **Admin** \> **Platform**
\> **Software Management** in the left-hand pane.
#. Select the Patches tab to see the current status.
#. Select the **Patches** tab to see the current status.
The Patches page shows the current status of all updates uploaded to the
system. If there are no updates, an empty Patch Table is displayed.
The **Patches** tab shows the current status of all updates uploaded to the
system. If there are no updates, an empty **Patch Table** is displayed.
#. Upload the update \(patch\) file to the update storage area.
@ -50,7 +50,7 @@ update. The main steps of the procedure are:
The update file is transferred to the Active Controller and is copied to
the storage area, but it has yet to be applied to the cluster. This is
reflected in the Patches page.
reflected on the **Patches** tab.
#. Apply the update.
@ -62,14 +62,14 @@ update. The main steps of the procedure are:
The Patches page is updated to report the update to be in the
*Partial-Apply* state.
#. Install the update on **controller-0**.
#. Install the update on controller-0.
.. _installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001:
#. Select the **Hosts** tab.
The **Hosts** tab on the Host Inventory page reflects the new status of
the hosts with respect to the new update state. As shown below, both
The **Hosts** tab on the **Host Inventory** page reflects the new status
of the hosts with respect to the new update state. As shown below, both
controllers are now reported as not 'patch current' and requiring
reboot.
@ -83,10 +83,10 @@ update. The main steps of the procedure are:
Access to Horizon may be lost briefly during the active controller
transition. You may have to log in again.
#. Select the Lock Host option from the **Edit Host** button associated
#. Select the **Lock Host** option from the **Edit Host** button associated
with **controller-0**.
#. Select the Install Patches option from the **Edit Host** button
#. Select the **Install Patches** option from the **Edit Host** button
associated with **controller-0** to install the update.
A confirmation window is presented giving you a last opportunity to
@ -94,12 +94,12 @@ update. The main steps of the procedure are:
Wait for the update install to complete.
#. Select the Unlock Host option from the **Edit Host** button associated
with controller-0.
#. Select the **Unlock Host** option from the **Edit Host** button
associated with controller-0.
#. Repeat steps :ref:`6
<installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001>`
a to e, with **controller-1** to install the update on **controller-1**.
a to e, with **controller-1** to install the update on controller-1.
.. note::
For |prod| Simplex systems, this step does not apply.
@ -113,14 +113,14 @@ update. The main steps of the procedure are:
#. Verify the state of the update.
Visit the Patches page. The update is now in the Applied state.
Visit the **Patches** page. The update is now in the *Applied* state.
.. rubric:: |result|
The update is applied now, and all affected hosts have been updated.
The update is now applied, and all affected hosts have been updated.
Updates can be removed using the **Remove Patches** button from the Patches
page. The workflow is similar to the one presented in this section, with the
Updates can be removed using the **Remove Patches** button from the **Patches**
tab. The workflow is similar to the one presented in this section, with the
exception that updates are being removed from each host instead of being
applied.

View File

@ -14,7 +14,7 @@ You can install reboot-required software updates using the CLI.
.. _installing-reboot-required-software-updates-using-the-cli-steps-v1q-vlv-vw:
#. Log in as user **sysadmin** to the active controller and source the script
/etc/platform/openrc to obtain administrative privileges.
``/etc/platform/openrc`` to obtain administrative privileges.
#. Verify that the updates are available using the :command:`sw-patch query`
command.
@ -49,10 +49,10 @@ You can install reboot-required software updates using the CLI.
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch apply |pn|-nn.nn_PATCH_0001
|pn|-nn.nn_PATCH_0001 is now in the repo
~(keystone_admin)]$ sudo sw-patch apply |pn|-<nn>.<nn>_PATCH_0001
|pn|-<nn>.<nn>_PATCH_0001 is now in the repo
where nn.nn in the update filename is the |prod-long| release number.
where <nn>.<nn> in the update filename is the |prod-long| release number.
The update is now in the Partial-Apply state, ready for installation from
the software updates repository on the impacted hosts.
@ -101,7 +101,7 @@ You can install reboot-required software updates using the CLI.
host.
The **Patch Current** field of the :command:`query-hosts` command will
briefly report “Pending” after you apply or remove an update, until
briefly report *Pending* after you apply or remove an update, until
that host has checked against the repository to see if it is impacted
by the patching operation.
@ -124,7 +124,8 @@ You can install reboot-required software updates using the CLI.
**install-failed**
The operation failed, either due to an update error or something
killed the process. Check the patching.log on the node in question.
killed the process. Check the ``patching.log`` on the node in
question.
**install-rejected**
The node is unlocked, therefore the request to install has been
@ -168,8 +169,8 @@ You can install reboot-required software updates using the CLI.
~(keystone_admin)]$ sudo sw-patch host-install <controller-0>
.. note::
You can use the :command:`sudo sw-patch host-install-async`
<hostname> command if you are launching multiple installs in
You can use the :command:`sudo sw-patch host-install-async <hostname>`
command if you are launching multiple installs in
parallel.
#. Unlock the host.
@ -181,7 +182,7 @@ You can install reboot-required software updates using the CLI.
Unlocking the host forces a reset of the host followed by a reboot.
This ensures that the host is restarted in a known state.
All updates are now installed on **controller-0**. Querying the current
All updates are now installed on controller-0. Querying the current
update status displays the following information:
.. code-block:: none
@ -199,14 +200,14 @@ You can install reboot-required software updates using the CLI.
storage-0 192.168.204.37 Yes No nn.nn idle
storage-1 192.168.204.90 Yes No nn.nn idle
#. Install all pending updates on **controller-1**.
#. Install all pending updates on controller-1.
.. note::
For |prod| Simplex systems, this step does not apply.
Repeat the previous step targeting **controller-1**.
Repeat the previous step targeting controller-1.
All updates are now installed on **controller-1** as well. Querying the
All updates are now installed on controller-1 as well. Querying the
current updating status displays the following information:
.. code-block:: none
@ -227,12 +228,12 @@ You can install reboot-required software updates using the CLI.
#. Install any pending updates for the worker or storage hosts.
.. note::
For |prod| Simplex or Duplex systems, this step does not apply.
This step does not apply for |prod| Simplex or Duplex systems.
All hosted application pods currently running on a worker host are
re-located to another host.
If the **Patch Current** status for a worker or storage host is **No**,
If the **Patch Current** status for a worker or storage host is *No*,
apply the pending updates using the following commands:
.. code-block:: none
@ -247,32 +248,36 @@ You can install reboot-required software updates using the CLI.
~(keystone_admin)]$ system host-unlock <hostname>
where <hostname> is the name of the host \(for example, **worker-0**\).
where <hostname> is the name of the host \(for example, ``worker-0``\).
.. note::
Update installations can be triggered in parallel.
The :command:`sw-patch host-install-async` command \(**install
patches** on the Horizon Web interface\) can be run on all locked
nodes, without waiting for one node to complete the install before
triggering the install on the next. If you can lock the nodes at the
same time, without impacting hosted application services, you can
The :command:`sw-patch host-install-async` command \( cooresponding to
**install patches** on the Horizon Web interface\) can be run on all
locked nodes, without waiting for one node to complete the install
before triggering the install on the next. If you can lock the nodes at
the same time, without impacting hosted application services, you can
update them at the same time.
Likewise, you can install an update to the standby controller and a
worker node at the same time. The only restrictions are those of the
lock: You cannot lock both controllers, and you cannot lock a worker
node if you do not have enough free resources to relocate the hosted
applications from it. Also, in a Ceph configuration \(with storage
nodes\), you cannot lock more than one of
controller-0/controller-1/storage-0 at the same time, as these nodes
are running Ceph monitors and you must have at least two in service at
all times.
lock:
* You cannot lock both controllers.
* You cannot lock a worker node if you do not have enough free resources
to relocate the hosted applications from it.
Also, in a Ceph configuration \(with storage nodes\), you cannot lock
more than one of controller-0/controller-1/storage-0 at the same time,
as these nodes are running Ceph monitors and you must have at least two
in service at all times.
#. Confirm that all updates are installed and the |prod| is up-to-date.
Use the :command:`sw-patch query` command to verify that all updates are
**Applied**.
*Applied*.
.. parsed-literal::
@ -280,13 +285,13 @@ You can install reboot-required software updates using the CLI.
Patch ID Patch State
========================= ===========
|pn|-nn.nn_PATCH_0001 Applied
|pn|-<nn>.<nn>_PATCH_0001 Applied
where *nn.nn* in the update filename is the |prod| release number.
where <nn>.<nn> in the update filename is the |prod| release number.
If the **Patch State** for any update is still shown as **Available** or
**Partial-Apply**, use the **sw-patch query-hosts** command to identify
which hosts are not **Patch Current**, and then apply updates to them as
If the **Patch State** for any update is still shown as *Available* or
*Partial-Apply*, use the :command:`sw-patch query-hosts`` command to identify
which hosts are not *Patch Current*, and then apply updates to them as
described in the preceding steps.

View File

@ -12,18 +12,18 @@ This section describes installing software updates before you can commission
.. rubric:: |context|
This procedure assumes that the software updates to install are available on a
USB flash drive, or from a server reachable by **controller-0**.
USB flash drive, or from a server reachable by controller-0.
.. rubric:: |prereq|
When initially installing the |prod-long| software, it is required that you
install the latest available updates on **controller-0** before running Ansible
install the latest available updates on controller-0 before running Ansible
Bootstrap Playbook, and before installing the software on other hosts. This
ensures that:
.. _installing-software-updates-before-initial-commissioning-ul-gsq-1ht-vp:
- The software on **controller-0**, and all other hosts, is up to date when
- The software on controller-0, and all other hosts, is up to date when
the cluster comes alive.
- You reduce installation time by avoiding updating the system right after an
@ -31,12 +31,12 @@ ensures that:
.. rubric:: |proc|
#. Install software on **controller-0**.
#. Install software on controller-0.
Use the |prod-long| bootable ISO image to initialize **controller-0**.
Use the |prod-long| bootable ISO image to initialize controller-0.
This step takes you to the point where you use the console port to log in
to **controller-0** as user **sysadmin**.
to controller-0 as user **sysadmin**.
#. Populate the storage area.
@ -68,9 +68,9 @@ ensures that:
Patch installation is complete.
Please reboot before continuing with configuration.
This command installs all applied updates on **controller-0**.
This command installs all applied updates on controller-0.
#. Reboot **controller-0**.
#. Reboot controller-0.
You must reboot the controller to ensure that it is running with the
software fully updated.

View File

@ -7,9 +7,9 @@ Manage Software Updates
=======================
Updates \(also known as patches\) to the system software become available as
needed to address issues associated with a current |prod-long| software
release. Software updates must be uploaded to the active controller and applied
to all required hosts in the cluster.
needed to address issues associated with a current |prod-long| software release.
Software updates must be uploaded to the active controller and applied to all
required hosts in the cluster.
.. note::
Updating |prod-dc| is distinct from updating other |prod| configurations.
@ -21,8 +21,8 @@ to all required hosts in the cluster.
The following elements form part of the software update environment:
**Reboot-Required Software Updates**
Reboot-required updates are typically major updates that require hosts to
be locked during the update process and rebooted to complete the process.
Reboot-required updates are typically major updates that require hosts to be
locked during the update process and rebooted to complete the process.
.. note::
When a |prod| host is locked and rebooted for updates, the hosted
@ -30,26 +30,26 @@ The following elements form part of the software update environment:
minimize the impact to the hosted application service.
**In-Service Software Updates**
In-service \(reboot-not-required\), software updates are updates that do
not require the locking and rebooting of hosts. The required |prod|
software is updated and any required |prod| processes are re-started.
Hosted applications pods and services are completely unaffected.
In-service \(reboot-not-required\), software updates are updates that do not
require the locking and rebooting of hosts. The required |prod| software is
updated and any required |prod| processes are re-started. Hosted
applications pods and services are completely unaffected.
**Software Update Commands**
The :command:`sw-patch` command is available on both active controllers. It
must be run as root using :command:`sudo`. It provides the user interface
to process the updates, including querying the state of an update, listing
must be run as root using :command:`sudo`. It provides the user interface to
process the updates, including querying the state of an update, listing
affected hosts, and applying, installing, and removing updates.
**Software Update Storage Area**
A central storage area maintained by the update controller. Software
updates are initially uploaded to the storage area and remains there until
they are deleted.
A central storage area maintained by the update controller. Software updates
are initially uploaded to the storage area and remains there until they are
deleted.
**Software Update Repository**
A central repository of software updates associated with any updates
applied to the system. This repository is used by all hosts in the cluster
to identify the software updates and rollbacks required on each host.
A central repository of software updates associated with any updates applied
to the system. This repository is used by all hosts in the cluster to
identify the software updates and rollbacks required on each host.
**Software Update Logs**
The following logs are used to record software update activity:
@ -102,7 +102,7 @@ upload the software update directly from your workstation using a file browser
window provided by the software update upload facility.
A special case occurs during the initial provisioning of a cluster when you
want to update **controller-0** before the system software is configured. This
want to update controller-0 before the system software is configured. This
can only be done from the command line interface. See :ref:`Install Software
Updates Before Initial Commissioning
<installing-software-updates-before-initial-commissioning>` for details.

View File

@ -6,8 +6,8 @@
Manual Kubernetes Version Upgrade
=================================
You can upgrade the Kubernetes version on a running system from one
supported version to another.
You can upgrade the Kubernetes version on a running system from one supported
version to another.
.. rubric:: |context|
@ -102,26 +102,26 @@ and upgrade various systems.
**State**
Can be one of:
**active**
*active*
The version is running everywhere.
**partial**
*partial*
The version is running somewhere.
**available**
*available*
The version can be upgraded to.
**unavailable**
The version is not available for upgrading. Either it is a
downgrade or it requires an intermediate upgrade first. Kubernetes
can be only upgraded one version at a time.
*unavailable*
The version is not available for upgrading. Either it is a downgrade
or it requires an intermediate upgrade first. Kubernetes can be only
upgraded one version at a time.
#. Confirm that the system is healthy.
Check the current system health status, resolve any alarms and other issues
reported by the :command:`system health-query-kube-upgrade` command then
recheck the system health status to confirm that all **System Health**
fields are set to **OK**.
fields are set to *OK*.
.. code-block:: none
@ -156,8 +156,8 @@ and upgrade various systems.
| state | upgrade-started |
+-------------------+-------------------+
The upgrade process checks the applied/available updates, the upgrade path,
the health of the system, the installed applications compatibility and
The upgrade process checks the *applied*/*available* updates, the upgrade
path, the health of the system, the installed applications compatibility and
validates the system is ready for an upgrade.
.. warning::
@ -218,7 +218,7 @@ and upgrade various systems.
| updated_at | 2020-02-20T16:18:11.459736+00:00 |
+--------------+--------------------------------------+
The state **upgraded-networking** will be entered when the networking
The state *upgraded-networking* will be entered when the networking
upgrade has completed.
#. Upgrade the control plane on the first controller.
@ -241,7 +241,7 @@ and upgrade various systems.
You can upgrade either controller first.
The state **upgraded-first-master** will be entered when the first control
The state *upgraded-first-master* will be entered when the first control
plane upgrade has completed.
#. Upgrade the control plane on the second controller.
@ -261,7 +261,7 @@ and upgrade various systems.
| target_version | v1.19.13 |
+-----------------------+-------------------------+
The state **upgraded-second-master** will be entered when the upgrade has
The state *upgraded-second-master* will be entered when the upgrade has
completed.
#. Show the Kubernetes upgrade status for all hosts.
@ -298,7 +298,7 @@ and upgrade various systems.
~(keystone_admin)]$ system host-lock controller-1
.. note::
.. warning::
For All-In-One Simplex systems, the controller must **not** be
locked.

View File

@ -15,7 +15,7 @@ Standard, |prod-dc|, and subcloud deployments.
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|: :ref:`Upgrade
Management <upgrade-management-overview>`.
An upgrade can be performed manually or by the Upgrade Orchestrator which
An upgrade can be performed manually or using the Upgrade Orchestrator, which
automates a rolling install of an update across all of the |prod-long| hosts.
This section describes the manual upgrade procedures.
@ -28,8 +28,8 @@ met:
- The system is patch current.
- There are no management-affecting alarms and the "system
health-query-upgrade" check passes.
- There are no management-affecting alarms and the :command:`system
health-query-upgrade` check passes.
- The new software load has been imported.

View File

@ -21,11 +21,12 @@ manual steps for operator oversight.
:ref:`Upgrade Management <upgrade-management-overview>`.
.. note::
The upgrade orchestration CLI is :command:`sw-manager`.To use upgrade
orchestration commands, you need administrator privileges. You must log in
to the active controller as user **sysadmin** and source the
/etc/platform/openrc script to obtain administrator privileges. Do not use
**sudo**.
The upgrade orchestration commands are prefixed with :command:`sw-manager`.
To use upgrade orchestration commands, you need administrator privileges.
You must log in to the active controller as user **sysadmin** and source the
``/etc/platform/openrc`` script to obtain administrator privileges. Do not use
:command:`sudo`.
.. code-block:: none
@ -71,9 +72,9 @@ conditions:
orchestrated while another orchestration is in progress.
- Sufficient free capacity or unused worker resources must be available
across the cluster. A rough calculation is: Required spare capacity \( %\)
= \(Number of hosts to upgrade in parallel divided by the total number of
hosts\) times 100.
across the cluster. A rough calculation is:
``Required spare capacity ( %) = (<Number-of-hosts-to-upgrade-in-parallel> / <total-number-of-hosts>) * 100``
.. _orchestration-upgrade-overview-section-N10081-N10026-N10001:
@ -81,16 +82,16 @@ conditions:
The Upgrade Orchestration Process
---------------------------------
Upgrade orchestration can be initiated after the manual upgrade and stability
of the initial controller host. Upgrade orchestration automatically iterates
through the remaining hosts, installing the new software load on each one:
first the other controller host, then the storage hosts, and finally the worker
hosts. During worker host upgrades, pods are moved to alternate worker hosts
automatically.
Upgrade orchestration can be initiated after the initial controller host has
been manual upgraded and returned to a stability state. Upgrade orchestration
automatically iterates through the remaining hosts, installing the new software
load on each one: first the other controller host, then the storage hosts, and
finally the worker hosts. During worker host upgrades, pods are automatically
moved to alternate worker hosts.
The user first creates an upgrade orchestration strategy, or plan, for the
automated upgrade procedure. This customizes the upgrade orchestration, using
parameters to specify:
You first create an upgrade orchestration strategy, or plan, for the automated
upgrade procedure. This customizes the upgrade orchestration, using parameters
to specify:
.. _orchestration-upgrade-overview-ul-eyw-fyr-31b:
@ -103,9 +104,9 @@ creates a number of stages for the overall upgrade strategy. Each stage
generally consists of moving pods, locking hosts, installing upgrades, and
unlocking hosts for a subset of the hosts on the system.
After creating the upgrade orchestration strategy, the user can either apply
the entire strategy automatically, or apply individual stages to control and
monitor its progress manually.
After creating the upgrade orchestration strategy, the you can either apply the
entire strategy automatically, or apply individual stages to control and monitor
their progress manually.
Update and upgrade orchestration are mutually exclusive; they perform
conflicting operations. Only a single strategy \(sw-patch or sw-upgrade\) is
@ -115,7 +116,7 @@ strategy before going back to the upgrade.
Some stages of the upgrade could take a significant amount of time \(hours\).
For example, after upgrading a storage host, re-syncing the OSD data could take
30m per TB \(assuming 500MB/s sync rate, which is about half of a 10G
30 minutes per TB \(assuming 500MB/s sync rate, which is about half of a 10G
infrastructure link\).
.. _orchestration-upgrade-overview-section-N10101-N10026-N10001:

View File

@ -10,7 +10,7 @@ Firmware update orchestration allows the firmware on the hosts of an entire
|prod-long| system to be updated with a single operation.
You can configure and run firmware update orchestration using the |CLI|, or the
stx-nfv VIM REST API.
``stx-nfv`` VIM REST API.
.. note::
Firmware update is currently not supported on the Horizon Web interface.
@ -28,7 +28,7 @@ following conditions:
.. note::
When configuring firmware update orchestration, you have the option to
ignore alarms that are not management-affecting severity. For more
ignore alarms that are not of management-affecting severity. For more
information, see :ref:`Kubernetes Version Upgrade Cloud Orchestration
<configuring-kubernetes-update-orchestration>`.
@ -36,7 +36,7 @@ following conditions:
requires firmware update. The *Firmware Update Orchestration Strategy*
creation step will fail if there are no qualified hosts detected.
- Firmware update is a reboot required operation. Therefore, in systems that
- Firmware update is a reboot-required operation. Therefore, in systems that
have the |prefix|-openstack application applied with running instances, if
the migrate option is selected there must be spare openstack-compute \
(worker\) capacity to move instances off the openstack-compute \

View File

@ -13,8 +13,8 @@ The upgrade orchestration CLI is :command:`sw-manager`.
.. note::
To use upgrade orchestration commands, you need administrator privileges.
You must log in to the active controller as user **sysadmin** and source the
/etc/platform/openrc script to obtain administrator privileges. Do not use
**sudo**.
``/etc/platform/openrc`` script to obtain administrator privileges. Do not use
:command:`sudo`.
The upgrade strategy options are shown in the following output:
@ -34,9 +34,9 @@ The upgrade strategy options are shown in the following output:
abort Abort a strategy
show Show a strategy
You can perform a partially orchestrated upgrade using the CLI. Upgrade and
stability of the initial controller node must be done manually before using
upgrade orchestration to orchestrate the remaining nodes of the |prod|.
You can perform a partially orchestrated upgrade using the |CLI|. Upgrade
orchestration of other |prod| nodes can be initiated after the initial
controller host has been manually upgraded and returned to a stability state.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity
@ -65,9 +65,11 @@ See :ref:`Upgrading All-in-One Duplex / Standard
upgrade the initial controller node before doing the upgrade orchestration
described below to upgrade the remaining nodes of the |prod|.
- The subclouds must use the Redfish platform management service if it is an All-in-one Simplex subcloud.
- The subclouds must use the Redfish platform management service if it is an
All-in-one Simplex subcloud.
- Duplex \(AIODX/Standard\) upgrades are supported, and they do not require remote install via Redfish.
- Duplex \(AIODX/Standard\) upgrades are supported, and they do not require
remote install via Redfish.
.. rubric:: |proc|
@ -95,20 +97,20 @@ described below to upgrade the remaining nodes of the |prod|.
- storage-apply-type:
- serial \(default\): storage hosts will be upgraded one at a time
- ``serial`` \(default\): storage hosts will be upgraded one at a time
- parallel: storage hosts will be upgraded in parallel, ensuring that
- ``parallel``: storage hosts will be upgraded in parallel, ensuring that
only one storage node in each replication group is patched at a
time.
- ignore: storage hosts will not be upgraded
- ``ignore``: storage hosts will not be upgraded
- worker-apply-type:
**serial** \(default\)
``serial`` \(default\)
Worker hosts will be upgraded one at a time.
**ignore**
``ignore``
Worker hosts will not be upgraded.
- Alarm Restrictions
@ -177,8 +179,8 @@ described below to upgrade the remaining nodes of the |prod|.
relocated before it is upgraded.
#. Run :command:`sw-manager upgrade-strategy show` command, to display the
current-phase-completion displaying the field goes from 0% to 100% in
various increments. Once at 100%, it returns:
current-phase-completion percentage progress indicator in various
increments. Once at 100%, it returns:
.. code-block:: none
@ -196,7 +198,7 @@ described below to upgrade the remaining nodes of the |prod|.
build-result: success
build-reason:
#. Apply the upgrade-strategy. You can optionally apply a single stage at a
#. Apply the upgrade strategy. You can optionally apply a single stage at a
time.
.. code-block:: none
@ -214,7 +216,7 @@ described below to upgrade the remaining nodes of the |prod|.
state: applying
inprogress: true
While an upgrade-strategy is being applied, it can be aborted. This results
While an upgrade strategy is being applied, it can be aborted. This results
in:
- The current step will be allowed to complete.
@ -222,9 +224,9 @@ described below to upgrade the remaining nodes of the |prod|.
- If necessary an abort phase will be created and applied, which will
attempt to unlock any hosts that were locked.
After an upgrade-strategy has been applied \(or aborted\) it must be
deleted before another upgrade-strategy can be created. If an
upgrade-strategy application fails, you must address the issue that caused
After an upgrade strategy has been applied \(or aborted\) it must be
deleted before another upgrade strategy can be created. If an
upgrade strategy application fails, you must address the issue that caused
the failure, then delete/re-create the strategy before attempting to apply
it again.

View File

@ -6,9 +6,10 @@
Perform an Orchestrated Upgrade
===============================
You can perform a partially-Orchestrated Upgrade of a |prod| system using the CLI and Horizon
Web interface. Upgrade and stability of the initial controller node must be done manually
before using upgrade orchestration to orchestrate the remaining nodes of the |prod|.
You can perform a partially orchestrated Upgrade of a |prod| system using the
CLI and Horizon Web interface. Upgrade and stability of the initial controller
node must be done manually before using upgrade orchestration to orchestrate the
remaining nodes of the |prod|.
.. rubric:: |context|
@ -50,31 +51,31 @@ described below to upgrade the remaining nodes of the |prod| system.
#. Click the **Create Strategy** button.
The Create Strategy dialog appears.
The **Create Strategy** dialog appears.
#. Create an upgrade strategy by specifying settings for the parameters in the
Create Strategy dialog box.
**Create Strategy** dialog box.
Create an upgrade strategy, specifying the following parameters:
- storage-apply-type:
**serial** \(default\)
``serial`` \(default\)
Storage hosts will be upgraded one at a time.
**parallel**
``parallel``
Storage hosts will be upgraded in parallel, ensuring that only one
storage node in each replication group is upgraded at a time.
**ignore**
``ignore``
Storage hosts will not be upgraded.
- worker-apply-type:
**serial** \(default\):
``serial`` \(default\):
Worker hosts will be upgraded one at a time.
**parallel**
``parallel``
Worker hosts will be upgraded in parallel, ensuring that:
- At most max-parallel-worker-hosts \(see below\) worker hosts
@ -86,10 +87,10 @@ described below to upgrade the remaining nodes of the |prod| system.
- Worker hosts with no application pods are upgraded before
worker hosts with application pods.
**ignore**
``ignore``
Worker hosts will not be upgraded.
**max-parallel-worker-hosts**
``max-parallel-worker-hosts``
Specify the maximum worker hosts to upgrade in parallel \(minimum:
2, maximum: 10\).
@ -98,19 +99,19 @@ described below to upgrade the remaining nodes of the |prod| system.
(50), the value shall be at the maximum 2, which represents the
minimum value.
**alarm-restrictions**
``alarm-restrictions``
This option lets you specify how upgrade orchestration behaves when
alarms are present.
You can use the CLI command :command:`fm alarm-list
--mgmt_affecting` to view the alarms that are management affecting.
**Strict**
``Strict``
The default strict option will result in upgrade orchestration
failing if there are any alarms present in the system \(except
for a small list of alarms\).
**Relaxed**
``Relaxed``
This option allows orchestration to proceed if alarms are
present, as long as none of these alarms are management
affecting.
@ -157,10 +158,10 @@ described below to upgrade the remaining nodes of the |prod| system.
NOT updated, but any additional pods on each worker host will be
relocated before it is upgraded.
#. Apply the upgrade-strategy. You can optionally apply a single stage at a
#. Apply the upgrade strategy. You can optionally apply a single stage at a
time.
While an upgrade-strategy is being applied, it can be aborted. This results
While an upgrade strategy is being applied, it can be aborted. This results
in:
- The current step will be allowed to complete.
@ -168,9 +169,9 @@ described below to upgrade the remaining nodes of the |prod| system.
- If necessary an abort phase will be created and applied, which will
attempt to unlock any hosts that were locked.
After an upgrade-strategy has been applied \(or aborted\) it must be
deleted before another upgrade-strategy can be created. If an
upgrade-strategy application fails, you must address the issue that caused
After an upgrade strategy has been applied \(or aborted\) it must be
deleted before another upgrade strategy can be created. If an
upgrade strategy application fails, you must address the issue that caused
the failure, then delete/re-create the strategy before attempting to apply
it again.

View File

@ -18,9 +18,9 @@ before they can be applied.
.. parsed-literal::
$ sudo sw-patch upload /home/sysadmin/patches/|pn|-CONTROLLER_<nn.nn>_PATCH_0001.patch
Cloud_Platform__CONTROLLER_nn.nn_PATCH_0001 is now available
Cloud_Platform__CONTROLLER_<nn.nn>_PATCH_0001 is now available
where *nn.nn* in the update file name is the |prod| release number.
where <nn.nn> in the update file name is the |prod| release number.
This example uploads a single update to the storage area. You can specify
multiple update files on the same command separating their names with
@ -42,7 +42,7 @@ before they can be applied.
$ sudo sw-patch query
The update state is *Available* now, indicating that it is included in the
The update state displays *Available*, indicating that it is included in the
storage area. Further details about the updates can be retrieved as
follows:

View File

@ -20,10 +20,10 @@ version of an update has been committed to the system.
The :command:`query-dependencies` command will show a list of updates that
are required by the specified update \(including itself\). The
**--recursive** option will crawl through those dependencies to return a
``--recursive`` option will crawl through those dependencies to return a
list of all the updates in the specified update's dependency tree. This
query is used by the “commit” command in calculating the set of updates to
be committed.For example,
query is used by the :command:`commit` command in calculating the set of
updates to be committed. For example,
.. parsed-literal::
@ -48,12 +48,12 @@ version of an update has been committed to the system.
updates to be committed. The commit set is calculated by querying the
dependencies of each specified update.
The **--all** option, without the **--release** option, commits all updates
The ``--all`` option, without the ``--release`` option, commits all updates
of the currently running release. When two releases are on the system use
the **--release** option to specify a particular release's updates if
committing all updates for the non-running release. The **--dry-run**
the ``--release`` option to specify a particular release's updates if
committing all updates for the non-running release. The ``--dry-run``
option shows the list of updates to be committed and how much disk space
will be freed up. This information is also shown without the **--dry-run**
will be freed up. This information is also shown without the ``--dry-run``
option, before prompting to continue with the operation. An update can only
be committed once it has been fully applied to the system, and cannot be
removed after.
@ -61,7 +61,7 @@ version of an update has been committed to the system.
Following are examples that show the command usage.
The following command lists the status of all updates that are in an
APPLIED state.
*Applied* state.
.. code-block:: none
@ -84,7 +84,7 @@ version of an update has been committed to the system.
Would you like to continue? [y/N]: y
The patches have been committed.
The following command shows the updates now in the COMMITTED state.
The following command shows the updates now in the *Committed* state.
.. parsed-literal::

View File

@ -23,20 +23,20 @@ following state transitions:
Use the command :command:`sw-patch remove` to trigger this transition.
**Partial-Remove to Available**
Use the command :command:`sudo sw-patch host-install-async` <hostname>
Use the command :command:`sudo sw-patch host-install-async <hostname>`
repeatedly targeting each one of the applicable hosts in the cluster. The
transition to the *Available* state is complete when the update is removed
from all target hosts. The update remains in the update storage area as if
it had just been uploaded.
.. note::
The command :command:`sudo sw-patch host-install-async` <hostname> both
The command :command:`sudo sw-patch host-install-async <hostname>` both
installs and removes updates as necessary.
The following example describes removing an update that applies only to the
controllers. Removing updates can be done using the Horizon Web interface,
also, as discussed in :ref:`Install Reboot-Required Software Updates Using
Horizon <installing-reboot-required-software-updates-using-horizon>`.
controllers. Update removal can be done using the Horizon Web interface as
discussed in :ref:`Install Reboot-Required Software Updates Using Horizon
<installing-reboot-required-software-updates-using-horizon>`.
.. rubric:: |proc|
@ -52,7 +52,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|pn|-|pvr|-PATCH_0001 Applied
In this example the update is listed in the *Applied* state, but it could
be in the *Partial-Apply* state as well.
alo be in the *Partial-Apply* state.
#. Remove the update.
@ -62,7 +62,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|pn|-|pvr|-PATCH_0001 has been removed from the repo
The update is now in the *Partial-Remove* state, ready to be removed from
the impacted hosts where it was already installed.
the impacted hosts where it was currently installed.
#. Query the updating status of all hosts in the cluster.
@ -83,7 +83,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
In this example, the controllers have updates ready to be removed, and
therefore must be rebooted.
#. Remove all pending-for-removal updates from **controller-0**.
#. Remove all pending-for-removal updates from controller-0.
#. Swact controller services away from controller-0.
@ -93,7 +93,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-0>
~(keystone_admin)]$ sudo sw-patch host-install-async controller-0
#. Unlock controller-0.
@ -109,7 +109,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
.. code-block:: none
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-1>
~(keystone_admin)]$ sudo sw-patch host-install-async controller-1
.. rubric:: |result|

View File

@ -11,7 +11,7 @@ upgrade, however, the rollback will impact the hosting of applications.
The upgrade abort procedure can only be applied before the
:command:`upgrade-complete` command is issued. Once this command is issued
the upgrade can not be aborted. If the return to the previous release is required,
the upgrade cannot be aborted. If you must revert to the previous release,
then restore the system using the backup data taken prior to the upgrade.
In some scenarios additional actions will be required to complete the upgrade
@ -23,7 +23,7 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system upgrade-abort
~(keystone_admin)]$ system upgrade-abort
Once this is done there is no going back; the upgrade must be completely
aborted.
@ -41,13 +41,13 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-swact controller-0
~(keystone_admin)]$ system host-swact controller-0
#. Lock controller-0.
.. code-block:: none
$ system host-lock controller-0
~(keystone_admin)]$ system host-lock controller-0
#. Wipe the disk and power down all storage \(if applicable\) and worker hosts.
@ -66,13 +66,13 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-lock <hostID>
~(keystone_admin)]$ system host-lock <hostID>
#. Downgrade controller-0.
.. code-block:: none
$ system host-downgrade controller-0
~(keystone_admin)]$ system host-downgrade controller-0
The host is re-installed with the previous release load.
@ -80,7 +80,7 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-unlock controller-0
~(keystone_admin)]$ system host-unlock controller-0
.. note::
Wait for controller-0 to become unlocked-enabled. Wait for the
@ -90,7 +90,7 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-swact controller-1
~(keystone_admin)]$ system host-swact controller-1
Swacting back to controller-0 will switch back to using the previous
release databases, which were frozen at the time of the swact to
@ -100,11 +100,11 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-lock controller-1
~(keystone_admin)]$ system host-lock controller-1
.. code-block:: none
$ system host-downgrade controller-1
~(keystone_admin)]$ system host-downgrade controller-1
The host is re-installed with the previous release load.
@ -112,7 +112,7 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system host-unlock controller-1
~(keystone_admin)]$ system host-unlock controller-1
#. Power up and unlock the storage hosts one at a time \(if using a Ceph
@ -134,7 +134,7 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system upgrade-complete
~(keystone_admin)]$ system upgrade-complete
This cleans up the upgrade release, configuration, databases, and so forth.
@ -142,4 +142,4 @@ abort. It may be necessary to restore the system from a backup.
.. code-block:: none
$ system load-delete
~(keystone_admin)]$ system load-delete

View File

@ -18,7 +18,7 @@ has upgraded successfully.
.. code-block:: none
$ system upgrade-abort
~(keystone_admin)]$ system upgrade-abort
The upgrade state is set to aborting. Once this is executed, there is no
canceling; the upgrade must be completely aborted.
@ -36,7 +36,7 @@ has upgraded successfully.
.. code-block:: none
$ system host-swact controller-1
~(keystone_admin)]$ system host-swact controller-1
If controller-1 was active with the new upgrade release, swacting back to
controller-0 will switch back to using the previous release databases,
@ -47,8 +47,8 @@ has upgraded successfully.
.. code-block:: none
$ system host-lock controller-1
$ system host-downgrade controller-1
~(keystone_admin)]$ system host-lock controller-1
~(keystone_admin)]$ system host-downgrade controller-1
The host is re-installed with the previous release load.
@ -63,16 +63,16 @@ has upgraded successfully.
.. code-block:: none
$ system host-unlock controller-1
~(keystone_admin)]$ system host-unlock controller-1
#. Complete the upgrade.
.. code-block:: none
$ system upgrade-complete
~(keystone_admin)]$ system upgrade-complete
#. Delete the newer upgrade release that has been aborted.
.. code-block:: none
$ system load-delete <loadID>
~(keystone_admin)]$ system load-delete <loadID>

View File

@ -25,7 +25,7 @@ following items:
- feature enhancements
Software updates can be installed manually or by the Update Orchestrator which
Software updates can be installed manually or by the Update Orchestrator, which
automates a rolling install of an update across all of the |prod-long| hosts.
For more information on manual updates, see :ref:`Manage Software Updates
<managing-software-updates>`. For more information on upgrade orchestration,
@ -40,7 +40,7 @@ see :ref:`Orchestrated Software Update <update-orchestration-overview>`.
.. xbooklink For more information, see, |distcloud-doc|: :ref:`Update Management for
Distributed Cloud <update-management-for-distributed-cloud>`.
The |prod| handles multiple updates being applied and removed at once. Software
|prod| handles multiple updates being applied and removed at once. Software
updates can modify and update any area of |prod| software, including the kernel
itself. For information on populating, installing and removing software
updates, see :ref:`Manage Software Updates <managing-software-updates>`.
@ -73,9 +73,9 @@ the |prod| software:
#. **Application Software Updates**
These software updates apply to software being managed through the
StarlingX Application Package Manager, that is, ':command:`system
application-upload/apply/remove/delete`'. |prod| delivers some software
through this mechanism, for example, **platform-integ-apps**.
StarlingX Application Package Manager, that is, :command:`system
application-upload/apply/remove/delete`. |prod| delivers some software
through this mechanism, for example, ``platform-integ-apps``.
For software updates for these applications, download the updated
application tarball, containing the updated FluxCD manifest, and updated

View File

@ -18,7 +18,7 @@ hosts are upgraded one at time while continuing to provide its hosting services
to its hosted applications. An upgrade can be performed manually or using
Upgrade Orchestration, which automates much of the upgrade procedure, leaving a
few manual steps to prevent operator oversight. For more information on manual
upgrades, see :ref:`Manual PLatform Components Upgrade
upgrades, see :ref:`Manual Platform Components Upgrade
<manual-upgrade-overview>`. For more information on upgrade orchestration, see
:ref:`Orchestrated Platform Component Upgrade
<orchestration-upgrade-overview>`.
@ -26,7 +26,7 @@ upgrades, see :ref:`Manual PLatform Components Upgrade
.. warning::
Do NOT use information in the |updates-doc| guide for |prod-dc|
orchestrated software upgrades. If information in this document is used for
a |prod-dc| orchestrated upgrade, the upgrade will fail resulting
a |prod-dc| orchestrated upgrade, the upgrade will fail, resulting
in an outage. The |prod-dc| Upgrade Orchestrator automates a
recursive rolling upgrade of all subclouds and all hosts within the
subclouds.
@ -40,40 +40,41 @@ Before starting the upgrades process:
.. _software-upgrades-ul-ant-vgq-gmb:
- the system must be “patch current”
- The system must be 'patch current'.
- there must be no management-affecting alarms present on the system
- There must be no management-affecting alarms present on the system.
- ensure any certificates managed by cert manager will not be renewed during
the upgrade process
- Ensure that any certificates managed by cert manager will not be renewed
during the upgrade process.
- the new software load must be imported, and
- The new software load must be imported.
- a valid license file for the new software release must be installed
- A valid license file for the new software release must be installed.
The upgrade process starts by upgrading the controllers. The standby controller
is upgraded first and involves loading the standby controller with the new
release of software and migrating all the controller services' databases for
the new release of software. Activity is switched to the upgraded controller,
release of software and migrating all the controller services' databases for the
new release of software. Activity is switched to the upgraded controller,
running in a 'compatibility' mode where all inter-node messages are using
message formats from the old release of software. Before upgrading the second
controller, is the "point-of-no-return for an in-service abort" of the upgrades
process. The second controller is loaded with the new release of software and
becomes the new Standby controller. For more information on manual upgrades,
see :ref:`Manual Platform Components Upgrade <manual-upgrade-overview>` .
message formats from the old release of software. Prior to upgrading the second
controller, you reach a "point-of-no-return for an in-service abort" of the
upgrades process. The second controller is loaded with the new release of
software and becomes the new Standby controller. For more information on manual
upgrades, see :ref:`Manual Platform Components Upgrade
<manual-upgrade-overview>` .
If present, storage nodes are locked, upgraded and unlocked one at a time in
order to respect the redundancy model of |prod| storage nodes. Storage nodes
can be upgraded in parallel if using upgrade orchestration.
Upgrade of worker nodes is the next step in the process. When locking a worker
node the node is tainted, such that Kubernetes shuts down any pods on this
worker node and restarts the pods on another worker node. When upgrading the
worker node, the worker node network boots/installs the new software from the
active controller. After unlocking the worker node, the worker services are
running in a 'compatibility' mode where all inter-node messages are using
message formats from the old release of software. Note that the worker nodes
can only be upgraded in parallel if using upgrade orchestration.
Worker nodes are then upgraded. Worker nodes are tainted when locked, such that
Kubernetes shuts down any pods on this worker node and restarts the pods on
another worker node. When upgrading the worker node, the worker node network
boots/installs the new software from the active controller. After unlocking the
worker node, the worker services are running in a 'compatibility' mode where all
inter-node messages are using message formats from the old release of software.
Note that the worker nodes can only be upgraded in parallel if using upgrade
orchestration.
The final step of the upgrade process is to activate and complete the upgrade.
This involves disabling 'compatibility' modes on all hosts and clearing the
@ -97,9 +98,9 @@ resolved. Issues specific to a storage or worker host can be addressed by
temporarily downgrading the host, addressing the issues and then upgrading the
host again, or in some cases by replacing the node.
In extremely rare cases, it may be necessary to abort an upgrade. This is a
last resort and should only be done if there is no other way to address the
issue within the context of the upgrade. There are two cases for doing such an
In extremely rare cases, it may be necessary to abort an upgrade. This is a last
resort and should only be done if there is no other way to address the issue
within the context of the upgrade. There are two scenarios for doing such an
abort:
.. _software-upgrades-ul-dqp-brt-cx:

View File

@ -32,13 +32,14 @@ instances and since the firmware update is a reboot required operation for a
host, the strategy offers **stop/start** or **migrate** options for managing
instances over the **lock/unlock** \(reboot\) steps in the update process.
You must use the **sw-manager** |CLI| tool to **create**, and then **apply** the
update strategy. A created strategy can be monitored with the **show** command.
For more information, see :ref:`Firmware Update Orchestration Using the CLI
You must use the :command:`sw-manager` |CLI| commands to create, and then apply
the update strategy. A created strategy can be monitored with the
command:`sw-manager show` command. For more information, see :ref:`Firmware
Update Orchestration Using the CLI
<firmware-update-orchestration-using-the-cli>`.
Firmware update orchestration automatically iterates through all
**unlocked-enabled** hosts on the system looking for hosts with the worker
*unlocked-enabled* hosts on the system looking for hosts with the worker
function that need firmware update and then proceeds to update them on the
strategy :command:`apply` action.
@ -52,13 +53,13 @@ After creating the *Firmware Update Orchestration Strategy*, you can either
apply the entire strategy automatically, or manually apply individual stages to
control and monitor the firmware update progress one stage at a time.
When the firmware update strategy is **applied**, if the system is All-in-one,
When the firmware update strategy is applied, if the system is All-in-one,
the controllers are updated first, one after the other with a swact in between,
followed by the remaining worker hosts according to the selected worker apply
concurrency \(**serial** or **parallel**\) method.
The strategy creation default is to update the worker hosts serially unless the
**parallel** worker apply type option is specified which configures the
``parallel`` worker apply type option is specified which configures the
firmware update process for worker hosts to be in parallel \(up to a maximum
parallel number\) to reduce the overall firmware update installation time.
@ -73,7 +74,8 @@ steps involved in a firmware update for a single or group of hosts include:
#. Alarm Query is an update pre-check.
#. Firmware update non-service affecting update that can take over 45 minutes.
#. Firmware update is a non-service affecting update that can take over 45
minutes.
#. Lock Host.
@ -89,11 +91,11 @@ Strategy* considers any configured server groups and host aggregates when
creating the stages to reduce the impact to running instances. The *Firmware
Update Orchestration Strategy* automatically manages the instances during the
strategy application process. The instance management options include
**start-stop** or **migrate**.
``start-stop`` or ``migrate``.
.. _htb1590431033292-ul-vcp-dvs-tlb:
- **start-stop**: where instances are stopped following the actual firmware
- ``start-stop``: where instances are stopped following the actual firmware
update but before the lock operation and then automatically started again
after the unlock completes. This is typically used for instances that do
not support migration or for cases where migration takes too long. To
@ -101,6 +103,6 @@ strategy application process. The instance management options include
instance, the instance\(s\) should be protected and grouped into an
anti-affinity server group\(s\) with its standby instance.
- **migrate**: where instances are moved off a host following the firmware
- ``migrate``: where instances are moved off a host following the firmware
update but before the host is locked. Instances with **Live Migration**
support are **Live Migrated**. Otherwise, they are **Cold Migrated**.

View File

@ -35,35 +35,33 @@ operation for a host, the strategy offers **stop/start** or **migrate** options
for managing instances over the **lock/unlock** \(reboot\) steps in the upgrade
process.
You must use the **sw-manager** CLI tool to **create**, and then **apply** the
upgrade strategy. A created strategy can be monitored with the **show**
command.
You must use the :command:`sw-manager`` CLI tool to create, and then apply the
upgrade strategy. A created strategy can be monitored with the **show** command.
Kubernetes version upgrade orchestration automatically iterates through all
**unlocked-enabled** hosts on the system looking for hosts with the worker
function that need Kubernetes version upgrade and then proceeds to upgrade them
*unlocked-enabled* hosts on the system looking for hosts with the worker
function that need Kubernetes version upgrades and then proceeds to upgrade them
on the strategy :command:`apply` action.
.. note::
Controllers (including |AIO| controllers) are upgraded before worker only
hosts. Storage hosts do not run Kubernetes so Kubernetes is not upgraded
on them, although they still may be patched.
Controllers (including |AIO| controllers) are upgraded before worker-only
hosts. Since storage hosts do not run Kubernetes, no upgrade is performed,
although they may still be patched.
After creating the *Kubernetes Version Upgrade Orchestration Strategy*, you can
either apply the entire strategy automatically, or manually apply individual
stages to control and monitor the Kubernetes version upgrade progress one stage
at a time.
When the Kubernetes version upgrade strategy is **applied**, if the system is
When the Kubernetes version upgrade strategy is applied, if the system is
All-in-one, the controllers are upgraded first, one after the other with a
swact in between, followed by the remaining worker hosts according to the
selected worker apply concurrency \(**serial** or **parallel**\) method.
The strategy creation default is to upgrade the worker hosts serially unless
the **parallel** worker apply type option is specified which configures the
Kubernetes version upgrade process for worker hosts to be in parallel \(up to a
maximum parallel number\) to reduce the overall Kubernetes version upgrade
installation time.
By default, strategies upgrade the worker hosts serially unless the **parallel**
worker apply type option is specified, which configures the Kubernetes version
upgrade process for worker hosts to be in parallel \(up to a maximum parallel
number\). This reduces the overall Kubernetes version upgrade installation time.
The upgrade takes place in two phases. The first phase upgrades the patches
(controllers, storage and then workers), and the second phase upgrades
@ -113,25 +111,25 @@ Upgrade Operations Requiring Manual Migration
On systems with |prefix|-openstack application, the *Kubernetes Version Upgrade
Orchestration Strategy* considers any configured server groups and host
aggregates when creating the stages to reduce the impact to running instances.
The *Kubernetes Version Upgrade Orchestration Strategy* automatically manages
the instances during the strategy application process. The instance management
options include **start-stop** or **migrate**.
aggregates when creating the stages in order to reduce the impact to running
instances. The *Kubernetes Version Upgrade Orchestration Strategy* automatically
manages the instances during the strategy application process. The instance
management options include **start-stop** or **migrate**.
.. _htb1590431033292-ul-vcp-dvs-tlb:
- **start-stop**: where instances are stopped following the actual Kubernetes
- **start-stop**: Instances are stopped following the actual Kubernetes
upgrade but before the lock operation and then automatically started again
after the unlock completes. This is typically used for instances that do
not support migration or for cases where migration takes too long. To
ensure this does not impact the high-level service being provided by the
instance, the instance\(s\) should be protected and grouped into an
anti-affinity server group\(s\) with its standby instance.
after the unlock completes. This is typically used for instances that do not
support migration or for cases where migration takes too long. To ensure
this does not impact the high-level service being provided by the instance,
the instance\(s\) should be protected and grouped into an anti-affinity
server group\(s\) with its standby instance.
- **migrate**: where instances are moved off a host following the Kubernetes
upgrade but before the host is locked. Instances with **Live Migration**
support are **Live Migrated**. Otherwise, they are **Cold Migrated**.
- **migrate**: Instances are moved off a host following the Kubernetes upgrade
but before the host is locked. Instances with **Live Migration** support are
**Live Migrated**. Otherwise, they are **Cold Migrated**.
.. _kubernetes-update-operations-requiring-manual-migration:
@ -149,34 +147,33 @@ Do the following to manage the instance re-location manually:
.. _rbp1590431075472-ul-mgr-kvs-tlb:
- Manually perform Kubernetes version upgrade at least one openstack-compute worker host. This
assumes that at least one openstack-compute worker host does not have any
instances, or has instances that can be migrated. For more information on
manually updating a host, see :ref:`Manual Kubernetes Version Upgrade
- Manually perform a Kubernetes version upgrade of at least one
openstack-compute worker host. This assumes that at least one
openstack-compute worker host does not have any instances, or has instances
that can be migrated. For more information on manually updating a host, see
:ref:`Manual Kubernetes Version Upgrade
<manual-kubernetes-components-upgrade>`.
- If the migration is prevented by limitations in the VNF or virtual
application, perform the following:
- Create new instances on an already upgraded openstack-compute worker
#. Create new instances on an already upgraded openstack-compute worker
host.
- Manually migrate the data from the old instances to the new instances.
#. Manually migrate the data from the old instances to the new instances.
.. note::
This is specific to your environment and depends on the virtual
application running in the instance.
- Terminate the old instances.
#. Terminate the old instances.
- If the migration is prevented by the size of the instances local disks:
- For each openstack-compute worker host that has instances that cannot
be migrated, manually move the instances using the CLI.
- If the migration is prevented by the size of the instances local disks, then
for each openstack-compute worker host that has instances that cannot
be migrated, manually move the instances using the CLI.
Once all openstack-compute worker hosts containing instances that cannot be
migrated have been Kubernetes version upgraded, Kubernetes version upgrade
orchestration can then be used to upgrade the remaining worker hosts.
migrated have been Kubernetes-version upgraded, Kubernetes version upgrade
orchestration can be used to upgrade the remaining worker hosts.

View File

@ -16,8 +16,8 @@ interface dialog, described in :ref:`Configuring Update Orchestration
.. note::
To use update orchestration commands, you need administrator privileges.
You must log in to the active controller as user **sysadmin** and source
the /etc/platform/openrc script to obtain administrator privileges. Do not
use **sudo**.
the ``/etc/platform/openrc`` script to obtain administrator privileges. Do not
use :command:`sudo`.
.. note::
Management-affecting alarms cannot be ignored at the indicated severity

View File

@ -14,7 +14,7 @@ operation.
:depth: 1
You can configure and run update orchestration using the CLI, the Horizon Web
interface, or the stx-nfv REST API.
interface, or the ``stx-nfv`` REST API.
.. note::
Updating of |prod-dc| is distinct from updating of other |prod|
@ -74,9 +74,9 @@ the same time. Update orchestration only locks and unlocks \(that is, reboots\)
a host to install an update if at least one reboot-required update has been
applied.
The user first creates an update orchestration strategy, or plan, for the
automated updating procedure. This customizes the update orchestration, using
parameters to specify:
You first create an update orchestration strategy, or plan, for the automated
updating procedure. This customizes the update orchestration, using parameters
to specify:
.. _update-orchestration-overview-ul-eyw-fyr-31b:

View File

@ -27,19 +27,19 @@ To keep track of software update installation, you can use the
.. parsed-literal::
~(keystone_admin)]$ sudo sw-patch query
Patch ID Patch State
=========== ============
|pvr|-nn.nn_PATCH_0001 Applied
Patch ID Patch State
=========== ============
|pvr|-<nn>.<nn>_PATCH_0001 Applied
where *nn.nn* in the update filename is the |prod| release number.
where <nn>.<nn> in the update filename is the |prod| release number.
This shows the **Patch State** for each of the updates in the storage area:
This shows the 'Patch State' for each of the updates in the storage area:
**Available**
``Available``
An update in the *Available* state has been added to the storage area, but
is not currently in the repository or installed on the hosts.
**Partial-Apply**
``Partial-Apply``
An update in the *Partial-Apply* state has been added to the software
updates repository using the :command:`sw-patch apply` command, but has not
been installed on all hosts that require it. It may have been installed on
@ -51,12 +51,12 @@ This shows the **Patch State** for each of the updates in the storage area:
node X, you cannot just install the non-reboot-required update to the
unlocked node X.
**Applied**
``Applied``
An update in the *Applied* state has been installed on all hosts that
require it.
You can use the :command:`sw-patch query-hosts` command to see which hosts are
fully updated \(**Patch Current**\). This also shows which hosts require
fully updated \(Patch Current\). This also shows which hosts require
reboot, either because they are not fully updated, or because they are fully
updated but not yet rebooted.

View File

@ -16,11 +16,11 @@ of |prod| software.
- Perform a full backup to allow recovery.
.. note::
Back up files in the /home/sysadmin and /root directories prior
Back up files in the ``/home/sysadmin`` and ``/root`` directories prior
to doing an upgrade. Home directories are not preserved during backup or
restore operations, blade replacement, or upgrades.
- The system must be "patch current". All updates available for the current
- The system must be 'patch current'. All updates available for the current
release running on the system must be applied, and all patches must be
committed. To find and download applicable updates, visit the |dnload-loc|.
@ -29,9 +29,9 @@ of |prod| software.
.. note::
Make sure that the ``/home/sysadmin`` directory has enough space
(at least 2GB of free space), otherwise the upgrade may fail once it
starts. If more space is needed, it is recommended to delete the
``.iso bootimage`` previously imported after the `load-import` command.
(at least 2GB of free space), otherwise the upgrade may fail.
If more space is needed, it is recommended to delete the
``.iso bootimage`` previously imported after the :command:`load-import` command.
- Transfer the new release software license file to controller-0, \(or onto a
USB stick\).
@ -41,8 +41,8 @@ of |prod| software.
- Unlock all hosts.
- All nodes must be unlocked. The upgrade cannot be started when there
are locked nodes \(the health check prevents it\).
- All nodes must be unlocked as the health check prevents the upgrade
cannot if there are locked nodes.
.. note::
The upgrade procedure includes steps to resolve system health issues.
@ -51,8 +51,7 @@ of |prod| software.
#. Ensure that controller-0 is the active controller.
#. Install the license file for the release you are upgrading to, for example,
nn.nn.
#. Install the license file for the release you are upgrading.
.. code-block:: none
@ -66,13 +65,12 @@ of |prod| software.
#. Import the new release.
#. Run the :command:`load-import` command on **controller-0** to import
#. Run the :command:`load-import` command on controller-0 to import
the new release.
First, source /etc/platform/openrc. Also, you must specify an exact
path to the \*.iso bootimage file and to the \*.sig bootimage signature
file.
Source ``/etc/platform/openrc``. Also, you must specify an exact
path to the ``*.iso`` bootimage file and to the ``*.sig`` bootimage
signature file.
.. code-block:: none
@ -89,7 +87,7 @@ of |prod| software.
| required_patches | |
+--------------------+-----------+
The :command:`load-import` must be done on **controller-0** and accepts
The :command:`load-import` must be done on controller-0 and accepts
relative paths.
.. note::
@ -112,7 +110,7 @@ of |prod| software.
The system must be 'patch current'. All software updates related to your
current |prod| software release must be uploaded, applied, and installed.
All software updates to the new |prod| release, only need to be uploaded
All software updates to the new |prod| release only need to be uploaded
and applied. The install of these software updates will occur automatically
during the software upgrade procedure as the hosts are reset to load the
new release of software.
@ -127,7 +125,7 @@ of |prod| software.
Check the current system health status, resolve any alarms and other issues
reported by the :command:`system health-query-upgrade` command, then
recheck the system health status to confirm that all **System Health**
fields are set to **OK**. For example:
fields are set to *OK*. For example:
.. code-block:: none
@ -148,10 +146,9 @@ of |prod| software.
All kubernetes applications are in a valid state: [OK]
Active controller is controller-0: [OK]
By default, the upgrade process cannot be run and is not recommended to be
run with active alarms present. Use the command :command:`system upgrade-start --force`
to force the upgrade process to start and ignore non-management-affecting
alarms.
By default, the upgrade process cannot be run with active alarms present.
Use the command :command:`system upgrade-start --force` to force the upgrade
process to start and ignore non-management-affecting alarms.
.. note::
It is strongly recommended that you clear your system of any and all
@ -177,17 +174,18 @@ of |prod| software.
| to_release | nn.nn |
+--------------+--------------------------------------+
This will make a copy of the upgrade data onto a DRBD file system to be
This will make a copy of the upgrade data onto a |DRBD| file system to be
used in the upgrade. Configuration changes are not allowed after this point
until the swact to controller-1 is completed.
The following upgrade state applies once this command is executed:
- started:
- ``started``:
- State entered after :command:`system upgrade-start` completes.
- Release nn.nn system data \(for example, postgres databases\) has
- Release <nn>.<nn> system data \(for example, postgres databases\) has
been exported to be used in the upgrade.
- Configuration changes must not be made after this point, until the
@ -200,13 +198,14 @@ of |prod| software.
upgrade.
.. note::
Use the command :command:`system upgrade-start --force` to force the
upgrade process to start and ignore non-management-affecting alarms.
This should ONLY be done if you feel these alarms will not be an issue
over the upgrades process.
This should **ONLY** be done if you ascertain that these alarms will
interfere with the upgrades process.
On systems with Ceph storage, it also checks that the Ceph cluster is
healthy.
On systems with Ceph storage, the process also checks that the Ceph cluster
is healthy.
#. Upgrade controller-1.
@ -231,7 +230,7 @@ of |prod| software.
The following data migration states apply when this command is
executed:
- data-migration:
- ``data-migration``:
- State entered when :command:`system host-upgrade controller-1`
is executed.
@ -245,21 +244,21 @@ of |prod| software.
You can view the upgrade progress on controller-1 using the
serial console.
- data-migration-complete or upgrading-controllers:
- ``data-migration-complete or upgrading-controllers``:
- State entered when controller-1 upgrade is complete.
- System data has been successfully migrated from release nn.nn
to release nn.nn.
- System data has been successfully migrated from release <nn>.<nn>
to the newer Version.
- data-migration-failed:
- ``data-migration-failed``:
- State entered if data migration on controller-1 fails.
- Upgrade must be aborted.
.. note::
Review the /var/log/sysinv.log on the active controller for
Review the ``/var/log/sysinv.log`` on the active controller for
more details on data migration failure.
#. Check the upgrade state.
@ -277,7 +276,7 @@ of |prod| software.
+--------------+--------------------------------------+
If the :command:`upgrade-show` status indicates
'data-migration-failed', then there is an issue with the data
*data-migration-failed*, then there is an issue with the data
migration. Check the issue before proceeding to the next step.
#. Unlock controller-1.
@ -286,20 +285,22 @@ of |prod| software.
~(keystone_admin)]$ system host-unlock controller-1
Wait for controller-1 to become **unlocked-enabled**. Wait for the DRBD
sync **400.001** Services-related alarm is raised and then cleared.
Wait for controller-1 to enter the state *unlocked-enabled*. Wait for
the |DRBD| sync **400.001** Services-related alarm to be raised and then
cleared.
The following states apply when this command is executed.
- upgrading-controllers:
- ``upgrading-controllers``:
- State entered when controller-1 has been unlocked and is
running release nn.nn software.
If it transitions to **unlocked-disabled-failed**, check the issue
before proceeding to the next step. The alarms may indicate a
If the controller transitions to **unlocked-disabled-failed**, check the
issue before proceeding to the next step. The alarms may indicate a
configuration error. Check the result of the configuration logs on
controller-1, \(for example, Error logs in controller1:/var/log/puppet\).
controller-1, \(for example, Error logs in
controller1:``/var/log/puppet``\).
#. Set controller-1 as the active controller. Swact to controller-1.
@ -307,36 +308,33 @@ of |prod| software.
~(keystone_admin)]$ system host-swact controller-0
Wait until all services are enabled / active and the swact is complete
on controller-0 before proceeding to the next step. Use the following
command below:
Wait until services have become active on the new active controller-1 before
proceeding to the next step. The swact is complete when all services on
controller-1 are in the state ``enabled-active``. Use the command ``system
servicegroup-list`` to monitor progress.
.. code-block:: none
#. Upgrade controller-0.
~(keystone_admin)]$ system servicegroup-list
#. Upgrade **controller-0**.
#. Lock **controller-0**.
#. Lock controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-lock controller-0
#. Upgrade **controller-0**.
#. Upgrade controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-upgrade controller-0
#. Unlock **controller-0**.
#. Unlock controller-0.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-0
Wait until the DRBD sync **400.001** Services-related alarm is raised
Wait until the |DRBD| sync **400.001** Services-related alarm is raised
and then cleared before proceeding to the next step.
- upgrading-hosts:
@ -356,7 +354,7 @@ of |prod| software.
Clear all alarms unrelated to the upgrade process.
#. If using Ceph storage backend, upgrade the storage nodes one at a time.
#. If using Ceph a storage backend, upgrade the storage nodes one at a time.
.. note::
Proceed to step 13 if no storage/worker node is present.
@ -370,10 +368,10 @@ of |prod| software.
~(keystone_admin)]$ system host-lock storage-0
#. Verify that the OSDs are down after the storage node is locked.
#. Verify that the |OSDs| are down after the storage node is locked.
In the Horizon interface, navigate to **Admin** \> **Platform** \>
**Storage Overview** to view the status of the OSDs.
**Storage Overview** to view the status of the |OSDs|.
#. Upgrade storage-0.
@ -381,7 +379,7 @@ of |prod| software.
~(keystone_admin)]$ system host-upgrade storage-0
The upgrade is complete when the node comes online, and at that point,
The upgrade is complete when the node comes online. At that point
you can safely unlock the node.
After upgrading a storage node, but before unlocking, there are Ceph
@ -408,7 +406,7 @@ of |prod| software.
**800.003**. The alarm is cleared after all storage nodes are
upgraded.
#. Upgrade worker hosts, one at a time, if any.
#. Upgrade worker hosts, if any, one at a time.
#. Lock worker-0.
@ -431,7 +429,7 @@ of |prod| software.
~(keystone_admin)]$ system host-unlock worker-0
Wait for all alarms to clear after the unlock before proceeding to the
After the unlock wait for all alarms to clear before proceeding to the
next worker host.
#. Repeat the above steps for each worker host.
@ -442,9 +440,9 @@ of |prod| software.
~(keystone_admin)]$ system host-swact controller-1
Wait until services have gone active on the active controller-0 before
proceeding to the next step. When all services on controller-0 are
enabled-active, the swact is complete.
Wait until services have become available on the active controller-0 before
proceeding to the next step. When all services on controller-0 are in the
``enabled-active`` state, the swact is complete.
#. Activate the upgrade.
@ -460,31 +458,31 @@ of |prod| software.
| to_release | nn.nn |
+--------------+--------------------------------------+
During the running of the :command:`upgrade-activate` command, new
When running the :command:`upgrade-activate` command, new
configurations are applied to the controller. 250.001 \(**hostname
Configuration is out-of-date**\) alarms are raised and are cleared as the
configuration is applied. The upgrade state goes from **activating** to
**activation-complete** once this is done.
configuration is applied. The upgrade state goes from ``activating`` to
``activation-complete`` once this is done.
The following states apply when this command is executed.
**activation-requested**
``activation-requested``
State entered when :command:`system upgrade-activate` is executed.
**activating**
State entered when we have started activating the upgrade by applying
``activating``
State entered when the system has started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activating-hosts**
``activating-hosts``
State entered when applying host-specific configurations. This state is
entered only if needed.
**activation-complete**
``activation-complete``
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**.
``activation-complete``.
.. code-block:: none

View File

@ -30,8 +30,8 @@ software.
- The system is patch current.
- There should be sufficient free space in /opt/platform-backup. Remove
any unused files if necessary.
- There should be sufficient free space in ``/opt/platform-backup.``.
Remove any unused files if necessary.
- The new software load has been imported.
@ -41,10 +41,12 @@ software.
stick\); controller-0 must be active.
.. note::
Make sure that the ``/home/sysadmin`` directory has enough space
(at least 2GB of free space), otherwise the upgrade may fail once it
starts. If more space is needed, it is recommended to delete the
``.iso bootimage`` previously imported after the `load-import` command.
Make sure that the ``/home/sysadmin`` directory has enough space (at
least 2GB of free space), otherwise the upgrade may fail once it starts.
If more space is needed, it is recommended to delete the ``.iso``
bootimage previously imported after the :command:`load-import`
command.
- Transfer the new release software license file to controller-0 \(or onto a
USB stick\).
@ -55,7 +57,7 @@ software.
.. note::
The upgrade procedure includes steps to resolve system health issues.
End user container images in`registry.local` will be backed up during the
End user container images in ``registry.local`` will be backed up during the
upgrade process. This only includes images other than |prod| system and
application images. These images are limited to 5 GB in total size. If the
system contains more than 5 GB of these images, the upgrade start will fail.
@ -71,8 +73,7 @@ For more details, see :ref:`Detailed contents of a system backup
$ source /etc/platform/openrc
~(keystone_admin)]$
#. Install the license file for the release you are upgrading to, for example,
nn.nn.
#. Install the license file for the release you are upgrading to.
.. code-block:: none
@ -86,13 +87,13 @@ For more details, see :ref:`Detailed contents of a system backup
#. Import the new release.
#. Run the :command:`load-import` command on **controller-0** to import
#. Run the :command:`load-import` command on controller-0 to import
the new release.
First, source /etc/platform/openrc.
First, source ``/etc/platform/openrc``.
You must specify an exact path to the \*.iso bootimage file and to the
\*.sig bootimage signature file.
You must specify an exact path to the ``*.iso`` bootimage file and to the
``*.sig`` bootimage signature file.
.. code-block:: none
@ -109,7 +110,7 @@ For more details, see :ref:`Detailed contents of a system backup
| required_patches | |
+--------------------+-----------+
The :command:`load-import` must be done on **controller-0** and accepts
The :command:`load-import` must be done on controller-0 and accepts
relative paths.
.. note::
@ -130,9 +131,9 @@ For more details, see :ref:`Detailed contents of a system backup
#. Apply any required software updates.
The system must be 'patch current'. All software updates related to your
current |prod| software release must be, uploaded, applied, and installed.
current |prod| software release must be uploaded, applied, and installed.
All software updates to the new |prod| release, only need to be uploaded
All software updates to the new |prod| release only need to be uploaded
and applied. The install of these software updates will occur automatically
during the software upgrade procedure as the hosts are reset to load the
new release of software.
@ -147,7 +148,7 @@ For more details, see :ref:`Detailed contents of a system backup
Check the current system health status, resolve any alarms and other issues
reported by the :command:`system health-query-upgrade` command, then
recheck the system health status to confirm that all **System Health**
fields are set to **OK**.
fields are set to *OK*.
.. code-block:: none
@ -167,13 +168,13 @@ For more details, see :ref:`Detailed contents of a system backup
All kubernetes applications are in a valid state: [OK]
Active controller is controller-0: [OK]
By default, the upgrade process cannot be run and is not recommended to be
run with Active Alarms present. However, management affecting alarms can be
ignored with the :command:`--force` option with the :command:`system
upgrade-start` command to force the upgrade process to start.
By default, the upgrade process cannot be run with Active Alarms present.
However, management affecting alarms can be ignored with the
:command:`--force` option with the :command:`system upgrade-start` command
to force the upgrade process to start.
.. note::
It is strongly recommended that you clear your system of any and all
It is strongly recommended that you clear your system of all
alarms before doing an upgrade. While the :command:`--force` option is
available to run the upgrade, it is a best practice to clear any
alarms.
@ -192,40 +193,41 @@ For more details, see :ref:`Detailed contents of a system backup
| to_release | nn.nn |
+--------------+--------------------------------------+
This will back up the system data and images to /opt/platform-backup.
/opt/platform-backup is preserved when the host is reinstalled. With the
platform backup, the size of /home/sysadmin must be less than 2GB.
This will back up the system data and images to ``/opt/platform-backup.``.
``/opt/platform-backup.`` is preserved when the host is reinstalled. With the
platform backup, the size of ``/home/sysadmin`` must be less than 2GB.
This process may take several minutes.
When the upgrade state is upgraded to **started** the process is complete.
When the upgrade state is upgraded to *started* the process is complete.
Any changes made to the system after this point will be lost when the data
is restored.
The following upgrade state applies once this command is executed:
- started:
- ``started``:
- State entered after :command:`system upgrade-start` completes.
- Release nn.nn system data \(for example, postgres databases\) has
- Release <nn>.<nn> system data \(for example, postgres databases\) has
been exported to be used in the upgrade.
- Configuration changes must not be made after this point, until the
upgrade is completed.
As part of the upgrade, the upgrade process checks the health of the system
and validates that the system is ready for an upgrade.
The upgrade process checks the health of the system and validates that the
system is ready for an upgrade.
The upgrade process checks that no alarms are active before starting an
upgrade.
.. note::
Use the command :command:`system upgrade-start --force` to force the
upgrades process to start and to ignore management affecting alarms.
This should ONLY be done if you feel these alarms will not be an issue
over the upgrades process.
This should **ONLY** be done if you have ascertained that these alarms
will not interfere with the upgrades process.
#. Check the upgrade state.
@ -241,13 +243,13 @@ For more details, see :ref:`Detailed contents of a system backup
| to_release | nn.nn |
+--------------+--------------------------------------+
Ensure the upgrade state is **started**. It will take several minutes to
transition to the started state.
Ensure the upgrade state is *started*. It will take several minutes to
transition to the *started* state.
#. \(Optional\) Copy the upgrade data from the system to an alternate safe
location \(such as a USB drive or remote server\).
The upgrade data is located under /opt/platform-backup. Example file names
The upgrade data is located under ``/opt/platform-backup``. Example file names
are:
**lost+found upgrade\_data\_2020-06-23T033950\_61e5fcd7-a38d-40b0-ab83-8be55b87fee2.tgz**
@ -264,8 +266,8 @@ For more details, see :ref:`Detailed contents of a system backup
#. Upgrade controller-0.
This is the point of no return. All data except /opt/platform-backup/ will
be erased from the system. This will wipe the **rootfs** and reboot the
This is the point of no return. All data except ``/opt/platform-backup/``
will be erased from the system. This will wipe the ``rootfs`` and reboot the
host. The new release must then be manually installed \(via network or
USB\).
@ -279,11 +281,11 @@ For more details, see :ref:`Detailed contents of a system backup
#. Install the new release of |prod-long| Simplex software via network or USB.
#. Verify and configure IP connectivity. External connectivity is required to
run the Ansible upgrade playbook. The |prod-long| boot image will DHCP out all
interfaces so the server may have obtained an IP address and have external IP
connectivity if a DHCP server is present in your environment. Verify this using
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
route.
run the Ansible upgrade playbook. The |prod-long| boot image will |DHCP| out
all interfaces so the server may have obtained an IP address and have
external IP connectivity if a |DHCP| server is present in your environment.
Verify this using the :command:`ip addr` command. Otherwise, manually
configure an IP address and default IP route.
#. Restore the upgrade data.
@ -298,14 +300,13 @@ For more details, see :ref:`Detailed contents of a system backup
following parameter:
``ansible_become_pass``
The ansible playbook will check /home/sysadmin/<hostname\>.yml for these
user configuration override files for hosts. For example, if running
ansible locally, /home/sysadmin/localhost.yml.
The ansible playbook will check ``/home/sysadmin/<hostname\>.yml`` for
these user configuration override files for hosts. For example, if
running ansible locally, ``/home/sysadmin/localhost.yml``.
By default the playbook will search for the upgrade data file under
/opt/platform-backup. If required, use the **upgrade\_data\_file**
parameter to specify the path to the **upgrade\_data**.
``/opt/platform-backup``. If required, use the ``upgrade_data_file``
parameter to specify the path to the ``upgrade_data``.
.. note::
This playbook does not support replay.
@ -314,7 +315,7 @@ For more details, see :ref:`Detailed contents of a system backup
This can take more than one hour to complete.
Once the data restoration is complete the upgrade state will be set to
**upgrading-hosts**.
*upgrading-hosts*.
#. Check the status of the upgrade.
@ -342,9 +343,9 @@ For more details, see :ref:`Detailed contents of a system backup
During the running of the :command:`upgrade-activate` command, new
configurations are applied to the controller. 250.001 \(**hostname
Configuration is out-of-date**\) alarms are raised and are cleared as the
configuration is applied. The upgrade state goes from **activating** to
**activation-complete** once this is done.
Configuration is out-of-date**\) alarms are raised and then cleared as the
configuration is applied. The upgrade state goes from *activating* to
*activation-complete* once this is done.
.. code-block:: none
@ -360,23 +361,25 @@ For more details, see :ref:`Detailed contents of a system backup
The following states apply when this command is executed.
**activation-requested**
``activation-requested``
State entered when :command:`system upgrade-activate` is executed.
**activating**
``activating``
State entered when we have started activating the upgrade by applying
new configurations to the controller and compute hosts.
**activating-hosts**
``activating-hosts``
State entered when applying host-specific configurations. This state is
entered only if needed.
**activation-complete**
``activation-complete``
State entered when new configurations have been applied to all
controller and compute hosts.
#. Check the status of the upgrade again to see it has reached
**activation-complete**.
``activation-complete``.
.. code-block:: none