Merge "Upgrades edits (r6,r7,dsR6,dsR7)"
This commit is contained in:
commit
e73f762df6
@ -9,11 +9,11 @@ Abort Simplex System Upgrades
|
||||
You can abort a Simplex System upgrade before or after upgrading controller-0.
|
||||
The upgrade abort procedure can only be applied before the
|
||||
:command:`upgrade-complete` command is issued. Once this command is issued the
|
||||
upgrade can not be aborted. If the return to the previous release is required,
|
||||
then restore the system using the backup data taken prior to the upgrade.
|
||||
upgrade can not be aborted. If you must return to the previous release, then
|
||||
restore the system using the backup data taken prior to the upgrade.
|
||||
|
||||
Before starting, verify the upgrade data under `/opt/platform-backup`. This data
|
||||
must be present to perform the abort process.
|
||||
Before starting, verify the upgrade data under ``/opt/platform-backup``. This
|
||||
data must be present to perform the abort process.
|
||||
|
||||
.. _aborting-simplex-system-upgrades-section-N10025-N1001B-N10001:
|
||||
|
||||
@ -31,20 +31,20 @@ Before upgrading controller-0
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-abort
|
||||
~(keystone_admin)$ system upgrade-abort
|
||||
|
||||
The upgrade state is set to aborting. Once this is executed, there is no
|
||||
canceling; the upgrade must be completely aborted.
|
||||
The upgrade state is set to ``aborting``. Once this is executed, it cannot
|
||||
be cancelled; the upgrade must be completely aborted.
|
||||
|
||||
#. Complete the upgrade.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-complete
|
||||
~(keystone_admin)$ system upgrade-complete
|
||||
|
||||
At this time any upgrade data generated as part of the upgrade-start
|
||||
command will be deleted. This includes the upgrade data in
|
||||
/opt/platform-backup.
|
||||
``/opt/platform-backup``.
|
||||
|
||||
.. _aborting-simplex-system-upgrades-section-N10063-N1001B-N10001:
|
||||
|
||||
@ -52,7 +52,7 @@ Before upgrading controller-0
|
||||
After upgrading controller-0
|
||||
----------------------------
|
||||
|
||||
After controller-0 has been upgraded it is possible to roll back the software
|
||||
After controller-0 has been upgraded, it is possible to roll back the software
|
||||
upgrade. This involves performing a system restore with the previous release.
|
||||
|
||||
.. _aborting-simplex-system-upgrades-ol-jmw-kcp-xdb:
|
||||
@ -61,41 +61,42 @@ upgrade. This involves performing a system restore with the previous release.
|
||||
USB.
|
||||
|
||||
#. Verify and configure IP connectivity. External connectivity is required to
|
||||
run the Ansible restore playbook. The |prod-long| boot image will DHCP out all
|
||||
interfaces so the server may have obtained an IP address and have external IP
|
||||
connectivity if a DHCP server is present in your environment. Verify this using
|
||||
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
|
||||
route.
|
||||
run the Ansible restore playbook. The |prod-long| boot image will |DHCP| out
|
||||
all interfaces so the server may have obtained an IP address and have
|
||||
external IP connectivity if a |DHCP| server is present in your environment.
|
||||
Verify this using the :command:`ip addr` command. Otherwise, manually
|
||||
configure an IP address and default IP route.
|
||||
|
||||
#. Restore the system data. The restore is preserved in /opt/platform-backup.
|
||||
#. Restore the system data. The restore is preserved in ``/opt/platform-backup``.
|
||||
|
||||
The system will be restored to the state when the :command:`upgrade-start`
|
||||
command was issued. Follow the process in :ref:`Run Restore Playbook Locally on the
|
||||
Controller <running-restore-playbook-locally-on-the-controller>`.
|
||||
command was issued. Follow the process in :ref:`Run Restore Playbook Locally
|
||||
on the Controller <running-restore-playbook-locally-on-the-controller>`.
|
||||
|
||||
Specify the upgrade data filename as `backup_filename` and the `initial_backup_dir`
|
||||
as `/opt/platform-backup`.
|
||||
Specify the upgrade data filename as `backup_filename` and the
|
||||
`initial_backup_dir` as ``/opt/platform-backup``.
|
||||
|
||||
The user images will also need to be restored as described in the Postrequisites section.
|
||||
The user images will also need to be restored as described in the
|
||||
Postrequisites section.
|
||||
|
||||
#. Unlock controller-0
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-unlock controller-0
|
||||
~(keystone_admin)$ system host-unlock controller-0
|
||||
|
||||
|
||||
#. Abort the upgrade with the :command:`upgrade-abort` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-abort
|
||||
~(keystone_admin)$ system upgrade-abort
|
||||
|
||||
The upgrade state is set to aborting. Once this is executed, there is no
|
||||
canceling; the upgrade must be completely aborted.
|
||||
The upgrade state is set to ``aborting``. Once this is executed, it cannot
|
||||
be cancelled; the upgrade must be completely aborted.
|
||||
|
||||
#. Complete the upgrade.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-complete
|
||||
~(keystone_admin)$ system upgrade-complete
|
||||
|
@ -7,12 +7,12 @@ Configure Firmware Update Orchestration
|
||||
=======================================
|
||||
|
||||
You can configure *Firmware Update Orchestration Strategy* using the
|
||||
**sw-manager** |CLI|.
|
||||
:command:`sw-manager` |CLI|.
|
||||
|
||||
.. note::
|
||||
Management-affecting alarms cannot be ignored using relaxed alarm rules
|
||||
during an orchestrated firmware update operation. For a list of
|
||||
management-affecting alarms, see |prod| Fault Management:
|
||||
management-affecting alarms, see |fault-doc|:
|
||||
:ref:`Alarm Messages <100-series-alarm-messages>`. To display
|
||||
management-affecting active alarms, use the following command:
|
||||
|
||||
@ -37,9 +37,9 @@ ignored even when the default strict restrictions are selected:
|
||||
|
||||
.. _noc1590162360081-ul-ls2-pxs-tlb:
|
||||
|
||||
- Hosts that need to be updated must be in the **unlocked-enabled** state.
|
||||
- Hosts that need to be updated must be in the ``unlocked-enabled`` state.
|
||||
|
||||
- The firmware update image must be in the **applied** state. For more
|
||||
- The firmware update image must be in the ``applied`` state. For more
|
||||
information, see :ref:`Managing Software Updates <managing-software-updates>`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
@ -69,7 +69,7 @@ ignored even when the default strict restrictions are selected:
|
||||
state: building
|
||||
inprogress: true
|
||||
|
||||
#. Optional: Display the strategy in summary, if required. The firmware update
|
||||
#. |Optional| Display the strategy in summary, if required. The firmware update
|
||||
strategy :command:`show` command displays the strategy in a summary.
|
||||
|
||||
.. code-block:: none
|
||||
@ -87,7 +87,7 @@ ignored even when the default strict restrictions are selected:
|
||||
state: ready-to-apply
|
||||
build-result: success
|
||||
|
||||
The strategy steps and stages are displayed using the **--details** option.
|
||||
The strategy steps and stages are displayed using the ``--details`` option.
|
||||
|
||||
#. Apply the strategy.
|
||||
|
||||
@ -96,7 +96,7 @@ ignored even when the default strict restrictions are selected:
|
||||
all the hosts in the strategy is complete.
|
||||
|
||||
|
||||
- Use the **-stage-id** option to specify a specific stage to apply; one
|
||||
- Use the ``-stage-id`` option to specify a specific stage to apply; one
|
||||
at a time.
|
||||
|
||||
.. note::
|
||||
@ -106,7 +106,7 @@ ignored even when the default strict restrictions are selected:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$sw-manager fw-update-strategy apply
|
||||
~(keystone_admin)$ sw-manager fw-update-strategy apply
|
||||
Strategy Firmware Update Strategy:
|
||||
strategy-uuid: 3e43c018-9c75-4ba8-a276-472c3bcbb268
|
||||
controller-apply-type: ignore
|
||||
@ -125,7 +125,7 @@ ignored even when the default strict restrictions are selected:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$sw-manager fw-update-strategy show
|
||||
~(keystone_admin)$ sw-manager fw-update-strategy show
|
||||
Strategy Firmware Update Strategy:
|
||||
strategy-uuid: 3e43c018-9c75-4ba8-a276-472c3bcbb268
|
||||
controller-apply-type: ignore
|
||||
@ -138,7 +138,7 @@ ignored even when the default strict restrictions are selected:
|
||||
state: applying
|
||||
inprogress: true
|
||||
|
||||
#. Optional: Abort the strategy, if required. This is only used to stop, and
|
||||
#. |optional| Abort the strategy, if required. This is only used to stop, and
|
||||
abort the entire strategy.
|
||||
|
||||
The firmware update strategy :command:`abort` command can be used to abort
|
||||
@ -157,7 +157,7 @@ ignored even when the default strict restrictions are selected:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$sw-manager fw-update-strategy delete
|
||||
~(keystone_admin)$ sw-manager fw-update-strategy delete
|
||||
Strategy deleted.
|
||||
|
||||
For more information see :ref:`Firmware Update Orchestration Using the CLI
|
||||
|
@ -12,7 +12,7 @@ You can configure *Kubernetes Version Upgrade Orchestration Strategy* using the
|
||||
.. note::
|
||||
You require administrator privileges to use :command:`sw-manager`. You must
|
||||
log in to the active controller as **user sysadmin** and source the script
|
||||
by using the command, source /etc/platform/openrc to obtain administrator
|
||||
by using the command, source ``/etc/platform/openrc`` to obtain administrator
|
||||
privileges. Do not use :command:`sudo`.
|
||||
|
||||
.. note::
|
||||
@ -75,9 +75,10 @@ For example:
|
||||
- Hosts that need to be upgraded must be in the ``unlocked-enabled`` state.
|
||||
|
||||
- If you are using NetApp Trident, ensure that your NetApp version is
|
||||
compatible with Trident 22.01 before upgrading Kubernetes to version |kube-ver|
|
||||
and after updating |prod| to version |prod-ver|. For more information,
|
||||
see :ref:`Upgrade the NetApp Trident Software <upgrade-the-netapp-trident-software-c5ec64d213d3>`.
|
||||
compatible with Trident 22.01 before upgrading Kubernetes to version
|
||||
|kube-ver| and after updating |prod| to version |prod-ver|. For more
|
||||
information, see :ref:`Upgrade the NetApp Trident Software
|
||||
<upgrade-the-netapp-trident-software-c5ec64d213d3>`.
|
||||
|
||||
|
||||
.. only:: partner
|
||||
@ -104,7 +105,7 @@ For example:
|
||||
#. Confirm that the system is healthy.
|
||||
|
||||
Check the current system health status, resolve any alarms and other issues
|
||||
reported by the :command:`system health-query-kube-upgrade` command then
|
||||
reported by the :command:`system health-query-kube-upgrade` command, then
|
||||
recheck the system health status to confirm that all **System Health**
|
||||
fields are set to **OK**.
|
||||
|
||||
@ -272,7 +273,7 @@ For example:
|
||||
defaults to strict
|
||||
|
||||
|
||||
#. Optional: Display the strategy in summary, if required. The Kubernetes
|
||||
#. |optional| Display the strategy in summary, if required. The Kubernetes
|
||||
upgrade strategy :command:`show` command displays the strategy in a summary.
|
||||
|
||||
.. code-block:: none
|
||||
@ -350,7 +351,7 @@ For example:
|
||||
``downloading-images``, ``downloaded-images``, ``upgrading-first-master``,
|
||||
``upgraded-first-master``, etc.
|
||||
|
||||
#. Optional: Abort the strategy, if required. This is only used to stop, and
|
||||
#. |optional| Abort the strategy, if required. This is only used to stop, and
|
||||
abort the entire strategy.
|
||||
|
||||
The Kubernetes version upgrade strategy :command:`abort` command can be
|
||||
|
@ -156,14 +156,14 @@ status before creating a update strategy.
|
||||
- Worker hosts with no hosted application pods are updated before
|
||||
worker hosts with hosted application pods.
|
||||
|
||||
- The final step in each stage is "system-stabilize," which waits for
|
||||
a period of time \(up to several minutes\) and ensures that the
|
||||
- The final step in each stage is ``system-stabilize``, which waits
|
||||
for a period of time \(up to several minutes\) and ensures that the
|
||||
system is free of alarms. This ensures that the update orchestrator
|
||||
does not continue to update more hosts if the update application
|
||||
has caused an issue resulting in an alarm.
|
||||
does not continue to update more hosts if the update application has
|
||||
caused an issue resulting in an alarm.
|
||||
|
||||
|
||||
#. Click the **Apply Strategy** button to apply the update- strategy. You can
|
||||
#. Click the **Apply Strategy** button to apply the update strategy. You can
|
||||
optionally apply a single stage at a time by clicking the **Apply Stage**
|
||||
button.
|
||||
|
||||
@ -181,7 +181,7 @@ status before creating a update strategy.
|
||||
attempt to unlock any hosts that were locked.
|
||||
|
||||
.. note::
|
||||
If a update-strategy is aborted after hosts were locked, but before
|
||||
If a update strategy is aborted after hosts were locked, but before
|
||||
they were updated, the hosts will not be unlocked, as this would result
|
||||
in the updates being installed. You must either install the updates on
|
||||
the hosts or remove the updates before unlocking the hosts.
|
||||
|
@ -15,13 +15,13 @@ Do the following to manage the instance re-location manually:
|
||||
|
||||
.. _rbp1590431075472-ul-mgr-kvs-tlb:
|
||||
|
||||
- Manually firmware update at least one openstack-compute worker host. This
|
||||
- Manually firmware-update at least one openstack-compute worker host. This
|
||||
assumes that at least one openstack-compute worker host does not have any
|
||||
instances, or has instances that can be migrated. For more information on
|
||||
manually updating a host, see the :ref:`Display Worker Host Information
|
||||
<displaying-worker-host-information>`.
|
||||
|
||||
- If the migration is prevented by limitations in the VNF or virtual
|
||||
- If the migration is prevented by limitations in the |VNF| or virtual
|
||||
application, perform the following:
|
||||
|
||||
|
||||
@ -35,7 +35,7 @@ Do the following to manage the instance re-location manually:
|
||||
|
||||
- Terminate the old instances.
|
||||
|
||||
- If the migration is prevented by the size of the instances local disks:
|
||||
- If the migration is prevented by the size of the instances' local disks:
|
||||
|
||||
- For each openstack-compute worker host that has instances that cannot
|
||||
be migrated, manually move the instances using the CLI. For more
|
||||
|
@ -7,29 +7,30 @@ Firmware Update Orchestration Using the CLI
|
||||
===========================================
|
||||
|
||||
You can configure the *Firmware Update Orchestration Strategy* using the
|
||||
**sw-manager** |CLI|.
|
||||
:command:`sw-manager`` |CLI| commands.
|
||||
|
||||
---------------
|
||||
About this task
|
||||
---------------
|
||||
|
||||
.. note::
|
||||
You require administrator privileges to use **sw-manager**. You must log in
|
||||
to the active controller as **user sysadmin** and source the script by
|
||||
using the command, source /etc/platform/openrc to obtain administrator
|
||||
privileges. Do not use sudo.
|
||||
|
||||
You require administrator privileges to use :command:`sw-manager` commands.
|
||||
You must log in to the active controller as **user sysadmin** and source the
|
||||
script by using the command, source ``/etc/platform/openrc`` to obtain
|
||||
administrator privileges. Do not use sudo.
|
||||
|
||||
.. note::
|
||||
Management-affecting alarms cannot be ignored at the indicated severity
|
||||
level or higher by using relaxed alarm rules during an orchestrated
|
||||
firmware update operation. For a list of management-affecting alarms, see
|
||||
|prod| Fault Management: :ref:`Alarm Messages
|
||||
|fault-doc|: :ref:`Alarm Messages
|
||||
<100-series-alarm-messages>`. To display management-affecting active
|
||||
alarms, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
|
||||
During an orchestrated firmware update operation, the following alarms are
|
||||
ignored even when strict restrictions are selected:
|
||||
@ -72,112 +73,108 @@ be created with override worker apply type concurrency with a max host
|
||||
parallelism, instance action, and alarm restrictions.
|
||||
|
||||
``--controller-apply-type`` and ``--storage-apply-type``
|
||||
These options cannot be changed from ``ignore`` because firmware update is
|
||||
only supported for worker hosts.
|
||||
|
||||
These options cannot be changed from '**ignore**' because firmware update
|
||||
is only supported for worker hosts.
|
||||
|
||||
.. note::
|
||||
Firmware update is currently only supported for hosts with worker
|
||||
function. Any attempt to modify the controller or storage apply type is
|
||||
rejected.
|
||||
.. note::
|
||||
Firmware update is currently only supported for hosts with worker
|
||||
function. Any attempt to modify the controller or storage apply type is
|
||||
rejected.
|
||||
|
||||
``--worker-apply-type``
|
||||
This option specifies the host concurrency of the firmware update strategy:
|
||||
|
||||
This option specifies the host concurrency of the firmware update strategy:
|
||||
- ``serial`` \(default\): worker hosts will be patched one at a time
|
||||
|
||||
- ``parallel``: worker hosts will be updated in parallel
|
||||
|
||||
- At most, ``parallel`` will be updated at the same time
|
||||
|
||||
- At most, half of the hosts in a host aggregate will be updated at the
|
||||
same time
|
||||
|
||||
- ``ignore``: worker hosts will not be updated; strategy create will fail
|
||||
|
||||
- ``serial`` \(default\): worker hosts will be patched one at a time
|
||||
|
||||
- ``parallel``: worker hosts will be updated in parallel
|
||||
|
||||
- At most, ``parallel`` will be updated at the same time
|
||||
|
||||
- At most, half of the hosts in a host aggregate will be updated at the
|
||||
same time
|
||||
|
||||
- ``ignore``: worker hosts will not be updated; strategy create will fail
|
||||
|
||||
Worker hosts with no instances are updated before worker hosts with instances.
|
||||
Worker hosts with no instances are updated before worker hosts with
|
||||
instances.
|
||||
|
||||
``--max-parallel-worker-hosts``
|
||||
|
||||
This option applies to the parallel worker apply type selection to specify
|
||||
the maximum worker hosts to update in parallel \(minimum: 2, maximum: 10\).
|
||||
This option applies to the parallel worker apply type selection to specify
|
||||
the maximum worker hosts to update in parallel \(minimum: 2, maximum: 10\).
|
||||
|
||||
``-–instance-action``
|
||||
This option only has significance when the |prefix|-openstack application is
|
||||
loaded and there are instances running on worker hosts. It specifies how the
|
||||
strategy deals with worker host instances over the strategy execution.
|
||||
|
||||
This option only has significance when the |prefix|-openstack application is
|
||||
loaded and there are instances running on worker hosts. It specifies how
|
||||
the strategy deals with worker host instances over the strategy execution.
|
||||
|
||||
- ``stop-start`` (default)
|
||||
|
||||
Instances will be stopped before the host lock operation following the
|
||||
update and then started again following the host unlock.
|
||||
|
||||
.. warning::
|
||||
Using the ``stop-start`` option will result in an outage for each
|
||||
instance, as it is stopped while the worker host is locked/unlocked. In
|
||||
order to ensure this does not impact service, instances MUST be grouped
|
||||
into anti-affinity \(or anti-affinity best effort\) server groups,
|
||||
which will ensure that only a single instance in each server group is
|
||||
stopped at a time.
|
||||
|
||||
- ``migrate``
|
||||
|
||||
Instances will be migrated off a host before it is patched \(this applies
|
||||
to reboot patching only\).
|
||||
- ``stop-start`` (default)
|
||||
|
||||
Instances will be stopped before the host lock operation following the
|
||||
update and then started again following the host unlock.
|
||||
|
||||
.. warning::
|
||||
Using the ``stop-start`` option will result in an outage for each
|
||||
instance, as it is stopped while the worker host is locked/unlocked. In
|
||||
order to ensure this does not impact service, instances MUST be grouped
|
||||
into anti-affinity \(or anti-affinity best effort\) server groups,
|
||||
which will ensure that only a single instance in each server group is
|
||||
stopped at a time.
|
||||
|
||||
- ``migrate``
|
||||
|
||||
Instances will be migrated off a host before it is patched \(this applies
|
||||
to reboot patching only\).
|
||||
|
||||
``--alarm-restrictions``
|
||||
This option sets how the how the firmware update orchestration behaves when
|
||||
alarms are present.
|
||||
|
||||
This option sets how the how the firmware update orchestration behaves when
|
||||
alarms are present.
|
||||
|
||||
To display management-affecting active alarms, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
|
||||
- ``strict`` (default)
|
||||
|
||||
The default strict option will result in patch orchestration failing if
|
||||
there are any alarms present in the system \(except for a small list of
|
||||
alarms\).
|
||||
|
||||
- ``relaxed``
|
||||
|
||||
This option allows orchestration to proceed if alarms are present, as long
|
||||
as none of these alarms are management affecting.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ sw-manager fw-update-strategy create --help
|
||||
usage:sw-manager fw-update-strategy create [-h]
|
||||
[--controller-apply-type {ignore}]
|
||||
[--storage-apply-type {ignore}]
|
||||
[--worker-apply-type
|
||||
{serial,parallel,ignore}]
|
||||
[--max-parallel-worker-hosts
|
||||
{2,3,4,5,6,7,8,9,10}]
|
||||
[--instance-action {migrate,stop-start}]
|
||||
[--alarm-restrictions {strict,relaxed}]
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--controller-apply-type {ignore}
|
||||
defaults to ignore
|
||||
--storage-apply-type {ignore}
|
||||
defaults to ignore
|
||||
--worker-apply-type {serial,parallel,ignore}
|
||||
defaults to serial
|
||||
--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}
|
||||
maximum worker hosts to update in parallel
|
||||
--instance-action {migrate,stop-start}
|
||||
defaults to stop-start
|
||||
--alarm-restrictions {strict,relaxed}
|
||||
defaults to strict
|
||||
|
||||
To display management-affecting active alarms, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
|
||||
- ``strict`` (default)
|
||||
|
||||
The default strict option will result in patch orchestration failing if
|
||||
there are any alarms present in the system \(except for a small list of
|
||||
alarms\).
|
||||
|
||||
- ``relaxed``
|
||||
|
||||
This option allows orchestration to proceed if alarms are present, as long
|
||||
as none of these alarms are management affecting.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ sw-manager fw-update-strategy create --help
|
||||
usage:sw-manager fw-update-strategy create [-h]
|
||||
[--controller-apply-type {ignore}]
|
||||
[--storage-apply-type {ignore}]
|
||||
[--worker-apply-type
|
||||
{serial,parallel,ignore}]
|
||||
[--max-parallel-worker-hosts
|
||||
{2,3,4,5,6,7,8,9,10}]
|
||||
[--instance-action {migrate,stop-start}]
|
||||
[--alarm-restrictions {strict,relaxed}]
|
||||
|
||||
optional arguments:
|
||||
-h, --help show this help message and exit
|
||||
--controller-apply-type {ignore}
|
||||
defaults to ignore
|
||||
--storage-apply-type {ignore}
|
||||
defaults to ignore
|
||||
--worker-apply-type {serial,parallel,ignore}
|
||||
defaults to serial
|
||||
--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}
|
||||
maximum worker hosts to update in parallel
|
||||
--instance-action {migrate,stop-start}
|
||||
defaults to stop-start
|
||||
--alarm-restrictions {strict,relaxed}
|
||||
defaults to strict
|
||||
|
||||
|
||||
.. _tsr1590164474201-section-l3x-wr5-tlb:
|
||||
|
||||
-------------------------------------------
|
||||
@ -203,11 +200,10 @@ of the strategy. A complete view of the strategy can be shown using the
|
||||
Firmware update orchestration strategy apply
|
||||
--------------------------------------------
|
||||
|
||||
The ``apply`` strategy subcommand with no options executes the firmware
|
||||
update strategy from current state to the end. The apply strategy operation can
|
||||
be called with the ``stage-id`` option to execute the next stage of the
|
||||
strategy. The ``stage-id`` option cannot be used to execute the strategy out of
|
||||
order.
|
||||
The ``apply`` strategy subcommand with no options executes the firmware update
|
||||
strategy from current state to the end. The apply strategy operation can be
|
||||
called with the ``stage-id`` option to execute the next stage of the strategy.
|
||||
The ``stage-id`` option cannot be used to execute the strategy out of order.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -226,9 +222,9 @@ Firmware update orchestration strategy abort
|
||||
|
||||
The ``abort`` strategy subcommand with no options sets the strategy to abort
|
||||
after the current applying stage is complete. The abort strategy operation can
|
||||
be called with the ``stage-id`` option to specify that the strategy abort
|
||||
before executing the next stage of the strategy. The ``stage-id`` option cannot
|
||||
be used to execute the strategy out of order.
|
||||
be called with the ``stage-id`` option to specify that the strategy abort before
|
||||
executing the next stage of the strategy. The ``stage-id`` option cannot be used
|
||||
to execute the strategy out of order.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
|
@ -6,9 +6,8 @@
|
||||
Handle Firmware Update Orchestration Failures
|
||||
=============================================
|
||||
|
||||
The creation or application of a strategy could fail for any of the listed
|
||||
reasons described in this section. Follow the suggested actions in each case to
|
||||
resolve the issue.
|
||||
The creation or application of a strategy could fail for any of the reasons
|
||||
listed below. Follow the suggested actions in each case to resolve the issue.
|
||||
|
||||
-------------------------
|
||||
Strategy creation failure
|
||||
@ -20,31 +19,31 @@ Strategy creation failure
|
||||
|
||||
- **Action**:
|
||||
|
||||
- verify that the **--worker-apply-type** was not set to '**ignore**'
|
||||
- Verify that the ``--worker-apply-type`` was not set to ``ignore``.
|
||||
|
||||
- check recent logs added to /var/log/nfv-vim.log
|
||||
- Check recent logs added to ``/var/log/nfv-vim.log``.
|
||||
|
||||
- **Reason**: alarms from platform are present
|
||||
|
||||
- **Action**:
|
||||
|
||||
- query for management affecting alarms and take actions to clear them
|
||||
- Query for management affecting alarms and take actions to clear them.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
|
||||
- if there are no management affecting alarms present take actions to
|
||||
- If there are no management affecting alarms present take actions to
|
||||
clear other reported alarms or try creating the strategy with the
|
||||
'**relaxed**' alarms restrictions option **--alarm-restrictions
|
||||
relaxed**
|
||||
'**relaxed**' alarms restrictions option ``--alarm-restrictions
|
||||
relaxed``.
|
||||
|
||||
- **Reason**: no firmware update required
|
||||
|
||||
- **Action**:
|
||||
|
||||
- verify that the firmware device image has been applied for the
|
||||
worker hosts that require updating
|
||||
- Verify that the firmware device image has been applied for the
|
||||
worker hosts that require updating.
|
||||
|
||||
.. note::
|
||||
If the strategy create failed. After resolving the strategy
|
||||
@ -57,52 +56,53 @@ Strategy apply failure
|
||||
|
||||
.. _jkf1590184623714-ul-rdf-4pq-5lb:
|
||||
|
||||
- **Reason**: alarms from platform are present
|
||||
- **Reason**: Alarms from platform are present.
|
||||
|
||||
- **Action**: suggests that an alarm has been raised since the creation
|
||||
of the strategy. Address the cause of the new alarm, delete the strategy
|
||||
and try creating and applying a new strategy
|
||||
- **Action**: This suggests that an alarm has been raised since the
|
||||
creation of the strategy. Address the cause of the new alarm, delete the
|
||||
strategy and try creating and applying a new strategy.
|
||||
|
||||
- **Reason**: unable to migrate instances
|
||||
- **Reason**: Unable to migrate instances.
|
||||
|
||||
- **Action**: See :ref:`Firmware Update Operations Requiring Manual
|
||||
Migration <firmware-update-operations-requiring-manual-migration>` for
|
||||
steps to resolve migration issues.
|
||||
|
||||
- **Reason**: firmware update failed. Suggests that the firmware update for
|
||||
- **Reason**: Firmware update failed. Suggests that the firmware update for
|
||||
the specified host has failed
|
||||
|
||||
- **Action**: For more information, see |prod| Node Management:
|
||||
:ref:`Display Worker Host Information <displaying-worker-host-information>`
|
||||
|
||||
- **Reason**: lock host failed
|
||||
- **Reason**: Lock host failed.
|
||||
|
||||
- **Action**:
|
||||
|
||||
- investigate the /var/log/sysinv.log, and /var/log/nfv-vim.log files
|
||||
- Investigate the ``/var/log/sysinv.log``, and
|
||||
``/var/log/nfv-vim.log`` files.
|
||||
|
||||
- address the underlying issue
|
||||
- Address the underlying issue.
|
||||
|
||||
- manually lock and unlock the host
|
||||
- Manually lock and unlock the host.
|
||||
|
||||
- try recreating and re-applying the firmware update strategy to
|
||||
automatically finish the update process
|
||||
- Try recreating and re-applying the firmware update strategy to
|
||||
automatically finish the update process.
|
||||
|
||||
- **Reason**: unlock host failed
|
||||
- **Reason**: Unlock host failed.
|
||||
|
||||
- **Action**:
|
||||
|
||||
- investigate /var/log/mtcAgent.log file for cause logs files
|
||||
- Investigate the ``/var/log/mtcAgent.log`` file for cause logs files
|
||||
|
||||
- address the underlying issue
|
||||
- Address the underlying issue.
|
||||
|
||||
- manually lock and unlock the host to recover
|
||||
- Manually lock and unlock the host to recover.
|
||||
|
||||
- try recreating and re-applying the firmware update strategy to
|
||||
automatically finish the update process
|
||||
- Try recreating and re-applying the firmware update strategy to
|
||||
automatically finish the update process.
|
||||
|
||||
.. note::
|
||||
If the strategy :command:`apply` fails, you must resolve the
|
||||
strategy:command:`apply` failure, and delete the failed strategy before
|
||||
strategy:command:`apply` failure and delete the failed strategy before
|
||||
trying to create and apply another strategy.
|
||||
|
||||
|
@ -18,7 +18,7 @@ Strategy creation failure
|
||||
|
||||
.. _jkf1590184623714-ul-fvs-vnq-5lb:
|
||||
|
||||
- **Reason**: build failed with no reason.
|
||||
- **Reason**: Build failed with no reason.
|
||||
|
||||
- **Action**:
|
||||
|
||||
@ -27,7 +27,7 @@ Strategy creation failure
|
||||
- Check recent logs added to /var/log/nfv-vim.log.
|
||||
|
||||
|
||||
- **Reason**: alarms from platform are present.
|
||||
- **Reason**: Alarms from platform are present.
|
||||
|
||||
- **Action**:
|
||||
|
||||
@ -43,7 +43,7 @@ Strategy creation failure
|
||||
the ``relaxed`` alarms restrictions option ``--alarm-restrictions
|
||||
relaxed``.
|
||||
|
||||
- **Reason**: no Kubernetes version upgrade required.
|
||||
- **Reason**: No Kubernetes version upgrade required.
|
||||
|
||||
- **Action**:
|
||||
|
||||
@ -65,14 +65,14 @@ Strategy Apply Failure
|
||||
|
||||
.. _jkf1590184623714-ul-rdf-4pq-5lb:
|
||||
|
||||
- **Reason**: alarms from platform are present.
|
||||
- **Reason**: Alarms from platform are present.
|
||||
|
||||
- **Action**: suggests that an alarm has been raised since the creation
|
||||
of the strategy. Address the cause of the new alarm, delete the
|
||||
- **Action**: This suggests that an alarm has been raised since the
|
||||
creation of the strategy. Address the cause of the new alarm, delete the
|
||||
strategy and try creating and applying a new strategy.
|
||||
|
||||
|
||||
- **Reason**: unable to migrate instances.
|
||||
- **Reason**: Unable to migrate instances.
|
||||
|
||||
- **Action**: See :ref:`Kubernetes Version Upgrade Operations Requiring
|
||||
Manual Migration
|
||||
@ -91,11 +91,11 @@ Strategy Apply Failure
|
||||
|
||||
.. include:: /_includes/handling-kubernetes-update-orchestration-failures.rest
|
||||
|
||||
- **Reason**: lock host failed.
|
||||
- **Reason**: Lock host failed.
|
||||
|
||||
- **Action**:
|
||||
|
||||
- Investigate the /var/log/sysinv.log, and /var/log/nfv-vim.log
|
||||
- Investigate the ``/var/log/sysinv.log``, and ``/var/log/nfv-vim.log``
|
||||
files.
|
||||
|
||||
- Address the underlying issue.
|
||||
@ -106,11 +106,11 @@ Strategy Apply Failure
|
||||
strategy to automatically finish the upgrade process.
|
||||
|
||||
|
||||
- **Reason**: unlock host failed.
|
||||
- **Reason**: Unlock host failed.
|
||||
|
||||
- **Action**:
|
||||
|
||||
- Investigate /var/log/mtcAgent.log file for cause logs files.
|
||||
- Investigate ``/var/log/mtcAgent.log`` file for cause logs files.
|
||||
|
||||
- Address the underlying issue.
|
||||
|
||||
|
@ -11,10 +11,10 @@ interface. The system type is also shown.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. In the |prod| Horizon, open the System Configuration page.
|
||||
#. In the |prod| Horizon, open the **System Configuration** page.
|
||||
|
||||
The System Configuration page is available from **Admin** \> **Platform**
|
||||
\> **System Configuration** in the left-hand pane.
|
||||
The **System Configuration** page is available from **Admin** \>
|
||||
**Platform** \> **System Configuration** in the left-hand pane.
|
||||
|
||||
#. Select the **Systems** tab to view the software version.
|
||||
|
||||
@ -24,10 +24,10 @@ interface. The system type is also shown.
|
||||
shown in the **System Type** field. The mode \(**simplex**, **duplex**, or
|
||||
**standard**\) is shown in the **System Mode** field.
|
||||
|
||||
#. In the |prod| Horizon interface, open the Software Management page.
|
||||
#. In the |prod| Horizon interface, open the **Software Management** page.
|
||||
|
||||
The Software Management page is available from **Admin** \> **Platform** \>
|
||||
**Software Management** in the left-hand pane.
|
||||
The **Software Management** page is available from **Admin** \> **Platform**
|
||||
\> **Software Management** in the left-hand pane.
|
||||
|
||||
#. Select the **Patches** tab to view update information.
|
||||
|
||||
|
@ -47,7 +47,7 @@ For more about working with software updates, see :ref:`Manage Software Updates
|
||||
+----------------------+----------------------------------------------------+
|
||||
|
||||
.. note::
|
||||
The **system\_mode** field is shown only for a |prod| Simplex or Duplex
|
||||
The **system_mode** field is shown only for a |prod| Simplex or Duplex
|
||||
system.
|
||||
|
||||
- To list applied software updates from the CLI, use the :command:`sw-patch
|
||||
|
@ -9,16 +9,15 @@ In-Service Versus Reboot-Required Software Updates
|
||||
In-Service \(Reboot-not-Required\) and a Reboot-Required software updates are
|
||||
available depending on the nature of the update to be performed.
|
||||
|
||||
In-Service software updates provides a mechanism to issue updates that do not
|
||||
In-Service software updates provide a mechanism to issue updates that do not
|
||||
require a reboot, allowing the update to be installed on in-service nodes and
|
||||
restarting affected processes as needed.
|
||||
|
||||
Depending on the area of software being updated and the type of software
|
||||
change, installation of the update may or may not require the |prod| hosts to
|
||||
be rebooted. For example, a software update to the kernel would require the
|
||||
host to be rebooted in order to apply the update. Software updates are
|
||||
classified as reboot-required or reboot-not-required \(also referred to as
|
||||
Depending on the area of software being updated and the type of software change,
|
||||
installation of the update may or may not require the |prod| hosts to be
|
||||
rebooted. For example, a software update to the kernel would require the host to
|
||||
be rebooted in order to apply the update. Software updates are classified as
|
||||
reboot-required or reboot-not-required \(also referred to as
|
||||
in-service\) type updates to indicate this. For reboot-required updates, the
|
||||
hosted application pods are automatically relocated to an alternate host as
|
||||
part of the update procedure, prior to applying the update and rebooting the
|
||||
host.
|
||||
hosted application pods are automatically relocated to an alternate host as part
|
||||
of the update procedure, prior to applying the update and rebooting the host.
|
||||
|
@ -18,13 +18,13 @@ unlocked as part of applying the update.
|
||||
|
||||
#. In |prod| Horizon, open the Software Management page.
|
||||
|
||||
The Software Management page is available from **Admin** \> **Platform** \>
|
||||
**Software Management** in the left-hand pane.
|
||||
The **Software Management** page is available from **Admin** \> **Platform**
|
||||
\> **Software Management** in the left-hand pane.
|
||||
|
||||
#. Select the Patches tab to see the current update status.
|
||||
#. Select the **Patches** tab to see the current update status.
|
||||
|
||||
The Patches page shows the current status of all updates uploaded to the
|
||||
system. If there are no updates, an empty Patch Table is displayed.
|
||||
The **Patches** tab shows the current status of all updates uploaded to the
|
||||
system. If there are no updates, an empty **Patch Table** is displayed.
|
||||
|
||||
#. Upload the update \(patch\) file to the update storage area.
|
||||
|
||||
@ -34,7 +34,7 @@ unlocked as part of applying the update.
|
||||
|
||||
The update file is transferred to the Active Controller and is copied to
|
||||
the update storage area, but it has yet to be applied to the cluster. This
|
||||
is reflected in the Patches page.
|
||||
is reflected in the **Patches** tab.
|
||||
|
||||
#. Apply the update.
|
||||
|
||||
@ -43,29 +43,29 @@ unlocked as part of applying the update.
|
||||
click the **Apply Patches** button at the top. You can use this selection
|
||||
process to apply all updates, or a selected subset, in a single operation.
|
||||
|
||||
The Patches page is updated to report the update to be in the
|
||||
The **Patches** tab is updated to report the update to be in the
|
||||
*Partial-Apply* state.
|
||||
|
||||
#. Install the update on **controller-0**.
|
||||
#. Install the update on controller-0.
|
||||
|
||||
#. Select the **Hosts** tab.
|
||||
|
||||
The **Hosts** tab on the Host Inventory page reflects the new status of
|
||||
the hosts with respect to the new update state. In this example, the
|
||||
The **Hosts** tab on the **Host Inventory** page reflects the new status
|
||||
of the hosts with respect to the new update state. In this example, the
|
||||
update only applies to controller software, as can be seen by the
|
||||
worker host's status field being empty, indicating that it is 'patch
|
||||
current'.
|
||||
|
||||
.. image:: figures/ekn1453233538504.png
|
||||
|
||||
#. Next, select the Install Patches option from the **Edit Host** button
|
||||
associated with **controller-0** to install the update.
|
||||
#. Select the Install Patches option from the **Edit Host** button
|
||||
associated with controller-0 to install the update.
|
||||
|
||||
A confirmation window is presented giving you a last opportunity to
|
||||
cancel the operation before proceeding.
|
||||
|
||||
#. Repeat the steps 6 a,b, above with **controller-1** to install the update
|
||||
on **controller-1**.
|
||||
#. Repeat the steps 6 a,b, above with controller-1 to install the update
|
||||
on controller-1.
|
||||
|
||||
#. Repeat the steps 6 a,b above for the worker and/or storage hosts \(if
|
||||
present\).
|
||||
@ -74,7 +74,7 @@ unlocked as part of applying the update.
|
||||
|
||||
#. Verify the state of the update.
|
||||
|
||||
Visit the Patches page again. The update is now in the *Applied* state.
|
||||
Visit the **Patches** tab again. The update is now in the *Applied* state.
|
||||
|
||||
.. rubric:: |result|
|
||||
|
||||
|
@ -39,7 +39,7 @@ unlocked as part of applying the update.
|
||||
controller-0 192.168.204.3 Yes No nn.nn idle
|
||||
controller-1 192.168.204.4 Yes No nn.nn idle
|
||||
|
||||
#. Ensure the original update files have been deleted from the root drive.
|
||||
#. Ensure that the original update files have been deleted from the root drive.
|
||||
|
||||
After they are uploaded to the storage area, the original files are no
|
||||
longer required. You must use the command-line interface to delete them, in
|
||||
|
@ -32,15 +32,15 @@ update. The main steps of the procedure are:
|
||||
|
||||
#. Log in to the Horizon Web interface interface as the **admin** user.
|
||||
|
||||
#. In Horizon, open the Software Management page.
|
||||
#. In Horizon, open the **Software Management** page.
|
||||
|
||||
The Software Management page is available from **Admin** \> **Platform** \>
|
||||
**Software Management** in the left-hand pane.
|
||||
The **Software Management** page is available from **Admin** \> **Platform**
|
||||
\> **Software Management** in the left-hand pane.
|
||||
|
||||
#. Select the Patches tab to see the current status.
|
||||
#. Select the **Patches** tab to see the current status.
|
||||
|
||||
The Patches page shows the current status of all updates uploaded to the
|
||||
system. If there are no updates, an empty Patch Table is displayed.
|
||||
The **Patches** tab shows the current status of all updates uploaded to the
|
||||
system. If there are no updates, an empty **Patch Table** is displayed.
|
||||
|
||||
#. Upload the update \(patch\) file to the update storage area.
|
||||
|
||||
@ -50,7 +50,7 @@ update. The main steps of the procedure are:
|
||||
|
||||
The update file is transferred to the Active Controller and is copied to
|
||||
the storage area, but it has yet to be applied to the cluster. This is
|
||||
reflected in the Patches page.
|
||||
reflected on the **Patches** tab.
|
||||
|
||||
#. Apply the update.
|
||||
|
||||
@ -62,14 +62,14 @@ update. The main steps of the procedure are:
|
||||
The Patches page is updated to report the update to be in the
|
||||
*Partial-Apply* state.
|
||||
|
||||
#. Install the update on **controller-0**.
|
||||
#. Install the update on controller-0.
|
||||
|
||||
.. _installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001:
|
||||
|
||||
#. Select the **Hosts** tab.
|
||||
|
||||
The **Hosts** tab on the Host Inventory page reflects the new status of
|
||||
the hosts with respect to the new update state. As shown below, both
|
||||
The **Hosts** tab on the **Host Inventory** page reflects the new status
|
||||
of the hosts with respect to the new update state. As shown below, both
|
||||
controllers are now reported as not 'patch current' and requiring
|
||||
reboot.
|
||||
|
||||
@ -83,10 +83,10 @@ update. The main steps of the procedure are:
|
||||
Access to Horizon may be lost briefly during the active controller
|
||||
transition. You may have to log in again.
|
||||
|
||||
#. Select the Lock Host option from the **Edit Host** button associated
|
||||
#. Select the **Lock Host** option from the **Edit Host** button associated
|
||||
with **controller-0**.
|
||||
|
||||
#. Select the Install Patches option from the **Edit Host** button
|
||||
#. Select the **Install Patches** option from the **Edit Host** button
|
||||
associated with **controller-0** to install the update.
|
||||
|
||||
A confirmation window is presented giving you a last opportunity to
|
||||
@ -94,12 +94,12 @@ update. The main steps of the procedure are:
|
||||
|
||||
Wait for the update install to complete.
|
||||
|
||||
#. Select the Unlock Host option from the **Edit Host** button associated
|
||||
with controller-0.
|
||||
#. Select the **Unlock Host** option from the **Edit Host** button
|
||||
associated with controller-0.
|
||||
|
||||
#. Repeat steps :ref:`6
|
||||
<installing-reboot-required-software-updates-using-horizon-step-N10107-N10028-N1001C-N10001>`
|
||||
a to e, with **controller-1** to install the update on **controller-1**.
|
||||
a to e, with **controller-1** to install the update on controller-1.
|
||||
|
||||
.. note::
|
||||
For |prod| Simplex systems, this step does not apply.
|
||||
@ -113,14 +113,14 @@ update. The main steps of the procedure are:
|
||||
|
||||
#. Verify the state of the update.
|
||||
|
||||
Visit the Patches page. The update is now in the Applied state.
|
||||
Visit the **Patches** page. The update is now in the *Applied* state.
|
||||
|
||||
|
||||
.. rubric:: |result|
|
||||
|
||||
The update is applied now, and all affected hosts have been updated.
|
||||
The update is now applied, and all affected hosts have been updated.
|
||||
|
||||
Updates can be removed using the **Remove Patches** button from the Patches
|
||||
page. The workflow is similar to the one presented in this section, with the
|
||||
Updates can be removed using the **Remove Patches** button from the **Patches**
|
||||
tab. The workflow is similar to the one presented in this section, with the
|
||||
exception that updates are being removed from each host instead of being
|
||||
applied.
|
||||
|
@ -14,7 +14,7 @@ You can install reboot-required software updates using the CLI.
|
||||
.. _installing-reboot-required-software-updates-using-the-cli-steps-v1q-vlv-vw:
|
||||
|
||||
#. Log in as user **sysadmin** to the active controller and source the script
|
||||
/etc/platform/openrc to obtain administrative privileges.
|
||||
``/etc/platform/openrc`` to obtain administrative privileges.
|
||||
|
||||
#. Verify that the updates are available using the :command:`sw-patch query`
|
||||
command.
|
||||
@ -49,10 +49,10 @@ You can install reboot-required software updates using the CLI.
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
~(keystone_admin)]$ sudo sw-patch apply |pn|-nn.nn_PATCH_0001
|
||||
|pn|-nn.nn_PATCH_0001 is now in the repo
|
||||
~(keystone_admin)]$ sudo sw-patch apply |pn|-<nn>.<nn>_PATCH_0001
|
||||
|pn|-<nn>.<nn>_PATCH_0001 is now in the repo
|
||||
|
||||
where nn.nn in the update filename is the |prod-long| release number.
|
||||
where <nn>.<nn> in the update filename is the |prod-long| release number.
|
||||
|
||||
The update is now in the Partial-Apply state, ready for installation from
|
||||
the software updates repository on the impacted hosts.
|
||||
@ -101,7 +101,7 @@ You can install reboot-required software updates using the CLI.
|
||||
host.
|
||||
|
||||
The **Patch Current** field of the :command:`query-hosts` command will
|
||||
briefly report “Pending” after you apply or remove an update, until
|
||||
briefly report *Pending* after you apply or remove an update, until
|
||||
that host has checked against the repository to see if it is impacted
|
||||
by the patching operation.
|
||||
|
||||
@ -124,7 +124,8 @@ You can install reboot-required software updates using the CLI.
|
||||
|
||||
**install-failed**
|
||||
The operation failed, either due to an update error or something
|
||||
killed the process. Check the patching.log on the node in question.
|
||||
killed the process. Check the ``patching.log`` on the node in
|
||||
question.
|
||||
|
||||
**install-rejected**
|
||||
The node is unlocked, therefore the request to install has been
|
||||
@ -168,8 +169,8 @@ You can install reboot-required software updates using the CLI.
|
||||
~(keystone_admin)]$ sudo sw-patch host-install <controller-0>
|
||||
|
||||
.. note::
|
||||
You can use the :command:`sudo sw-patch host-install-async`
|
||||
<hostname> command if you are launching multiple installs in
|
||||
You can use the :command:`sudo sw-patch host-install-async <hostname>`
|
||||
command if you are launching multiple installs in
|
||||
parallel.
|
||||
|
||||
#. Unlock the host.
|
||||
@ -181,7 +182,7 @@ You can install reboot-required software updates using the CLI.
|
||||
Unlocking the host forces a reset of the host followed by a reboot.
|
||||
This ensures that the host is restarted in a known state.
|
||||
|
||||
All updates are now installed on **controller-0**. Querying the current
|
||||
All updates are now installed on controller-0. Querying the current
|
||||
update status displays the following information:
|
||||
|
||||
.. code-block:: none
|
||||
@ -199,14 +200,14 @@ You can install reboot-required software updates using the CLI.
|
||||
storage-0 192.168.204.37 Yes No nn.nn idle
|
||||
storage-1 192.168.204.90 Yes No nn.nn idle
|
||||
|
||||
#. Install all pending updates on **controller-1**.
|
||||
#. Install all pending updates on controller-1.
|
||||
|
||||
.. note::
|
||||
For |prod| Simplex systems, this step does not apply.
|
||||
|
||||
Repeat the previous step targeting **controller-1**.
|
||||
Repeat the previous step targeting controller-1.
|
||||
|
||||
All updates are now installed on **controller-1** as well. Querying the
|
||||
All updates are now installed on controller-1 as well. Querying the
|
||||
current updating status displays the following information:
|
||||
|
||||
.. code-block:: none
|
||||
@ -227,12 +228,12 @@ You can install reboot-required software updates using the CLI.
|
||||
#. Install any pending updates for the worker or storage hosts.
|
||||
|
||||
.. note::
|
||||
For |prod| Simplex or Duplex systems, this step does not apply.
|
||||
This step does not apply for |prod| Simplex or Duplex systems.
|
||||
|
||||
All hosted application pods currently running on a worker host are
|
||||
re-located to another host.
|
||||
|
||||
If the **Patch Current** status for a worker or storage host is **No**,
|
||||
If the **Patch Current** status for a worker or storage host is *No*,
|
||||
apply the pending updates using the following commands:
|
||||
|
||||
.. code-block:: none
|
||||
@ -247,32 +248,36 @@ You can install reboot-required software updates using the CLI.
|
||||
|
||||
~(keystone_admin)]$ system host-unlock <hostname>
|
||||
|
||||
where <hostname> is the name of the host \(for example, **worker-0**\).
|
||||
where <hostname> is the name of the host \(for example, ``worker-0``\).
|
||||
|
||||
.. note::
|
||||
Update installations can be triggered in parallel.
|
||||
|
||||
The :command:`sw-patch host-install-async` command \(**install
|
||||
patches** on the Horizon Web interface\) can be run on all locked
|
||||
nodes, without waiting for one node to complete the install before
|
||||
triggering the install on the next. If you can lock the nodes at the
|
||||
same time, without impacting hosted application services, you can
|
||||
The :command:`sw-patch host-install-async` command \( cooresponding to
|
||||
**install patches** on the Horizon Web interface\) can be run on all
|
||||
locked nodes, without waiting for one node to complete the install
|
||||
before triggering the install on the next. If you can lock the nodes at
|
||||
the same time, without impacting hosted application services, you can
|
||||
update them at the same time.
|
||||
|
||||
Likewise, you can install an update to the standby controller and a
|
||||
worker node at the same time. The only restrictions are those of the
|
||||
lock: You cannot lock both controllers, and you cannot lock a worker
|
||||
node if you do not have enough free resources to relocate the hosted
|
||||
applications from it. Also, in a Ceph configuration \(with storage
|
||||
nodes\), you cannot lock more than one of
|
||||
controller-0/controller-1/storage-0 at the same time, as these nodes
|
||||
are running Ceph monitors and you must have at least two in service at
|
||||
all times.
|
||||
lock:
|
||||
|
||||
* You cannot lock both controllers.
|
||||
|
||||
* You cannot lock a worker node if you do not have enough free resources
|
||||
to relocate the hosted applications from it.
|
||||
|
||||
Also, in a Ceph configuration \(with storage nodes\), you cannot lock
|
||||
more than one of controller-0/controller-1/storage-0 at the same time,
|
||||
as these nodes are running Ceph monitors and you must have at least two
|
||||
in service at all times.
|
||||
|
||||
#. Confirm that all updates are installed and the |prod| is up-to-date.
|
||||
|
||||
Use the :command:`sw-patch query` command to verify that all updates are
|
||||
**Applied**.
|
||||
*Applied*.
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
@ -280,13 +285,13 @@ You can install reboot-required software updates using the CLI.
|
||||
|
||||
Patch ID Patch State
|
||||
========================= ===========
|
||||
|pn|-nn.nn_PATCH_0001 Applied
|
||||
|pn|-<nn>.<nn>_PATCH_0001 Applied
|
||||
|
||||
where *nn.nn* in the update filename is the |prod| release number.
|
||||
where <nn>.<nn> in the update filename is the |prod| release number.
|
||||
|
||||
If the **Patch State** for any update is still shown as **Available** or
|
||||
**Partial-Apply**, use the **sw-patch query-hosts** command to identify
|
||||
which hosts are not **Patch Current**, and then apply updates to them as
|
||||
If the **Patch State** for any update is still shown as *Available* or
|
||||
*Partial-Apply*, use the :command:`sw-patch query-hosts`` command to identify
|
||||
which hosts are not *Patch Current*, and then apply updates to them as
|
||||
described in the preceding steps.
|
||||
|
||||
|
||||
|
@ -12,18 +12,18 @@ This section describes installing software updates before you can commission
|
||||
.. rubric:: |context|
|
||||
|
||||
This procedure assumes that the software updates to install are available on a
|
||||
USB flash drive, or from a server reachable by **controller-0**.
|
||||
USB flash drive, or from a server reachable by controller-0.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
When initially installing the |prod-long| software, it is required that you
|
||||
install the latest available updates on **controller-0** before running Ansible
|
||||
install the latest available updates on controller-0 before running Ansible
|
||||
Bootstrap Playbook, and before installing the software on other hosts. This
|
||||
ensures that:
|
||||
|
||||
.. _installing-software-updates-before-initial-commissioning-ul-gsq-1ht-vp:
|
||||
|
||||
- The software on **controller-0**, and all other hosts, is up to date when
|
||||
- The software on controller-0, and all other hosts, is up to date when
|
||||
the cluster comes alive.
|
||||
|
||||
- You reduce installation time by avoiding updating the system right after an
|
||||
@ -31,12 +31,12 @@ ensures that:
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Install software on **controller-0**.
|
||||
#. Install software on controller-0.
|
||||
|
||||
Use the |prod-long| bootable ISO image to initialize **controller-0**.
|
||||
Use the |prod-long| bootable ISO image to initialize controller-0.
|
||||
|
||||
This step takes you to the point where you use the console port to log in
|
||||
to **controller-0** as user **sysadmin**.
|
||||
to controller-0 as user **sysadmin**.
|
||||
|
||||
#. Populate the storage area.
|
||||
|
||||
@ -68,9 +68,9 @@ ensures that:
|
||||
Patch installation is complete.
|
||||
Please reboot before continuing with configuration.
|
||||
|
||||
This command installs all applied updates on **controller-0**.
|
||||
This command installs all applied updates on controller-0.
|
||||
|
||||
#. Reboot **controller-0**.
|
||||
#. Reboot controller-0.
|
||||
|
||||
You must reboot the controller to ensure that it is running with the
|
||||
software fully updated.
|
||||
|
@ -7,9 +7,9 @@ Manage Software Updates
|
||||
=======================
|
||||
|
||||
Updates \(also known as patches\) to the system software become available as
|
||||
needed to address issues associated with a current |prod-long| software
|
||||
release. Software updates must be uploaded to the active controller and applied
|
||||
to all required hosts in the cluster.
|
||||
needed to address issues associated with a current |prod-long| software release.
|
||||
Software updates must be uploaded to the active controller and applied to all
|
||||
required hosts in the cluster.
|
||||
|
||||
.. note::
|
||||
Updating |prod-dc| is distinct from updating other |prod| configurations.
|
||||
@ -21,8 +21,8 @@ to all required hosts in the cluster.
|
||||
The following elements form part of the software update environment:
|
||||
|
||||
**Reboot-Required Software Updates**
|
||||
Reboot-required updates are typically major updates that require hosts to
|
||||
be locked during the update process and rebooted to complete the process.
|
||||
Reboot-required updates are typically major updates that require hosts to be
|
||||
locked during the update process and rebooted to complete the process.
|
||||
|
||||
.. note::
|
||||
When a |prod| host is locked and rebooted for updates, the hosted
|
||||
@ -30,26 +30,26 @@ The following elements form part of the software update environment:
|
||||
minimize the impact to the hosted application service.
|
||||
|
||||
**In-Service Software Updates**
|
||||
In-service \(reboot-not-required\), software updates are updates that do
|
||||
not require the locking and rebooting of hosts. The required |prod|
|
||||
software is updated and any required |prod| processes are re-started.
|
||||
Hosted applications pods and services are completely unaffected.
|
||||
In-service \(reboot-not-required\), software updates are updates that do not
|
||||
require the locking and rebooting of hosts. The required |prod| software is
|
||||
updated and any required |prod| processes are re-started. Hosted
|
||||
applications pods and services are completely unaffected.
|
||||
|
||||
**Software Update Commands**
|
||||
The :command:`sw-patch` command is available on both active controllers. It
|
||||
must be run as root using :command:`sudo`. It provides the user interface
|
||||
to process the updates, including querying the state of an update, listing
|
||||
must be run as root using :command:`sudo`. It provides the user interface to
|
||||
process the updates, including querying the state of an update, listing
|
||||
affected hosts, and applying, installing, and removing updates.
|
||||
|
||||
**Software Update Storage Area**
|
||||
A central storage area maintained by the update controller. Software
|
||||
updates are initially uploaded to the storage area and remains there until
|
||||
they are deleted.
|
||||
A central storage area maintained by the update controller. Software updates
|
||||
are initially uploaded to the storage area and remains there until they are
|
||||
deleted.
|
||||
|
||||
**Software Update Repository**
|
||||
A central repository of software updates associated with any updates
|
||||
applied to the system. This repository is used by all hosts in the cluster
|
||||
to identify the software updates and rollbacks required on each host.
|
||||
A central repository of software updates associated with any updates applied
|
||||
to the system. This repository is used by all hosts in the cluster to
|
||||
identify the software updates and rollbacks required on each host.
|
||||
|
||||
**Software Update Logs**
|
||||
The following logs are used to record software update activity:
|
||||
@ -102,7 +102,7 @@ upload the software update directly from your workstation using a file browser
|
||||
window provided by the software update upload facility.
|
||||
|
||||
A special case occurs during the initial provisioning of a cluster when you
|
||||
want to update **controller-0** before the system software is configured. This
|
||||
want to update controller-0 before the system software is configured. This
|
||||
can only be done from the command line interface. See :ref:`Install Software
|
||||
Updates Before Initial Commissioning
|
||||
<installing-software-updates-before-initial-commissioning>` for details.
|
||||
|
@ -6,8 +6,8 @@
|
||||
Manual Kubernetes Version Upgrade
|
||||
=================================
|
||||
|
||||
You can upgrade the Kubernetes version on a running system from one
|
||||
supported version to another.
|
||||
You can upgrade the Kubernetes version on a running system from one supported
|
||||
version to another.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
@ -102,26 +102,26 @@ and upgrade various systems.
|
||||
**State**
|
||||
Can be one of:
|
||||
|
||||
**active**
|
||||
*active*
|
||||
The version is running everywhere.
|
||||
|
||||
**partial**
|
||||
*partial*
|
||||
The version is running somewhere.
|
||||
|
||||
**available**
|
||||
*available*
|
||||
The version can be upgraded to.
|
||||
|
||||
**unavailable**
|
||||
The version is not available for upgrading. Either it is a
|
||||
downgrade or it requires an intermediate upgrade first. Kubernetes
|
||||
can be only upgraded one version at a time.
|
||||
*unavailable*
|
||||
The version is not available for upgrading. Either it is a downgrade
|
||||
or it requires an intermediate upgrade first. Kubernetes can be only
|
||||
upgraded one version at a time.
|
||||
|
||||
#. Confirm that the system is healthy.
|
||||
|
||||
Check the current system health status, resolve any alarms and other issues
|
||||
reported by the :command:`system health-query-kube-upgrade` command then
|
||||
recheck the system health status to confirm that all **System Health**
|
||||
fields are set to **OK**.
|
||||
fields are set to *OK*.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -156,8 +156,8 @@ and upgrade various systems.
|
||||
| state | upgrade-started |
|
||||
+-------------------+-------------------+
|
||||
|
||||
The upgrade process checks the applied/available updates, the upgrade path,
|
||||
the health of the system, the installed applications compatibility and
|
||||
The upgrade process checks the *applied*/*available* updates, the upgrade
|
||||
path, the health of the system, the installed applications compatibility and
|
||||
validates the system is ready for an upgrade.
|
||||
|
||||
.. warning::
|
||||
@ -218,7 +218,7 @@ and upgrade various systems.
|
||||
| updated_at | 2020-02-20T16:18:11.459736+00:00 |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
The state **upgraded-networking** will be entered when the networking
|
||||
The state *upgraded-networking* will be entered when the networking
|
||||
upgrade has completed.
|
||||
|
||||
#. Upgrade the control plane on the first controller.
|
||||
@ -241,7 +241,7 @@ and upgrade various systems.
|
||||
|
||||
You can upgrade either controller first.
|
||||
|
||||
The state **upgraded-first-master** will be entered when the first control
|
||||
The state *upgraded-first-master* will be entered when the first control
|
||||
plane upgrade has completed.
|
||||
|
||||
#. Upgrade the control plane on the second controller.
|
||||
@ -261,7 +261,7 @@ and upgrade various systems.
|
||||
| target_version | v1.19.13 |
|
||||
+-----------------------+-------------------------+
|
||||
|
||||
The state **upgraded-second-master** will be entered when the upgrade has
|
||||
The state *upgraded-second-master* will be entered when the upgrade has
|
||||
completed.
|
||||
|
||||
#. Show the Kubernetes upgrade status for all hosts.
|
||||
@ -298,7 +298,7 @@ and upgrade various systems.
|
||||
|
||||
~(keystone_admin)]$ system host-lock controller-1
|
||||
|
||||
.. note::
|
||||
.. warning::
|
||||
For All-In-One Simplex systems, the controller must **not** be
|
||||
locked.
|
||||
|
||||
|
@ -15,7 +15,7 @@ Standard, |prod-dc|, and subcloud deployments.
|
||||
.. xbooklink For information on updating |prod-dc|, see |distcloud-doc|: :ref:`Upgrade
|
||||
Management <upgrade-management-overview>`.
|
||||
|
||||
An upgrade can be performed manually or by the Upgrade Orchestrator which
|
||||
An upgrade can be performed manually or using the Upgrade Orchestrator, which
|
||||
automates a rolling install of an update across all of the |prod-long| hosts.
|
||||
This section describes the manual upgrade procedures.
|
||||
|
||||
@ -28,8 +28,8 @@ met:
|
||||
|
||||
- The system is patch current.
|
||||
|
||||
- There are no management-affecting alarms and the "system
|
||||
health-query-upgrade" check passes.
|
||||
- There are no management-affecting alarms and the :command:`system
|
||||
health-query-upgrade` check passes.
|
||||
|
||||
- The new software load has been imported.
|
||||
|
||||
|
@ -21,11 +21,12 @@ manual steps for operator oversight.
|
||||
:ref:`Upgrade Management <upgrade-management-overview>`.
|
||||
|
||||
.. note::
|
||||
The upgrade orchestration CLI is :command:`sw-manager`.To use upgrade
|
||||
orchestration commands, you need administrator privileges. You must log in
|
||||
to the active controller as user **sysadmin** and source the
|
||||
/etc/platform/openrc script to obtain administrator privileges. Do not use
|
||||
**sudo**.
|
||||
|
||||
The upgrade orchestration commands are prefixed with :command:`sw-manager`.
|
||||
To use upgrade orchestration commands, you need administrator privileges.
|
||||
You must log in to the active controller as user **sysadmin** and source the
|
||||
``/etc/platform/openrc`` script to obtain administrator privileges. Do not use
|
||||
:command:`sudo`.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -71,9 +72,9 @@ conditions:
|
||||
orchestrated while another orchestration is in progress.
|
||||
|
||||
- Sufficient free capacity or unused worker resources must be available
|
||||
across the cluster. A rough calculation is: Required spare capacity \( %\)
|
||||
= \(Number of hosts to upgrade in parallel divided by the total number of
|
||||
hosts\) times 100.
|
||||
across the cluster. A rough calculation is:
|
||||
|
||||
``Required spare capacity ( %) = (<Number-of-hosts-to-upgrade-in-parallel> / <total-number-of-hosts>) * 100``
|
||||
|
||||
.. _orchestration-upgrade-overview-section-N10081-N10026-N10001:
|
||||
|
||||
@ -81,16 +82,16 @@ conditions:
|
||||
The Upgrade Orchestration Process
|
||||
---------------------------------
|
||||
|
||||
Upgrade orchestration can be initiated after the manual upgrade and stability
|
||||
of the initial controller host. Upgrade orchestration automatically iterates
|
||||
through the remaining hosts, installing the new software load on each one:
|
||||
first the other controller host, then the storage hosts, and finally the worker
|
||||
hosts. During worker host upgrades, pods are moved to alternate worker hosts
|
||||
automatically.
|
||||
Upgrade orchestration can be initiated after the initial controller host has
|
||||
been manual upgraded and returned to a stability state. Upgrade orchestration
|
||||
automatically iterates through the remaining hosts, installing the new software
|
||||
load on each one: first the other controller host, then the storage hosts, and
|
||||
finally the worker hosts. During worker host upgrades, pods are automatically
|
||||
moved to alternate worker hosts.
|
||||
|
||||
The user first creates an upgrade orchestration strategy, or plan, for the
|
||||
automated upgrade procedure. This customizes the upgrade orchestration, using
|
||||
parameters to specify:
|
||||
You first create an upgrade orchestration strategy, or plan, for the automated
|
||||
upgrade procedure. This customizes the upgrade orchestration, using parameters
|
||||
to specify:
|
||||
|
||||
.. _orchestration-upgrade-overview-ul-eyw-fyr-31b:
|
||||
|
||||
@ -103,9 +104,9 @@ creates a number of stages for the overall upgrade strategy. Each stage
|
||||
generally consists of moving pods, locking hosts, installing upgrades, and
|
||||
unlocking hosts for a subset of the hosts on the system.
|
||||
|
||||
After creating the upgrade orchestration strategy, the user can either apply
|
||||
the entire strategy automatically, or apply individual stages to control and
|
||||
monitor its progress manually.
|
||||
After creating the upgrade orchestration strategy, the you can either apply the
|
||||
entire strategy automatically, or apply individual stages to control and monitor
|
||||
their progress manually.
|
||||
|
||||
Update and upgrade orchestration are mutually exclusive; they perform
|
||||
conflicting operations. Only a single strategy \(sw-patch or sw-upgrade\) is
|
||||
@ -115,7 +116,7 @@ strategy before going back to the upgrade.
|
||||
|
||||
Some stages of the upgrade could take a significant amount of time \(hours\).
|
||||
For example, after upgrading a storage host, re-syncing the OSD data could take
|
||||
30m per TB \(assuming 500MB/s sync rate, which is about half of a 10G
|
||||
30 minutes per TB \(assuming 500MB/s sync rate, which is about half of a 10G
|
||||
infrastructure link\).
|
||||
|
||||
.. _orchestration-upgrade-overview-section-N10101-N10026-N10001:
|
||||
|
@ -10,7 +10,7 @@ Firmware update orchestration allows the firmware on the hosts of an entire
|
||||
|prod-long| system to be updated with a single operation.
|
||||
|
||||
You can configure and run firmware update orchestration using the |CLI|, or the
|
||||
stx-nfv VIM REST API.
|
||||
``stx-nfv`` VIM REST API.
|
||||
|
||||
.. note::
|
||||
Firmware update is currently not supported on the Horizon Web interface.
|
||||
@ -28,7 +28,7 @@ following conditions:
|
||||
|
||||
.. note::
|
||||
When configuring firmware update orchestration, you have the option to
|
||||
ignore alarms that are not management-affecting severity. For more
|
||||
ignore alarms that are not of management-affecting severity. For more
|
||||
information, see :ref:`Kubernetes Version Upgrade Cloud Orchestration
|
||||
<configuring-kubernetes-update-orchestration>`.
|
||||
|
||||
@ -36,7 +36,7 @@ following conditions:
|
||||
requires firmware update. The *Firmware Update Orchestration Strategy*
|
||||
creation step will fail if there are no qualified hosts detected.
|
||||
|
||||
- Firmware update is a reboot required operation. Therefore, in systems that
|
||||
- Firmware update is a reboot-required operation. Therefore, in systems that
|
||||
have the |prefix|-openstack application applied with running instances, if
|
||||
the migrate option is selected there must be spare openstack-compute \
|
||||
(worker\) capacity to move instances off the openstack-compute \
|
||||
|
@ -13,8 +13,8 @@ The upgrade orchestration CLI is :command:`sw-manager`.
|
||||
.. note::
|
||||
To use upgrade orchestration commands, you need administrator privileges.
|
||||
You must log in to the active controller as user **sysadmin** and source the
|
||||
/etc/platform/openrc script to obtain administrator privileges. Do not use
|
||||
**sudo**.
|
||||
``/etc/platform/openrc`` script to obtain administrator privileges. Do not use
|
||||
:command:`sudo`.
|
||||
|
||||
The upgrade strategy options are shown in the following output:
|
||||
|
||||
@ -34,9 +34,9 @@ The upgrade strategy options are shown in the following output:
|
||||
abort Abort a strategy
|
||||
show Show a strategy
|
||||
|
||||
You can perform a partially orchestrated upgrade using the CLI. Upgrade and
|
||||
stability of the initial controller node must be done manually before using
|
||||
upgrade orchestration to orchestrate the remaining nodes of the |prod|.
|
||||
You can perform a partially orchestrated upgrade using the |CLI|. Upgrade
|
||||
orchestration of other |prod| nodes can be initiated after the initial
|
||||
controller host has been manually upgraded and returned to a stability state.
|
||||
|
||||
.. note::
|
||||
Management-affecting alarms cannot be ignored at the indicated severity
|
||||
@ -65,9 +65,11 @@ See :ref:`Upgrading All-in-One Duplex / Standard
|
||||
upgrade the initial controller node before doing the upgrade orchestration
|
||||
described below to upgrade the remaining nodes of the |prod|.
|
||||
|
||||
- The subclouds must use the Redfish platform management service if it is an All-in-one Simplex subcloud.
|
||||
- The subclouds must use the Redfish platform management service if it is an
|
||||
All-in-one Simplex subcloud.
|
||||
|
||||
- Duplex \(AIODX/Standard\) upgrades are supported, and they do not require remote install via Redfish.
|
||||
- Duplex \(AIODX/Standard\) upgrades are supported, and they do not require
|
||||
remote install via Redfish.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
@ -95,20 +97,20 @@ described below to upgrade the remaining nodes of the |prod|.
|
||||
|
||||
- storage-apply-type:
|
||||
|
||||
- serial \(default\): storage hosts will be upgraded one at a time
|
||||
- ``serial`` \(default\): storage hosts will be upgraded one at a time
|
||||
|
||||
- parallel: storage hosts will be upgraded in parallel, ensuring that
|
||||
- ``parallel``: storage hosts will be upgraded in parallel, ensuring that
|
||||
only one storage node in each replication group is patched at a
|
||||
time.
|
||||
|
||||
- ignore: storage hosts will not be upgraded
|
||||
- ``ignore``: storage hosts will not be upgraded
|
||||
|
||||
- worker-apply-type:
|
||||
|
||||
**serial** \(default\)
|
||||
``serial`` \(default\)
|
||||
Worker hosts will be upgraded one at a time.
|
||||
|
||||
**ignore**
|
||||
``ignore``
|
||||
Worker hosts will not be upgraded.
|
||||
|
||||
- Alarm Restrictions
|
||||
@ -177,8 +179,8 @@ described below to upgrade the remaining nodes of the |prod|.
|
||||
relocated before it is upgraded.
|
||||
|
||||
#. Run :command:`sw-manager upgrade-strategy show` command, to display the
|
||||
current-phase-completion displaying the field goes from 0% to 100% in
|
||||
various increments. Once at 100%, it returns:
|
||||
current-phase-completion percentage progress indicator in various
|
||||
increments. Once at 100%, it returns:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -196,7 +198,7 @@ described below to upgrade the remaining nodes of the |prod|.
|
||||
build-result: success
|
||||
build-reason:
|
||||
|
||||
#. Apply the upgrade-strategy. You can optionally apply a single stage at a
|
||||
#. Apply the upgrade strategy. You can optionally apply a single stage at a
|
||||
time.
|
||||
|
||||
.. code-block:: none
|
||||
@ -214,7 +216,7 @@ described below to upgrade the remaining nodes of the |prod|.
|
||||
state: applying
|
||||
inprogress: true
|
||||
|
||||
While an upgrade-strategy is being applied, it can be aborted. This results
|
||||
While an upgrade strategy is being applied, it can be aborted. This results
|
||||
in:
|
||||
|
||||
- The current step will be allowed to complete.
|
||||
@ -222,9 +224,9 @@ described below to upgrade the remaining nodes of the |prod|.
|
||||
- If necessary an abort phase will be created and applied, which will
|
||||
attempt to unlock any hosts that were locked.
|
||||
|
||||
After an upgrade-strategy has been applied \(or aborted\) it must be
|
||||
deleted before another upgrade-strategy can be created. If an
|
||||
upgrade-strategy application fails, you must address the issue that caused
|
||||
After an upgrade strategy has been applied \(or aborted\) it must be
|
||||
deleted before another upgrade strategy can be created. If an
|
||||
upgrade strategy application fails, you must address the issue that caused
|
||||
the failure, then delete/re-create the strategy before attempting to apply
|
||||
it again.
|
||||
|
||||
|
@ -6,9 +6,10 @@
|
||||
Perform an Orchestrated Upgrade
|
||||
===============================
|
||||
|
||||
You can perform a partially-Orchestrated Upgrade of a |prod| system using the CLI and Horizon
|
||||
Web interface. Upgrade and stability of the initial controller node must be done manually
|
||||
before using upgrade orchestration to orchestrate the remaining nodes of the |prod|.
|
||||
You can perform a partially orchestrated Upgrade of a |prod| system using the
|
||||
CLI and Horizon Web interface. Upgrade and stability of the initial controller
|
||||
node must be done manually before using upgrade orchestration to orchestrate the
|
||||
remaining nodes of the |prod|.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
@ -50,31 +51,31 @@ described below to upgrade the remaining nodes of the |prod| system.
|
||||
|
||||
#. Click the **Create Strategy** button.
|
||||
|
||||
The Create Strategy dialog appears.
|
||||
The **Create Strategy** dialog appears.
|
||||
|
||||
#. Create an upgrade strategy by specifying settings for the parameters in the
|
||||
Create Strategy dialog box.
|
||||
**Create Strategy** dialog box.
|
||||
|
||||
Create an upgrade strategy, specifying the following parameters:
|
||||
|
||||
- storage-apply-type:
|
||||
|
||||
**serial** \(default\)
|
||||
``serial`` \(default\)
|
||||
Storage hosts will be upgraded one at a time.
|
||||
|
||||
**parallel**
|
||||
``parallel``
|
||||
Storage hosts will be upgraded in parallel, ensuring that only one
|
||||
storage node in each replication group is upgraded at a time.
|
||||
|
||||
**ignore**
|
||||
``ignore``
|
||||
Storage hosts will not be upgraded.
|
||||
|
||||
- worker-apply-type:
|
||||
|
||||
**serial** \(default\):
|
||||
``serial`` \(default\):
|
||||
Worker hosts will be upgraded one at a time.
|
||||
|
||||
**parallel**
|
||||
``parallel``
|
||||
Worker hosts will be upgraded in parallel, ensuring that:
|
||||
|
||||
- At most max-parallel-worker-hosts \(see below\) worker hosts
|
||||
@ -86,10 +87,10 @@ described below to upgrade the remaining nodes of the |prod| system.
|
||||
- Worker hosts with no application pods are upgraded before
|
||||
worker hosts with application pods.
|
||||
|
||||
**ignore**
|
||||
``ignore``
|
||||
Worker hosts will not be upgraded.
|
||||
|
||||
**max-parallel-worker-hosts**
|
||||
``max-parallel-worker-hosts``
|
||||
Specify the maximum worker hosts to upgrade in parallel \(minimum:
|
||||
2, maximum: 10\).
|
||||
|
||||
@ -98,19 +99,19 @@ described below to upgrade the remaining nodes of the |prod| system.
|
||||
(50), the value shall be at the maximum 2, which represents the
|
||||
minimum value.
|
||||
|
||||
**alarm-restrictions**
|
||||
``alarm-restrictions``
|
||||
This option lets you specify how upgrade orchestration behaves when
|
||||
alarms are present.
|
||||
|
||||
You can use the CLI command :command:`fm alarm-list
|
||||
--mgmt_affecting` to view the alarms that are management affecting.
|
||||
|
||||
**Strict**
|
||||
``Strict``
|
||||
The default strict option will result in upgrade orchestration
|
||||
failing if there are any alarms present in the system \(except
|
||||
for a small list of alarms\).
|
||||
|
||||
**Relaxed**
|
||||
``Relaxed``
|
||||
This option allows orchestration to proceed if alarms are
|
||||
present, as long as none of these alarms are management
|
||||
affecting.
|
||||
@ -157,10 +158,10 @@ described below to upgrade the remaining nodes of the |prod| system.
|
||||
NOT updated, but any additional pods on each worker host will be
|
||||
relocated before it is upgraded.
|
||||
|
||||
#. Apply the upgrade-strategy. You can optionally apply a single stage at a
|
||||
#. Apply the upgrade strategy. You can optionally apply a single stage at a
|
||||
time.
|
||||
|
||||
While an upgrade-strategy is being applied, it can be aborted. This results
|
||||
While an upgrade strategy is being applied, it can be aborted. This results
|
||||
in:
|
||||
|
||||
- The current step will be allowed to complete.
|
||||
@ -168,9 +169,9 @@ described below to upgrade the remaining nodes of the |prod| system.
|
||||
- If necessary an abort phase will be created and applied, which will
|
||||
attempt to unlock any hosts that were locked.
|
||||
|
||||
After an upgrade-strategy has been applied \(or aborted\) it must be
|
||||
deleted before another upgrade-strategy can be created. If an
|
||||
upgrade-strategy application fails, you must address the issue that caused
|
||||
After an upgrade strategy has been applied \(or aborted\) it must be
|
||||
deleted before another upgrade strategy can be created. If an
|
||||
upgrade strategy application fails, you must address the issue that caused
|
||||
the failure, then delete/re-create the strategy before attempting to apply
|
||||
it again.
|
||||
|
||||
|
@ -18,9 +18,9 @@ before they can be applied.
|
||||
.. parsed-literal::
|
||||
|
||||
$ sudo sw-patch upload /home/sysadmin/patches/|pn|-CONTROLLER_<nn.nn>_PATCH_0001.patch
|
||||
Cloud_Platform__CONTROLLER_nn.nn_PATCH_0001 is now available
|
||||
Cloud_Platform__CONTROLLER_<nn.nn>_PATCH_0001 is now available
|
||||
|
||||
where *nn.nn* in the update file name is the |prod| release number.
|
||||
where <nn.nn> in the update file name is the |prod| release number.
|
||||
|
||||
This example uploads a single update to the storage area. You can specify
|
||||
multiple update files on the same command separating their names with
|
||||
@ -42,7 +42,7 @@ before they can be applied.
|
||||
|
||||
$ sudo sw-patch query
|
||||
|
||||
The update state is *Available* now, indicating that it is included in the
|
||||
The update state displays *Available*, indicating that it is included in the
|
||||
storage area. Further details about the updates can be retrieved as
|
||||
follows:
|
||||
|
||||
|
@ -20,10 +20,10 @@ version of an update has been committed to the system.
|
||||
|
||||
The :command:`query-dependencies` command will show a list of updates that
|
||||
are required by the specified update \(including itself\). The
|
||||
**--recursive** option will crawl through those dependencies to return a
|
||||
``--recursive`` option will crawl through those dependencies to return a
|
||||
list of all the updates in the specified update's dependency tree. This
|
||||
query is used by the “commit” command in calculating the set of updates to
|
||||
be committed.For example,
|
||||
query is used by the :command:`commit` command in calculating the set of
|
||||
updates to be committed. For example,
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
@ -48,12 +48,12 @@ version of an update has been committed to the system.
|
||||
updates to be committed. The commit set is calculated by querying the
|
||||
dependencies of each specified update.
|
||||
|
||||
The **--all** option, without the **--release** option, commits all updates
|
||||
The ``--all`` option, without the ``--release`` option, commits all updates
|
||||
of the currently running release. When two releases are on the system use
|
||||
the **--release** option to specify a particular release's updates if
|
||||
committing all updates for the non-running release. The **--dry-run**
|
||||
the ``--release`` option to specify a particular release's updates if
|
||||
committing all updates for the non-running release. The ``--dry-run``
|
||||
option shows the list of updates to be committed and how much disk space
|
||||
will be freed up. This information is also shown without the **--dry-run**
|
||||
will be freed up. This information is also shown without the ``--dry-run``
|
||||
option, before prompting to continue with the operation. An update can only
|
||||
be committed once it has been fully applied to the system, and cannot be
|
||||
removed after.
|
||||
@ -61,7 +61,7 @@ version of an update has been committed to the system.
|
||||
Following are examples that show the command usage.
|
||||
|
||||
The following command lists the status of all updates that are in an
|
||||
APPLIED state.
|
||||
*Applied* state.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -84,7 +84,7 @@ version of an update has been committed to the system.
|
||||
Would you like to continue? [y/N]: y
|
||||
The patches have been committed.
|
||||
|
||||
The following command shows the updates now in the COMMITTED state.
|
||||
The following command shows the updates now in the *Committed* state.
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
|
@ -23,20 +23,20 @@ following state transitions:
|
||||
Use the command :command:`sw-patch remove` to trigger this transition.
|
||||
|
||||
**Partial-Remove to Available**
|
||||
Use the command :command:`sudo sw-patch host-install-async` <hostname>
|
||||
Use the command :command:`sudo sw-patch host-install-async <hostname>`
|
||||
repeatedly targeting each one of the applicable hosts in the cluster. The
|
||||
transition to the *Available* state is complete when the update is removed
|
||||
from all target hosts. The update remains in the update storage area as if
|
||||
it had just been uploaded.
|
||||
|
||||
.. note::
|
||||
The command :command:`sudo sw-patch host-install-async` <hostname> both
|
||||
The command :command:`sudo sw-patch host-install-async <hostname>` both
|
||||
installs and removes updates as necessary.
|
||||
|
||||
The following example describes removing an update that applies only to the
|
||||
controllers. Removing updates can be done using the Horizon Web interface,
|
||||
also, as discussed in :ref:`Install Reboot-Required Software Updates Using
|
||||
Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
controllers. Update removal can be done using the Horizon Web interface as
|
||||
discussed in :ref:`Install Reboot-Required Software Updates Using Horizon
|
||||
<installing-reboot-required-software-updates-using-horizon>`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
@ -52,7 +52,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
|pn|-|pvr|-PATCH_0001 Applied
|
||||
|
||||
In this example the update is listed in the *Applied* state, but it could
|
||||
be in the *Partial-Apply* state as well.
|
||||
alo be in the *Partial-Apply* state.
|
||||
|
||||
#. Remove the update.
|
||||
|
||||
@ -62,7 +62,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
|pn|-|pvr|-PATCH_0001 has been removed from the repo
|
||||
|
||||
The update is now in the *Partial-Remove* state, ready to be removed from
|
||||
the impacted hosts where it was already installed.
|
||||
the impacted hosts where it was currently installed.
|
||||
|
||||
#. Query the updating status of all hosts in the cluster.
|
||||
|
||||
@ -83,7 +83,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
In this example, the controllers have updates ready to be removed, and
|
||||
therefore must be rebooted.
|
||||
|
||||
#. Remove all pending-for-removal updates from **controller-0**.
|
||||
#. Remove all pending-for-removal updates from controller-0.
|
||||
|
||||
#. Swact controller services away from controller-0.
|
||||
|
||||
@ -93,7 +93,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-0>
|
||||
~(keystone_admin)]$ sudo sw-patch host-install-async controller-0
|
||||
|
||||
#. Unlock controller-0.
|
||||
|
||||
@ -109,7 +109,7 @@ Horizon <installing-reboot-required-software-updates-using-horizon>`.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ sudo sw-patch host-install-async <controller-1>
|
||||
~(keystone_admin)]$ sudo sw-patch host-install-async controller-1
|
||||
|
||||
.. rubric:: |result|
|
||||
|
||||
|
@ -11,7 +11,7 @@ upgrade, however, the rollback will impact the hosting of applications.
|
||||
|
||||
The upgrade abort procedure can only be applied before the
|
||||
:command:`upgrade-complete` command is issued. Once this command is issued
|
||||
the upgrade can not be aborted. If the return to the previous release is required,
|
||||
the upgrade cannot be aborted. If you must revert to the previous release,
|
||||
then restore the system using the backup data taken prior to the upgrade.
|
||||
|
||||
In some scenarios additional actions will be required to complete the upgrade
|
||||
@ -23,7 +23,7 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-abort
|
||||
~(keystone_admin)]$ system upgrade-abort
|
||||
|
||||
Once this is done there is no going back; the upgrade must be completely
|
||||
aborted.
|
||||
@ -41,13 +41,13 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-swact controller-0
|
||||
~(keystone_admin)]$ system host-swact controller-0
|
||||
|
||||
#. Lock controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-lock controller-0
|
||||
~(keystone_admin)]$ system host-lock controller-0
|
||||
|
||||
#. Wipe the disk and power down all storage \(if applicable\) and worker hosts.
|
||||
|
||||
@ -66,13 +66,13 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-lock <hostID>
|
||||
~(keystone_admin)]$ system host-lock <hostID>
|
||||
|
||||
#. Downgrade controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-downgrade controller-0
|
||||
~(keystone_admin)]$ system host-downgrade controller-0
|
||||
|
||||
The host is re-installed with the previous release load.
|
||||
|
||||
@ -80,7 +80,7 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-unlock controller-0
|
||||
~(keystone_admin)]$ system host-unlock controller-0
|
||||
|
||||
.. note::
|
||||
Wait for controller-0 to become unlocked-enabled. Wait for the
|
||||
@ -90,7 +90,7 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-swact controller-1
|
||||
~(keystone_admin)]$ system host-swact controller-1
|
||||
|
||||
Swacting back to controller-0 will switch back to using the previous
|
||||
release databases, which were frozen at the time of the swact to
|
||||
@ -100,11 +100,11 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-lock controller-1
|
||||
~(keystone_admin)]$ system host-lock controller-1
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-downgrade controller-1
|
||||
~(keystone_admin)]$ system host-downgrade controller-1
|
||||
|
||||
The host is re-installed with the previous release load.
|
||||
|
||||
@ -112,7 +112,7 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-unlock controller-1
|
||||
~(keystone_admin)]$ system host-unlock controller-1
|
||||
|
||||
|
||||
#. Power up and unlock the storage hosts one at a time \(if using a Ceph
|
||||
@ -134,7 +134,7 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-complete
|
||||
~(keystone_admin)]$ system upgrade-complete
|
||||
|
||||
This cleans up the upgrade release, configuration, databases, and so forth.
|
||||
|
||||
@ -142,4 +142,4 @@ abort. It may be necessary to restore the system from a backup.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system load-delete
|
||||
~(keystone_admin)]$ system load-delete
|
||||
|
@ -18,7 +18,7 @@ has upgraded successfully.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-abort
|
||||
~(keystone_admin)]$ system upgrade-abort
|
||||
|
||||
The upgrade state is set to aborting. Once this is executed, there is no
|
||||
canceling; the upgrade must be completely aborted.
|
||||
@ -36,7 +36,7 @@ has upgraded successfully.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-swact controller-1
|
||||
~(keystone_admin)]$ system host-swact controller-1
|
||||
|
||||
If controller-1 was active with the new upgrade release, swacting back to
|
||||
controller-0 will switch back to using the previous release databases,
|
||||
@ -47,8 +47,8 @@ has upgraded successfully.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-lock controller-1
|
||||
$ system host-downgrade controller-1
|
||||
~(keystone_admin)]$ system host-lock controller-1
|
||||
~(keystone_admin)]$ system host-downgrade controller-1
|
||||
|
||||
The host is re-installed with the previous release load.
|
||||
|
||||
@ -63,16 +63,16 @@ has upgraded successfully.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system host-unlock controller-1
|
||||
~(keystone_admin)]$ system host-unlock controller-1
|
||||
|
||||
#. Complete the upgrade.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system upgrade-complete
|
||||
~(keystone_admin)]$ system upgrade-complete
|
||||
|
||||
#. Delete the newer upgrade release that has been aborted.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system load-delete <loadID>
|
||||
~(keystone_admin)]$ system load-delete <loadID>
|
||||
|
@ -25,7 +25,7 @@ following items:
|
||||
|
||||
- feature enhancements
|
||||
|
||||
Software updates can be installed manually or by the Update Orchestrator which
|
||||
Software updates can be installed manually or by the Update Orchestrator, which
|
||||
automates a rolling install of an update across all of the |prod-long| hosts.
|
||||
For more information on manual updates, see :ref:`Manage Software Updates
|
||||
<managing-software-updates>`. For more information on upgrade orchestration,
|
||||
@ -40,7 +40,7 @@ see :ref:`Orchestrated Software Update <update-orchestration-overview>`.
|
||||
.. xbooklink For more information, see, |distcloud-doc|: :ref:`Update Management for
|
||||
Distributed Cloud <update-management-for-distributed-cloud>`.
|
||||
|
||||
The |prod| handles multiple updates being applied and removed at once. Software
|
||||
|prod| handles multiple updates being applied and removed at once. Software
|
||||
updates can modify and update any area of |prod| software, including the kernel
|
||||
itself. For information on populating, installing and removing software
|
||||
updates, see :ref:`Manage Software Updates <managing-software-updates>`.
|
||||
@ -73,9 +73,9 @@ the |prod| software:
|
||||
#. **Application Software Updates**
|
||||
|
||||
These software updates apply to software being managed through the
|
||||
StarlingX Application Package Manager, that is, ':command:`system
|
||||
application-upload/apply/remove/delete`'. |prod| delivers some software
|
||||
through this mechanism, for example, **platform-integ-apps**.
|
||||
StarlingX Application Package Manager, that is, :command:`system
|
||||
application-upload/apply/remove/delete`. |prod| delivers some software
|
||||
through this mechanism, for example, ``platform-integ-apps``.
|
||||
|
||||
For software updates for these applications, download the updated
|
||||
application tarball, containing the updated FluxCD manifest, and updated
|
||||
|
@ -18,7 +18,7 @@ hosts are upgraded one at time while continuing to provide its hosting services
|
||||
to its hosted applications. An upgrade can be performed manually or using
|
||||
Upgrade Orchestration, which automates much of the upgrade procedure, leaving a
|
||||
few manual steps to prevent operator oversight. For more information on manual
|
||||
upgrades, see :ref:`Manual PLatform Components Upgrade
|
||||
upgrades, see :ref:`Manual Platform Components Upgrade
|
||||
<manual-upgrade-overview>`. For more information on upgrade orchestration, see
|
||||
:ref:`Orchestrated Platform Component Upgrade
|
||||
<orchestration-upgrade-overview>`.
|
||||
@ -26,7 +26,7 @@ upgrades, see :ref:`Manual PLatform Components Upgrade
|
||||
.. warning::
|
||||
Do NOT use information in the |updates-doc| guide for |prod-dc|
|
||||
orchestrated software upgrades. If information in this document is used for
|
||||
a |prod-dc| orchestrated upgrade, the upgrade will fail resulting
|
||||
a |prod-dc| orchestrated upgrade, the upgrade will fail, resulting
|
||||
in an outage. The |prod-dc| Upgrade Orchestrator automates a
|
||||
recursive rolling upgrade of all subclouds and all hosts within the
|
||||
subclouds.
|
||||
@ -40,40 +40,41 @@ Before starting the upgrades process:
|
||||
|
||||
.. _software-upgrades-ul-ant-vgq-gmb:
|
||||
|
||||
- the system must be “patch current”
|
||||
- The system must be 'patch current'.
|
||||
|
||||
- there must be no management-affecting alarms present on the system
|
||||
- There must be no management-affecting alarms present on the system.
|
||||
|
||||
- ensure any certificates managed by cert manager will not be renewed during
|
||||
the upgrade process
|
||||
- Ensure that any certificates managed by cert manager will not be renewed
|
||||
during the upgrade process.
|
||||
|
||||
- the new software load must be imported, and
|
||||
- The new software load must be imported.
|
||||
|
||||
- a valid license file for the new software release must be installed
|
||||
- A valid license file for the new software release must be installed.
|
||||
|
||||
The upgrade process starts by upgrading the controllers. The standby controller
|
||||
is upgraded first and involves loading the standby controller with the new
|
||||
release of software and migrating all the controller services' databases for
|
||||
the new release of software. Activity is switched to the upgraded controller,
|
||||
release of software and migrating all the controller services' databases for the
|
||||
new release of software. Activity is switched to the upgraded controller,
|
||||
running in a 'compatibility' mode where all inter-node messages are using
|
||||
message formats from the old release of software. Before upgrading the second
|
||||
controller, is the "point-of-no-return for an in-service abort" of the upgrades
|
||||
process. The second controller is loaded with the new release of software and
|
||||
becomes the new Standby controller. For more information on manual upgrades,
|
||||
see :ref:`Manual Platform Components Upgrade <manual-upgrade-overview>` .
|
||||
message formats from the old release of software. Prior to upgrading the second
|
||||
controller, you reach a "point-of-no-return for an in-service abort" of the
|
||||
upgrades process. The second controller is loaded with the new release of
|
||||
software and becomes the new Standby controller. For more information on manual
|
||||
upgrades, see :ref:`Manual Platform Components Upgrade
|
||||
<manual-upgrade-overview>` .
|
||||
|
||||
If present, storage nodes are locked, upgraded and unlocked one at a time in
|
||||
order to respect the redundancy model of |prod| storage nodes. Storage nodes
|
||||
can be upgraded in parallel if using upgrade orchestration.
|
||||
|
||||
Upgrade of worker nodes is the next step in the process. When locking a worker
|
||||
node the node is tainted, such that Kubernetes shuts down any pods on this
|
||||
worker node and restarts the pods on another worker node. When upgrading the
|
||||
worker node, the worker node network boots/installs the new software from the
|
||||
active controller. After unlocking the worker node, the worker services are
|
||||
running in a 'compatibility' mode where all inter-node messages are using
|
||||
message formats from the old release of software. Note that the worker nodes
|
||||
can only be upgraded in parallel if using upgrade orchestration.
|
||||
Worker nodes are then upgraded. Worker nodes are tainted when locked, such that
|
||||
Kubernetes shuts down any pods on this worker node and restarts the pods on
|
||||
another worker node. When upgrading the worker node, the worker node network
|
||||
boots/installs the new software from the active controller. After unlocking the
|
||||
worker node, the worker services are running in a 'compatibility' mode where all
|
||||
inter-node messages are using message formats from the old release of software.
|
||||
Note that the worker nodes can only be upgraded in parallel if using upgrade
|
||||
orchestration.
|
||||
|
||||
The final step of the upgrade process is to activate and complete the upgrade.
|
||||
This involves disabling 'compatibility' modes on all hosts and clearing the
|
||||
@ -97,9 +98,9 @@ resolved. Issues specific to a storage or worker host can be addressed by
|
||||
temporarily downgrading the host, addressing the issues and then upgrading the
|
||||
host again, or in some cases by replacing the node.
|
||||
|
||||
In extremely rare cases, it may be necessary to abort an upgrade. This is a
|
||||
last resort and should only be done if there is no other way to address the
|
||||
issue within the context of the upgrade. There are two cases for doing such an
|
||||
In extremely rare cases, it may be necessary to abort an upgrade. This is a last
|
||||
resort and should only be done if there is no other way to address the issue
|
||||
within the context of the upgrade. There are two scenarios for doing such an
|
||||
abort:
|
||||
|
||||
.. _software-upgrades-ul-dqp-brt-cx:
|
||||
|
@ -32,13 +32,14 @@ instances and since the firmware update is a reboot required operation for a
|
||||
host, the strategy offers **stop/start** or **migrate** options for managing
|
||||
instances over the **lock/unlock** \(reboot\) steps in the update process.
|
||||
|
||||
You must use the **sw-manager** |CLI| tool to **create**, and then **apply** the
|
||||
update strategy. A created strategy can be monitored with the **show** command.
|
||||
For more information, see :ref:`Firmware Update Orchestration Using the CLI
|
||||
You must use the :command:`sw-manager` |CLI| commands to create, and then apply
|
||||
the update strategy. A created strategy can be monitored with the
|
||||
command:`sw-manager show` command. For more information, see :ref:`Firmware
|
||||
Update Orchestration Using the CLI
|
||||
<firmware-update-orchestration-using-the-cli>`.
|
||||
|
||||
Firmware update orchestration automatically iterates through all
|
||||
**unlocked-enabled** hosts on the system looking for hosts with the worker
|
||||
*unlocked-enabled* hosts on the system looking for hosts with the worker
|
||||
function that need firmware update and then proceeds to update them on the
|
||||
strategy :command:`apply` action.
|
||||
|
||||
@ -52,13 +53,13 @@ After creating the *Firmware Update Orchestration Strategy*, you can either
|
||||
apply the entire strategy automatically, or manually apply individual stages to
|
||||
control and monitor the firmware update progress one stage at a time.
|
||||
|
||||
When the firmware update strategy is **applied**, if the system is All-in-one,
|
||||
When the firmware update strategy is applied, if the system is All-in-one,
|
||||
the controllers are updated first, one after the other with a swact in between,
|
||||
followed by the remaining worker hosts according to the selected worker apply
|
||||
concurrency \(**serial** or **parallel**\) method.
|
||||
|
||||
The strategy creation default is to update the worker hosts serially unless the
|
||||
**parallel** worker apply type option is specified which configures the
|
||||
``parallel`` worker apply type option is specified which configures the
|
||||
firmware update process for worker hosts to be in parallel \(up to a maximum
|
||||
parallel number\) to reduce the overall firmware update installation time.
|
||||
|
||||
@ -73,7 +74,8 @@ steps involved in a firmware update for a single or group of hosts include:
|
||||
|
||||
#. Alarm Query – is an update pre-check.
|
||||
|
||||
#. Firmware update – non-service affecting update that can take over 45 minutes.
|
||||
#. Firmware update – is a non-service affecting update that can take over 45
|
||||
minutes.
|
||||
|
||||
#. Lock Host.
|
||||
|
||||
@ -89,11 +91,11 @@ Strategy* considers any configured server groups and host aggregates when
|
||||
creating the stages to reduce the impact to running instances. The *Firmware
|
||||
Update Orchestration Strategy* automatically manages the instances during the
|
||||
strategy application process. The instance management options include
|
||||
**start-stop** or **migrate**.
|
||||
``start-stop`` or ``migrate``.
|
||||
|
||||
.. _htb1590431033292-ul-vcp-dvs-tlb:
|
||||
|
||||
- **start-stop**: where instances are stopped following the actual firmware
|
||||
- ``start-stop``: where instances are stopped following the actual firmware
|
||||
update but before the lock operation and then automatically started again
|
||||
after the unlock completes. This is typically used for instances that do
|
||||
not support migration or for cases where migration takes too long. To
|
||||
@ -101,6 +103,6 @@ strategy application process. The instance management options include
|
||||
instance, the instance\(s\) should be protected and grouped into an
|
||||
anti-affinity server group\(s\) with its standby instance.
|
||||
|
||||
- **migrate**: where instances are moved off a host following the firmware
|
||||
- ``migrate``: where instances are moved off a host following the firmware
|
||||
update but before the host is locked. Instances with **Live Migration**
|
||||
support are **Live Migrated**. Otherwise, they are **Cold Migrated**.
|
||||
|
@ -35,35 +35,33 @@ operation for a host, the strategy offers **stop/start** or **migrate** options
|
||||
for managing instances over the **lock/unlock** \(reboot\) steps in the upgrade
|
||||
process.
|
||||
|
||||
You must use the **sw-manager** CLI tool to **create**, and then **apply** the
|
||||
upgrade strategy. A created strategy can be monitored with the **show**
|
||||
command.
|
||||
You must use the :command:`sw-manager`` CLI tool to create, and then apply the
|
||||
upgrade strategy. A created strategy can be monitored with the **show** command.
|
||||
|
||||
Kubernetes version upgrade orchestration automatically iterates through all
|
||||
**unlocked-enabled** hosts on the system looking for hosts with the worker
|
||||
function that need Kubernetes version upgrade and then proceeds to upgrade them
|
||||
*unlocked-enabled* hosts on the system looking for hosts with the worker
|
||||
function that need Kubernetes version upgrades and then proceeds to upgrade them
|
||||
on the strategy :command:`apply` action.
|
||||
|
||||
.. note::
|
||||
Controllers (including |AIO| controllers) are upgraded before worker only
|
||||
hosts. Storage hosts do not run Kubernetes so Kubernetes is not upgraded
|
||||
on them, although they still may be patched.
|
||||
Controllers (including |AIO| controllers) are upgraded before worker-only
|
||||
hosts. Since storage hosts do not run Kubernetes, no upgrade is performed,
|
||||
although they may still be patched.
|
||||
|
||||
After creating the *Kubernetes Version Upgrade Orchestration Strategy*, you can
|
||||
either apply the entire strategy automatically, or manually apply individual
|
||||
stages to control and monitor the Kubernetes version upgrade progress one stage
|
||||
at a time.
|
||||
|
||||
When the Kubernetes version upgrade strategy is **applied**, if the system is
|
||||
When the Kubernetes version upgrade strategy is applied, if the system is
|
||||
All-in-one, the controllers are upgraded first, one after the other with a
|
||||
swact in between, followed by the remaining worker hosts according to the
|
||||
selected worker apply concurrency \(**serial** or **parallel**\) method.
|
||||
|
||||
The strategy creation default is to upgrade the worker hosts serially unless
|
||||
the **parallel** worker apply type option is specified which configures the
|
||||
Kubernetes version upgrade process for worker hosts to be in parallel \(up to a
|
||||
maximum parallel number\) to reduce the overall Kubernetes version upgrade
|
||||
installation time.
|
||||
By default, strategies upgrade the worker hosts serially unless the **parallel**
|
||||
worker apply type option is specified, which configures the Kubernetes version
|
||||
upgrade process for worker hosts to be in parallel \(up to a maximum parallel
|
||||
number\). This reduces the overall Kubernetes version upgrade installation time.
|
||||
|
||||
The upgrade takes place in two phases. The first phase upgrades the patches
|
||||
(controllers, storage and then workers), and the second phase upgrades
|
||||
@ -113,25 +111,25 @@ Upgrade Operations Requiring Manual Migration
|
||||
|
||||
On systems with |prefix|-openstack application, the *Kubernetes Version Upgrade
|
||||
Orchestration Strategy* considers any configured server groups and host
|
||||
aggregates when creating the stages to reduce the impact to running instances.
|
||||
The *Kubernetes Version Upgrade Orchestration Strategy* automatically manages
|
||||
the instances during the strategy application process. The instance management
|
||||
options include **start-stop** or **migrate**.
|
||||
aggregates when creating the stages in order to reduce the impact to running
|
||||
instances. The *Kubernetes Version Upgrade Orchestration Strategy* automatically
|
||||
manages the instances during the strategy application process. The instance
|
||||
management options include **start-stop** or **migrate**.
|
||||
|
||||
|
||||
.. _htb1590431033292-ul-vcp-dvs-tlb:
|
||||
|
||||
- **start-stop**: where instances are stopped following the actual Kubernetes
|
||||
- **start-stop**: Instances are stopped following the actual Kubernetes
|
||||
upgrade but before the lock operation and then automatically started again
|
||||
after the unlock completes. This is typically used for instances that do
|
||||
not support migration or for cases where migration takes too long. To
|
||||
ensure this does not impact the high-level service being provided by the
|
||||
instance, the instance\(s\) should be protected and grouped into an
|
||||
anti-affinity server group\(s\) with its standby instance.
|
||||
after the unlock completes. This is typically used for instances that do not
|
||||
support migration or for cases where migration takes too long. To ensure
|
||||
this does not impact the high-level service being provided by the instance,
|
||||
the instance\(s\) should be protected and grouped into an anti-affinity
|
||||
server group\(s\) with its standby instance.
|
||||
|
||||
- **migrate**: where instances are moved off a host following the Kubernetes
|
||||
upgrade but before the host is locked. Instances with **Live Migration**
|
||||
support are **Live Migrated**. Otherwise, they are **Cold Migrated**.
|
||||
- **migrate**: Instances are moved off a host following the Kubernetes upgrade
|
||||
but before the host is locked. Instances with **Live Migration** support are
|
||||
**Live Migrated**. Otherwise, they are **Cold Migrated**.
|
||||
|
||||
|
||||
.. _kubernetes-update-operations-requiring-manual-migration:
|
||||
@ -149,34 +147,33 @@ Do the following to manage the instance re-location manually:
|
||||
|
||||
.. _rbp1590431075472-ul-mgr-kvs-tlb:
|
||||
|
||||
- Manually perform Kubernetes version upgrade at least one openstack-compute worker host. This
|
||||
assumes that at least one openstack-compute worker host does not have any
|
||||
instances, or has instances that can be migrated. For more information on
|
||||
manually updating a host, see :ref:`Manual Kubernetes Version Upgrade
|
||||
- Manually perform a Kubernetes version upgrade of at least one
|
||||
openstack-compute worker host. This assumes that at least one
|
||||
openstack-compute worker host does not have any instances, or has instances
|
||||
that can be migrated. For more information on manually updating a host, see
|
||||
:ref:`Manual Kubernetes Version Upgrade
|
||||
<manual-kubernetes-components-upgrade>`.
|
||||
|
||||
- If the migration is prevented by limitations in the VNF or virtual
|
||||
application, perform the following:
|
||||
|
||||
|
||||
- Create new instances on an already upgraded openstack-compute worker
|
||||
#. Create new instances on an already upgraded openstack-compute worker
|
||||
host.
|
||||
|
||||
- Manually migrate the data from the old instances to the new instances.
|
||||
#. Manually migrate the data from the old instances to the new instances.
|
||||
|
||||
.. note::
|
||||
This is specific to your environment and depends on the virtual
|
||||
application running in the instance.
|
||||
|
||||
- Terminate the old instances.
|
||||
#. Terminate the old instances.
|
||||
|
||||
|
||||
- If the migration is prevented by the size of the instances local disks:
|
||||
|
||||
|
||||
- For each openstack-compute worker host that has instances that cannot
|
||||
be migrated, manually move the instances using the CLI.
|
||||
- If the migration is prevented by the size of the instances local disks, then
|
||||
for each openstack-compute worker host that has instances that cannot
|
||||
be migrated, manually move the instances using the CLI.
|
||||
|
||||
Once all openstack-compute worker hosts containing instances that cannot be
|
||||
migrated have been Kubernetes version upgraded, Kubernetes version upgrade
|
||||
orchestration can then be used to upgrade the remaining worker hosts.
|
||||
migrated have been Kubernetes-version upgraded, Kubernetes version upgrade
|
||||
orchestration can be used to upgrade the remaining worker hosts.
|
||||
|
@ -16,8 +16,8 @@ interface dialog, described in :ref:`Configuring Update Orchestration
|
||||
.. note::
|
||||
To use update orchestration commands, you need administrator privileges.
|
||||
You must log in to the active controller as user **sysadmin** and source
|
||||
the /etc/platform/openrc script to obtain administrator privileges. Do not
|
||||
use **sudo**.
|
||||
the ``/etc/platform/openrc`` script to obtain administrator privileges. Do not
|
||||
use :command:`sudo`.
|
||||
|
||||
.. note::
|
||||
Management-affecting alarms cannot be ignored at the indicated severity
|
||||
|
@ -14,7 +14,7 @@ operation.
|
||||
:depth: 1
|
||||
|
||||
You can configure and run update orchestration using the CLI, the Horizon Web
|
||||
interface, or the stx-nfv REST API.
|
||||
interface, or the ``stx-nfv`` REST API.
|
||||
|
||||
.. note::
|
||||
Updating of |prod-dc| is distinct from updating of other |prod|
|
||||
@ -74,9 +74,9 @@ the same time. Update orchestration only locks and unlocks \(that is, reboots\)
|
||||
a host to install an update if at least one reboot-required update has been
|
||||
applied.
|
||||
|
||||
The user first creates an update orchestration strategy, or plan, for the
|
||||
automated updating procedure. This customizes the update orchestration, using
|
||||
parameters to specify:
|
||||
You first create an update orchestration strategy, or plan, for the automated
|
||||
updating procedure. This customizes the update orchestration, using parameters
|
||||
to specify:
|
||||
|
||||
.. _update-orchestration-overview-ul-eyw-fyr-31b:
|
||||
|
||||
|
@ -27,19 +27,19 @@ To keep track of software update installation, you can use the
|
||||
.. parsed-literal::
|
||||
|
||||
~(keystone_admin)]$ sudo sw-patch query
|
||||
Patch ID Patch State
|
||||
=========== ============
|
||||
|pvr|-nn.nn_PATCH_0001 Applied
|
||||
Patch ID Patch State
|
||||
=========== ============
|
||||
|pvr|-<nn>.<nn>_PATCH_0001 Applied
|
||||
|
||||
where *nn.nn* in the update filename is the |prod| release number.
|
||||
where <nn>.<nn> in the update filename is the |prod| release number.
|
||||
|
||||
This shows the **Patch State** for each of the updates in the storage area:
|
||||
This shows the 'Patch State' for each of the updates in the storage area:
|
||||
|
||||
**Available**
|
||||
``Available``
|
||||
An update in the *Available* state has been added to the storage area, but
|
||||
is not currently in the repository or installed on the hosts.
|
||||
|
||||
**Partial-Apply**
|
||||
``Partial-Apply``
|
||||
An update in the *Partial-Apply* state has been added to the software
|
||||
updates repository using the :command:`sw-patch apply` command, but has not
|
||||
been installed on all hosts that require it. It may have been installed on
|
||||
@ -51,12 +51,12 @@ This shows the **Patch State** for each of the updates in the storage area:
|
||||
node X, you cannot just install the non-reboot-required update to the
|
||||
unlocked node X.
|
||||
|
||||
**Applied**
|
||||
``Applied``
|
||||
An update in the *Applied* state has been installed on all hosts that
|
||||
require it.
|
||||
|
||||
You can use the :command:`sw-patch query-hosts` command to see which hosts are
|
||||
fully updated \(**Patch Current**\). This also shows which hosts require
|
||||
fully updated \(Patch Current\). This also shows which hosts require
|
||||
reboot, either because they are not fully updated, or because they are fully
|
||||
updated but not yet rebooted.
|
||||
|
||||
|
@ -16,11 +16,11 @@ of |prod| software.
|
||||
- Perform a full backup to allow recovery.
|
||||
|
||||
.. note::
|
||||
Back up files in the /home/sysadmin and /root directories prior
|
||||
Back up files in the ``/home/sysadmin`` and ``/root`` directories prior
|
||||
to doing an upgrade. Home directories are not preserved during backup or
|
||||
restore operations, blade replacement, or upgrades.
|
||||
|
||||
- The system must be "patch current". All updates available for the current
|
||||
- The system must be 'patch current'. All updates available for the current
|
||||
release running on the system must be applied, and all patches must be
|
||||
committed. To find and download applicable updates, visit the |dnload-loc|.
|
||||
|
||||
@ -29,9 +29,9 @@ of |prod| software.
|
||||
|
||||
.. note::
|
||||
Make sure that the ``/home/sysadmin`` directory has enough space
|
||||
(at least 2GB of free space), otherwise the upgrade may fail once it
|
||||
starts. If more space is needed, it is recommended to delete the
|
||||
``.iso bootimage`` previously imported after the `load-import` command.
|
||||
(at least 2GB of free space), otherwise the upgrade may fail.
|
||||
If more space is needed, it is recommended to delete the
|
||||
``.iso bootimage`` previously imported after the :command:`load-import` command.
|
||||
|
||||
- Transfer the new release software license file to controller-0, \(or onto a
|
||||
USB stick\).
|
||||
@ -41,8 +41,8 @@ of |prod| software.
|
||||
|
||||
- Unlock all hosts.
|
||||
|
||||
- All nodes must be unlocked. The upgrade cannot be started when there
|
||||
are locked nodes \(the health check prevents it\).
|
||||
- All nodes must be unlocked as the health check prevents the upgrade
|
||||
cannot if there are locked nodes.
|
||||
|
||||
.. note::
|
||||
The upgrade procedure includes steps to resolve system health issues.
|
||||
@ -51,8 +51,7 @@ of |prod| software.
|
||||
|
||||
#. Ensure that controller-0 is the active controller.
|
||||
|
||||
#. Install the license file for the release you are upgrading to, for example,
|
||||
nn.nn.
|
||||
#. Install the license file for the release you are upgrading.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -66,13 +65,12 @@ of |prod| software.
|
||||
|
||||
#. Import the new release.
|
||||
|
||||
|
||||
#. Run the :command:`load-import` command on **controller-0** to import
|
||||
#. Run the :command:`load-import` command on controller-0 to import
|
||||
the new release.
|
||||
|
||||
First, source /etc/platform/openrc. Also, you must specify an exact
|
||||
path to the \*.iso bootimage file and to the \*.sig bootimage signature
|
||||
file.
|
||||
Source ``/etc/platform/openrc``. Also, you must specify an exact
|
||||
path to the ``*.iso`` bootimage file and to the ``*.sig`` bootimage
|
||||
signature file.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -89,7 +87,7 @@ of |prod| software.
|
||||
| required_patches | |
|
||||
+--------------------+-----------+
|
||||
|
||||
The :command:`load-import` must be done on **controller-0** and accepts
|
||||
The :command:`load-import` must be done on controller-0 and accepts
|
||||
relative paths.
|
||||
|
||||
.. note::
|
||||
@ -112,7 +110,7 @@ of |prod| software.
|
||||
The system must be 'patch current'. All software updates related to your
|
||||
current |prod| software release must be uploaded, applied, and installed.
|
||||
|
||||
All software updates to the new |prod| release, only need to be uploaded
|
||||
All software updates to the new |prod| release only need to be uploaded
|
||||
and applied. The install of these software updates will occur automatically
|
||||
during the software upgrade procedure as the hosts are reset to load the
|
||||
new release of software.
|
||||
@ -127,7 +125,7 @@ of |prod| software.
|
||||
Check the current system health status, resolve any alarms and other issues
|
||||
reported by the :command:`system health-query-upgrade` command, then
|
||||
recheck the system health status to confirm that all **System Health**
|
||||
fields are set to **OK**. For example:
|
||||
fields are set to *OK*. For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -148,10 +146,9 @@ of |prod| software.
|
||||
All kubernetes applications are in a valid state: [OK]
|
||||
Active controller is controller-0: [OK]
|
||||
|
||||
By default, the upgrade process cannot be run and is not recommended to be
|
||||
run with active alarms present. Use the command :command:`system upgrade-start --force`
|
||||
to force the upgrade process to start and ignore non-management-affecting
|
||||
alarms.
|
||||
By default, the upgrade process cannot be run with active alarms present.
|
||||
Use the command :command:`system upgrade-start --force` to force the upgrade
|
||||
process to start and ignore non-management-affecting alarms.
|
||||
|
||||
.. note::
|
||||
It is strongly recommended that you clear your system of any and all
|
||||
@ -177,17 +174,18 @@ of |prod| software.
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
This will make a copy of the upgrade data onto a DRBD file system to be
|
||||
|
||||
This will make a copy of the upgrade data onto a |DRBD| file system to be
|
||||
used in the upgrade. Configuration changes are not allowed after this point
|
||||
until the swact to controller-1 is completed.
|
||||
|
||||
The following upgrade state applies once this command is executed:
|
||||
|
||||
- started:
|
||||
- ``started``:
|
||||
|
||||
- State entered after :command:`system upgrade-start` completes.
|
||||
|
||||
- Release nn.nn system data \(for example, postgres databases\) has
|
||||
- Release <nn>.<nn> system data \(for example, postgres databases\) has
|
||||
been exported to be used in the upgrade.
|
||||
|
||||
- Configuration changes must not be made after this point, until the
|
||||
@ -200,13 +198,14 @@ of |prod| software.
|
||||
upgrade.
|
||||
|
||||
.. note::
|
||||
|
||||
Use the command :command:`system upgrade-start --force` to force the
|
||||
upgrade process to start and ignore non-management-affecting alarms.
|
||||
This should ONLY be done if you feel these alarms will not be an issue
|
||||
over the upgrades process.
|
||||
This should **ONLY** be done if you ascertain that these alarms will
|
||||
interfere with the upgrades process.
|
||||
|
||||
On systems with Ceph storage, it also checks that the Ceph cluster is
|
||||
healthy.
|
||||
On systems with Ceph storage, the process also checks that the Ceph cluster
|
||||
is healthy.
|
||||
|
||||
#. Upgrade controller-1.
|
||||
|
||||
@ -231,7 +230,7 @@ of |prod| software.
|
||||
The following data migration states apply when this command is
|
||||
executed:
|
||||
|
||||
- data-migration:
|
||||
- ``data-migration``:
|
||||
|
||||
- State entered when :command:`system host-upgrade controller-1`
|
||||
is executed.
|
||||
@ -245,21 +244,21 @@ of |prod| software.
|
||||
You can view the upgrade progress on controller-1 using the
|
||||
serial console.
|
||||
|
||||
- data-migration-complete or upgrading-controllers:
|
||||
- ``data-migration-complete or upgrading-controllers``:
|
||||
|
||||
- State entered when controller-1 upgrade is complete.
|
||||
|
||||
- System data has been successfully migrated from release nn.nn
|
||||
to release nn.nn.
|
||||
- System data has been successfully migrated from release <nn>.<nn>
|
||||
to the newer Version.
|
||||
|
||||
- data-migration-failed:
|
||||
- ``data-migration-failed``:
|
||||
|
||||
- State entered if data migration on controller-1 fails.
|
||||
|
||||
- Upgrade must be aborted.
|
||||
|
||||
.. note::
|
||||
Review the /var/log/sysinv.log on the active controller for
|
||||
Review the ``/var/log/sysinv.log`` on the active controller for
|
||||
more details on data migration failure.
|
||||
|
||||
#. Check the upgrade state.
|
||||
@ -277,7 +276,7 @@ of |prod| software.
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
If the :command:`upgrade-show` status indicates
|
||||
'data-migration-failed', then there is an issue with the data
|
||||
*data-migration-failed*, then there is an issue with the data
|
||||
migration. Check the issue before proceeding to the next step.
|
||||
|
||||
#. Unlock controller-1.
|
||||
@ -286,20 +285,22 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-1
|
||||
|
||||
Wait for controller-1 to become **unlocked-enabled**. Wait for the DRBD
|
||||
sync **400.001** Services-related alarm is raised and then cleared.
|
||||
Wait for controller-1 to enter the state *unlocked-enabled*. Wait for
|
||||
the |DRBD| sync **400.001** Services-related alarm to be raised and then
|
||||
cleared.
|
||||
|
||||
The following states apply when this command is executed.
|
||||
|
||||
- upgrading-controllers:
|
||||
- ``upgrading-controllers``:
|
||||
|
||||
- State entered when controller-1 has been unlocked and is
|
||||
running release nn.nn software.
|
||||
|
||||
If it transitions to **unlocked-disabled-failed**, check the issue
|
||||
before proceeding to the next step. The alarms may indicate a
|
||||
If the controller transitions to **unlocked-disabled-failed**, check the
|
||||
issue before proceeding to the next step. The alarms may indicate a
|
||||
configuration error. Check the result of the configuration logs on
|
||||
controller-1, \(for example, Error logs in controller1:/var/log/puppet\).
|
||||
controller-1, \(for example, Error logs in
|
||||
controller1:``/var/log/puppet``\).
|
||||
|
||||
#. Set controller-1 as the active controller. Swact to controller-1.
|
||||
|
||||
@ -307,36 +308,33 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-swact controller-0
|
||||
|
||||
Wait until all services are enabled / active and the swact is complete
|
||||
on controller-0 before proceeding to the next step. Use the following
|
||||
command below:
|
||||
Wait until services have become active on the new active controller-1 before
|
||||
proceeding to the next step. The swact is complete when all services on
|
||||
controller-1 are in the state ``enabled-active``. Use the command ``system
|
||||
servicegroup-list`` to monitor progress.
|
||||
|
||||
.. code-block:: none
|
||||
#. Upgrade controller-0.
|
||||
|
||||
~(keystone_admin)]$ system servicegroup-list
|
||||
|
||||
#. Upgrade **controller-0**.
|
||||
|
||||
#. Lock **controller-0**.
|
||||
#. Lock controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-lock controller-0
|
||||
|
||||
#. Upgrade **controller-0**.
|
||||
#. Upgrade controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade controller-0
|
||||
|
||||
|
||||
#. Unlock **controller-0**.
|
||||
#. Unlock controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-0
|
||||
|
||||
Wait until the DRBD sync **400.001** Services-related alarm is raised
|
||||
Wait until the |DRBD| sync **400.001** Services-related alarm is raised
|
||||
and then cleared before proceeding to the next step.
|
||||
|
||||
- upgrading-hosts:
|
||||
@ -356,7 +354,7 @@ of |prod| software.
|
||||
|
||||
Clear all alarms unrelated to the upgrade process.
|
||||
|
||||
#. If using Ceph storage backend, upgrade the storage nodes one at a time.
|
||||
#. If using Ceph a storage backend, upgrade the storage nodes one at a time.
|
||||
|
||||
.. note::
|
||||
Proceed to step 13 if no storage/worker node is present.
|
||||
@ -370,10 +368,10 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-lock storage-0
|
||||
|
||||
#. Verify that the OSDs are down after the storage node is locked.
|
||||
#. Verify that the |OSDs| are down after the storage node is locked.
|
||||
|
||||
In the Horizon interface, navigate to **Admin** \> **Platform** \>
|
||||
**Storage Overview** to view the status of the OSDs.
|
||||
**Storage Overview** to view the status of the |OSDs|.
|
||||
|
||||
#. Upgrade storage-0.
|
||||
|
||||
@ -381,7 +379,7 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade storage-0
|
||||
|
||||
The upgrade is complete when the node comes online, and at that point,
|
||||
The upgrade is complete when the node comes online. At that point
|
||||
you can safely unlock the node.
|
||||
|
||||
After upgrading a storage node, but before unlocking, there are Ceph
|
||||
@ -408,7 +406,7 @@ of |prod| software.
|
||||
**800.003**. The alarm is cleared after all storage nodes are
|
||||
upgraded.
|
||||
|
||||
#. Upgrade worker hosts, one at a time, if any.
|
||||
#. Upgrade worker hosts, if any, one at a time.
|
||||
|
||||
#. Lock worker-0.
|
||||
|
||||
@ -431,7 +429,7 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-unlock worker-0
|
||||
|
||||
Wait for all alarms to clear after the unlock before proceeding to the
|
||||
After the unlock wait for all alarms to clear before proceeding to the
|
||||
next worker host.
|
||||
|
||||
#. Repeat the above steps for each worker host.
|
||||
@ -442,9 +440,9 @@ of |prod| software.
|
||||
|
||||
~(keystone_admin)]$ system host-swact controller-1
|
||||
|
||||
Wait until services have gone active on the active controller-0 before
|
||||
proceeding to the next step. When all services on controller-0 are
|
||||
enabled-active, the swact is complete.
|
||||
Wait until services have become available on the active controller-0 before
|
||||
proceeding to the next step. When all services on controller-0 are in the
|
||||
``enabled-active`` state, the swact is complete.
|
||||
|
||||
#. Activate the upgrade.
|
||||
|
||||
@ -460,31 +458,31 @@ of |prod| software.
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
During the running of the :command:`upgrade-activate` command, new
|
||||
When running the :command:`upgrade-activate` command, new
|
||||
configurations are applied to the controller. 250.001 \(**hostname
|
||||
Configuration is out-of-date**\) alarms are raised and are cleared as the
|
||||
configuration is applied. The upgrade state goes from **activating** to
|
||||
**activation-complete** once this is done.
|
||||
configuration is applied. The upgrade state goes from ``activating`` to
|
||||
``activation-complete`` once this is done.
|
||||
|
||||
The following states apply when this command is executed.
|
||||
|
||||
**activation-requested**
|
||||
``activation-requested``
|
||||
State entered when :command:`system upgrade-activate` is executed.
|
||||
|
||||
**activating**
|
||||
State entered when we have started activating the upgrade by applying
|
||||
``activating``
|
||||
State entered when the system has started activating the upgrade by applying
|
||||
new configurations to the controller and compute hosts.
|
||||
|
||||
**activating-hosts**
|
||||
``activating-hosts``
|
||||
State entered when applying host-specific configurations. This state is
|
||||
entered only if needed.
|
||||
|
||||
**activation-complete**
|
||||
``activation-complete``
|
||||
State entered when new configurations have been applied to all
|
||||
controller and compute hosts.
|
||||
|
||||
#. Check the status of the upgrade again to see it has reached
|
||||
**activation-complete**.
|
||||
``activation-complete``.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
|
@ -30,8 +30,8 @@ software.
|
||||
|
||||
- The system is patch current.
|
||||
|
||||
- There should be sufficient free space in /opt/platform-backup. Remove
|
||||
any unused files if necessary.
|
||||
- There should be sufficient free space in ``/opt/platform-backup.``.
|
||||
Remove any unused files if necessary.
|
||||
|
||||
- The new software load has been imported.
|
||||
|
||||
@ -41,10 +41,12 @@ software.
|
||||
stick\); controller-0 must be active.
|
||||
|
||||
.. note::
|
||||
Make sure that the ``/home/sysadmin`` directory has enough space
|
||||
(at least 2GB of free space), otherwise the upgrade may fail once it
|
||||
starts. If more space is needed, it is recommended to delete the
|
||||
``.iso bootimage`` previously imported after the `load-import` command.
|
||||
|
||||
Make sure that the ``/home/sysadmin`` directory has enough space (at
|
||||
least 2GB of free space), otherwise the upgrade may fail once it starts.
|
||||
If more space is needed, it is recommended to delete the ``.iso``
|
||||
bootimage previously imported after the :command:`load-import`
|
||||
command.
|
||||
|
||||
- Transfer the new release software license file to controller-0 \(or onto a
|
||||
USB stick\).
|
||||
@ -55,7 +57,7 @@ software.
|
||||
.. note::
|
||||
The upgrade procedure includes steps to resolve system health issues.
|
||||
|
||||
End user container images in `registry.local` will be backed up during the
|
||||
End user container images in ``registry.local`` will be backed up during the
|
||||
upgrade process. This only includes images other than |prod| system and
|
||||
application images. These images are limited to 5 GB in total size. If the
|
||||
system contains more than 5 GB of these images, the upgrade start will fail.
|
||||
@ -71,8 +73,7 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
$ source /etc/platform/openrc
|
||||
~(keystone_admin)]$
|
||||
|
||||
#. Install the license file for the release you are upgrading to, for example,
|
||||
nn.nn.
|
||||
#. Install the license file for the release you are upgrading to.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -86,13 +87,13 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
|
||||
#. Import the new release.
|
||||
|
||||
#. Run the :command:`load-import` command on **controller-0** to import
|
||||
#. Run the :command:`load-import` command on controller-0 to import
|
||||
the new release.
|
||||
|
||||
First, source /etc/platform/openrc.
|
||||
First, source ``/etc/platform/openrc``.
|
||||
|
||||
You must specify an exact path to the \*.iso bootimage file and to the
|
||||
\*.sig bootimage signature file.
|
||||
You must specify an exact path to the ``*.iso`` bootimage file and to the
|
||||
``*.sig`` bootimage signature file.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -109,7 +110,7 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
| required_patches | |
|
||||
+--------------------+-----------+
|
||||
|
||||
The :command:`load-import` must be done on **controller-0** and accepts
|
||||
The :command:`load-import` must be done on controller-0 and accepts
|
||||
relative paths.
|
||||
|
||||
.. note::
|
||||
@ -130,9 +131,9 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
#. Apply any required software updates.
|
||||
|
||||
The system must be 'patch current'. All software updates related to your
|
||||
current |prod| software release must be, uploaded, applied, and installed.
|
||||
current |prod| software release must be uploaded, applied, and installed.
|
||||
|
||||
All software updates to the new |prod| release, only need to be uploaded
|
||||
All software updates to the new |prod| release only need to be uploaded
|
||||
and applied. The install of these software updates will occur automatically
|
||||
during the software upgrade procedure as the hosts are reset to load the
|
||||
new release of software.
|
||||
@ -147,7 +148,7 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
Check the current system health status, resolve any alarms and other issues
|
||||
reported by the :command:`system health-query-upgrade` command, then
|
||||
recheck the system health status to confirm that all **System Health**
|
||||
fields are set to **OK**.
|
||||
fields are set to *OK*.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -167,13 +168,13 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
All kubernetes applications are in a valid state: [OK]
|
||||
Active controller is controller-0: [OK]
|
||||
|
||||
By default, the upgrade process cannot be run and is not recommended to be
|
||||
run with Active Alarms present. However, management affecting alarms can be
|
||||
ignored with the :command:`--force` option with the :command:`system
|
||||
upgrade-start` command to force the upgrade process to start.
|
||||
By default, the upgrade process cannot be run with Active Alarms present.
|
||||
However, management affecting alarms can be ignored with the
|
||||
:command:`--force` option with the :command:`system upgrade-start` command
|
||||
to force the upgrade process to start.
|
||||
|
||||
.. note::
|
||||
It is strongly recommended that you clear your system of any and all
|
||||
It is strongly recommended that you clear your system of all
|
||||
alarms before doing an upgrade. While the :command:`--force` option is
|
||||
available to run the upgrade, it is a best practice to clear any
|
||||
alarms.
|
||||
@ -192,40 +193,41 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
This will back up the system data and images to /opt/platform-backup.
|
||||
/opt/platform-backup is preserved when the host is reinstalled. With the
|
||||
platform backup, the size of /home/sysadmin must be less than 2GB.
|
||||
This will back up the system data and images to ``/opt/platform-backup.``.
|
||||
``/opt/platform-backup.`` is preserved when the host is reinstalled. With the
|
||||
platform backup, the size of ``/home/sysadmin`` must be less than 2GB.
|
||||
|
||||
This process may take several minutes.
|
||||
|
||||
When the upgrade state is upgraded to **started** the process is complete.
|
||||
When the upgrade state is upgraded to *started* the process is complete.
|
||||
|
||||
Any changes made to the system after this point will be lost when the data
|
||||
is restored.
|
||||
|
||||
The following upgrade state applies once this command is executed:
|
||||
|
||||
- started:
|
||||
- ``started``:
|
||||
|
||||
- State entered after :command:`system upgrade-start` completes.
|
||||
|
||||
- Release nn.nn system data \(for example, postgres databases\) has
|
||||
- Release <nn>.<nn> system data \(for example, postgres databases\) has
|
||||
been exported to be used in the upgrade.
|
||||
|
||||
- Configuration changes must not be made after this point, until the
|
||||
upgrade is completed.
|
||||
|
||||
As part of the upgrade, the upgrade process checks the health of the system
|
||||
and validates that the system is ready for an upgrade.
|
||||
The upgrade process checks the health of the system and validates that the
|
||||
system is ready for an upgrade.
|
||||
|
||||
The upgrade process checks that no alarms are active before starting an
|
||||
upgrade.
|
||||
|
||||
.. note::
|
||||
|
||||
Use the command :command:`system upgrade-start --force` to force the
|
||||
upgrades process to start and to ignore management affecting alarms.
|
||||
This should ONLY be done if you feel these alarms will not be an issue
|
||||
over the upgrades process.
|
||||
This should **ONLY** be done if you have ascertained that these alarms
|
||||
will not interfere with the upgrades process.
|
||||
|
||||
#. Check the upgrade state.
|
||||
|
||||
@ -241,13 +243,13 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
Ensure the upgrade state is **started**. It will take several minutes to
|
||||
transition to the started state.
|
||||
Ensure the upgrade state is *started*. It will take several minutes to
|
||||
transition to the *started* state.
|
||||
|
||||
#. \(Optional\) Copy the upgrade data from the system to an alternate safe
|
||||
location \(such as a USB drive or remote server\).
|
||||
|
||||
The upgrade data is located under /opt/platform-backup. Example file names
|
||||
The upgrade data is located under ``/opt/platform-backup``. Example file names
|
||||
are:
|
||||
|
||||
**lost+found upgrade\_data\_2020-06-23T033950\_61e5fcd7-a38d-40b0-ab83-8be55b87fee2.tgz**
|
||||
@ -264,8 +266,8 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
|
||||
#. Upgrade controller-0.
|
||||
|
||||
This is the point of no return. All data except /opt/platform-backup/ will
|
||||
be erased from the system. This will wipe the **rootfs** and reboot the
|
||||
This is the point of no return. All data except ``/opt/platform-backup/``
|
||||
will be erased from the system. This will wipe the ``rootfs`` and reboot the
|
||||
host. The new release must then be manually installed \(via network or
|
||||
USB\).
|
||||
|
||||
@ -279,11 +281,11 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
#. Install the new release of |prod-long| Simplex software via network or USB.
|
||||
|
||||
#. Verify and configure IP connectivity. External connectivity is required to
|
||||
run the Ansible upgrade playbook. The |prod-long| boot image will DHCP out all
|
||||
interfaces so the server may have obtained an IP address and have external IP
|
||||
connectivity if a DHCP server is present in your environment. Verify this using
|
||||
the :command:`ip addr` command. Otherwise, manually configure an IP address and default IP
|
||||
route.
|
||||
run the Ansible upgrade playbook. The |prod-long| boot image will |DHCP| out
|
||||
all interfaces so the server may have obtained an IP address and have
|
||||
external IP connectivity if a |DHCP| server is present in your environment.
|
||||
Verify this using the :command:`ip addr` command. Otherwise, manually
|
||||
configure an IP address and default IP route.
|
||||
|
||||
#. Restore the upgrade data.
|
||||
|
||||
@ -298,14 +300,13 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
following parameter:
|
||||
|
||||
``ansible_become_pass``
|
||||
|
||||
The ansible playbook will check /home/sysadmin/<hostname\>.yml for these
|
||||
user configuration override files for hosts. For example, if running
|
||||
ansible locally, /home/sysadmin/localhost.yml.
|
||||
The ansible playbook will check ``/home/sysadmin/<hostname\>.yml`` for
|
||||
these user configuration override files for hosts. For example, if
|
||||
running ansible locally, ``/home/sysadmin/localhost.yml``.
|
||||
|
||||
By default the playbook will search for the upgrade data file under
|
||||
/opt/platform-backup. If required, use the **upgrade\_data\_file**
|
||||
parameter to specify the path to the **upgrade\_data**.
|
||||
``/opt/platform-backup``. If required, use the ``upgrade_data_file``
|
||||
parameter to specify the path to the ``upgrade_data``.
|
||||
|
||||
.. note::
|
||||
This playbook does not support replay.
|
||||
@ -314,7 +315,7 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
This can take more than one hour to complete.
|
||||
|
||||
Once the data restoration is complete the upgrade state will be set to
|
||||
**upgrading-hosts**.
|
||||
*upgrading-hosts*.
|
||||
|
||||
#. Check the status of the upgrade.
|
||||
|
||||
@ -342,9 +343,9 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
|
||||
During the running of the :command:`upgrade-activate` command, new
|
||||
configurations are applied to the controller. 250.001 \(**hostname
|
||||
Configuration is out-of-date**\) alarms are raised and are cleared as the
|
||||
configuration is applied. The upgrade state goes from **activating** to
|
||||
**activation-complete** once this is done.
|
||||
Configuration is out-of-date**\) alarms are raised and then cleared as the
|
||||
configuration is applied. The upgrade state goes from *activating* to
|
||||
*activation-complete* once this is done.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@ -360,23 +361,25 @@ For more details, see :ref:`Detailed contents of a system backup
|
||||
|
||||
The following states apply when this command is executed.
|
||||
|
||||
**activation-requested**
|
||||
``activation-requested``
|
||||
State entered when :command:`system upgrade-activate` is executed.
|
||||
|
||||
**activating**
|
||||
``activating``
|
||||
State entered when we have started activating the upgrade by applying
|
||||
new configurations to the controller and compute hosts.
|
||||
|
||||
**activating-hosts**
|
||||
``activating-hosts``
|
||||
State entered when applying host-specific configurations. This state is
|
||||
entered only if needed.
|
||||
|
||||
**activation-complete**
|
||||
``activation-complete``
|
||||
State entered when new configurations have been applied to all
|
||||
controller and compute hosts.
|
||||
|
||||
|
||||
#. Check the status of the upgrade again to see it has reached
|
||||
**activation-complete**.
|
||||
``activation-complete``.
|
||||
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user