DC Orchestration for AIO-DX & Standard Subclouds
Distributed Upgrade Orchestration Process Using the CLI - Modified this topic to include 2 prerequisites Changing Case on file extensions. Updated comments for Patchset 3 Fixed merge conflicts Updated comments for Patchset 4 Fixed merge conflicts Story: 2008055 Task: 42387 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com> Change-Id: Ia2c44812052c4f70f4742923fa847698cc0d6fa6 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
This commit is contained in:
parent
cebb02eb21
commit
94fd67c34a
3
doc/source/dist_cloud/.vscode/settings.json
vendored
Normal file
3
doc/source/dist_cloud/.vscode/settings.json
vendored
Normal file
@ -0,0 +1,3 @@
|
||||
{
|
||||
"restructuredtext.confPath": ""
|
||||
}
|
@ -0,0 +1,21 @@
|
||||
|
||||
.. hil1593180554641
|
||||
.. _aborting-the-distributed-upgrade-orchestration:
|
||||
|
||||
==============================================
|
||||
Aborting the Distributed Upgrade Orchestration
|
||||
==============================================
|
||||
|
||||
To abort the current upgrade orchestration operation, use the
|
||||
:command:`upgrade-strategy abort` command.
|
||||
|
||||
.. note::
|
||||
|
||||
The :command:`dcmanager upgrade-strategy abort` command completes the
|
||||
current upgrading stage before aborting, to prevent hosts from being left
|
||||
in a locked state requiring manual intervention.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy abort
|
||||
|
158
doc/source/dist_cloud/configuration-for-specific-subclouds.rst
Normal file
158
doc/source/dist_cloud/configuration-for-specific-subclouds.rst
Normal file
@ -0,0 +1,158 @@
|
||||
|
||||
.. jul1593180757282
|
||||
.. _configuration-for-specific-subclouds:
|
||||
|
||||
====================================
|
||||
Configuration for Specific Subclouds
|
||||
====================================
|
||||
|
||||
To determine how upgrades are applied to the nodes on each subcloud, the
|
||||
upgrade strategy refers to separate configuration settings.
|
||||
|
||||
The following settings are applied by default:
|
||||
|
||||
|
||||
.. _configuration-for-specific-subclouds-ul-sgb-p34-gdb:
|
||||
|
||||
- storage apply type: parallel
|
||||
|
||||
- worker apply type: parallel
|
||||
|
||||
- max parallel workers: 10
|
||||
|
||||
- alarm restriction type: relaxed
|
||||
|
||||
- default instance action: migrate \(This parameter is only applicable to
|
||||
hosted application |VMs| with the stx-openstack application.\)
|
||||
|
||||
|
||||
To update the default values, use the :command:`dcmanager strategy-config
|
||||
update` command. You can also use this command to configure custom behavior for
|
||||
individual subclouds.
|
||||
|
||||
- To list the default upgrade strategy and any custom configurations
|
||||
configured for individual subclouds, use the :command:`strategy-config
|
||||
list` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-config list
|
||||
+--------------------+--------------------+--------------------+-----------------------+------------------------+------------------+
|
||||
| cloud | storage apply type | worker apply type | max parallel workers | alarm restriction type | default instance |
|
||||
| | | | | | action |
|
||||
+--------------------+--------------------+--------------------+-----------------------+------------------------+------------------+
|
||||
| all clouds default | parallel | parallel | 10 | relaxed | migrate |
|
||||
| subcloud-6 | parallel | parallel | 2 | relaxed | stop-start |
|
||||
+--------------------+--------------------+--------------------+-----------------------+------------------------+------------------+
|
||||
|
||||
|
||||
- To show the configuration settings applicable to all subclouds by default,
|
||||
use the :command:`strategy-config show` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-config show
|
||||
+-------------------------+--------------------+
|
||||
| Field | Value |
|
||||
+-------------------------+--------------------+
|
||||
| cloud | all clouds default |
|
||||
| storage apply type | parallel |
|
||||
| worker apply type | parallel |
|
||||
| max parallel workers | 10 |
|
||||
| alarm restriction type | relaxed |
|
||||
| default instance action | migrate |
|
||||
| created_at | None |
|
||||
| updated_at | None |
|
||||
+-------------------------+--------------------+
|
||||
|
||||
|
||||
- To update the settings, or to create a custom configuration for a subcloud,
|
||||
use the :command:`strategy-config update` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-config update \
|
||||
\
|
||||
--storage-apply-type <type> \
|
||||
--worker-apply-type <type> \
|
||||
--max-parallel-workers <i> \
|
||||
--alarm-restriction-type <level> \
|
||||
--default-instance-action <action> \
|
||||
[<subcloud_name>]
|
||||
|
||||
where
|
||||
|
||||
**storage apply type**
|
||||
parallel or serial — determines whether storage nodes are upgraded in
|
||||
parallel or serially.
|
||||
|
||||
**worker apply type**
|
||||
parallel or serial — determines whether worker nodes are upgraded in
|
||||
parallel or serially.
|
||||
|
||||
**max parallel workers**
|
||||
Set the maximum number of worker nodes that can be upgraded in
|
||||
parallel.
|
||||
|
||||
**alarm restriction type**
|
||||
relaxed or strict — determines whether the orchestration is aborted for
|
||||
alarms that are not management-affecting. For more information, refer
|
||||
to the
|
||||
|
||||
.. xbooklink :ref:`|updates-doc| <software-updates-and-upgrades-software-updates>` guide.
|
||||
|
||||
**default instance action**
|
||||
.. note::
|
||||
|
||||
This parameter is only applicable to hosted application |VMs| with
|
||||
the stx-openstack application.
|
||||
|
||||
migrate or stop-start — determines whether hosted application |VMs| are
|
||||
migrated or stopped and restarted when a worker host is upgraded
|
||||
|
||||
**subcloud\_name**
|
||||
The name of the subcloud to use the custom strategy. If this omitted,
|
||||
the default upgrade strategy is updated.
|
||||
|
||||
.. note::
|
||||
|
||||
You must specify all of the settings.
|
||||
|
||||
- To show the configuration settings for a subcloud, use the
|
||||
:command:`strategy-config show` <subcloud> command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-config show [<name>]
|
||||
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-config show subcloud-6
|
||||
+-------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+-------------------------+----------------------------+
|
||||
| cloud | subcloud-6 |
|
||||
| storage apply type | parallel |
|
||||
| worker apply type | parallel |
|
||||
| max parallel workers | 2 |
|
||||
| alarm restriction type | relaxed |
|
||||
| default instance action | stop-start |
|
||||
| created_at | 2020-03-12 20:08:48.917866 |
|
||||
| updated_at | None |
|
||||
+-------------------------+----------------------------+
|
||||
|
||||
|
||||
If custom configuration settings have not been created for the subcloud,
|
||||
the following message is displayed:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ERROR (app) No options found for Subcloud with id 1, defaults will be
|
||||
used.
|
||||
|
||||
|
@ -104,8 +104,8 @@ Deletes subcloud group details from the database.
|
||||
+--+------+----+----+-------+-------+------+-----------+-----------+-------------+-----------+------------+------------+------+----------+----------+
|
||||
|id|name |desc|loc.|sof.ver|mgmnt |avail |deploy_stat|mgmt_subnet|mgmt_start_ip|mgmt_end_ip|mgmt_gtwy_ip|sysctrl_gtwy|grp_id|created_at|updated_at|
|
||||
+--+------+----+----+-------+-------+------+-----------+-----------+-------------+-----------+------------+------------+------+----------+----------+
|
||||
|3 |subcl1|None|None|20.06 |managed|online|complete |fd01:12::0.|fd01:12::2 |fd01:12::11|fd01:12::1 |fd01:11::1 | 2 |2021-01-09|2021-01-12|
|
||||
|4 |subcl2|None|None|20.06 |managed|online|complete |fd01:13::0.|fd01:13::2 |fd01:13::11|fd01:13::1 |fd01:11::1 | 2 |2021-01-09|2021-01-12|
|
||||
|3 |subcl1|None|None|nn.nn |managed|online|complete |fd01:12::0.|fd01:12::2 |fd01:12::11|fd01:12::1 |fd01:11::1 | 2 |2021-01-09|2021-01-12|
|
||||
|4 |subcl2|None|None|nn.nn |managed|online|complete |fd01:13::0.|fd01:13::2 |fd01:13::11|fd01:13::1 |fd01:11::1 | 2 |2021-01-09|2021-01-12|
|
||||
+--+------+----+----+-------+-------+------+-----------+-----------+-------------+-----------+------------+------------+------+----------+----------+
|
||||
|
||||
- To show the details of a subcloud group, use the following command:
|
||||
|
@ -0,0 +1,335 @@
|
||||
|
||||
.. pek1594745988225
|
||||
.. _distributed-upgrade-orchestration-process-using-the-cli:
|
||||
|
||||
=======================================================
|
||||
Distributed Upgrade Orchestration Process Using the CLI
|
||||
=======================================================
|
||||
|
||||
Distributed upgrade orchestration can be initiated after the upgrade and
|
||||
stability of the SystemController cloud. Upgrade orchestration automatically
|
||||
iterates through each of the subclouds, installing the new software load on
|
||||
each one.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
The user first creates a distributed upgrade orchestration strategy, or plan,
|
||||
for the automated upgrade procedure. This customizes the upgrade orchestration,
|
||||
using parameters to specify:
|
||||
|
||||
|
||||
.. _distributed-upgrade-orchestration-process-using-the-cli-ul-eyw-fyr-31b:
|
||||
|
||||
- whether to stop on failure of a subcloud upgrade or continue with the next
|
||||
subcloud
|
||||
|
||||
- whether to upgrade hosts serially or in parallel
|
||||
|
||||
|
||||
Based on these parameters, and the state of the subclouds, distributed upgrade
|
||||
orchestration creates a number of stages for the overall upgrade strategy. All
|
||||
the subclouds that are included in the same stage will be upgraded in parallel.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
Distributed upgrade orchestration can only be done on a system that meets the
|
||||
following conditions:
|
||||
|
||||
.. _distributed-upgrade-orchestration-process-using-the-cli-ul-blp-gcx-ry:
|
||||
|
||||
- The subclouds must use the Redfish platform management service if it is
|
||||
an |AIO-SX| subcloud.
|
||||
|
||||
- Duplex \(|AIO-DX|/Standard\) upgrades are supported, and they do not
|
||||
require remote install using Redfish.
|
||||
|
||||
- Redfish |BMC| is required for orchestrated subcloud upgrades. The install
|
||||
values, and :command:`bmc\_password` for each |AIO-SX| subcloud controller
|
||||
must be provided using the following |CLI| command on the SystemController:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud update subcloud1 --install-values\
|
||||
install-values.yaml --bmc-password <password>
|
||||
|
||||
For more information on :command:`install-values.yaml` file, see
|
||||
:ref:`Installing a Subcloud Using Redfish Platform Management Service
|
||||
<installing-a-subcloud-using-redfish-platform-management-service>`.
|
||||
|
||||
- All subclouds are clear of alarms \(with the exception of the alarm upgrade
|
||||
in progress\).
|
||||
|
||||
- All hosts of all subclouds must be unlocked, enabled, and available.
|
||||
|
||||
- No distributed update orchestration strategy exists, to verify use the
|
||||
command :command:`dcmanager upgrade-stratagy-show`. An upgrade cannot be
|
||||
orchestrated while update orchestration is in progress.
|
||||
|
||||
- Verify the size and format of the platform-backup filesystem on each
|
||||
subcloud. From the shell on each subcloud, use the following command to view
|
||||
the details of the file system:
|
||||
|
||||
:command:`df -Th /opt/platform-backup`
|
||||
|
||||
The type must be ext4 and the size must be 9.5GB. For example, on
|
||||
controller-0, run the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ df -Th /opt/platform-backup/ Filesystem Type Size Used Avail Use% Mounted on /dev/sda2 ext4 9.5G 51M 9.0G 1% /opt/platform-backup
|
||||
|
||||
- **If a previous upgrade has been done on the subcloud**, from the shell on
|
||||
each subcloud, use the following command to remove the previous upgrade
|
||||
data:
|
||||
|
||||
:command:`sudo rm /opt/platform-backup/upgrade\_data\*`
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _distributed-upgrade-orchestration-process-using-the-cli-steps-vcm-pq4-3mb:
|
||||
|
||||
#. Review the upgrade status for the subclouds.
|
||||
|
||||
After the SystemController upgrade is completed, wait for 10 minutes for
|
||||
the **load\_sync\_status** of all subclouds to be updated.
|
||||
|
||||
To identify which subclouds are upgrade-current \(in-sync\), use the
|
||||
:command:`subcloud list` command. For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud list
|
||||
+----+-----------+--------------+--------------------+-------------+
|
||||
| id | name | management | availability | sync |
|
||||
+----+-----------+--------------+--------------------+-------------+
|
||||
| 1 | subcloud1 | managed | online | out-of-sync |
|
||||
| 2 | subcloud2 | managed | online | out-of-sync |
|
||||
| 3 | subcloud3 | managed | online | out-of-sync |
|
||||
| 4 | subcloud4 | managed | online | out-of-sync |
|
||||
+----+-----------+--------------+--------------------+-------------+
|
||||
|
||||
.. note::
|
||||
The sync status is the rolled up sync status of platform, patching,
|
||||
identity, etc.
|
||||
|
||||
To see synchronization details for a subcloud, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud show subcloud1
|
||||
+-----------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+-----------------------------+----------------------------+
|
||||
| id | 1 |
|
||||
| name | subcloud1 |
|
||||
| description | None |
|
||||
| location | None |
|
||||
| software_version | nn.nn |
|
||||
| management | managed |
|
||||
| availability | online |
|
||||
| deploy_status | complete |
|
||||
| management_subnet | fd01:82::0/64 |
|
||||
| management_start_ip | fd01:82::2 |
|
||||
| management_end_ip | fd01:82::11 |
|
||||
| management_gateway_ip | fd01:82::1 |
|
||||
| systemcontroller_gateway_ip | fd01:81::1 |
|
||||
| group_id | 1 |
|
||||
| created_at | 2020-07-15 19:23:50.966984 |
|
||||
| updated_at | 2020-07-17 12:36:28.815655 |
|
||||
| dc-cert_sync_status | in-sync |
|
||||
| identity_sync_status | in-sync |
|
||||
| load_sync_status | in-sync |
|
||||
| patching_sync_status | in-sync |
|
||||
| platform_sync_status | in-sync |
|
||||
+-----------------------------+----------------------------+
|
||||
|
||||
#. To create an upgrade strategy, use the :command:`dcmanager upgrade-strategy create`
|
||||
command.
|
||||
|
||||
The upgrade strategy for a |prod-dc| system controls how upgrades are
|
||||
applied to subclouds.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy create \
|
||||
[--subcloud-apply-type <type>] \
|
||||
[–-max-parallel-subclouds <i>] \
|
||||
[–-stop-on-failure <level>] \
|
||||
[--group group] \
|
||||
[<subcloud>]
|
||||
|
||||
where:
|
||||
|
||||
**subcloud-apply-type**
|
||||
**parallel** or **serial**— determines whether the subclouds are
|
||||
upgraded in parallel, or serially.
|
||||
|
||||
If this is not specified using the CLI, the values for
|
||||
:command:`subcloud\_update\_type` defined for each subcloud group will
|
||||
be used by default.
|
||||
|
||||
**max-parallel-subclouds**
|
||||
Sets the maximum number of subclouds that can be upgraded in parallel
|
||||
\(default 20\).
|
||||
|
||||
If this is not specified using the CLI, the values for
|
||||
:command:`max\_parallel\_subclouds` defined for each subcloud group
|
||||
will be used by default.
|
||||
|
||||
**stop-on-failure**
|
||||
**true**\(default\) or **false**— determines whether upgrade
|
||||
orchestration failure for a subcloud prevents application to subsequent
|
||||
subclouds.
|
||||
|
||||
**group**
|
||||
Optionally pass the name or ID of a subcloud group to the
|
||||
:command:`dcmanager upgrade-strategy create` command. This results in a
|
||||
strategy that is only applied to all subclouds in the specified group.
|
||||
The subcloud group values are used for subcloud apply type and max
|
||||
parallel subclouds parameters.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy create
|
||||
+------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+------------------------+----------------------------+
|
||||
| strategy type | upgrade |
|
||||
| subcloud apply type | parallel |
|
||||
| max parallel subclouds | 10 |
|
||||
| stop on failure | False |
|
||||
| state | initial |
|
||||
| created_at | 2020-06-10T17:16:51.857207 |
|
||||
| updated_at | None |
|
||||
+------------------------+----------------------------+
|
||||
|
||||
#. To show the settings for the upgrade strategy, use the
|
||||
:command:`dcmanager upgrade-strategy show` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy show
|
||||
+------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+------------------------+----------------------------+
|
||||
| subcloud apply type | parallel |
|
||||
| max parallel subclouds | 20 |
|
||||
| stop on failure | False |
|
||||
| state | initial |
|
||||
| created_at | 2020-02-02T14:42:13.822499 |
|
||||
| updated_at | None |
|
||||
+------------------------+----------------------------+
|
||||
|
||||
.. note::
|
||||
A value of **None** for :command:`subcloud apply type`, and
|
||||
:command:`max parallel subclouds` indicates that subcloud group values
|
||||
are being used.
|
||||
|
||||
#. Review the upgrade strategy for the subclouds.
|
||||
|
||||
To show the subclouds that will be upgraded when the upgrade strategy is
|
||||
applied, use the :command:`dcmanager strategy-step list` command. For
|
||||
example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-step list
|
||||
+------------------+-------+---------+---------+------------+-------------+
|
||||
| cloud | stage | state | details | started_at | finished_at |
|
||||
+------------------+-------+---------+---------+------------+-------------+
|
||||
| subcloud-1 | 1 | initial | | None | None |
|
||||
| subcloud-4 | 1 | initial | | None | None |
|
||||
| subcloud-5 | 2 | initial | | None | None |
|
||||
| subcloud-6 | 2 | initial | | None | None |
|
||||
+------------------+-------+---------+---------+------------+-------------+
|
||||
|
||||
.. note::
|
||||
All the subclouds that are included in the same stage will be upgraded
|
||||
in parallel.
|
||||
|
||||
#. To apply the upgrade strategy, use the :command:`dcmanager upgrade-strategy apply`
|
||||
command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy apply
|
||||
+------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+------------------------+----------------------------+
|
||||
| subcloud apply type | parallel |
|
||||
| max parallel subclouds | 20 |
|
||||
| stop on failure | False |
|
||||
| state | applying |
|
||||
| created_at | 2020-02-02T14:42:13.822499 |
|
||||
| updated_at | 2020-02-02T14:42:19.376688 |
|
||||
+------------------------+----------------------------+
|
||||
|
||||
#. To show the step currently being performed on each of the subclouds, use
|
||||
the :command:`dcmanager strategy-step list` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-step list
|
||||
+------------------+-------+-------------+-----------------------------+----------------------------+----------------------------+
|
||||
| cloud | stage | state | details | started_at | finished_at |
|
||||
+------------------+-------+-------------+-----------------------------+----------------------------+----------------------------+
|
||||
| subcloud-1 | 2 | applying... | apply phase is 66% complete | 2020-03-13 14:12:12.262001 | 2020-03-13 14:15:52.450908 |
|
||||
| subcloud-4 | 2 | applying... | apply phase is 83% complete | 2020-03-13 14:16:02.457588 | None |
|
||||
| subcloud-5 | 2 | finishing | | 2020-03-13 14:16:02.463213 | None |
|
||||
| subcloud-6 | 2 | applying... | apply phase is 66% complete | 2020-03-13 14:16:02.473669 | None |
|
||||
+------------------+-------+-------------+-----------------------------+----------------------------+----------------------------+
|
||||
|
||||
#. To show the step currently being performed on a subcloud, use the
|
||||
:command:`dcmanager strategy-step show` <subcloud> command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-step show <subcloud>
|
||||
|
||||
#. When the distributed upgrade orchestration complete, delete the upgrade
|
||||
strategy, using the :command:`dcmanager upgrade-strategy delete` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy delete
|
||||
+------------------------+----------------------------+
|
||||
| Field | Value |
|
||||
+------------------------+----------------------------+
|
||||
| subcloud apply type | parallel |
|
||||
| max parallel subclouds | 20 |
|
||||
| stop on failure | False |
|
||||
| state | deleting |
|
||||
| created_at | 2020-03-23T20:04:50.992444 |
|
||||
| updated_at | 2020-03-23T20:05:14.157352 |
|
||||
+------------------------+----------------------------+
|
||||
|
||||
.. rubric:: |postreq|
|
||||
|
||||
.. _distributed-upgrade-orchestration-process-using-the-cli-ul-lx1-zcv-3mb:
|
||||
|
||||
- Check and update docker registry credentials for **ALL** subclouds. For
|
||||
each subcloud:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
REGISTRY="docker-registry"
|
||||
SECRET_UUID='system service-parameter-list | fgrep
|
||||
$REGISTRY | fgrep auth-secret | awk '{print $10}''
|
||||
SECRET_REF='openstack secret list | fgrep ${SECRET_UUID}|
|
||||
awk '{print $2}''
|
||||
openstack secret get ${SECRET_REF} --payload -f value
|
||||
|
||||
The secret payload should be, "username: sysinv password:<password>". If
|
||||
the secret payload is, "username: admin password:<password>", see,
|
||||
:ref:`Updating Docker Registry Credentials on a Subcloud
|
||||
<updating-docker-registry-credentials-on-a-subcloud>` for more information.
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: ../_includes/distributed-upgrade-orchestration-process-using-the-cli.rest
|
@ -0,0 +1,140 @@
|
||||
|
||||
.. oeo1597292999568
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud:
|
||||
|
||||
===========================================================================
|
||||
Failure During the Installation or Data Migration of N+1 Load on a Subcloud
|
||||
===========================================================================
|
||||
|
||||
You may encounter some errors during Installation or Data migration of the
|
||||
**N+1** load on a subcloud. This section explains the errors and the steps
|
||||
required to fix these errors.
|
||||
|
||||
.. contents:: |minitoc|
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
Errors can occur due to one of the following:
|
||||
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-ul-j5r-czs-qmb:
|
||||
|
||||
- One or more invalid install values
|
||||
|
||||
- A network error that results in the subcloud's being temporarily unreachable
|
||||
|
||||
- An invalid docker registry certificate
|
||||
|
||||
|
||||
**Failure Caused by Install Values**
|
||||
|
||||
If the subcloud install values contain an incorrect value, use the following
|
||||
command to fix it.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud update <<subcloud-name>> --install-values <<subcloud-install-values-yaml>>
|
||||
|
||||
This type of failure is recoverable and you can rerun the upgrade strategy for
|
||||
the failed subcloud\(s\) using the following procedure:
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-ol-lc1-cyr-qmb:
|
||||
|
||||
#. Delete the failed upgrade strategy.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy delete
|
||||
|
||||
#. Create a new upgrade strategy for the failed subcloud.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy create <<subcloud-name>> --force <<additional options>>
|
||||
|
||||
.. note::
|
||||
|
||||
If the upgrade failed during the |AIO|-SX upgrade or data migration, the
|
||||
subcloud availability status is displayed as 'offline'. Use the
|
||||
:command:`--force` option when creating the new strategy.
|
||||
|
||||
#. Apply the new upgrade strategy.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy apply
|
||||
|
||||
#. Verify the upgrade strategy status.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-step list
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-section-f5f-j1y-qmb:
|
||||
|
||||
-----------------------------------------------------
|
||||
Failure Caused by Invalid Docker Registry Certificate
|
||||
-----------------------------------------------------
|
||||
|
||||
If the docker registry certificate on the subcloud is invalid/expired prior to
|
||||
an upgrade, the upgrade will fail during data migration.
|
||||
|
||||
.. warning::
|
||||
|
||||
This type of failure cannot be recovered. You will need to re-deploy the
|
||||
subcloud, redo all configuration changes, and regenerate the data.
|
||||
|
||||
.. note::
|
||||
|
||||
Ensure that the docker registry certificate on all subclouds must be
|
||||
upgraded prior to performing an orchestrated upgrade.
|
||||
|
||||
To re-deploy the subcloud, use the following procedure:
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-ol-dpp-bzr-qmb:
|
||||
|
||||
#. Unmanage the failed subcloud.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud unmanage <<subcloud-name>>
|
||||
|
||||
#. Delete the subcloud.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud delete <<subcloud-name>>
|
||||
|
||||
#. Re-deploy the failed subcloud.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud add <<parameters>>
|
||||
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-section-lj4-1rr-qmb:
|
||||
|
||||
-----------------------------------------
|
||||
Failure Post Data Migration on a Subcloud
|
||||
-----------------------------------------
|
||||
|
||||
Once the data migration on the subcloud is completed, the upgrade is activated
|
||||
and finalized. If failure occurs:
|
||||
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-ul-ogc-cp5-qmb:
|
||||
|
||||
- Check specified log files
|
||||
|
||||
- Follow the recovery procedure. See :ref:`Failure Prior to the Installation
|
||||
of N+1 Load on a Subcloud <failure-prior-to-the-installation-of-n+1-load-on-a-subcloud>`
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: ../_includes/distributed-upgrade-orchestration-process-using-the-cli.rest
|
@ -0,0 +1,61 @@
|
||||
|
||||
.. uvp1597292940831
|
||||
.. _failure-prior-to-the-installation-of-n+1-load-on-a-subcloud:
|
||||
|
||||
===========================================================
|
||||
Failure Prior to the Installation of N+1 Load on a Subcloud
|
||||
===========================================================
|
||||
|
||||
You may encounter some errors prior to Installation of the **N+1** load on a
|
||||
subcloud. This section explains the errors and the steps required to fix these
|
||||
errors.
|
||||
|
||||
Errors can occur due to any one of the following:
|
||||
|
||||
|
||||
.. _failure-prior-to-the-installation-of-n+1-load-on-a-subcloud-ul-onf-2vs-qmb:
|
||||
|
||||
- Insufficient disk space on scratch filesystems
|
||||
|
||||
- Missing subcloud install values
|
||||
|
||||
- Invalid license
|
||||
|
||||
- Invalid/corrupted load file
|
||||
|
||||
- The /home/sysadmin directory on the subcloud is too large
|
||||
|
||||
|
||||
If you encounter any of the above errors, use the following procedure to fix
|
||||
it:
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Delete the failed upgrade strategy
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy delete
|
||||
|
||||
#. Create a new upgrade strategy.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy create <<additional options>>
|
||||
|
||||
.. note::
|
||||
|
||||
If only one subcloud fails the upgrade, specify the name of the
|
||||
subcloud in the command.
|
||||
|
||||
#. Apply the new upgrade strategy.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager upgrade-strategy apply
|
||||
|
||||
#. Verify the upgrade strategy status
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager strategy-step list
|
@ -60,6 +60,30 @@ Kubernetes Version Upgrade Distributed Cloud Orchestration
|
||||
the-kubernetes-distributed-cloud-update-orchestration-process
|
||||
configuring-kubernetes-update-orchestration-on-distributed-cloud
|
||||
|
||||
------------------
|
||||
Upgrade management
|
||||
------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
upgrade-management-overview
|
||||
upgrading-the-systemcontroller-using-the-cli
|
||||
|
||||
*******************************************************************
|
||||
Upgrade Orchestration for Distributed Cloud SubClouds using the CLI
|
||||
*******************************************************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
distributed-upgrade-orchestration-process-using-the-cli
|
||||
aborting-the-distributed-upgrade-orchestration
|
||||
configuration-for-specific-subclouds
|
||||
robust-error-handling-during-an-orchestrated-upgrade
|
||||
failure-prior-to-the-installation-of-n+1-load-on-a-subcloud
|
||||
failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud
|
||||
|
||||
--------
|
||||
Appendix
|
||||
--------
|
||||
|
@ -109,7 +109,7 @@ subcloud, the subcloud installation has these phases:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
# Specify the WRCP software version, for example '20.06' for the WRCP 20.06 release of software.
|
||||
# Specify the |pp| software version, for example 'nn.nn' for the |pp| nn.nn release of software.
|
||||
software_version: <software_version>
|
||||
bootstrap_interface: <bootstrap_interface_name> # e.g. eno1
|
||||
bootstrap_address: <bootstrap_interface_ip_address> # e.g.128.224.151.183
|
||||
|
@ -13,7 +13,7 @@ system.
|
||||
|
||||
The Central Cloud supports either
|
||||
|
||||
- an |AIO|-Duplex deployment configuration
|
||||
- an |AIO-DX| deployment configuration
|
||||
|
||||
- a Standard with Dedicated Storage Nodes deployment Standard with Controller
|
||||
Storage and one or more workers deployment configuration, or
|
||||
|
@ -0,0 +1,41 @@
|
||||
|
||||
.. ziu1597089603252
|
||||
.. _robust-error-handling-during-an-orchestrated-upgrade:
|
||||
|
||||
====================================================
|
||||
Robust Error Handling During An Orchestrated Upgrade
|
||||
====================================================
|
||||
|
||||
This section describes the errors you may encounter during an orchestrated
|
||||
upgrade and the steps you can use to troubleshoot the errors.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
For a successful orchestrated upgrade, ensure the upgrade prerequisites,
|
||||
procedure, and postrequisites are met.
|
||||
|
||||
If a failure occurs, use the following general steps:
|
||||
|
||||
|
||||
.. _robust-error-handling-during-an-orchestrated-upgrade-ol-l5y-mby-qmb:
|
||||
|
||||
#. Allow the failed strategy to complete on its own.
|
||||
|
||||
#. Check the output using the :command:`dcmanager strategy-step list` command
|
||||
for failures, if any.
|
||||
|
||||
#. Address the cause of the failure. For more information, see :ref:`Failure
|
||||
During the Installation or Data Migration of N+1 Load on a Subcloud
|
||||
<failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud>`.
|
||||
|
||||
#. Rerun the orchestrated upgrade. For more information, see :ref:`Distributed
|
||||
Upgrade Orchestration Process Using the CLI
|
||||
<distributed-upgrade-orchestration-process-using-the-cli>`.
|
||||
|
||||
.. seealso::
|
||||
|
||||
:ref:`Failure Prior to the Installation of N+1 Load on a Subcloud
|
||||
<failure-prior-to-the-installation-of-n+1-load-on-a-subcloud>`
|
||||
|
||||
:ref:`Failure During the Installation or Data Migration of N+1 Load on a
|
||||
Subcloud <failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud>`
|
120
doc/source/dist_cloud/upgrade-management-overview.rst
Normal file
120
doc/source/dist_cloud/upgrade-management-overview.rst
Normal file
@ -0,0 +1,120 @@
|
||||
|
||||
.. gjf1592841770001
|
||||
.. _upgrade-management-overview:
|
||||
|
||||
===========================
|
||||
Upgrade Management Overview
|
||||
===========================
|
||||
|
||||
You can upgrade |prod|'s |prod-dc|'s SystemController, and subclouds with a new
|
||||
release of |prod| software.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
.. note::
|
||||
|
||||
Backup all yaml files that are updated using the Redfish Platform
|
||||
Management service. For more information, see, :ref:`Installing a Subcloud
|
||||
Using Redfish Platform Management Service
|
||||
<installing-a-subcloud-using-redfish-platform-management-service>`.
|
||||
|
||||
You can use the |CLI| to manage upgrades. The workflow for upgrades is as
|
||||
follows:
|
||||
|
||||
|
||||
.. _upgrade-management-overview-ol-uqv-p24-3mb:
|
||||
|
||||
#. To upgrade the |prod-dc| system, you must first upgrade the
|
||||
SystemController. See, :ref:`Upgrading the SystemController Using the CLI
|
||||
<upgrading-the-systemcontroller-using-the-cli>`.
|
||||
|
||||
#. Use |prod-dc| Upgrade Orchestration to upgrade the subclouds. See,
|
||||
:ref:`Distributed Upgrade Orchestration Process Using the CLI <distributed-upgrade-orchestration-process-using-the-cli>`.
|
||||
|
||||
#. To handle errors during an orchestrated upgrade, see :ref:`Robust Error
|
||||
Handling During An Orchestrated Upgrade
|
||||
<robust-error-handling-during-an-orchestrated-upgrade>`.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
The following prerequisites apply to a |prod-dc| upgrade management service.
|
||||
|
||||
.. _upgrade-management-overview-ul-smx-y2m-cmb:
|
||||
|
||||
- **Configuration Verification**: Ensure that the following configurations
|
||||
are verified before you proceed with the upgrade on the |prod-dc|
|
||||
and subclouds:
|
||||
|
||||
|
||||
- Run the :command:`system application-list` command to ensure that all
|
||||
applications are running
|
||||
|
||||
- Run the :command:`system host-list` command to list the configured
|
||||
hosts
|
||||
|
||||
- Run the :command:`dcmanager subcloud list` command to list the
|
||||
subclouds
|
||||
|
||||
- Run the :command:`kubectl get pods --all-namespaces` command to test
|
||||
that the authentication token validates correctly
|
||||
|
||||
- Run the :command:`fm alarm-list` command to check the system health to
|
||||
ensure that there are no unexpected alarms
|
||||
|
||||
- Run the :command:`kubectl get host -n deployment` command to ensure all
|
||||
nodes in the cluster have reconciled and is set to 'true'
|
||||
|
||||
- Ensure **controller-0** is the active controller
|
||||
|
||||
- The subclouds must all be |AIO-DX|, and using the Redfish
|
||||
platform management service.
|
||||
|
||||
- **Remove Non GA Applications**:
|
||||
|
||||
|
||||
- Use the following command to remove the analytics application on the
|
||||
subclouds:
|
||||
|
||||
- :command:`system application-remove wra-analytics`
|
||||
|
||||
- :command:`system application-delete wra-analytics`
|
||||
|
||||
|
||||
- Remove any non-GA applications such as Wind River Analytics, and
|
||||
|prefix|-openstack, from the |prod-dc| system, if they exist.
|
||||
|
||||
- **Increase Scratch File System Size**:
|
||||
|
||||
- Check the size of scratch partition on both the system controller and
|
||||
subclouds using the :command:`system host-fs-list` command.
|
||||
|
||||
.. note::
|
||||
Increase in scratch filesystem size is also required on each
|
||||
subcloud.
|
||||
|
||||
- All controller nodes and subclouds should have a minimum of 16G scratch
|
||||
file system. The process of importing a new load for upgrade will
|
||||
temporarily use up to 11G of scratch disk space. Use the :command:`system
|
||||
host-fs-modify` command to increase scratch size on **each controller
|
||||
node** and subcloud controllers as needed in preparation for software
|
||||
upgrade. For example, run the following commands:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-fs-modify controller-0 scratch=16
|
||||
|
||||
Run the :command:`fm alarm-list` command to check the system health to
|
||||
ensure that there are no unexpected alarms
|
||||
|
||||
- For orchestrated subcloud upgrades the install-values for each subcloud
|
||||
that was used for deployment must be saved and restored to the SystemController
|
||||
after the SystemController upgrade.
|
||||
|
||||
- Run the :command:`kubectl -n kube-system get secret` command on the
|
||||
SystemController before upgrading subclouds, as the docker **rvmc** image on
|
||||
orchestrated subcloud upgrade tries to copy the :command:`kube-system
|
||||
default-registry-key`.
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: ../_includes/upgrade-management-overview.rest
|
@ -0,0 +1,485 @@
|
||||
|
||||
.. vco1593176327490
|
||||
.. _upgrading-the-systemcontroller-using-the-cli:
|
||||
|
||||
==========================================
|
||||
Upgrade the SystemController Using the CLI
|
||||
==========================================
|
||||
|
||||
You can upload and apply upgrades to the SystemController in order to upgrade
|
||||
the central repository, from the CLI. The SystemController can be upgraded
|
||||
using either a manual software upgrade procedure or by using the
|
||||
non-distributed systems :command:`sw-manager` orchestration procedure.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
Follow the steps below to manually upgrade the SystemController:
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _upgrading-the-systemcontroller-using-the-cli-steps-oq4-dgm-cmb:
|
||||
|
||||
#. Source the platform environment.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ source /etc/platform/openrc
|
||||
~(keystone_admin)]$
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: ../_includes/upgrading-the-systemcontroller-using-the-cli.rest
|
||||
|
||||
#. Import the software release load, and copy the iso file to controller-0 \(active controller\).
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system --os-region-name SystemController load-import <bootimage>.iso <bootimage>.sig
|
||||
|
||||
For example,
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system --os-region-name SystemController load-import <bootimage>.iso <bootimage>.sig
|
||||
|
||||
#. Apply any required software updates. After the update is installed ensure
|
||||
controller-0 is active.
|
||||
|
||||
The system must be 'patch current'. All software updates related to your
|
||||
current |prod| software release must be uploaded, applied, and installed.
|
||||
|
||||
All software updates to the new |prod| release, only need to be uploaded
|
||||
and applied. The install of these software updates will occur automatically
|
||||
during the software upgrade procedure as the hosts are reset to load the
|
||||
new release of software.
|
||||
|
||||
To find and download applicable updates, visit the `Wind River Support
|
||||
Network <https://docs.windriver.com>`__.
|
||||
|
||||
.. xbooklink For more information, see |updates-doc|: :ref:`Managing Software Updates <managing-software-updates>`.
|
||||
|
||||
#. Confirm that the system is healthy.
|
||||
|
||||
Check the current system health status, resolve any alarms and other issues
|
||||
reported by the :command:`health-query-upgrade` command, then recheck the
|
||||
system health status to confirm that all **System Health** fields are set
|
||||
to **OK**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system health-query-upgrade
|
||||
System Health:
|
||||
All hosts are provisioned: [OK]
|
||||
All hosts are unlocked/enabled: [OK]
|
||||
All hosts have current configurations: [OK]
|
||||
All hosts are patch current: [OK]
|
||||
Ceph Storage Healthy: [OK]
|
||||
No alarms: [OK]
|
||||
All kubernetes nodes are ready: [OK]
|
||||
All kubernetes control plane pods are ready: [OK]
|
||||
Required patches are applied: [OK]
|
||||
License valid for upgrade: [OK]
|
||||
|
||||
By default, the upgrade process cannot run and is not recommended to run
|
||||
with active alarms present. It is strongly recommended that you clear your
|
||||
system of all alarms before doing an upgrade.
|
||||
|
||||
.. note::
|
||||
|
||||
Use the command :command:`system upgrade-start --force` to force the
|
||||
upgrades process to start and to ignore management affecting alarms.
|
||||
This should ONLY be done if these alarms do not cause an issue for the
|
||||
upgrades process.
|
||||
|
||||
If there are alarms present during the upgrade, subcloud load sync\_status
|
||||
will display "out-of-sync".
|
||||
|
||||
#. Start the upgrade from controller-0.
|
||||
|
||||
Make sure that controller-0 is the active controller, and you are logged
|
||||
into controller-0 as **sysadmin** and your present working directory is
|
||||
your home directory.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system upgrade-start
|
||||
+--------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+--------------+--------------------------------------+
|
||||
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
|
||||
| state | starting |
|
||||
| from_release | nn.nn |
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
This will make a copy of the system data to be used in the upgrade.
|
||||
Configuration changes are not allowed after this point until the swact to
|
||||
controller-1 is completed.
|
||||
|
||||
The following upgrade state applies once this command is executed. Run the
|
||||
:command:`system upgrade-show` command to verify the status of the upgrade.
|
||||
|
||||
|
||||
- started:
|
||||
|
||||
|
||||
- State entered after :command:`system upgrade-start` completes.
|
||||
|
||||
- Release 20.04 system data \(for example, postgres databases\) has
|
||||
been exported to be used in the upgrade.
|
||||
|
||||
- Configuration changes must not be made after this point, until the
|
||||
upgrade is completed.
|
||||
|
||||
|
||||
|
||||
As part of the upgrade, the upgrade process checks the health of the system
|
||||
and validates that the system is ready for an upgrade.
|
||||
|
||||
The upgrade process checks that no alarms are active before starting an
|
||||
upgrade.
|
||||
|
||||
.. note::
|
||||
|
||||
Use the command :command:`system upgrade-start --force` to force the
|
||||
upgrades process to start and to ignore management affecting alarms.
|
||||
This should ONLY be done if these alarms do not cause an issue for the
|
||||
upgrades process.
|
||||
|
||||
If there are alarms present during the upgrade, subcloud load
|
||||
sync\_status will display "out-of-sync".
|
||||
|
||||
On systems with Ceph storage, it also checks that the Ceph cluster is
|
||||
healthy.
|
||||
|
||||
#. Upgrade controller-1.
|
||||
|
||||
|
||||
#. Lock controller-1.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-lock controller-1
|
||||
|
||||
#. Start the upgrade on controller-1.
|
||||
|
||||
Controller-1 installs the update and reboots, then performs data
|
||||
migration.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade controller-1
|
||||
|
||||
Wait for controller-1 to reinstall with the load N+1 and becomes
|
||||
**locked-disabled-online** state.
|
||||
|
||||
The following data migration states apply when this command is executed.
|
||||
|
||||
|
||||
- data-migration:
|
||||
|
||||
|
||||
- State entered when :command:`system host-upgrade controller-1`
|
||||
is executed.
|
||||
|
||||
- System data is being migrated from release N to release N+1.
|
||||
|
||||
- data-migration-complete:
|
||||
|
||||
- State entered when controller-1 upgrade is complete.
|
||||
|
||||
- System data has been successfully migrated from release nn.nn
|
||||
to release nn.nn.
|
||||
|
||||
where *nn.nn* in the update file name is the |prod| release number.
|
||||
|
||||
- data-migration-failed:
|
||||
|
||||
- State entered if data migration on controller-1 fails.
|
||||
|
||||
- Upgrade must be aborted.
|
||||
|
||||
#. Check the upgrade state.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system upgrade-show
|
||||
+--------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+--------------+--------------------------------------+
|
||||
| uuid | e7c8f6bc-518c-46d4-ab81-7a59f8f8e64b |
|
||||
| state | data-migration-complete |
|
||||
| from_release | nn.nn |
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
If the :command:`upgrade-show` status indicates
|
||||
'data-migration-failed', then there is an issue with the data
|
||||
migration. Check the issue before proceeding to the next step.
|
||||
|
||||
.. note::
|
||||
|
||||
Do not unlock controller-1, before running :command:`system
|
||||
upgrade-show` to display the upgrade status
|
||||
"data-migration-complete".
|
||||
|
||||
#. Unlock controller-1.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-1
|
||||
|
||||
Wait for controller-1 to become **unlocked-enabled**. Wait for the DRBD
|
||||
sync **400.001** Services-related alarm is raised and then cleared.
|
||||
|
||||
The following states apply when this command is executed.
|
||||
|
||||
|
||||
- upgrading-controllers:
|
||||
|
||||
|
||||
- State entered when controller-1 has been unlocked and is
|
||||
running release nn.nn software.
|
||||
|
||||
where *nn.nn* in the update file name is the |prod| release
|
||||
number.
|
||||
|
||||
|
||||
If it transitions to **unlocked-disabled-failed**, check the issue
|
||||
before proceeding to the next step. The alarms may indicate a
|
||||
configuration error. Check the result of the configuration logs on
|
||||
controller-1, \(for example, Error logs in
|
||||
controller1:/var/log/puppet\).
|
||||
|
||||
#. Run the :command:`system application-list`, and :command:`system
|
||||
host-upgrade-list` commands to view the current progress.
|
||||
|
||||
#. Set controller-1 as the active controller. Swact to controller-1.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-swact controller-0
|
||||
|
||||
Wait until services have gone active on the new active controller-1 before
|
||||
proceeding to the next step. When all services on controller-1 are
|
||||
enabled-active, the swact is complete.
|
||||
|
||||
.. note::
|
||||
|
||||
Continue the remaining steps below to manually upgrade or use upgrade
|
||||
orchestration to upgrade the remaining nodes.
|
||||
|
||||
#. Upgrade **controller-0**. For more information, see
|
||||
|
||||
.. xbooklink :ref:`|updates-doc| <software-updates-and-upgrades-software-updates>`.
|
||||
|
||||
|
||||
#. Lock **controller-0**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-lock controller-0
|
||||
|
||||
#. Upgrade **controller-0**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade controller-0
|
||||
|
||||
|
||||
#. Unlock **controller-0**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-0
|
||||
|
||||
Wait until the DRBD sync **400.001** Services-related alarm is raised
|
||||
and then cleared before proceeding to the next step.
|
||||
|
||||
|
||||
- upgrading-hosts:
|
||||
|
||||
- State entered when both controllers are running release nn.nn
|
||||
software.
|
||||
|
||||
|
||||
#. Check the system health to ensure that there are no unexpected alarms.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ fm alarm-list
|
||||
|
||||
Clear all alarms unrelated to the upgrade process.
|
||||
|
||||
#. If using Ceph storage backend, upgrade the storage nodes one at a time.
|
||||
|
||||
The storage node must be locked and all OSDs must be down in order to do
|
||||
the upgrade.
|
||||
|
||||
|
||||
#. Lock storage-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-lock storage-0
|
||||
|
||||
#. Verify that the OSDs are down after the storage node is locked.
|
||||
|
||||
In the Horizon interface, navigate to **Admin** \> **Platform** \>
|
||||
**Storage Overview** to view the status of the OSDs.
|
||||
|
||||
#. Upgrade storage-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade storage-0
|
||||
|
||||
The upgrade is complete when the node comes online, and at that point,
|
||||
you can safely unlock the node.
|
||||
|
||||
After upgrading a storage node, but before unlocking, there are Ceph
|
||||
synchronization alarms \(that appear to be making progress in
|
||||
synching\), and there are infrastructure network interface alarms
|
||||
\(since the infrastructure network interface configuration has not been
|
||||
applied to the storage node yet, as it has not been unlocked\).
|
||||
|
||||
Unlock the node as soon as the upgraded storage node comes online.
|
||||
|
||||
#. Unlock storage-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock storage-0
|
||||
|
||||
Wait for all alarms to clear after the unlock before proceeding to
|
||||
upgrade the next storage host.
|
||||
|
||||
#. Repeat the above steps for each storage host.
|
||||
|
||||
.. note::
|
||||
|
||||
After upgrading the first storage node you can expect alarm
|
||||
**800.003**. The alarm is cleared after all storage nodes are
|
||||
upgraded.
|
||||
|
||||
#. If worker nodes are present, upgrade worker hosts, serially or parallelly,
|
||||
if any.
|
||||
|
||||
|
||||
#. Lock worker-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-lock worker-0
|
||||
|
||||
|
||||
#. Upgrade worker-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-upgrade worker-0
|
||||
|
||||
Wait for the host to run the installer, reboot, and go online before
|
||||
unlocking it in the next step.
|
||||
|
||||
#. Unlock worker-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock worker-0
|
||||
|
||||
Wait for all alarms to clear after the unlock before proceeding to the
|
||||
next worker host.
|
||||
|
||||
#. Repeat the above steps for each worker host.
|
||||
|
||||
|
||||
#. Set controller-0 as the active controller. Swact to controller-0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-swact controller-1
|
||||
|
||||
Wait until services have gone active on the active controller-0 before
|
||||
proceeding to the next step. When all services on controller-0 are
|
||||
enabled-active, the swact is complete.
|
||||
|
||||
#. Activate the upgrade.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system upgrade-activate
|
||||
+--------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+--------------+--------------------------------------+
|
||||
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
|
||||
| state | activating |
|
||||
| from_release | nn.nn |
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
During the running of the :command:`upgrade-activate` command, new
|
||||
configurations are applied to the controller. 250.001 \(**hostname
|
||||
Configuration is out-of-date**\) alarms are raised and are cleared as the
|
||||
configuration is applied. The upgrade state goes from **activating** to
|
||||
**activation-complete** once this is done.
|
||||
|
||||
The following states apply when this command is executed.
|
||||
|
||||
- activation-requested:
|
||||
|
||||
- State entered when :command:`system upgrade-activate` is executed.
|
||||
|
||||
- activating:
|
||||
|
||||
- State entered when we have started activating the upgrade by
|
||||
applying new configurations to the controller and compute hosts.
|
||||
|
||||
- activation-complete:
|
||||
|
||||
- State entered when new configurations have been applied to all
|
||||
controller and compute hosts.
|
||||
|
||||
#. Check the status of the upgrade again to see it has reached
|
||||
**activation-complete**, for example.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system upgrade-show
|
||||
+--------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+--------------+--------------------------------------+
|
||||
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
|
||||
| state | activation-complete |
|
||||
| from_release | nn.nn |
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
|
||||
.. note::
|
||||
Alarms are generated as the subcloud load sync\_status is "out-of-sync".
|
||||
|
||||
#. Complete the upgrade.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system upgrade-complete
|
||||
+--------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+--------------+--------------------------------------+
|
||||
| uuid | 61e5fcd7-a38d-40b0-ab83-8be55b87fee2 |
|
||||
| state | completing |
|
||||
| from_release | nn.nn |
|
||||
| to_release | nn.nn |
|
||||
+--------------+--------------------------------------+
|
||||
|
||||
Run the :command:`system upgrade-show` command, and the status will display
|
||||
"no upgrade in progress". The subclouds will be out-of-sync.
|
||||
|
||||
.. rubric:: |postreq|
|
||||
|
||||
.. warning::
|
||||
Do NOT delete the N load from the SystemController once the upgrade is
|
||||
complete. If the load is deleted from the SystemController, you must
|
||||
manually delete the N load from each subcloud.
|
||||
|
Loading…
x
Reference in New Issue
Block a user