Subcloud Local Installation Upgrade Support (pick dsR6)

Updated Patchset 7 comments
Updated Patchset 6 comments
Fixed Include file
Updated Patchset 4 comments
Updated Patchset 3 comments
Updated Patchset 1 comments
Added new topics for Prestaging a Subcloud

Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
Change-Id: Ibc51fa7159a06a6317eadb19524f9b00ada105eb
This commit is contained in:
Juanita-Balaraj 2022-03-24 17:04:19 -04:00
parent cf21ee640e
commit 9fe711783e
9 changed files with 518 additions and 0 deletions

View File

@ -0,0 +1,7 @@
start-after: prestage-image-begin
end-before: prestage-image-end
start-after: image-list-begin
end-before: image-list-end

View File

@ -0,0 +1,3 @@
start-after: strategy-begin
end-before: strategy-end

View File

@ -9,6 +9,9 @@ Distributed Upgrade Orchestration Process Using the CLI
Distributed upgrade orchestration can be initiated after the System Controller
has been successfully upgraded.
For more information Prestaging Subcloud Orchestration see,
:ref:`prestage-subcloud-orchestration-eb516473582f`.
.. rubric:: |context|
The user first creates a distributed upgrade orchestration strategy, or plan,

View File

@ -54,6 +54,16 @@ Operation
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
restoring-subclouds-from-backupdata-using-dcmanager
rehoming-a-subcloud
prestage-a-subcloud-using-dcmanager-df756866163f
--------------------------------------------------------------------
Prestage Orchestration for Distributed Cloud Subclouds using the CLI
--------------------------------------------------------------------
.. toctree::
:maxdepth: 1
prestage-subcloud-orchestration-eb516473582f
----------------------
Manage Subcloud Groups

View File

@ -68,6 +68,29 @@ subcloud, the subcloud installation has these phases:
files that are referenced in the **bootstrap.yml** file must exist on both
controllers \(for example, /home/sysadmin/docker-registry-ca-cert.pem\).
.. _increase-subcloud-platform-backup-size:
----------------------------------------------------
Increase Subcloud Platform Backup Size using the CLI
----------------------------------------------------
By default, 30GB is allocated for ``/opt/platform-backup``. If additional
persistent disk space is required, the partition can be increased in the next
subcloud reinstall using the following commands:
- To increase ``/opt/platform-backup`` to 40GB, add the **persistent_size: 40000**
parameter to the subcloud install-values.yaml file.
- Use the :command:`dcmanager subcloud update` command to save the
configuration change for the next subcloud reinstall.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud update --install-values <install-values-yaml-file><subcloud-name>
For a new subcloud deployment, use the :command:`dcmanager subcloud add`
command with the install-values.yaml file containing the desired
**persistent_size** value.
.. rubric:: |proc|
@ -162,6 +185,10 @@ subcloud, the subcloud installation has these phases:
# rootfs_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0"
# boot_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0"
# Set the value for persistent file system (/opt/platform-backup).
# The value must be whole number (in MB) that is greater than or equal
# to 30000.
persistent_size: 30000
#. At the System Controller, create a
``/home/sysadmin/subcloud1-bootstrap-values.yaml`` overrides file for the

View File

@ -26,3 +26,6 @@ Platform Management Service.
.. include:: /_includes/installing-and-provisioning-a-subcloud.rest
:start-after: begin-shared-nic
:end-before: end-shared-nic
For more information on subcloud deployment with local installation see,
:ref:`subcloud-deployment-with-local-installation-4982449058d5`

View File

@ -0,0 +1,261 @@
.. _prestage-a-subcloud-using-dcmanager-df756866163f:
===================
Prestage a Subcloud
===================
Before you start an |AIO-SX| subcloud upgrade or reinstall for the purpose of
restoring the subcloud; the subcloud can be prestaged with software packages
and container image archives outside the maintenance window using the dcmanager
CLI. The prestaged data is stored in the subcloud persistent file systems
``/opt/platform-backup/<sw_version>``. This data will be used when the subcloud
is reinstalled next.
Where, the `<sw_version>` number is the active load of the System Controller.
.. note::
Only |AIO-SX| subclouds can be prestaged using the dcmanager CLI.
For information on prestaging a batch of subclouds, see,
:ref:`prestage-subcloud-orchestration-eb516473582f`.
.. rubric:: |context|
The main steps of this task are:
#. Ensure prestaging prerequisites are met, see :ref:`prestaging-prereqs`.
#. Upload the list of container images to prestage. This step is relevant to
upgrade and must be performed after the System Controller
has been upgraded. See :ref:`Upload Prestage Image List <prestaging-image-list>`.
#. Use dcmanager commands to prestage the subcloud(s).
To increase Subcloud Platform Backup Size using dcmanager CLI, see
:ref:`increase-subcloud-platform-backup-size`.
.. _prestaging-prereqs:
-----------------------
Prestaging Requirements
-----------------------
.. rubric:: |prereq|
Prestaging can be done for a single subcloud or a batch of subclouds via
orchestration. See :ref:`prestage-subcloud-orchestration-eb516473582f`.
There are two types of subcloud prestage:
- **Prestage for upgrade**: when the subcloud is running a different (older)
load than the System Controller at the time of prestaging.
- **Prestage for reinstall**: When the subcloud is running the same load as the
System Controller at the time of prestaging.
.. note::
Only |AIO-SX| subclouds can be prestaged using the dcmanager CLI.
**Pre-conditions common to both types of prestage**:
- Subclouds to be prestaged must be |AIO-SX|, online, managed and free
of any management affecting alarms.
.. note::
You can force prestaging using ``--force`` option. However,
it is not recommended unless it is certain that the prestaging
process will not exacerbate the alarm condition on the subcloud.
- Subcloud ``/opt/platform-backup`` must have enough available disk space
for prestage data.
- Subcloud ``/var/lib/docker`` must have enough space for all prestage
image pulls and archive file generation. If the total size of prestage
images is N GB, available Docker space should be N*2 GB.
.. warning::
If the available docker space is inadequate, some application pods can get
evicted due to temporary disk pressure during the prestaging process. The
cert-manager application will fail subcloud upgrade if its evicted pods are
not cleaned up.
**Pre-conditions specific to prestage for upgrade**:
- The total size of prestage images and custom images restored over upgrade
must not exceed docker-distribution capacity.
- Prestage images must already exist in the configured source(s) prior to
subcloud prestaging. For example, if the subcloud is configured to
download images from the central registry; the specified images must
already exist in the registry on the System Controller.
.. _prestaging-image-list:
--------------------------
Upload Prestage Image List
--------------------------
The prestage image list specifies what container images are to be pulled from
the configured sources and included in the image archive files during prestaging.
This list is only used if the prestage is intended for subcloud upgrade i.e.
the System Controller and subclouds are running different loads at the time of
prestaging.
The prestage image list must contain:
- Images required for subcloud platform upgrade.
- Images required for the restore and update or |prod-long| applications,
currently applied on the subcloud, for example, cert-manager, |OIDC|, and
metrics-server.
.. only:: partner
.. include:: /_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest
:start-after: prestage-image-begin
:end-before: prestage-image-end
If the available docker and docker-distribution storage is ample, prestage
image list should also contain:
- (Optional) Images required for Kubernetes version upgrades post subcloud upgrade.
- (Optional) Images required for the update of end users' Helm applications
post subcloud upgrade.
.. note::
It is required to determine the total size of all images to be prestaged
in advance. Too many images can result in subcloud upgrade failure due to
docker-distribution (local registry) out of space error.
See the Prerequisites section above for more details.
.. rubric:: |proc|
#. To upload the prestage image list, use the following command after the
System Controller has been upgraded.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud-deploy upload --prestage-images nn.nn_images.lst
+------------------+-----------------+
| Field | Value |
+------------------+-----------------+
|deploy_playbook | None |
|deploy_overrides | None |
|deploy_chart | None |
|prestage_images | nn.nn_images.lst|
+------------------+-----------------+
Where, the name of the prestage image file can be user defined. However,
it is recommended to use the following format `<software_version>_images.lst`,
for example, `<21.12_images.lst>`.
#. To confirm that the image list has been uploaded, use the following command.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud-deploy show
+------------------+-------------------------+
| Field | Value |
+------------------+-------------------------+
| deploy_playbook | None |
| deploy_overrides | None |
| deploy_chart | None |
| prestage_images | nn.nn_images.lst |
+------------------+-------------------------+
.. warning::
As prestage images will be pulled from Docker registries currently
configured for the subcloud, images in the image list file must not contain
custom/private registry prefix.
.. only:: partner
.. include:: /_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest
:start-after: image-list-begin
:end-before: image-list-end
------------------------
Single Subcloud Prestage
------------------------
See :ref:`prestaging-prereqs` for preconditions prior to prestaging the subcloud.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud prestage subcloud2
Enter the sysadmin password for the subcloud:
Re-enter sysadmin password to confirm:
+-----------------------------+----------------------------+
| Field | Value |
+-----------------------------+----------------------------+
| id | 2 |
| name | subcloud2 |
| description | None |
| location | None |
| software_version | nn.nn |
| management | managed |
| availability | online |
| deploy_status | prestage-prepare |
| management_subnet | 2620:10a:a001:ac01::20/123 |
| management_start_ip | 2620:10a:a001:ac01::22 |
| management_end_ip | 2620:10a:a001:ac01::3e |
| management_gateway_ip | 2620:10a:a001:ac01::21 |
| systemcontroller_gateway_ip | 2620:10a:a001:a113::1 |
| group_id | 3 |
| created_at | 2202-03-18 20:31:16.548903 |
| updated_at | 2202-03-22 18:55:56:251643 |
+-----------------------------+----------------------------+
-----------------------
Rerun Subcloud Prestage
-----------------------
A subcloud can be prestaged multiple times. However, only prestaging images
will be repeated. Once packages prestaging is successful, this step will be
skipped in subsequent prestage reruns for the same software version.
------------------------
Verify Subcloud Prestage
------------------------
After a subcloud is successfully prestaged, the ``deploy_status`` will change to
``prestage-complete``. Use the :command:`dcmanager subcloud show` command to
verify the status. The packages directory, repodata directory, and container
image bundles, and md5 file can be found on the subcloud in
``/opt/platform-backup/<sw_version>``.
Where, the `<sw_version>` number is the active load of the System Controller.
------------------------------
Troubleshoot Subcloud Prestage
------------------------------
If the subcloud prestage fails, check ``/var/log/dcmanager/dcmanager.log``
for the reason of failure. Once the issue has been resolved, prestage can be
retried using :command:`dcmanager subcloud prestage` command.
---------------------------------
Verifying Usage of Prestaged Data
---------------------------------
To verify that the prestaged data is used over subcloud upgrade, subcloud
reinstall, or subcloud remote restore:
- Search for the the subcloud name in the log file, for example,
subcloud1 from ``/www/var/log/lighttpd-access.log``. There should not be
GET requests to download packages from ``/iso/<sw_version>/nodes/subcloud1/Packages/``.
- Check subcloud ansible log in ``/var/log/dcmanager/ansible`` directory.
Images are imported from local archives and no images in the prestage image
list need to be downloaded from configured sources.

View File

@ -0,0 +1,201 @@
.. _prestage-subcloud-orchestration-eb516473582f:
===============================
Prestage Subcloud Orchestration
===============================
This section describes the prestage strategy for a single subcloud, default
subcloud group or a specific subcloud group.
.. rubric:: |prereq|
For more information on prerequisites for prestage upgrade and reinstall, see
:ref:`prestage-a-subcloud-using-dcmanager-df756866163f`.
..note::
Any existing strategy must be deleted first as only one type
of strategy can exist at a time.
.. only:: partner
.. include:: /_includes/prestage-subcloud-orchestration-eb516473582f.rest
:start-after: strategy-begin
:end-before: strategy-end
.. rubric:: |proc|
#. Create a prestage strategy.
Prestage strategy can be created for a single subcloud, the default
subcloud group (all subclouds), or a specific subcloud group.
To create a prestage strategy for a specific subcloud, use the following
command:
.. code-block:: none
~(keystone_admin)]$ dcmanager prestage-strategy create subcloud1
Enter the sysadmin password for the subcloud:
Re-enter sysadmin password to confirm:
+-----------------------------+----------------------------+
| Field | Value |
+-----------------------------+----------------------------+
| id | 1 |
| name | subcloud1 |
| description | None |
| location | False |
| software_version | nn.nn |
| management | managed |
| availability | online |
| deploy_status | prestage-prepare |
| management_subnet | 2620:10a:a001:ac01::20/123 |
| management_start_ip | 2620:10a:a001:ac01::22 |
| management_end_ip | 2620:10a:a001:ac01::3e |
| management_gateway_ip | 2620:10a:a001:ac01::21 |
| systemcontroller_gateway_ip | 2620:10a:a001:a113::1 |
| group_id | 3 |
| created_at | 2202-03-18 20:31:16.548903 |
| updated_at | 2202-03-22 18:55:56:251643 |
+-----------------------------+----------------------------+
To create a prestage strategy for the default subcloud group, use the
following command:
.. code-block:: none
~(keystone_admin)]$ dcmanager prestage-strategy create
Enter the sysadmin password for the subcloud:
Re-enter sysadmin password to confirm:
+------------------------+-----------------------------+
| Field | Value |
+------------------------+-----------------------------+
| strategy type | prestage |
| subcloud apply type | parallel |
| max parallel subclouds | 50 |
| stop on failure | False |
| state | initial |
| created_at | 2202-03-22T18:54:45.037336 |
| updated_at | None |
+------------------------+-----------------------------+
To create a prestage strategy for a specific subcloud group, use the
following command:
.. code-block:: none
~(keystone_admin)]$ dcmanager prestage-strategy create group First_10_Subclouds
Enter the sysadmin password for the subcloud:
Re-enter sysadmin password to confirm:
+------------------------+-----------------------------+
| Field | Value |
+------------------------+-----------------------------+
| strategy type | prestage |
| subcloud apply type | parallel |
| max parallel subclouds | 10 |
| stop on failure | False |
| state | initial |
| created_at | 2202-03-22T18:54:45.037336 |
| updated_at | None |
+------------------------+-----------------------------+
.. note::
Unlike other types of orchestration, prestage orchestration requires
sysadmin password as all communications with the subclouds are done
using ansible over the oam network to avoid disruptions to management
traffic.
#. Apply the strategy.
.. code-block:: none
~(keystone_admin)]$ dcmanager prestage-strategy apply
+------------------------+-----------------------------+
| Field | Value |
+------------------------+-----------------------------+
| strategy type | prestage |
| subcloud apply type | None |
| max parallel subclouds | None |
| stop on failure | False |
| state | applying |
| created_at | 2202-03-22T18:33:20:100712 |
| updated_at | 2202-03-22T18:36:03.895542 |
+------------------------+-----------------------------+
#. Monitor the progress of the strategy.
.. code-block:: none
~(keystone_admin)]$ dcmanager strategy-step list
+-----------+-------+---------------------+---------+----------------------------+-------------+
| cloud | stage | state | details | started_at | finished_at |
+-----------+-------+---------------------+---------+----------------------------+-------------+
| subcloud1 | 1 | prestaging-packages | | 2202-03-22 18:55:11.523970 | None |
+-----------+-------+---------------------+---------+----------------------------+-------------+
#. (Optional) Abort the strategy, if required.
The abort command can be used to abort the prestage orchestration strategy
after the current step of the currently applying state is completed.
#. Delete the strategy.
.. code-block:: none
~(keystone_admin)]$ dcmanager prestage-strategy delete
+------------------------+-----------------------------+
| Field | Value |
+------------------------+-----------------------------+
| strategy type | prestage |
| subcloud apply type | None |
| max parallel subclouds | None |
| stop on failure | False |
| state | deleting |
| created_at | 2202-03-22T19:09:03.576053 |
| updated_at | 2202-03-22T19:09:09.436732 |
+------------------------+-----------------------------+
--------------------------------------------
Troubleshoot Subcloud Prestage Orchestration
--------------------------------------------
If an orchestrated prestage fails for a subcloud, check the log specified in
the error message for reasons of failure. After the issue has been resolved,
prestage can be retried using one of the following options:
.. rubric:: |proc|
- Run :command:`dcmanager subcloud prestage` command on the failed subcloud.
- Create a subcloud group, for example, ``prestage-retry``, add the failed
subcloud(s) to group ``prestage-retry``, and finally create and apply the
prestage strategy for the group.
.. warning::
Do not retry orchestration with an existing group unless the subclouds
that have been successfully prestaged are removed from the group.
Otherwise, prestage will be repeated for ALL subclouds in the group.
For more information on the following, see
:ref:`prestage-a-subcloud-using-dcmanager-df756866163f`
- Upload Prestage Image List
- Single Subcloud Prestage
- Rerun Subcloud Prestage
- Verify Subcloud Prestage
- Verifying Usage of Prestaged Data

View File

@ -9,6 +9,9 @@ For subclouds with servers that support Redfish Virtual Media Service
(version 1.2 or higher), you can use the Central Cloud's CLI to restore the
subcloud from data that was backed up previously.
Before you start an |AIO-SX| subcloud upgrade or reinstall for the purpose of
restoring the subcloud, see :ref:`prestage-a-subcloud-using-dcmanager-df756866163f`.
.. rubric:: |context|
The CLI command :command:`dcmanager subcloud restore` can be used to restore a