From 9fe711783e2c062cd8c5ff08f1335cc839cde7f1 Mon Sep 17 00:00:00 2001 From: Juanita-Balaraj Date: Thu, 24 Mar 2022 17:04:19 -0400 Subject: [PATCH] Subcloud Local Installation Upgrade Support (pick dsR6) Updated Patchset 7 comments Updated Patchset 6 comments Fixed Include file Updated Patchset 4 comments Updated Patchset 3 comments Updated Patchset 1 comments Added new topics for Prestaging a Subcloud Signed-off-by: Juanita-Balaraj Change-Id: Ibc51fa7159a06a6317eadb19524f9b00ada105eb --- ...subcloud-using-dcmanager-df756866163f.rest | 7 + ...e-subcloud-orchestration-eb516473582f.rest | 3 + ...de-orchestration-process-using-the-cli.rst | 3 + .../index-dist-cloud-kub-95bef233eef0.rst | 10 + ...ng-redfish-platform-management-service.rst | 27 ++ ...installing-and-provisioning-a-subcloud.rst | 3 + ...-subcloud-using-dcmanager-df756866163f.rst | 261 ++++++++++++++++++ ...ge-subcloud-orchestration-eb516473582f.rst | 201 ++++++++++++++ ...clouds-from-backupdata-using-dcmanager.rst | 3 + 9 files changed, 518 insertions(+) create mode 100644 doc/source/_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest create mode 100644 doc/source/_includes/prestage-subcloud-orchestration-eb516473582f.rest create mode 100644 doc/source/dist_cloud/kubernetes/prestage-a-subcloud-using-dcmanager-df756866163f.rst create mode 100644 doc/source/dist_cloud/kubernetes/prestage-subcloud-orchestration-eb516473582f.rst diff --git a/doc/source/_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest b/doc/source/_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest new file mode 100644 index 000000000..0839a8a12 --- /dev/null +++ b/doc/source/_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest @@ -0,0 +1,7 @@ + + +start-after: prestage-image-begin +end-before: prestage-image-end + +start-after: image-list-begin +end-before: image-list-end \ No newline at end of file diff --git a/doc/source/_includes/prestage-subcloud-orchestration-eb516473582f.rest b/doc/source/_includes/prestage-subcloud-orchestration-eb516473582f.rest new file mode 100644 index 000000000..92c08653d --- /dev/null +++ b/doc/source/_includes/prestage-subcloud-orchestration-eb516473582f.rest @@ -0,0 +1,3 @@ + +start-after: strategy-begin +end-before: strategy-end diff --git a/doc/source/dist_cloud/kubernetes/distributed-upgrade-orchestration-process-using-the-cli.rst b/doc/source/dist_cloud/kubernetes/distributed-upgrade-orchestration-process-using-the-cli.rst index 60fc72580..d6e1c0e87 100644 --- a/doc/source/dist_cloud/kubernetes/distributed-upgrade-orchestration-process-using-the-cli.rst +++ b/doc/source/dist_cloud/kubernetes/distributed-upgrade-orchestration-process-using-the-cli.rst @@ -9,6 +9,9 @@ Distributed Upgrade Orchestration Process Using the CLI Distributed upgrade orchestration can be initiated after the System Controller has been successfully upgraded. +For more information Prestaging Subcloud Orchestration see, +:ref:`prestage-subcloud-orchestration-eb516473582f`. + .. rubric:: |context| The user first creates a distributed upgrade orchestration strategy, or plan, diff --git a/doc/source/dist_cloud/kubernetes/index-dist-cloud-kub-95bef233eef0.rst b/doc/source/dist_cloud/kubernetes/index-dist-cloud-kub-95bef233eef0.rst index da1c25303..a9347039d 100644 --- a/doc/source/dist_cloud/kubernetes/index-dist-cloud-kub-95bef233eef0.rst +++ b/doc/source/dist_cloud/kubernetes/index-dist-cloud-kub-95bef233eef0.rst @@ -54,6 +54,16 @@ Operation migrate-an-aiosx-subcloud-to-an-aiodx-subcloud restoring-subclouds-from-backupdata-using-dcmanager rehoming-a-subcloud + prestage-a-subcloud-using-dcmanager-df756866163f + +-------------------------------------------------------------------- +Prestage Orchestration for Distributed Cloud Subclouds using the CLI +-------------------------------------------------------------------- + +.. toctree:: + :maxdepth: 1 + + prestage-subcloud-orchestration-eb516473582f ---------------------- Manage Subcloud Groups diff --git a/doc/source/dist_cloud/kubernetes/installing-a-subcloud-using-redfish-platform-management-service.rst b/doc/source/dist_cloud/kubernetes/installing-a-subcloud-using-redfish-platform-management-service.rst index e91acf3e3..269141fd2 100644 --- a/doc/source/dist_cloud/kubernetes/installing-a-subcloud-using-redfish-platform-management-service.rst +++ b/doc/source/dist_cloud/kubernetes/installing-a-subcloud-using-redfish-platform-management-service.rst @@ -68,6 +68,29 @@ subcloud, the subcloud installation has these phases: files that are referenced in the **bootstrap.yml** file must exist on both controllers \(for example, /home/sysadmin/docker-registry-ca-cert.pem\). +.. _increase-subcloud-platform-backup-size: + +---------------------------------------------------- +Increase Subcloud Platform Backup Size using the CLI +---------------------------------------------------- + +By default, 30GB is allocated for ``/opt/platform-backup``. If additional +persistent disk space is required, the partition can be increased in the next +subcloud reinstall using the following commands: + +- To increase ``/opt/platform-backup`` to 40GB, add the **persistent_size: 40000** + parameter to the subcloud install-values.yaml file. + +- Use the :command:`dcmanager subcloud update` command to save the + configuration change for the next subcloud reinstall. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud update --install-values + +For a new subcloud deployment, use the :command:`dcmanager subcloud add` +command with the install-values.yaml file containing the desired +**persistent_size** value. .. rubric:: |proc| @@ -162,6 +185,10 @@ subcloud, the subcloud installation has these phases: # rootfs_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0" # boot_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0" + # Set the value for persistent file system (/opt/platform-backup). + # The value must be whole number (in MB) that is greater than or equal + # to 30000. + persistent_size: 30000 #. At the System Controller, create a ``/home/sysadmin/subcloud1-bootstrap-values.yaml`` overrides file for the diff --git a/doc/source/dist_cloud/kubernetes/installing-and-provisioning-a-subcloud.rst b/doc/source/dist_cloud/kubernetes/installing-and-provisioning-a-subcloud.rst index c5903c069..6daa26d7a 100644 --- a/doc/source/dist_cloud/kubernetes/installing-and-provisioning-a-subcloud.rst +++ b/doc/source/dist_cloud/kubernetes/installing-and-provisioning-a-subcloud.rst @@ -26,3 +26,6 @@ Platform Management Service. .. include:: /_includes/installing-and-provisioning-a-subcloud.rest :start-after: begin-shared-nic :end-before: end-shared-nic + +For more information on subcloud deployment with local installation see, +:ref:`subcloud-deployment-with-local-installation-4982449058d5` \ No newline at end of file diff --git a/doc/source/dist_cloud/kubernetes/prestage-a-subcloud-using-dcmanager-df756866163f.rst b/doc/source/dist_cloud/kubernetes/prestage-a-subcloud-using-dcmanager-df756866163f.rst new file mode 100644 index 000000000..4de3b2d74 --- /dev/null +++ b/doc/source/dist_cloud/kubernetes/prestage-a-subcloud-using-dcmanager-df756866163f.rst @@ -0,0 +1,261 @@ +.. _prestage-a-subcloud-using-dcmanager-df756866163f: + +=================== +Prestage a Subcloud +=================== + +Before you start an |AIO-SX| subcloud upgrade or reinstall for the purpose of +restoring the subcloud; the subcloud can be prestaged with software packages +and container image archives outside the maintenance window using the dcmanager +CLI. The prestaged data is stored in the subcloud persistent file systems +``/opt/platform-backup/``. This data will be used when the subcloud +is reinstalled next. + +Where, the `` number is the active load of the System Controller. + +.. note:: + + Only |AIO-SX| subclouds can be prestaged using the dcmanager CLI. + +For information on prestaging a batch of subclouds, see, +:ref:`prestage-subcloud-orchestration-eb516473582f`. + +.. rubric:: |context| + +The main steps of this task are: + +#. Ensure prestaging prerequisites are met, see :ref:`prestaging-prereqs`. + +#. Upload the list of container images to prestage. This step is relevant to + upgrade and must be performed after the System Controller + has been upgraded. See :ref:`Upload Prestage Image List `. + +#. Use dcmanager commands to prestage the subcloud(s). + +To increase Subcloud Platform Backup Size using dcmanager CLI, see +:ref:`increase-subcloud-platform-backup-size`. + +.. _prestaging-prereqs: + +----------------------- +Prestaging Requirements +----------------------- + +.. rubric:: |prereq| + +Prestaging can be done for a single subcloud or a batch of subclouds via +orchestration. See :ref:`prestage-subcloud-orchestration-eb516473582f`. + +There are two types of subcloud prestage: + +- **Prestage for upgrade**: when the subcloud is running a different (older) + load than the System Controller at the time of prestaging. + +- **Prestage for reinstall**: When the subcloud is running the same load as the + System Controller at the time of prestaging. + + .. note:: + Only |AIO-SX| subclouds can be prestaged using the dcmanager CLI. + +**Pre-conditions common to both types of prestage**: + +- Subclouds to be prestaged must be |AIO-SX|, online, managed and free + of any management affecting alarms. + + .. note:: + + You can force prestaging using ``--force`` option. However, + it is not recommended unless it is certain that the prestaging + process will not exacerbate the alarm condition on the subcloud. + +- Subcloud ``/opt/platform-backup`` must have enough available disk space + for prestage data. + +- Subcloud ``/var/lib/docker`` must have enough space for all prestage + image pulls and archive file generation. If the total size of prestage + images is N GB, available Docker space should be N*2 GB. + +.. warning:: + + If the available docker space is inadequate, some application pods can get + evicted due to temporary disk pressure during the prestaging process. The + cert-manager application will fail subcloud upgrade if its evicted pods are + not cleaned up. + +**Pre-conditions specific to prestage for upgrade**: + +- The total size of prestage images and custom images restored over upgrade + must not exceed docker-distribution capacity. + +- Prestage images must already exist in the configured source(s) prior to + subcloud prestaging. For example, if the subcloud is configured to + download images from the central registry; the specified images must + already exist in the registry on the System Controller. + +.. _prestaging-image-list: + +-------------------------- +Upload Prestage Image List +-------------------------- + +The prestage image list specifies what container images are to be pulled from +the configured sources and included in the image archive files during prestaging. +This list is only used if the prestage is intended for subcloud upgrade i.e. +the System Controller and subclouds are running different loads at the time of +prestaging. + +The prestage image list must contain: + +- Images required for subcloud platform upgrade. + +- Images required for the restore and update or |prod-long| applications, + currently applied on the subcloud, for example, cert-manager, |OIDC|, and + metrics-server. + +.. only:: partner + + .. include:: /_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest + :start-after: prestage-image-begin + :end-before: prestage-image-end + +If the available docker and docker-distribution storage is ample, prestage +image list should also contain: + +- (Optional) Images required for Kubernetes version upgrades post subcloud upgrade. + +- (Optional) Images required for the update of end users' Helm applications + post subcloud upgrade. + +.. note:: + + It is required to determine the total size of all images to be prestaged + in advance. Too many images can result in subcloud upgrade failure due to + docker-distribution (local registry) out of space error. + See the Prerequisites section above for more details. + +.. rubric:: |proc| + +#. To upload the prestage image list, use the following command after the + System Controller has been upgraded. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud-deploy upload --prestage-images nn.nn_images.lst + + +------------------+-----------------+ + | Field | Value | + +------------------+-----------------+ + |deploy_playbook | None | + |deploy_overrides | None | + |deploy_chart | None | + |prestage_images | nn.nn_images.lst| + +------------------+-----------------+ + + Where, the name of the prestage image file can be user defined. However, + it is recommended to use the following format `_images.lst`, + for example, `<21.12_images.lst>`. + +#. To confirm that the image list has been uploaded, use the following command. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud-deploy show + + +------------------+-------------------------+ + | Field | Value | + +------------------+-------------------------+ + | deploy_playbook | None | + | deploy_overrides | None | + | deploy_chart | None | + | prestage_images | nn.nn_images.lst | + +------------------+-------------------------+ + +.. warning:: + + As prestage images will be pulled from Docker registries currently + configured for the subcloud, images in the image list file must not contain + custom/private registry prefix. + +.. only:: partner + + .. include:: /_includes/prestage-a-subcloud-using-dcmanager-df756866163f.rest + :start-after: image-list-begin + :end-before: image-list-end + +------------------------ +Single Subcloud Prestage +------------------------ + +See :ref:`prestaging-prereqs` for preconditions prior to prestaging the subcloud. + +.. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud prestage subcloud2 + + Enter the sysadmin password for the subcloud: + Re-enter sysadmin password to confirm: + + +-----------------------------+----------------------------+ + | Field | Value | + +-----------------------------+----------------------------+ + | id | 2 | + | name | subcloud2 | + | description | None | + | location | None | + | software_version | nn.nn | + | management | managed | + | availability | online | + | deploy_status | prestage-prepare | + | management_subnet | 2620:10a:a001:ac01::20/123 | + | management_start_ip | 2620:10a:a001:ac01::22 | + | management_end_ip | 2620:10a:a001:ac01::3e | + | management_gateway_ip | 2620:10a:a001:ac01::21 | + | systemcontroller_gateway_ip | 2620:10a:a001:a113::1 | + | group_id | 3 | + | created_at | 2202-03-18 20:31:16.548903 | + | updated_at | 2202-03-22 18:55:56:251643 | + +-----------------------------+----------------------------+ + +----------------------- +Rerun Subcloud Prestage +----------------------- + +A subcloud can be prestaged multiple times. However, only prestaging images +will be repeated. Once packages prestaging is successful, this step will be +skipped in subsequent prestage reruns for the same software version. + +------------------------ +Verify Subcloud Prestage +------------------------ + +After a subcloud is successfully prestaged, the ``deploy_status`` will change to +``prestage-complete``. Use the :command:`dcmanager subcloud show` command to +verify the status. The packages directory, repodata directory, and container +image bundles, and md5 file can be found on the subcloud in +``/opt/platform-backup/``. + +Where, the `` number is the active load of the System Controller. + +------------------------------ +Troubleshoot Subcloud Prestage +------------------------------ + +If the subcloud prestage fails, check ``/var/log/dcmanager/dcmanager.log`` +for the reason of failure. Once the issue has been resolved, prestage can be +retried using :command:`dcmanager subcloud prestage` command. + +--------------------------------- +Verifying Usage of Prestaged Data +--------------------------------- + +To verify that the prestaged data is used over subcloud upgrade, subcloud +reinstall, or subcloud remote restore: + +- Search for the the subcloud name in the log file, for example, + subcloud1 from ``/www/var/log/lighttpd-access.log``. There should not be + GET requests to download packages from ``/iso//nodes/subcloud1/Packages/``. + +- Check subcloud ansible log in ``/var/log/dcmanager/ansible`` directory. + Images are imported from local archives and no images in the prestage image + list need to be downloaded from configured sources. + diff --git a/doc/source/dist_cloud/kubernetes/prestage-subcloud-orchestration-eb516473582f.rst b/doc/source/dist_cloud/kubernetes/prestage-subcloud-orchestration-eb516473582f.rst new file mode 100644 index 000000000..34ae50f5a --- /dev/null +++ b/doc/source/dist_cloud/kubernetes/prestage-subcloud-orchestration-eb516473582f.rst @@ -0,0 +1,201 @@ +.. _prestage-subcloud-orchestration-eb516473582f: + +=============================== +Prestage Subcloud Orchestration +=============================== + +This section describes the prestage strategy for a single subcloud, default +subcloud group or a specific subcloud group. + +.. rubric:: |prereq| + +For more information on prerequisites for prestage upgrade and reinstall, see +:ref:`prestage-a-subcloud-using-dcmanager-df756866163f`. + + + ..note:: + + Any existing strategy must be deleted first as only one type + of strategy can exist at a time. + + .. only:: partner + + .. include:: /_includes/prestage-subcloud-orchestration-eb516473582f.rest + :start-after: strategy-begin + :end-before: strategy-end + +.. rubric:: |proc| + +#. Create a prestage strategy. + + Prestage strategy can be created for a single subcloud, the default + subcloud group (all subclouds), or a specific subcloud group. + + To create a prestage strategy for a specific subcloud, use the following + command: + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager prestage-strategy create subcloud1 + + Enter the sysadmin password for the subcloud: + Re-enter sysadmin password to confirm: + + +-----------------------------+----------------------------+ + | Field | Value | + +-----------------------------+----------------------------+ + | id | 1 | + | name | subcloud1 | + | description | None | + | location | False | + | software_version | nn.nn | + | management | managed | + | availability | online | + | deploy_status | prestage-prepare | + | management_subnet | 2620:10a:a001:ac01::20/123 | + | management_start_ip | 2620:10a:a001:ac01::22 | + | management_end_ip | 2620:10a:a001:ac01::3e | + | management_gateway_ip | 2620:10a:a001:ac01::21 | + | systemcontroller_gateway_ip | 2620:10a:a001:a113::1 | + | group_id | 3 | + | created_at | 2202-03-18 20:31:16.548903 | + | updated_at | 2202-03-22 18:55:56:251643 | + +-----------------------------+----------------------------+ + + To create a prestage strategy for the default subcloud group, use the + following command: + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager prestage-strategy create + Enter the sysadmin password for the subcloud: + Re-enter sysadmin password to confirm: + + +------------------------+-----------------------------+ + | Field | Value | + +------------------------+-----------------------------+ + | strategy type | prestage | + | subcloud apply type | parallel | + | max parallel subclouds | 50 | + | stop on failure | False | + | state | initial | + | created_at | 2202-03-22T18:54:45.037336 | + | updated_at | None | + +------------------------+-----------------------------+ + + To create a prestage strategy for a specific subcloud group, use the + following command: + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager prestage-strategy create –group First_10_Subclouds + + Enter the sysadmin password for the subcloud: + Re-enter sysadmin password to confirm: + + +------------------------+-----------------------------+ + | Field | Value | + +------------------------+-----------------------------+ + | strategy type | prestage | + | subcloud apply type | parallel | + | max parallel subclouds | 10 | + | stop on failure | False | + | state | initial | + | created_at | 2202-03-22T18:54:45.037336 | + | updated_at | None | + +------------------------+-----------------------------+ + + .. note:: + + Unlike other types of orchestration, prestage orchestration requires + sysadmin password as all communications with the subclouds are done + using ansible over the oam network to avoid disruptions to management + traffic. + +#. Apply the strategy. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager prestage-strategy apply + + +------------------------+-----------------------------+ + | Field | Value | + +------------------------+-----------------------------+ + | strategy type | prestage | + | subcloud apply type | None | + | max parallel subclouds | None | + | stop on failure | False | + | state | applying | + | created_at | 2202-03-22T18:33:20:100712 | + | updated_at | 2202-03-22T18:36:03.895542 | + +------------------------+-----------------------------+ + +#. Monitor the progress of the strategy. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager strategy-step list + + +-----------+-------+---------------------+---------+----------------------------+-------------+ + | cloud | stage | state | details | started_at | finished_at | + +-----------+-------+---------------------+---------+----------------------------+-------------+ + | subcloud1 | 1 | prestaging-packages | | 2202-03-22 18:55:11.523970 | None | + +-----------+-------+---------------------+---------+----------------------------+-------------+ + +#. (Optional) Abort the strategy, if required. + + The abort command can be used to abort the prestage orchestration strategy + after the current step of the currently applying state is completed. + +#. Delete the strategy. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager prestage-strategy delete + + +------------------------+-----------------------------+ + | Field | Value | + +------------------------+-----------------------------+ + | strategy type | prestage | + | subcloud apply type | None | + | max parallel subclouds | None | + | stop on failure | False | + | state | deleting | + | created_at | 2202-03-22T19:09:03.576053 | + | updated_at | 2202-03-22T19:09:09.436732 | + +------------------------+-----------------------------+ + +-------------------------------------------- +Troubleshoot Subcloud Prestage Orchestration +-------------------------------------------- + +If an orchestrated prestage fails for a subcloud, check the log specified in +the error message for reasons of failure. After the issue has been resolved, +prestage can be retried using one of the following options: + +.. rubric:: |proc| + +- Run :command:`dcmanager subcloud prestage` command on the failed subcloud. + +- Create a subcloud group, for example, ``prestage-retry``, add the failed + subcloud(s) to group ``prestage-retry``, and finally create and apply the + prestage strategy for the group. + + .. warning:: + + Do not retry orchestration with an existing group unless the subclouds + that have been successfully prestaged are removed from the group. + Otherwise, prestage will be repeated for ALL subclouds in the group. + +For more information on the following, see +:ref:`prestage-a-subcloud-using-dcmanager-df756866163f` + +- Upload Prestage Image List + +- Single Subcloud Prestage + +- Rerun Subcloud Prestage + +- Verify Subcloud Prestage + +- Verifying Usage of Prestaged Data diff --git a/doc/source/dist_cloud/kubernetes/restoring-subclouds-from-backupdata-using-dcmanager.rst b/doc/source/dist_cloud/kubernetes/restoring-subclouds-from-backupdata-using-dcmanager.rst index a4d3f109f..f7b0a543e 100644 --- a/doc/source/dist_cloud/kubernetes/restoring-subclouds-from-backupdata-using-dcmanager.rst +++ b/doc/source/dist_cloud/kubernetes/restoring-subclouds-from-backupdata-using-dcmanager.rst @@ -9,6 +9,9 @@ For subclouds with servers that support Redfish Virtual Media Service (version 1.2 or higher), you can use the Central Cloud's CLI to restore the subcloud from data that was backed up previously. +Before you start an |AIO-SX| subcloud upgrade or reinstall for the purpose of +restoring the subcloud, see :ref:`prestage-a-subcloud-using-dcmanager-df756866163f`. + .. rubric:: |context| The CLI command :command:`dcmanager subcloud restore` can be used to restore a