From 3733f8a967aef8dc3da0d853e80f209ef1bb027e Mon Sep 17 00:00:00 2001 From: Juanita-Balaraj Date: Thu, 27 May 2021 23:26:11 -0400 Subject: [PATCH] Rehoming a Subcloud Updated Patchset 5 Updated Patchset 4 Updated Patchset 3 Updated Patchset 2 Updated Patchset 1 Signed-off-by: Juanita-Balaraj Change-Id: Ie17aeb732c265bc2e5f72ed78a64b26905f7e022 Signed-off-by: Juanita-Balaraj --- doc/.vscode/settings.json | 3 + ...g-redfish-platform-management-service.rest | 5 +- ...t-redfish-platform-management-service.rest | 4 + doc/source/_includes/rehoming-a-subcloud.rest | 3 + doc/source/dist_cloud/index.rst | 2 + ...ng-redfish-platform-management-service.rst | 12 +- ...ut-redfish-platform-management-service.rst | 15 +- ...installing-and-provisioning-a-subcloud.rst | 108 -------- doc/source/dist_cloud/rehoming-a-subcloud.rst | 235 ++++++++++++++++++ ...th-redfish-platform-management-service.rst | 111 +++++++++ 10 files changed, 382 insertions(+), 116 deletions(-) create mode 100644 doc/.vscode/settings.json create mode 100644 doc/source/_includes/rehoming-a-subcloud.rest create mode 100644 doc/source/dist_cloud/rehoming-a-subcloud.rst create mode 100644 doc/source/dist_cloud/reinstalling-a-subcloud-with-redfish-platform-management-service.rst diff --git a/doc/.vscode/settings.json b/doc/.vscode/settings.json new file mode 100644 index 000000000..3cce948f6 --- /dev/null +++ b/doc/.vscode/settings.json @@ -0,0 +1,3 @@ +{ + "restructuredtext.confPath": "" +} \ No newline at end of file diff --git a/doc/source/_includes/installing-a-subcloud-using-redfish-platform-management-service.rest b/doc/source/_includes/installing-a-subcloud-using-redfish-platform-management-service.rest index 617e3141e..803cde16c 100644 --- a/doc/source/_includes/installing-a-subcloud-using-redfish-platform-management-service.rest +++ b/doc/source/_includes/installing-a-subcloud-using-redfish-platform-management-service.rest @@ -2,4 +2,7 @@ .. end-context .. begin-prereqs -.. end-prereqs \ No newline at end of file +.. end-prereqs + +.. begin-ref-1 +.. end-ref-1 \ No newline at end of file diff --git a/doc/source/_includes/installing-a-subcloud-without-redfish-platform-management-service.rest b/doc/source/_includes/installing-a-subcloud-without-redfish-platform-management-service.rest index e69de29bb..d42cedd76 100644 --- a/doc/source/_includes/installing-a-subcloud-without-redfish-platform-management-service.rest +++ b/doc/source/_includes/installing-a-subcloud-without-redfish-platform-management-service.rest @@ -0,0 +1,4 @@ + + +.. begin-ref-1 +.. end-ref-1 \ No newline at end of file diff --git a/doc/source/_includes/rehoming-a-subcloud.rest b/doc/source/_includes/rehoming-a-subcloud.rest new file mode 100644 index 000000000..fde54f527 --- /dev/null +++ b/doc/source/_includes/rehoming-a-subcloud.rest @@ -0,0 +1,3 @@ + +.. rehoming-begin +.. rehoming-end \ No newline at end of file diff --git a/doc/source/dist_cloud/index.rst b/doc/source/dist_cloud/index.rst index fcf2b80f7..15e18bee9 100644 --- a/doc/source/dist_cloud/index.rst +++ b/doc/source/dist_cloud/index.rst @@ -28,6 +28,7 @@ Installation installing-and-provisioning-a-subcloud installing-a-subcloud-using-redfish-platform-management-service installing-a-subcloud-without-redfish-platform-management-service + reinstalling-a-subcloud-with-redfish-platform-management-service --------- Operation @@ -50,6 +51,7 @@ Operation updating-docker-registry-credentials-on-a-subcloud migrate-an-aiosx-subcloud-to-an-aiodx-subcloud restoring-subclouds-from-backupdata-using-dcmanager + rehoming-a-subcloud ---------------------- Manage Subcloud Groups diff --git a/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst b/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst index 3f624ab77..5e22dbf9a 100644 --- a/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst +++ b/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst @@ -28,6 +28,7 @@ subcloud, the subcloud installation has these phases: .. note:: + After a successful remote installation of a subcloud in a Distributed Cloud system, a subsequent remote reinstallation fails because of an existing ssh key entry in the /root/.ssh/known\_hosts on the System Controller. In this @@ -50,6 +51,7 @@ subcloud, the subcloud installation has these phases: command\). .. note:: + This is required only once and does not have to be done for every subcloud install. @@ -72,10 +74,8 @@ subcloud, the subcloud installation has these phases: #. At the subcloud location, physically install the servers and network connectivity required for the subcloud. -.. See |inst-doc|: :ref:`Preparing Servers ` for more - information. - .. note:: + Do not power off the servers. The host portion of the server can be powered off, but the |BMC| portion of the server must be powered and accessible from the System Controller. @@ -83,16 +83,22 @@ subcloud, the subcloud installation has these phases: There is no need to wipe the disks. .. note:: + The servers require connectivity to a gateway router that provides IP routing between the subcloud management subnet and the System Controller management subnet, and between the subcloud |OAM| subnet and the System Controller subnet. + .. include:: ../_includes/installing-a-subcloud-using-redfish-platform-management-service.rest + :start-after: begin-ref-1 + :end-before: end-ref-1 + #. Create the install-values.yaml file and use the content to pass the file into the :command:`dcmanager subcloud add` command, using the :command:`--install-values` command option. .. note:: + If your controller is on a ZTSystems Triton server that requires a longer timeout value, you can now use the rd.net.timeout.ipv6dad dracut parameter to specify an increased timeout value for dracut to wait for diff --git a/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst b/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst index 038ccb96b..ea42def4b 100644 --- a/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst +++ b/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst @@ -30,6 +30,7 @@ subcloud, the subcloud installation process has two phases: .. note:: + After a successful remote installation of a subcloud in a Distributed Cloud system, a subsequent remote reinstallation fails because of an existing ssh key entry in the /root/.ssh/known\_hosts on the System Controller. In this @@ -51,14 +52,17 @@ subcloud, the subcloud installation process has two phases: #. At the subcloud location, physically install the servers and network connectivity required for the subcloud. -.. See |inst-doc|: :ref:`Preparing Servers `. - .. note:: + The servers require connectivity to a gateway router that provides IP routing between the subcloud management subnet and the System Controller management subnet, and between the subcloud OAM subnet and the System Controller subnet. + .. include:: ../_includes/installing-a-subcloud-without-redfish-platform-management-service.rest + :start-after: begin-ref-1 + :end-before: end-ref-1 + #. Update the ISO image to modify installation boot parameters \(if required\), automatically select boot menu options and add a kickstart file to automatically perform configurations such as configuring the initial IP @@ -132,13 +136,15 @@ subcloud, the subcloud installation process has two phases: #. At the subcloud location, install the |prod| software from a USB device or a |PXE| Boot Server on the server designated as controller-0. - See |inst-doc| instructions for preparing servers. + .. include:: ../_includes/installing-a-subcloud-without-redfish-platform-management-service.rest + :start-after: begin-ref-1 + :end-before: end-ref-1 #. At the subcloud location, verify that the |OAM| interface on the subcloud controller has been properly configured by the kickstart file added to the ISO. - Log in to the subcloud's controller-0 and ping the Central Cloud's floating +#. Log in to the subcloud's controller-0 and ping the Central Cloud's floating |OAM| IP Address. #. At the System Controller, create a @@ -203,6 +209,7 @@ subcloud, the subcloud installation process has two phases: password: .. note:: + If you have a reason not to use the Central Cloud's local registry you can pull the images from another local private docker registry. diff --git a/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst b/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst index c89853f74..dc220e074 100644 --- a/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst +++ b/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst @@ -26,111 +26,3 @@ Platform Management Service. .. include:: ../_includes/installing-and-provisioning-a-subcloud.rest :start-after: begin-shared-nic :end-before: end-shared-nic - -------------------------------------------------------------- -Reinstall a Subcloud with Redfish Platform Management Service -------------------------------------------------------------- - -For subclouds with servers that support Redfish Virtual Media Service -\(version 1.2 or higher\), you can use the Central Cloud's CLI to re-install -the ISO and bootstrap subclouds from the Central Cloud. - -.. caution:: - - All application and data on the subcloud will be lost after re-installation. - -.. rubric:: |context| - -The subcloud reinstallation has two phases: - -Executing the dcmanager subcloud reinstall command in the Central Cloud: - -- Uses Redfish Virtual Media Service to remote install the ISO on controller-0 - in the subcloud. - -- Uses Ansible to bootstrap |prod| on controller-0. - -.. rubric:: |prereq| - -- The install values are required for subcloud reinstallation. By default, - install values are stored in the database after a subcloud installation or - upgrade, and the reinstallation will re-use the install values. If you want - to update the install values, use the following CLI command in the Central - Cloud. - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud update subcloud1 --install-values install-values.yaml --bmc-password - - For more information on install-values.yaml file, see :ref:`Installing a Subcloud Using Redfish Platform Management Service - `. - - You can only reinstall the same software version with the Central Cloud on - the subcloud. - -- Check the subcloud's availability in the Central Cloud, for example, - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud list - - +----+----------+------------+--------------+---------------+---------+ - | id | name | management | availability | deploy status | sync | - +----+----------+------------+--------------+---------------+---------+ - | 1 | subcloud1| unmanaged | offline | complete | unknown | - +----+----------+------------+--------------+---------------+---------+ - - As the reinstall will cause data and application loss, it is not necessary - and not recommended to reinstall a healthy subcloud. The dcmanager rejects - the reinstallation of a managed or online subcloud. - -.. rubric:: |proc| - -#. Execute the reinstall using the CLI. For example, - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud reinstall subcloud1 - -#. Confirm the reinstall of the subcloud. - - You are prompted to enter ``reinstall`` to confirm the reinstallation. - - .. warning:: - - This will reinstall the subcloud. All applications and data on the - subcloud will be lost. - - Please type ``reinstall`` to confirm: reinstall - - Any other input will abort the reinstallation. - -#. At the Central Cloud, monitor the progress of the subcloud installation - and bootstrapping by using the deploy status field of the dcmanager - subcloud list command, for example, - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud list - - +----+-----------+------------+--------------+---------------+---------+ - | id | name | management | availability | deploy status | sync | - +----+-----------+------------+--------------+---------------+---------+ - | 1 | subcloud1 | unmanaged | offline | installing | unknown | - +----+-----------+------------+--------------+---------------+---------+ - - For more information on the deploy status filed, see :ref:`Installing a Subcloud Using Redfish Platform Management Service - `. - - You can also monitor detailed logging of the subcloud installation, - bootstrapping by monitoring the following log files on the active - controller in the Central Cloud. - - - /var/log/dcmanager/subcloud_name_install_date_stamp.log - - /var/log/dcmanager/subcloud_name_bootstrap_date_stamp.log - -#. After the subcloud is successfully reinstalled and bootstrapped, use the - following command to reconfigure the subcloud, **subcloud reconfig**. - For more information, see :ref:`Managing Subclouds Using the CLI - `. - diff --git a/doc/source/dist_cloud/rehoming-a-subcloud.rst b/doc/source/dist_cloud/rehoming-a-subcloud.rst new file mode 100644 index 000000000..ba23097f8 --- /dev/null +++ b/doc/source/dist_cloud/rehoming-a-subcloud.rst @@ -0,0 +1,235 @@ + +.. _rehoming-a-subcloud: + +================= +Rehome a Subcloud +================= + +|release-caveat| + +When the System Controller needs to be reinstalled, or when the subclouds from +multiple System Controllers are being consolidated into a single System +Controller, you can add already deployed subclouds to a different System +Controller using the rehoming playbook. + +.. note:: + + The rehoming playbook does not work with freshly installed/bootstrapped + subclouds. + +Use the following procedure to enable subcloud rehoming and to update the new +subcloud configuration \(networking parameters, passwords, etc.\) to be +compatible with the new System Controller. + +.. rubric:: |context| + +There are six phases for Rehoming a subcloud: + + +#. Unmanage the subcloud from the previous System Controller. + + .. note:: + + You can skip this phase if the previous System Controller is no longer + running or is unable to connect to the subcloud. + +#. Update the admin password on the subcloud to match the new System + Controller, if required. + +#. Run the :command:`subcloud add` command with the ``--migrate`` option on + the new System Controller. This will update the System Controller and + connect to the subcloud to update the appropriate configuration parameters. + +#. On the subcloud, lock/unlock the subcloud controller(s) to enable the new + configuration. + +#. Use the :command:`dcmanager subcloud list` command to check the status + of the subcloud, ensure the subcloud is online and complete before managing + the subcloud. + +#. On the new System Controller, set the subcloud to "managed" and wait for it + to sync. + +.. rubric:: |prereq| + +- Ensure that the subcloud management subnet, oam_floating_address, + oam_node_0_address and oam_node_1_address \(if applicable\) does not overlap + addresses already being used by the new System Controller or any of its + subclouds. + +- Ensure that the subcloud has been backed up, in case something goes wrong + and a subcloud system recovery is required. + +- Transfer the yaml file that was used to bootstrap the subcloud prior to + rehoming, to the new System Controller. This data is required for rehoming. + +- If the subcloud can be remotely installed via Redfish Virtual Media service, + transfer the yaml file that contains the install data for this subcloud, + and use this install data in the new System Controller, via the + ``--install-values`` option, when running the remote subcloud reinstall, + upgrade or restore commands. + + +.. rubric:: |proc| + +#. If the previous System Controller is running, use the following command to + ensure that it does not try to change subcloud configuration while you are + modifying it to be compatible with the new System Controller. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud unmanage + +#. Ensure that the subcloud's bootstrap values file is available on the new + System Controller. If required, in the subcloud's bootstrap values file + update the **systemcontroller_gateway_address** entry to point to the + appropriate network gateway for the new System Controller to communicate + with the subcloud. + +#. If the admin password of the subcloud does not match the admin password of + the new System Controller, use the following command to change the subcloud + admin password. This step is done on the subcloud that is being migrated. + + .. code-block:: none + + ~(keystone_admin)]$ openstack user password set + + .. note:: + + You will need to specify the old and the new password. + +#. For an |AIO-DX| subcloud, ensure that the active controller is + controller-0. Perform a host-swact of the active controller \(controller-1\) + to make controller-0 active. + + .. code-block:: none + + ~(keystone_admin)]$ system host-swact controller-1 + +#. Lock controller-1 of the subcloud. + + .. code-block:: none + + ~(keystone_admin)]$ system host-lock controller-1 + + +#. On the new System Controller, use the following command to start the + rehoming process. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud add --migrate –bootstrap-address --bootstrap-values [--install-values ] + + The subcloud deploy status will change to "pre-rehome" and if the + preliminary steps complete successfully it will change to "rehoming". + At this point an Ansible playbook will run and update the appropriate + configuration data in the subcloud. You can query the status by running + :command:`dcmanager subcloud show` command. Once the subcloud has been + updated, the subcloud deploy status will change to "complete". + + .. note:: + + The ``--install-values`` is optional. It is not mandatory for subcloud + rehoming. However, you can opt to save these values in the new System + Controller as part of the rehoming process so that future operations + that involve remote reinstallation of the subcloud (e.g. reinstall, + upgrade, restore) can be performed for a rehomed subcloud similar to + other existing Redfish capable subclouds in the Distributed Cloud. + + **Delete the "image:" line from the install-values file, if it exists, so + that the image is correctly located based on the new System Controller + configuration**. + + +#. For an |AIO-SX| subcloud, use the following commands to lock/unlock the + controller \(wait for the lock to complete before unlocking the controller\). + + .. code-block:: none + + ~(keystone_admin)]$ system host-lock controller-0 + ~(keystone_admin)]$ system host-unlock controller-0 + + For an |AIO-DX| subcloud, first, use the following command to unlock + controller-1. + + .. code-block:: none + + ~(keystone_admin)]$ system host-unlock controller-1 + + #. Wait until controller-1 is unlocked/online/available, then use the + following command to switch activity to controller-1. + + .. code-block:: none + + ~(keystone_admin)]$ system host-swact controller-0 + + #. After the swact is complete, use the following commands to lock/unlock + controller-0. + + .. code-block:: none + + ~(keystone_admin)]$ system host-lock controller-0 + ~(keystone_admin)]$ system host-unlock controller-0 + + #. Wait until controller-0 is unlocked/online/available, then switch + activity back to controller-0. + + #. Perform a swact on controller-1. + + .. code-block:: none + + ~(keystone_admin)]$ system host-swact controller-1 + + Wait until the swact is complete. + +#. Use the :command:`dcmanager subcloud list` command to display the status of + the subcloud, and ensure the subcloud is online and complete before + managing the subcloud. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud list + + +----+-----------+------------+--------------+---------------+---------+ + | id | name | management | availability | deploy status | sync | + +----+-----------+------------+--------------+---------------+---------+ + | 1 | subcloud1 | unmanaged | online | complete | unknown | + +----+-----------+------------+--------------+---------------+---------+ + +#. Use the following command to "manage" the subcloud. This is executed on the + System Controller. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud manage + +#. The new System Controller will audit the subcloud and determine whether it + is in-sync with the System Controller. + +.. only:: partner + + .. include:: /_includes/rehoming-a-subcloud.rest + :start-after: rehoming-begin + :end-before: rehoming-end + +**Error Recovery** + +If the subcloud rehoming process begins successfully, (status changes to +"rehoming") but there is a transient fault that prevents step 5 from completing +successfully, then manual error recovery is required. + +The first stage of error recovery is to delete the subcloud from +the new System Controller and re-attempt rehoming using the following commands: + +.. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud delete + ~(keystone_admin)]$ dcmanager subcloud add --migrate –bootstrap-address --bootstrap-values [--install-values ] + +If the second attempt fails, it is recommended to contact Wind River Customer +Support at https://www.windriver.com/support. + +If all attempts fail, restore the subcloud from backups, that will revert the +subcloud to the original state prior to rehoming. + + diff --git a/doc/source/dist_cloud/reinstalling-a-subcloud-with-redfish-platform-management-service.rst b/doc/source/dist_cloud/reinstalling-a-subcloud-with-redfish-platform-management-service.rst new file mode 100644 index 000000000..02ee69b6d --- /dev/null +++ b/doc/source/dist_cloud/reinstalling-a-subcloud-with-redfish-platform-management-service.rst @@ -0,0 +1,111 @@ + + +.. _reinstalling-a-subcloud-with-redfish-platform-management-service: + +============================================================= +Reinstall a Subcloud with Redfish Platform Management Service +============================================================= + +For subclouds with servers that support Redfish Virtual Media Service +\(version 1.2 or higher\), you can use the Central Cloud's CLI to re-install +the ISO and bootstrap subclouds from the Central Cloud. + +.. caution:: + + All application and data on the subcloud will be lost after re-installation. + +.. rubric:: |context| + +The subcloud reinstallation has two phases: + +Executing the dcmanager subcloud reinstall command in the Central Cloud: + +- Uses Redfish Virtual Media Service to remote install the ISO on controller-0 + in the subcloud. + +- Uses Ansible to bootstrap |prod| on controller-0. + +.. rubric:: |prereq| + +- The install values are required for subcloud reinstallation. By default, + install values are stored in the database after a subcloud installation or + upgrade, and the reinstallation will re-use the install values. If you want + to update the install values, use the following CLI command in the Central + Cloud. + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud update subcloud1 --install-values install-values.yaml --bmc-password + + For more information on install-values.yaml file, see :ref:`Installing a Subcloud Using Redfish Platform Management Service + `. + + You can only reinstall the same software version with the Central Cloud on + the subcloud. + +- Check the subcloud's availability in the Central Cloud, for example, + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud list + + +----+----------+------------+--------------+---------------+---------+ + | id | name | management | availability | deploy status | sync | + +----+----------+------------+--------------+---------------+---------+ + | 1 | subcloud1| unmanaged | offline | complete | unknown | + +----+----------+------------+--------------+---------------+---------+ + + As the reinstall will cause data and application loss, it is not necessary + and not recommended to reinstall a healthy subcloud. The dcmanager rejects + the reinstallation of a managed or online subcloud. + +.. rubric:: |proc| + +#. Execute the reinstall using the CLI. For example, + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud reinstall subcloud1 + +#. Confirm the reinstall of the subcloud. + + You are prompted to enter ``reinstall`` to confirm the reinstallation. + + .. warning:: + + This will reinstall the subcloud. All applications and data on the + subcloud will be lost. + + Please type ``reinstall`` to confirm: reinstall + + Any other input will abort the reinstallation. + +#. At the Central Cloud, monitor the progress of the subcloud installation + and bootstrapping by using the deploy status field of the dcmanager + subcloud list command, for example, + + .. code-block:: none + + ~(keystone_admin)]$ dcmanager subcloud list + + +----+-----------+------------+--------------+---------------+---------+ + | id | name | management | availability | deploy status | sync | + +----+-----------+------------+--------------+---------------+---------+ + | 1 | subcloud1 | unmanaged | offline | installing | unknown | + +----+-----------+------------+--------------+---------------+---------+ + + For more information on the deploy status filed, see :ref:`Installing a Subcloud Using Redfish Platform Management Service + `. + + You can also monitor detailed logging of the subcloud installation, + bootstrapping by monitoring the following log files on the active + controller in the Central Cloud. + + - /var/log/dcmanager/subcloud_name_install_date_stamp.log + - /var/log/dcmanager/subcloud_name_bootstrap_date_stamp.log + +#. After the subcloud is successfully reinstalled and bootstrapped, use the + following command to reconfigure the subcloud, **subcloud reconfig**. + For more information, see :ref:`Managing Subclouds Using the CLI + `. +