From ac4d8fea449771da0fced6294be78b4dcbecd3d5 Mon Sep 17 00:00:00 2001 From: Adil Date: Wed, 26 May 2021 14:59:08 -0300 Subject: [PATCH] Node Management and Distributed cloud Guide updates Global Pass Upgrades Added content from emails attached to ticket and sharepoint Pacth 01: inputs from email by Greg Patch 03: Created new section for subcloud group updated table 1 shared system configurations Patch 04: corrected typos (Mary's comments) Patch 05: solved merged conflict patch 06: removed broken link Story: TBD Task: TBD Signed-off-by: Adil Change-Id: I60b0a40a60a44d30429cd3a4dd8374c16345951a --- doc/source/cli_ref/system.rst | 2 +- .../r3_release/distributed_cloud/index.rst | 24 +++---- .../r4_release/distributed_cloud/index.rst | 22 +++--- .../r5_release/distributed_cloud/index.rst | 22 +++--- ...anagement-for-admin-rest-api-endpoints.rst | 18 ++--- ...he-admin-password-on-distributed-cloud.rst | 2 +- .../distributed-cloud-architecture.rst | 10 +-- .../distributed-cloud-ports-reference.rst | 24 +++---- ...de-orchestration-process-using-the-cli.rst | 41 ++++------- ...ta-migration-of-n+1-load-on-a-subcloud.rst | 50 +------------- ...installation-of-n+1-load-on-a-subcloud.rst | 4 +- doc/source/dist_cloud/index.rst | 12 ++++ ...ng-redfish-platform-management-service.rst | 18 ++--- ...ut-redfish-platform-management-service.rst | 2 +- ...installing-and-provisioning-a-subcloud.rst | 2 +- ...user-accounts-on-the-system-controller.rst | 6 +- ...andling-during-an-orchestrated-upgrade.rst | 14 ++-- .../dist_cloud/shared-configurations.rst | 4 +- ...ker-registry-credentials-on-a-subcloud.rst | 14 ++-- .../upgrade-management-overview.rst | 68 +++++-------------- ...ing-the-systemcontroller-using-the-cli.rst | 49 +++++-------- ...ting-a-master-controller-using-horizon.rst | 6 +- ...ting-a-master-controller-using-the-cli.rst | 6 +- .../configuring-cpu-core-assignments.rst | 2 +- .../displaying-worker-host-information.rst | 3 +- ...or-hosted-vram-containerized-workloads.rst | 3 +- .../n3000-overview.rst | 8 +-- .../set-up-pods-to-use-sriov.rst | 6 +- .../showing-details-for-an-fpga-device.rst | 6 +- .../updating-an-intel-n3000-fpga-image.rst | 6 +- ...hardware-components-for-a-storage-host.rst | 5 +- .../kubernetes/host_inventory/hosts-tab.rst | 2 +- .../kubernetes/host_inventory/lldp-tab.rst | 4 +- .../node_management/kubernetes/index.rst | 2 +- ...ated-ethernet-interfaces-using-the-cli.rst | 14 ---- ...-ip-address-provisioning-using-the-cli.rst | 33 +++++++-- .../interface-provisioning.rst | 2 +- .../node_interfaces/interface-settings.rst | 2 +- ...entication-setup-for-distributed-cloud.rst | 18 ++--- ...-an-orchestrated-upgrade-using-the-cli.rst | 12 ++-- .../performing-an-orchestrated-upgrade.rst | 6 +- 41 files changed, 235 insertions(+), 319 deletions(-) diff --git a/doc/source/cli_ref/system.rst b/doc/source/cli_ref/system.rst index 8cf236941..49dffc64e 100644 --- a/doc/source/cli_ref/system.rst +++ b/doc/source/cli_ref/system.rst @@ -937,7 +937,7 @@ The following set of commands allow you to update the Intel N3000 |FPGA| |PAC| user image on StarlingX hosts. For more information, see -:doc:`N3000 Overview `. +:doc:`N3000 FPGA Overview `. ``host-device-image-update`` diff --git a/doc/source/deploy_install_guides/r3_release/distributed_cloud/index.rst b/doc/source/deploy_install_guides/r3_release/distributed_cloud/index.rst index 38747b2b0..1b7e1b0fc 100644 --- a/doc/source/deploy_install_guides/r3_release/distributed_cloud/index.rst +++ b/doc/source/deploy_install_guides/r3_release/distributed_cloud/index.rst @@ -46,7 +46,7 @@ Distributed cloud architecture ------------------------------ A distributed cloud system consists of a central cloud, and one or more -subclouds connected to the SystemController region central cloud over L3 +subclouds connected to the System Controller region central cloud over L3 networks, as shown in Figure 1. - **Central cloud** @@ -65,13 +65,13 @@ networks, as shown in Figure 1. In the Horizon GUI, SystemController is the name of the access mode, or region, used to manage the subclouds. - You can use the SystemController to add subclouds, synchronize select + You can use the System Controller to add subclouds, synchronize select configuration data across all subclouds and monitor subcloud operations and alarms. System software updates for the subclouds are also centrally - managed and applied from the SystemController. + managed and applied from the System Controller. DNS, NTP, and other select configuration settings are centrally managed - at the SystemController and pushed to the subclouds in parallel to + at the System Controller and pushed to the subclouds in parallel to maintain synchronization across the distributed cloud. - **Subclouds** @@ -81,7 +81,7 @@ networks, as shown in Figure 1. (including simplex, duplex, or standard with or without storage nodes), can be used for a subcloud. The two edge clouds shown in Figure 1 are subclouds. - Alarms raised at the subclouds are sent to the SystemController for + Alarms raised at the subclouds are sent to the System Controller for central reporting. .. figure:: ../figures/starlingx-deployment-options-distributed-cloud.png @@ -95,21 +95,21 @@ networks, as shown in Figure 1. Network requirements -------------------- -Subclouds are connected to the SystemController through both the OAM and the +Subclouds are connected to the System Controller through both the OAM and the Management interfaces. Because each subcloud is on a separate L3 subnet, the OAM, Management and PXE boot L2 networks are local to the subclouds. They are not connected via L2 to the central cloud, they are only connected via L3 -routing. The settings required to connect a subcloud to the SystemController +routing. The settings required to connect a subcloud to the System Controller are specified when a subcloud is defined. A gateway router is required to complete the L3 connections, which will provide IP routing between the -subcloud Management and OAM IP subnet and the SystemController Management and -OAM IP subnet, respectively. The SystemController bootstraps the subclouds via +subcloud Management and OAM IP subnet and the System Controller Management and +OAM IP subnet, respectively. The System Controller bootstraps the subclouds via the OAM network, and manages them via the management network. For more information, see the `Install a Subcloud`_ section later in this guide. .. note:: - All messaging between SystemControllers and Subclouds uses the ``admin`` + All messaging between System Controllers and Subclouds uses the ``admin`` REST API service endpoints which, in this distributed cloud environment, are all configured for secure HTTPS. Certificates for these HTTPS connections are managed internally by StarlingX. @@ -159,12 +159,12 @@ At the subcloud location: 2. Physically install the top of rack switch and configure it for the required networks. 3. Physically install the gateway routers which will provide IP routing - between the subcloud OAM and Management subnets and the SystemController + between the subcloud OAM and Management subnets and the System Controller OAM and management subnets. 4. On the server designated for controller-0, install the StarlingX Kubernetes software from USB or a PXE Boot server. -5. Establish an L3 connection to the SystemController by enabling the OAM +5. Establish an L3 connection to the System Controller by enabling the OAM interface (with OAM IP/subnet) on the subcloud controller using the ``config_management`` script. This step is for subcloud ansible bootstrap preparation. diff --git a/doc/source/deploy_install_guides/r4_release/distributed_cloud/index.rst b/doc/source/deploy_install_guides/r4_release/distributed_cloud/index.rst index 0703c48e3..60d66006e 100644 --- a/doc/source/deploy_install_guides/r4_release/distributed_cloud/index.rst +++ b/doc/source/deploy_install_guides/r4_release/distributed_cloud/index.rst @@ -65,13 +65,13 @@ networks, as shown in Figure 1. In the Horizon GUI, SystemController is the name of the access mode, or region, used to manage the subclouds. - You can use the SystemController to add subclouds, synchronize select + You can use the System Controller to add subclouds, synchronize select configuration data across all subclouds and monitor subcloud operations and alarms. System software updates for the subclouds are also centrally - managed and applied from the SystemController. + managed and applied from the System Controller. DNS, NTP, and other select configuration settings are centrally managed - at the SystemController and pushed to the subclouds in parallel to + at the System Controller and pushed to the subclouds in parallel to maintain synchronization across the distributed cloud. - **Subclouds** @@ -81,7 +81,7 @@ networks, as shown in Figure 1. (including simplex, duplex, or standard with or without storage nodes), can be used for a subcloud. The two edge clouds shown in Figure 1 are subclouds. - Alarms raised at the subclouds are sent to the SystemController for + Alarms raised at the subclouds are sent to the System Controller for central reporting. .. figure:: ../figures/starlingx-deployment-options-distributed-cloud.png @@ -95,21 +95,21 @@ networks, as shown in Figure 1. Network requirements -------------------- -Subclouds are connected to the SystemController through both the OAM and the +Subclouds are connected to the System Controller through both the OAM and the Management interfaces. Because each subcloud is on a separate L3 subnet, the OAM, Management and PXE boot L2 networks are local to the subclouds. They are not connected via L2 to the central cloud, they are only connected via L3 -routing. The settings required to connect a subcloud to the SystemController +routing. The settings required to connect a subcloud to the System Controller are specified when a subcloud is defined. A gateway router is required to complete the L3 connections, which will provide IP routing between the -subcloud Management and OAM IP subnet and the SystemController Management and -OAM IP subnet, respectively. The SystemController bootstraps the subclouds via +subcloud Management and OAM IP subnet and the System Controller Management and +OAM IP subnet, respectively. The System Controller bootstraps the subclouds via the OAM network, and manages them via the management network. For more information, see the `Install a Subcloud`_ section later in this guide. .. note:: - All messaging between SystemControllers and Subclouds uses the ``admin`` + All messaging between System Controllers and Subclouds uses the ``admin`` REST API service endpoints which, in this distributed cloud environment, are all configured for secure HTTPS. Certificates for these HTTPS connections are managed internally by StarlingX. @@ -159,12 +159,12 @@ At the subcloud location: 2. Physically install the top of rack switch and configure it for the required networks. 3. Physically install the gateway routers which will provide IP routing - between the subcloud OAM and Management subnets and the SystemController + between the subcloud OAM and Management subnets and the System Controller OAM and management subnets. 4. On the server designated for controller-0, install the StarlingX Kubernetes software from USB or a PXE Boot server. -5. Establish an L3 connection to the SystemController by enabling the OAM +5. Establish an L3 connection to the System Controller by enabling the OAM interface (with OAM IP/subnet) on the subcloud controller using the ``config_management`` script. This step is for subcloud ansible bootstrap preparation. diff --git a/doc/source/deploy_install_guides/r5_release/distributed_cloud/index.rst b/doc/source/deploy_install_guides/r5_release/distributed_cloud/index.rst index 26ec39518..60d2c867a 100644 --- a/doc/source/deploy_install_guides/r5_release/distributed_cloud/index.rst +++ b/doc/source/deploy_install_guides/r5_release/distributed_cloud/index.rst @@ -65,13 +65,13 @@ networks, as shown in Figure 1. In the Horizon GUI, SystemController is the name of the access mode, or region, used to manage the subclouds. - You can use the SystemController to add subclouds, synchronize select + You can use the System Controller to add subclouds, synchronize select configuration data across all subclouds and monitor subcloud operations and alarms. System software updates for the subclouds are also centrally - managed and applied from the SystemController. + managed and applied from the System Controller. DNS, NTP, and other select configuration settings are centrally managed - at the SystemController and pushed to the subclouds in parallel to + at the System Controller and pushed to the subclouds in parallel to maintain synchronization across the distributed cloud. - **Subclouds** @@ -81,7 +81,7 @@ networks, as shown in Figure 1. (including simplex, duplex, or standard with or without storage nodes), can be used for a subcloud. The two edge clouds shown in Figure 1 are subclouds. - Alarms raised at the subclouds are sent to the SystemController for + Alarms raised at the subclouds are sent to the System Controller for central reporting. .. figure:: ../figures/starlingx-deployment-options-distributed-cloud.png @@ -95,21 +95,21 @@ networks, as shown in Figure 1. Network requirements -------------------- -Subclouds are connected to the SystemController through both the OAM and the +Subclouds are connected to the System Controller through both the OAM and the Management interfaces. Because each subcloud is on a separate L3 subnet, the OAM, Management and PXE boot L2 networks are local to the subclouds. They are not connected via L2 to the central cloud, they are only connected via L3 -routing. The settings required to connect a subcloud to the SystemController +routing. The settings required to connect a subcloud to the System Controller are specified when a subcloud is defined. A gateway router is required to complete the L3 connections, which will provide IP routing between the -subcloud Management and OAM IP subnet and the SystemController Management and -OAM IP subnet, respectively. The SystemController bootstraps the subclouds via +subcloud Management and OAM IP subnet and the System Controller Management and +OAM IP subnet, respectively. The System Controller bootstraps the subclouds via the OAM network, and manages them via the management network. For more information, see the `Install a Subcloud`_ section later in this guide. .. note:: - All messaging between SystemControllers and Subclouds uses the ``admin`` + All messaging between System Controllers and Subclouds uses the ``admin`` REST API service endpoints which, in this distributed cloud environment, are all configured for secure HTTPS. Certificates for these HTTPS connections are managed internally by StarlingX. @@ -159,12 +159,12 @@ At the subcloud location: 2. Physically install the top of rack switch and configure it for the required networks. 3. Physically install the gateway routers which will provide IP routing - between the subcloud OAM and Management subnets and the SystemController + between the subcloud OAM and Management subnets and the System Controller OAM and management subnets. 4. On the server designated for controller-0, install the StarlingX Kubernetes software from USB or a PXE Boot server. -5. Establish an L3 connection to the SystemController by enabling the OAM +5. Establish an L3 connection to the System Controller by enabling the OAM interface (with OAM IP/subnet) on the subcloud controller using the ``config_management`` script. This step is for subcloud ansible bootstrap preparation. diff --git a/doc/source/dist_cloud/certificate-management-for-admin-rest-api-endpoints.rst b/doc/source/dist_cloud/certificate-management-for-admin-rest-api-endpoints.rst index e9705b42e..28be24661 100644 --- a/doc/source/dist_cloud/certificate-management-for-admin-rest-api-endpoints.rst +++ b/doc/source/dist_cloud/certificate-management-for-admin-rest-api-endpoints.rst @@ -6,7 +6,7 @@ Certificate Management for Admin REST API Endpoints =================================================== -All messaging between SystemControllers and Subclouds in the |prod-dc| +All messaging between System Controllers and Subclouds in the |prod-dc| system uses the admin REST API service endpoints, which are all configured for secure HTTPS. @@ -19,9 +19,9 @@ endpoints. .. certificate-management-for-admin-rest--api-endpoints-section-lkn-ypk-xnb: ------------------------------------- -Certificates on the SystemController ------------------------------------- +------------------------------------- +Certificates on the System Controller +------------------------------------- In a |prod-dc| system, the HTTPS certificates for admin endpoints are managed by |prod| internally. @@ -29,7 +29,7 @@ managed by |prod| internally. .. note:: All renewal operations are automatic, and no user operation is required. -For admin endpoints, the SystemControllers in a |prod-dc| system +For admin endpoints, the System Controllers in a |prod-dc| system manages the following certificates: @@ -39,7 +39,7 @@ manages the following certificates: \(approximately 5 years\). Renewal of this certificate starts 30 days prior to expiry. - The Root |CA| certificate is renewed on the SystemController. When the + The Root |CA| certificate is renewed on the System Controller. When the certificate is renewed, |prod| renews the intermediate |CA| certificates for all subclouds. @@ -66,7 +66,7 @@ certificates: .. certificate-management-for-admin-rest--api-endpoints-ul-x51-3qk-xnb: - **DC-AdminEp-Intermediate-CA certificate**: The intermediate CA certificate - for a subcloud is renewed on the SystemController. It is sent to the + for a subcloud is renewed on the System Controller. It is sent to the subcloud using a Rest API. Therefore, a subcloud needs to be online to receive the renewed certificate. @@ -84,9 +84,9 @@ certificates: generated. The new |TLS| certificate is used to provide |TLS| termination. -The SystemController audits subcloud AdminEp certificates daily. It also audits +The System Controller audits subcloud AdminEp certificates daily. It also audits subcloud admin endpoints when a subcloud becomes online or managed. If the -subcloud admin endpoint is "out-of-sync", the SystemController initiates +subcloud admin endpoint is "out-of-sync", the System Controller initiates intermediate |CA| certificate renewal, to force subcloud renewal of the admin endpoint certificate. diff --git a/doc/source/dist_cloud/changing-the-admin-password-on-distributed-cloud.rst b/doc/source/dist_cloud/changing-the-admin-password-on-distributed-cloud.rst index 1e40ed5e1..546eeb219 100644 --- a/doc/source/dist_cloud/changing-the-admin-password-on-distributed-cloud.rst +++ b/doc/source/dist_cloud/changing-the-admin-password-on-distributed-cloud.rst @@ -22,7 +22,7 @@ Ensure that all subclouds are managed and online. System Controller. - - In the SystemController context, select **Identity** \> **Users**. + - In the System Controller context, select **Identity** \> **Users**. Select **Change Password** from the **Edit** menu for the Admin user. - From the |CLI|: diff --git a/doc/source/dist_cloud/distributed-cloud-architecture.rst b/doc/source/dist_cloud/distributed-cloud-architecture.rst index 031c4de7c..f79cffe0d 100644 --- a/doc/source/dist_cloud/distributed-cloud-architecture.rst +++ b/doc/source/dist_cloud/distributed-cloud-architecture.rst @@ -10,14 +10,14 @@ A |prod-dc| system consists of a Central Cloud and one or more subclouds connected to the Central Cloud over L3 networks. The Central Cloud has two regions: RegionOne, used to manage the nodes in the -Central Cloud, and SystemController, used to manage the subclouds in the -|prod-dc| system. You can select RegionOne or SystemController regions from the +Central Cloud, and System Controller, used to manage the subclouds in the +|prod-dc| system. You can select RegionOne or System Controller regions from the Horizon Web interface or by setting the environment variable if using the CLI. **Central Cloud** The Central Cloud provides a RegionOne region for managing the physical - platform of the Central Cloud and the SystemController region for managing + platform of the Central Cloud and the System Controller region for managing and orchestrating over the subclouds. The Central Cloud does not support worker hosts. All worker functions are @@ -29,7 +29,7 @@ if using the CLI. **System Controller** The System Controller access mode, or region, for managing subclouds is - SystemController. + System Controller. You can use the System Controller to add subclouds, synchronize select configuration data across all subclouds and monitor subcloud operations and @@ -95,7 +95,7 @@ if using the CLI. L3 connections. The routers must be configured independently according to OEM instructions. - All messaging between SystemControllers and Subclouds uses the **admin** + All messaging between System Controllers and Subclouds uses the **admin** REST API service endpoints which, in this distributed cloud environment, are all configured for secure HTTPS. Certificates for these HTTPS connections are managed internally by |prod|. diff --git a/doc/source/dist_cloud/distributed-cloud-ports-reference.rst b/doc/source/dist_cloud/distributed-cloud-ports-reference.rst index 25cd03327..70f617340 100644 --- a/doc/source/dist_cloud/distributed-cloud-ports-reference.rst +++ b/doc/source/dist_cloud/distributed-cloud-ports-reference.rst @@ -37,27 +37,27 @@ function correctly. +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | tcp | 6386 | sysinv-api | System Controller | Subclouds | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 6443 | K8s API server | Not used between SystemController and Subclouds | | | + | tcp | 6443 | K8s API server | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 7778 | stx-ha | Not used between SystemController and Subclouds | | | + | tcp | 7778 | stx-ha | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 8443 | horizon https | Not used between SystemController and Subclouds | | | + | tcp | 8443 | horizon https | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 8080 | horizon http | Not used between SystemController and Subclouds | Not required if using https | | + | tcp | 8080 | horizon http | Not used between System Controller and Subclouds | Not required if using https | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 8119 | stx-distcloud | Not used between SystemController and Subclouds | dcmanager-api | | + | tcp | 8119 | stx-distcloud | Not used between System Controller and Subclouds | dcmanager-api | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 15491 | stx-update | Not used between SystemController and Subclouds | only required for system controller | | + | tcp | 15491 | stx-update | Not used between System Controller and Subclouds | only required for system controller | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 18003 | stx-fault | SystemController | Subclouds | | + | tcp | 18003 | stx-fault | System Controller | Subclouds | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | icmp | icmp | | | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 9312 | barbican | Not used between SystemController and Subclouds | | | + | tcp | 9312 | barbican | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | udp | 319 | PTP | Not used between SystemController and Subclouds | | | + | udp | 319 | PTP | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | udp | 320 | PTP | Not used between SystemController and Subclouds | | | + | udp | 320 | PTP | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | tcp/udp | 636 | LDAPS | Subcloud | Windows AD server | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ @@ -67,7 +67,7 @@ function correctly. +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | tcp/udp | 30556 | DEC OIDC Provider | Subcloud | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 8220 | Dist. cloud | SystemController | Subclouds | dcdbsync-api | + | tcp | 8220 | Dist. cloud | System Controller | Subclouds | dcdbsync-api | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | tcp | 31001 | Elastic \(using NodePort\) | Subcloud | DC | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ @@ -77,6 +77,6 @@ function correctly. +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ | udp | 162 | snmp trap | Subcloud | DC | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ - | tcp | 8443 | https | Not used between SystemController and Subclouds | | | + | tcp | 8443 | https | Not used between System Controller and Subclouds | | | +----------+-------+----------------------------+--------------------------------------------------+-------------------------------------+-----------------------------------------+ diff --git a/doc/source/dist_cloud/distributed-upgrade-orchestration-process-using-the-cli.rst b/doc/source/dist_cloud/distributed-upgrade-orchestration-process-using-the-cli.rst index 9b026796a..ccc7e5d5b 100644 --- a/doc/source/dist_cloud/distributed-upgrade-orchestration-process-using-the-cli.rst +++ b/doc/source/dist_cloud/distributed-upgrade-orchestration-process-using-the-cli.rst @@ -7,9 +7,8 @@ Distributed Upgrade Orchestration Process Using the CLI ======================================================= Distributed upgrade orchestration can be initiated after the upgrade and -stability of the SystemController cloud. Upgrade orchestration automatically -iterates through each of the subclouds, installing the new software load on -each one. +stability of the System Controller cloud. Distributed upgrade orchestration can +be initiated after the system controller has been successfully upgraded. .. rubric:: |context| @@ -26,8 +25,8 @@ using parameters to specify: - whether to upgrade hosts serially or in parallel -Based on these parameters, and the state of the subclouds, distributed upgrade -orchestration creates a number of stages for the overall upgrade strategy. All +Based on these parameters, and the state of the subclouds, the upgrade +orchestrator creates a number of stages for the overall upgrade strategy. All the subclouds that are included in the same stage will be upgraded in parallel. .. rubric:: |prereq| @@ -45,7 +44,7 @@ following conditions: - Redfish |BMC| is required for orchestrated subcloud upgrades. The install values, and :command:`bmc\_password` for each |AIO-SX| subcloud controller - must be provided using the following |CLI| command on the SystemController: + must be provided using the following |CLI| command on the System Controller: .. code-block:: none @@ -56,14 +55,14 @@ following conditions: :ref:`Installing a Subcloud Using Redfish Platform Management Service `. -- All subclouds are clear of alarms \(with the exception of the alarm upgrade +- All subclouds are clear of management-affecting alarms \(with the exception of the alarm upgrade in progress\). - All hosts of all subclouds must be unlocked, enabled, and available. -- No distributed update orchestration strategy exists, to verify use the - command :command:`dcmanager upgrade-stratagy-show`. An upgrade cannot be - orchestrated while update orchestration is in progress. +- No distributed upgrade orchestration strategy exists, to verify use the + command :command:`dcmanager upgrade-strategy-show`. An upgrade cannot be + orchestrated while upgrade orchestration is in progress. - Verify the size and format of the platform-backup filesystem on each subcloud. From the shell on each subcloud, use the following command to view @@ -90,7 +89,7 @@ following conditions: #. Review the upgrade status for the subclouds. - After the SystemController upgrade is completed, wait for 10 minutes for + After the System Controller upgrade is completed, wait for 10 minutes for the **load\_sync\_status** of all subclouds to be updated. To identify which subclouds are upgrade-current \(in-sync\), use the @@ -313,22 +312,10 @@ following conditions: .. _distributed-upgrade-orchestration-process-using-the-cli-ul-lx1-zcv-3mb: -- Check and update docker registry credentials for **ALL** subclouds. For - each subcloud: - - .. code-block:: none - - REGISTRY="docker-registry" - SECRET_UUID='system service-parameter-list | fgrep - $REGISTRY | fgrep auth-secret | awk '{print $10}'' - SECRET_REF='openstack secret list | fgrep ${SECRET_UUID}| - awk '{print $2}'' - openstack secret get ${SECRET_REF} --payload -f value - - The secret payload should be, "username: sysinv password:". If - the secret payload is, "username: admin password:", see, - :ref:`Updating Docker Registry Credentials on a Subcloud - ` for more information. +The secret payload should be, "username: sysinv password:". If +the secret payload is, "username: admin password:", see, +:ref:`Update Docker Registry Credentials on a Subcloud +` for more information. .. only:: partner diff --git a/doc/source/dist_cloud/failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud.rst b/doc/source/dist_cloud/failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud.rst index 0e0bd05b1..0052aae5b 100644 --- a/doc/source/dist_cloud/failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud.rst +++ b/doc/source/dist_cloud/failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud.rst @@ -23,8 +23,6 @@ Errors can occur due to one of the following: - A network error that results in the subcloud's being temporarily unreachable -- An invalid docker registry certificate - **Failure Caused by Install Values** @@ -35,8 +33,8 @@ command to fix it. ~(keystone_admin)]$ dcmanager subcloud update <> --install-values <> -This type of failure is recoverable and you can rerun the upgrade strategy for -the failed subcloud\(s\) using the following procedure: +This type of failure is recoverable and you can retry the orchestrated +upgrade for each of the failed subclouds using the following procedure: .. rubric:: |proc| @@ -74,50 +72,6 @@ the failed subcloud\(s\) using the following procedure: .. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-section-f5f-j1y-qmb: ------------------------------------------------------ -Failure Caused by Invalid Docker Registry Certificate ------------------------------------------------------ - -If the docker registry certificate on the subcloud is invalid/expired prior to -an upgrade, the upgrade will fail during data migration. - -.. warning:: - - This type of failure cannot be recovered. You will need to re-deploy the - subcloud, redo all configuration changes, and regenerate the data. - -.. note:: - - Ensure that the docker registry certificate on all subclouds must be - upgraded prior to performing an orchestrated upgrade. - -To re-deploy the subcloud, use the following procedure: - -.. rubric:: |proc| - -.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-ol-dpp-bzr-qmb: - -#. Unmanage the failed subcloud. - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud unmanage <> - -#. Delete the subcloud. - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud delete <> - -#. Re-deploy the failed subcloud. - - .. code-block:: none - - ~(keystone_admin)]$ dcmanager subcloud add <> - - -.. _failure-during-the-installation-or-data-migration-of-n+1-load-on-a-subcloud-section-lj4-1rr-qmb: - ----------------------------------------- Failure Post Data Migration on a Subcloud ----------------------------------------- diff --git a/doc/source/dist_cloud/failure-prior-to-the-installation-of-n+1-load-on-a-subcloud.rst b/doc/source/dist_cloud/failure-prior-to-the-installation-of-n+1-load-on-a-subcloud.rst index 7c72ed9a7..310288a40 100644 --- a/doc/source/dist_cloud/failure-prior-to-the-installation-of-n+1-load-on-a-subcloud.rst +++ b/doc/source/dist_cloud/failure-prior-to-the-installation-of-n+1-load-on-a-subcloud.rst @@ -26,8 +26,8 @@ Errors can occur due to any one of the following: - The /home/sysadmin directory on the subcloud is too large -If you encounter any of the above errors, use the following procedure to fix -it: +If you encounter any of the above errors, follow this procedure to retry the +orchestrated upgrade after addressing the cause of failure: .. rubric:: |proc| diff --git a/doc/source/dist_cloud/index.rst b/doc/source/dist_cloud/index.rst index 56bca6cfb..fcf2b80f7 100644 --- a/doc/source/dist_cloud/index.rst +++ b/doc/source/dist_cloud/index.rst @@ -51,6 +51,18 @@ Operation migrate-an-aiosx-subcloud-to-an-aiodx-subcloud restoring-subclouds-from-backupdata-using-dcmanager +---------------------- +Manage Subcloud Groups +---------------------- + +.. toctree:: + :maxdepth: 1 + :caption: Contents: + + managing-subcloud-groups + creating-subcloud-groups + ochestration-strategy-using-subcloud-groups + ------------------------- Update (Patch) management ------------------------- diff --git a/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst b/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst index d1c23f400..3f624ab77 100644 --- a/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst +++ b/doc/source/dist_cloud/installing-a-subcloud-using-redfish-platform-management-service.rst @@ -30,20 +30,20 @@ subcloud, the subcloud installation has these phases: .. note:: After a successful remote installation of a subcloud in a Distributed Cloud system, a subsequent remote reinstallation fails because of an existing ssh - key entry in the /root/.ssh/known\_hosts on the SystemController. In this + key entry in the /root/.ssh/known\_hosts on the System Controller. In this case, delete the host key entry, if present, from /root/.ssh/known\_hosts - on the SystemController before doing reinstallations. + on the System Controller before doing reinstallations. .. rubric:: |prereq| .. _installing-a-subcloud-using-redfish-platform-management-service-ul-g5j-3f3-qjb: -- The docker **rvmc** image needs to be added to the SystemController +- The docker **rvmc** image needs to be added to the System Controller bootstrap override file, docker.io/starlingx/rvmc:stx.5.0-v1.0.0. - A new system CLI option ``--active`` is added to the :command:`load-import` command to allow the import into the - SystemController /opt/dc-vault/loads. The purpose of this is to allow + System Controller /opt/dc-vault/loads. The purpose of this is to allow Redfish install of subclouds referencing a single full copy of the **bootimage.iso** at /opt/dc-vault/loads. \(Previously, the full **bootimage.iso** was duplicated for each :command:`subcloud add` @@ -78,15 +78,15 @@ subcloud, the subcloud installation has these phases: .. note:: Do not power off the servers. The host portion of the server can be powered off, but the |BMC| portion of the server must be powered and - accessible from the SystemController. + accessible from the System Controller. There is no need to wipe the disks. .. note:: The servers require connectivity to a gateway router that provides IP - routing between the subcloud management subnet and the SystemController + routing between the subcloud management subnet and the System Controller management subnet, and between the subcloud |OAM| subnet and the - SystemController subnet. + System Controller subnet. #. Create the install-values.yaml file and use the content to pass the file into the :command:`dcmanager subcloud add` command, using the @@ -156,7 +156,7 @@ subcloud, the subcloud installation has these phases: # boot_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0" -#. At the SystemController, create a +#. At the System Controller, create a /home/sysadmin/subcloud1-bootstrap-values.yaml overrides file for the subcloud. @@ -275,7 +275,7 @@ subcloud, the subcloud installation has these phases: The :command:`dcmanager subcloud add` command can take up to ten minutes to complete. -#. At the Central Cloud / SystemController, monitor the progress of the +#. At the Central Cloud / System Controller, monitor the progress of the subcloud install, bootstrapping, and deployment by using the deploy status field of the :command:`dcmanager subcloud list` command. diff --git a/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst b/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst index e0f5d3bca..038ccb96b 100644 --- a/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst +++ b/doc/source/dist_cloud/installing-a-subcloud-without-redfish-platform-management-service.rst @@ -32,7 +32,7 @@ subcloud, the subcloud installation process has two phases: .. note:: After a successful remote installation of a subcloud in a Distributed Cloud system, a subsequent remote reinstallation fails because of an existing ssh - key entry in the /root/.ssh/known\_hosts on the SystemController. In this + key entry in the /root/.ssh/known\_hosts on the System Controller. In this case, delete the host key entry, if present, from /root/.ssh/known\_hosts on the System Controller before doing reinstallations. diff --git a/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst b/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst index c92fdc4ba..c89853f74 100644 --- a/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst +++ b/doc/source/dist_cloud/installing-and-provisioning-a-subcloud.rst @@ -16,7 +16,7 @@ Platform Management Service. .. note:: Each subcloud must be on a separate management subnet \(different from the - SystemController and from any other subclouds\). + System Controller and from any other subclouds\). .. _installing-and-provisioning-a-subcloud-section-orn-jkf-t4b: diff --git a/doc/source/dist_cloud/managing-ldap-linux-user-accounts-on-the-system-controller.rst b/doc/source/dist_cloud/managing-ldap-linux-user-accounts-on-the-system-controller.rst index 2d5d4a995..08f9c4a50 100644 --- a/doc/source/dist_cloud/managing-ldap-linux-user-accounts-on-the-system-controller.rst +++ b/doc/source/dist_cloud/managing-ldap-linux-user-accounts-on-the-system-controller.rst @@ -7,10 +7,10 @@ Managing LDAP Linux User Accounts on the System Controller ========================================================== In a |prod-dc| system, |LDAP| Linux user accounts are managed centrally -on the SystemController. +on the System Controller. -You can only add/modify/delete |LDAP| users on the SystemController. Any user -account modifications done on the SystemController will be available across all +You can only add/modify/delete |LDAP| users on the System Controller. Any user +account modifications done on the System Controller will be available across all subclouds. For more information, see |sec-doc|: :ref:`Local LDAP Linux User Accounts diff --git a/doc/source/dist_cloud/robust-error-handling-during-an-orchestrated-upgrade.rst b/doc/source/dist_cloud/robust-error-handling-during-an-orchestrated-upgrade.rst index 192c6ee94..b8c7514b2 100644 --- a/doc/source/dist_cloud/robust-error-handling-during-an-orchestrated-upgrade.rst +++ b/doc/source/dist_cloud/robust-error-handling-during-an-orchestrated-upgrade.rst @@ -2,9 +2,9 @@ .. ziu1597089603252 .. _robust-error-handling-during-an-orchestrated-upgrade: -==================================================== -Robust Error Handling During An Orchestrated Upgrade -==================================================== +============================================= +Error Handling During An Orchestrated Upgrade +============================================= This section describes the errors you may encounter during an orchestrated upgrade and the steps you can use to troubleshoot the errors. @@ -28,10 +28,16 @@ If a failure occurs, use the following general steps: During the Installation or Data Migration of N+1 Load on a Subcloud `. -#. Rerun the orchestrated upgrade. For more information, see :ref:`Distributed +#. Retry the orchestrated upgrade. For more information, see :ref:`Distributed Upgrade Orchestration Process Using the CLI `. +.. note:: + Orchestrated upgrade can be retried for a group of failed subclouds that + are still **online** using the :command:`upgrade-strategy create --group + ` command. + Failed subclouds that are **offline** must be retried one at a time. + .. seealso:: :ref:`Failure Prior to the Installation of N+1 Load on a Subcloud diff --git a/doc/source/dist_cloud/shared-configurations.rst b/doc/source/dist_cloud/shared-configurations.rst index 639f6ffd8..7f9fe3890 100644 --- a/doc/source/dist_cloud/shared-configurations.rst +++ b/doc/source/dist_cloud/shared-configurations.rst @@ -30,11 +30,9 @@ for resources of the Keystone Identity Service \(see :ref:`Table 2 +=============================+==============================================================================================================================================================================================================================================================================================================================================================+ | DNS IP addresses | Subclouds use the DNS servers specified at the System Controller. | +-----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ - | SNMP community and trapdest | Subclouds use the SNMP alarm trap and destination settings specified at the System Controller, for example using the :command:`system snmp-comm-add`, :command:`system snmp-comm-delete`, and :command:`system snmp-trapdest-add` commands. A subcloud may use additional local settings; if present, these are not synchronized with the System Controller. | - +-----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | **sysadmin** Password | The **sysadmin** password may take up to 10 minutes to sync with the controller. The **sysadmin** password is not modified via the :command:`system` command. It is modified using the regular Linux :command:`passwd` command. | +-----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ - | Certificates | Subclouds use the digital certificates installed on the System Controller using the :command:`system certificate-install` command. | + | Certificates | Subclouds use the Trusted |CA| certificates installed on the System Controller using the :command:`system certificate-install -m ssl_ca` command. | +-----------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ diff --git a/doc/source/dist_cloud/updating-docker-registry-credentials-on-a-subcloud.rst b/doc/source/dist_cloud/updating-docker-registry-credentials-on-a-subcloud.rst index 27816fbf0..b496e48ad 100644 --- a/doc/source/dist_cloud/updating-docker-registry-credentials-on-a-subcloud.rst +++ b/doc/source/dist_cloud/updating-docker-registry-credentials-on-a-subcloud.rst @@ -6,26 +6,26 @@ Update Docker Registry Credentials on a Subcloud ================================================ -On a subcloud that uses the systemController's docker registry -(registry.central) as its install registry, one should use the -systemController's sysinv service credentials for accessing registry.central. +On a subcloud that uses the System Controller's Docker registry +(registry.central) as its install registry, you should use the +System Controller's sysinv service credentials for accessing registry.central. This makes access to registry.central independent of changes to the Distributed -Cloud's keystone admin user password. +Cloud's Keystone admin user password. Use the following procedure to update the install registry credentials on the -subcloud to the sysinv service credentials of the systemController. +subcloud to the sysinv service credentials of the System Controller. .. rubric:: |proc| .. _updating-docker-registry-credentials-on-a-subcloud-steps-ywx-wyt-kmb: -#. On the SystemController, get the password for the sysinv services. +#. On the System Controller, get the password for the sysinv services. .. code-block:: none $ keyring get sysinv services -#. On each subcloud, run the following script to update the docker registry +#. On each subcloud, run the following script to update the Docker registry credentials to sysinv: .. code-block:: none diff --git a/doc/source/dist_cloud/upgrade-management-overview.rst b/doc/source/dist_cloud/upgrade-management-overview.rst index 15d8eb1ec..c3514fe94 100644 --- a/doc/source/dist_cloud/upgrade-management-overview.rst +++ b/doc/source/dist_cloud/upgrade-management-overview.rst @@ -6,7 +6,7 @@ Upgrade Management Overview =========================== -You can upgrade |prod|'s |prod-dc|'s SystemController, and subclouds with a new +You can upgrade |prod|'s |prod-dc|'s System Controller, and subclouds with a new release of |prod| software. .. rubric:: |context| @@ -14,7 +14,7 @@ release of |prod| software. .. note:: Backup all yaml files that are updated using the Redfish Platform - Management service. For more information, see, :ref:`Installing a Subcloud + Management service. For more information, see :ref:`Installing a Subcloud Using Redfish Platform Management Service `. @@ -25,13 +25,13 @@ follows: .. _upgrade-management-overview-ol-uqv-p24-3mb: #. To upgrade the |prod-dc| system, you must first upgrade the - SystemController. See, :ref:`Upgrading the SystemController Using the CLI + System Controller. See :ref:`Upgrading the System Controller Using the CLI `. -#. Use |prod-dc| Upgrade Orchestration to upgrade the subclouds. See, +#. Use |prod-dc| Upgrade Orchestration to upgrade the subclouds. See :ref:`Distributed Upgrade Orchestration Process Using the CLI `. -#. To handle errors during an orchestrated upgrade, see :ref:`Robust Error +#. To handle errors during an orchestrated upgrade, see :ref:`Error Handling During An Orchestrated Upgrade `. @@ -47,73 +47,37 @@ The following prerequisites apply to a |prod-dc| upgrade management service. - Run the :command:`system application-list` command to ensure that all - applications are running + applications are running. - Run the :command:`system host-list` command to list the configured - hosts + hosts. - Run the :command:`dcmanager subcloud list` command to list the - subclouds + subclouds. - Run the :command:`kubectl get pods --all-namespaces` command to test - that the authentication token validates correctly + that the authentication token validates correctly. - Run the :command:`fm alarm-list` command to check the system health to - ensure that there are no unexpected alarms + ensure that there are no unexpected or management-affecting alarms. - Run the :command:`kubectl get host -n deployment` command to ensure all - nodes in the cluster have reconciled and is set to 'true' + nodes in the cluster have reconciled and is set to 'true'. - - Ensure **controller-0** is the active controller + - Ensure **controller-0** is the active controller. - The subclouds must all be |AIO-DX|, and using the Redfish platform management service. - **Remove Non GA Applications**: + - Use the :command:`system application-remove` and :command:`system + application-delete` commands to remove the application on the + subclouds. - - Use the following command to remove the analytics application on the - subclouds: - - - :command:`system application-remove wra-analytics` - - - :command:`system application-delete wra-analytics` - - - - Remove any non-GA applications such as Wind River Analytics, and + - Remove any non-GA applications and |prefix|-openstack, from the |prod-dc| system, if they exist. -- **Increase Scratch File System Size**: - - - Check the size of scratch partition on both the system controller and - subclouds using the :command:`system host-fs-list` command. - - .. note:: - Increase in scratch filesystem size is also required on each - subcloud. - - - All controller nodes and subclouds should have a minimum of 16G scratch - file system. The process of importing a new load for upgrade will - temporarily use up to 11G of scratch disk space. Use the :command:`system - host-fs-modify` command to increase scratch size on **each controller - node** and subcloud controllers as needed in preparation for software - upgrade. For example, run the following commands: - - .. code-block:: none - - ~(keystone_admin)]$ system host-fs-modify controller-0 scratch=16 - - Run the :command:`fm alarm-list` command to check the system health to - ensure that there are no unexpected alarms - -- For orchestrated subcloud upgrades the install-values for each subcloud - that was used for deployment must be saved and restored to the SystemController - after the SystemController upgrade. - -- Run the :command:`kubectl -n kube-system get secret` command on the - SystemController before upgrading subclouds, as the docker **rvmc** image on - orchestrated subcloud upgrade tries to copy the :command:`kube-system - default-registry-key`. .. only:: partner diff --git a/doc/source/dist_cloud/upgrading-the-systemcontroller-using-the-cli.rst b/doc/source/dist_cloud/upgrading-the-systemcontroller-using-the-cli.rst index 13f08f7a7..a677250a3 100644 --- a/doc/source/dist_cloud/upgrading-the-systemcontroller-using-the-cli.rst +++ b/doc/source/dist_cloud/upgrading-the-systemcontroller-using-the-cli.rst @@ -2,18 +2,18 @@ .. vco1593176327490 .. _upgrading-the-systemcontroller-using-the-cli: -========================================== -Upgrade the SystemController Using the CLI -========================================== +=========================================== +Upgrade the System Controller Using the CLI +=========================================== -You can upload and apply upgrades to the SystemController in order to upgrade -the central repository, from the CLI. The SystemController can be upgraded +You can upload and apply upgrades to the System Controller in order to upgrade +the central repository, from the CLI. The System Controller can be upgraded using either a manual software upgrade procedure or by using the non-distributed systems :command:`sw-manager` orchestration procedure. .. rubric:: |context| -Follow the steps below to manually upgrade the SystemController: +Follow the steps below to manually upgrade the System Controller: .. rubric:: |proc| @@ -30,7 +30,7 @@ Follow the steps below to manually upgrade the SystemController: .. include:: ../_includes/upgrading-the-systemcontroller-using-the-cli.rest -#. Import the software release load, and copy the iso file to controller-0 \(active controller\). +#. Transfer iso and signature files to controller-0 \(active controller\) and import the load. .. code-block:: none @@ -87,13 +87,10 @@ Follow the steps below to manually upgrade the SystemController: .. note:: Use the command :command:`system upgrade-start --force` to force the - upgrades process to start and to ignore management affecting alarms. + upgrade process to start and ignore non-management-affecting alarms. This should ONLY be done if these alarms do not cause an issue for the upgrades process. - If there are alarms present during the upgrade, subcloud load sync\_status - will display "out-of-sync". - #. Start the upgrade from controller-0. Make sure that controller-0 is the active controller, and you are logged @@ -113,8 +110,8 @@ Follow the steps below to manually upgrade the SystemController: +--------------+--------------------------------------+ This will make a copy of the system data to be used in the upgrade. - Configuration changes are not allowed after this point until the swact to - controller-1 is completed. + Configuration changes must not be made after this point, until the + upgrade is completed. The following upgrade state applies once this command is executed. Run the :command:`system upgrade-show` command to verify the status of the upgrade. @@ -128,11 +125,6 @@ Follow the steps below to manually upgrade the SystemController: - Release 20.04 system data \(for example, postgres databases\) has been exported to be used in the upgrade. - - Configuration changes must not be made after this point, until the - upgrade is completed. - - - As part of the upgrade, the upgrade process checks the health of the system and validates that the system is ready for an upgrade. @@ -146,8 +138,9 @@ Follow the steps below to manually upgrade the SystemController: This should ONLY be done if these alarms do not cause an issue for the upgrades process. - If there are alarms present during the upgrade, subcloud load - sync\_status will display "out-of-sync". + The `fm alarm-list` will provide the specific alarms leading to the system + health-query-upgrade alarms notes which may be blocking an orchestrated + upgrade. On systems with Ceph storage, it also checks that the Ceph cluster is healthy. @@ -313,7 +306,7 @@ Follow the steps below to manually upgrade the SystemController: #. If using Ceph storage backend, upgrade the storage nodes one at a time. - The storage node must be locked and all OSDs must be down in order to do + The storage node must be locked and all |OSDs| must be down in order to do the upgrade. @@ -323,10 +316,10 @@ Follow the steps below to manually upgrade the SystemController: ~(keystone_admin)]$ system host-lock storage-0 - #. Verify that the OSDs are down after the storage node is locked. + #. Verify that the |OSDs| are down after the storage node is locked. In the Horizon interface, navigate to **Admin** \> **Platform** \> - **Storage Overview** to view the status of the OSDs. + **Storage Overview** to view the status of the |OSDs|. #. Upgrade storage-0. @@ -362,7 +355,7 @@ Follow the steps below to manually upgrade the SystemController: **800.003**. The alarm is cleared after all storage nodes are upgraded. -#. If worker nodes are present, upgrade worker hosts, serially or parallelly, +#. If worker nodes are present, upgrade worker hosts, serially or in parallel, if any. @@ -475,11 +468,3 @@ Follow the steps below to manually upgrade the SystemController: Run the :command:`system upgrade-show` command, and the status will display "no upgrade in progress". The subclouds will be out-of-sync. - -.. rubric:: |postreq| - -.. warning:: - Do NOT delete the N load from the SystemController once the upgrade is - complete. If the load is deleted from the SystemController, you must - manually delete the N load from each subcloud. - diff --git a/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-horizon.rst b/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-horizon.rst index e28ad7f8f..1e9383887 100644 --- a/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-horizon.rst +++ b/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-horizon.rst @@ -2,9 +2,9 @@ .. fab1579714529266 .. _swacting-a-master-controller-using-horizon: -======================================= -Swact a Master/Controller Using Horizon -======================================= +=============================== +Swact Controllers Using Horizon +=============================== Swacting initiates a switch of the active/standby roles between two controllers. diff --git a/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-the-cli.rst b/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-the-cli.rst index 685dcd60e..29dc39451 100644 --- a/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-the-cli.rst +++ b/doc/source/node_management/kubernetes/common_host_tasks/swacting-a-master-controller-using-the-cli.rst @@ -2,9 +2,9 @@ .. qmi1579723342974 .. _swacting-a-master-controller-using-the-cli: -======================================= -Swact a Master/Controller Using the CLI -======================================= +=============================== +Swact Controllers Using the CLI +=============================== Swacting initiates a switch of the active/standby roles between two controllers. diff --git a/doc/source/node_management/kubernetes/configuring_cpu_core_assignments/configuring-cpu-core-assignments.rst b/doc/source/node_management/kubernetes/configuring_cpu_core_assignments/configuring-cpu-core-assignments.rst index 65bdb0bfa..56d45b8e3 100644 --- a/doc/source/node_management/kubernetes/configuring_cpu_core_assignments/configuring-cpu-core-assignments.rst +++ b/doc/source/node_management/kubernetes/configuring_cpu_core_assignments/configuring-cpu-core-assignments.rst @@ -6,7 +6,7 @@ Configure CPU Core Assignments ============================== -You can improve the performance of specific functions by assigning them to +You can improve the performance and capacity of specific functions by assigning them more CPU cores from the Horizon Web interface. .. rubric:: |proc| diff --git a/doc/source/node_management/kubernetes/displaying-worker-host-information.rst b/doc/source/node_management/kubernetes/displaying-worker-host-information.rst index 535354e21..2ab54de36 100644 --- a/doc/source/node_management/kubernetes/displaying-worker-host-information.rst +++ b/doc/source/node_management/kubernetes/displaying-worker-host-information.rst @@ -6,8 +6,7 @@ Display Worker Host Information =============================== -You can view worker host resources from the Horizon Web interface. You can -also view data interface assignments graphically from Horizon. +You can view worker host resources from the Horizon Web interface. .. rubric:: |proc| diff --git a/doc/source/node_management/kubernetes/hardware_acceleration_devices/enabling-mount-bryce-hw-accelerator-for-hosted-vram-containerized-workloads.rst b/doc/source/node_management/kubernetes/hardware_acceleration_devices/enabling-mount-bryce-hw-accelerator-for-hosted-vram-containerized-workloads.rst index d8a0c7295..d41e43cff 100644 --- a/doc/source/node_management/kubernetes/hardware_acceleration_devices/enabling-mount-bryce-hw-accelerator-for-hosted-vram-containerized-workloads.rst +++ b/doc/source/node_management/kubernetes/hardware_acceleration_devices/enabling-mount-bryce-hw-accelerator-for-hosted-vram-containerized-workloads.rst @@ -129,5 +129,6 @@ enables the Mount Bryce device. .. rubric:: |result| -To set up pods using |SRIOV|, see, :ref:`Setting Up Pods to Use SRIOV `. +To set up pods using |SRIOV|, see :ref:`Setting Up Pods to Use SRIOV to Access +Mount Bryce HW Accelerator `. diff --git a/doc/source/node_management/kubernetes/hardware_acceleration_devices/n3000-overview.rst b/doc/source/node_management/kubernetes/hardware_acceleration_devices/n3000-overview.rst index 0b17b5958..e67b20f80 100644 --- a/doc/source/node_management/kubernetes/hardware_acceleration_devices/n3000-overview.rst +++ b/doc/source/node_management/kubernetes/hardware_acceleration_devices/n3000-overview.rst @@ -2,9 +2,9 @@ .. pis1592390220404 .. _n3000-overview: -============== -N3000 Overview -============== +=================== +N3000 FPGA Overview +=================== The N3000 |FPGA| |PAC| has two Intel XL710 |NICs|, memory and an Intel |FPGA|. @@ -29,4 +29,4 @@ perform accelerated 5G |LDPC| encoding and decoding operations. .. seealso:: :ref:`N3000 FPGA Forward Error Correction - `. + ` diff --git a/doc/source/node_management/kubernetes/hardware_acceleration_devices/set-up-pods-to-use-sriov.rst b/doc/source/node_management/kubernetes/hardware_acceleration_devices/set-up-pods-to-use-sriov.rst index 935384395..6e0d153f4 100644 --- a/doc/source/node_management/kubernetes/hardware_acceleration_devices/set-up-pods-to-use-sriov.rst +++ b/doc/source/node_management/kubernetes/hardware_acceleration_devices/set-up-pods-to-use-sriov.rst @@ -2,9 +2,9 @@ .. ggs1611608368857 .. _set-up-pods-to-use-sriov: -============================ -Set Up Pods to Use SRIOV -============================ +============================================================= +Set Up Pods to Use SRIOV to Access Mount Bryce HW Accelerator +============================================================= You can configure pods with |SRIOV| access to a Mount Bryce device by adding the appropriate 'resources' request in the pod specification. diff --git a/doc/source/node_management/kubernetes/hardware_acceleration_devices/showing-details-for-an-fpga-device.rst b/doc/source/node_management/kubernetes/hardware_acceleration_devices/showing-details-for-an-fpga-device.rst index d89398961..f464285b0 100644 --- a/doc/source/node_management/kubernetes/hardware_acceleration_devices/showing-details-for-an-fpga-device.rst +++ b/doc/source/node_management/kubernetes/hardware_acceleration_devices/showing-details-for-an-fpga-device.rst @@ -2,9 +2,9 @@ .. mmu1591729910787 .. _showing-details-for-an-fpga-device: -=============================== -Show Details for an FPGA Device -=============================== +========================= +Show Details for a Device +========================= Additional details are available when running the :command:`host-device-show` command in the context of an |FPGA| device. diff --git a/doc/source/node_management/kubernetes/hardware_acceleration_devices/updating-an-intel-n3000-fpga-image.rst b/doc/source/node_management/kubernetes/hardware_acceleration_devices/updating-an-intel-n3000-fpga-image.rst index e41e6168c..14f0571dd 100644 --- a/doc/source/node_management/kubernetes/hardware_acceleration_devices/updating-an-intel-n3000-fpga-image.rst +++ b/doc/source/node_management/kubernetes/hardware_acceleration_devices/updating-an-intel-n3000-fpga-image.rst @@ -2,9 +2,9 @@ .. yui1591714746999 .. _updating-an-intel-n3000-fpga-image: -================================ -Update an Intel N3000 FPGA Image -================================ +========================== +Update an N3000 FPGA Image +========================== The N3000 |FPGA| as shipped from the factory is expected to have production |BMC| and factory images. The following procedure describes how to update the diff --git a/doc/source/node_management/kubernetes/host_hardware_management/changing-hardware-components-for-a-storage-host.rst b/doc/source/node_management/kubernetes/host_hardware_management/changing-hardware-components-for-a-storage-host.rst index 9e84ec2fd..8a4c56bce 100644 --- a/doc/source/node_management/kubernetes/host_hardware_management/changing-hardware-components-for-a-storage-host.rst +++ b/doc/source/node_management/kubernetes/host_hardware_management/changing-hardware-components-for-a-storage-host.rst @@ -78,9 +78,6 @@ can reproduce them later. If the host has been deleted from the Host Inventory, the host software is reinstalled. -.. From Power up the host -.. xbookref For details, see :ref:`|inst-doc| `. - Wait for the host to be reported as **Locked**, **Disabled**, and **Online**. @@ -108,4 +105,6 @@ can reproduce them later. .. From If required, allocate the |OSD| and journal disk storage. .. xbooklinkFor more information, see |stor-doc|: `Provision Storage on a Storage Host `. +.. From Power up the host +.. xbookref For details, see :ref:`|inst-doc| `. diff --git a/doc/source/node_management/kubernetes/host_inventory/hosts-tab.rst b/doc/source/node_management/kubernetes/host_inventory/hosts-tab.rst index edaf0b8bf..44af06637 100644 --- a/doc/source/node_management/kubernetes/host_inventory/hosts-tab.rst +++ b/doc/source/node_management/kubernetes/host_inventory/hosts-tab.rst @@ -186,7 +186,7 @@ A sample **Hosts** tab is illustrated below: **Swact Host** This operation is available on controller nodes only. It initiates a switch of the active/standby roles between two controllers. For more - information, see :ref:`Swact a Master/Controller Using Horizon + information, see :ref:`Swact Controllers Using Horizon `. **Unlock Host** diff --git a/doc/source/node_management/kubernetes/host_inventory/lldp-tab.rst b/doc/source/node_management/kubernetes/host_inventory/lldp-tab.rst index 7c9d0e9d4..dd9916761 100644 --- a/doc/source/node_management/kubernetes/host_inventory/lldp-tab.rst +++ b/doc/source/node_management/kubernetes/host_inventory/lldp-tab.rst @@ -6,8 +6,8 @@ LLDP Tab ======== -The **LLDP** tab on the Host Detail page presents details about the Link -Layer Discovery Protocol configuration on a node. +The **LLDP** tab on the Host Detail page presents learned details about +neighbors' ports though the Link Layer Discovery Protocol. The **LLDP** tab provides the following information about each LLDP-enabled neighbor device: diff --git a/doc/source/node_management/kubernetes/index.rst b/doc/source/node_management/kubernetes/index.rst index 5faf5ebbe..35576d21b 100644 --- a/doc/source/node_management/kubernetes/index.rst +++ b/doc/source/node_management/kubernetes/index.rst @@ -89,8 +89,8 @@ Configuring CPU core assignments .. toctree:: :maxdepth: 1 - configuring_cpu_core_assignments/changing-the-hyper-threading-status configuring_cpu_core_assignments/configuring-cpu-core-assignments + configuring_cpu_core_assignments/changing-the-hyper-threading-status ------------------------ Host memory provisioning diff --git a/doc/source/node_management/kubernetes/node_interfaces/configuring-aggregated-ethernet-interfaces-using-the-cli.rst b/doc/source/node_management/kubernetes/node_interfaces/configuring-aggregated-ethernet-interfaces-using-the-cli.rst index 85b1f9b6d..dff472afb 100644 --- a/doc/source/node_management/kubernetes/node_interfaces/configuring-aggregated-ethernet-interfaces-using-the-cli.rst +++ b/doc/source/node_management/kubernetes/node_interfaces/configuring-aggregated-ethernet-interfaces-using-the-cli.rst @@ -115,17 +115,3 @@ Settings `. ~(keystone_admin)$ system host-if-add controller-0 -a balanced -x layer2 ae0 ae enp0s9 enp0s10 ~(keystone_admin)$ system interface-datanetwork-assign controller-0 ae0 providernet-net-a ~(keystone_admin)$ system interface-datanetwork-assign controller-0 ae0 providernet-net-b - - For example, to attach an aggregated Ethernet interface named **bond0** to - the platform management network, using member interfaces **enp0s8** - and **enp0s11** on **controller-0**: - - .. code-block:: none - - ~(keystone_admin)$ system host-if-add controller-0 -c platform -a active_standby --primary-reselect failure bond0 ae enp0s8 enp0s11 - ~(keystone_admin)$ system interface-network-assign controller-0 bond0 mgmt - - -.. only:: partner - - ../../../_includes/configuring-aggregated-ethernet-interfaces-using-the-cli.rest diff --git a/doc/source/node_management/kubernetes/node_interfaces/interface-ip-address-provisioning-using-the-cli.rst b/doc/source/node_management/kubernetes/node_interfaces/interface-ip-address-provisioning-using-the-cli.rst index 48432ae58..dc09bb750 100644 --- a/doc/source/node_management/kubernetes/node_interfaces/interface-ip-address-provisioning-using-the-cli.rst +++ b/doc/source/node_management/kubernetes/node_interfaces/interface-ip-address-provisioning-using-the-cli.rst @@ -9,9 +9,9 @@ Interface IP Address Provisioning Using the CLI On a network that uses static addressing, you must assign an IP address to the interface using the :command:`system host-addr-add` command. -The procedure for attaching an interface depends on the interface type. +The procedure for adding an IP address depends on the interface type. -|prod| supports three types of interfaces: +|prod| supports the following types of interfaces: **Ethernet interfaces** These are created automatically for each port on the host. You must @@ -21,11 +21,32 @@ The procedure for attaching an interface depends on the interface type. For link protection, you can create an aggregated Ethernet interface with two or more ports, and configure it with the interface class. + .. code-block:: none + + ~(keystone_admin)$ system host-if-add -m mtu -a aemode -x txhashpolicy ifname ae + **VLAN interfaces** To support multiple interfaces on the same physical Ethernet or aggregated Ethernet interface, you can create |VLAN| interfaces and configure them with the interface class. + .. code-block:: none + + ~(keystone_admin)$ systemhost-if-add -V --vlan_id -c --ifclass + +**Virtual Function interfaces** + You can create an SROIV VF interface on top of an existing SROIV VF + interface in order to configure a subset of virtual functions with + different drivers. For example, if the ethernet SR-IOV interface is + configured with the kernel VF driver, you can create a VF interface to + configure a subset of virtual functions with the vfio driver that can be + used with userspace libraries such as DPDK. + + .. code-block:: none + + ~(keystone_admin)$ system host-if-add -c pci-sriov vf -N numvfs --vf-driver=drivername + + Logical interfaces of network types **oam** and **mgmt** cannot be deleted. They can only be modified to use different physical ports when required. @@ -55,16 +76,16 @@ They can only be modified to use different physical ports when required. where the following options are available: - **node** + ``node`` The name or UUID of the worker node. - **ifname** + ``ifname`` The name of the interface. - **ip\_address** + ``ip\_address`` An IPv4 or IPv6 address. - **prefix** + ``prefix`` The netmask length for the address. #. Unlock the node and wait for it to become available. diff --git a/doc/source/node_management/kubernetes/node_interfaces/interface-provisioning.rst b/doc/source/node_management/kubernetes/node_interfaces/interface-provisioning.rst index cdc71d20d..e3b52a583 100644 --- a/doc/source/node_management/kubernetes/node_interfaces/interface-provisioning.rst +++ b/doc/source/node_management/kubernetes/node_interfaces/interface-provisioning.rst @@ -77,7 +77,7 @@ They can only be modified to use different physical ports when required. see |planning-doc|: `Ethernet Interfaces `. .. note:: - On the second worker and storage nodes, the Ethernet interface for the + On all nodes, except for the initial controller, the Ethernet interface for the internal management network is attached automatically to support installation using |PXE| booting. On the initial controller node, the interface for the internal management network is attached according to the diff --git a/doc/source/node_management/kubernetes/node_interfaces/interface-settings.rst b/doc/source/node_management/kubernetes/node_interfaces/interface-settings.rst index f6d53d51d..b18cd0c6e 100644 --- a/doc/source/node_management/kubernetes/node_interfaces/interface-settings.rst +++ b/doc/source/node_management/kubernetes/node_interfaces/interface-settings.rst @@ -147,7 +147,7 @@ These settings are available on the **Edit Interface** and Use an address from a pool of IPv4 addresses that has been defined and associated with the data interface. -**IPv4 Addressing Mode** +**IPv4 Addressing Pool** \(Shown only when the **IPv4 Addressing Mode** is set to **pool**\). The pool from which to assign an IPv4 address. diff --git a/doc/source/security/kubernetes/centralized-oidc-authentication-setup-for-distributed-cloud.rst b/doc/source/security/kubernetes/centralized-oidc-authentication-setup-for-distributed-cloud.rst index 48d241b1a..d658003bb 100644 --- a/doc/source/security/kubernetes/centralized-oidc-authentication-setup-for-distributed-cloud.rst +++ b/doc/source/security/kubernetes/centralized-oidc-authentication-setup-for-distributed-cloud.rst @@ -17,7 +17,7 @@ Distributed Setup ----------------- For a distributed setup, configure the **kube-apiserver**, and -**oidc-auth-apps** independently for each cloud, SystemController, and all +**oidc-auth-apps** independently for each cloud, System Controller, and all subclouds. For more information, see: @@ -53,21 +53,21 @@ Centralized Setup ----------------- For a centralized setup, the **oidc-auth-apps** is configured '**only**' on -the SystemController. The **kube-apiserver** must be configured on all -clouds, SystemController, and all subclouds, to point to the centralized -**oidc-auth-apps** running on the SystemController. In the centralized +the System Controller. The **kube-apiserver** must be configured on all +clouds, System Controller, and all subclouds, to point to the centralized +**oidc-auth-apps** running on the System Controller. In the centralized setup, a user logs in, authenticates, and gets an |OIDC| token from the -Central SystemController's |OIDC| identity provider, and uses the |OIDC| token -with '**any**' of the subclouds as well as the SystemController cloud. +Central System Controller's |OIDC| identity provider, and uses the |OIDC| token +with '**any**' of the subclouds as well as the System Controller cloud. For a centralized |OIDC| authentication setup, use the following procedure: .. rubric:: |proc| -#. Configure the **kube-apiserver** parameters on the SystemController and +#. Configure the **kube-apiserver** parameters on the System Controller and each subcloud during bootstrapping, or by using the **system service-parameter-add kubernetes kube\_apiserver** command after - bootstrapping the system, using the SystemController's floating OAM IP + bootstrapping the system, using the System Controller's floating OAM IP address as the oidc\_issuer\_url for all clouds. address as the oidc\_issuer\_url for all clouds. @@ -89,7 +89,7 @@ For a centralized |OIDC| authentication setup, use the following procedure: ` -#. On the SystemController only configure the **oidc-auth-apps**. For more information, see: +#. On the System Controller only configure the **oidc-auth-apps**. For more information, see: :ref:`Configure OIDC Auth Applications ` diff --git a/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade-using-the-cli.rst b/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade-using-the-cli.rst index 9ee760104..9aec95f00 100644 --- a/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade-using-the-cli.rst +++ b/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade-using-the-cli.rst @@ -22,12 +22,12 @@ The upgrade strategy options are shown in the following output: ~(keystone_admin)]$ sw-manager upgrade-strategy --help usage: sw-manager upgrade-strategy [-h] ... - + optional arguments: -h, --help show this help message and exit - + Software Upgrade Commands: - + create Create a strategy delete Delete a strategy apply Apply a strategy @@ -56,7 +56,7 @@ upgrade orchestration to orchestrate the remaining nodes of the |prod|. - 900.201, Software upgrade auto apply in progress -.. _performing-an-orchestrated-upgrade-using-the-cli-ul-qhy-q1p-v1b: +.. _performing-an-orchestrated-upgrade-using-the-cli-ul-qhy-q1p-v1b: .. rubric:: |prereq| @@ -65,6 +65,10 @@ See :ref:`Upgrading All-in-One Duplex / Standard controller node before doing the upgrade orchestration described below to upgrade the remaining nodes of the |prod|. +- The subclouds must use the Redfish platform management service if it is an All-in-one Simplex subcloud. + +- Duplex \(AIODX/Standard\) upgrades are supported, and they do not require remote install via Redfish. + .. rubric:: |proc| .. _performing-an-orchestrated-upgrade-using-the-cli-steps-e45-kh5-sy: diff --git a/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade.rst b/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade.rst index e0abe7afe..b4c917bf7 100644 --- a/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade.rst +++ b/doc/source/updates/kubernetes/performing-an-orchestrated-upgrade.rst @@ -24,7 +24,7 @@ You can perform a partially-Orchestrated Upgrade of a |prod| system using the CL During an orchestrated upgrade, the following alarms are ignored even when strict restrictions are selected: - - 750.006, Automatic application re-apply is pending + - 750.006, Generic alarm for any platform-managed applications as they are auto-applied - 900.005, Upgrade in progress @@ -60,7 +60,7 @@ upgrade the remaining nodes of the |prod| system. **serial** \(default\) Storage hosts will be upgraded one at a time. - **parallel** + **parallel** Storage hosts will be upgraded in parallel, ensuring that only one storage node in each replication group is upgraded at a time. @@ -69,7 +69,7 @@ upgrade the remaining nodes of the |prod| system. - worker-apply-type: - **serial** \(default\): + **serial** \(default\): Worker hosts will be upgraded one at a time. **parallel**