22 KiB
Install a Subcloud Using Redfish Platform Management Service
For subclouds with servers that support Redfish Virtual Media Service (version 1.2 or higher), you can use the Central Cloud's CLI to install the ISO and bootstrap the subclouds from the Central Cloud.
After physically installing the hardware and network connectivity of a subcloud, the subcloud installation has these phases:
- Executing the
dcmanager subcloud add
command in the Central Cloud:- Uses Redfish Virtual Media Service to remote install the ISO on controller-0 in the subcloud
- Uses Ansible to bootstrap on controller-0 in the subcloud
Note
Remove all removable USB storage devices from subcloud servers before starting a Redfish remote subcloud install.
A new system CLI option
--active
is added to theload-import
command to allow the import into the system controller/opt/dc-vault/loads
. The purpose of this is to allow Redfish install of subclouds referencing a single full copy of thebootimage.iso
at/opt/dc-vault/loads
. (Previously, the fullbootimage.iso
was duplicated for eachsubcloud add
command).Note
This is required only once and does not have to be done for every subcloud install.
dcmanager
recognizes bootimage names ending in.iso
and.sig
.For example,
~(keystone_admin)]$ system --os-region-name SystemController load-import --active .iso .sig
The ISO imported via
load-import --active
must be at the same patch level as the system controller. This ensures that the subcloud boot image aligns with the patch level of the load to be installed on the subcloud.There is no support using the option
--active
together with--local
.
Warning
If the patch level of load-imported ISO does not match the system controller patch level, the subcloud patch state may not align with the system controller patch state.
Run the
load-import
command on controller-0 to import the new release.You can specify either the full file path or relative paths to the
*.iso
bootimage file and to the*.sig
bootimage signature file.$ source /etc/platform/openrc ~(keystone_admin)]$ system load-import [--local] /home/sysadmin/<bootimage>.iso <bootimage>.sig +--------------------+-----------+ | Property | Value | +--------------------+-----------+ | id | 2 | | state | importing | | software_version | nn.nn | | compatible_version | nn.nn | | required_patches | | +--------------------+-----------+
The
load-import
must be done on controller-0.(Optional) If
--local
is specified, the ISO and sig files are uploaded directly from the active controller, where <local_iso_file_path> and <local_sig_file_path> are paths on the active controller to load ISO files and sig files respectively.Note
If
--local
is specified, the ISO and sig files are transferred directly from the active controller filesystem to the load directory, if it is not specified, the files are transferred via the API.Note
This will take a few minutes to complete.
In order to deploy subclouds from either controller, all local files that are referenced in the
subcloud-bootstrap-values.yaml
file must exist on both controllers (for example,/home/sysadmin/docker-registry-ca-cert.pem
).
Controlling the RVMC debug level and automatic serial console log capture
The optional parameter, rvmc_debug_level
, in the
subcloud install_values YAML file, controls the generation of debug logs
during installation, which are then stored in the ansible log files for
each subcloud.
Valid rvmc_debug_levels
The available rvmc_debug_level
values control the log
content as follows.
Note that the log levels increase in verbosity as they increase:
0: Debug logging is disabled (normal log content)
1: Logs operations of each remote install stage, such as RedFish session open/close, eject/insert image, power on/off host, and more. If the install_type matches a serial console, then the full serial console log output is also captured automatically.
2: Logs HTTP request type and URL with corresponding response status.
3: Logs HTTP request headers and payloads, along with Redfish action tracing logs.
4: Outputs JSON of all command responses.
Automatic Serial Console capture (for ``rvmc_debug_level`` > 1)
When the rvmc_debug_level
is enabled
(rvmc_debug_level
> 0), the full serial console output
can be automatically captured, provided the serial console is configured
in the install_type
install value.
Note
Capturing graphical console output is not supported.
The install_type
in the subcloud install_values YAML
file must correspond to one of the serial console types:
0: Standard Controller, Serial Console
2: AIO, Serial Console
4: AIO Low-latency, Serial Console
Automatic Serial Console capture (for ``rvmc_debug_level`` > 1)
Automatic Serial Console capture is based on the global
CONF.ipmi_capture
value and the given
rvmc_debug_level
. The global CONF.ipmi_capture
value defaults to 1, which defers the configuration to the per-subcloud
rvmc_debug_level
install value. The
CONF.ipmi_capture
can be set in
/etc/dcmanager/dcmanager.conf
to override this setting for
all subclouds.
CONF.ipmi_capture options:
0: globally disabled 1: enabled based on rvmc_debug_level 2: globally enabled
At the subcloud location, physically install the servers and network connectivity required for the subcloud.
Note
Do not power off the servers. The host portion of the server can be powered off, but the portion of the server must be powered and accessible from the system controller.
There is no need to wipe the disks.
Note
The servers require connectivity to a gateway router that provides IP routing between the subcloud management or admin subnet and the system controller management subnet, and between the subcloud subnet and the system controller subnet.
Create the
subcloud-install-values.yaml
file and use the content to pass the file into thedcmanager subcloud add
command, using the--install-values
command option.Note
If your controller is on a ZTSystems Triton server that requires a longer timeout value, you can now use the
rd.net.timeout.ipv6dad
dracut parameter to specify an increased timeout value for dracut to wait for the interface to have carrier, and complete IPv6 duplicate address detection . For the ZTSystems server, this can take more than four minutes. It is recommended to set this value to 300 seconds, by specifying the following in thesubcloud-install-values.yaml
file:rd.net.timeout.ipv6dad: 300
Note
The
wait_for_timeout
value must be chosen based on your network performance (bandwidth, latency, and quality) and should be increased if the network does not meet the minimum or timeout requirements. The default value of 3600 seconds is based on a network bandwidth of 100 Mbps with a 50 ms delay.For example,
--install-values /home/sysadmin/subcloud-install-values.yaml
.# Specify the software version, for example 'nn.nn' for the nn.nn release of software. software_version: <software_version> bootstrap_interface: <bootstrap_interface_name> # e.g. eno1 bootstrap_address: <bootstrap_interface_ip_address> # e.g.128.224.151.183 bootstrap_address_prefix: <bootstrap_netmask> # e.g. 23
# Board Management Controller bmc_address: <BMCs_IPv4_or_IPv6_address> # e.g. 128.224.64.180 bmc_username: <bmc_username> # e.g. root
# If the subcloud's bootstrap IP interface and the system controller are not on the # same network then the customer must configure a default route or static route # so that the Central Cloud can login bootstrap the newly installed subcloud.
# If nexthop_gateway is specified and the network_address is not specified then a # default route will be configured. Otherwise, if a network_address is specified then # a static route will be configured.
nexthop_gateway: <default_route_address> for # e.g. 128.224.150.1 (required) network_address: <static_route_address> # e.g. 128.224.144.0 network_mask: <static_route_mask> # e.g. 255.255.254.0
# Installation type codes #0 - Standard Controller, Serial Console #1 - Standard Controller, Graphical Console #2 - AIO, Serial Console #3 - AIO, Graphical Console #4 - AIO Low-latency, Serial Console #5 - AIO Low-latency, Graphical Console install_type: 3
# Optional parameters defaults can be modified by uncommenting the option with a modified value.
# This option can be set to extend the installing stage timeout value # wait_for_timeout: 3600
# Set this options for https no_check_certificate: True
# If the bootstrap interface is a vlan interface then configure the vlan ID. # bootstrap_vlan: <vlan_id>
# Override default filesystem device. # rootfs_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0" # boot_device: "/dev/disk/by-path/pci-0000:00:1f.2-ata-1.0"
# Set the value for persistent file system (/opt/platform-backup). # The value must be whole number (in MB) that is greater than or equal # to 30000. persistent_size: 30000
# Configure custom arguments applied at boot within the installed subcloud. # Multiple boot arguments can be provided by separating each argument by a # single comma. Spaces are not allowed. # Example: extra_boot_params: multi-drivers-switch=cvl-2.54 # extra_boot_params:
Note
By default, 30GB is allocated for
/opt/platform-backup
. If additional persistent disk space is required, the partition can be increased in the next subcloud redeploy using the following commands:To increase
/opt/platform-backup
to 40GB, add the persistent_size: 40000 parameter to thesubcloud-install-values.yaml
file.Use the
dcmanager subcloud update
command to save the configuration change for the next subcloud redeploy.~(keystone_admin)]$ dcmanager subcloud update --install-values <subcloud-install-values.yaml> <subcloud-name>
For a new subcloud deployment, use the
dcmanager subcloud add
command with thesubcloud-install-values.yaml
file containing the desiredpersistent_size
value.At the system controller, create a
/home/sysadmin/subcloud-bootstrap-values.yaml
overrides file for the subcloud.For example:
system_mode: simplex name: "subcloud" description: "test" location: "loc" management_subnet: 192.168.101.0/24 management_start_address: 192.168.101.2 management_end_address: 192.168.101.50 management_gateway_address: 192.168.101.1 external_oam_subnet: 10.10.10.0/24 external_oam_gateway_address: 10.10.10.1 external_oam_floating_address: 10.10.10.12 systemcontroller_gateway_address: 192.168.204.101 docker_registries: k8s.gcr.io: url: registry.central:9001/k8s.gcr.io gcr.io: url: registry.central:9001/gcr.io ghcr.io: url: registry.central:9001/ghcr.io quay.io: url: registry.central:9001/quay.io docker.io: url: registry.central:9001/docker.io docker.elastic.co: url: registry.central:9001/docker.elastic.co registry.k8s.io: url: registry.central:9001/registry.k8s.io icr.io: url: registry.central:9001/icr.io defaults: username: sysinv password: <sysinv_password> type: docker
Where <sysinv_password> can be found by running the following command as 'sysadmin' on the Central Cloud:
$ keyring get sysinv services
In the above example, if the admin network is used for communication between the subcloud and system controller, then the
management_gateway_address
parameter should be replaced with admin subnet information.For example:
management_subnet: 192.168.101.0/24 management_start_address: 192.168.101.2 management_end_address: 192.168.101.50 admin_subnet: 192.168.102.0/24 admin_start_address: 192.168.102.2 admin_end_address: 192.168.102.50 admin_gateway_address: 192.168.102.1
This configuration will install container images from the local registry on your central cloud. The Central Cloud's local registry's HTTPS Certificate must have the Central Cloud's IP, registry.local and registry.central in the certificate's list. For example, a valid certificate contains a list:
"DNS.1: registry.local DNS.2: registry.central IP.1: floating_management IP.2: floating_OAM"
If required, refer to the
migrate-platform-certificates-to-use-cert-manager-c0b1727e4e5d
procedure to update the Docker Registry certificate for the Central Cloud. The procedure will include the with required Domain Names and IPs in the the certificate's list.If you prefer to install container images from the default external registries, make the following substitutions for the docker_registries sections of the file.
docker_registries: defaults: username: <your_default_registry_username> password: <your_default_registry_password>
Using the add_docker_prefix flag
The
add_docker_prefix
flag determines ifdocker.io prefix
is automatically added to the image names while importing the images. This is useful in handling the custom and platform images.add_docker_prefix
: false (default)For default image bundle file names (example:
container-image*.tar.gz
),docker.io prefix
is added to images without a known registry prefix when they are loaded to the Docker cache. For custom image bundle file names (example:custom-name.tar.gz
), the image is loaded to the Docker cache without modifying the registry prefix.add_docker_prefix
: trueFor both default and custom image bundle file names,
docker.io prefix
is added to images without a known registry prefix when they are loaded to the Docker cache.
You can explicitly set the
add_docker_prefix
flag in the host overrides file for more granular control over prefixing during the bootstrap or restore operations.Example configuration:
add_docker_prefix: true # Enable prefix for custom images add_docker_prefix: false # Disable prefix for custom images
This flag ensures that images are handled correctly based on their source, providing flexibility for a custom image management.
Add the subcloud using dcmanager.
When calling the
subcloud add
command, specify the install values, bootstrap values and the subcloud's sysadmin password.~(keystone_admin)]$ dcmanager subcloud add \ --bootstrap-address <oam_ip_address_of_subclouds_controller-0> \ --bootstrap-values /home/sysadmin/subcloud1-bootstrap-values.yaml \ --sysadmin-password <sysadmin_password> \ --deploy-config /home/sysadmin/subcloud1-deploy-config.yaml \ --install-values /home/sysadmin/install-values.yaml \ --bmc-password <bmc_password>
If the
--sysadmin-password
is not specified, you are prompted to enter it once the full command is invoked. The password is masked when it is entered.Enter the sysadmin password for the subcloud:
The
--deploy-config
option must reference the deployment configuration file mentioned above. In the deployment configurations, static routes from the management or admin interface of a subcloud to the system controller's management subnet must be explicitly listed. This ensures that the subcloud comes online after deployment. If the admin network is used for communication between the system controller and subcloud, the deployment configuration file must include both an admin network type and a management network type interface.(Optional) The
--bmc-password <password>
option is used for subcloud installation, and only required if the--install- values
option is specified.If the
--bmc-password <password>
option is omitted and the--install-values
option is specified the system administrator will be prompted to enter it, following thedcmanager subcloud add
command. This option is ignored if the--install-values
option is not specified. The password is masked when it is entered.Enter the bmc password for the subcloud:
The
dcmanager subcloud show
ordcmanager subcloud list
command can be used to view subcloud add progress.At the Central Cloud / System Controller, monitor the progress of the subcloud install, bootstrapping, and deployment by using the deploy status field of the
dcmanager subcloud list
command.Caution
If there is an installation failure, or a failure during bootstrapping, you must delete the subcloud before re-adding it, using the
dcmanager subcloud add
command. For more information on deleting, managing or unmanaging a subcloud, seeManaging Subclouds Using the CLI <managing-subclouds-using-the-cli>
.If there is a deployment failure, do not delete the subcloud, use the
dcmanager subcloud deploy config
command, to reconfigure the subcloud. For more information, seeManaging Subclouds Using the CLI <managing-subclouds-using-the-cli>
.You can also monitor detailed logging of the subcloud installation, bootstrapping and deployment by monitoring the following log files on the active controller in the Central Cloud.
/var/log/dcmanager/ansible/<subcloud_name>_playbook_output.log
For example:
controller-0:/home/sysadmin# tail /var/log/dcmanager/ansible/subcloud_playbook_output.log k8s.gcr.io: {password: secret, url: null} quay.io: {password: secret, url: null} ) TASK [bootstrap/bringup-essential-services : Mark the bootstrap as completed] *** changed: [subcloud] PLAY RECAP ********************************************************************* subcloud : ok=230 changed=137 unreachable=0 failed=0
Note
The subcloud_playbook_output.log can rotate, the previous log file will be subcloud_playbook_output.log.1.
Provision the newly installed and bootstrapped subcloud. For detailed deployment procedures for the desired deployment configuration of the subcloud, see the post-bootstrap steps of .
Check and update docker registry credentials on the subcloud:
REGISTRY="docker-registry" SECRET_UUID='system service-parameter-list | fgrep $REGISTRY | fgrep auth-secret | awk '{print $10}'' SECRET_REF='openstack secret list | fgrep $ {SECRET_UUID} | awk '{print $2}'' openstack secret get ${SECRET_REF} --payload -f value
The secret payload should be,
username: sysinv password:<password>
. If the secret payload is, "username: admin password:<password>", see,Updating Docker Registry Credentials on a Subcloud <updating-docker-registry-credentials-on-a-subcloud>
for more information.For more information on bootstrapping and deploying, see the procedures listed under
install-a-subcloud
.