Juanita-Balaraj 25105d9287 StarlingX 7.0 Release Notes - Draft

Updated Vault and Portieris to indicate its not supported in Stx 7.0
Modified the Debian ISO link - http://mirror.starlingx.cengn.ca/mirror/starlingx/release/7.0.0/debian/monolithic/outputs/
Removed comments from the 7.0 Release page / Fixed typos
Fixed Index to remove Stx 5.0.1 is EOL
Fixed Link to 7.0 Installation Guide
Modified Index to indicate Stx 5.0 is EOL
Removed "Quartzville tools" limitation as it is fixed
Modified text to add link to Debian ISO repo
Revised text for the Deployment and New Kubernetes Taint on Controllers for Standard Systems sections
Added Installation Doc link
Updated vRAN Tools topic based on comments in Patchset 5
Updated Index File
Added Known Limitations and Workarounds
Added New features in Stx 7.0

Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
Change-Id: I3380ecaeaf89f4545f7556b9b8ca294c3caa4f10

2022-09-06 12:00:56 -04:00

39 KiB

Raw Blame History

R7.0 Release Notes

ISO image

The pre-built ISO (CentOS and Debian) and Docker images for StarlingX release 7.0 are located at the CENGN StarlingX mirror repos:

Note

Debian is a Technology Preview Release and only supports in StarlingX Release 7.0 and uses the same docker images as CentOS.

Branch

The source code for StarlingX release 7.0 is available in the r/stx.7.0 branch in the StarlingX repositories.

Deployment

To deploy StarlingX release 7.0. Refer to Consuming StarlingX <index-intro-27197f27ad41>.

For detailed installation instructions, see R7.0 Installation Guides.

New features and enhancements

The list below provides a detailed list of new features and links to the associated user guides (if applicable).

Debian-based Solution

inherits the 5.10 kernel version from the Yocto project introduced in , i.e. the Debian 5.10 kernel is replaced with the Yocto project 5.10.x kernel (linux-yocto).

is a Technology Preview Release of Debian for evaluation purposes.

release runs Debian Bullseye (11.3). It is limited in scope to the configuration, . It is also limited in scope to Kubernetes apps and does not yet support running OpenStack on Debian.

See:

index-debian-introduction-8eb59cf0a062
operational-impacts-9cf2e610b5b3

Istio Service Mesh Application

The Istio Service Mesh application is integrated into as a system application.

Istio provides traffic management, observability as well as security as a Kubernetes service mesh. For more information, see https://istio.io/.

includes istio-operator container to manage the life cycle management of the Istio components.

See: Istio Service Mesh Application <istio-service-mesh-application-eee5ebb3d3c4>

Pod Security Admission Controller

The Beta release of Pod Security Admission (PSA) controller is available in StarlingX release 7.0 as a Technology Preview feature. It will replace Pod Security Policies in a future release.

PSA controller acts on creation and modification of the pod and determines if it should be admitted based on the requested security context and the policies defined. It provides a more usable k8s-native solution to enforce Pod Security Standards.

See:

https://kubernetes.io/docs/concepts/security/pod-security-admission/
pod-security-admission-controller-8e9e6994100f

Platform Application Components Revision

The following applications have been updated to a new version in Release 7.0.

cert-manager, 1.7.1
metric-server, 1.0.18
nginx-ingress-controller, 1.1.1
oidc-dex, 2.31.1

cert-manager

The upgrade of cert-manager from 0.15.0 to 1.7.1 deprecated support for cert manager API versions cert-manager.io/v1alpha2 and cert-manager.io/v1alpha3. When creating cert-manager (certificates, issuers, etc) with , Release 7.0, use API version of cert-manager.io/v1.

Cert manager resources that are already deployed on the system will be automatically converted to API version of cert-manager.io/v1. Anything created using automation or previous releases should be converted with the cert-manager kubectl plugin using the instructions documented in https://cert-manager.io/docs/installation/upgrading/upgrading-0.16-1.0/#converting-resources before being deployed to the new release.

metric-server

In Release 7.0 the Metrics Server will NOT be automatically updated. To update the Metrics Server, see Install Metrics Server <kubernetes-admin-tutorials-metrics-server>

oidc-dex

Release 7.0 supports helm-overrides of oidc-auth-apps application. The recommended and legacy example Helm overrides of oidc-auth-apps are supported for upgrades, as described in documentation User Authentication Using Windows Active Directory <user-authentication-using-windows-active-directory-security-index>.

See: configure-oidc-auth-applications.

Bond CNI plugin

The Bond CNI plugin v1.0.1 is now supported in Release 7.0.

The Bond CNI plugin provides a method for aggregating multiple network interfaces into a single logical "bonded" interface.

To add a bonded interface to a container, a network attachment definition of type bond must be created and added as a network annotation in the pod specification. The bonded interfaces can either be taken from the host or container based on the value of the linksInContainer parameter in the network attachment definition. It provides transparent link aggregation for containerized applications via K8s configuration for improved redundancy and link capacity.

See:

integrate-the-bond-cni-plugin-2c2f14733b46

PTP GNSS and Time SyncE Support for 5G Solutions

Intel's E810 Westport Channel and Logan Beach NICs support a built-in GNSS module and the ability to distribute clock via Synchronous Ethernet (SyncE). This feature allows a PPS signal to be taken in via the module and redistributed to additional NICs on the same host or on different hosts. This behavior is configured on using the clock instance type in the configuration.

These parameters are used to enable the UFL/SMA ports, recovered clock syncE etc. Refer to the user's guide for the Westport Channel or Logan Beach NIC for additional details on how to operate these cards.

See: SyncE and Introduction <gnss-and-synce-support-62004dc97f3e>

PTP Clock TAI Support

A special ptp4l instance level parameter is provided to allow a PTP node to set the currentUtcOffsetValid flag in its announce messages and to correctly set the CLOCK_TAI on the system.

PTP Multiple NIC Boundary Clock Configuration StarlingX 7.0 provides support for PTP multiple NIC Boundary Clock configuration. Multiple instances of ptp4l, phc2sys and ts2phc can now be configured on each host to support a variety of configurations including Telecom Boundary clock (T-BC), Telecom Grand Primary clock (T-GM) and Ordinary clock (OC).

See:

ptp-server-config-index

Enhanced Parallel Operations for Distributed Cloud

The following operations can now be performed on a larger number of subclouds in parallel. The supported maximum parallel number ranges from 100 to 500 depending on the type of operation.

Subcloud Install
Subcloud Deployment (bootstrap and deploy)
Subcloud Manage and Sync
Subcloud Application Deployment/Update
Patch Orchestration
Upgrade Orchestration
Firmware Update Orchestration
Kubernetes Upgrade Orchestration
Kubernetes Root CA Orchestration
Upgrade Prestaging

--force option

The --force option has been added to the dcmanager upgrade-strategy create command. This option upgrades both online and offline subclouds for a single subcloud or a group of subclouds.

See Distributed Upgrade Orchestration Process Using the CLI <distributed-upgrade-orchestration-process-using-the-cli>

Subcloud Local Installation Enhancements

Error preventive mechanisms have been implemented for subcloud local installation.

Pre-check to avoid overwriting installed systems
Unified ISO image for multiple systems and disk configurations
Prestage execution optimization
Effective handling of resized docker and docker-distribution filesystems over subcloud upgrade

See Subcloud Deployment with Local Installation <subcloud-deployment-with-local-installation-4982449058d5>.

Distributed Cloud Horizon Orchestration Updates

You can use the Horizon Web interface to upgrade Kubernetes across the Distributed Cloud system by applying the Kubernetes upgrade strategy for Distributed Cloud Orchestration.

See: apply-a-kubernetes-upgrade-strategy-using-horizon-2bb24c72e947

You can use Horizon to update the device/firmware image across the Distributed Cloud system by applying the firmware update strategy for Distributed Cloud Update Orchestration.

See: apply-the-firmware-update-strategy-using-horizon-e78bf11c7189

You can upgrade the platform software across the Distributed Cloud system by applying the upgrade strategy for Distributed Cloud Upgrade Orchestration.

See: apply-the-upgrade-strategy-using-horizon-d0aab18cc724

You can use the Horizon Web interface as an alternative to the CLI for managing device / firmware image update strategies (Firmware update).

See: create-a-firmware-update-orchestration-strategy-using-horizon-cfecdb67cef2

You can use the Horizon Web interface as an alternative to the CLI for managing Kubernetes upgrade strategies.

See: create-a-kubernetes-upgrade-orchestration-using-horizon-16742b62ffb2

For more information, See: Distributed Cloud Guide <index-dist-cloud-kub-95bef233eef0>

Security Audit Logging for Platform Commands

logs all StarlingX REST API operator commands, except commands that use only GET requests. also logs all commands, including GET requests.

See:

Operator Command Logging <operator-command-logging>
Operator Login/Authentication Logging <operator-login-authentication-logging>

Security Audit Logging for K8s API

Kubernetes API Logging can be enabled and configured in , and can be fully configured and enabled at bootstrap time. Post-bootstrap, Kubernetes API logging can only be enabled or disabled. Kubernetes auditing provides a security-relevant, chronological set of records documenting the sequence of actions in a cluster.

See: kubernetes-operator-command-logging-663fce5d74e7

Playbook for managing local LDAP Admin User

The purpose of this playbook is to simplify and automate the management of composite Local accounts across multiple systems or standalone systems. A composite Local account is defined as a Local account that also has a unique keystone account with admin role credentials and access to a K8S serviceAccount with cluster-admin role credentials.

See: Manage Composite Local LDAP Accounts at Scale <manage-local-ldap-39fe3a85a528>

Kubernetes Custom Configuration

Kubernetes configuration can be customized during deployment by specifying bootstrap overrides in the localhost.yml file during the Ansible bootstrap process. Additionally, you can also override the extraVolumes section in the apiserver to add new configuration files that may be needed by the server.

See: Kubernetes Custom Configuration <kubernetes-custom-configuration-31c1fd41857d>

Configuring Host CPU MHz Parameters

Some hosts support setting a maximum frequency for their CPU cores (application cores and platform cores). You may need to configure a maximum scaled frequency to avoid variability due to power and thermal issues when configured for maximum performance. For these hosts, the parameters control the maximum frequency of their CPU cores.

Enable support for power saving modes available on Intel processors to facilitate a balance between latency and power consumption.

permits the CPU "p-states" and "c-states" control via the BIOS
Introduce a new starlingx-realtime tuned profile, specifically configured for the low latency profile to align with Intel recommendations for maximum performance while enabling support for higher c-states.

See: Host CPU MHz Parameters Configuration <host-cpu-mhz-parameters-configuration-d9ccf907ede0>

vRAN Intel Tool Enablement

The following open-source tools are delivered in the following container image, docker.io/starlingx/stx-centos-tools-dev:stx.7.0-v1.0.1:

dmidecode
net-tools
iproute
ethtool
tcpdump
turbostat
OPAE Tools (Open Programmable Acceleration Engine, fpgainfo, fpgabist, etc.)
ACPICA Tools (acpidump, acpixtract, etc.)
PCM Tools (https://github.com/opcm/pcm, pcm, pcm-core, etc.)

See: vRAN Tools <vran-tools-2c3ee49f4b0b>

Coredump Configuration Support

You can change the default core dump configuration used to create core files. These are images of the system's working memory used to debug crashes or abnormal exits.

See: Change the Default Coredump Configuration <change-the-default-coredump-configuration-51ff4ce0c9ae>

FluxCD replaces Airship Armada

application management provides a wrapper around FluxCD and Kubernetes Helm (see https://github.com/helm/helm) for managing containerized applications. FluxCD is a tool for managing multiple Helm charts with dependencies by centralizing all configurations in a single FluxCD YAML definition and providing life-cycle hooks for all Helm releases.

See: StarlingX Application Package Manager <kubernetes-admin-tutorials-starlingx-application-package-manager>. See: FluxCD Limitation note applicable to Release 7.0.

Kubernetes Upgrade

Kubernetes has now been upgraded to k8s 1.23.1 and is the default version for Release 7.0.

NetApp Trident Version Upgrade

contains the installer for Trident 22.01

If you are using NetApp Trident in and have upgraded from the previous version, ensure that your NetApp backend version is compatible with Trident 22.01.

Note

You need to upgrade the NetApp Trident driver to 22.01 before upgrading Kubernetes to 1.22.

See: upgrade-the-netapp-trident-software-c5ec64d213d3

Bug status

Fixed bugs

This release provides fixes for a number of defects. Refer to the StarlingX bug database to review the R7.0 Fixed Bugs.

Known Limitations and Workarounds

The following are known limitations you may encounter with your Release 7.0 and earlier releases. Workarounds are suggested where applicable.

Note

These limitations are considered temporary and will likely be resolved in a future release.

Debian Bootstrap

On CentOS bootstrap worked even if dns_servers were not present in the localhost.yml. This does not work for Debian bootstrap.

Workaround: You need to configure the dns_servers parameter in the localhost.yml, as long as no were used in the bootstrap overrides in the localhost.yml file for Debian bootstrap.

Installing a Debian ISO

Installing a Debian ISO may fail with a message that the system is in emergency mode. This occurs if the disks and disk partitions are not completely wiped before the install, especially if the server was previously running a CentOS ISO.

Workaround: When installing a lab for any Debian install, the disks must first be completely wiped using the following procedure before starting an install.

Use the following wipedisk commands to run before any Debian install for each disk (eg: sda, sdb, etc):

sudo wipedisk
# Show
sudo sgdisk -p /dev/sda
# Clear part table
sudo sgdisk -o /dev/sda

Note

The above commands must be run before any Debian install. The above commands must also be run if the same lab is used for CentOS installs after the lab was previously running a Debian ISO.

PTP 110.119 Alarm raised incorrectly on Debian

Alarm 100.119 (controller not locked on remote PTP Grand Master ( (Primary Time Source)) is raised on Release 7.0 systems running Debian after configuring instances. This alarm does not affect system operations.

Workaround: Manually delete the alarm using the fm alarm-delete command.

Note

Lock/Unlock and reboot events will cause the alarm to reappear. Use the workaround after these operations are completed.

N3000 image updates are not supported on Debian

N3000 image update and show operations are not supported on Debian. Support will be included in a future release.

Workaround: Do not attempt these operations on a Release 7.0 Debian system.

Security Audit Logging for K8s API

In Release 7.0, a custom policy file can only be created at bootstrap in apiserver_extra_volumes section. If a custom policy file was configured at bootstrap, then after bootstrap the user has the option to configure the parameter audit-policy-file to either this custom policy file (/etc/kubernetes/my-audit-policy-file.yml) or the default policy file /etc/kubernetes/default-audit-policy.yaml. If no custom policy file was configured at bootstrap, then the user can only configure the parameter audit-policy-file to the default policy file.

Only the parameter audit-policy-file is configurable after bootstrap, so the other parameters (audit-log-path, audit-log-maxsize, audit-log-maxage and audit-log-maxbackup) cannot be changed at runtime.

Workaround: NA

See: kubernetes-operator-command-logging-663fce5d74e7.

PTP is not supported on Broadcom 57504 NIC

is not supported on the Broadcom 57504 NIC.

Workaround: Do not configure instances on the Broadcom 57504 NIC.

Backup and Restore: Remote restore fails to gather the SSH public key

IPv4 remote restore fails while running restore bootstrap.

Workaround: If remote restore fails due to failed authentication, perform the restore on the box instead of remotely. This issue is caused when remote restore fails to gather the SSH public key.

Deploying an App using nginx controller fails with internal error after controller.name override

An Helm override of controller.name to the nginx-ingress-controller app may result in errors when creating ingress resources later on.

Example of Helm override:

cat <<EOF> values.yml
controller:
  name: notcontroller

EOF

~(keystone_admin)$ system helm-override-update nginx-ingress-controller ingress-nginx kube-system --values values.yml
+----------------+-----------------------+
| Property       | Value                 |
+----------------+-----------------------+
| name           | ingress-nginx         |
| namespace      | kube-system           |
| user_overrides | controller:           |
|                |   name: notcontroller |
|                |                       |
+----------------+-----------------------+

~(keystone_admin)$ system application-apply nginx-ingress-controller

Workaround: NA

Cloud installation causes disk errors in /dev/mapper/mpatha and CentOS

During installation of the HPE SAN disk, an error "/dev/mapper/mpatha is invalid" occurs (intermittent), and CentOS is not bootable (intermittent).

Workaround: Reboot the system to solve the issue.

Optimization with a Large number of OSDs

As Storage nodes are not optimized, you may need to optimize your Ceph configuration for balanced operation across deployments with a high number of . This results in an alarm being generated even if the installation succeeds.

800.001 - Storage Alarm Condition: HEALTH_WARN. Please check 'ceph -s'

Workaround: To optimize your storage nodes with a large number of , it is recommended to use the following commands:

$ ceph osd pool set kube-rbd pg_num 256
$ ceph osd pool set kube-rbd pgp_num 256

PTP Limitations

NICs using the Intel Ice NIC driver may report the following in the ptp4l logs, which might coincide with a |PTP| port switching to FAULTY` before re-initializing.

ptp4l[80330.489]: timed out while polling for tx timestamp
ptp4l[80330.CGTS-30543489]: increasing tx_timestamp_timeout may correct
this issue, but it is likely caused by a driver bug

This is due to a limitation of the Intel ICE driver.

Workaround: The recommended workaround is to set the tx_timestamp_timeout parameter to 700 (ms) in the ptp4l config using the following command.

~(keystone_admin)]$ system ptp-instance-parameter-add ptp-inst1 tx_timestamp_timeout=700

Multiple Lock/Unlock operations on the controllers causes 100.104 alarm

Performing multiple Lock/Unlock operations on controllers while is applied can fill the partition and can trigger an 100.104 alarm.

Workaround: Check the amount of space used by core dump using the controller-0:~$ ls -lha /var/lib/systemd/coredump` command. Core dumps related to MariaDB can be safely deleted.

BPF is disabled

cannot be used in the PREEMPT_RT/low latency kernel, due to the inherent incompatibility between PREEMPT_RT and , see, https://lwn.net/Articles/802884/.

Some packages might be affected when PREEMPT_RT and BPF are used together. This includes the following, but not limited to these packages.

libpcap
libnet
dnsmasq
qemu
nmap-ncat
libv4l
elfutils
iptables
tcpdump
iproute
gdb
valgrind
kubernetes
cni
strace
mariadb
libvirt
dpdk
libteam
libseccomp
binutils
libbpf
dhcp
lldpd
containernetworking-plugins
golang
i40e
ice

Workaround: StarlingX recommends not to use BPF with real time kernel. If required it can still be used, for example, debugging only.

crashkernel Value

crashkernel=auto is no longer supported by newer kernels, and hence the v5.10 kernel will not support the "auto" value.

Workaround: uses crashkernel=512m instead of crashkernel=auto.

New Kubernetes Taint on Controllers for Standard Systems

In future Releases, a new Kubernetes taint will be applied to controllers for Standard systems in order to prevent application pods from being scheduled on controllers; since controllers in Standard systems are intended ONLY for platform services. If application pods MUST run on controllers, a Kubernetes toleration of the taint can be specified in the application's pod specifications.

Workaround: Customer applications that need to run on controllers on Standard systems will need to be enabled/configured for Kubernetes toleration in order to ensure the applications continue working after an upgrade to Release 7.0 and future Releases.

You can specify toleration for a pod through the pod specification (PodSpec). For example:

spec:
....
template:
....
    spec
      tolerations:
        - key: "node-role.kubernetes.io/master"
        operator: "Exists"
        effect: "NoSchedule"

See: Taints and Tolerations.

Ceph alarm 800.001 interrupts the AIO-DX upgrade orchestration

Upgrade orchestration fails on systems that have Ceph enabled.

Workaround: Clear the Ceph alarm 800.001 by manually upgrading both controllers and using the following command:

~(keystone_admin)]$ ceph mon enable-msgr2

Ceph alarm 800.001 is cleared.

Storage Nodes are not considered part of the Kubernetes cluster

When running the system kube-host-upgrade-list command the output must only display controller and worker hosts that have control-plane and kubelet components. Storage nodes do not have any of those components and so are not considered a part of the Kubernetes cluster.

Workaround: Do not include Storage nodes.

Backup and Restore of ACC100 (Mount Bryce) configuration requires double unlock attempt

After restoring from a previous backup with an Intel ACC100 processing accelerator device, the first unlock attempt will be refused since this specific kind of device will be updated in the same context.

Workaround: A second attempt after few minutes will accept and unlock the host.

Application Pods with SRIOV Interfaces

Application Pods with Interfaces require a restart-on-reboot: "true" label in their pod spec template.

Pods with interfaces may fail to start after a platform restore or Simplex upgrade and persist in the Container Creating state due to missing PCI address information in the CNI configuration.

Workaround: Application pods that require should add the label restart-on-reboot: "true" to their pod spec template metadata. All pods with this label will be deleted and recreated after system initialization, therefore all pods must be restartable and managed by a Kubernetes controller (i.e. DaemonSet, Deployment or StatefulSet) for auto recovery.

Pod Spec template example:

template:
    metadata:
      labels:
        tier: node
        app: sriovdp
        restart-on-reboot: "true"

Management VLAN Failure

If the Management VLAN fails on the active System Controller, communication failure 400.005 is detected, and alarm 280.001 is raised indicating subclouds are offline.

Workaround: System Controller will recover and subclouds are manageable when the Management VLAN is restored.

Host Unlock During Orchestration

If a host unlock during orchestration takes longer than 30 minutes to complete, a second reboot may occur. This is due to the delays, VIM tries to abort. The abort operation triggers the second reboot.

Workaround: NA

Storage Nodes Recovery on Power Outage

Storage nodes take 10-15 minutes longer to recover in the event of a full power outage.

Workaround: NA

Ceph OSD Recovery on an AIO-DX System

In certain instances a Ceph OSD may not recover on an system (for example, if an OSD comes up after a controller reboot and a swact occurs), and remains in the down state when viewed using the ceph -s command.

Workaround: Manual recovery of the OSD may be required.

Using Helm with Container-Backed Remote CLIs and Clients

If Helm is used within Container-backed Remote CLIs and Clients:

You will NOT see any helm installs from Platform's system Armada applications.

Workaround: Do not directly use Helm to manage Platform's system Armada applications. Manage these applications using system application commands.
You will NOT see any helm installs from end user applications, installed using Helm on the controller's local CLI.

Workaround: It is recommended that you manage your Helm applications only remotely; the controller's local CLI should only be used for management of the Platform infrastructure.

Remote CLI Containers Limitation for StarlingX Platform HTTPS Systems

The python2 SSL lib has limitations with reference to how certificates are validated. If you are using Remote CLI containers, due to a limitation in the python2 SSL certificate validation, the certificate used for the 'ssl' certificate should either have:

CN=IPADDRESS and SAN=empty or,
CN=FQDN and SAN=FQDN

Workaround: Use CN=FQDN and SAN=FQDN as CN is a deprecated field in the certificate.

Cert-manager does not work with uppercase letters in IPv6 addresses

Cert-manager does not work with uppercase letters in IPv6 addresses.

Workaround: Replace the uppercase letters in IPv6 addresses with lowercase letters.

apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
    name: oidc-auth-apps-certificate
    namespace: test
spec:
    secretName: oidc-auth-apps-certificate
    dnsNames:
    - ahost.com
    ipAddresses:
    - fe80::903a:1c1a:e802::11e4
    issuerRef:
        name: cloudplatform-interca-issuer
        kind: Issuer

Kubernetes Root CA Certificates

Kubernetes does not properly support k8s_root_ca_cert and k8s_root_ca_key being an Intermediate CA.

Workaround: Accept internally generated k8s_root_ca_cert/key or customize only with a Root CA certificate and key.

Windows Active Directory

Limitation: The Kubernetes API does not support uppercase IPv6 addresses.

Workaround: The issuer_url IPv6 address must be specified as lowercase.
Limitation: The refresh token does not work.

Workaround: If the token expires, manually replace the ID token. For more information, see, Obtain the Authentication Token Using the Browser <obtain-the-authentication-token-using-the-browser>.
Limitation: TLS error logs are reported in the oidc-dex container on subclouds. These logs should not have any system impact.

Workaround: NA
Limitation: stx-oidc-client liveness probe sometimes reports failures. These errors may not have system impact.

Workaround: NA

BMC Password

The BMC password cannot be updated.

Workaround: In order to update the BMC password, de-provision the BMC, and then re-provision it again with the new password.

Application Fails After Host Lock/Unlock

In some situations, application may fail to apply after host lock/unlock due to previously evicted pods.

Workaround: Use the kubectl delete command to delete the evicted pods and reapply the application.

Application Apply Failure if Host Reset

If an application apply is in progress and a host is reset it will likely fail. A re-apply attempt may be required once the host recovers and the system is stable.

Workaround: Once the host recovers and the system is stable, a re-apply may be required.

Pod Recovery after a Host Reboot

On occasions some pods may remain in an unknown state after a host is rebooted.

Workaround: To recover these pods kill the pod. Also based on https://github.com/kubernetes/kubernetes/issues/68211 it is recommended that applications avoid using a subPath volume configuration.

Rare Node Not Ready Scenario

In rare cases, an instantaneous loss of communication with the active kube-apiserver may result in kubernetes reporting node(s) as stuck in the "Not Ready" state after communication has recovered and the node is otherwise healthy.

Workaround: A restart of the kublet process on the affected node(s) will resolve the issue.

Platform CPU Usage Alarms

Alarms may occur indicating platform cpu usage is >90% if a large number of pods are configured using liveness probes that run every second.

Workaround: To mitigate either reduce the frequency for the liveness probes or increase the number of platform cores.

Pods Using isolcpus

The isolcpus feature currently does not support allocation of thread siblings for cpu requests (i.e. physical thread +HT sibling).

Workaround: NA

system host-disk-wipe command

The system host-disk-wipe command is not supported in this release.

Workaround: NA

Restrictions on the Size of Persistent Volume Claims (PVCs)

There is a limitation on the size of Persistent Volume Claims (PVCs) that can be used for all StarlingX Platform Releases.

Workaround: It is recommended that all PVCs should be a minimum size of 1GB. For more information, see, https://bugs.launchpad.net/starlingx/+bug/1814595.

Sub-Numa Cluster Configuration not Supported on Skylake Servers

Sub-Numa cluster configuration is not supported on Skylake servers.

Workaround: For servers with Skylake Gold or Platinum CPUs, Sub-NUMA clustering must be disabled in the BIOS.

The ptp-notification-demo App is Not a System-Managed Application

The ptp-notification-demo app is provided for demonstration purposes only. Therefore, it is not supported on typical platform operations such as Backup and Restore.

Workaround: NA

Deleting image tags in registry.local may delete tags under the same name

When deleting image tags in the registry.local docker registry, you should be aware that the deletion of an <image-name:tag-name> will delete all tags under the specified <image-name> that have the same 'digest' as the specified <image-name:tag-name>. For more information, see, Delete Image Tags in the Docker Registry <delete-image-tags-in-the-docker-registry-8e2e91d42294>.

Workaround: NA

Vault Application

The Vault application is not supported in Release 7.0.

Workaround: NA

Portieris Application

The Portieris application is not supported in Release 7.0.

Workaround: NA

Deprecated Notices

Control Group parameter

The control group (cgroup) parameter kmem.limit_in_bytes has been deprecated, and results in the following message in the kernel's log buffer (dmesg) during boot-up and/or during the Ansible bootstrap procedure: "kmem.limit_in_bytes is deprecated and will be removed. Please report your usecase to linux-mm@kvack.org if you depend on this functionality." This parameter is used by a number of software packages in , including, but not limited to, systemd, docker, containerd, libvirt etc.

Workaround: NA. This is only a warning message about the future deprecation of an interface.

Airship Armada is deprecated

StarlingX Release 7.0 introduces FluxCD based applications that utilize FluxCD Helm/source controller pods deployed in the flux-helm Kubernetes namespace. Airship Armada support is now considered to be deprecated. The Armada pod will continue to be deployed for use with any existing Armada based applications but will be removed in StarlingX Release 8.0, once the stx-openstack Armada application is fully migrated to FluxCD.

Workaround: NA

39 KiB Raw Blame History

R7.0 Release Notes

ISO image

Branch

Deployment

New features and enhancements

Debian-based Solution

Istio Service Mesh Application

Pod Security Admission Controller

Platform Application Components Revision

Bond CNI plugin

PTP GNSS and Time SyncE Support for 5G Solutions

PTP Clock TAI Support

Enhanced Parallel Operations for Distributed Cloud

--force option

Subcloud Local Installation Enhancements

Distributed Cloud Horizon Orchestration Updates

Security Audit Logging for Platform Commands

Security Audit Logging for K8s API

Playbook for managing local LDAP Admin User

Kubernetes Custom Configuration

Configuring Host CPU MHz Parameters

vRAN Intel Tool Enablement

Coredump Configuration Support

FluxCD replaces Airship Armada

Kubernetes Upgrade

NetApp Trident Version Upgrade

Bug status

Fixed bugs

Known Limitations and Workarounds

Debian Bootstrap

Installing a Debian ISO

PTP 110.119 Alarm raised incorrectly on Debian

N3000 image updates are not supported on Debian

Security Audit Logging for K8s API

PTP is not supported on Broadcom 57504 NIC

Backup and Restore: Remote restore fails to gather the SSH public key

Deploying an App using nginx controller fails with internal error after controller.name override

Cloud installation causes disk errors in /dev/mapper/mpatha and CentOS

Optimization with a Large number of OSDs

PTP Limitations

Multiple Lock/Unlock operations on the controllers causes 100.104 alarm

BPF is disabled

crashkernel Value

New Kubernetes Taint on Controllers for Standard Systems

Ceph alarm 800.001 interrupts the AIO-DX upgrade orchestration

Storage Nodes are not considered part of the Kubernetes cluster

Backup and Restore of ACC100 (Mount Bryce) configuration requires double unlock attempt

Application Pods with SRIOV Interfaces

Management VLAN Failure

Host Unlock During Orchestration

Storage Nodes Recovery on Power Outage

Ceph OSD Recovery on an AIO-DX System

Using Helm with Container-Backed Remote CLIs and Clients

Remote CLI Containers Limitation for StarlingX Platform HTTPS Systems

Cert-manager does not work with uppercase letters in IPv6 addresses

Kubernetes Root CA Certificates

Windows Active Directory

BMC Password

Application Fails After Host Lock/Unlock

Application Apply Failure if Host Reset

Pod Recovery after a Host Reboot

Rare Node Not Ready Scenario

Platform CPU Usage Alarms

Pods Using isolcpus

system host-disk-wipe command

Restrictions on the Size of Persistent Volume Claims (PVCs)

Sub-Numa Cluster Configuration not Supported on Skylake Servers

The ptp-notification-demo App is Not a System-Managed Application

Deleting image tags in registry.local may delete tags under the same name

Vault Application

Portieris Application

Deprecated Notices

Control Group parameter

Airship Armada is deprecated

39 KiB

Raw Blame History