docs/doc/source/updates/kubernetes/software-upgrades.rst
Ron Stone d8d90b4d75 Install conditionalizations
OVS related deployment conditionalizations.
Patchset 1 review updates.
Updates based on additional inputs.
Patchset 3 review updates.
Fixed some unexpanded substitutions and formatting issues throughout.
Patchset 5 updates.

Signed-off-by: Ron Stone <ronald.stone@windriver.com>
Change-Id: Ib86bf0e13236a40f7a472d4448a9b2d3cc165deb
Signed-off-by: Ron Stone <ronald.stone@windriver.com>

Reorg OpenStack installion for DS consumption

This review replaces https://review.opendev.org/c/starlingx/docs/+/801130

Signed-off-by: Ron Stone <ronald.stone@windriver.com>
Change-Id: Iab9c8d56cff9c1bc57e7e09fb3ceef7cf626edad
Signed-off-by: Ron Stone <ronald.stone@windriver.com>
2021-07-22 19:59:01 -04:00

5.1 KiB

Software Upgrades

upgrades enable you to move software from one release of to the next release of .

software upgrade is a multi-step rolling-upgrade process, where hosts are upgraded one at time while continuing to provide its hosting services to its hosted applications. An upgrade can be performed manually or using Upgrade Orchestration, which automates much of the upgrade procedure, leaving a few manual steps to prevent operator oversight. For more information on manual upgrades, see Manual PLatform Components Upgrade <manual-upgrade-overview>. For more information on upgrade orchestration, see Orchestrated Platform Component Upgrade <orchestration-upgrade-overview>.

Warning

Do NOT use information in the guide for orchestrated software upgrades. If information in this document is used for a orchestrated upgrade, the upgrade will fail resulting in an outage. The Upgrade Orchestrator automates a recursive rolling upgrade of all subclouds and all hosts within the subclouds.

Before starting the upgrades process:

  • the system must be “patch current”
  • there must be no management-affecting alarms present on the system
  • the new software load must be imported, and
  • a valid license file for the new software release must be installed

The upgrade process starts by upgrading the controllers. The standby controller is upgraded first and involves loading the standby controller with the new release of software and migrating all the controller services' databases for the new release of software. Activity is switched to the upgraded controller, running in a 'compatibility' mode where all inter-node messages are using message formats from the old release of software. Before upgrading the second controller, is the "point-of-no-return for an in-service abort" of the upgrades process. The second controller is loaded with the new release of software and becomes the new Standby controller. For more information on manual upgrades, see Manual Platform Components Upgrade <manual-upgrade-overview> .

If present, storage nodes are locked, upgraded and unlocked one at a time in order to respect the redundancy model of storage nodes. Storage nodes can be upgraded in parallel if using upgrade orchestration.

Upgrade of worker nodes is the next step in the process. When locking a worker node the node is tainted, such that Kubernetes shuts down any pods on this worker node and restarts the pods on another worker node. When upgrading the worker node, the worker node network boots/installs the new software from the active controller. After unlocking the worker node, the worker services are running in a 'compatibility' mode where all inter-node messages are using message formats from the old release of software. Note that the worker nodes can only be upgraded in parallel if using upgrade orchestration.

The final step of the upgrade process is to activate and complete the upgrade. This involves disabling 'compatibility' modes on all hosts and clearing the Upgrade Alarm.

Rolling Back / Aborting an Upgrade

In general, any issues encountered during an upgrade should be addressed during the upgrade with the intention of completing the upgrade after the issues are resolved. Issues specific to a storage or worker host can be addressed by temporarily downgrading the host, addressing the issues and then upgrading the host again, or in some cases by replacing the node.

In extremely rare cases, it may be necessary to abort an upgrade. This is a last resort and should only be done if there is no other way to address the issue within the context of the upgrade. There are two cases for doing such an abort:

  • Before controller-0 has been upgraded (that is, only controller-1 has been upgraded): In this case the upgrade can be aborted and the system will remain in service during the abort, see, Rolling Back a Software Upgrade Before the Second Controller Upgrade <rolling-back-a-software-upgrade-before-the-second-controller-upgrade>.
  • After controller-0 has been upgraded (that is, both controllers have been upgraded): In this case the upgrade can only be aborted with a complete outage and a reinstall of all hosts. This would only be done as a last resort, if there was absolutely no other way to recover the system, see, Rolling Back a Software Upgrade After the Second Controller Upgrade <rolling-back-a-software-upgrade-after-the-second-controller-upgrade>.