tripleo-docs/doc/source/install/post_deployment/upgrade.rst

Upgrading to a Next Major Release
=================================

Upgrading a TripleO deployment to the next major release is done by first
upgrading the undercloud and using it to upgrade the overcloud.

Note that there are version specific caveats and notes which are pointed out
as below:

.. note::

   You can use the "Limit Environment Specific Content" in the left hand nav
   bar to restrict content to the upgrade you are performing.

.. note::

   Generic upgrade testing cannot cover all possible deployment
   configurations. Before performing the upgrade in production, test
   it in a matching staging environment, and create a backup of the
   production environment.

.. Undercloud upgrade section
.. include:: ../../install/installation/updating.rst

Upgrading the Overcloud from Pike to Queens
-------------------------------------------

The Queens upgrade workflow for TripleO is significantly different to previous
cycles. At a high level, the workflow no longer relies on Heat to deliver the
upgrade configuration, but instead uses ansible playbooks. Those
playbooks *are* still generated using a Heat stack update of the overcloud
stack, as the first step in the workflow. However there is no configuration
applied during that stack update and so comparatively it takes *much* less
time to complete than a 'normal' stack update. More information and pointers
can be found in the relevant queens-upgrade-spec_ and the
queens-upgrade-dev-docs_.

The tripleo cli has been updated accordingly to accommodate this new
workflow. In Queens a new `openstack overcloud upgrade` command is introduced
and it expects one of three subcommands: **prepare**, **run** and
**converge**. Each subcommand has its own set of options which you can explore
with  --help:

   .. code-block:: bash

          source /home/stack/stackrc
          openstack overcloud upgrade run --help

The Queens upgrade workflow essentially consists of the following steps:

#. `Prepare your environment - get latest container images`_, backup.
   Generate any environment files you need for the upgrade such as the
   references to the latest container images or commands used to switch repos.

#. `openstack overcloud upgrade prepare`_ $OPTS.
   Run a heat stack update to generate the upgrade playbooks.

#. `openstack overcloud upgrade run`_ $OPTS.
   Run the upgrade on specific nodes or groups of nodes. Repeat until all nodes
   are successfully upgraded.

#. `openstack overcloud ceph-upgrade run`_ $OPTS. (optional)
   Not necessary unless a TripleO managed Ceph cluster is deployed in the overcloud;
   this step performs the upgrade of the Ceph cluster.

#. `openstack overcloud upgrade converge`_ $OPTS.
   Finally run a heat stack update, unsetting any upgrade specific variables
   and leaving the heat stack in a healthy state for future updates.

.. _queens-upgrade-dev-docs: https://docs.openstack.org/tripleo-docs/latest/install/developer/upgrades/major_upgrade.html # WIP @ https://review.openstack.org/#/c/569443/

Prepare your environment - get latest container images
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

As a part of the upgrade the container images for the target release should be
downloaded to the Undercloud. Please see the `openstack overcloud container image prepare`
:doc:`../containers_deployment/overcloud` for more information.

The output of this step will be a Heat environment file that contains
references to the latest container images. You will need to pass this file
into the **upgrade prepare** command using the --container-registry-file
option.

You will also need to create an environment file to override the
UpgradeInitCommand_ tripleo-heat-templates parameter, that can be used to
switch the yum repos in use by the nodes during the upgrade. This will likely
be the same commands that were used to switch repositories on the undercloud.

   .. code-block:: bash

      cat <<EOF > init-repo.yaml
        parameter_defaults:
        UpgradeInitCommand: |
          set -e
          #  -- REPLACE LINES WITH YOUR REPO SWITCH COMMANDS --
          curl -L -o /etc/yum.repos.d/delorean.repo https://trunk.rdoproject.org/centos7-queens/current/delorean.repo
          curl -L -o /etc/yum.repos.d/delorean-deps.repo https://trunk.rdoproject.org/centos7-queens/delorean-deps.repo
          yum clean all
        EOF

The resulting init-repo.yaml will then be passed into the upgrade prepare using
the -e option.

.. _Upgradeinitcommand: https://github.com/openstack/tripleo-heat-templates/blob/1d9629ec0b3320bcbc5a4150c8be19c6eb4096eb/puppet/role.role.j2.yaml#L468-L493

openstack overcloud upgrade prepare
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

   .. note::

      Before running the overcloud upgrade prepare ensure you have a valid backup
      of the current state, including the **undercloud** since there will be a
      Heat stack update performed here.

   .. note::

      If you have enabled neutron_DVR_ in your deployment you must ensure that
      compute nodes are connected to the External network via the
      roles_data.yaml that you will pass using the -r parameter to upgrade prepare.
      This is necessary to allow floating IP connectivity via the external api network.

   .. note::

      After running the upgrade prepare and until successful completion
      of the upgrade converge operation, stack updates to the deployment Heat
      stack are expected to fail. That is, operations such as scaling to add
      a new node or to apply any new TripleO configuration via Heat stack
      update **must not** be performed on a Heat stack that has been prepared
      for upgrade with the 'prepare' command and only consider doing so after
      running the converge step. See the queens-upgrade-dev-docs_ for more.

Run **overcloud upgrade prepare**. This command expects the full set
of environment files that were passed into the deploy command, as well as the
roles_data.yaml file used to deploy the overcloud you are about to upgrade. The
--container-registry-file should point to the file that was output by the image
prepare command you ran to get the latest container image references.

   .. note::

      It is especially important to remember that you **must** include all
      environment files that were used to deploy the overcloud that you are about
      to upgrade.

   .. code-block:: bash

      openstack overcloud upgrade prepare --templates \
        --container-registry-file /home/stack/containers-default-parameters.yaml \
        -e <ALL Templates from overcloud-deploy.sh> \
        -e init-repo.yaml
        -r /path/to/roles_data.yaml

This will begin an update on the overcloud Heat stack but without
applying any of the TripleO configuration, as explained above. Once this
`upgrade prepare` operation has successfully completed the heat stack will be
in the UPDATE_COMPLETE state. At that point you can use `config download` to
download and inspect the configuration ansible playbooks that will be used
to deliver the upgrade in the next step:

   .. code-block:: bash

      openstack overcloud config download --config-dir SOMEDIR
      # playbooks will be downloaded to SOMEDIR directory

.. _neutron_DVR: https://specs.openstack.org/openstack/neutron-specs/specs/juno/neutron-ovs-dvr.html


openstack overcloud upgrade run
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This will run the ansible playbooks to deliver the upgrade configuration.
By default, 3 playbooks are executed: the upgrade_steps_playbook, then the
deploy_steps_playbook and finally the post_upgrade_steps_playbook. These
playbooks are invoked on those overcloud nodes specified by the --roles or
--nodes parameters, which are mutually exclusive. You are expected to use
--roles for controlplane nodes, since these need to be upgraded in the same
step. For non controlplane nodes, such as Compute or Storage, you can use
--nodes to specify a single node or list of nodes to upgrade.

   .. code-block:: bash

      openstack overcloud upgrade run --roles Controller

**Optionally** specify `--playbook` to manually step through the upgrade
playbooks: You need to run all three in this order and as specified below
(no path) for a full upgrade to Queens.

   .. code-block:: bash

      openstack overcloud upgrade run --roles Controller --playbook upgrade_steps_playbook.yaml
      openstack overcloud upgrade run --roles Controller --playbook deploy_steps_playbook.yaml
      openstack overcloud upgrade run --roles Controller --playbook post_upgrade_steps_playbook.yaml

After all three playbooks have been executed without error on all nodes of
the controller role the controlplane will have been fully upgraded to Queens.
At a minimum an operator should check the health of the pacemaker cluster

   .. code-block:: bash

      [root@overcloud-controller-0 ~]# pcs status | grep -C 10 -i "error\|fail"

The operator may also want to confirm that openstack and related service
containers are all in a good state and using the image references passed
during upgrade prepare with the --container-registry-file parameter.

   .. code-block:: bash

      [root@overcloud-controller-0 ~]# docker ps -a

For non controlplane nodes, such as Compute or ObjectStorage, you can use
`--nodes overcloud-compute-0` to upgrade particular nodes, or even
"compute0,compute1,compute3" for multiple nodes. Note these are again
upgraded in parallel. Also note that you can still use the `--roles` parameter
with non controlplane roles if that is preferred.

   .. code-block:: bash

      openstack overcloud upgrade run --nodes overcloud-compute-0

Use of `--nodes` allows the operator to upgrade some subset, perhaps just one,
compute or other non controlplane node and verify that the upgrade is
successful. One may even migrate workloads onto the newly upgraded node and
confirm there are no problems, before deciding to proceed with upgrading the
remaining nodes that are still on Pike.

Again you can optionally step through the upgrade playbooks if you prefer. Be
sure to run upgrade_steps_playbook.yaml then deploy_steps_playbook.yaml and
finally post_upgrade_steps_playbook.yaml in that order.

   .. code-block:: bash

      openstack overcloud upgrade run --nodes overcloud-compute-1 \
         --playbook upgrade_steps_playbook.yaml
      # etc for the other 2 as above example for controller

For re-run, you can specify --skip-tags validation to skip those step 0
ansible tasks that check if services are running, in case you can't or
don't want to start them all.

   .. code-block:: bash

      openstack overcloud upgrade run --roles Controller --skip-tags validation

openstack overcloud ceph-upgrade run
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

This step is only necessary if Ceph was deployed in the Overcloud and it
triggers an upgrade of the Ceph cluster which will be performed without
taking down the cluster.

   .. note::

      It is especially important to remember that you **must** include all
      environment files that were used to deploy the overcloud that you are about
      to upgrade.

   .. code-block:: bash

      openstack overcloud ceph-upgrade run --templates \
        --container-registry-file /home/stack/containers-default-parameters.yaml \
        -e <ALL Templates from overcloud-deploy.sh> \
        -r /path/to/roles_data.yaml

At the end of the process, Ceph will be upgraded from Jewel to Luminous so
there will be new containers for the `ceph-mgr` service running on the
controlplane node.

openstack overcloud upgrade converge
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Finally, run the converge heat stack update. This will re-apply all Queens
configuration across all nodes and unset all variables that were used during
the upgrade. Until you have successfully completed this step, heat stack
updates against the overcloud stack are expected to fail. You can read more
about why this is the case in the queens-upgrade-dev-docs_.

   .. note::

      It is especially important to remember that you **must** include all
      environment files that were used to deploy the overcloud that you are about
      to upgrade converge, including the list of Queens container image references
      and the roles_data.yaml roles and services definition. You should omit
      any repo switch commands and ensure that none of the environment files
      you are about to use is specifying a value for UpgradeInitCommand.

   .. note::

      The Queens container image references that were passed into the
      `openstack overcloud upgrade prepare`_ with the `--container-registry-file`
      parameter **must** be included as an environment file, with the -e option
      to the openstack overcloud upgrade run command, together with all other
      environment files for your deployment.

   .. code-block:: bash

      openstack overcloud upgrade converge --templates
        -e /home/stack/containers-default-parameters.yaml \
        -e <ALL Templates from overcloud-deploy.sh> \
        -r /path/to/roles_data.yaml

The Heat stack will be in the **UPDATE_IN_PROGRESS** state for the duration of
the openstack overcloud upgrade converge. Once converge has completed
successfully the Heat stack should also be in the **UPDATE_COMPLETE** state.

Upgrading the Overcloud to Ocata or Pike
----------------------------------------

As of the Ocata release, the upgrades workflow in tripleo has changed
significantly to accommodate the operators' new ability to deploy custom roles
with the Newton release (see the Composable Service Upgrade spec_ for more
info). The new workflow uses ansible upgrades tasks to define the upgrades
workflow on a per-service level. The Pike release upgrade uses a similar
mechanism and the steps are invoked with the same cli. A big difference however
is that after upgrading to Pike most of the overcloud services will be running
in containers.

.. note::

   Upgrades to Pike or Queens will only be tested with containers. Baremetal
   deployments, which don't use containers, will be deprecated in Queens and
   have full support removed in Rocky.

The operator starts the upgrade with a ``openstack overcloud deploy`` that
includes the major-upgrade-composable-steps.yaml_ environment file (or the
docker variant for the `containerized upgrade to Pike`__)
as well as all environment files used on the initial deployment. This will
collect the ansible upgrade tasks for all roles, except those that have the
``disable_upgrade_deployment`` flag set ``True`` in roles_data.yaml_. The
tasks will be executed in a series of steps, for example (and not limited to):
step 0 for validations or other pre-upgrade tasks, step 1 to stop the
pacemaker cluster, step 2 to stop services, step 3 for package updates,
step 4 for cluster startup, step 5 for any special case db syncs or post
package update migrations. The Pike upgrade tasks are in general much simpler
than those used in Ocata since for Pike these tasks are mainly for stopping
and disabling the systemd services, since they will be containerized as part
of the upgrade.

After the ansible tasks have run the puppet (or docker, for Pike containers)
configuration is also applied in the 'normal' manner we do on an initial
deploy, to complete the upgrade and bring services back up, or start the
service containers, as the case may be for Ocata or Pike.

For those roles with the ``disable_upgrade_deployment`` flag set True, the
operator will upgrade the corresponding nodes with the
upgrade-non-controller.sh_. The operator uses that script to invoke the
tripleo_upgrade_node.sh_ which is delivered during the
major-upgrade-composable-steps that come first, as described above.

#. Run the major upgrade composable ansible steps

   This step will upgrade the nodes of all roles that do not explicitly set the
   ``disable_upgrade_deployment`` flag to ``True`` in the roles_data.yaml_
   (this is an operator decision, and the current default is for the **Compute**
   and **ObjectStorage** roles to have this set).

   The ansible upgrades tasks are collected from all service manifests_ and
   executed in a series of steps as described in the introduction above.
   Even before the invocation of these ansible tasks however, this upgrade
   step also delivers the tripleo_upgrade_node.sh_ and role specific puppet
   manifest to allow the operator to upgrade those nodes after this step has
   completed.

   From Ocata to Pike, the Overcloud will be upgraded to a containerized
   environment. All OpenStack related services will run in containers.

   If you deploy TripleO with custom roles, you want to synchronize them with
   `roles_data.yaml` visible in default roles and make sure parameters and new
   services are present in your roles.

   .. admonition:: Newton
      :class: newton

      Newton roles_data.yaml is available here:
      https://github.com/openstack/tripleo-heat-templates/blob/stable/newton/roles_data.yaml

   .. admonition:: Ocata
      :class: ocata

      Ocata roles_data.yaml is available here:
      https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/roles_data.yaml

   .. admonition:: Pike
      :class: pike

      Pike roles_data.yaml is available here:
      https://github.com/openstack/tripleo-heat-templates/blob/stable/pike/roles_data.yaml

   .. admonition:: Queens
      :class: queens

      Queens roles_data.yaml is available here:
      https://github.com/openstack/tripleo-heat-templates/blob/stable/queens/roles_data.yaml


   Create an environment file with commands to switch OpenStack repositories to
   a new release. This will likely be the same commands that were used to switch
   repositories on the undercloud

    .. code-block:: bash

       cat > overcloud-repos.yaml <<EOF
       parameter_defaults:
         UpgradeInitCommand: |
           set -e
           # REPOSITORY SWITCH COMMANDS GO HERE
       EOF

   .. admonition:: Newton to Ocata
      :class: ntoo

      Run ``overcloud deploy``, passing in full set of environment files plus
      `major-upgrade-composable-steps.yaml` and ``overcloud-repos.yaml``

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps.yaml \
           -e overcloud-repos.yaml

   .. note::

      Before upgrading your deployment to containers, you must perform the
      actions mentioned here to prepare your environment. In particular
      *image prepare* to generate the docker registry which you must include
      as one of the environment files specified below:
      * :doc:`../containers_deployment/overcloud`

   .. __:

   Run `overcloud deploy`, passing in full set of environment
   files plus `major-upgrade-composable-steps-docker.yaml` and
   `overcloud-repos.yaml` (and docker registry if upgrading to containers)

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml \
        -e overcloud-repos.yaml

   .. note::

      It is especially important to remember that you **must** include all
      environment files that were used to deploy the overcloud that you are about
      to upgrade.

   .. note::

      If the Overcloud has been deployed with Pacemaker, then add the
      `docker-ha.yaml` environment file to the upgrade command

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
           -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml \
           -e overcloud-repos.yaml

   .. admonition:: Ceph
      :class: ceph

      When upgrading to Pike, if Ceph has been deployed in the Overcloud, then
      use the `ceph-ansible.yaml` environment file **instead of**
      `storage-environment.yaml`. Make sure to move any customization into
      `ceph-ansible.yaml` (or a copy of ceph-ansible.yaml)

      .. code-block:: bash

          openstack overcloud deploy --templates \
            -e <full environment> \
            -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
            -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
            -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml \
            -e overcloud-repos.yaml

      Customizations for the Ceph deployment previously passed as hieradata
      via \*ExtraConfig should be removed as they are ignored, specifically
      the deployment will stop if ``ceph::profile::params::osds`` is found to
      ensure the devices list has been migrated to the format expected by
      ceph-ansible. It is possible to use the ``CephAnsibleExtraConfig`` and
      ``CephAnsibleDisksConfig`` parameters to pass arbitrary variables to
      ceph-ansible, like ``devices`` and ``dedicated_devices``.  See the
      `ceph-ansible scenarios`_ or the :doc:`TripleO Ceph config guide
      <../advanced_deployment/ceph_config>`

      The other parameters (for example ``CinderRbdPoolName``,
      ``CephClientUserName``, ...) will behave as they used to with puppet-ceph
      with the only exception of ``CephPools``. This can be used to create
      additional pools in the Ceph cluster but the two tools expect the list
      to be in a different format. Specifically while puppet-ceph expected it
      in this format::

        {
         "mypool": {
          "size": 1,
          "pg_num": 32,
          "pgp_num": 32
         }
        }

      with ceph-ansible that would become::

        [{"name": "mypool", "pg_num": 32, "rule_name": ""}]

   .. note::

      The first step of the ansible tasks is to validate that the deployment is
      in a good state before performing any other upgrade operations. Each
      service manifest in the tripleo-heat-templates includes a check that it is
      running and if any of those checks fail the upgrade will exit early at
      ansible step 0.

      If you are re-running the upgrade after an initial failed attempt, you may
      need to disable these checks in order to allow the upgrade to proceed with
      services down. This is done with the SkipUpgradeConfigTags parameter to
      specify that tasks with the 'validation' tag should be skipped. You can
      include this in any of the environment files you are using::

         SkipUpgradeConfigTags: [validation]

#. Upgrade remaining nodes for roles with ``disable_upgrade_deployment: True``

   It is expected that the operator will want to upgrade the roles that have the
   ``openstack-nova-compute`` and ``openstack-swift-object`` services deployed
   to allow for pre-upgrade migration of workloads. For this reason the default
   ``Compute`` and ``ObjectStorage`` roles in the roles_data.yaml_ have the
   ``disable_upgrade_deployment`` set ``True``.

   Note that unlike in previous releases, this operator driven upgrade step
   includes a full puppet configuration run as happens after the ansible
   steps on the roles those are executed on. The significance is that nodes
   are 'fully' upgraded after each step completes, rather than having to wait
   for the final converge step as has previously been the case. In the case of
   Ocata to Pike the full puppet/docker config is applied to bring up the
   overclod services in containers.

   The tripleo_upgrade_node.sh_ script and puppet configuration are delivered to
   the nodes with ``disable_upgrade_deployment`` set ``True`` during the initial
   major upgrade composable steps in step 1 above.

   For Ocata to Pike, the tripleo_upgrade_node.sh_ is still delivered to the
   ``disable_upgrade_deployment`` nodes but is now empty. Instead, the
   `upgrade_non_controller.sh` downloads ansible playbooks and those are
   executed to deliver the upgrade. See the Queens-upgrade-spec_ for more
   information on this mechanism.

   To upgrade remaining roles (at your convenience)

   .. code-block:: bash

      upgrade-non-controller.sh --upgrade overcloud-compute-0

      for i in $(seq 0 2); do
        upgrade-non-controller.sh --upgrade overcloud-objectstorage-$i &
      done

#. Converge to unpin Nova RPC

   The final step is required to unpin Nova RPC version. Unlike in previous
   releases, for Ocata the puppet configuration has already been applied to
   nodes as part of each upgrades step, i.e. after the ansible tasks or when
   invoking the tripleo_upgrade_node.sh_ script to upgrade compute nodes. Thus
   the significance of this step is somewhat diminished compared to previously.
   However a re-application of puppet configuration across all nodes here will
   also serve as a sanity check and hopefully show any issues that an operator
   may have missed during any of the previous upgrade steps.

   To converge, run the deploy command with `major-upgrade-converge-docker.yaml`

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-converge-docker.yaml

   .. admonition:: Newton to Ocata
      :class: ntoo

      For Newton to Ocata, run the deploy command with
      `major-upgrade-pacemaker-converge.yaml`

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-converge.yaml

   .. note::

      If the Overcloud has been deployed with Pacemaker, then add the
      `docker-ha.yaml` environment file to the upgrade command

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
           -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-converge-docker.yaml

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-converge.yaml

   .. note::

      It is especially important to remember that you **must** include all
      environment files that were used to deploy the overcloud.

.. _spec: https://specs.openstack.org/openstack/tripleo-specs/specs/ocata/tripleo-composable-upgrades.html
.. _major-upgrade-composable-steps.yaml: https://github.com/openstack/tripleo-heat-templates/blob/master/environments/major-upgrade-composable-steps.yaml
.. _roles_data.yaml: https://github.com/openstack/tripleo-heat-templates/blob/master/roles_data.yaml
.. _tripleo_upgrade_node.sh: https://github.com/openstack/tripleo-heat-templates/blob/master/extraconfig/tasks/tripleo_upgrade_node.sh
.. _upgrade-non-controller.sh: https://github.com/openstack/tripleo-common/blob/master/scripts/upgrade-non-controller.sh
.. _manifests: https://github.com/openstack/tripleo-heat-templates/tree/master/puppet/services
.. _Queens-upgrade-spec: https://specs.openstack.org/openstack/tripleo-specs/specs/queens/tripleo_ansible_upgrades_workflow.html
.. _ceph-ansible scenarios: https://github.com/ceph/ceph-ansible/blob/stable-3.0/docs/source/testing/scenarios.rst


Upgrading the Overcloud to Newton and earlier
---------------------------------------------

.. note::

   The `openstack overcloud deploy` calls in upgrade steps below are
   non-blocking. Make sure that the overcloud is `UPDATE_COMPLETE` in
   `openstack stack list` and `sudo pcs status` on a controller reports
   everything running fine before proceeding to the next step.

.. admonition:: Mitaka to Newton
   :class: mton

   **Deliver the migration for ceilometer to run under httpd.**

   This is to deliver the migration for ceilometer to be run under httpd (apache)
   rather than eventlet as was the case before. To execute this step run
   `overcloud deploy`, passing in the full set of environment files plus
   `major-upgrade-ceilometer-wsgi-mitaka-newton.yaml`

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-ceilometer-wsgi-mitaka-newton.yaml

#. Upgrade initialization

   The initialization step switches to new repositories on overcloud nodes, and
   it delivers upgrade scripts to nodes which are going to be upgraded
   one-by-one (this means non-controller nodes, except any stand-alone block
   storage nodes).

   Create an environment file with commands to switch OpenStack repositories to
   a new release. This will likely be the same commands that were used to
   switch repositories on the undercloud

   .. code-block:: bash

      cat > overcloud-repos.yaml <<EOF
      parameter_defaults:
        UpgradeInitCommand: |
          set -e
          # REPOSITORY SWITCH COMMANDS GO HERE
      EOF


   And run `overcloud deploy`, passing in full set of environment files plus
   `major-upgrade-pacemaker-init.yaml` and `overcloud-repos.yaml`

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-init.yaml \
        -e overcloud-repos.yaml

#. Object storage nodes upgrade

   If the deployment has any standalone object storage nodes, upgrade them
   one-by-one using the `upgrade-non-controller.sh` script on the undercloud
   node

   .. code-block:: bash

      upgrade-non-controller.sh --upgrade <nova-id of object storage node>

   This is ran before controller node upgrade because swift storage services
   should be upgraded before swift proxy services.

#. Upgrade controller and block storage nodes

   .. admonition:: Mitaka to Newton
      :class: mton

      **Explicitly disable sahara services if so desired:**
      As discussed at bug1630247_  sahara services are disabled by default in
      the Newton overcloud deployment. This special case is handled for the
      duration of the upgrade by defaulting to 'keep sahara-\*'.

      That is by default sahara services are restarted after the mitaka to
      newton upgrade of controller nodes and sahara config is re-applied during
      the final upgrade converge step.

      If an operator wishes to **disable** sahara services as part of the
      mitaka to newton upgrade they need to include the
      major-upgrade-remove-sahara.yaml_ environment file during the controller
      upgrade step as well as during the converge step later

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-remove-sahara.yaml

   All controllers will be upgraded in sync in order to make services only talk
   to DB schema versions they expect. Services will be unavailable during this
   operation. Standalone block storage nodes are automatically upgraded in this
   step too, in sync with controllers, because block storage services don't
   have a version pinning mechanism.

   Run the deploy command with `major-upgrade-pacemaker.yaml`

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker.yaml

   Services of the compute component on the controller nodes are now pinned to
   communicate like the older release, ensuring that they can talk to the
   compute nodes which haven't been upgraded yet.

   .. note::

      If this step fails, it may leave the pacemaker cluster stopped (together
      with all OpenStack services on the controller nodes). The root cause and
      restoration procedure may vary, but in simple cases the pacemaker cluster
      can be started by logging into one of the controllers and running `sudo
      pcs cluster start --all`.

   .. note::

      After this step, or if this step failed with the error: `ERROR: upgrade
      cannot start with some cluster nodes being offlineAfter`, it's possible
      that some pacemaker resources needs to be clean. Check the failed
      actions and clean them by running on `only one` controller node as root

      .. code-block:: bash

         pcs status
         pcs resource cleanup

      It can take few minutes for the cluster to go back to a “normal” state as
      displayed by `crm_mon`.  This is expected.

#. Upgrade ceph storage nodes

   If the deployment has any ceph storage nodes, upgrade them one-by-one using
   the `upgrade-non-controller.sh` script on the undercloud node

   .. code-block:: bash

      upgrade-non-controller.sh --upgrade <nova-id of ceph storage node>

#. Upgrade compute nodes

   Upgrade compute nodes one-by-one using the `upgrade-non-controller.sh`
   script on the undercloud node

   .. code-block:: bash

      upgrade-non-controller.sh --upgrade <nova-id of compute node>

#. Apply configuration from upgraded tripleo-heat-templates

   .. admonition:: Mitaka to Newton
      :class: mton

      **Explicitly disable sahara services if so desired:**
      As discussed at bug1630247_  sahara services are disabled by default in
      the Newton overcloud deployment. This special case is handled for the
      duration of the upgrade by defaulting to 'keep sahara-\*'.

      That is by default sahara services are restarted after the mitaka to
      newton upgrade of controller nodes and sahara config is re-applied during
      the final upgrade converge step.

      If an operator wishes to **disable** sahara services as part of the
      mitaka to newton upgrade they need to include the
      major-upgrade-remove-sahara.yaml_ environment file during the controller
      upgrade earlier and converge step here

      .. code-block:: bash

         openstack overcloud deploy --templates \
           -e <full environment> \
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-converge.yaml
           -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-remove-sahara.yaml

   .. _bug1630247: https://bugs.launchpad.net/tripleo/+bug/1630247
   .. _major-upgrade-remove-sahara.yaml: https://github.com/openstack/tripleo-heat-templates/blob/2e6cc07c1a74c2dd7be70568f49834bace499937/environments/major-upgrade-remove-sahara.yaml


   This step unpins compute services communication (upgrade level) on
   controller and compute nodes, and it triggers configuration management
   tooling to converge the overcloud configuration according to the new release
   of `tripleo-heat-templates`.

   Make sure that all overcloud nodes have been upgraded to the new release,
   and then run the deploy command with `major-upgrade-pacemaker-converge.yaml`

   .. code-block:: bash

      openstack overcloud deploy --templates \
        -e <full environment> \
        -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-pacemaker-converge.yaml


   .. note::

      After the converge step, it's possible that some pacemaker resources
      needs to be cleaned.  Check the failed actions and clean them by running
      on **only one** controller as root

      .. code-block:: bash

         pcs status
         pcs resource cleanup

      It can take few minutes for the cluster to go back to a “normal” state as
      displayed by ``crm_mon``. This is expected.