From 3e6e1f1b0156f2cf74175c77faead9ed65b1dfbf Mon Sep 17 00:00:00 2001
From: Gorka Eguileor <geguileo@redhat.com>
Date: Thu, 8 Oct 2015 16:23:38 +0200
Subject: [PATCH] Support HA Active/Active configurations

This spec outlines issues needed to achieve Cinder High Availability
Active/Active configurations.

Implements: blueprint cinder-volume-active-active-support
Change-Id: I3affb2039888b8a22c3d39634e87344dd4c03729
---
 .../cinder-volume-active-active-support.rst   | 260 ++++++++++++++++++
 1 file changed, 260 insertions(+)
 create mode 100644 specs/mitaka/cinder-volume-active-active-support.rst

diff --git a/specs/mitaka/cinder-volume-active-active-support.rst b/specs/mitaka/cinder-volume-active-active-support.rst
new file mode 100644
index 00000000..0f7579dd
--- /dev/null
+++ b/specs/mitaka/cinder-volume-active-active-support.rst
@@ -0,0 +1,260 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=============================================================
+Cinder Volume Active/Active support - General description
+=============================================================
+
+https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
+
+Right now cinder-volume service can run only in Active/Passive HA fashion and
+this spec proposes a possible path to support Active/Active configurations in
+Cinder Volumes.
+
+This spec will only provide a general description of the problem enumerating
+the different issues we have to resolve without actually going into too much
+detail.  It's more an eagle's eye kind of view of the problem.
+
+Each specific issue will have its own spec that gives a detailed description of
+the problem with proposed solution to the problem.
+
+
+Problem description
+===================
+
+Right now cinder-volume service only accepts Active/Passive High Availability
+configurations, and there are a number of things that need to change for it to
+support Active/Active configurations.
+
+API Races
+---------
+
+On API nodes given current code we are open to races in the code that affect
+resources on the database, and this will be exacerbated when working with
+Active/Active configurations.
+
+Manager Local Locks
+-------------------
+
+We have multiple local locks in the manager code of the volume nodes to prevent
+multiple green threads from accessing the same resource on specific operations.
+
+This locking is local to the nodes and doesn't extend to other nodes, so we
+need to solve mutual exclusion among volume nodes of the same cluster.
+
+Job distribution
+----------------
+
+Cinder has no concept of clusters, only has the concept of hosts and each host
+implements a specific backend/service.  A mechanism is needed to group hosts
+from the same cluster under the same conceptual unit while retaining the
+individual identities of the nodes belonging to the cluster for differentiation
+in the clean up of crashed nodes.
+
+Cleanup
+-------
+
+Right now only one node can work on a specific backend, and therefore on the
+resources that it contains, so the cleanup is done by the node itself on
+startup. And if the node does not come up and the resources are left on a stale
+state it is not a big deal.
+
+It is different with an Active/Active deployment since multiple nodes are
+sharing the same storage back-end and a node can only do cleanup for the nodes
+he was working on when he died/failed.
+
+It is also important to do proper cleanup even when a specific node does not
+come back to life, since other nodes from the same cluster can still manage
+those resources.
+
+Data Corruption Prevention
+--------------------------
+
+Since multiple nodes will be accessing the same storage back-end we have to be
+extra careful not to access resources that are accessed by other nodes.
+
+More relevant case is when we lose connection to the DB and we no longer can
+send Service Heartbeats, since Scheduler's cleanup process (explained in
+Cleanup proposed changes) will come into place and we could have 2 different
+nodes accessing the same resource, one because it's still working on it and the
+other because it is trying to do the cleanup.
+
+Drivers' Locks
+--------------
+
+Some drivers require mutual exclusion for certain operations or when accessing
+the same resources.
+
+This mutual exclusion is currently being done using local locks in the same way
+the manager does and they need to be able to work when multiple nodes are
+accessing the same storage back-end.
+
+
+Use Cases
+=========
+
+Operators that have hard requirements, SLA or other reasons, to have their
+cloud operational at all times or have higher throughput requirements will want
+to have the possibility to configure their deployments with an Active/Active
+configuration.
+
+
+Proposed change
+===============
+
+API Races
+---------
+
+Races on the API nodes will be removed used compare-and-swap updates to the DB.
+
+- Specs: https://review.openstack.org/207101/
+
+Job distribution
+----------------
+
+Job distribution will add the concept of cluster to cinder and send jobs using
+a topic message queue using the cluster instead of the host like we are doing
+now.
+
+- Specs: https://review.openstack.org/232595
+
+Cleanup
+-------
+
+Cleanup will keep track of resources that are have ongoing operations and will
+have cleanup mechanisms on the Scheduler as well as the Volume nodes.
+
+Cleanup on the nodes will happen on initialization as it is doing now but we'll
+also have an automatic cleanup job on the scheduler for the cases where a node
+with the same host name is not brought up.
+
+Automatic cleanup mechanism will be disabled by default and it will be possible
+to trigger it manually.
+
+- Specs: https://review.openstack.org/236977
+
+Data Corruption Prevention
+--------------------------
+
+Stop listening to new jobs from the Message Broker and halt all ongoing
+operations so we are no longer accessing resources on the Storage Backend.
+
+- Specs: https://review.openstack.org/237076
+
+Manager Local Locks
+-------------------
+
+Default solution will be using a DLM with TooZ as the abstraction layer:
+
+- Specs: https://review.openstack.org/202615
+
+An alternative solution, that will be initially left as nice to have, will be
+available for systems that don't want to install a DLM solution and are using
+drivers that don't require distributed locking for Active-Active
+configurations.  This solution replaces local file locks on c-vol's manager
+with a DB locking mechanism using ``workers`` DB table (introduced by Cleanup
+changes).
+
+- Specs: https://review.openstack.org/237602
+
+Drivers' Locks
+--------------
+
+We will be using a DLM solution with TooZ as the abstraction layer:
+
+- Specs: https://review.openstack.org/202615
+
+Alternatives
+------------
+
+There are quite a number of alternatives to not only each of the issues we need
+to fix, and they are discussed in the respective specs except for the Drivers'
+lock alternative that creates a generic locking mechanism extending the locking
+mechanism implemented to remove `Manager Local Locks`_.
+
+- Specs: https://review.openstack.org/237604
+
+
+Data model impact
+-----------------
+
+Discussed in the respective specs.
+
+REST API impact
+---------------
+
+Discussed in the respective specs.
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+None
+
+Performance Impact
+------------------
+
+Discussed in the respective specs.
+
+Other deployer impact
+---------------------
+
+Discussed in the respective specs.
+
+Developer impact
+----------------
+
+None
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Discussed in the respective specs.
+
+Work Items
+----------
+
+- API Races
+- Job distribution
+- Cleanup
+- Data Corruption Prevention
+- Manager Local Locks
+- Drivers' Locks
+
+Dependencies
+============
+
+None
+
+
+Testing
+=======
+
+Discussed in the respective specs.
+
+
+Documentation Impact
+====================
+
+Discussed in the respective specs.
+
+
+References
+==========
+
+None