From 1f9ddb7692580dde4eafc2ae752997520b0b5064 Mon Sep 17 00:00:00 2001 From: Gorka Eguileor Date: Mon, 19 Oct 2015 20:23:41 +0200 Subject: [PATCH] Remove Manager Local Locks for HA A/A This spec outlines work needed to remove local locks from Volume nodes. Two methods are used, filters on the compare-and-swap mechanism and lock mechanism using workers table. This is one of the aspects that are needed to support Active/Active in cinder. Implements: blueprint cinder-volume-active-active-support Change-Id: Iab5bf8c3971d4a908bec16882fb4ca4a50c7dfdb --- specs/newton/ha-aa-manager_locks.rst | 273 +++++++++++++++++++++++++++ 1 file changed, 273 insertions(+) create mode 100644 specs/newton/ha-aa-manager_locks.rst diff --git a/specs/newton/ha-aa-manager_locks.rst b/specs/newton/ha-aa-manager_locks.rst new file mode 100644 index 00000000..2ed46495 --- /dev/null +++ b/specs/newton/ha-aa-manager_locks.rst @@ -0,0 +1,273 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +============================================================= +Cinder Volume Active/Active support - Manager Local Locks +============================================================= + +https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support + +Right now cinder-volume service can run only in Active/Passive HA fashion. + +One of the reasons for this is that we have multiple **local locks** in the +Volume nodes for mutual exclusion of specific operations on the same resource. +These locks need to be shared between nodes of the same cluster, removed or +replaced with DB operations. + +This spec proposes a different mechanism to replace these local locks. + + +Problem description +=================== + +We have locks in Volume nodes to prevent things like deleting a volume that is +being used to create another volume, or attaching a volume that is already +being attached. + +Unfortunately these locks are local to the nodes, which works if we only +support Active/Passive configurations, but doesn't on Active/Active +configurations when we have more than one node, since locks will not guarantee +mutual exclusion of operations on other nodes. + +We have different locks for different purposes but we will be using the same +mechanism to allow them to handle an Active/Active clusters. + +List of locks are: + +- ${VOL_UUID} +- ${VOL_UUID}-delete_volume +- ${VOL_UUID}-detach_volume +- ${SNAPSHOT_UUID}-delete_snapshot + +We are adding an abstraction layer to our locking methods in Cinder -for the +manager and drivers- using Tooz_ as explained in `Tooz locks for A/A`_ that +will use local file locks by default and will use a DLM for Active-Active +configurations. + +But there are cases where drivers don't need distributed locks to work, they +may just need local locks and we would be forcing them to install a DLM just +for these few locks in the manager, which can be considered a bit extreme. + + +Use Cases +========= + +Operators that have hard requirements, SLA or other reasons, to have their +cloud operational at all times or have higher throughput requirements will want +to have the possibility to configure their deployments with an Active/Active +configuration and have the same behavior they currently have. But they don't +want to have to install a DLM when they are using drivers that don't require +distributed locks to operate in Active-Active mode just for 4 locks. + +Also in the context of the effort to make Cinder capable of working as an SDS +outside of OpenStack, where we no longer can make a hard requirement the +presence of a DLM as we can within OpenStack, it makes even more sense for +Cinder to be able to work without a DLM present for Active-Active +configurations if the storage backend drivers don't require distributed locking +in Active-Active environments. + + +Proposed change +=============== + +This spec suggests modifying behavior introduced by `Tooz locks for A/A spec`_ +for the case where the drivers don't need distributed locks. So we would use +local file locks in the drivers (if they use any) and for the locks in the +manager we would use a locking mechanism based on the ``workers`` table that +was introduced by the `HA A/A Cleanup specs`_. + +This new locking mechanism would be an hybrid between local and distributed +locks. + +By using a locking mechanism similar to the one that local file locks provide +we'll be able to mimic the same behavior we have: + +- Assure mutual exclusion between different nodes of the same cluster and in + the node itself. +- Request queuing. +- Lock release on node crash. +- Require no additional software in the system. +- Require no operator specific configuration. + +To assure the mutual exclusion using the ``workers`` table we will need to add +a new field called ``lock_name`` that will store the current operation (method +name in most cases since table already has the resource type and UUID) that is +being performed on the Volume's manager and it will be used for locking. + +New locking mechanism will use added ``lock_name`` field to check if +``workers`` table already has an entry for that ``lock_name`` on the specific +resource and cluster of the node, in which case the lock is *acquired* and we +have to wait and retry later until the lock has been *released* -row has been +deleted- and we can insert the row ourselves to *acquire* the lock. This means +that in the case of attaching a volume we will only proceed with the attach if +there is no DB entry for our cluster for the volume ID with operation +``volume_attach``. + +To assure mutual exclusion, this lock checking and acquisition needs to be +atomic, so we'll be using a conditional insert of the "lock" in the same way we +are doing conditional updates (compare-and-swap) to remove `API races`_. +Insert query will be conditioned to the absence of a row with some specific +restrictions, in this case conditions will be the ``lock_name``, the +``cluster`` and the ``type``. If we couldn't insert the row acquiring the lock +failed and we have to wait, if we inserted the row successfully we have +acquired the lock and can proceed with the operation. + +This process of trying to acquire, fail, wait and retry, is the exact same +mechanism we have today with the local file locks. ``synchronize`` method when +using external synchronization will try to acquire the file lock with a +relatively small timeout and if it fails it will try again until lock is +acquired. + +So by mimicking the same behavior as the file locks we are preserving the +queuing of operations we currently have, and not altering our API's behavior +and "implicit API contract" that some external scripts may be relying on. + +Since we will be using the DB for the locking operators will not need to add +any software to their systems or carry out specific system configurations. + +It is important to notice that timeouts for these new locks will be handled by +the cleanup process itself, as rows are removed from the DB when heartbeats are +no longer being received, thus releasing the locks and preventing a resource +from being stuck. + +On a closer look at these 4 locks mentioned before we can classify them in 2 +categories: + +- Locks for the resource of the operation. + - *${VOL_UUID}-detach_volume* - Used in ``detach_volume`` to prevent + multiple simultaneous detaches + - *${VOL_UUID}* - Used in ``attach_volume`` to prevent multiple + simultaneous attaches + +- Locks that prevent deletion of the source of a volume creation (they are + created by ``create_volume`` method): + - *${VOL_UUID}-delete_volume* - Used in ``delete_volume`` + - *${SNAPSHOT_UUID}-delete_snapshot* - Used in ``delete_snapshot`` + +For locks on the resource of the operation -attach and detach- the row in the +DB is already been inserted by the cleanup method, so we'll reuse that same row +and condition the writing of the ``lock_name`` field to the lock being +available. + +As for locks preventing deletion we will need to add the row ourselves since +cleanup was not adding a row in the ``workers`` table for those resources as +they didn't require any cleanup. + + +Alternatives +------------ + +We could use a DLM, which is a stand-in replacement for local locks, but there +have been operator that have expressed their concern on adding this burden -to +their systems and duties- because they are using drivers that don't require +locks for Active-Active and would prefer to avoid adding a DLM to their +systems. + +Instead of using the new locking mechanism for locks that prevent deletion of +resources we could add a filter to the conditional update -the one being used +to prevent `API Races`_- that will prevent us from deleting a volume or a +snapshot that is being used as the source for a volume adding also the +appropriate response error when we try to delete such a volume/snapshot. + + +Data model impact +----------------- + +Adds a new string field called ``lock_name`` to the ``workers`` table. + +REST API impact +--------------- + +None + +Security impact +--------------- + +None + +Notifications impact +-------------------- + +None + +Other end user impact +--------------------- + +None + +Performance Impact +------------------ + +Small, but necessary, performance impact from changing local file locks to DB +calls. + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +None + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + Gorka Eguileor (geguileo) + +Other contributors: + Anyone is welcome to help + +Work Items +---------- + +- Add ``lock_name`` field to ``workers`` table. + +- Modify Cinder's new locking methods/decorators to handle hybrid behavior. + + +Dependencies +============ + +Cleanup for HA A/A: https://review.openstack.org/236977 + - We need the new ``workers`` table and the cleanup mechanism. + +Removing API Races: https://review.openstack.org/207101/ + - We need compare-and-swap mechanism on volume and snapshot deletion to be in + place so we can add required filters. + +Testing +======= + +Unittests for new locking mechanism. + + +Documentation Impact +==================== + +This needs to be properly documented, as this locking mechanism will *not* be +appropriate for all drivers. + + +References +========== + +General Description for HA A/A: https://review.openstack.org/#/c/232599/ + +Cleanup for HA A/A: https://review.openstack.org/236977 + +Removal of API Races: https://review.openstack.org/207101/ + + +.. _HA A/A Cleanup specs: https://review.openstack.org/236977 +.. _API Races: https://review.openstack.org/207101/ +.. _Tooz: http://docs.openstack.org/developer/tooz/ +.. _Tooz locks for A/A: https://review.openstack.org/202615/