Remove Manager Local Locks for HA A/A
This spec outlines work needed to remove local locks from Volume nodes. Two methods are used, filters on the compare-and-swap mechanism and lock mechanism using workers table. This is one of the aspects that are needed to support Active/Active in cinder. Implements: blueprint cinder-volume-active-active-support Change-Id: Iab5bf8c3971d4a908bec16882fb4ca4a50c7dfdb
This commit is contained in:
parent
967c2b0a0e
commit
1f9ddb7692
273
specs/newton/ha-aa-manager_locks.rst
Normal file
273
specs/newton/ha-aa-manager_locks.rst
Normal file
@ -0,0 +1,273 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=============================================================
|
||||
Cinder Volume Active/Active support - Manager Local Locks
|
||||
=============================================================
|
||||
|
||||
https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
|
||||
|
||||
Right now cinder-volume service can run only in Active/Passive HA fashion.
|
||||
|
||||
One of the reasons for this is that we have multiple **local locks** in the
|
||||
Volume nodes for mutual exclusion of specific operations on the same resource.
|
||||
These locks need to be shared between nodes of the same cluster, removed or
|
||||
replaced with DB operations.
|
||||
|
||||
This spec proposes a different mechanism to replace these local locks.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
We have locks in Volume nodes to prevent things like deleting a volume that is
|
||||
being used to create another volume, or attaching a volume that is already
|
||||
being attached.
|
||||
|
||||
Unfortunately these locks are local to the nodes, which works if we only
|
||||
support Active/Passive configurations, but doesn't on Active/Active
|
||||
configurations when we have more than one node, since locks will not guarantee
|
||||
mutual exclusion of operations on other nodes.
|
||||
|
||||
We have different locks for different purposes but we will be using the same
|
||||
mechanism to allow them to handle an Active/Active clusters.
|
||||
|
||||
List of locks are:
|
||||
|
||||
- ${VOL_UUID}
|
||||
- ${VOL_UUID}-delete_volume
|
||||
- ${VOL_UUID}-detach_volume
|
||||
- ${SNAPSHOT_UUID}-delete_snapshot
|
||||
|
||||
We are adding an abstraction layer to our locking methods in Cinder -for the
|
||||
manager and drivers- using Tooz_ as explained in `Tooz locks for A/A`_ that
|
||||
will use local file locks by default and will use a DLM for Active-Active
|
||||
configurations.
|
||||
|
||||
But there are cases where drivers don't need distributed locks to work, they
|
||||
may just need local locks and we would be forcing them to install a DLM just
|
||||
for these few locks in the manager, which can be considered a bit extreme.
|
||||
|
||||
|
||||
Use Cases
|
||||
=========
|
||||
|
||||
Operators that have hard requirements, SLA or other reasons, to have their
|
||||
cloud operational at all times or have higher throughput requirements will want
|
||||
to have the possibility to configure their deployments with an Active/Active
|
||||
configuration and have the same behavior they currently have. But they don't
|
||||
want to have to install a DLM when they are using drivers that don't require
|
||||
distributed locks to operate in Active-Active mode just for 4 locks.
|
||||
|
||||
Also in the context of the effort to make Cinder capable of working as an SDS
|
||||
outside of OpenStack, where we no longer can make a hard requirement the
|
||||
presence of a DLM as we can within OpenStack, it makes even more sense for
|
||||
Cinder to be able to work without a DLM present for Active-Active
|
||||
configurations if the storage backend drivers don't require distributed locking
|
||||
in Active-Active environments.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
This spec suggests modifying behavior introduced by `Tooz locks for A/A spec`_
|
||||
for the case where the drivers don't need distributed locks. So we would use
|
||||
local file locks in the drivers (if they use any) and for the locks in the
|
||||
manager we would use a locking mechanism based on the ``workers`` table that
|
||||
was introduced by the `HA A/A Cleanup specs`_.
|
||||
|
||||
This new locking mechanism would be an hybrid between local and distributed
|
||||
locks.
|
||||
|
||||
By using a locking mechanism similar to the one that local file locks provide
|
||||
we'll be able to mimic the same behavior we have:
|
||||
|
||||
- Assure mutual exclusion between different nodes of the same cluster and in
|
||||
the node itself.
|
||||
- Request queuing.
|
||||
- Lock release on node crash.
|
||||
- Require no additional software in the system.
|
||||
- Require no operator specific configuration.
|
||||
|
||||
To assure the mutual exclusion using the ``workers`` table we will need to add
|
||||
a new field called ``lock_name`` that will store the current operation (method
|
||||
name in most cases since table already has the resource type and UUID) that is
|
||||
being performed on the Volume's manager and it will be used for locking.
|
||||
|
||||
New locking mechanism will use added ``lock_name`` field to check if
|
||||
``workers`` table already has an entry for that ``lock_name`` on the specific
|
||||
resource and cluster of the node, in which case the lock is *acquired* and we
|
||||
have to wait and retry later until the lock has been *released* -row has been
|
||||
deleted- and we can insert the row ourselves to *acquire* the lock. This means
|
||||
that in the case of attaching a volume we will only proceed with the attach if
|
||||
there is no DB entry for our cluster for the volume ID with operation
|
||||
``volume_attach``.
|
||||
|
||||
To assure mutual exclusion, this lock checking and acquisition needs to be
|
||||
atomic, so we'll be using a conditional insert of the "lock" in the same way we
|
||||
are doing conditional updates (compare-and-swap) to remove `API races`_.
|
||||
Insert query will be conditioned to the absence of a row with some specific
|
||||
restrictions, in this case conditions will be the ``lock_name``, the
|
||||
``cluster`` and the ``type``. If we couldn't insert the row acquiring the lock
|
||||
failed and we have to wait, if we inserted the row successfully we have
|
||||
acquired the lock and can proceed with the operation.
|
||||
|
||||
This process of trying to acquire, fail, wait and retry, is the exact same
|
||||
mechanism we have today with the local file locks. ``synchronize`` method when
|
||||
using external synchronization will try to acquire the file lock with a
|
||||
relatively small timeout and if it fails it will try again until lock is
|
||||
acquired.
|
||||
|
||||
So by mimicking the same behavior as the file locks we are preserving the
|
||||
queuing of operations we currently have, and not altering our API's behavior
|
||||
and "implicit API contract" that some external scripts may be relying on.
|
||||
|
||||
Since we will be using the DB for the locking operators will not need to add
|
||||
any software to their systems or carry out specific system configurations.
|
||||
|
||||
It is important to notice that timeouts for these new locks will be handled by
|
||||
the cleanup process itself, as rows are removed from the DB when heartbeats are
|
||||
no longer being received, thus releasing the locks and preventing a resource
|
||||
from being stuck.
|
||||
|
||||
On a closer look at these 4 locks mentioned before we can classify them in 2
|
||||
categories:
|
||||
|
||||
- Locks for the resource of the operation.
|
||||
- *${VOL_UUID}-detach_volume* - Used in ``detach_volume`` to prevent
|
||||
multiple simultaneous detaches
|
||||
- *${VOL_UUID}* - Used in ``attach_volume`` to prevent multiple
|
||||
simultaneous attaches
|
||||
|
||||
- Locks that prevent deletion of the source of a volume creation (they are
|
||||
created by ``create_volume`` method):
|
||||
- *${VOL_UUID}-delete_volume* - Used in ``delete_volume``
|
||||
- *${SNAPSHOT_UUID}-delete_snapshot* - Used in ``delete_snapshot``
|
||||
|
||||
For locks on the resource of the operation -attach and detach- the row in the
|
||||
DB is already been inserted by the cleanup method, so we'll reuse that same row
|
||||
and condition the writing of the ``lock_name`` field to the lock being
|
||||
available.
|
||||
|
||||
As for locks preventing deletion we will need to add the row ourselves since
|
||||
cleanup was not adding a row in the ``workers`` table for those resources as
|
||||
they didn't require any cleanup.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
We could use a DLM, which is a stand-in replacement for local locks, but there
|
||||
have been operator that have expressed their concern on adding this burden -to
|
||||
their systems and duties- because they are using drivers that don't require
|
||||
locks for Active-Active and would prefer to avoid adding a DLM to their
|
||||
systems.
|
||||
|
||||
Instead of using the new locking mechanism for locks that prevent deletion of
|
||||
resources we could add a filter to the conditional update -the one being used
|
||||
to prevent `API Races`_- that will prevent us from deleting a volume or a
|
||||
snapshot that is being used as the source for a volume adding also the
|
||||
appropriate response error when we try to delete such a volume/snapshot.
|
||||
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
Adds a new string field called ``lock_name`` to the ``workers`` table.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Small, but necessary, performance impact from changing local file locks to DB
|
||||
calls.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Gorka Eguileor (geguileo)
|
||||
|
||||
Other contributors:
|
||||
Anyone is welcome to help
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
- Add ``lock_name`` field to ``workers`` table.
|
||||
|
||||
- Modify Cinder's new locking methods/decorators to handle hybrid behavior.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
Cleanup for HA A/A: https://review.openstack.org/236977
|
||||
- We need the new ``workers`` table and the cleanup mechanism.
|
||||
|
||||
Removing API Races: https://review.openstack.org/207101/
|
||||
- We need compare-and-swap mechanism on volume and snapshot deletion to be in
|
||||
place so we can add required filters.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Unittests for new locking mechanism.
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
This needs to be properly documented, as this locking mechanism will *not* be
|
||||
appropriate for all drivers.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
General Description for HA A/A: https://review.openstack.org/#/c/232599/
|
||||
|
||||
Cleanup for HA A/A: https://review.openstack.org/236977
|
||||
|
||||
Removal of API Races: https://review.openstack.org/207101/
|
||||
|
||||
|
||||
.. _HA A/A Cleanup specs: https://review.openstack.org/236977
|
||||
.. _API Races: https://review.openstack.org/207101/
|
||||
.. _Tooz: http://docs.openstack.org/developer/tooz/
|
||||
.. _Tooz locks for A/A: https://review.openstack.org/202615/
|
Loading…
Reference in New Issue
Block a user