Add spec for RUG HA

This adds a spec to address changes that need to happen to enable RUG HA and scaleout Change-Id: Iaeadb07b5a6909409053ab0b7341c2b3651e1444
2015-06-19 11:57:26 -07:00 · 2015-06-19 11:57:26 -07:00 · 1998f897a7
commit 1998f897a7
parent dbd8943478
1 changed files with 201 additions and 0 deletions
--- a/specs/liberty/rug_ha.rst
+++ b/specs/liberty/rug_ha.rst
@ -0,0 +1,201 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+
+Title of your blueprint
+=======================
+
+Rug HA and scaleout
+
+Problem Description
+===================
+
+The RUG is a multi-process, multi-worker service but it be cannot be
+scaled out to multiple nodes for purposes of high-availability and
+distributed handling of load.  The only currently option for a
+highly-available is to do an active/passive cluster using Pacemaker
+or similar, which is less than ideal and does not address scale-out
+concerns.
+
+Proposed Change
+===============
+
+This proposes allowing multiple RUG processes to be spawned across
+many nodes.  Each RUG process is responsible for a fraction of the
+total running appliances.  RUG_process->appliance(s) mapping will be
+managed by a consistent hash ring.  An external coordination service
+(ie, zookeeper) will be leveraged to provide cluster membership
+capabilities, and python-tooz will be used to manage cluster events.
+When new members join or depart, the hash ring will be rebalanced and
+appliances re-distributed across the RUG.
+
+This allows operators to scale out to many RUG instances, eliminating
+the single-point-of-failure and allowing appliances to be evenly
+distributed across multiple worker processes.
+
+
+Data Model Impact
+-----------------
+
+n/a
+
+REST API Impact
+---------------
+
+n/a
+
+Security Impact
+---------------
+
+None
+
+Notifications Impact
+--------------------
+
+
+Other End User Impact
+---------------------
+
+n/a
+
+Performance Impact
+------------------
+
+There will be some new overhead introduced the messaging layer as Neutron
+notifications and RPCs will need to be distributed to per-RUG message queues.
+
+Other Deployer Impact
+---------------------
+
+Deployers will need to evaluate and choose an appropriate backend to be used
+by tooz for leader election.  memcached is a simple yet non-robust solution,
+while zookeeper is a less light-weight but proven one.  More info at [2]
+
+Developer Impact
+----------------
+
+n/a
+
+Community Impact
+----------------
+
+n/a
+
+
+Alternatives
+------------
+
+One alternative to having each RUG instance declare its own messaging queue and
+inspect all incoming messages would be to have the DHT master also serve as a
+notification master. That is, the leader would be the only instance of the RUG
+listening to and processing incoming Neutron notificatons, and then
+re-distributing them to specific RUG workers based on the state of the DHT.
+
+Another option would be to do away with the use of Neutron notifications
+entirely and hard-wire the akanda-neutron plugin to the RUG via a dedicated
+message queue.
+
+
+Implementation
+==============
+
+This proposes enabling operators to run multiple instances of the RUG.
+Each instance of the RUG will be responsible for a subset of the managed
+appliances.  A distributed, consistent hash ring will be used to map appliances
+to their respective RUG instance. The Ironic project is already doing
+something similar and has a hashring implementation we can likely leverage
+to get started [1]
+
+The RUG cluster is essentially leaderless.  The hash ring is constructed
+using the active node list and each indvidual RUG instance is capable of
+constructing a ring given a list of members.  This ring is consistent
+across nodes provided the coordination service is properly reporting membership
+events and they are processed correctly.  Using metadata attached to incoming
+events (ie, tenant_id), a consumer is able to check the hash ring to determine
+which node in the ring the event is mapped to.
+
+The RUG will spawn a new subprocess called the coordinator.  It's only purpose
+is to listen for cluster membership events using python-tooz.  When a member
+joins or departs, the coordinator will create a new Event of type REBALANCE
+and put it onto the notifications queue.  This event's body will contain an
+updated list of current cluster nodes.
+
+Each RUG worker process will maintain a copy of the hash ring, which is
+shared by its worker threads.  When it receives a REBALANCE event, it will
+rebalance the hash ring given the new membership list.  When it receives
+normal CRUD events for resources, it will first check the hash ring to see
+if it is mapped to its host based on target tenant_id for the event. If it is,
+the event will be processed. If it is not, the event will be ignored and
+serviced by another worker.
+
+Ideally, REBALANCE events should be serviced before CRUD events.
+
+Assignee(s)
+-----------
+
+
+Work Items
+----------
+
+* Implement a distributed hash ring for managing worker:appliance
+assignment
+
+* Add new coordination sub-process to the RUG that publishes REBALANCE
+events to the notifications queue when membership changes
+
+* Setup per-RUG message queues such that notifications are distributed to all
+RUG processes equally.
+
+* Update worker to manage its own copy of the hash ring
+
+* Update worker /w ability to respond to new REBALANCE events by rebalancing
+the ring with an updated membership list
+
+* Update worker to drop events for resources that are not mapped to its host in
+the hash ring.
+
+Dependencies
+============
+
+Testing
+=======
+
+Tempest Tests
+-------------
+
+
+Functional Tests
+----------------
+
+If we cannot sufficiently test this using unit tests, we could potentially
+spin up our devstack job with multiple copies of the akanda-rug-service
+running on a single host, and having multiple router appliances.  This
+would allow us to test ring rebalancing by killing off one of the multiple
+akanda-rug-service processes.
+
+API Tests
+---------
+
+
+Documentation Impact
+====================
+
+User Documentation
+------------------
+
+Deployment docs need to be updated to mention this feature is dependent
+on an external coordination service.
+
+Developer Documentation
+-----------------------
+
+
+References
+==========
+
+[1] https://git.openstack.org/cgit/openstack/ironic/tree/ironic/common/hash_ring.py
+[2] http://docs.openstack.org/developer/tooz/drivers.html
+