Merge "Added specification for storyboard subscription"
This commit is contained in:
commit
bec7680a04
211
specs/storyboard_subscription_pub_sub.rst
Normal file
211
specs/storyboard_subscription_pub_sub.rst
Normal file
@ -0,0 +1,211 @@
|
||||
::
|
||||
|
||||
This work is licensed under a Creative Commons Attribution 3.0
|
||||
Unported License.
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
..
|
||||
This template should be in ReSTructured text. Please do not delete
|
||||
any of the sections in this template. If you have nothing to say
|
||||
for a whole section, just write: "None". For help with syntax, see
|
||||
http://sphinx-doc.org/rest.html To test out your formatting, see
|
||||
http://www.tele3.cz/jbar/rest/rest.html
|
||||
|
||||
========================
|
||||
Subscriptions and Events
|
||||
========================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/96
|
||||
|
||||
StoryBoard needs to notify a user when changes occur to a resource which
|
||||
they have decided to be notified about.
|
||||
|
||||
A key feature needed by all ticket tracking systems is the ability to
|
||||
notify a user when topics which they care about have changed. A common way
|
||||
to describe this is "Subscription", where a single user will ask to be
|
||||
notified about certain types of changes for certain resources. More
|
||||
complicated implementations include filtering, notification alert levels,
|
||||
summaries of events that all impact the same resource,
|
||||
or automatic notification based on inferred or calculated relevance.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
In its simplest form, we would like each user to be able to indicate which
|
||||
resources they are interested in, and be able to retrieve a date-sorted
|
||||
list of which of those resources have changed recently. This will then be
|
||||
shown in the UI as a list of events that occurred that are relevant to the
|
||||
user. More advanced features would be the ability to filter these events,
|
||||
or to receive notification of new events in "real time".
|
||||
|
||||
Requirements are as follows:
|
||||
* A user should be able to manage their subscriptions
|
||||
* Subscriptions should be as up-to-date as possible with as little data
|
||||
loss as possible.
|
||||
* A user should be able to subscribe to tasks, stories, projects,
|
||||
and project groups.
|
||||
* Subscriptions should have a minimal impact on API performance.
|
||||
* When a subscription is added, all changes from that point forward should
|
||||
be reported. Historical changes do not need to be generated.
|
||||
* When a subscription is removed, a user's subscription list does not have
|
||||
to be recalculated to extract no-longer-relevant events.
|
||||
* StoryBoard should be able to run on a single server with no crazy
|
||||
additional infrastructure required.
|
||||
* Oslo libraries should be taken into account.
|
||||
|
||||
Subscriptions are the classic, personalized, pub-sub problem, where a user's
|
||||
list of subscriptions can be large, and the matrix of events that could
|
||||
cause a subscription to be notified can be complex and processor intensive.
|
||||
Classically, there are four ways to handle this problem: Push, Pull,
|
||||
Async, or JIT.
|
||||
|
||||
Push
|
||||
----
|
||||
The push approach assumes that you will notify all subscribers when the
|
||||
event occurs. In the case of our API, this means that during a
|
||||
PUT/POST/DELETE, all subscribers to that resource are checked to see
|
||||
whether they need to be notified. In most small-scale system this works
|
||||
fine, however as subscriptions increase this approach is generally not
|
||||
stable, as the time required to process the request (and the number of
|
||||
errors that could possibly be raised) will start to impact the client
|
||||
making the API request. Things that would raise concerns are timeout,
|
||||
as well as the number of points of failure, and as a result of that this is
|
||||
not an appropriate approach.
|
||||
|
||||
Pull
|
||||
----
|
||||
The pull approach believes in restricting the processing load described
|
||||
above to only those users who care enough to ask for subscription events.
|
||||
In this case, a list is generated fresh every time GET /subscriptions (or
|
||||
similar) is called. This approach is appropriate when usage is expected to
|
||||
be low, similar to the generation of a report. Given that a subscription
|
||||
list is a fairly frequently polled resource, this is not an appropriate
|
||||
approach.
|
||||
|
||||
Async
|
||||
-----
|
||||
An asynchronous approach makes use of deferred, distributed processing to
|
||||
"eventually" update a user's list of subscription events. Worker management
|
||||
systems such as Gearman or Rabbit are notified whenever a resource changes
|
||||
(likely via Pecan API hooks), and they 'eventually' ask a worker process to
|
||||
go figure out which subscriptions need to be updated. Advantages of this
|
||||
approach is that we can avoid the Python Global Interpreter Lock by having
|
||||
separate worker processes, and any errors encountered during subscription
|
||||
processing will be isolated and thus not impact the actual API request. The
|
||||
challenge with this is that most queueing/worker management systems are
|
||||
resource intensive (kafka), do not guarantee delivery (gearman),
|
||||
or have known issues with split-brain clustering (rabbit). Any approach
|
||||
will have to accommodate the chosen system.
|
||||
|
||||
Streaming
|
||||
---------
|
||||
A streaming approach begins by emitting an event whenever a resource
|
||||
changes, and to notify all subscribers that are currently connected via a
|
||||
socket. Persistence of events may be handled by creating
|
||||
individual processes that listen to the stream and persist the received
|
||||
data much like a subscribed client might update its UI. This approach
|
||||
solves the real-time problem with a hot sexy technology, however coordinating
|
||||
listeners and ensuring that persistence is handled properly raises the same
|
||||
problems as the Async or Push/Pull problem. As a result this approach is
|
||||
unnecessarily more complex than it needs to be.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Summary
|
||||
-------
|
||||
StoryBoard will emit events whenever a resource changes. Since most
|
||||
resources map directly to database changes, the majority of these changes can
|
||||
be handled via Pecan post-request hooks.
|
||||
|
||||
Events, when emitted, will be written to a deferred processing queue. If
|
||||
the queue is unavailable or misconfigured, a warning should be written to
|
||||
the log, however the system should complete the original request normally
|
||||
and discard the change event altogether.
|
||||
|
||||
The queuing system to be used is RabbitMQ, because it guarantees delivery
|
||||
and recovers after a crash. This decision is based on the assumption that
|
||||
the number of events dispatched by the database will not be sufficient to
|
||||
require a full RabbitMQ cluster, which means we don't have to worry about
|
||||
split-brain problems.
|
||||
|
||||
The StoryBoard server will spin up a series of processes that listen to the
|
||||
event queue and perform actions based on the type of event received. In the
|
||||
case of subscriptions, a process would read the event,
|
||||
load the impacted resource and its change, search for any subscriptions to
|
||||
the impacted resource, update each subscription's owners' subscription feed,
|
||||
and then notify rabbit that the message has been received and processed. If
|
||||
updating one subscription fails, the process should still attempt to
|
||||
complete the other ones.
|
||||
|
||||
The number of events that are retained per user should be configurable by
|
||||
age, with a default of 1 month.
|
||||
|
||||
A user's event feed should be retained in the database in its own table.
|
||||
|
||||
API: Subscriptions
|
||||
------------------
|
||||
StoryBoard will expose a new endpoint at /v1/users/ID/subscriptions. This
|
||||
endpoint will support basic CRUD operations by which a user can manage
|
||||
their subscriptions.
|
||||
|
||||
API: Feed
|
||||
---------
|
||||
StoryBoard will expose a new endpoint at /v1/users/ID/feed. This endpoint
|
||||
will provide a list of events that have been emitted by a user's
|
||||
subscriptions, sorted by date in descending order.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
Alternative approaches have been listed in the problem description.
|
||||
Alternative queueing systems include ZeroMQ & gearman, which was disqualified
|
||||
because it does not guarantee delivery, Kafka which was disqualified
|
||||
because (anecdotally) it requires a cluster to perform properly.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary Assignee:
|
||||
TBD
|
||||
|
||||
Work Items
|
||||
----------
|
||||
* Create an API to add subscriptions for projects, project groups, stories,
|
||||
and tasks.
|
||||
* Teach the storyboard-webclient to allow subscription on projects, project
|
||||
groups, stories, and tasks.
|
||||
* Install RabbitMQ on StoryBoard Server.
|
||||
* Use Oslo.messaging to create an SQLAlchemy hook that broadcasts change
|
||||
events for project groups, projects, stories, and tasks.
|
||||
* Add configuration to StoryBoard for the AMQP connection string and
|
||||
optionally an enabling flag for the whole feature.
|
||||
* Create a storyboard-worker process that connects to AMQP and receives
|
||||
messages for processing.
|
||||
* Create a way for the storyboard-worker process to process lots of
|
||||
different kinds of events (event hooks of some sort? processor factory?)
|
||||
* Build a subscription event handler which is run by storyboard-worker and
|
||||
updates a subscriber's feed.
|
||||
* Create an API endpoint that exposes the feed.
|
||||
* Teach the storyboard-webclient to display the feed.
|
||||
|
||||
Repositories
|
||||
------------
|
||||
No new repositories.
|
||||
|
||||
Servers
|
||||
-------
|
||||
No new servers. storyboard.openstack.org will need to have a running
|
||||
RabbitMQ instance.
|
||||
|
||||
DNS Entries
|
||||
-----------
|
||||
No new DNS entries.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
See above. Puppet module for storyboard will need to be updated. Additional
|
||||
dependencies are on oslo.messaging, rabbitmq-server, upstart, etc.
|
Loading…
x
Reference in New Issue
Block a user