0c619c133d
Add admin command for db schema upgrade/downgrade/etc. Move alembic migrations so above can find them when installed as a package. Fix up packaging to use setup.cfg and pbr. Flesh out README.
93 lines
4.8 KiB
Markdown
93 lines
4.8 KiB
Markdown
winchester
|
|
==========
|
|
|
|
An OpenStack notification event processing library based on persistant streams.
|
|
|
|
Winchester is designed to process event streams, such as those produced from
|
|
OpenStack notifications. Events are represented as simple python dictionaries.
|
|
They should be flat dictionaries (not nested), with a minimum of three keys:
|
|
|
|
"message_id": A unique identifier for this event, such as a uuid.
|
|
"event_type": A string identifying the event's type. Usually a hierarchical dotted name like "foo.bar.baz"
|
|
"timestamp": Time the event occurred (a python datetime, in UTC)
|
|
|
|
The individual keys of the event dictionary are called *traits* and can be
|
|
strings, integers, floats or datetimes. For processing of the (often large)
|
|
notifications that come out of OpenStack, winchester uses the
|
|
[StackDistiller library](https://github.com/StackTach/stackdistiller) to
|
|
extract flattened events from the notifications, that only contain the data
|
|
you actually need for processing.
|
|
|
|
Winchester's processing is done through *triggers* and *pipelines*.
|
|
|
|
A *trigger* is composed of a *match_criteria* which is like a
|
|
persistant query, collecting events you want to process into a
|
|
persistant *stream* (stored in a sql database), a set of distinguishing
|
|
traits, which can separate your list of events into distinct streams,
|
|
similar to a **GROUP BY** clause in an SQL query, and a *fire_criteria*,
|
|
which specifies the conditions a given *stream* has to match for the
|
|
trigger to fire. When it does, the events in the *stream* are sent to
|
|
a *pipeline* listed as the *fire_pipeline* for processing as a batch.
|
|
Also listed is an *expire_timestamp*. If a given stream does not meet
|
|
the *fire_criteria* by that time, it is expired, and can be sent to
|
|
an *expire_pipeline* for alternate processing. Both *fire_pipeline*
|
|
and *expire_pipeline* are optional, but at least one of them must
|
|
be specified.
|
|
|
|
A *pipeline* is simply a list of simple *handlers*. Each *handler*
|
|
in the pipeline receives the list of events in a given stream,
|
|
sorted by timestamp, in turn. *Handlers* can filter events from the list,
|
|
or add new events to it. These changes will be seen by *handlers* further
|
|
down the pipeline. *Handlers* should avoid operations with side-effects,
|
|
other than modifying the list of events, as pipeline processing can be
|
|
re-tried later if there is an error. Instead, if all handlers process the
|
|
list of events without raising an exception, a *commit* call is made on
|
|
each handler, giving it the chance to perform actions, like sending data
|
|
to external systems. *Handlers* are simple to write, as pretty much any
|
|
object that implements the appropriate *handle_events*, *commit* and
|
|
*rollback* methods can be a *handler*.
|
|
|
|
## Installing and running.
|
|
|
|
Winchster is installable as a simple python package.
|
|
Once installed, and the appropriate database url is specified in the
|
|
*winchester.yaml* config file (example included in the *etc* directory),
|
|
you can create the appropriate database schema with:
|
|
|
|
winchester_db -c <path_to_your_config_files>/winchester.yaml upgrade head
|
|
|
|
If you need to run the SQL by hand, or just want to look at the schema, the
|
|
following will print out the appropriate table creation SQL:
|
|
|
|
winchester_db -c <path_to_your_config_files>/winchester.yaml upgrade --sql head
|
|
|
|
Once you have done that, and configured the appropriate *triggers.yaml*,
|
|
*pipelines.yaml*, and, if using StackDistiller, *event_definitions.yaml* configs
|
|
(again, examples are in *etc* in the winchester codebase), you can add events
|
|
into the system by calling the *add_event* method of Winchester's TriggerManager.
|
|
If you are processing OpenStack notifications, you can call *add_notification*,
|
|
which will pare down the notification into an event with StackDistiller, and
|
|
then call *add_event* with that. If you are reading OpenStack notifications off
|
|
of a RabbitMQ queue, there is a plugin for the
|
|
[Yagi](https://github.com/rackerlabs/yagi) notification processor included with
|
|
Winchester. Simply add "winchester.yagi\_handler.WinchesterHandler" to the "apps"
|
|
line in your *yagi.conf* section for the queues you want to listen to, and add a:
|
|
|
|
[winchester]
|
|
config_file = <path_to_your_config_files>/winchester.yaml
|
|
|
|
section to the *yagi.conf*.
|
|
|
|
To run the actual pipeline processing, which is run as a separate daemon, run:
|
|
|
|
pipeline_worker -c <path_to_your_config_files>/winchester.yaml
|
|
|
|
You can pass the *-d* flag to the *pipeline_worker* to tell it to run as a background
|
|
daemon.
|
|
|
|
Winchester uses an optimistic locking scheme in the database to coordinate firing,
|
|
expiring, and processing of streams, so you can run as many processes (like
|
|
Yagi's *yagi-event* daemon) feeding TriggerManagers as you need to handle the
|
|
incoming events, and as many *pipeline_worker*s as you need to handle the resulting
|
|
processing load, scaling the system horizontally.
|