Eoghan Glynn ede2329e54 Simple alarm partitioning protocol based on AMQP fanout RPC
All available partitions report their presence periodically.

The priority of each partition in terms of assuming mastership
is determined by earliest start-time (with a UUID-based tiebreaker
in the unlikely event of a time clash).

A single partion assumes mastership at any given time, taking
responsibility for allocating the alarms to be evaluated across
the set of currently available partitions.

When a partition lifecycle event is detected (i.e. a pre-existing
partition fails to report its presence, or a new one is started
up), a complete rebalance of the alarms is initiated.

Individual alarm lifecycle events, on the other hand, do not
require a full re-balance. Instead new alarms are allocated as
they are detected, whereas deleted alarms are initially allowed to
remain within the allocation (as the individual evaluators are tolerant
of assigned alarms not existing, and the deleted alarms should be
randomly distributed over the partions). However once the number of
alarms deleted since the last rebalance reaches a certain limit, a
rebalance will be initiated to maintain equity.

As presence reports are received, each partition keeps track of the
oldest partition it currently knows about, allowing an assumption of
mastership to be aborted if an older partition belatedly reports.

The alarm evaluation service to launch (singleton versus partitioned)
is controlled via a new alarm.evaluation_service config option.

Implements bp alarm-service-partitioner

Change-Id: I3dede464d019a7f776f3d302e2b24cc4a9fc5b66
2013-09-20 17:22:46 +01:00

0 lines
0 B
Python

The file is empty.