aodh/partition at 125411ea4f6a15b51e7e80419b2d15310864b771 - aodh

History

Eoghan Glynn ede2329e54 Simple alarm partitioning protocol based on AMQP fanout RPC All available partitions report their presence periodically. The priority of each partition in terms of assuming mastership is determined by earliest start-time (with a UUID-based tiebreaker in the unlikely event of a time clash). A single partion assumes mastership at any given time, taking responsibility for allocating the alarms to be evaluated across the set of currently available partitions. When a partition lifecycle event is detected (i.e. a pre-existing partition fails to report its presence, or a new one is started up), a complete rebalance of the alarms is initiated. Individual alarm lifecycle events, on the other hand, do not require a full re-balance. Instead new alarms are allocated as they are detected, whereas deleted alarms are initially allowed to remain within the allocation (as the individual evaluators are tolerant of assigned alarms not existing, and the deleted alarms should be randomly distributed over the partions). However once the number of alarms deleted since the last rebalance reaches a certain limit, a rebalance will be initiated to maintain equity. As presence reports are received, each partition keeps track of the oldest partition it currently knows about, allowing an assumption of mastership to be aborted if an older partition belatedly reports. The alarm evaluation service to launch (singleton versus partitioned) is controlled via a new alarm.evaluation_service config option. Implements bp alarm-service-partitioner Change-Id: I3dede464d019a7f776f3d302e2b24cc4a9fc5b66	2013-09-20 17:22:46 +01:00
..
__init__.py	Simple alarm partitioning protocol based on AMQP fanout RPC	2013-09-20 17:22:46 +01:00
test_coordination.py	Simple alarm partitioning protocol based on AMQP fanout RPC	2013-09-20 17:22:46 +01:00

Eoghan Glynn ede2329e54 Simple alarm partitioning protocol based on AMQP fanout RPC

All available partitions report their presence periodically.

The priority of each partition in terms of assuming mastership
is determined by earliest start-time (with a UUID-based tiebreaker
in the unlikely event of a time clash).

A single partion assumes mastership at any given time, taking
responsibility for allocating the alarms to be evaluated across
the set of currently available partitions.

When a partition lifecycle event is detected (i.e. a pre-existing
partition fails to report its presence, or a new one is started
up), a complete rebalance of the alarms is initiated.

Individual alarm lifecycle events, on the other hand, do not
require a full re-balance. Instead new alarms are allocated as
they are detected, whereas deleted alarms are initially allowed to
remain within the allocation (as the individual evaluators are tolerant
of assigned alarms not existing, and the deleted alarms should be
randomly distributed over the partions). However once the number of
alarms deleted since the last rebalance reaches a certain limit, a
rebalance will be initiated to maintain equity.

As presence reports are received, each partition keeps track of the
oldest partition it currently knows about, allowing an assumption of
mastership to be aborted if an older partition belatedly reports.

The alarm evaluation service to launch (singleton versus partitioned)
is controlled via a new alarm.evaluation_service config option.

Implements bp alarm-service-partitioner

Change-Id: I3dede464d019a7f776f3d302e2b24cc4a9fc5b66

2013-09-20 17:22:46 +01:00

__init__.py

Simple alarm partitioning protocol based on AMQP fanout RPC

2013-09-20 17:22:46 +01:00

test_coordination.py

Simple alarm partitioning protocol based on AMQP fanout RPC

2013-09-20 17:22:46 +01:00