Added examples to create event based alarm

- Aodh event based alarm using single and multiple traits
- New sub-section in 'Debug alarm'
- Correct path to aodh.conf.sample file
- Seperated example sections into
  - Threshold based alarm
  - Composite alarm
  - Event based alarm

Change-Id: I0b025a7401ad7feaf05f7e31e77cd3f92c9761a0
Signed-off-by: Sachin Patil <psachin@redhat.com>
This commit is contained in:
Sachin 2017-07-15 11:35:21 +00:00 committed by Sachin Patil
parent 2c795471fe
commit c875113961
3 changed files with 159 additions and 18 deletions

View File

@ -144,18 +144,27 @@ Using alarms
Alarm creation Alarm creation
-------------- --------------
Threshold based alarm
`````````````````````
An example of creating a Gnocchi threshold-oriented alarm, based on an upper An example of creating a Gnocchi threshold-oriented alarm, based on an upper
bound on the CPU utilization for a particular instance: bound on the CPU utilization for a particular instance:
.. code-block:: console .. code-block:: console
$ aodh alarm create --name cpu_hi \ $ aodh alarm create \
--name cpu_hi \
--type gnocchi_resources_threshold \ --type gnocchi_resources_threshold \
--description 'instance running hot' \ --description 'instance running hot' \
--metric cpu_util --threshold 70.0 \ --metric cpu_util \
--comparison-operator gt --aggregation_method avg \ --threshold 70.0 \
--granularity 600 --evaluation-periods 3 \ --comparison-operator gt \
--alarm-action 'log://' --resource_id INSTANCE_ID --aggregation-method mean \
--granularity 600 \
--evaluation-periods 3 \
--alarm-action 'log://' \
--resource-id INSTANCE_ID \
--resource-type instance
This creates an alarm that will fire when the average CPU utilization This creates an alarm that will fire when the average CPU utilization
for an individual instance exceeds 70% for three consecutive 10 for an individual instance exceeds 70% for three consecutive 10
@ -220,17 +229,21 @@ time-constraint
day or days of the week (expressed as ``cron`` expression with an day or days of the week (expressed as ``cron`` expression with an
optional timezone). optional timezone).
Composite alarm
```````````````
An example of creating a combination alarm, based on the combined An example of creating a combination alarm, based on the combined
state of two underlying alarms: state of two underlying alarms:
.. code-block:: console .. code-block:: console
$ aodh alarm create --name meta --type composite \ $ aodh alarm create --name meta --type composite \
--composite-rule '{"or":[{"threshold": 0.8,"metric": "cpu_util", "type": \ --composite-rule '{"or": [{"threshold": 0.8,"metric": "cpu_util", \
"gnocchi_resources_threshold", "resource_id": INSTANCE_ID, \ "type": "gnocchi_resources_threshold", "resource_id": INSTANCE_ID1, \
"aggregation-method": "last"},{"threshold": 0.8,"metric": "cpu_util", \ "aggregation_method": "last", "resource_type": "instance"}, {"threshold": \
"type": "gnocchi_resources_threshold", "resource_id": INSTANCE_ID2, \ 0.8, "metric": "cpu_util", "type": "gnocchi_resources_threshold", \
"aggregation-method": "last"}]}' \ "resource_id": INSTANCE_ID2, "aggregation_method": "last", \
"resource_type": "instance"}]}' \
--alarm-action 'http://example.org/notify' --alarm-action 'http://example.org/notify'
This creates an alarm that will fire when either one of two underlying This creates an alarm that will fire when either one of two underlying
@ -239,6 +252,12 @@ is a webhook call. Any number of underlying alarms can be combined in
this way, using either ``and`` or ``or``. Additionally, combinations this way, using either ``and`` or ``or``. Additionally, combinations
can contain nested conditions: can contain nested conditions:
.. note::
Observe the *underscore in* ``resource_id`` & ``resource_type`` in
composite rule as opposed to ``--resource-id`` &
``--resource-type`` CLI arguments.
.. code-block:: console .. code-block:: console
$ aodh alarm create --name meta --type composite \ $ aodh alarm create --name meta --type composite \
@ -246,6 +265,76 @@ can contain nested conditions:
--alarm-action 'http://example.org/notify' --alarm-action 'http://example.org/notify'
Event based alarm
`````````````````
An example of creating a event alarm based on power state of
instance:
.. code-block:: console
$ aodh alarm create \
--type event \
--name instance_off \
--description 'Instance powered OFF' \
--event-type "compute.instance.power_off.*" \
--enable True \
--query "traits.instance_id=string::INSTANCE_ID" \
--alarm-action 'log://' \
--ok-action 'log://' \
--insufficient-data-action 'log://'
Valid list of ``event-type`` and ``traits`` can be found in
``event_definitions.yaml`` file . ``--query`` may also contain mix of
traits for example to create alarm when instance is powered on but
went into error state:
.. code-block:: console
$ aodh alarm create \
--type event \
--name instance_on_but_in err_state \
--description 'Instance powered ON but in error state' \
--event-type "compute.instance.power_on.*" \
--enable True \
--query "traits.instance_id=string::INSTANCE_ID;traits.state=string::error" \
--alarm-action 'log://' \
--ok-action 'log://' \
--insufficient-data-action 'log://'
Sample output of alarm type **event**:
.. code-block:: console
+---------------------------+---------------------------------------------------------------+
| Field | Value |
+---------------------------+---------------------------------------------------------------+
| alarm_actions | [u'log://'] |
| alarm_id | 15c0da26-524d-40ad-8fba-3e55ee0ddc91 |
| description | Instance powered ON but in error state |
| enabled | True |
| event_type | compute.instance.power_on.* |
| insufficient_data_actions | [u'log://'] |
| name | instance_on_state_err |
| ok_actions | [u'log://'] |
| project_id | 9ee200732f4c4d10a6530bac746f1b6e |
| query | traits.instance_id = bb912729-fa51-443b-bac6-bf4c795f081d AND |
| | traits.state = error |
| repeat_actions | False |
| severity | low |
| state | insufficient data |
| state_timestamp | 2017-07-15T02:28:31.114455 |
| time_constraints | [] |
| timestamp | 2017-07-15T02:28:31.114455 |
| type | event |
| user_id | 89b4e48bcbdb4816add7800502bd5122 |
+---------------------------+---------------------------------------------------------------+
.. note::
To enable event alarms please refer `Configuration
<https://docs.openstack.org/aodh/latest/contributor/event-alarm.html#configuration>`_
Alarm retrieval Alarm retrieval
--------------- ---------------
@ -341,3 +430,54 @@ or even deleted permanently (an irreversible step):
.. code-block:: console .. code-block:: console
$ aodh alarm delete ALARM_ID $ aodh alarm delete ALARM_ID
Debug alarms
------------
A good place to start is to add ``--debug`` flag when creating or
updating an alarm. For example:
.. code-block:: console
$ aodh --debug alarm create <OTHER_PARAMS>
Look for the state to transition when event is triggered in
``/var/log/aodh/listener.log`` file. For example, the below logs shows
the transition state of alarm with id
``85a2942f-a2ec-4310-baea-d58f9db98654`` triggered by event id
``abe437a3-b75b-40b4-a3cb-26022a919f5e``
.. code-block:: console
2017-07-15 07:03:20.149 2866 INFO aodh.evaluator [-] alarm 85a2942f-a2ec-4310-baea-d58f9db98654 transitioning to alarm because Event <id=abe437a3-b75b-40b4-a3cb-26022a919f5e,event_type=compute.instance.power_off.start> hits the query <query=[{"field": "traits.instance_id", "op": "eq", "type": "string", "value": "bb912729-fa51-443b-bac6-bf4c795f081d"}]>.
The below entry in ``/var/log/aodh/notifier.log`` also confirms that
event id ``abe437a3-b75b-40b4-a3cb-26022a919f5e`` hits the query
matching instance id ``bb912729-fa51-443b-bac6-bf4c795f081d``
.. code-block:: console
2017-07-15 07:03:24.071 2863 INFO aodh.notifier.log [-] Notifying alarm instance_off 85a2942f-a2ec-4310-baea-d58f9db98654 of low priority from insufficient data to alarm with action log: because Event <id=abe437a3-b75b-40b4-a3cb-26022a919f5e,event_type=compute.instance.power_off.start> hits the query <query=[{"field": "traits.instance_id", "op": "eq", "type": "string", "value": "bb912729-fa51-443b-bac6-bf4c795f081d"}]>
``aodh alarm-history`` as mentioned earlier will also display the
transition:
.. code-block:: console
$ aodh alarm-history show 85a2942f-a2ec-4310-baea-d58f9db98654
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| timestamp | type | detail | event_id |
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+
| 2017-07-15T01:33:20.390623 | state transition | {"transition_reason": "Event <id=abe437a3-b75b-40b4-a3cb-26022a919f5e,event_type=compute.instance.power_off.start> hits | c5ca92ae-584b-4da6-a12c-b7a00dd39fef |
| | | the query <query=[{\"field\": \"traits.instance_id\", \"op\": \"eq\", \"type\": \"string\", \"value\": \"bb912729-fa51 | |
| | | -443b-bac6-bf4c795f081d\"}]>.", "state": "alarm"} | |
| 2017-07-15T01:31:14.516188 | creation | {"alarm_actions": ["log://"], "user_id": "89b4e48bcbdb4816add7800502bd5122", "name": "instance_off", "state": | fb31f4c2-e357-44c3-9b6a-bd2aaaa4ae68 |
| | | "insufficient data", "timestamp": "2017-07-15T01:31:14.516188", "description": "event_instance_power_off", "enabled": | |
| | | true, "state_timestamp": "2017-07-15T01:31:14.516188", "rule": {"query": [{"field": "traits.instance_id", "type": | |
| | | "string", "value": "bb912729-fa51-443b-bac6-bf4c795f081d", "op": "eq"}], "event_type": "compute.instance.power_off.*"}, | |
| | | "alarm_id": "85a2942f-a2ec-4310-baea-d58f9db98654", "time_constraints": [], "insufficient_data_actions": ["log://"], | |
| | | "repeat_actions": false, "ok_actions": ["log://"], "project_id": "9ee200732f4c4d10a6530bac746f1b6e", "type": "event", | |
| | | "severity": "low"} | |
+----------------------------+------------------+--------------------------------------------------------------------------------------------------------------------------+--------------------------------------+

View File

@ -33,4 +33,4 @@ The following is a sample Aodh configuration for adaptation and use. It is
auto-generated from Aodh when this documentation is built, and can also be auto-generated from Aodh when this documentation is built, and can also be
viewed in `file form <_static/aodh.conf.sample>`_. viewed in `file form <_static/aodh.conf.sample>`_.
.. literalinclude:: _static/aodh.conf.sample .. literalinclude:: ../_static/aodh.conf.sample

View File

@ -63,13 +63,14 @@ The following is an example of event alarm rule::
Configuration Configuration
============= =============
To enable this functionality, config the Ceilometer to be able to publish To enable this functionality, config the Ceilometer to be able to
events to the queue the aodh-listener service listen on. The publish events to the queue the aodh-listener service listen on. The
*event_alarm_topic* config option of Aodh identify which messaging topic the *event_alarm_topic* config option of Aodh identify which messaging
aodh-listener on, the default value is "alarm.all". In Ceilometer side, topic the aodh-listener on, the default value is "alarm.all". In
a publisher of notifier type need to be configured in the event pipeline config Ceilometer side, a publisher of notifier type need to be configured in
file(event_pipeline.yaml as default), the notifier should be with a messaging the event pipeline config file(``event_pipeline.yaml`` as default),
topic same as the *event_alarm_topic* option defined. For an example:: the notifier should be with a messaging topic same as the
*event_alarm_topic* option defined. For an example::
--- ---
sources: sources: