Document version 3 template format

- Added vitrage-templates.rst and renamed the previous file to v2. - Release notes Change-Id: Ibed548118473902ed4bf7028aad854ebd1cb84bf Story: 2004871 Task: 29128
2019-02-07 14:30:56 +00:00 · 2019-02-07 14:30:56 +00:00 · a2ce45c3c6
commit a2ce45c3c6
parent f87cdcd111
8 changed files with 533 additions and 8 deletions
--- a/doc/source/contributor/nova-notifier.rst
+++ b/doc/source/contributor/nova-notifier.rst
@ -44,4 +44,4 @@ In order to support this use case, the user should perform the following:
 For more information about the mark-down action, see the Vitrage templates
 documentation: templates_

-.. _templates: https://docs.openstack.org/vitrage/latest/contributor/vitrage-template-format.html
+.. _templates: https://docs.openstack.org/vitrage/latest/contributor/vitrage-templates.html
--- a/doc/source/contributor/scenario-evaluator.rst
+++ b/doc/source/contributor/scenario-evaluator.rst
@ -88,7 +88,7 @@ Template Loading
 Scenarios are written up in configuration files called *templates*. The format
 and specification of these can be seen here_.

-.. _here: vitrage-template-format.html
+.. _here: vitrage-templates.html

 Templates should all be located in the *<vitrage folder>/templates* folder.

--- a/doc/source/contributor/templates-loading.rst
+++ b/doc/source/contributor/templates-loading.rst
@ -12,7 +12,7 @@ templates will be added into scenario repository.
 This document explains the implementation details of template data to help
 developer understand how scenario_evaluator_ works.

-.. _format: vitrage-template-format.html
+.. _format: vitrage-templates.html
 .. _scenario_evaluator: scenario-evaluator.html

 Example
--- a/doc/source/contributor/vitrage-first_steps.rst
+++ b/doc/source/contributor/vitrage-first_steps.rst
@ -122,4 +122,4 @@ Writing your own templates
 For more information regarding Vitrage templates, their format and how to add
 them, see here_.

-.. _here: https://docs.openstack.org/vitrage/latest/contributor/vitrage-template-format.html
+.. _here: https://docs.openstack.org/vitrage/latest/contributor/vitrage-templates.html
--- a/doc/source/contributor/vitrage-template-format-v2.rst
+++ b/doc/source/contributor/vitrage-template-format-v2.rst
@ -1,6 +1,10 @@
-================================
-Vitrage Templates Format & Usage
-================================
+====================================
+Vitrage Templates Format - Version 2
+====================================
+
+*Note:* Consider using the newer template format version_3_
+
+.. _version_3: https://docs.openstack.org/vitrage/latest/contributor/vitrage-templates.html

 Overview
 ========
--- a/doc/source/contributor/vitrage-templates.rst
+++ b/doc/source/contributor/vitrage-templates.rst
@ -0,0 +1,515 @@
+===============
+Using Templates
+===============
+
+Overview
+########
+In Vitrage we use configuration files, called ``templates``, to express rules
+regarding raising deduced alarms, setting deduced states, and detecting/setting
+RCA links.
+This page describes the format of the Vitrage templates, with some examples.
+Additionally, a short guide on adding templates is presented.
+
+*Note:* This document refers to Vitrage templates version 3.
+
+For previous versions, see:
+
+Version_1_
+
+Version_2_
+
+.. _Version_1: https://docs.openstack.org/vitrage/pike/
+.. _Version_2: https://docs.openstack.org/vitrage/latest/contributor/vitrage-template-format-v2.html
+
+
+Template Structure
+##################
+The template is written in YAML language, with the following structure:
+
+.. code-block:: yaml
+
+   metadata:
+    version: 3
+    name: <unique template identifier>
+    type: standard
+    description: <what this template does>
+   entities:
+    example_host:
+     type: nova.host
+     name: compute-0-0
+    example_instance:
+     type: nova.instance
+    example_alarm:
+     type: zabbix
+     name: memory threshold crossed
+   scenarios:
+    - condition: <if statement true do the actions>
+      actions:
+        ...
+
+The template is divided into three main sections:
+
+- ``metadata`` - contains general information about the template.
+
+  - ``version`` - the version of the template format.
+  - ``name`` - the name of the template
+  - ``type`` - the type of the template. Currently only `standard` is supported
+  - ``description`` - a brief description of what the template does (optional)
+
+- ``entities`` - describes the resources and alarms which are relevant to the template scenario (corresponds to a vertex in the entity graph). These are referenced later on.
+
+- ``scenarios`` - a list of if-then scenarios to consider. Each scenario is comprised of:
+
+  - ``condition`` - an expression describing the existence of a structure in the topology
+  - ``actions`` - a list of actions to execute when the condition is met.
+
+
+Scenario Condition
+==================
+
+The condition expression is evaluated to True or False depending on the existence of such a structure in the entity graph.
+An expression is either a *single* entity, a declaration describing a relationship between two entities, or some logical combination of these.
+
+Example 1
+---------
+
+.. code-block:: yaml
+
+   scenarios:
+    - condition: example_host
+      actions:
+
+True if an entity exists with properties matching those defined in `example_host`, False otherwise
+
+Example 2
+---------
+
+.. code-block:: yaml
+
+   scenarios:
+    - condition: example_host [ contains ] example_instance
+      actions:
+
+True if all of the following are True:
+ - An entity exists with properties matching those defined in `example_host`
+ - An entity exists with properties matching those defined in `example_instance`
+ - Between these two entities, exists a relationship (graph edge) with a label `contains`
+
+Logical Operators
+-----------------
+
+Expressions can be combined using the following logical operators:
+
+- `AND` - Both expressions must be satisfied.
+- `OR` - At least one expression must be satisfied (non-exclusive or).
+- `NOT` - The expression must not be satisfied in order for the condition to be met.
+- `()` - parentheses clause indicating the scope of an expression.
+
+
+Example 3
+---------
+
+.. code-block:: yaml
+
+   scenarios:
+    - condition: example_host [ contains ] example_instance AND example_alarm [ on ] example_host
+      actions:
+
+True if all of the following are True:
+ - An entity exists with properties matching those defined in `example_host`
+ - An entity exists with properties matching those defined in `example_instance`
+ - An entity exists with properties matching those defined in `example_alarm`
+ - Between `host` and `instance`, exists a relationship (graph edge) with a label `contains`
+ - Between `alarm` and `host`, exists a relationship (graph edge) with a label `on`
+
+Example 4
+---------
+
+.. code-block:: yaml
+
+    - condition: example_host [ contains ] example_instance AND NOT example_alarm [ on ] example_host
+      actions:
+
+Similar to the example 3, adding the `NOT` means there must not exist an edge with `on` label, between `alarm` and `host`.
+
+Further examples
+----------------
+
+A few more example conditions:
+
+- `entity_a [contains] entity_b`
+- `entity_a [contains] entity_b AND entity_b [contains] entity_c AND entity_c [contains] entity_d`
+- `entity_a [contains] entity_b AND NOT entity_a [contains] entity_c`
+- `entity_a [contains] entity_b AND NOT (entity_a [contains] entity_c OR entity_a [contains] entity_d)`
+
+A few restrictions regarding the condition format
+-------------------------------------------------
+
+A condition can not be entirely "negative", it must have at least one part that does not have a `NOT` in front of it. This example is illegal:
+
+::
+
+ This condition is illegal:
+ condition: NOT example_alarm [on] example_instance
+
+ Instead, add a positive term:
+ condition: example_instance AND NOT example_alarm [on] example_instance
+
+There must be at least one entity that is common to all `OR` clauses.
+
+::
+
+ This condition is illegal:
+ example_alarm_1 [on] example_instance OR example_alarm_2 [on] example_host
+
+ Instead, use two separate conditions and scenarios.
+
+For more information, see the 'Calculate the action_target' section in external actions Spec_.
+
+.. _Spec: https://specs.openstack.org/openstack/vitrage-specs/specs/pike/external-actions.html
+
+Scenario Actions
+================
+
+Each scenario contains `condition` and `actions`. When the `condition` is met, all the scenario's
+actions are executed. The executed actions may be reverted if the condition is no longer met.
+
+All supported actions described below, use the following entities definitions:
+
+.. code-block:: yaml
+
+    metadata:
+        version: 3
+        name: Entities for action examples
+        type: standard
+    entities:
+        - host:
+            type: nova.host
+        - host_alarm:
+            category: ALARM
+        - instance:
+            type: nova.instance
+        - instance_alarm:
+            category: ALARM
+
+Set State
+---------
+
+.. code-block:: yaml
+
+ - condition: host_alarm [on] host
+   actions:
+     - set_state:
+        state: ERROR                         # Mandatory - ERROR/SUBOPTIMAL/OK
+        target: host                         # Mandatory - Entity key
+
+This action will change the state of the `target` resource to the specified `state`.
+Affect the state seen in Vitrage.
+Once the condition is no longer met, the state will reverted to the result of either the data source state, or any other scenario.
+
+Raise Alarm
+-----------
+
+.. code-block:: yaml
+
+ - condition: host_alarm [on] host AND host [contains] instance
+   actions:
+    - raise_alarm:
+       target: instance                      # Mandatory - Entity key
+       alarm_name: affected by host problem  # Mandatory - Any string
+       severity: WARNING                     # Mandatory - CRITICAL/WARNING
+       causing_alarm: host_alarm             # Optional - Entity key
+
+This action creates a new alarm vertex, with the specified `alarm_name` as its `name` property.
+This alarm vertex will have an edge to the `target` vertex, with a label `on`.
+Optionally, if `causing_alarm` is specified, another edge will be added, from the `causing_alarm` vertex to the new alarm vertex, with a label `causes`.
+Notice: `on` and `causes` edge labels, are predefined values.
+Once the condition is no longer met, the alarm may be removed, if it is not the result of any other scenario.
+
+Add Causal Relationship
+-----------------------
+
+.. code-block:: yaml
+
+ - condition: host_alarm [on] host AND host [contains] instance AND instance_alarm [on] instance
+   actions:
+     - add_causal_relationship:
+        source: host_alarm
+        target: instance_alarm
+
+A new edge will be added, from the `source` vertex to the `target` vertex, with a label `causes`.
+Once the condition is no longer met, the edge may be removed, if it is not the result of any other scenario.
+Notice: `causes` edge label, is a predefined value.
+
+Mark Down
+---------
+
+.. code-block:: yaml
+
+ - condition: host_alarm [on] host
+   actions:
+     - mark_down:
+        target: host                         # Mandatory - Entity key
+
+Set an entity's `marked_down` field.
+This action will add a `marked_down` property to the resource (Supported by nova notifier).
+This can be used along with nova notifier to:
+- call nova force_down for a host.
+- call nova reset-server-state for an instance.
+Once the condition is no longer met, the `marked_down` property may be removed, if it is not the result of any other scenario.
+
+Execute Mistral
+---------------
+
+.. code-block:: yaml
+
+ - condition: host_alarm [on] host
+   actions:
+     - execute_mistral:
+        workflow: work_1                      # Mandatory - Workflow name
+        input:                                # Optional - Dictionary of custom workflow input
+          some_property: 5
+          another_property: hello
+
+Execute a Mistral workflow.
+If the Mistral notifier is used, the specified workflow will be executed with
+its parameters.
+
+Advanced
+========
+
+Regular expressions
+-------------------
+All parameters within an entity definition can be made to include regular
+expressions. To do this, simply add `.regex` to their key. For example, as
+Zabbix supports regular expressions and a Zabbix alarm contains a `rawtext`
+field which is a regular expression, a Zabbix alarm entity defined in the
+template may contain a ``rawtext.regex`` field that is also defined by a
+regular expression:
+::
+
+  - zabbix_alarm:
+     category: ALARM
+     type: zabbix
+     rawtext.regex: Interface ([_a-zA-Z0-9'-]+) down on {HOST.NAME}
+
+Functions
+---------
+Some properties of an action can be defined using functions. On version 2, one
+function is supported: `get_attr`, and it is supported only for `execute_mistral`
+action.
+
+
+get_attr
+^^^^^^^^
+This function retrieves the value of an attribute of an entity that is defined
+in the template.
+
+::
+
+    get_attr(template_id, attr_name)
+
+.. code-block:: yaml
+
+    metadata:
+        ...
+    entities:
+        - host:
+            type: nova.host
+        - host_alarm:
+            type: zabbix
+            name: host connectivity problem
+    scenarios:
+     - condition: host_alarm [on] host
+       actions:
+         - execute_mistral:
+            workflow: demo_workflow
+            input:
+              host_name: get_attr(host, name)
+              retries: 5
+
+
+Examples
+########
+
+
+Example 1: Basic RCA and Deduced Alarm/State
+============================================
+The following template demonstrates:
+
+1. How to raise a deduced alarm. Specifically, if there is high CPU load on a
+   host, raise alarm indicating CPU performance problems on all contained
+   instances.
+2. How to link alarms for purposes of root cause analysis (RCA). Specifically,
+   if there is high CPU load on the host and CPU performance problems on the
+   hosted instances, we link them with a `causes` relationship, according to the optional property `causing_alarm`.
+
+.. code-block:: yaml
+
+    metadata:
+        version: 3
+        name: EXAMPLE 1 - host high CPU load to instance CPU suboptimal
+        type: standard
+        description: when there is high CPU load on the host, show implications on the instances
+    entities:
+        - host:
+            type: nova.host
+        - host_alarm:
+            type: zabbix
+            name: host high cpu load
+        - instance:
+            type: nova.instance
+        - instance_alarm:
+            category: ALARM
+            severity: CRITICAL
+    scenarios:
+     - condition: host_alarm [on] host AND host [contains] instance
+       actions:
+         - raise_alarm:
+            target: instance
+            alarm_name: instance cpu performance problem
+            severity: WARNING
+            causing_alarm: host_alarm
+     - condition: instance_alarm [on] instance
+       actions:
+         - set_state:
+            state: SUBOPTIMAL
+            target: instance
+
+
+Example 2: Deduced state based on alarm
+=======================================
+The following template will change the state of an instance to `ERROR` if there
+is any alarm of severity `CRITICAL` on it.
+
+.. code-block:: yaml
+
+    metadata:
+        version: 3
+        name: EXAMPLE 3 - deduced state for instances with critical alarm
+        type: standard
+        description: deduced state for all instance with alarms
+    entities:
+        - instance:
+            type: nova.instance
+        - instance_alarm:
+            category: ALARM
+            severity: CRITICAL
+    scenarios:
+     - condition: instance_alarm [on] instance
+       actions:
+         - set_state:
+            state: ERROR
+            target: instance
+
+Example 3: Deduced alarm based on state
+=======================================
+This template will cause an alarm to be raised on any host in state `ERROR`
+
+Note that in this template, there are no relationships. The condition is just
+that the entity exists.
+
+
+.. code-block:: yaml
+
+    metadata:
+        version: 3
+        name: EXAMPLE 3 - deduced alarm for all hosts in error
+        type: standard
+        description: raise deduced alarm for all hosts in error
+    entities:
+        - host_in_error:
+            type: nova.host
+            state: error
+    scenarios:
+     - condition: host_in_error
+       actions:
+         - raise_alarm:
+            target: host_in_error
+            alarm_name: host in error state
+            severity: CRITICAL
+
+Example 4: Deduced Alarm triggered by several options
+=====================================================
+This template will raise a deduced alarm on an instance, which can be caused by
+an alarm on the hosting zone or an alarm on the hosting host.
+
+
+.. code-block:: yaml
+
+    metadata:
+        version: 3
+        name: EXAMPLE 4 - deduced alarm two possible triggers
+        type: standard
+        description: deduced alarm using or in condition
+    entities:
+        - zone:
+            type: nova.zone
+        - zone_alarm:
+            category: ALARM
+            name: zone connectivity problem
+        - host:
+            type: nova.host
+        - host_alarm:
+            type: zabbix
+            name: host connectivity problem
+        - instance:
+            type: nova.instance
+    scenarios:
+     - condition: (host_alarm [on] host OR (zone_alarm [on] zone AND zone [contains] host)) AND host [contains] instance
+       actions:
+         - raise_alarm:
+            target: instance
+            alarm_name: instance_connectivity_problem
+            severity: CRITICAL
+
+Applying the template
+#####################
+
+
+Template Validate
+=================
+Before adding a template you can validate it
+
+::
+
+    vitrage template validate --path /home/stack/my_new_template.yaml
+
+Template Add
+============
+Applying the template will evaluate it against the existing entity graph as well as to any new data.
+
+::
+
+    vitrage template add --path /home/stack/my_new_template.yaml
+
+
+Common parameters and their acceptable values
+=============================================
+
+-------------------+-----------------------+-------------------------+------------------------------------+
+| block             | key                   | supported values        | comments                           |
+===================+=======================+=========================+====================================+
+| entity            | category              | ALARM,                  |                                    |
+|                   |                       | RESOURCE                |                                    |
+-------------------+-----------------------+-------------------------+------------------------------------+
+| entity (ALARM)    | type                  | vitrage,                |                                    |
+|                   |                       | zabbix,                 |                                    |
+|                   |                       | doctor,                 |                                    |
+|                   |                       | aodh,                   |                                    |
+|                   |                       | prometheus,             |                                    |
+|                   |                       | nagios,                 |                                    |
+-------------------+-----------------------+-------------------------+------------------------------------+
+| entity (RESOURCE) | type                  | openstack.cluster,      | These are for the datasources that |
+|                   |                       | nova.zone,              | come with vitrage by default.      |
+|                   |                       | nova.host,              | Adding datasources will add more   |
+|                   |                       | nova.instance,          | supported types, as defined in the |
+|                   |                       | cinder.volume,          | datasource transformer             |
+|                   |                       | switch                  |                                    |
+-------------------+-----------------------+-------------------------+------------------------------------+
+| actions           |                       | raise_alarm,            |                                    |
+|                   |                       | set_state,              |                                    |
+|                   |                       | add_causal_relationship |                                    |
+|                   |                       | mark_down               |                                    |
+|                   |                       | execute_mistral         |                                    |
+-------------------+-----------------------+-------------------------+------------------------------------+
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -69,13 +69,14 @@ Developer Guide
   :hidden:
   
   contributor/index
+   contributor/vitrage-template-format-v2

 .. toctree::
   :maxdepth: 1

   contributor/vitrage-first_steps
   contributor/vitrage-api
-   contributor/vitrage-template-format
+   contributor/vitrage-templates
   contributor/devstack-installation
   contributor/configuration

--- a/releasenotes/notes/template_version_3-cd8a0775b2f2e7cd.yaml
+++ b/releasenotes/notes/template_version_3-cd8a0775b2f2e7cd.yaml
@ -0,0 +1,5 @@
+---
+features:
+  - Introduced template version 3, a shorter, more fluent template language.
+    Overall template yaml appearance improvements. `condition` definitions were
+    revised, `relationships` declarations removed.