documentation review

Change-Id: I655cd0b2a8871ec47ec92006bc7179a96ad13f83
2016-04-12 17:27:42 +03:00 · 2016-04-12 17:27:42 +03:00 · 3627ab1944
commit 3627ab1944
parent e6da524186
5 changed files with 192 additions and 115 deletions
--- a/doc/source/nagios-config.rst
+++ b/doc/source/nagios-config.rst
@ -25,25 +25,35 @@ The following should be set in **/etc/vitrage/vitrage.conf**, under [nagios] sec
 the form *http://<nagios site url>/cgi-bin/status.cgi*, which returns all the
 nagios tests.

-**Example**
+Nagios access configuration - example
+++++++++++++++++++++++++++++++++++++
+
+When installing Nagios on devstack with IP 10.20.30.40, following
+the instructions here_, this would be the correct configuration:
+
+.. _here: https://github.com/openstack/vitrage/blob/master/doc/source/nagios-devstack-installation.rst

 ::

  [nagios]
  user = omdadmin
  password = omd
-  url = http://10.20.30.40/monitoring/nagios/cgi-bin/status.cgi
+  url = http://10.20.30.40:54321/my_site/nagios/cgi-bin/status.cgi
  config_file = /etc/vitrage/nagios_conf.yaml
-    

 Configure Nagios Host Mapping
 -----------------------------

-Nagios tests are defined in a table with columns: Host, Service, Status, Last Check, etc.
+Nagios tests are defined in a table with columns: Host, Service, Status, Last
+Check, etc.

-Nagios "Host" is not necessarily a resource of type host. It can also be an instance, switch, or other resource types. **nagios_conf.yaml** is used to map Nagios host type to a Vitrage resource.
+A Nagios "host" is not necessarily a resource of type "nova.host". It can also
+be an instance, switch, or other resource types. **nagios_conf.yaml** is used
+to map each Nagios host to a Vitrage resource.
+
+Format
++++++

-**Format**
 ::

 nagios:
@ -57,29 +67,40 @@ Nagios "Host" is not necessarily a resource of type host. It can also be an inst

  ...

+Note that for ease of use, there is support for wildcards in the "nagios_host"
+value, and references to the actual value for a given wildcard match. See
+**Example 2** below.

-**Example**

-The following file will map compute-1 to a nova.host named compute-1; and compute-2 to a nova.host named host2
+
+Example 1
+++++++++
+
+The following example is for a system with two hosts. In nagios they are named
+*compute-0, compute-1*, and in nova they are named *host-1, host-2*.

 ::

 nagios:
-  - nagios_host: compute-1
+  - nagios_host: compute-0
    type: nova.host
-    name: compute-1
+    name: host1

-  - nagios_host: compute-2
+  - nagios_host: compute-1
    type: nova.host
    name: host2

+Example 2
+++++++++

+The following file will
+ - map all Nagios hosts named host-*<some_suffix>* or *<some_prefix>*-devstack
+   to resources of type nova.host with the same name.
+ - map all Nagios hosts named instance-*<some_suffix>* to nova.instance
+   resources.

-**Default Configuration**
-
-A default nagios_conf.yaml will be installed with Vitrage. Its content is still TBD, but it will be similar to the following example.
-
-All Nagios hosts named host* or * -devstack will be mapped in Vitrage to resoruces of type nova.host with the same name; and all Nagios hosts named instance* will be mapped to nova.instance resources.
+Note how the *${nagios_host}* references the instantiation of the regex defined
+in nagios_host.

 ::

--- a/doc/source/nagios-devstack-installation.rst
+++ b/doc/source/nagios-devstack-installation.rst
@ -7,10 +7,10 @@ Overview
 This page describes how to manually install and configure Nagios on devstack.
 After following the steps described here, Nagios will be installed via the OMD
 package (http://omdistro.org/) and will have a basic set of tests for
-monitoring the Devstack VM. It will then be possible to sync Nagios into
-Vitrage.
+monitoring the Devstack VM. It will then be possible to configure a Nagios
+datasource for Vitrage.

-The following guide is for Ubuntu. With slight modifications should work for
+The following guide is for Ubuntu. With slight modifications it should work for
 other linux flavours. Links for this purpose are added below.

 Installation
@ -21,7 +21,7 @@ Installation

    wget -q "https://labs.consol.de/repo/stable/RPM-GPG-KEY" -O - | sudo apt-key add -

-2. Update your repo with the OMD site. For example, for wheezy release:
+2. Update your repo with the OMD site. For example, for ubuntu wheezy release:
   ::

    sudo bash -c "echo 'deb http://labs.consol.de/repo/stable/ubuntu trusty main' >> /etc/apt/sources.list"
@ -34,15 +34,16 @@ Installation

    sudo apt-get install omd

-4. Create a site for nagios with a name of your choosing, for example "my_host".
+4. Create a site for nagios with a name of your choosing, for example
+   "my_site".
   ::

-    sudo omd create my_host
-    sudo omd config my_host set APACHE_TCP_PORT 54321
-    sudo omd config my_host set APACHE_TCP_ADDR 0.0.0.0
-    sudo omd start  my_host
+    sudo omd create my_site
+    sudo omd config my_site set APACHE_TCP_PORT 54321
+    sudo omd config my_site set APACHE_TCP_ADDR 0.0.0.0
+    sudo omd start  my_site

-   You can now access your Nagios site here: *http://<devstack_ip>:54321/my_host/omd*.
+   You can now access your Nagios site here: *http://<devstack_ip>:54321/my_site/omd*.
   ::

    username: omdadmin
@ -52,27 +53,28 @@ Installation
    - The default port is OMD uses is 5000, which is also used by OpenStack
      Keystone, and so it must be changed. Port 54321 used here is only an
      example.
-    - APACHE_TCP_ADDR indicates the address to listen on. Use 0.0.0.0 to listen
-      for all traffic addressed to the specified port. Use a different address
-      to listen on a specific (public) address.
+    - *APACHE_TCP_ADDR* indicates the address to listen on. Use 0.0.0.0 to
+      listen for all traffic addressed to the specified port. Use a different
+      address to listen on a specific (public) address.

-5. Install the Check_MK client on devstack VM:
+5. Install the Check_MK agent on devstack VM:
   ::

    sudo apt-get install check-mk-agent

-6. Activate the agent, by editing */etc/xinetd.d/check_mk* and setting
-   "disable" to "no", and then run
+6. Activate the Check_MK agent, by editing */etc/xinetd.d/check_mk* and
+   **setting "disable" to "no"**, and then run
   ::

    sudo service xinetd restart

-7. In your browser, go to *http://<devstack_ip>:<selected port>/my_host/omd*
-   and follow the instructions at this link_ to configure the nagios host.
+7. In your browser, go to *http://<devstack_ip>:<selected port>/my_site/omd*
+   and follow the instructions at this link_ (**"Configuring the first host and
+   checks"** section) to configure the nagios host.

   .. _link: http://mathias-kettner.de/checkmk_install_with_omd.html#H1:Configuring_the_first_host_and_checks

-8. *Vitrage Support.* With Nagios installed, you can now sync it into Vitrage.
-   follow the instructions here_.
+8. *Vitrage Support.* With Nagios installed, you can now configure a datasource
+   for it for Vitrage, by following the instructions here_.

-   .. _here: https://github.com/openstack/vitrage/blob/master/doc/source/nagios-config.rst
+   .. _here: https://github.com/openstack/vitrage/blob/master/doc/source/nagios-config.rst
--- a/doc/source/scenario-evaluator.rst
+++ b/doc/source/scenario-evaluator.rst
@ -67,12 +67,12 @@ Flow
 Concepts and Guidelines
 -----------------------
 - *Events:* The Scenario Evaluator is notified of each event in the Entity
-  Graph *after* the event takes place. An event in this context is any change
+  Graph after the event takes place. An event in this context is any change
  (create/update/delete) in a graph element (vertex/edge). The notification
  will consist of two copies of the element: the element *before* the change
  and the *current* element after the change took place.

- *Execute and Undo Actions:* If the Entity Graph matches a scenario, the
+- *Actions - Do and Undo:* If the Entity Graph matches a scenario, the
  relevant actions will need to be executed. Conversely, if a previously
  matched scenario no longer holds, the action needs to be undone. Thus, for
  each action there must be a "do" and "undo" procedure defined. For example,
@ -97,12 +97,12 @@ When Vitrage is started up, all the templates are loaded into a *Scenario*
 ensure the correct format is used, references are valid, and more. Errors in
 each template should be written to the log. Invalid templates are skipped.

-The SR supports querying for scenarios based on a vertex or edge in the Entity
-Graph. Given such a graph element, the Scenario Repository will return all
-scenarios where this element appears in the scenario condition. This means that
-a corresponding element appears in the scenario condition, such that for each
-key-value in the template, the same key-value can be found in the element (but
-not always the reverse).
+The Scenario Repository supports querying for scenarios based on a vertex or
+edge in the Entity Graph. Given such a graph element, the Scenario Repository
+will return all scenarios where this element appears in the scenario condition.
+This means that a corresponding element appears in the scenario condition, such
+that for each key-value in the template, the same key-value can be found in the
+element (but not always the reverse).

 Ongoing Operation
 -----------------
@ -110,7 +110,8 @@ Ongoing Operation
 1. The Scenario Evaluator is notified of an event on some element in the Entity
   Graph.
 2. The Scenario Evaluator queries the Scenario Repository for relevant
-   scenarios for both *before* and *current* state of the element.
+   scenarios for both *before* and *current* states of the element, and returns
+   a set of matching scenarios for each.
 3. The two sets of scenarios are analyzed and filtered, resulting in two
   disjoint sets, to avoid "do/undo" conflicts (See above in
   'Concepts and Guidelines'_).
@ -118,24 +119,24 @@ Ongoing Operation
   in both from both sets.
 4. For each scenario related to the *before* element, the Scenario Evaluator
   queries the Entity Graph for all the matching patterns in the current
-   system. For each match and each associated action, add a reference to the
-   *undo* of the action to an *action collection*.
+   system. For each match and each associated action, a reference to the
+   *undo* of the action is added to an *action collection*.
 5. For each scenario related to the *current* element,the Scenario Evaluator
   queries the Entity Graph for all the matching patterns in the current
-   system. For each match and each associated action, add a reference to the
-   *do* of the action to the same *action collection*.
-6. Given all the actions (do & undo) in the *action collection*, perform them.
-   Using action executor.
+   system. For each match and each associated action, a reference to the
+   *do* of the action is added to the same *action collection*.
+6. Given all the actions (do & undo) in the *action collection*, perform them
+   using action executor.

-   - Note that in Mitaka, the only action filtering planned is avoiding
-     performing the same action twice.
+   - Currently, the only action filtering is avoiding performing the same
+     action twice.


 System Initialization
 ---------------------

 During the initialization of Vitrage, the Scenario Evaluator will be
-de-activated until all the synchronizers complete their initial "get_all"
+de-activated until all the datasources complete their initial "get_all"
 process. After it is activated, the consistency flow will begin, which will
 trigger all the relevant scenarios for each element in the Entity Graph.

@ -146,9 +147,10 @@ This approach has several benefits:
  bottlenecks and other performance issues.
 - During the initialization period the Entity Graph is built step-by-step until
  it reflects the current status of the Cloud. Thus, during this interim period
-  scenarios that contain "not" clauses might "fire" because a certain entity is
-  not present in the graph, even though it is present in reality and just has
-  not been processed into the graph (since the "get_all" is not finished).
+  scenarios that contain "not" clauses might be triggered because a certain
+  entity is not present in the graph, even though it is present in reality and
+  just has not been processed into the graph (since the "get_all" is not
+  finished).

 It is possible that this late activation of the evaluator will be removed or
 changed once we move to a persistent graph database for the Entity Graph in
--- a/doc/source/static-physical-config.rst
+++ b/doc/source/static-physical-config.rst
@ -1,57 +1,78 @@
-====================================
-Static Physical Plugin Configuration
-====================================
+========================================
+Static Physical Datasource Configuration
+========================================
+
+Overview
+--------
+
+The Static Physical datasource allows users to integrate the physical topology
+into Vitrage. Physical topology includes switches and their connection to
+other switches and physical hosts.
+
+This datasource is static - pre-configured in a file. This is sufficient in 
+many cases where the physical topology is relatively unchanging.

 Configure Access to Static Physical
 -----------------------------------

-The following should be set in **/etc/vitrage/vitrage.conf**, under [static_physical] section:
+The following should be set in **/etc/vitrage/vitrage.conf**, under 
+[static_physical] section:

-+------------------+---------------------------------------------------------+------------------------------+
-| Name             | Description                                             | Default Value                |
-+==================+=========================================================+==============================+
-| directory        | Directory path from where to load the configurations    | /etc/vitrage/static_plugins/ |
-+------------------+---------------------------------------------------------+------------------------------+
-| changes_interval | Interval of checking changes in the configuration files | 30 seconds                   |
-+------------------+---------------------------------------------------------+------------------------------+
-| entities         | Static physical entity types list                       | switch                       |
-+------------------+---------------------------------------------------------+------------------------------+
+------------------+---------------------------------------------------------+----------------------------------+
+| Name             | Description                                             | Default Value                    |
+==================+=========================================================+==================================+
+| directory        | Directory path from where to load the configurations    | /etc/vitrage/static_datasources/ |
+------------------+---------------------------------------------------------+----------------------------------+
+| changes_interval | Interval of checking changes in the configuration files | 30 seconds                       |
+------------------+---------------------------------------------------------+----------------------------------+
+| entities         | Static physical entity types list                       | switch                           |
+------------------+---------------------------------------------------------+----------------------------------+


 Configure Static Physical Mapping
 ---------------------------------

-Physical configuration is made for configuring statically physical entities, and their relationships to other entities in the topology.
+Physical configuration is made for configuring statically physical entities, 
+and their relationships to other entities in the topology.

-Some physical entities, such as switches, can not be retrieved from OpenStack, so for now we will configure them statically.
+Some physical entities, such as switches, can not be retrieved from OpenStack,
+and so are defined here.

-There may be more than one configuration file. All files will be read from /etc/vitrage/static_plugins/.
+There may be more than one configuration file. All files will be read from 
+*/etc/vitrage/static_plugins/*. See previous section on how to configure this 
+location.
+
+Format
++++++

-**Format**
 ::


 entities:
  - name: <Physical entity name as appears in configuration>
    id: <Physical entity id as appears in configuration>
-    type: <Physical entity type - must be from constants.EntityType>
-    state: <resource state>
+    type: <Physical entity type - see below for details>
+    state: <default resource state>
    relationships:
-      - type: <Physical entity type it is connected to - must be from constants.EntityType>
-        name: <Physical entity name connected to as appears in configuration>
-        id: <Physical entity id connected to as appears in configuration>
-        relation_type: <Relation name>
-      - type: <Physical entity type it is connected to - must be from constants.EntityType>
-        name: <Physical entity name connected to as appears in configuration>
-        id: <Physical entity id connected to as appears in configuration>
+      - type: <Physical entity type it is connected to - see below for details>
+        name: <Name of physical entity as appears in configuration>
+        id: <Id of physical entity as appears in configuration>
        relation_type: <Relation name>
+      - type: ...

  ...


-**Example**
+Notes:
+- The "type" key must match the name of a type from an existing datasource.
+  Type names appear, for each datasource, in its __init__.py file. For example
+  see */workspace/dev/vitrage/vitrage/datasources/nova/host/__init__.py*

-The following will define a switch that is attached to host-1 and is a backup of switch-2
+Example
+++++++
+
+The following will define a switch that is attached to host-1 and is a backup
+of switch-2

 ::

--- a/doc/source/vitrage-graph-design.rst
+++ b/doc/source/vitrage-graph-design.rst
@ -9,70 +9,101 @@ Vitrage Graph Design

 Main Components
 ===============
-**Note:** The gray plugins will not be implemented in Mitaka
+**Note:** The gray datasources will not be implemented in Mitaka

 Graph
 -----
-A library with a graph representation, that is used by Vitrage Graph and by Vitrage Evaluator.
+A library with a graph representation, that is used by Vitrage Graph and by
+Vitrage Evaluator.

-The **Graph Driver** consists of APIs for graph CRUD actions (add/remove vertex, add/remove edge, etc.) and graph algorithms like DFS or sub-graph matching.
-In Mitaka, the graph driver will be implemented over NetworkX. Future versions should support replacing NetworkX with a persistent graph DB such as Titan or Neo4J.
+The **Graph Driver** consists of APIs for graph CRUD operations (add/remove
+vertex, add/remove edge, etc.) and graph algorithms for iteration and pattern
+detection. In Mitaka, the graph driver will be implemented in-memory over
+NetworkX (https://networkx.github.io/). Future versions should support
+replacing NetworkX with a persistent graph DB such as Titan or Neo4J.


-Synchronizer
------------
-Responsible for importing information regarding entities in the cloud. Entities in this context refer both to resources (physical & virtual) and alarms (Aodh, Nagios, Monasca, etc.)
+Datasources
+-----------
+Vitrage can support multiple *datasources* to populate its graph. Each
+datasource is responsible for importing information regarding certain entities
+in the cloud and defining how to integrate their information into the graph.
+Entities in this context refer both to resources (physical & virtual) and
+alarms (Aodh, Nagios, Monasca, etc.)

-The Synchronizer can hold several plugins, each responsible for a different entity type, like Nova instance, Nova host, Nova zone, Nagios alarms, Aodh alarms, etc.
+The datasource is comprised of two components. The *Driver* handles retrieving
+the information and entering it into the entity queue, while the *Transformer*
+defines how to integrate the information retrieved into the graph (for more
+details about the Transformer, see next section).

-The plugin has two modes of action:
-
- get_all (snapshot): Query all entities and send events to the queue. When done, send an "end" event.
- notify: Send an event to the queue upon any change.
-
-For more details, see https://github.com/openstack/vitrage-specs/blob/master/specs/mitaka/vitrage-synchronizer.rst
+The datasource *driver* has two modes of operation:

+- *get_all* (snapshot): Pull-based operation. Query all entities for this
+  datasource and send events to the queue. When done, send an "end" event.
+- *notify*: Push-based operation. When change occurs, send an event to the
+  queue.

 Entity Processor and Transformers
 ---------------------------------
-Responsible for polling events from the entity queue and inserting corresponding vertices to the Graph. For every entity in the queue, the Processor calls the Transformer that is registered on this entity type. The Transformer plugin understands the specific entity details, queries the graph, and outputs a vertex to be inserted to the Graph together with edges that connect it to its neighbors.
-
-Note that for every Synchronizer plugin there should be a matching Transformer plugin.
+The Entity Processor is responsible for polling events from the entity queue
+and inserting corresponding vertices to the Graph. For every entity event in
+the queue, the Processor calls the Transformer that is registered on this
+entity type. For each datasource there is a Transformer, which understands the
+specific entity details. It queries the graph, and after processing outputs a
+vertex to be inserted to the Graph together with edges that connect it to its
+neighbors in the graph.


 Evaluator
 ---------
-The Evaluator is notified on every change in the Graph, and is responsible for executing templates that are related to this change.
+The Evaluator is notified on every change in the Graph, and is responsible for
+executing templates that are related to this change.

 Template Examples:

- Deduced alarm: In case an alarm is raised indicating a public switch is down, trigger an "instance is at risk" alarm on every instance that is contained by a host attached to this switch.
- RCA: In case an alarm is raised indicating a public switch is down, and an "instance is at risk" alarm is active on an instance that is contained by a host attached to this switch - determine that the switch alarm is the root cause of the instance alarm, and add a "causes" edge to the Graph from the vertex representing the switch to the vertex representing the instance.
+- Deduced alarm: In case an alarm is raised indicating a public switch is down,
+  trigger an "instance is at risk" alarm on every instance that is contained by
+  a host attached to this switch.
+- Deduced state: In case an alarm is raised indicating a public switch is down,
+  set the state of every instance that is contained by a host attached to this
+  switch to be "suboptimal".
+- Causal relationship: In case an alarm is raised indicating a public switch is
+  down, and an "instance is at risk" alarm is active on an instance that is
+  contained by a host attached to this switch - determine that the switch alarm
+  is the root cause of the instance alarm, and add a "causes" edge to the Graph
+  from the vertex representing the switch to the vertex representing the
+  instance.

 Templates can be added, removed or modified by the user.

-The Evaluator detailed design is still TBD.
-
-
 Consistency
 -----------
-This component is responsible for verifying the Graph's consistency with the "real" situation in the cloud. It is called both during Vitrage startup, as part of the graph initialization, as well as periodically to ensure the graph is correct.
+This component is responsible for verifying the Graph's consistency with the
+actual situation in the cloud. It is called both during Vitrage startup, as
+part of the graph initialization, as well as periodically to ensure the graph
+is correct.

 The consistency component will include:

 - Deleting obsolete vertices
- Handle the case that Vitrage missed a "delete entity" event, and did not delete the relevant deduced alarms
- Ensure no entity is missed in the Graph. This can be done by retrieving all entities from all Synchronizer plugins.
+- Handle the case that Vitrage missed a "delete entity" event, and did not
+  delete certain deduced alarms.
+- Ensure no entity is missed in the Graph. This can be done by retrieving all
+  entities from all datasources.

-Note: If an entity is added, its related templates will also be executed as well to create deduced alarms and add RCA information to the graph. This step will be handled differently during graph initialization and during periodic checks.
+**Note:** If an entity is added, its related templates will also be executed
+as well to create deduced alarms and add RCA information to the graph.


 API Handler
 -----------
-Responsible for transferring Vitrage API calls to the Graph.
+The API handler is responsible for transferring Vitrage API calls to the Graph.


 Notifiers
 ---------
-Are called by the Evaluator, for example, in order to raise a Deduced Alarm. Each notifier is responsible to notify another component, like Aodh or Monasca, about alarm state changes.
+The notifier is responsible for notifying registered external components of
+changes that took place in the graph. For example, external OpenStack projects
+such as Aodh/Monasca could have a notifier to notify them about deduced alarms.
+Each notifier is responsible to notify another component.