documentation fixes; first steps guide
Change-Id: I8b6bcd32f9c8eac5f74a1c35fb28ee1f94e7e9d4
This commit is contained in:
parent
05aa2e8617
commit
e7c97e0075
@ -53,6 +53,8 @@ alarm is normalized. There are several guidelines for creating a config file:
|
||||
- Defining a config file for each datasource is recommended, but not mandatory.
|
||||
Datasources with no such configuration will use the values as-is.
|
||||
|
||||
Once the file is modified, you must restart **vitrage-graph** service to load
|
||||
the changes.
|
||||
|
||||
Default Configuration
|
||||
+++++++++++++++++++++
|
||||
|
@ -21,7 +21,7 @@ Installation
|
||||
|
||||
wget -q "https://labs.consol.de/repo/stable/RPM-GPG-KEY" -O - | sudo apt-key add -
|
||||
|
||||
2. Update your repo with the OMD site. For example, for ubuntu wheezy release:
|
||||
2. Update your repo with the OMD site. For example, for ubuntu trusty release:
|
||||
::
|
||||
|
||||
sudo bash -c "echo 'deb http://labs.consol.de/repo/stable/ubuntu trusty main' >> /etc/apt/sources.list"
|
||||
|
@ -53,6 +53,8 @@ resource is normalized. Some guidelines for creating a config file:
|
||||
- Defining a config file for each datasource is recommended, but not mandatory.
|
||||
Datasources with no such configuration will use the values as-is.
|
||||
|
||||
Once the file is modified, you must restart **vitrage-graph** service to load
|
||||
the changes.
|
||||
|
||||
Default Configuration
|
||||
+++++++++++++++++++++
|
||||
|
@ -39,7 +39,7 @@ Some physical entities, such as switches, can not be retrieved from OpenStack,
|
||||
and so are defined here.
|
||||
|
||||
There may be more than one configuration file. All files will be read from
|
||||
*/etc/vitrage/static_plugins/*. See previous section on how to configure this
|
||||
*/etc/vitrage/static_datasources/*. See previous section on how to configure this
|
||||
location.
|
||||
|
||||
Format
|
||||
@ -79,15 +79,15 @@ of switch-2
|
||||
entities:
|
||||
- type: switch
|
||||
name: switch-1
|
||||
id: 11111
|
||||
id: switch-1 # should be same as name
|
||||
state: available
|
||||
relationships:
|
||||
- type: nova.host
|
||||
name: host-1
|
||||
id: 22222
|
||||
id: host-1 # should be same as name
|
||||
relation_type: attached
|
||||
- type: switch
|
||||
name: switch-2
|
||||
id: 33333
|
||||
id: switch-2 # should be same as name
|
||||
relation_type: backup
|
||||
|
||||
|
113
doc/source/vitrage-first_steps.rst
Normal file
113
doc/source/vitrage-first_steps.rst
Normal file
@ -0,0 +1,113 @@
|
||||
===============================
|
||||
Vitrage - Getting Started Guide
|
||||
===============================
|
||||
|
||||
This document explains how to get started using Vitrage. Here you will find
|
||||
easy-to-follow instructions on how to install & configure Vitrage to suit
|
||||
your needs, try out its different functions, and expand it's capabilities.
|
||||
|
||||
Before you start
|
||||
================
|
||||
|
||||
Installation
|
||||
============
|
||||
- `Enable Vitrage in devstack <https://github.com/openstack/vitrage/blob/master/devstack/README.rst/>`_
|
||||
- `Enable Vitrage in horizon <https://github.com/openstack/vitrage-dashboard/blob/master/README.rst/>`_
|
||||
- run ./stack
|
||||
|
||||
|
||||
Nagios Installation & Configuration
|
||||
===================================
|
||||
Nagios_ is a widely-used tool for monitoring hardware and software systems.
|
||||
It periodically runs tests on the entities it monitors, and sets the state
|
||||
of these tests to OK (pass) or different levels of severity.
|
||||
|
||||
Vitrage comes with Nagios as a datasource, The examples given below use Nagios
|
||||
as the trigger for deduced alarms, states and RCA templates in Vitrage.
|
||||
|
||||
.. _Nagios: https://www.nagios.org/
|
||||
|
||||
- `Install Nagios on your devstack <https://github.com/openstack/vitrage/blob/master/doc/source/nagios-devstack-installation.rst/>`_
|
||||
- `Configure Nagios datasource <https://github.com/openstack/vitrage/blob/master/doc/source/nagios-config.rst>`_
|
||||
|
||||
|
||||
Vitrage in action
|
||||
=================
|
||||
|
||||
In order to see Vitrage in action, it comes prepackaged with a sample template
|
||||
that demonstrate its functionality. This can be found (with default config) at
|
||||
*/etc/vitrage/templates*.
|
||||
|
||||
In the example shown here, we will cause Nagios to report high memory usage on
|
||||
the devstack host. As a result and as defined in our sample template, Vitrage
|
||||
will change the state of the hosted instances to "suboptimal", raise an alarm
|
||||
on each and indicate that the host-level alarm is the cause for the instance
|
||||
alarms.
|
||||
|
||||
Setting up
|
||||
----------
|
||||
- Deploy several (3-5) instances on your devstack. Make sure that they are
|
||||
in state "Running" before continuing.
|
||||
- In your browser, go to the Nagios site you defined. If you used the
|
||||
steps defined above,
|
||||
- URL: *http://<IP>:54321/my_site/omd/*.
|
||||
- Select "Classic Nagios GUI" (other views are ok as well, the instructions
|
||||
below on raising alarms are for this view)
|
||||
- User/Password: omdadmin/omd
|
||||
- Set the "Memory Used" test to "Warning":
|
||||
- Click on *Services --> Memory Used*
|
||||
- On the right pane, select "Submit passive check result for this service"
|
||||
- For the "Check result" enter "Warning", and for "Check Output" enter
|
||||
"High memory usage". Click *commit*, then *Done*.
|
||||
- On the right pane, select "Stop accepting passive checks for this service"
|
||||
and then *Done*.
|
||||
|
||||
With the alarm on the host now activated, lets see how this is expressed in
|
||||
Vitrage.
|
||||
|
||||
|
||||
Deduced State
|
||||
-------------
|
||||
|
||||
- In the Horizon UI, select *Vitrage --> Topology*
|
||||
- The UI will now show the Sunburst view of the compute hierarchy. The color
|
||||
of each resource reflects its state: green (ok), yellow (warning), red
|
||||
(critical).
|
||||
|
||||
A list of alarms will appear in the UI, showing an alarm on the host, as well
|
||||
as one alarm per instance.
|
||||
|
||||
|
||||
Deduced Alarm
|
||||
-------------
|
||||
|
||||
- In the Horizon UI, select *Vitrage --> Alarms*
|
||||
- A list of alarms will appear in the UI, showing an alarm on the host, as well
|
||||
as one alarm per instance.
|
||||
|
||||
|
||||
Root Cause Analysis
|
||||
-------------------
|
||||
- In the Horizon UI, select *Vitrage --> Alarms*
|
||||
- Select a host alarm, and click on the RCA icon in the far right-hand side of
|
||||
the screen. This will show how the host alarm caused the instance alarms
|
||||
|
||||
Advanced Usage
|
||||
==============
|
||||
|
||||
Modify states & severities
|
||||
--------------------------
|
||||
Since each data-source might represent a resource state or alarm severity
|
||||
differently, for each data-source you can define it's own mapping to the
|
||||
*normalized* states/severities supported in Vitrage. This will impact UI and
|
||||
templates behavior that depends on these fields.
|
||||
|
||||
- `Resource state configuration <https://github.com/openstack/vitrage/blob/master/doc/source/resource-state-config.rst/>`_
|
||||
- `Alarm severity configuration <https://github.com/openstack/vitrage/blob/master/doc/source/alarm-state-config.rst/>`_
|
||||
|
||||
Writing your own templates
|
||||
--------------------------
|
||||
For more information regarding Vitrage templates, their format and how to add
|
||||
them, see here_.
|
||||
|
||||
.. _here: https://github.com/openstack/vitrage/blob/master/doc/source/vitrage-template-format.rst
|
@ -9,10 +9,14 @@ Add Nova Instance
|
||||
:align: center
|
||||
|
||||
|
||||
#. Nova Synchronizer plugin queries all Nova instances, or gets a message bus notification about a new Nova instance
|
||||
#. Nova Synchronizer plugin sends corresponding events to the Entity Queue
|
||||
#. The Entity Processor polls the Entity Queue and gets the new Nova Instance event
|
||||
#. The Entity Processor passes the event to the Nova Instance Transformer plugin, which returns a Vertex with the instance data, and an edge to the host Vertex in the graph
|
||||
#. Nova datasource Driver queries all Nova instances, or gets a message bus
|
||||
notification about a new Nova instance
|
||||
#. Nova datasource Driver sends corresponding events to the Entity Queue
|
||||
#. The Entity Processor polls the Entity Queue and gets the new Nova Instance
|
||||
event
|
||||
#. The Entity Processor passes the event to the Nova Instance Transformer,
|
||||
which returns a Vertex with the instance data, with an edge to the host
|
||||
Vertex in the graph
|
||||
#. The Entity Processor adds the new vertex and edge to the Graph
|
||||
|
||||
.. image:: ./images/add_nova_instance_graph.png
|
||||
@ -27,10 +31,12 @@ Add Aodh Alarm
|
||||
:align: center
|
||||
|
||||
|
||||
#. Aodh Synchronizer plugin queries all Aodh alarms, or gets a notification (TBD) about an Aodh alarm state change
|
||||
#. Aodh Synchronizer plugin sends corresponding events to the Entity Queue
|
||||
#. The Entity Processor polls the Entity Queue and gets the Aodh Alarm event, for example threshold alarm on Instance1 CPU
|
||||
#. The Entity Processor passes the event to the Aodh Alarm Transformer plugin, which returns a Vertex with the alarm data, and an edge to the instance Vertex
|
||||
#. Aodh Driver queries all Aodh alarms
|
||||
#. Aodh Driver sends corresponding events to the Entity Queue
|
||||
#. The Entity Processor polls the Entity Queue and gets the Aodh Alarm event,
|
||||
for example threshold alarm on Instance-1 CPU
|
||||
#. The Entity Processor passes the event to the Aodh Alarm Transformer, which
|
||||
returns a Vertex with the alarm data, with an edge to the instance Vertex
|
||||
#. The Entity Processor adds the new vertex and edge to the Graph
|
||||
|
||||
.. image:: ./images/add_aodh_alarm_graph.png
|
||||
@ -45,12 +51,18 @@ Nagios Alarm Causes Deduced Alarm
|
||||
:align: center
|
||||
|
||||
|
||||
5. (steps 1-5) Nagios Synchronizer plugin pushes a nagios alarm on a switch to the Entity Queue, which is converted by Nagios Transformer to a vertex and inserted to the Graph
|
||||
6. The Evaluator is notified about a new Vertex (Nagios switch alarm) that was added to the graph
|
||||
7. The Evaluator performs its calculations (TBD) and deduces that alarms should be triggered on every instance on every host attached to this switch
|
||||
5. (steps 1-4) Nagios datasource driver pushes a nagios alarm on a switch to
|
||||
the Entity Queue, which is converted by Nagios Transformer to a vertex and
|
||||
inserted to the Graph
|
||||
6. The Evaluator is notified about a new Vertex (Nagios switch alarm) that was
|
||||
added to the graph
|
||||
7. The Evaluator performs its calculations and deduces that alarms should be
|
||||
triggered on every instance on every host attached to this switch
|
||||
8. The Evaluator pushes alarms to the Entity Queue
|
||||
9. The Evaluator asks the notifier to notify on these new alarms
|
||||
10. Aodh Notifier creates new alarm definitions in Aodh, and sets their states to "alarm"
|
||||
9. The graph is updated with these new alarms
|
||||
10. The graph writes to the message bus that new alarms were created
|
||||
11. Aodh Notifier creates new alarm definitions in Aodh, and sets their states
|
||||
to "alarm"
|
||||
|
||||
.. image:: ./images/nagios_causes_deduced_graph.png
|
||||
:width: 100%
|
||||
@ -64,15 +76,18 @@ Create RCA Insights
|
||||
:align: center
|
||||
|
||||
|
||||
#. The Evaluator is notified of a new alarm.
|
||||
#. The Evaluator evaluates the templates and the Graph (TBD), and decides that there is a root cause relation between two alarms. It adds a "causes" edge to the Graph
|
||||
#. The Evaluator is notified of a new alarm *Alarm-X*.
|
||||
#. The Evaluator evaluates the templates and the Graph, and decides that there
|
||||
is a root cause relation between *Alarm-X* and *Alarm-Y*. It adds a "causes"
|
||||
edge to the Graph
|
||||
|
||||
.. image:: ./images/rca_graph.png
|
||||
:width: 100%
|
||||
:align: center
|
||||
|
||||
|
||||
Note that in future versions the graph with RCA information may become more complex, for example:
|
||||
Note that in future versions the graph with RCA information may become more
|
||||
complex, for example:
|
||||
|
||||
.. image:: ./images/complex_rca_graph.png
|
||||
:width: 100%
|
||||
|
Loading…
x
Reference in New Issue
Block a user