watcher-specs/specs/newton/implemented/continuously-optimization.rst
Antoine Cabot eb0bde14ad Move implemented specs in implemented folder
I also removed useless templates for a better
HTML rendering.

Change-Id: I1220319a1388f74bcdccff271ce25647227d74c5
2016-08-29 15:20:05 +02:00

285 lines
8.8 KiB
ReStructuredText

=============================================
Watcher Continuous Optimization
=============================================
https://blueprints.launchpad.net/watcher/+spec/continuously-optimization
Problem description
===================
`Cluster`_ can be optimized by different `Strategies`_ only when they have
been triggered by `Administrator`_. Launching a recommended `Action Plan`_
manually is not always suitable since state of cluster is constantly changing.
It would be better to have two ways of launching audit: either by triggering
it manually or by launching it periodically.
We propose to include continuous optimization as continuous type of
audit object in Watcher Project.
The main purpose of this change is to design and implement active mode of
Watcher's audit.
This specification relates to blueprint:
https://blueprints.launchpad.net/watcher/+spec/continuously-optimization
Use Cases
---------
As an administrator, I would like to create a periodic audit to be able to
optimize continuously my cloud infrastructure. I can specify a period
with --period parameter (in seconds) to launch an audit every 600 seconds.
As an administrator, I would like to be able to remove a continuous audit.
As an administrator, I would like to be able to update the period of an audit.
Project Priority
----------------
Essential for Newton-2
Proposed change
===============
The watcher system enables a private cloud administrator to launch `Audit`_
on an Openstack cluster in order to optimize it in regards of one or several
goals. An `Audit`_ is an optimization request.
There are two types of audits :
- ONESHOT : the audit will only be executed once
- CONTINUOUS : the audit will be executed regularly with a given frequency.
We propose to use the `APScheduler`_ library to schedule the continuous audits.
Note: This library is already in the Openstack global requirements.
`APScheduler`_ provides several scheduler implementations to schedule jobs with
a specific interval. The scheduler which seems to match well our requirements
is the BackgroundScheduler.
APScheduler provides an example: `BackgroundScheduler`_.
The DecisionEngineManager (watcher/decision_engine/manager.py) class will need
to be amended in order to instantiate the new ContinuousAuditManager class.
The ContinuousAuditManager class will contain the BackgroundScheduler but also
the logic for managing the continuous audits. We should also create a
ContinuousAuditJob class in charge of supervising one `Audit`_.
This class will contain the `APScheduler`_ job and its associated audit.
We can easily add new audits or remove old ones on the fly with
BackgroundScheduler. So, the existing continuous audits should be
automatically added by the decision_engine during start.
Then, the ContinuousAuditManager will manage the audits in an even driven
fashion. In order to do that, we should then modify the 'post', 'patch'
and 'delete' methods in the `API source file`_ for sending immediate
notification messages.
The notifications generated by Watcher are generated in JSON format,
and placed on an AMQP queue named ``watcher.status``.
This parameter must be configurable.
The ContinuousAuditManager will consume these events in order to update
the status of the audits.
Immediate Notification Examples
::
{
"event_type": "audit.create",
"timestamp": "2016-03-12 17:01:29.899834",
"message_id": "1234653e-ce46-4a82-979f-a9286cac5258",
"priority": "INFO",
"publisher_id": "<service name >:<the host where the service runs>",
"payload": {
"watcher_object.namespace":"watcher",
"watcher_object.name":"Audit",
"watcher_object.version":"1.0",
"watcher_object.data":{
"audit_uuid": "840eeb3e-3486-11e6-ac61-9e71128cae77",
"type": "CONTINUOUS",
"state": "PENDING",
"period": 3600
}
}
}
::
{
"event_type": "audit.update",
"timestamp": "2016-03-12 17:01:29.899834",
"message_id": "1234653e-ce46-4a82-979f-a9286cac5258",
"priority": "INFO",
"publisher_id": "<service name >:<the host where the service runs>",
"payload": {
"watcher_object.namespace":"watcher",
"watcher_object.name":"Audit",
"watcher_object.version":"1.0",
"watcher_object.data":{
"audit_uuid": "840eeb3e-3486-11e6-ac61-9e71128cae77",
"type": "CONTINUOUS",
"state": "ONGOING",
"period": 3600
}
}
}
::
{
"event_type": "audit.delete",
"timestamp": "2016-03-12 17:01:29.899834",
"message_id": "1234653e-ce46-4a82-979f-a9286cac5258",
"priority": "INFO",
"publisher_id": "<service name >:<the host where the service runs>",
"payload": {
"watcher_object.namespace":"watcher",
"watcher_object.name":"Audit",
"watcher_object.version":"1.0",
"watcher_object.data":{
"audit_uuid": "840eeb3e-3486-11e6-ac61-9e71128cae77",
"type": "CONTINUOUS",
"state": "SUCCEEDED",
"period": 3600
}
}
}
The notification logic isn't yet available in Watcher. We will work on this
with the `watcher-notifications-ovo`_ blueprint.
So, for the first implementation of this spec, we will manage the audits by
querying periodically in the watcher database in order to update running audits
and their periods.
APScheduler give also the possibility to store your jobs in a database.
In this way, the jobs will survive decision engine restarts and maintain
their state. This feature is interesting, but for the first implementation
of the continuous `Audit`_ we will use the memory backend.
To keep track of the triggered audit, notification has to be pushed on
the message bus every time the audit is re-triggered.
When a new action plan is proposed, Watcher should cancel all the previously
generated action plans (and actions) with same Audit Template become obsolete
and therefore their state should be changed to CANCELLED.
Alternatives
------------
* To use Congress to automatically trigger audits when some conditions are met.
* To use a cronjob which triggers new audit regularly via python-watcherclient.
Data model impact
-----------------
There must be new field in Audit model: integer 'period'. 'period' field has
3600 by default.
REST API impact
---------------
period's field has to be added as Audit attribute.
Security impact
---------------
None expected.
Notifications impact
--------------------
None expected.
Other end user impact
---------------------
Support for 'period' field must be added to the python-watcherclient and
to the watcher-dashboard.
Performance Impact
------------------
No specific performance impact is expected.
Other deployer impact
---------------------
No specific deployer impact is envisaged.
Developer impact
----------------
This will not impact other developers working on OpenStack.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Alexander Chadin <alexchadin>
Other contributors:
Vladimir Ostroverkhov <Ostroverkhov>
Jean-Emile DARTOIS <jed56>
Work Items
----------
Part 1
^^^^^^
* Implement ContinuousAuditManager that use `APScheduler`_.
* Implement ContinuousAuditJob class.
* Implement the logic to add new audits or remove old ones on the fly with
BackgroundScheduler by periodically query the watcher db. ``Audit.list()``
* Adapt API to support period field.
* Make some changes to python-watcherclient to add support for period argument.
* Add changes to watcher-dashboard to support CONTINUOUS type.
* Implement appropriate unit tests to test various scenarios.
Part 2
^^^^^^
* We need to wait that `watcher-notifications-ovo`_ is implemented for this
part.
* Load the registered audits in the watcher database during decision engine
start.
* Implement the logic to add new audits or remove old ones on the fly with
BackgroundScheduler by subscripting to the events.
Dependencies
============
There is a dependency with `watcher-notifications-ovo`_ blueprint.
Testing
=======
Appropriate unit tests will be adapted to new changes.
Documentation Impact
====================
It will be necessary to add new content relating to this change.
References
==========
No references.
History
=======
No history.
.. _APScheduler: https://github.com/agronholm/apscheduler
.. _Strategies: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#strategy
.. _Administrator: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#administrator
.. _Audit: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#audit
.. _Action Plan: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action-plan
.. _Cluster: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#cluster
.. _BackgroundScheduler: https://github.com/agronholm/apscheduler/blob/master/examples/schedulers/background.py
.. _API source file: https://github.com/openstack/watcher/blob/master/watcher/api/controllers/v1/audit.py
.. _watcher-notifications-ovo: https://blueprints.launchpad.net/watcher/+spec/watcher-notifications-ovo