Add 2025.1 spec for prometheus as datasource in watcher
Also remove redundant .gitkeep Co-Authored-By: Dan Smith <dms@danplanet.com> Co-Authored-By: Sean Mooney <smooney@redhat.com> Change-Id: Idc9bf7d218a55b3f1847e5579d55bd2f1acd7d7c
This commit is contained in:
parent
e7cc93a76c
commit
fe6895c7c2
@ -35,6 +35,7 @@ Here you can find the specs, and spec template, for each release:
|
||||
specs/ocata/index
|
||||
specs/newton/index
|
||||
specs/mitaka/index
|
||||
specs/2025.1/index
|
||||
|
||||
There are also some approved backlog specifications that are looking for
|
||||
owners:
|
||||
|
1
doc/source/specs/2025.1/approved
Symbolic link
1
doc/source/specs/2025.1/approved
Symbolic link
@ -0,0 +1 @@
|
||||
../../../../specs/2025.1/approved
|
18
doc/source/specs/2025.1/index.rst
Normal file
18
doc/source/specs/2025.1/index.rst
Normal file
@ -0,0 +1,18 @@
|
||||
=============================
|
||||
Watcher 2025.1 Specifications
|
||||
=============================
|
||||
|
||||
Template:
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
Specification Template (2025.1 release) <template>
|
||||
|
||||
2025.1 approved (but not implemented) specs:
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:maxdepth: 1
|
||||
|
||||
approved/*
|
1
doc/source/specs/2025.1/template.rst
Symbolic link
1
doc/source/specs/2025.1/template.rst
Symbolic link
@ -0,0 +1 @@
|
||||
../../../../specs/2025.1-template.rst
|
197
specs/2025.1/approved/prometheus-datasource.rst
Normal file
197
specs/2025.1/approved/prometheus-datasource.rst
Normal file
@ -0,0 +1,197 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==========================================
|
||||
Add Prometheus as a Watcher Data Source
|
||||
==========================================
|
||||
|
||||
launchpad blueprint: https://blueprints.launchpad.net/watcher/+spec/example
|
||||
|
||||
Watcher currently supports a small number of data sources for collection of
|
||||
metrics: Ceilometer, Gnocchi and Grafana. Prometheus is a widely adopted
|
||||
time/series based metric collection system that allows for collection of any
|
||||
type of custom metric an operator may be interested in for their cloud VMs or
|
||||
containers.
|
||||
|
||||
Besides its usage in OpenStack deployment, Prometheus is considered Kubernetes
|
||||
'native' as both are CNCF projects and Prometheus is included as part of
|
||||
Kubernetes distributions.
|
||||
|
||||
Adding the ability for Watcher to interact with a Prometheus data source
|
||||
will increase the potential user base for Watcher and especially to those
|
||||
operators that are familiar with or already using Prometheus.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Watcher currently supports a small number of data sources for collection of
|
||||
metrics: Ceilometer, Gnocchi and Grafana. Some of these are no longer actively
|
||||
developed and integrated with OpenStack distributions, limiting the ability
|
||||
to deploy watcher at all.
|
||||
|
||||
As Prometheus becomes the de facto standard metrics store in the Kubernetes
|
||||
ecosystem and OpenStack is increasingly deployed on Kubernetes, Watchers'
|
||||
inability to consume metrics from Prometheus limits the project's reach.
|
||||
|
||||
Use Cases
|
||||
----------
|
||||
|
||||
By providing the ability to couple the efficient and highly customizable
|
||||
Prometheus collector with the Watcher project operators can achieve a powerful
|
||||
optimization solution for their OpenStack deployments. There is currently no
|
||||
way to use Prometheus as a data source for Watcher.
|
||||
|
||||
As an operator with existing knowledge of Prometheus, I would like to
|
||||
leverage the power of Watcher as an optimization engine, by using it as a data
|
||||
source.
|
||||
|
||||
As an operator with existing Kubernetes infrastructure, I would like to reuse
|
||||
the same metrics storage solution across my OpenStack and Kubernetes
|
||||
deployments.
|
||||
|
||||
As a developer of Watcher, I want to allow it to be deployed in more OpenStack
|
||||
clouds, leveraging popular open-source tools to increase the project's reach
|
||||
and adoption.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
A new Prometheus module will be added to watcher.decision_engine.datasources
|
||||
which will leverage the https://opendev.org/openstack/python-observabilityclient
|
||||
already used by AODH to retrieve metrics from Prometheus.
|
||||
https://github.com/openstack/aodh/commit/f932265290a4e923eac6111eb28578489c7dce33
|
||||
|
||||
As a first implementation, we are not expecting to extend the DataSource
|
||||
METRIC_MAP beyond the existing set (host/instance cpu/ram etc). That could be
|
||||
considered future work depending on the success of this proposal.
|
||||
The new Prometheus client will provide a default set of mappings to enable a
|
||||
subset of strategies and goals to function by normalising the Prometheus
|
||||
metric names and units to align with the existing values supported by other
|
||||
data sources.
|
||||
|
||||
This initial work will not utilise Prometheus alert to enable triggering
|
||||
audits and instead will build on AODH's existing integration to fulfil that
|
||||
use case.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
It is not possible to use Prometheus as a metrics collector currently. The
|
||||
alternative is to use one of the currently supported data sources which
|
||||
restricts the potential user base for Watcher.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
There are no expected changes to the data model as part of this proposal.
|
||||
Given the extensibility of Prometheus as a collector, it is feasible that
|
||||
future work could propose extension of the Watcher metrics beyond the
|
||||
current set (host/instance cpu or ram usage, temperatore etc). However
|
||||
that is not in the scope of this current proposal.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
This proposal is not expected to impact the REST API.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None Expected
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None expected.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None expected.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
There is no expected impact to using a Prometheus data source compared
|
||||
to any of the currently supported sources.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
No anticipated impact besides the ability to integrate with a new data source.
|
||||
Deployers will have to provide the required configuration values such
|
||||
as (Prometheus) authentication credentials required for the integration.
|
||||
|
||||
A new optional dependency on python-observabilityclient will be introduced
|
||||
which may require changes to packaging and installers.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
The watcher devstack plugin will be extended to allow developers to use
|
||||
Prometheus instead of the default Gnocchi/Ceilometer collectors.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Sean Mooney, Marios Andreou,
|
||||
|
||||
Reviewers
|
||||
-----------
|
||||
|
||||
Dan Smith
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
We will need:
|
||||
|
||||
* New prometheus.py subclass of base.DataSourceBase in the [datasources](https://github.com/openstack/watcher/tree/master/watcher/decision_engine/datasources),
|
||||
* A prometheus_client.py to handle authentication and transport of metrics
|
||||
from the Prometheus instance under
|
||||
[conf](https://github.com/openstack/watcher/tree/master/watcher/conf),
|
||||
* Extend the Zuul CI testing for the Prometheus integration, that is, add a
|
||||
new devstack job similar to the existing
|
||||
[watcher-tempest-strategies](https://zuul.opendev.org/t/openstack/builds?job_name=watcher-tempest-strategies&project=openstack/watcher)
|
||||
to enable Watcher with a Prometheus collector.
|
||||
* Extend the Watcher devstack plugin to support deployment with Prometheus
|
||||
instead of the default Gnocchi/Ceilometer.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
The proposal requires that the OpenStack deployment monitored by the Prometheus
|
||||
instance used as a data source, has deployed the appropriate exporters, the
|
||||
actual collection functions and API endpoints, such that they can be mapped to
|
||||
the expected Watcher metrics (host_cpu_usage, host_ram_usage,
|
||||
instance_cpu_usage etc).
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
As mentioned under work items this work will also include addition of a new
|
||||
CI job against the Watcher code repo. Beyond ensuring the integration point
|
||||
(e.g. communication with Prometheus is OK, metrics are received and processed
|
||||
correctly etc) ideally this should include functional testing similar to the
|
||||
existing watcher-tempest-strategies job that has execution of strategies.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
We will need to extend documentation including considerations around setup,
|
||||
for example, setting up the appropriate exporters on the Prometheus side,
|
||||
best practices around authentication/certs etc.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
This proposal was first mentioned by S Mooney during the
|
||||
[October 2024 Watcher PTG session](https://etherpad.opendev.org/p/oct2024-ptg-watcher)
|
||||
session
|
||||
|
Loading…
Reference in New Issue
Block a user