Added specification for ZHAW load consolidation
This specification focuses on integration of the rudimentary load consolidation mechanism developed at ZHAW with Watcher. Modifications to take into account feedback from acabot, jwcroppe. Further modifications to pass tox tests and also taking into account further feedback from review. Implements: blueprint basic-cloud-consolidation-integration Change-Id: Icb4fe4f70ed11229d2f50112b4920a187b239c52
This commit is contained in:
parent
d12ee48f54
commit
9143ca37da
229
specs/mitaka/approved/zhaw-load-consolidation.rst
Normal file
229
specs/mitaka/approved/zhaw-load-consolidation.rst
Normal file
@ -0,0 +1,229 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
===========================
|
||||
Load consolidation strategy
|
||||
===========================
|
||||
|
||||
This specification relates to blueprint:
|
||||
https://blueprints.launchpad.net/watcher/+spec/basic-cloud-consolidation-integration
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Watcher is a framework which provides support for more energy efficient
|
||||
OpenStack operations. It does this by providing access to system state
|
||||
information and a set of available actions which can be performed on an
|
||||
OpenStack installation. It is specifically designed to provide support for
|
||||
different approaches to realizing energy efficient operations: consequently,
|
||||
interested parties are encouraged to provide their own energy efficiency
|
||||
approaches and integrate them with Watcher. This specification focuses on
|
||||
integration of the rudimentary load consolidation mechanism developed at
|
||||
ICCLab cloud computing research lab at Zürcher Hochschule für Angewandte
|
||||
Wissenschaften (ZHAW) with Watcher.
|
||||
|
||||
The original code implementing this algorithm (not in the context of Watcher)
|
||||
was published here:
|
||||
https://github.com/icclab/cloud-consolidation
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
The use case is one in which the `Administrator`_ wants to perform a load
|
||||
consolidation on the `resources`_ to reduce the amount of underutilized
|
||||
servers. The Administrator invokes Watcher with the `Goal`_ of
|
||||
“VM_WORKLOAD_CONSOLIDATION”. Watcher then executes the `Strategy`_
|
||||
“VM_WORKLOAD_CONSOLIDATION_STRATEGY”. It then presents a set of `Actions`_
|
||||
to the Administrator. The Administrator then approves the recommended
|
||||
`action plan`_ - typically VM live-migration actions - and instructs Watcher
|
||||
to perform the actions.
|
||||
|
||||
Project Priority
|
||||
----------------
|
||||
|
||||
Not relevant because Watcher is not in the big tent so far.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The proposed change is to add a new Goal and a new Strategy to Watcher.
|
||||
The new Goal is “VM_WORKLOAD_CONSOLIDATION” and the new Strategy is
|
||||
“VM_WORKLOAD_CONSOLIDATION_STRATEGY”. The new Strategy is designed to be a
|
||||
lightweight consolidation mechanism which can be tuned based on experience; it
|
||||
also operates quickly. The purpose of the strategy is to move the aggregate
|
||||
operating point of the `Cluster`_ to increase the number of servers with
|
||||
moderate to high load and minimize the number of servers with low load.
|
||||
This can be used in conjunction with a server management mechanism to reduce
|
||||
overall energy consumption.
|
||||
|
||||
The new Strategy will leverage a modified first-fit algorithm to achieve
|
||||
increased server CPU and memory utilization which ultimately leads to freeing
|
||||
some of the hosts that can be powered down to save energy. It comprises of
|
||||
two phases, one focused on identifying server with high load and reducing their
|
||||
load and one focused on identifying servers which have spare underutilized
|
||||
capacity. Each of these operates as a first-fit algorithm with utilization
|
||||
ordered in different ways as input to each.
|
||||
|
||||
This Strategy will consider compute host's CPU utilization and memory
|
||||
constraints. These upper utilization thresholds can be set relative to resource
|
||||
capacity and hence will provide simple resource overbooking management if
|
||||
needed. This strategy will not deal with any other limitations such as actual
|
||||
VM memory change rate, network constraints, etc. and relies upon a robust live
|
||||
migration mechanism.
|
||||
|
||||
In order to be able to predict host resources utilization the following
|
||||
utilization estimation model is used. A host resource utilization equals to a
|
||||
sum of the resource utilizations of the hosted workloads (VMs). Considering
|
||||
hosts H1, H2 with a workload W running on H1, moving the workload W from H1 to
|
||||
H2 will result in predicted resource utilization as follows: H1 = H1 - W and
|
||||
H2 = H2 + W with the metrics relating to the VM taken from telemetry and those
|
||||
pertaining to the host available via nova metrics.
|
||||
|
||||
The strategy will work in two phases.
|
||||
The first phase handles decreasingly sorted hosts (by their CPU utilization)
|
||||
whose CPU utilization is exceeding defined threshold and offloads their
|
||||
workload (VM) to the first suitable less loaded host which is able to
|
||||
accommodate the workload without violating any of the constraints described.
|
||||
This host offloading process is repeated for all overloaded hosts until the
|
||||
host’s CPU utilization is predicted to be under the threshold. Doing so for
|
||||
all overloaded servers outputs in a system without overloaded servers. In
|
||||
this phase the workloads (VM) are handled sorted increasingly by its CPU
|
||||
utilization.
|
||||
The second phase then iterates through the servers in reversed order (sorted
|
||||
increasingly by their CPU utilization and thus starting with the least
|
||||
loaded servers) and looks for a smallest possible space where to accommodate
|
||||
its remaining workloads starting with the largest workload and the most loaded
|
||||
hosts. This process is repeated until there is no workload (VM) left on the
|
||||
host in which case this host can be deactivated. This continues again for the
|
||||
next hosts in the same manner until the source and the destination host
|
||||
becomes the same. In this phase the workloads (VM) are handled sorted
|
||||
decreasingly by its CPU utilization.
|
||||
|
||||
Both phases result in a solution whose execution leads to a consolidated system
|
||||
with no overloaded hosts.
|
||||
|
||||
This change will not affect any existing Strategies and will not affect Watcher
|
||||
performance.
|
||||
|
||||
Concretely, the new Strategy will be implemented as a new Strategy called
|
||||
VMWorkloadConsolidationStrategy inheriting from BaseStrategy. The
|
||||
implementation will be very much based on the BasicConsolidation example in the
|
||||
current Watcher codebase.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
The alternatives to this approach are to use different Goals and associated
|
||||
Strategies defined in Watcher.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None expected.
|
||||
|
||||
Having reviewed the data models for both information available to the different
|
||||
Strategies as well as the data models for the Actions, we believe that no
|
||||
modifications are necessary to implement this Strategy.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
There is no impact on the REST API.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
As the strategy only computes a new VM placement and doesn’t deal with
|
||||
placement itself, no security impact is envisaged.
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
No specific notifications associated with executing a specific Strategy are
|
||||
envisaged. (Notifications could arise from the resulting actions, but these
|
||||
are presumably handled in other parts of Watcher).
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
This capability will not have any specific impact on the API. It will have a
|
||||
small impact in how it is used via the python-watcherclient as a new option
|
||||
will now be available for goal parameter in an Audit Template.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
No specific performance impact is envisaged. The Strategy has been designed
|
||||
to operate over hundreds of servers in the order of a few seconds.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
No specific deployer impact is envisaged.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
This will not impact other developers working on OpenStack.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Seán Murphy <murp>
|
||||
Other contributors:
|
||||
Bruno Grazioli <bwg-bruno>
|
||||
Vojtech Cima <cima-vojtech>
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
This task can be considered atomic. It just requires the development and
|
||||
test of a single class.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
No dependencies.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Several unit tests will be provided to test various scenarios using a fake
|
||||
mock models (mock model collector and mock metrics collector) including edge
|
||||
scenarios such as a consolidation of an empty cluster, a consolidation of
|
||||
randomly generated clusters or consolidation of an overloaded cluster.
|
||||
|
||||
Testing approaches similar to the basic consolidation strategy will be
|
||||
used, comprising of unit tests and integration tests in which a specific
|
||||
input is given and compared against the expected output.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
It will be necessary to add new content relating to this new Goal and Strategy
|
||||
to the documentation.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
No references.
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
No history.
|
||||
|
||||
.. _Administrator: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#administrator
|
||||
.. _resources: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#managed-resource
|
||||
.. _Goal: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#goal
|
||||
.. _Strategy: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#strategy
|
||||
.. _Actions: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action
|
||||
.. _action plan: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action-plan
|
||||
.. _Cluster: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#cluster
|
Loading…
x
Reference in New Issue
Block a user