Merge "Added specification for ZHAW load consolidation"
This commit is contained in:
commit
fddfc12e07
229
specs/mitaka/approved/zhaw-load-consolidation.rst
Normal file
229
specs/mitaka/approved/zhaw-load-consolidation.rst
Normal file
@ -0,0 +1,229 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
===========================
|
||||
Load consolidation strategy
|
||||
===========================
|
||||
|
||||
This specification relates to blueprint:
|
||||
https://blueprints.launchpad.net/watcher/+spec/basic-cloud-consolidation-integration
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Watcher is a framework which provides support for more energy efficient
|
||||
OpenStack operations. It does this by providing access to system state
|
||||
information and a set of available actions which can be performed on an
|
||||
OpenStack installation. It is specifically designed to provide support for
|
||||
different approaches to realizing energy efficient operations: consequently,
|
||||
interested parties are encouraged to provide their own energy efficiency
|
||||
approaches and integrate them with Watcher. This specification focuses on
|
||||
integration of the rudimentary load consolidation mechanism developed at
|
||||
ICCLab cloud computing research lab at Zürcher Hochschule für Angewandte
|
||||
Wissenschaften (ZHAW) with Watcher.
|
||||
|
||||
The original code implementing this algorithm (not in the context of Watcher)
|
||||
was published here:
|
||||
https://github.com/icclab/cloud-consolidation
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
The use case is one in which the `Administrator`_ wants to perform a load
|
||||
consolidation on the `resources`_ to reduce the amount of underutilized
|
||||
servers. The Administrator invokes Watcher with the `Goal`_ of
|
||||
“VM_WORKLOAD_CONSOLIDATION”. Watcher then executes the `Strategy`_
|
||||
“VM_WORKLOAD_CONSOLIDATION_STRATEGY”. It then presents a set of `Actions`_
|
||||
to the Administrator. The Administrator then approves the recommended
|
||||
`action plan`_ - typically VM live-migration actions - and instructs Watcher
|
||||
to perform the actions.
|
||||
|
||||
Project Priority
|
||||
----------------
|
||||
|
||||
Not relevant because Watcher is not in the big tent so far.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The proposed change is to add a new Goal and a new Strategy to Watcher.
|
||||
The new Goal is “VM_WORKLOAD_CONSOLIDATION” and the new Strategy is
|
||||
“VM_WORKLOAD_CONSOLIDATION_STRATEGY”. The new Strategy is designed to be a
|
||||
lightweight consolidation mechanism which can be tuned based on experience; it
|
||||
also operates quickly. The purpose of the strategy is to move the aggregate
|
||||
operating point of the `Cluster`_ to increase the number of servers with
|
||||
moderate to high load and minimize the number of servers with low load.
|
||||
This can be used in conjunction with a server management mechanism to reduce
|
||||
overall energy consumption.
|
||||
|
||||
The new Strategy will leverage a modified first-fit algorithm to achieve
|
||||
increased server CPU and memory utilization which ultimately leads to freeing
|
||||
some of the hosts that can be powered down to save energy. It comprises of
|
||||
two phases, one focused on identifying server with high load and reducing their
|
||||
load and one focused on identifying servers which have spare underutilized
|
||||
capacity. Each of these operates as a first-fit algorithm with utilization
|
||||
ordered in different ways as input to each.
|
||||
|
||||
This Strategy will consider compute host's CPU utilization and memory
|
||||
constraints. These upper utilization thresholds can be set relative to resource
|
||||
capacity and hence will provide simple resource overbooking management if
|
||||
needed. This strategy will not deal with any other limitations such as actual
|
||||
VM memory change rate, network constraints, etc. and relies upon a robust live
|
||||
migration mechanism.
|
||||
|
||||
In order to be able to predict host resources utilization the following
|
||||
utilization estimation model is used. A host resource utilization equals to a
|
||||
sum of the resource utilizations of the hosted workloads (VMs). Considering
|
||||
hosts H1, H2 with a workload W running on H1, moving the workload W from H1 to
|
||||
H2 will result in predicted resource utilization as follows: H1 = H1 - W and
|
||||
H2 = H2 + W with the metrics relating to the VM taken from telemetry and those
|
||||
pertaining to the host available via nova metrics.
|
||||
|
||||
The strategy will work in two phases.
|
||||
The first phase handles decreasingly sorted hosts (by their CPU utilization)
|
||||
whose CPU utilization is exceeding defined threshold and offloads their
|
||||
workload (VM) to the first suitable less loaded host which is able to
|
||||
accommodate the workload without violating any of the constraints described.
|
||||
This host offloading process is repeated for all overloaded hosts until the
|
||||
host’s CPU utilization is predicted to be under the threshold. Doing so for
|
||||
all overloaded servers outputs in a system without overloaded servers. In
|
||||
this phase the workloads (VM) are handled sorted increasingly by its CPU
|
||||
utilization.
|
||||
The second phase then iterates through the servers in reversed order (sorted
|
||||
increasingly by their CPU utilization and thus starting with the least
|
||||
loaded servers) and looks for a smallest possible space where to accommodate
|
||||
its remaining workloads starting with the largest workload and the most loaded
|
||||
hosts. This process is repeated until there is no workload (VM) left on the
|
||||
host in which case this host can be deactivated. This continues again for the
|
||||
next hosts in the same manner until the source and the destination host
|
||||
becomes the same. In this phase the workloads (VM) are handled sorted
|
||||
decreasingly by its CPU utilization.
|
||||
|
||||
Both phases result in a solution whose execution leads to a consolidated system
|
||||
with no overloaded hosts.
|
||||
|
||||
This change will not affect any existing Strategies and will not affect Watcher
|
||||
performance.
|
||||
|
||||
Concretely, the new Strategy will be implemented as a new Strategy called
|
||||
VMWorkloadConsolidationStrategy inheriting from BaseStrategy. The
|
||||
implementation will be very much based on the BasicConsolidation example in the
|
||||
current Watcher codebase.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
The alternatives to this approach are to use different Goals and associated
|
||||
Strategies defined in Watcher.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None expected.
|
||||
|
||||
Having reviewed the data models for both information available to the different
|
||||
Strategies as well as the data models for the Actions, we believe that no
|
||||
modifications are necessary to implement this Strategy.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
There is no impact on the REST API.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
As the strategy only computes a new VM placement and doesn’t deal with
|
||||
placement itself, no security impact is envisaged.
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
No specific notifications associated with executing a specific Strategy are
|
||||
envisaged. (Notifications could arise from the resulting actions, but these
|
||||
are presumably handled in other parts of Watcher).
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
This capability will not have any specific impact on the API. It will have a
|
||||
small impact in how it is used via the python-watcherclient as a new option
|
||||
will now be available for goal parameter in an Audit Template.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
No specific performance impact is envisaged. The Strategy has been designed
|
||||
to operate over hundreds of servers in the order of a few seconds.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
No specific deployer impact is envisaged.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
This will not impact other developers working on OpenStack.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Seán Murphy <murp>
|
||||
Other contributors:
|
||||
Bruno Grazioli <bwg-bruno>
|
||||
Vojtech Cima <cima-vojtech>
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
This task can be considered atomic. It just requires the development and
|
||||
test of a single class.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
No dependencies.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Several unit tests will be provided to test various scenarios using a fake
|
||||
mock models (mock model collector and mock metrics collector) including edge
|
||||
scenarios such as a consolidation of an empty cluster, a consolidation of
|
||||
randomly generated clusters or consolidation of an overloaded cluster.
|
||||
|
||||
Testing approaches similar to the basic consolidation strategy will be
|
||||
used, comprising of unit tests and integration tests in which a specific
|
||||
input is given and compared against the expected output.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
It will be necessary to add new content relating to this new Goal and Strategy
|
||||
to the documentation.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
No references.
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
No history.
|
||||
|
||||
.. _Administrator: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#administrator
|
||||
.. _resources: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#managed-resource
|
||||
.. _Goal: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#goal
|
||||
.. _Strategy: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#strategy
|
||||
.. _Actions: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action
|
||||
.. _action plan: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action-plan
|
||||
.. _Cluster: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#cluster
|
Loading…
x
Reference in New Issue
Block a user