workload balance migration strategy spec
This is one of the algorithm spec of Intel thermal POC blueprint: workload-balance-migration-strategy Change-Id: Ifd114715963f7d6ad682c6d21506b5e7372fd927
This commit is contained in:
parent
d12ee48f54
commit
a3f73e3d75
181
specs/mitaka/approved/workload-balance-migration-strategy.rst
Normal file
181
specs/mitaka/approved/workload-balance-migration-strategy.rst
Normal file
@ -0,0 +1,181 @@
|
|||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License.
|
||||||
|
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
==========================================
|
||||||
|
Workload Based Migration Strategy
|
||||||
|
==========================================
|
||||||
|
|
||||||
|
https://blueprints.launchpad.net/watcher/+spec/workload-balance-migration-strategy
|
||||||
|
|
||||||
|
This spec proposes a new Watcher migration strategy based on the VM workloads
|
||||||
|
of hypervisors. This strategy makes decisions to migrate workloads to make the
|
||||||
|
total VM workloads of each hypervisor balanced, when the total VM workloads of
|
||||||
|
hypervisor reaches threshold.
|
||||||
|
|
||||||
|
Note: * VM workloads means how much CPUs the VM instance fully used, eg a VM
|
||||||
|
instance has 4 CPUs, the VM workload range is [0.00 , 4.00].
|
||||||
|
|
||||||
|
* This strategy is based on that all hosts have the same CPUs.
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
In current Data Center, VM workloads on each hypervisor may not balance, some
|
||||||
|
are extremely high, some are idle, which will reduce the cooling efficiency,
|
||||||
|
this strategy will balance the workloads when the CPU utilization of
|
||||||
|
hypervisor reaches threshold.
|
||||||
|
|
||||||
|
Use Cases
|
||||||
|
----------
|
||||||
|
|
||||||
|
As an administrator, I want to be able to trigger an audit that controls the
|
||||||
|
CPU utilization below a certain threshold.
|
||||||
|
|
||||||
|
In order to:
|
||||||
|
|
||||||
|
* balance the workloads on each hypervisor, make it close to average workload
|
||||||
|
value of all hypervisors.
|
||||||
|
|
||||||
|
Project Priority
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
Not relevant because Watcher is not in the big tent so far.
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
Watcher already has its decision framework, so this strategy should be a new
|
||||||
|
class which extend the base strategy class.
|
||||||
|
|
||||||
|
* Set the threshold in 2 steps: hard coded first, then through the template.
|
||||||
|
see: https://blueprints.launchpad.net/watcher/+spec/optimization-threshold
|
||||||
|
Threshold is the trigger value to start workload balancing.
|
||||||
|
It can be the percentage of CPU utilization of hypervisor.
|
||||||
|
|
||||||
|
* Create a new Python class to extend the "BaseStrategy" class.
|
||||||
|
|
||||||
|
* Use the Nova objects framework to get free CPU/Memory/Disk of hypervisors.
|
||||||
|
|
||||||
|
* Use the Ceilometer client to get VM "cpu_util" to calculate the workloads.
|
||||||
|
The time window is 5 minutes here, will be configurable as threshold.
|
||||||
|
It uses ceilometer aggregation API to get the average value of "cpu_util".
|
||||||
|
|
||||||
|
* An algorithm to detect if the threshold of workloads has been reached, it
|
||||||
|
will figure out the suitable VM to be moved, and it will filter the viable
|
||||||
|
targets according to the free resource information of hypervisors from
|
||||||
|
previous step and choose the one with lowest workloads. It will use
|
||||||
|
select-destinations-filter when it is ready.
|
||||||
|
It will also do some check to avoid the corner case, such as "ping pong".
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
No alternative
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
REST API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Notifications impact
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Other end user impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
There used to be some performance issues regarding the query of metrics from
|
||||||
|
the Ceilometer database. This is one of the reasons why it was rarely used in
|
||||||
|
production environment. These issues may now be solved thanks to an
|
||||||
|
abstraction layer which enables anybody to change the underlying metrics
|
||||||
|
storage backend easily.
|
||||||
|
|
||||||
|
There is a performance issue when you query the Nova DB to get CPU usage
|
||||||
|
metrics.
|
||||||
|
|
||||||
|
Other deployer impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
<edwin-zhai>
|
||||||
|
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
1. function to calculate the total VM workloads of hypervisors.
|
||||||
|
|
||||||
|
2. function to filter hypervisors by Nova basic metrics(free CPU/Memory/Disk).
|
||||||
|
|
||||||
|
3. Rewrite execute function to add the algorithm to detect the threshold and
|
||||||
|
to pick up the suitable VM to be moved and choose the target hypervisors,
|
||||||
|
generate the solution.
|
||||||
|
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
* https://blueprints.launchpad.net/watcher/+spec/optimization-threshold
|
||||||
|
|
||||||
|
* https://blueprints.launchpad.net/watcher/+spec/select-destinations-filter
|
||||||
|
|
||||||
|
* http://docs.openstack.org/developer/python-novaclient/api.html
|
||||||
|
|
||||||
|
* https://blueprints.launchpad.net/watcher/+spec/get-goal-from-strategy
|
||||||
|
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
Unit tests and functional test, will use a fake metrics set for running
|
||||||
|
functional test.
|
||||||
|
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
A documentation explaining how to use this new optimization strategy.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
History
|
||||||
|
=======
|
||||||
|
|
||||||
|
None
|
Loading…
x
Reference in New Issue
Block a user