Merge "workload balance migration strategy spec"

2016-03-16 04:54:03 +00:00 · 2016-03-16 04:54:03 +00:00 · bf2de9d42e
commit bf2de9d42e
parent fddfc12e07 a3f73e3d75
1 changed files with 181 additions and 0 deletions
--- a/specs/mitaka/approved/workload-balance-migration-strategy.rst
+++ b/specs/mitaka/approved/workload-balance-migration-strategy.rst
@ -0,0 +1,181 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+==========================================
+Workload Based Migration Strategy
+==========================================
+
+https://blueprints.launchpad.net/watcher/+spec/workload-balance-migration-strategy
+
+This spec proposes a new Watcher migration strategy based on the VM workloads
+of hypervisors. This strategy makes decisions to migrate workloads to make the
+total VM workloads of each hypervisor balanced, when the total VM workloads of
+hypervisor reaches threshold.
+
+Note: * VM workloads means how much CPUs the VM instance fully used, eg a VM
+        instance has 4 CPUs, the VM workload range is [0.00 , 4.00].
+
+      * This strategy is based on that all hosts have the same CPUs.
+
+Problem description
+===================
+
+In current Data Center, VM workloads on each hypervisor may not balance, some
+are extremely high, some are idle, which will reduce the cooling efficiency,
+this strategy will balance the workloads when the CPU utilization of
+hypervisor reaches threshold.
+
+Use Cases
+----------
+
+As an administrator, I want to be able to trigger an audit that controls the
+CPU utilization below a certain threshold.
+
+In order to:
+
+* balance the workloads on each hypervisor, make it close to average workload
+  value of all hypervisors.
+
+Project Priority
+-----------------
+
+Not relevant because Watcher is not in the big tent so far.
+
+Proposed change
+===============
+
+Watcher already has its decision framework, so this strategy should be a new
+class which extend the base strategy class.
+
+* Set the threshold in 2 steps: hard coded first, then through the template.
+  see: https://blueprints.launchpad.net/watcher/+spec/optimization-threshold
+  Threshold is the trigger value to start workload balancing.
+  It can be the percentage of CPU utilization of hypervisor.
+
+* Create a new Python class to extend the "BaseStrategy" class.
+
+* Use the Nova objects framework to get free CPU/Memory/Disk of hypervisors.
+
+* Use the Ceilometer client to get VM "cpu_util" to calculate the workloads.
+  The time window is 5 minutes here, will be configurable as threshold.
+  It uses ceilometer aggregation API to get the average value of "cpu_util".
+
+* An algorithm to detect if the threshold of workloads has been reached, it
+  will figure out the suitable VM to be moved, and it will filter the viable
+  targets according to the free resource information of hypervisors from
+  previous step and choose the one with lowest workloads. It will use
+  select-destinations-filter when it is ready.
+  It will also do some check to avoid the corner case, such as "ping pong".
+
+Alternatives
+------------
+
+No alternative
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+None
+
+Performance Impact
+------------------
+
+There used to be some performance issues regarding the query of metrics from
+the Ceilometer database. This is one of the reasons why it was rarely used in
+production environment. These issues may now be solved thanks to an
+abstraction layer which enables anybody to change the underlying metrics
+storage backend easily.
+
+There is a performance issue when you query the Nova DB to get CPU usage
+metrics.
+
+Other deployer impact
+---------------------
+
+None
+
+Developer impact
+----------------
+
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  <edwin-zhai>
+
+
+Work Items
+----------
+
+1. function to calculate the total VM workloads of hypervisors.
+
+2. function to filter hypervisors by Nova basic metrics(free CPU/Memory/Disk).
+
+3. Rewrite execute function to add the algorithm to detect the threshold and
+   to pick up the suitable VM to be moved and choose the target hypervisors,
+   generate the solution.
+
+
+Dependencies
+============
+
+* https://blueprints.launchpad.net/watcher/+spec/optimization-threshold
+
+* https://blueprints.launchpad.net/watcher/+spec/select-destinations-filter
+
+* http://docs.openstack.org/developer/python-novaclient/api.html
+
+* https://blueprints.launchpad.net/watcher/+spec/get-goal-from-strategy
+
+
+Testing
+=======
+
+Unit tests and functional test, will use a fake metrics set for running
+functional test.
+
+
+Documentation Impact
+====================
+
+A documentation explaining how to use this new optimization strategy.
+
+
+References
+==========
+
+None
+
+History
+=======
+
+None