Merge "Added specification for ZHAW load consolidation"

2016-03-02 11:26:05 +00:00 · 2016-03-02 11:26:05 +00:00 · fddfc12e07
commit fddfc12e07
parent 357dcdeaf0 9143ca37da
1 changed files with 229 additions and 0 deletions
--- a/specs/mitaka/approved/zhaw-load-consolidation.rst
+++ b/specs/mitaka/approved/zhaw-load-consolidation.rst
@ -0,0 +1,229 @@
 ..
 This work is licensed under a Creative Commons Attribution 3.0 Unported
 License.
 http://creativecommons.org/licenses/by/3.0/legalcode
 ===========================
 Load consolidation strategy
 ===========================
 This specification relates to blueprint:
 https://blueprints.launchpad.net/watcher/+spec/basic-cloud-consolidation-integration
 Problem description
 ===================
 Watcher is a framework which provides support for more energy efficient
 OpenStack operations. It does this by providing access to system state
 information and a set of available actions which can be performed on an
 OpenStack installation. It is specifically designed to provide support for
 different approaches to realizing energy efficient operations: consequently,
 interested parties are encouraged to provide their own energy efficiency
 approaches and integrate them with Watcher. This specification focuses on
 integration of the rudimentary load consolidation mechanism developed at
 ICCLab cloud computing research lab at Zürcher Hochschule für Angewandte
 Wissenschaften (ZHAW) with Watcher.
 The original code implementing this algorithm (not in the context of Watcher)
 was published here:
 https://github.com/icclab/cloud-consolidation
 Use Cases
 ---------
 The use case is one in which the `Administrator`_ wants to perform a load
 consolidation on the `resources`_ to reduce the amount of underutilized
 servers. The Administrator invokes Watcher with the `Goal`_ of
 “VM_WORKLOAD_CONSOLIDATION”. Watcher then executes the `Strategy`_
 “VM_WORKLOAD_CONSOLIDATION_STRATEGY”. It then presents a set of `Actions`_
 to the  Administrator. The Administrator then approves the recommended
 `action plan`_ - typically VM live-migration actions - and instructs Watcher
 to perform the actions.
 Project Priority
 ----------------
 Not relevant because Watcher is not in the big tent so far.
 Proposed change
 ===============
 The proposed change is to add a new Goal and a new Strategy to Watcher.
 The new Goal is “VM_WORKLOAD_CONSOLIDATION” and the new Strategy is
 “VM_WORKLOAD_CONSOLIDATION_STRATEGY”. The new Strategy is designed to be a
 lightweight consolidation mechanism which can be tuned based on experience; it
 also operates quickly. The purpose of the strategy is to move the aggregate
 operating point of the `Cluster`_ to increase the number of servers with
 moderate to high load and minimize the number of servers with low load.
 This can be used in conjunction with a server management mechanism to reduce
 overall energy consumption.
 The new Strategy will leverage a modified first-fit algorithm to achieve
 increased server CPU and memory utilization which ultimately leads to freeing
 some of the hosts that can be powered down to save energy. It comprises of
 two phases, one focused on identifying server with high load and reducing their
 load and one focused on identifying servers which have spare underutilized
 capacity. Each of these operates as a first-fit algorithm with utilization
 ordered in different ways as input to each.
 This Strategy will consider compute host's CPU utilization and memory
 constraints. These upper utilization thresholds can be set relative to resource
 capacity and hence will provide simple resource overbooking management if
 needed. This strategy will not deal with any other limitations such as actual
 VM memory change rate, network constraints, etc. and relies upon a robust live
 migration mechanism.
 In order to be able to predict host resources utilization the following
 utilization estimation model is used. A host resource utilization equals to a
 sum of the resource utilizations of the hosted workloads (VMs). Considering
 hosts H1, H2 with a workload W running on H1, moving the workload W from H1 to
 H2 will result in predicted resource utilization as follows: H1 = H1 - W and
 H2 = H2 + W with the metrics relating to the VM taken from telemetry and those
 pertaining to the host available via nova metrics.
 The strategy will work in two phases.
 The first phase handles decreasingly sorted hosts (by their CPU utilization)
 whose CPU utilization is exceeding defined threshold and offloads their
 workload (VM) to the first suitable less loaded host which is able to
 accommodate the workload without violating any of the constraints described.
 This host offloading process is repeated for all overloaded hosts until the
 host’s CPU utilization is predicted to be under the threshold. Doing so for
 all overloaded servers outputs in a system without overloaded servers. In
 this phase the workloads (VM) are handled sorted increasingly by its CPU
 utilization.
 The second phase then iterates through the servers in reversed order (sorted
 increasingly by their CPU utilization and thus starting with the least
 loaded servers) and looks for a smallest possible space where to accommodate
 its remaining workloads starting with the largest workload and the most loaded
 hosts. This process is repeated until there is no workload (VM) left on the
 host in which case this host can be deactivated. This continues again for the
 next hosts in the same manner until the source and the destination host
 becomes the same. In this phase the workloads (VM) are handled sorted
 decreasingly by its CPU utilization.
 Both phases result in a solution whose execution leads to a consolidated system
 with no overloaded hosts.
 This change will not affect any existing Strategies and will not affect Watcher
 performance.
 Concretely, the new Strategy will be implemented as a new Strategy called
 VMWorkloadConsolidationStrategy inheriting from BaseStrategy. The
 implementation will be very much based on the BasicConsolidation example in the
 current Watcher codebase.
 Alternatives
 ------------
 The alternatives to this approach are to use different Goals and associated
 Strategies defined in Watcher.
 Data model impact
 -----------------
 None expected.
 Having reviewed the data models for both information available to the different
 Strategies as well as the data models for the Actions, we believe that no
 modifications are necessary to implement this Strategy.
 REST API impact
 ---------------
 There is no impact on the REST API.
 Security impact
 ---------------
 As the strategy only computes a new VM placement and doesn’t deal with
 placement itself, no security impact is envisaged.
 Notifications impact
 --------------------
 No specific notifications associated with executing a specific Strategy are
 envisaged. (Notifications could arise from the resulting actions, but these
 are presumably handled in other parts of Watcher).
 Other end user impact
 ---------------------
 This capability will not have any specific impact on the API. It will have a
 small impact in how it is used via the python-watcherclient as a new option
 will now be available for goal parameter in an Audit Template.
 Performance Impact
 ------------------
 No specific performance impact is envisaged. The Strategy has been designed
 to operate over hundreds of servers in the order of a few seconds.
 Other deployer impact
 ---------------------
 No specific deployer impact is envisaged.
 Developer impact
 ----------------
 This will not impact other developers working on OpenStack.
 Implementation
 ==============
 Assignee(s)
 -----------
 Primary assignee:
  Seán Murphy <murp>
 Other contributors:
  Bruno Grazioli <bwg-bruno>
  Vojtech Cima <cima-vojtech>
 Work Items
 ----------
 This task can be considered atomic. It just requires the development and
 test of a single class.
 Dependencies
 ============
 No dependencies.
 Testing
 =======
 Several unit tests will be provided to test various scenarios using a fake
 mock models (mock model collector and mock metrics collector) including edge
 scenarios such as a consolidation of an empty cluster, a consolidation of
 randomly generated clusters or consolidation of an overloaded cluster.
 Testing approaches similar to the basic consolidation strategy will be
 used, comprising of unit tests and integration tests in which a specific
 input is given and compared against the expected output.
 Documentation Impact
 ====================
 It will be necessary to add new content relating to this new Goal and Strategy
 to the documentation.
 References
 ==========
 No references.
 History
 =======
 No history.
 .. _Administrator: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#administrator
 .. _resources: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#managed-resource
 .. _Goal: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#goal
 .. _Strategy: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#strategy
 .. _Actions: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action
 .. _action plan: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#action-plan
 .. _Cluster: https://factory.b-com.com/www/watcher/doc/watcher/glossary.html#cluster