From aea4693153884044cd3a35518c92fc2e6e6b9019 Mon Sep 17 00:00:00 2001 From: suzhengwei Date: Mon, 15 May 2017 18:11:16 +0800 Subject: [PATCH] cluster-maintenance-strategy Implements:blueprint cluster-maintaining Change-Id: I942b793a3182d81981bb817a6e8c970459a16e43 --- .../approved/cluster-maintenance-strategy.rst | 146 ++++++++++++++++++ 1 file changed, 146 insertions(+) create mode 100644 specs/queens/approved/cluster-maintenance-strategy.rst diff --git a/specs/queens/approved/cluster-maintenance-strategy.rst b/specs/queens/approved/cluster-maintenance-strategy.rst new file mode 100644 index 0000000..fdef271 --- /dev/null +++ b/specs/queens/approved/cluster-maintenance-strategy.rst @@ -0,0 +1,146 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +========================== +Host maintenance strategy +========================== + +https://blueprints.launchpad.net/watcher/+spec/cluster-maintaining + +Problem description +=================== + +Sometimes we need to maintain compute nodes, update hardware or software, +and so on, without interrupting user's applications. + +Use Cases +--------- + +As an openstack operator, sometimes I want to maintain one compute node +without interrupting user's applications. + + +Proposed change +=============== +There will be a new goal and strategy for cluster-maintenance. + +* Add one new goal - "Cluster Maintenance" +* Add one new strategy for this goal - "Host Maintenance" + +The new strategy executes as follows + +* First, get the compute node which needs maintenance. This input parameter + is provided by the administrator. Call change_nova_service_state action + to set the maintaining node in "maintaining" state (disabled with + disable_reason 'watcher_maintaining'). +* Then, call migrate action to migrate all instances on the maintaining node + to other nodes. Migrate active instances use "live-migrate" and + others use "cold-migrate". Calculate free cpus/memory/disk of a node + to determine whether one instance or all instances from the maintaining node + can migrate to. + This strategy just consider how to migrate all instances of the + maintaining node, further optimization rely on other strategies. + There are two methods to migrate the instances of the maintaining node: + Method No.1, migrate all instances on the maintaining node intensively to + one unused host.The 'unused' host means disable but not power-off node + for Watcher. If there are more than one "unused" hosts, choose one from + them by random. + (This method won't result in more VMs migration among other hosts.) + Method No.2, just migrate all instances on the maintaining node dispersedly + to other nodes. + Method No.1 is priority. Only if Method No.1 fails, Method No.2 will + execute. If both methods fail, this audit fails and raise exception with + no solution produced. + +After the maintenance finished, the administrator needs to activate the +maintaining node by cli 'nova service-enable' to change the node's state +from "maintaining" to "enabled" manually, which will make the compute node +rejoin into compute resource. + +Alternatives +------------ + +None + +Data model impact +----------------- + +None + +REST API impact +--------------- + +None + +Security impact +--------------- +None + +Notifications impact +-------------------- + +None + +Other end user impact +--------------------- + +None + +Performance Impact +------------------ + +None + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +None + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee:sue + +Work Items +---------- + + * Add strategy and goal for cluster_maintenance + * Update change_nova_service_state action, to make it available to + maintain one compute node. + +Dependencies +============ + +https://blueprints.launchpad.net/watcher/+spec/extend-node-status + +Testing +======= + +Unit tests + +Documentation Impact +==================== + +A documentation explaining how to use this new optimization strategy. + +References +========== + +None + +History +======= + +None +