cluster-maintenance-strategy
Implements:blueprint cluster-maintaining Change-Id: I942b793a3182d81981bb817a6e8c970459a16e43
This commit is contained in:
parent
3c538d49fc
commit
aea4693153
146
specs/queens/approved/cluster-maintenance-strategy.rst
Normal file
146
specs/queens/approved/cluster-maintenance-strategy.rst
Normal file
@ -0,0 +1,146 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==========================
|
||||
Host maintenance strategy
|
||||
==========================
|
||||
|
||||
https://blueprints.launchpad.net/watcher/+spec/cluster-maintaining
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Sometimes we need to maintain compute nodes, update hardware or software,
|
||||
and so on, without interrupting user's applications.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
As an openstack operator, sometimes I want to maintain one compute node
|
||||
without interrupting user's applications.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
There will be a new goal and strategy for cluster-maintenance.
|
||||
|
||||
* Add one new goal - "Cluster Maintenance"
|
||||
* Add one new strategy for this goal - "Host Maintenance"
|
||||
|
||||
The new strategy executes as follows
|
||||
|
||||
* First, get the compute node which needs maintenance. This input parameter
|
||||
is provided by the administrator. Call change_nova_service_state action
|
||||
to set the maintaining node in "maintaining" state (disabled with
|
||||
disable_reason 'watcher_maintaining').
|
||||
* Then, call migrate action to migrate all instances on the maintaining node
|
||||
to other nodes. Migrate active instances use "live-migrate" and
|
||||
others use "cold-migrate". Calculate free cpus/memory/disk of a node
|
||||
to determine whether one instance or all instances from the maintaining node
|
||||
can migrate to.
|
||||
This strategy just consider how to migrate all instances of the
|
||||
maintaining node, further optimization rely on other strategies.
|
||||
There are two methods to migrate the instances of the maintaining node:
|
||||
Method No.1, migrate all instances on the maintaining node intensively to
|
||||
one unused host.The 'unused' host means disable but not power-off node
|
||||
for Watcher. If there are more than one "unused" hosts, choose one from
|
||||
them by random.
|
||||
(This method won't result in more VMs migration among other hosts.)
|
||||
Method No.2, just migrate all instances on the maintaining node dispersedly
|
||||
to other nodes.
|
||||
Method No.1 is priority. Only if Method No.1 fails, Method No.2 will
|
||||
execute. If both methods fail, this audit fails and raise exception with
|
||||
no solution produced.
|
||||
|
||||
After the maintenance finished, the administrator needs to activate the
|
||||
maintaining node by cli 'nova service-enable' to change the node's state
|
||||
from "maintaining" to "enabled" manually, which will make the compute node
|
||||
rejoin into compute resource.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
None
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
None
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:sue
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
* Add strategy and goal for cluster_maintenance
|
||||
* Update change_nova_service_state action, to make it available to
|
||||
maintain one compute node.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
https://blueprints.launchpad.net/watcher/+spec/extend-node-status
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Unit tests
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
A documentation explaining how to use this new optimization strategy.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
None
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
None
|
||||
|
Loading…
Reference in New Issue
Block a user