Config STONITH role creation

This role configures the STONITH devices for all the nodes on a deployed
overcloud environment.
It gives the ability to configure (idempotentially) STONITH on the
controllers, on the computes or on all the nodes (which is the default).

Change-Id: I2bb5307b43e39bc7d26cd3911d56efcf3fae2a8b
This commit is contained in:
Raoul Scarazzini 2017-03-13 13:16:44 -04:00
parent 9036d6de48
commit 2a5d666e6d
5 changed files with 230 additions and 0 deletions

View File

@ -0,0 +1,7 @@
---
- name: Configure STONITH for all the hosts on the overcloud
hosts: undercloud
gather_facts: no
roles:
- stonith-config

View File

@ -0,0 +1,99 @@
stonith-config
==============
This role acts on an already deployed tripleo environment, setting up STONITH (Shoot The Other Node In The Head) inside the Pacemaker configuration for all the hosts that are part of the overcloud.
Requirements
------------
This role must be used with a deployed TripleO environment, so you'll need a working directory of tripleo-quickstart with these files:
- **hosts**: which will contain all the hosts used in the deployment;
- **ssh.config.ansible**: which will have all the ssh data to connect to the undercloud and all the overcloud nodes;
- **instackenv.json**: which must be present on the undercloud workdir. This should be created by the installer;
Quickstart invocation
---------------------
Quickstart can be invoked like this:
./quickstart.sh \
--retain-inventory \
--playbook overcloud-stonith-config.yml \
--working-dir /path/to/workdir \
--config /path/to/config.yml \
--release <RELEASE> \
--tags all \
<HOSTNAME or IP>
Basically this command:
- **Keeps** existing data on the repo (it's the most important one)
- Uses the *overcloud-stonith-config.yml* playbook
- Uses the same custom workdir where quickstart was first deployed
- Select the specific config file
- Specifies the release (mitaka, newton, or “master” for ocata)
- Performs all the tasks in the playbook overcloud-stonith-config.yml
**Important note**
You might need to export *ANSIBLE_SSH_ARGS* with the path of the *ssh.config.ansible* file to make the command work, like this:
export ANSIBLE_SSH_ARGS="-F /path/to/quickstart/workdir/ssh.config.ansible"
STONITH configuration
---------------------
STONITH configuration relies on the same **instackenv.json** file used by TripleO to configure Ironic and all the provision stuff.
Basically this role enable STONITH on the Pacemaker cluster and takes all the information from the mentioned file, creating a STONITH resource for each host on the overcloud.
After running this playbook th cluster configuration will have this property:
$ sudo pcs property
Cluster Properties:
cluster-infrastructure: corosync
cluster-name: tripleo_cluster
...
...
**stonith-enabled: true**
And something like this, depending on how many nodes are there in the overcloud:
sudo pcs stonith
ipmilan-overcloud-compute-0 (stonith:fence_ipmilan): Started overcloud-controller-1
ipmilan-overcloud-controller-2 (stonith:fence_ipmilan): Started overcloud-controller-0
ipmilan-overcloud-controller-0 (stonith:fence_ipmilan): Started overcloud-controller-0
ipmilan-overcloud-controller-1 (stonith:fence_ipmilan): Started overcloud-controller-1
ipmilan-overcloud-compute-1 (stonith:fence_ipmilan): Started overcloud-controller-1
Having all this in place is a requirement for a reliable HA solution and for configuring special OpenStack features like [Instance HA](https://github.com/redhat-openstack/tripleo-quickstart-utils/tree/master/roles/instance-ha).
**Note**: by default this role configures STONITH for all the overcloud nodes, but it is possible to limitate it just for controllers, or just for computes, by setting the **stonith_devices** variable, which by default is set to "all", but can also be "*controllers*" or "*computes*".
Limitations
-----------
The only kind of STONITH devices supported are **for the moment** IPMI.
Example Playbook
----------------
The main playbook couldn't be simpler:
---
- name: Configure STONITH for all the hosts on the overcloud
hosts: undercloud
gather_facts: no
roles:
- stonith-config
But it could also be used at the end of a deployment, like the validate-ha role is used in [baremetal-undercloud-validate-ha.yml](https://github.com/redhat-openstack/tripleo-quickstart-utils/blob/master/playbooks/baremetal-undercloud-validate-ha.yml).
License
-------
GPL
Author Information
------------------
Raoul Scarazzini <rasca@redhat.com>

View File

@ -0,0 +1,9 @@
---
overcloud_working_dir: "/home/heat-admin"
working_dir: "/home/stack"
instack_env_file: "{{ working_dir }}/instackenv.json"
create_stonith_python_script: create-stonith-from-instackenv.py.j2
# Can be all, controllers, computes
stonith_devices: all

View File

@ -0,0 +1,26 @@
---
- name: Load the STONITH creation script on the undercloud
template:
src: "{{ create_stonith_python_script }}"
dest: "{{ working_dir }}/create_stonith_from_instackenv.py"
mode: 0755
- name: Generate STONITH script
shell: |
source {{ working_dir }}/stackrc
{{ working_dir }}/create_stonith_from_instackenv.py {{ instack_env_file }} {{ stonith_devices }}
register: stonith_script
- name: Create the STONITH script on the overcloud
lineinfile:
destfile: "{{ overcloud_working_dir }}/create-stonith.sh"
line: "{{ stonith_script.stdout }}"
create: yes
mode: 0755
delegate_to: overcloud-controller-0
- name: Execute STONITH script
become: true
delegate_to: overcloud-controller-0
shell: >
{{ overcloud_working_dir }}/create-stonith.sh &> create_stonith.log

View File

@ -0,0 +1,89 @@
#!/bin/python
import os
import json
import sys
# The below will be enabled once OS_AUTH_URL=http://192.0.2.1:5000/v3
#from keystoneauth1.identity import v3
from keystoneauth1.identity import v2
from keystoneauth1 import session
from pprint import pprint
from novaclient import client
# Environment variables (need to source before launching):
# export NOVA_VERSION=1.1
# export OS_PASSWORD=$(sudo hiera admin_password)
# If v3:
# export OS_AUTH_URL=http://192.0.2.1:5000/v3
# else
# export OS_AUTH_URL=http://192.0.2.1:5000/v2.0
# export OS_USERNAME=admin
# export OS_TENANT_NAME=admin
# export COMPUTE_API_VERSION=1.1
# export OS_NO_CACHE=True
# JSON format:
#{ "nodes": [
#{
# "mac": [
#"b8:ca:3a:66:e3:82"
# ],
# "_comment":"host12-rack03.scale.openstack.engineering.redhat.com",
# "cpu": "",
# "memory": "",
# "disk": "",
# "arch": "x86_64",
# "pm_type":"pxe_ipmitool",
# "pm_user":"qe-scale",
# "pm_password":"d0ckingSt4tion",
# "pm_addr":"10.1.8.102"
#},
#...
# JSon file as first parameter
jdata = open(sys.argv[1])
data = json.load(jdata)
# controllers, computes or all
fence_devices = sys.argv[2]
os_username = os.environ['OS_USERNAME']
os_password = os.environ['OS_PASSWORD']
os_auth_url = os.environ['OS_AUTH_URL']
if os.environ['OS_TENANT_NAME']:
os_tenant_name = os.environ['OS_TENANT_NAME']
else:
os_tenant_name = os.environ['OS_PROJECT_NAME']
os_compute_api_version = os.environ['COMPUTE_API_VERSION']
print('pcs property set stonith-enabled=false')
# To make the below working os_auth_url must be http://192.0.2.1:5000/v3
#auth = v3.Password(auth_url=os_auth_url,
# username=os_username,
# password=os_password,
#{% if release in [ 'liberty', 'mitaka' ] %}
# tenant_name=os_tenant_name,
#{% else %}
# project_name=os_tenant_name,
#{% endif %}
# user_domain_id='default',
# project_domain_id='default')
auth = v2.Password(auth_url=os_auth_url, username=os_username, password=os_password, tenant_name=os_tenant_name)
sess = session.Session(auth=auth)
nt = client.Client("2.1", session=sess)
for instance in nt.servers.list():
for node in data["nodes"]:
if (node["mac"][0] == instance.addresses['ctlplane'][0]['OS-EXT-IPS-MAC:mac_addr'] and (('controller' in instance.name and fence_devices in ['controllers','all']) or ('compute' in instance.name and fence_devices in ['computes','all']))):
print('pcs stonith delete ipmilan-{} || /bin/true'.format(instance.name))
print('pcs stonith create ipmilan-{} fence_ipmilan pcmk_host_list="{}" ipaddr="{}" login="{}" passwd="{}" lanplus="true" delay=20 op monitor interval=60s'.format(instance.name,instance.name,node["pm_addr"],node["pm_user"],node["pm_password"]))
print('pcs location ipmilan-{} avoids {}'.format(instance.name,instance.name))
print('pcs property set stonith-enabled=true')
jdata.close()