From 7a82904d619cefb7e786e3450bb1c4d1ae75c390 Mon Sep 17 00:00:00 2001 From: Alexandra Date: Thu, 28 Apr 2016 15:23:12 +1000 Subject: [PATCH] Docs: Ops section - cleanup As per discussion in the OSA docs summit session, clean up of installation guide. This fixes typos, minor RST mark up changes, and passive voice. Change-Id: Ibacaabddafee465a05bcb6eec01dd3ef04b33826 --- .../install-guide/ops-addcomputehost.rst | 7 +- .../install-guide/ops-galera-recovery.rst | 76 +++++++++---------- .../install-guide/ops-galera-remove.rst | 3 +- doc/source/install-guide/ops-galera-start.rst | 10 +-- doc/source/install-guide/ops-galera.rst | 7 +- doc/source/install-guide/ops-logging.rst | 17 +++-- .../install-guide/ops-troubleshooting.rst | 23 ++++-- doc/source/install-guide/ops.rst | 3 +- 8 files changed, 79 insertions(+), 67 deletions(-) diff --git a/doc/source/install-guide/ops-addcomputehost.rst b/doc/source/install-guide/ops-addcomputehost.rst index bcb5cc368b..5f6ce63849 100644 --- a/doc/source/install-guide/ops-addcomputehost.rst +++ b/doc/source/install-guide/ops-addcomputehost.rst @@ -1,7 +1,8 @@ `Home `_ OpenStack-Ansible Installation Guide +===================== Adding a compute host ---------------------- +===================== Use the following procedure to add a compute host to an operational cluster. @@ -14,8 +15,8 @@ cluster. If necessary, also modify the ``used_ips`` stanza. -#. If the cluster is utilizing Ceilometer, it will be necessary to edit the - ``/etc/openstack_deploy/conf.d/ceilometer.yml`` file and add the host to +#. If the cluster is utilizing Telemetry/Metering (Ceilometer), + edit the ``/etc/openstack_deploy/conf.d/ceilometer.yml`` file and add the host to the ``metering-compute_hosts`` stanza. #. Run the following commands to add the host. Replace diff --git a/doc/source/install-guide/ops-galera-recovery.rst b/doc/source/install-guide/ops-galera-recovery.rst index bede75a669..115a43d35b 100644 --- a/doc/source/install-guide/ops-galera-recovery.rst +++ b/doc/source/install-guide/ops-galera-recovery.rst @@ -1,12 +1,12 @@ `Home `_ OpenStack-Ansible Installation Guide +======================= Galera cluster recovery ------------------------ +======================= -When one or all nodes fail within a galera cluster you may need to -re-bootstrap the environment. To make take advantage of the -automation Ansible provides simply execute the ``galera-install.yml`` -play using the **galera-bootstrap** to auto recover a node or an +Run the `` ``galera-bootstrap`` playbook to automatically recover +a node or an entire environment. Run the ``galera install`` playbook` +using the ``galera-bootstrap`` tag to auto recover a node or an entire environment. #. Run the following Ansible command to show the failed nodes: @@ -15,15 +15,13 @@ entire environment. # openstack-ansible galera-install.yml --tags galera-bootstrap - -Upon completion of this command the cluster should be back online an in -a functional state. +The cluster comes back online after completion of this command. Single-node failure ~~~~~~~~~~~~~~~~~~~ -If a single node fails, the other nodes maintain quorum and continue to -process SQL requests. +If a single node fails, the other nodes maintain quorum and +continue to process SQL requests. #. Run the following Ansible command to determine the failed node: @@ -55,15 +53,15 @@ process SQL requests. #. Restart MariaDB on the failed node and verify that it rejoins the cluster. -#. If MariaDB fails to start, run the **mysqld** command and perform - further analysis on the output. As a last resort, rebuild the - container for the node. +#. If MariaDB fails to start, run the ``mysqld`` command and perform + further analysis on the output. As a last resort, rebuild the container + for the node. Multi-node failure ~~~~~~~~~~~~~~~~~~ -When all but one node fails, the remaining node cannot achieve quorum -and stops processing SQL requests. In this situation, failed nodes that +When all but one node fails, the remaining node cannot achieve quorum and +stops processing SQL requests. In this situation, failed nodes that recover cannot join the cluster because it no longer exists. #. Run the following Ansible command to show the failed nodes: @@ -92,7 +90,7 @@ recover cannot join the cluster because it no longer exists. #. Run the following command to `rebootstrap `_ - the operational node into the cluster. + the operational node into the cluster: .. code-block:: shell-session @@ -116,7 +114,7 @@ recover cannot join the cluster because it no longer exists. processing SQL requests. #. Restart MariaDB on the failed nodes and verify that they rejoin the - cluster. + cluster: .. code-block:: shell-session @@ -144,16 +142,15 @@ recover cannot join the cluster because it no longer exists. wsrep_cluster_status Primary #. If MariaDB fails to start on any of the failed nodes, run the - **mysqld** command and perform further analysis on the output. As a + ``mysqld`` command and perform further analysis on the output. As a last resort, rebuild the container for the node. Complete failure ~~~~~~~~~~~~~~~~ -If all of the nodes in a Galera cluster fail (do not shutdown -gracefully), then the integrity of the database can no longer be -guaranteed and should be restored from backup. Run the following command -to determine if all nodes in the cluster have failed: +Restore from backup if all of the nodes in a Galera cluster fail (do not shutdown +gracefully). Run the following command to determine if all nodes in the +cluster have failed: .. code-block:: shell-session @@ -185,34 +182,35 @@ nodes and all of the nodes contain a ``seqno`` value of -1. If any single node has a positive ``seqno`` value, then that node can be used to restart the cluster. However, because there is no guarantee that -each node has an identical copy of the data, it is not recommended to -restart the cluster using the **--wsrep-new-cluster** command on one +each node has an identical copy of the data, we do not recommend to +restart the cluster using the ``--wsrep-new-cluster`` command on one node. Rebuilding a container ~~~~~~~~~~~~~~~~~~~~~~ -Sometimes recovering from a failure requires rebuilding one or more -containers. +Recovering from certain failures require rebuilding one or more containers. #. Disable the failed node on the load balancer. - Do not rely on the load balancer health checks to disable the node. - If the node is not disabled, the load balancer will send SQL requests - to it before it rejoins the cluster and cause data inconsistencies. + .. note:: + + Do not rely on the load balancer health checks to disable the node. + If the node is not disabled, the load balancer sends SQL requests + to it before it rejoins the cluster and cause data inconsistencies. -#. Use the following commands to destroy the container and remove - MariaDB data stored outside of the container. In this example, node 3 - failed. +#. Destroy the container and remove MariaDB data stored outside + of the container: .. code-block:: shell-session # lxc-stop -n node3_galera_container-3ea2cbd3 # lxc-destroy -n node3_galera_container-3ea2cbd3 # rm -rf /openstack/node3_galera_container-3ea2cbd3/* + + In this example, node 3 failed. -#. Run the host setup playbook to rebuild the container specifically on - node 3: +#. Run the host setup playbook to rebuild the container on node 3: .. code-block:: shell-session @@ -220,7 +218,7 @@ containers. -l node3_galera_container-3ea2cbd3 - The playbook will also restart all other containers on the node. + The playbook restarts all other containers on the node. #. Run the infrastructure playbook to configure the container specifically on node 3: @@ -231,9 +229,11 @@ containers. -l node3_galera_container-3ea2cbd3 - The new container runs a single-node Galera cluster, a dangerous - state because the environment contains more than one active database - with potentially different data. + .. warning:: + + The new container runs a single-node Galera cluster, which is a dangerous + state because the environment contains more than one active database + with potentially different data. .. code-block:: shell-session diff --git a/doc/source/install-guide/ops-galera-remove.rst b/doc/source/install-guide/ops-galera-remove.rst index c6f9cf6596..961753ae9b 100644 --- a/doc/source/install-guide/ops-galera-remove.rst +++ b/doc/source/install-guide/ops-galera-remove.rst @@ -1,7 +1,8 @@ `Home `_ OpenStack-Ansible Installation Guide +============== Removing nodes --------------- +============== In the following example, all but one node was shut down gracefully: diff --git a/doc/source/install-guide/ops-galera-start.rst b/doc/source/install-guide/ops-galera-start.rst index 160e17c6c4..0c9c56b12d 100644 --- a/doc/source/install-guide/ops-galera-start.rst +++ b/doc/source/install-guide/ops-galera-start.rst @@ -1,15 +1,15 @@ `Home `_ OpenStack-Ansible Installation Guide +================== Starting a cluster ------------------- +================== Gracefully shutting down all nodes destroys the cluster. Starting or restarting a cluster from zero nodes requires creating a new cluster on one of the nodes. -#. The new cluster should be started on the most advanced node. Run the - following command to check the ``seqno`` value in the - ``grastate.dat`` file on all of the nodes: +#. Start a new cluster on the most advanced node. + Check the ``seqno`` value in the ``grastate.dat`` file on all of the nodes: .. code-block:: shell-session @@ -33,7 +33,7 @@ one of the nodes. cert_index: In this example, all nodes in the cluster contain the same positive - ``seqno`` values because they were synchronized just prior to + ``seqno`` values as they were synchronized just prior to graceful shutdown. If all ``seqno`` values are equal, any node can start the new cluster. diff --git a/doc/source/install-guide/ops-galera.rst b/doc/source/install-guide/ops-galera.rst index 5a74d1ff28..8bbbcee3ed 100644 --- a/doc/source/install-guide/ops-galera.rst +++ b/doc/source/install-guide/ops-galera.rst @@ -1,7 +1,8 @@ `Home `_ OpenStack-Ansible Installation Guide +========================== Galera cluster maintenance --------------------------- +========================== .. toctree:: @@ -13,8 +14,8 @@ Routine maintenance includes gracefully adding or removing nodes from the cluster without impacting operation and also starting a cluster after gracefully shutting down all nodes. -MySQL instances are restarted when creating a cluster, adding a -node, the service isn't running, or when changes are made to the +MySQL instances are restarted when creating a cluster, when adding a +node, when the service is not running, or when changes are made to the ``/etc/mysql/my.cnf`` configuration file. -------------- diff --git a/doc/source/install-guide/ops-logging.rst b/doc/source/install-guide/ops-logging.rst index 2be3a007e0..2665631a69 100644 --- a/doc/source/install-guide/ops-logging.rst +++ b/doc/source/install-guide/ops-logging.rst @@ -1,15 +1,16 @@ `Home `_ OpenStack-Ansible Installation Guide -Centralized Logging -------------------- +=================== +Centralized logging +=================== -OpenStack-Ansible will configure all instances to send syslog data to a -container (or group of containers) running rsyslog. The rsyslog server +OpenStack-Ansible configures all instances to send syslog data to a +container (or group of containers) running rsyslog. The rsyslog server containers are specified in the ``log_hosts`` section of the ``openstack_user_config.yml`` file. The rsyslog server container(s) have logrotate installed and configured with -a 14 day retention. All rotated logs are compressed by default. +a 14 day retention. All rotated logs are compressed by default. Finding logs ~~~~~~~~~~~~ @@ -18,10 +19,10 @@ Logs are accessible in multiple locations within an OpenStack-Ansible deployment: * The rsyslog server container collects logs in ``/var/log/log-storage`` within - directories named after the container or physical host + directories named after the container or physical host. * Each physical host has the logs from its service containers mounted at - ``/openstack/log/`` -* Each service container has its own logs stored at ``/var/log/`` + ``/openstack/log/``. +* Each service container has its own logs stored at ``/var/log/``. -------------- diff --git a/doc/source/install-guide/ops-troubleshooting.rst b/doc/source/install-guide/ops-troubleshooting.rst index 7b873b148a..825b2cdded 100644 --- a/doc/source/install-guide/ops-troubleshooting.rst +++ b/doc/source/install-guide/ops-troubleshooting.rst @@ -13,8 +13,9 @@ All LXC containers on the host have two virtual Ethernet interfaces: * `eth1` in the container connects to `br-mgmt` on the host .. note:: - Some containers, such as cinder, glance, neutron_agents, and - swift_proxy, have more than two interfaces to support their + + Some containers, such as ``cinder``, ``glance``, ``neutron_agents``, and + ``swift_proxy``, have more than two interfaces to support their functions. Predictable interface naming @@ -70,10 +71,15 @@ containers. Cached Ansible facts issues ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -At the beginning of a playbook run, information about each host, such -as its Linux distribution, kernel version, and network interfaces, is -gathered. To improve performance, particularly in larger deployments, -these facts can be cached. +At the beginning of a playbook run, information about each host is gathered. +Examples of the information gathered are: + + * Linux distribution + * Kernel version + * Network interfaces + +To improve performance, particularly in large deployments, you can +cache host facts and information. OpenStack-Ansible enables fact caching by default. The facts are cached in JSON files within ``/etc/openstack_deploy/ansible_facts``. @@ -87,8 +93,9 @@ documentation on `fact caching`_ for more details. Forcing regeneration of cached facts ------------------------------------ -If a host's kernel is upgraded or additional network interfaces or -bridges are created on the host, its cached facts may be incorrect. +Cached facts may be incorrect if the host receives a kernel upgrade or new network +interfaces. Newly created bridges also disrupt cache facts. + This can lead to unexpected errors while running playbooks, and require that the cached facts be regenerated. diff --git a/doc/source/install-guide/ops.rst b/doc/source/install-guide/ops.rst index c0b9c07c4d..bbff979537 100644 --- a/doc/source/install-guide/ops.rst +++ b/doc/source/install-guide/ops.rst @@ -1,7 +1,8 @@ `Home `_ OpenStack-Ansible Installation Guide +===================== Chapter 8. Operations ---------------------- +===================== .. toctree::