From df1bd3b6e00748cbaf43cd7e4a075dbb5b1fed55 Mon Sep 17 00:00:00 2001 From: Alexandra Settle Date: Mon, 12 Dec 2016 16:29:44 +0000 Subject: [PATCH] [ops-guide] Adding new content to guide This patch is to simply create the ToC in the draft folder and to add pre-existing content. Edits will come later. Please review based on structure only. Change-Id: I9fa7754f3f6cdf1c861a146a93acf675c74f8a8b Implements: blueprint create-ops-guide --- .../advanced-config.rst | 340 +++++++++++++++++- .../draft-operations-guide/extending.rst | 306 ---------------- doc/source/draft-operations-guide/index.rst | 2 +- .../maintenance-tasks.rst | 12 +- .../maintenance-tasks/ansible-modules.rst | 6 + .../maintenance-tasks/backups.rst | 37 ++ .../maintenance-tasks/containers.rst | 194 ++++++++++ .../maintenance-tasks/firewalls.rst | 14 + .../galera.rst} | 152 +++++++- .../maintenance-tasks/managing-swift.rst | 78 ++++ .../maintenance-tasks/network-maintain.rst | 27 ++ .../maintenance-tasks/rabbitmq-maintain.rst | 24 ++ .../maintenance-tasks/scale-environment.rst | 234 ++++++++++++ .../monitor-environment.rst | 5 + .../monitoring-systems.rst | 11 + .../openstack-operations.rst | 6 +- .../access-environment.rst | 273 ++++++++++++++ .../openstack-operations/managing-images.rst | 125 +++++++ .../managing-instances.rst | 221 ++++++++++++ .../openstack-operations/network-service.rst | 41 +++ .../openstack-operations/verify-deploy.rst | 68 ++++ .../ops-add-computehost.rst | 29 -- .../ops-galera-remove.rst | 32 -- .../ops-galera-start.rst | 88 ----- .../draft-operations-guide/ops-galera.rst | 18 - .../ops-remove-computehost.rst | 49 --- .../draft-operations-guide/ops-tips.rst | 38 -- .../ops-troubleshooting.rst | 125 ------- .../draft-operations-guide/ref-info.rst | 59 ++- .../ref-info/ansible-scripts.rst | 21 ++ .../lxc-commands.rst} | 1 - .../troubleshooting.rst | 218 ++++++++++- .../draft-operations-guide/verify-deploy.rst | 6 - 33 files changed, 2141 insertions(+), 719 deletions(-) delete mode 100644 doc/source/draft-operations-guide/extending.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/ansible-modules.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/backups.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/containers.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/firewalls.rst rename doc/source/draft-operations-guide/{ops-galera-recovery.rst => maintenance-tasks/galera.rst} (67%) create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/managing-swift.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/network-maintain.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/rabbitmq-maintain.rst create mode 100644 doc/source/draft-operations-guide/maintenance-tasks/scale-environment.rst create mode 100644 doc/source/draft-operations-guide/monitor-environment/monitoring-systems.rst create mode 100644 doc/source/draft-operations-guide/openstack-operations/access-environment.rst create mode 100644 doc/source/draft-operations-guide/openstack-operations/managing-images.rst create mode 100644 doc/source/draft-operations-guide/openstack-operations/managing-instances.rst create mode 100644 doc/source/draft-operations-guide/openstack-operations/network-service.rst create mode 100644 doc/source/draft-operations-guide/openstack-operations/verify-deploy.rst delete mode 100644 doc/source/draft-operations-guide/ops-add-computehost.rst delete mode 100644 doc/source/draft-operations-guide/ops-galera-remove.rst delete mode 100644 doc/source/draft-operations-guide/ops-galera-start.rst delete mode 100644 doc/source/draft-operations-guide/ops-galera.rst delete mode 100644 doc/source/draft-operations-guide/ops-remove-computehost.rst delete mode 100644 doc/source/draft-operations-guide/ops-tips.rst delete mode 100644 doc/source/draft-operations-guide/ops-troubleshooting.rst create mode 100644 doc/source/draft-operations-guide/ref-info/ansible-scripts.rst rename doc/source/draft-operations-guide/{ops-lxc-commands.rst => ref-info/lxc-commands.rst} (99%) delete mode 100644 doc/source/draft-operations-guide/verify-deploy.rst diff --git a/doc/source/draft-operations-guide/advanced-config.rst b/doc/source/draft-operations-guide/advanced-config.rst index dd522a5b7e..0779e34333 100644 --- a/doc/source/draft-operations-guide/advanced-config.rst +++ b/doc/source/draft-operations-guide/advanced-config.rst @@ -2,11 +2,339 @@ Advanced configuration ====================== -This is a draft advanced configuration page for the proposed -OpenStack-Ansible operations guide. +The OpenStack-Ansible project provides a basic OpenStack environment, but +many deployers will wish to extend the environment based on their needs. This +could include installing extra services, changing package versions, or +overriding existing variables. -.. toctree:: - :maxdepth: 2 +Using these extension points, deployers can provide a more 'opinionated' +installation of OpenStack that may include their own software. - extending.rst - ops-tips.rst +Including OpenStack-Ansible in your project +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Including the openstack-ansible repository within another project can be +done in several ways: + +- A git submodule pointed to a released tag. +- A script to automatically perform a git checkout of Openstack-Ansible. + +When including OpenStack-Ansible in a project, consider using a parallel +directory structure as shown in the ``ansible.cfg`` files section. + +Also note that copying files into directories such as ``env.d`` or +``conf.d`` should be handled via some sort of script within the extension +project. + +Ansible forks +~~~~~~~~~~~~~ + +The default MaxSessions setting for the OpenSSH Daemon is 10. Each Ansible +fork makes use of a Session. By default, Ansible sets the number of forks to +5. However, you can increase the number of forks used in order to improve +deployment performance in large environments. + +Note that more than 10 forks will cause issues for any playbooks +which use ``delegate_to`` or ``local_action`` in the tasks. It is +recommended that the number of forks are not raised when executing against the +Control Plane, as this is where delegation is most often used. + +The number of forks used may be changed on a permanent basis by including +the appropriate change to the ``ANSIBLE_FORKS`` in your ``.bashrc`` file. +Alternatively it can be changed for a particular playbook execution by using +the ``--forks`` CLI parameter. For example, the following executes the nova +playbook against the control plane with 10 forks, then against the compute +nodes with 50 forks. + +.. code-block:: shell-session + + # openstack-ansible --forks 10 os-nova-install.yml --limit compute_containers + # openstack-ansible --forks 50 os-nova-install.yml --limit compute_hosts + +For more information about forks, please see the following references: + +* OpenStack-Ansible `Bug 1479812`_ +* Ansible `forks`_ entry for ansible.cfg +* `Ansible Performance Tuning`_ + +.. _Bug 1479812: https://bugs.launchpad.net/openstack-ansible/+bug/1479812 +.. _forks: http://docs.ansible.com/ansible/intro_configuration.html#forks +.. _Ansible Performance Tuning: https://www.ansible.com/blog/ansible-performance-tuning + +ansible.cfg files +~~~~~~~~~~~~~~~~~ + +You can create your own playbook, variable, and role structure while still +including the OpenStack-Ansible roles and libraries by putting an +``ansible.cfg`` file in your ``playbooks`` directory. + +The relevant options for Ansible 1.9 (included in OpenStack-Ansible) +are as follows: + + ``library`` + This variable should point to + ``openstack-ansible/playbooks/library``. Doing so allows roles and + playbooks to access OpenStack-Ansible's included Ansible modules. + ``roles_path`` + This variable should point to + ``openstack-ansible/playbooks/roles``. This allows Ansible to + properly look up any OpenStack-Ansible roles that extension roles + may reference. + ``inventory`` + This variable should point to + ``openstack-ansible/playbooks/inventory``. With this setting, + extensions have access to the same dynamic inventory that + OpenStack-Ansible uses. + +Note that the paths to the ``openstack-ansible`` top level directory can be +relative in this file. + +Consider this directory structure:: + + my_project + | + |- custom_stuff + | | + | |- playbooks + |- openstack-ansible + | | + | |- playbooks + +The variables in ``my_project/custom_stuff/playbooks/ansible.cfg`` would use +``../openstack-ansible/playbooks/``. + + +env.d +~~~~~ + +The ``/etc/openstack_deploy/env.d`` directory sources all YAML files into the +deployed environment, allowing a deployer to define additional group mappings. + +This directory is used to extend the environment skeleton, or modify the +defaults defined in the ``playbooks/inventory/env.d`` directory. + +See also `Understanding Container Groups`_ in Appendix C. + +.. _Understanding Container Groups: ../install-guide/app-custom-layouts.html#understanding-container-groups + +conf.d +~~~~~~ + +Common OpenStack services and their configuration are defined by +OpenStack-Ansible in the +``/etc/openstack_deploy/openstack_user_config.yml`` settings file. + +Additional services should be defined with a YAML file in +``/etc/openstack_deploy/conf.d``, in order to manage file size. + +See also `Understanding Host Groups`_ in Appendix C. + +.. _Understanding Host Groups: ../install-guide/app-custom-layouts.html#understanding-host-groups + +user_*.yml files +~~~~~~~~~~~~~~~~ + +Files in ``/etc/openstack_deploy`` beginning with ``user_`` will be +automatically sourced in any ``openstack-ansible`` command. Alternatively, +the files can be sourced with the ``-e`` parameter of the ``ansible-playbook`` +command. + +``user_variables.yml`` and ``user_secrets.yml`` are used directly by +OpenStack-Ansible. Adding custom variables used by your own roles and +playbooks to these files is not recommended. Doing so will complicate your +upgrade path by making comparison of your existing files with later versions +of these files more arduous. Rather, recommended practice is to place your own +variables in files named following the ``user_*.yml`` pattern so they will be +sourced alongside those used exclusively by OpenStack-Ansible. + +Ordering and precedence +----------------------- + +``user_*.yml`` variables are just YAML variable files. They will be sourced +in alphanumeric order by ``openstack-ansible``. + +.. _adding-galaxy-roles: + +Adding Galaxy roles +~~~~~~~~~~~~~~~~~~~ + +Any roles defined in ``openstack-ansible/ansible-role-requirements.yml`` +will be installed by the +``openstack-ansible/scripts/bootstrap-ansible.sh`` script. + + +Setting overrides in configuration files +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All of the services that use YAML, JSON, or INI for configuration can receive +overrides through the use of a Ansible action plugin named ``config_template``. +The configuration template engine allows a deployer to use a simple dictionary +to modify or add items into configuration files at run time that may not have a +preset template option. All OpenStack-Ansible roles allow for this +functionality where applicable. Files available to receive overrides can be +seen in the ``defaults/main.yml`` file as standard empty dictionaries (hashes). + +Practical guidance for using this feature is available in the `Install Guide`_. + +This module has been `submitted for consideration`_ into Ansible Core. + +.. _Install Guide: ../install-guide/app-advanced-config-override.html +.. _submitted for consideration: https://github.com/ansible/ansible/pull/12555 + + +Build the environment with additional python packages +----------------------------------------------------- + +The system will allow you to install and build any package that is a python +installable. The repository infrastructure will look for and create any +git based or PyPi installable package. When the package is built the repo-build +role will create the sources as Python wheels to extend the base system and +requirements. + +While the packages pre-built in the repository-infrastructure are +comprehensive, it may be needed to change the source locations and versions of +packages to suit different deployment needs. Adding additional repositories as +overrides is as simple as listing entries within the variable file of your +choice. Any ``user_.*.yml`` file within the "/etc/openstack_deployment" +directory will work to facilitate the addition of a new packages. + + +.. code-block:: yaml + + swift_git_repo: https://private-git.example.org/example-org/swift + swift_git_install_branch: master + + +Additional lists of python packages can also be overridden using a +``user_.*.yml`` variable file. + +.. code-block:: yaml + + swift_requires_pip_packages: + - virtualenv + - virtualenv-tools + - python-keystoneclient + - NEW-SPECIAL-PACKAGE + + +Once the variables are set call the play ``repo-build.yml`` to build all of the +wheels within the repository infrastructure. When ready run the target plays to +deploy your overridden source code. + + +Module documentation +-------------------- + +These are the options available as found within the virtual module +documentation section. + +.. code-block:: yaml + + module: config_template + version_added: 1.9.2 + short_description: > + Renders template files providing a create/update override interface + description: + - The module contains the template functionality with the ability to + override items in config, in transit, through the use of a simple + dictionary without having to write out various temp files on target + machines. The module renders all of the potential jinja a user could + provide in both the template file and in the override dictionary which + is ideal for deployers who may have lots of different configs using a + similar code base. + - The module is an extension of the **copy** module and all of attributes + that can be set there are available to be set here. + options: + src: + description: + - Path of a Jinja2 formatted template on the local server. This can + be a relative or absolute path. + required: true + default: null + dest: + description: + - Location to render the template to on the remote machine. + required: true + default: null + config_overrides: + description: + - A dictionary used to update or override items within a configuration + template. The dictionary data structure may be nested. If the target + config file is an ini file the nested keys in the ``config_overrides`` + will be used as section headers. + config_type: + description: + - A string value describing the target config type. + choices: + - ini + - json + - yaml + + +Example task using the config_template module +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: yaml + + - name: Run config template ini + config_template: + src: test.ini.j2 + dest: /tmp/test.ini + config_overrides: {{ test_overrides }} + config_type: ini + + +Example overrides dictionary(hash) +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: yaml + + test_overrides: + DEFAULT: + new_item: 12345 + + +Original template file test.ini.j2 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: ini + + [DEFAULT] + value1 = abc + value2 = 123 + + +Rendered on disk file /tmp/test.ini +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: ini + + [DEFAULT] + value1 = abc + value2 = 123 + new_item = 12345 + + +In this task the ``test.ini.j2`` file is a template which will be rendered and +written to disk at ``/tmp/test.ini``. The **config_overrides** entry is a +dictionary(hash) which allows a deployer to set arbitrary data as overrides to +be written into the configuration file at run time. The **config_type** entry +specifies the type of configuration file the module will be interacting with; +available options are "yaml", "json", and "ini". + + +Discovering available overrides +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +All of these options can be specified in any way that suits your deployment. +In terms of ease of use and flexibility it's recommended that you define your +overrides in a user variable file such as +``/etc/openstack_deploy/user_variables.yml``. + +The list of overrides available may be found by executing: + +.. code-block:: bash + + find . -name "main.yml" -exec grep '_.*_overrides:' {} \; \ + | grep -v "^#" \ + | sort -u diff --git a/doc/source/draft-operations-guide/extending.rst b/doc/source/draft-operations-guide/extending.rst deleted file mode 100644 index 4ecaf5dc78..0000000000 --- a/doc/source/draft-operations-guide/extending.rst +++ /dev/null @@ -1,306 +0,0 @@ -=========================== -Extending OpenStack-Ansible -=========================== - -The OpenStack-Ansible project provides a basic OpenStack environment, but -many deployers will wish to extend the environment based on their needs. This -could include installing extra services, changing package versions, or -overriding existing variables. - -Using these extension points, deployers can provide a more 'opinionated' -installation of OpenStack that may include their own software. - -Including OpenStack-Ansible in your project -------------------------------------------- - -Including the openstack-ansible repository within another project can be -done in several ways. - - 1. A git submodule pointed to a released tag. - 2. A script to automatically perform a git checkout of - openstack-ansible - -When including OpenStack-Ansible in a project, consider using a parallel -directory structure as shown in the `ansible.cfg files`_ section. - -Also note that copying files into directories such as `env.d`_ or -`conf.d`_ should be handled via some sort of script within the extension -project. - -ansible.cfg files ------------------ - -You can create your own playbook, variable, and role structure while still -including the OpenStack-Ansible roles and libraries by putting an -``ansible.cfg`` file in your ``playbooks`` directory. - -The relevant options for Ansible 1.9 (included in OpenStack-Ansible) -are as follows: - - ``library`` - This variable should point to - ``openstack-ansible/playbooks/library``. Doing so allows roles and - playbooks to access OpenStack-Ansible's included Ansible modules. - ``roles_path`` - This variable should point to - ``openstack-ansible/playbooks/roles``. This allows Ansible to - properly look up any OpenStack-Ansible roles that extension roles - may reference. - ``inventory`` - This variable should point to - ``openstack-ansible/playbooks/inventory``. With this setting, - extensions have access to the same dynamic inventory that - OpenStack-Ansible uses. - -Note that the paths to the ``openstack-ansible`` top level directory can be -relative in this file. - -Consider this directory structure:: - - my_project - | - |- custom_stuff - | | - | |- playbooks - |- openstack-ansible - | | - | |- playbooks - -The variables in ``my_project/custom_stuff/playbooks/ansible.cfg`` would use -``../openstack-ansible/playbooks/``. - - -env.d ------ - -The ``/etc/openstack_deploy/env.d`` directory sources all YAML files into the -deployed environment, allowing a deployer to define additional group mappings. - -This directory is used to extend the environment skeleton, or modify the -defaults defined in the ``playbooks/inventory/env.d`` directory. - -See also `Understanding Container Groups`_ in Appendix C. - -.. _Understanding Container Groups: ../install-guide/app-custom-layouts.html#understanding-container-groups - -conf.d ------- - -Common OpenStack services and their configuration are defined by -OpenStack-Ansible in the -``/etc/openstack_deploy/openstack_user_config.yml`` settings file. - -Additional services should be defined with a YAML file in -``/etc/openstack_deploy/conf.d``, in order to manage file size. - -See also `Understanding Host Groups`_ in Appendix C. - -.. _Understanding Host Groups: ../install-guide/app-custom-layouts.html#understanding-host-groups - -user\_*.yml files ------------------ - -Files in ``/etc/openstack_deploy`` beginning with ``user_`` will be -automatically sourced in any ``openstack-ansible`` command. Alternatively, -the files can be sourced with the ``-e`` parameter of the ``ansible-playbook`` -command. - -``user_variables.yml`` and ``user_secrets.yml`` are used directly by -OpenStack-Ansible. Adding custom variables used by your own roles and -playbooks to these files is not recommended. Doing so will complicate your -upgrade path by making comparison of your existing files with later versions -of these files more arduous. Rather, recommended practice is to place your own -variables in files named following the ``user_*.yml`` pattern so they will be -sourced alongside those used exclusively by OpenStack-Ansible. - -Ordering and Precedence -+++++++++++++++++++++++ - -``user_*.yml`` variables are just YAML variable files. They will be sourced -in alphanumeric order by ``openstack-ansible``. - -.. _adding-galaxy-roles: - -Adding Galaxy roles -------------------- - -Any roles defined in ``openstack-ansible/ansible-role-requirements.yml`` -will be installed by the -``openstack-ansible/scripts/bootstrap-ansible.sh`` script. - - -Setting overrides in configuration files ----------------------------------------- - -All of the services that use YAML, JSON, or INI for configuration can receive -overrides through the use of a Ansible action plugin named ``config_template``. -The configuration template engine allows a deployer to use a simple dictionary -to modify or add items into configuration files at run time that may not have a -preset template option. All OpenStack-Ansible roles allow for this -functionality where applicable. Files available to receive overrides can be -seen in the ``defaults/main.yml`` file as standard empty dictionaries (hashes). - -Practical guidance for using this feature is available in the `Install Guide`_. - -This module has been `submitted for consideration`_ into Ansible Core. - -.. _Install Guide: ../install-guide/app-advanced-config-override.html -.. _submitted for consideration: https://github.com/ansible/ansible/pull/12555 - - -Build the environment with additional python packages -+++++++++++++++++++++++++++++++++++++++++++++++++++++ - -The system will allow you to install and build any package that is a python -installable. The repository infrastructure will look for and create any -git based or PyPi installable package. When the package is built the repo-build -role will create the sources as Python wheels to extend the base system and -requirements. - -While the packages pre-built in the repository-infrastructure are -comprehensive, it may be needed to change the source locations and versions of -packages to suit different deployment needs. Adding additional repositories as -overrides is as simple as listing entries within the variable file of your -choice. Any ``user_.*.yml`` file within the "/etc/openstack_deployment" -directory will work to facilitate the addition of a new packages. - - -.. code-block:: yaml - - swift_git_repo: https://private-git.example.org/example-org/swift - swift_git_install_branch: master - - -Additional lists of python packages can also be overridden using a -``user_.*.yml`` variable file. - -.. code-block:: yaml - - swift_requires_pip_packages: - - virtualenv - - virtualenv-tools - - python-keystoneclient - - NEW-SPECIAL-PACKAGE - - -Once the variables are set call the play ``repo-build.yml`` to build all of the -wheels within the repository infrastructure. When ready run the target plays to -deploy your overridden source code. - - -Module documentation -++++++++++++++++++++ - -These are the options available as found within the virtual module -documentation section. - -.. code-block:: yaml - - module: config_template - version_added: 1.9.2 - short_description: > - Renders template files providing a create/update override interface - description: - - The module contains the template functionality with the ability to - override items in config, in transit, through the use of a simple - dictionary without having to write out various temp files on target - machines. The module renders all of the potential jinja a user could - provide in both the template file and in the override dictionary which - is ideal for deployers who may have lots of different configs using a - similar code base. - - The module is an extension of the **copy** module and all of attributes - that can be set there are available to be set here. - options: - src: - description: - - Path of a Jinja2 formatted template on the local server. This can - be a relative or absolute path. - required: true - default: null - dest: - description: - - Location to render the template to on the remote machine. - required: true - default: null - config_overrides: - description: - - A dictionary used to update or override items within a configuration - template. The dictionary data structure may be nested. If the target - config file is an ini file the nested keys in the ``config_overrides`` - will be used as section headers. - config_type: - description: - - A string value describing the target config type. - choices: - - ini - - json - - yaml - - -Example task using the "config_template" module -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: yaml - - - name: Run config template ini - config_template: - src: test.ini.j2 - dest: /tmp/test.ini - config_overrides: {{ test_overrides }} - config_type: ini - - -Example overrides dictionary(hash) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: yaml - - test_overrides: - DEFAULT: - new_item: 12345 - - -Original template file "test.ini.j2" -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: ini - - [DEFAULT] - value1 = abc - value2 = 123 - - -Rendered on disk file "/tmp/test.ini" -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: ini - - [DEFAULT] - value1 = abc - value2 = 123 - new_item = 12345 - - -In this task the ``test.ini.j2`` file is a template which will be rendered and -written to disk at ``/tmp/test.ini``. The **config_overrides** entry is a -dictionary(hash) which allows a deployer to set arbitrary data as overrides to -be written into the configuration file at run time. The **config_type** entry -specifies the type of configuration file the module will be interacting with; -available options are "yaml", "json", and "ini". - - -Discovering Available Overrides -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - -All of these options can be specified in any way that suits your deployment. -In terms of ease of use and flexibility it's recommended that you define your -overrides in a user variable file such as -``/etc/openstack_deploy/user_variables.yml``. - -The list of overrides available may be found by executing: - -.. code-block:: bash - - find . -name "main.yml" -exec grep '_.*_overrides:' {} \; \ - | grep -v "^#" \ - | sort -u diff --git a/doc/source/draft-operations-guide/index.rst b/doc/source/draft-operations-guide/index.rst index 6da4e3a30a..f3a29b5475 100644 --- a/doc/source/draft-operations-guide/index.rst +++ b/doc/source/draft-operations-guide/index.rst @@ -6,7 +6,7 @@ This is a draft index page for the proposed OpenStack-Ansible operations guide. .. toctree:: - :maxdepth: 2 + :maxdepth: 3 openstack-operations.rst maintenance-tasks.rst diff --git a/doc/source/draft-operations-guide/maintenance-tasks.rst b/doc/source/draft-operations-guide/maintenance-tasks.rst index a5eae2546e..fcd28a2f45 100644 --- a/doc/source/draft-operations-guide/maintenance-tasks.rst +++ b/doc/source/draft-operations-guide/maintenance-tasks.rst @@ -8,6 +8,12 @@ operations guide. .. toctree:: :maxdepth: 2 - ops-add-computehost.rst - ops-remove-computehost.rst - ops-galera.rst + maintenance-tasks/network-maintain.rst + maintenance-tasks/galera.rst + maintenance-tasks/rabbitmq-maintain.rst + maintenance-tasks/backups.rst + maintenance-tasks/scale-environment.rst + maintenance-tasks/ansible-modules.rst + maintenance-tasks/managing-swift.rst + maintenance-tasks/containers.rst + maintenance-tasks/firewalls.rst diff --git a/doc/source/draft-operations-guide/maintenance-tasks/ansible-modules.rst b/doc/source/draft-operations-guide/maintenance-tasks/ansible-modules.rst new file mode 100644 index 0000000000..485b6de5a6 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/ansible-modules.rst @@ -0,0 +1,6 @@ +============================ +Running ad-hoc Ansible plays +============================ + +This is a draft ad-hoc plays page for the proposed OpenStack-Ansible +operations guide. diff --git a/doc/source/draft-operations-guide/maintenance-tasks/backups.rst b/doc/source/draft-operations-guide/maintenance-tasks/backups.rst new file mode 100644 index 0000000000..7bdcca6c11 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/backups.rst @@ -0,0 +1,37 @@ +======= +Backups +======= + +This is a draft backups page for the proposed OpenStack-Ansible +operations guide. + +Checking for recent back ups +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before adding new nodes to your OpenStack-Ansible environment, it is possible +to confirm that a recent back up resides inside the ``holland_backups`` +repository: + +#. Log in to the Infra host where the Galera service creates backups. + +#. Run the :command:``ls -ls`` command to view the contents of the + back up files: + + .. code:: + + -Infra01~#: ls -ls /openstack/backup/XXXX_galera_containe + +Backup of /etc/openstack_deploy +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Details about our inventory backup we are already doing, but also more +content can go there on how to backup and restore this. + +Backup of Galera data +~~~~~~~~~~~~~~~~~~~~~ + +Backup your environment +~~~~~~~~~~~~~~~~~~~~~~~ + +Backup procedure +---------------- diff --git a/doc/source/draft-operations-guide/maintenance-tasks/containers.rst b/doc/source/draft-operations-guide/maintenance-tasks/containers.rst new file mode 100644 index 0000000000..38f0c09fe7 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/containers.rst @@ -0,0 +1,194 @@ +==================== +Container management +==================== + +With Ansible, the OpenStack installation process is entirely automated +using playbooks written in YAML. After installation, the settings +configured by the playbooks can be changed and modified. Services and +containers can shift to accommodate certain environment requirements. +Scaling services is achieved by adjusting services within containers, or +adding new deployment groups. It is also possible to destroy containers +if needed after changes and modifications are complete. + +Scale individual services +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Individual OpenStack services, and other open source project services, +run within containers. It is possible to scale out these services by +modifying the ``etc/openstack_deploy/openstack_user_config.yml`` file. + +#. Navigate into the ``etc/openstack_deploy/openstack_user_config.yml`` + file. + +#. Access the deployment groups section of the configuration file. + Underneath the deployment group name, add an affinity value line to + container scales OpenStack services: + + .. code:: + + infra_host + infra1: + ip: 10.10.236.100 + # Rabbitmq + affinity: + galera_container: 1 + rabbit_mq_container: 2 + + In this example, ``galera_container`` has a container value of one. + In practice, any containers that do not need adjustment can remain at + the default value of one, and should not be adjusted above or below + the value of one. + + The affinity value for each container is set at one by default. + Adjust the affinity value to zero for situations where the OpenStack + services housed within a specific container will not be needed when + scaling out other required services. + +#. Update the container number listed under the ``affinity`` + configuration to the desired number. The above example has + ``galera_container`` set at one and ``rabbit_mq_container`` at two, + which scales RabbitMQ services, but leaves Galera services fixed. + +#. Run the appropriate playbook commands after changing the + configuration to create the new containers, and install the + appropriate services. + + For example, run the **openstack-ansible lxc-containers-create.yml + rabbitmq-install.yml** commands from the + ``openstack-ansible/playbooks`` repository to complete the scaling + process described in the example above: + + .. code:: + + $ cd openstack-ansible/playbooks + $ openstack-ansible lxc-containers-create.yml rabbitmq-install.yml + +Scale services with new deployment groups +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In any OpenStack environment installed with Ansible, Deployment Groups +reside on specific nodes. Specific groups of containers are collected +open source project services, run within containers. + +For example, the ``compute_hosts ``\ deployment group holds the +``nova_compute_container``, which contains the +``neutron_linuxbridge_agent`` and ``nova_compute`` OpenStack services. +This deployment group resides on the compute node. + +Users can create new infrastructure nodes, and scale OpenStack services +within containers, by generating new deployment groups. The process +requires setting up a new deployment groups inside the host +configuration files. + +#. On the host machine, navigate to the directory where + ``openstack_config`` file resides. This configuration file + defines which deployment groups are assigned to each node. + +#. Add a new deployment group to the configuration file. Adjust the + deployment group name followed by the affinity values within the + deployment group section of the ``openstack_config`` config file to + scale services. + + .. code:: + + compute_hosts + infra_hosts + identity_hosts + log_hosts + network_hosts + os-infra_hosts + repo-infra_hosts + shared-infra_hosts + storage-infra_hosts + storage_hosts + swift_hosts + swift-proxy_hosts + +#. Modify the ``openstack_config`` file, adding containers for the new + deployment group. + +#. Specify the required affinity levels. Add a zero value for any + OpenStack or open source services not needed that would ordinarily + run on the deployment group. + + For example, to add a new deployment group with nova\_api and + cinder\_api services reconfigure the ``openstack_config`` file: + + .. code:: + + os-infra_hosts: + my_new_node: + ip: 3.4.5.6 + affinity: + glance_container: 0 + heat_apis_container: 0 + heat_engine_container: 0 + horizon_container: 0 + nova_api_metadata_container: 0 + nova_cert_container: 0 + nova_conductor_container: 0 + nova_scheduler_container: 0 + nova_console_container: 0 + + ``my_new_node`` is the name for the new deployment group. + ``ip 3.4.5.6`` is the ip address assigned to the new deployment + group. + +#. As another example, a new deployment group that houses the + ``cinder_api`` would have the following values: + + .. code:: + + storage-infra_hosts: + my_new_node: + ip: 3.4.5.6 + affinity: + cinder_api_container: 0 + + The ``storage-infra_host`` contains only the ``cinder_api`` services. + +Destroy and recreate containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Resolving some issues may require destroying a container, and rebuilding +that container from the beginning. It is possible to destroy and +re-create a container with the ``destroy-containers.yml`` and +``build-containers.yml`` commands. These Ansible scripts reside in the +``openstack-ansible/playbooks`` repository. + +#. Navigate to the ``openstack-ansible`` directory. + +#. Run the **openstack-ansible destroy-containers.yml** commands, + specifying the target containers and the container to be destroyed. + + .. code:: + + $ openstack-ansible destroy-containers.yml \ + build-containers.yml OTHER_PLAYS -e container_group="CONTAINER_NAME" + +#. Replace *``OTHER_PLAYS``* with the target container, and replace + +#. Change the load balancer configuration to match the newly recreated + container identity if needed. + +Archive a container +~~~~~~~~~~~~~~~~~~~ + +If a container experiences a problem and needs to be deactivated, it is +possible to flag the container as inactive, and archive it in the +``/tmp`` directory. + +#. Change into the playbooks directory. + +#. Run the **openstack-ansible** with the **-e** argument, and replace + *``HOST_NAME``* and *`` CONTAINER_NAME``* options with the + applicable host and container names. + + .. code:: + + $ openstack-ansible -e \ + "host_group=HOST_NAME,container_name=CONTAINER_NAME" \ + setup/archive-container.yml + + By default, Ansible archives the container contents to the ``/tmp`` + directory on the host machine. diff --git a/doc/source/draft-operations-guide/maintenance-tasks/firewalls.rst b/doc/source/draft-operations-guide/maintenance-tasks/firewalls.rst new file mode 100644 index 0000000000..ed7021ced0 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/firewalls.rst @@ -0,0 +1,14 @@ +========= +Firewalls +========= + +This is a draft backups page for the proposed OpenStack-Ansible +operations guide. + +.. TODO Describe general approaches to adding firewalls to OSA infrastructure. + +Finding ports used by an external IP address +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. TODO explain how to find the ports used by the external IP + (whether you deploy haproxy or not), and what are the default ports diff --git a/doc/source/draft-operations-guide/ops-galera-recovery.rst b/doc/source/draft-operations-guide/maintenance-tasks/galera.rst similarity index 67% rename from doc/source/draft-operations-guide/ops-galera-recovery.rst rename to doc/source/draft-operations-guide/maintenance-tasks/galera.rst index cb7dcd2892..55b5024979 100644 --- a/doc/source/draft-operations-guide/ops-galera-recovery.rst +++ b/doc/source/draft-operations-guide/maintenance-tasks/galera.rst @@ -1,6 +1,138 @@ -======================= +========================== +Galera cluster maintenance +========================== + + maintenance-tasks/ops-galera-recovery.rst + +Routine maintenance includes gracefully adding or removing nodes from +the cluster without impacting operation and also starting a cluster +after gracefully shutting down all nodes. + +MySQL instances are restarted when creating a cluster, when adding a +node, when the service is not running, or when changes are made to the +``/etc/mysql/my.cnf`` configuration file. + +Remove nodes +~~~~~~~~~~~~ + +In the following example, all but one node was shut down gracefully: + +.. code-block:: shell-session + + # ansible galera_container -m shell -a "mysql -h localhost \ + -e 'show status like \"%wsrep_cluster_%\";'" + node3_galera_container-3ea2cbd3 | FAILED | rc=1 >> + ERROR 2002 (HY000): Can't connect to local MySQL server + through socket '/var/run/mysqld/mysqld.sock' (2) + + node2_galera_container-49a47d25 | FAILED | rc=1 >> + ERROR 2002 (HY000): Can't connect to local MySQL server + through socket '/var/run/mysqld/mysqld.sock' (2) + + node4_galera_container-76275635 | success | rc=0 >> + Variable_name Value + wsrep_cluster_conf_id 7 + wsrep_cluster_size 1 + wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 + wsrep_cluster_status Primary + + +Compare this example output with the output from the multi-node failure +scenario where the remaining operational node is non-primary and stops +processing SQL requests. Gracefully shutting down the MariaDB service on +all but one node allows the remaining operational node to continue +processing SQL requests. When gracefully shutting down multiple nodes, +perform the actions sequentially to retain operation. + +Start a cluster +~~~~~~~~~~~~~~~ + +Gracefully shutting down all nodes destroys the cluster. Starting or +restarting a cluster from zero nodes requires creating a new cluster on +one of the nodes. + +#. Start a new cluster on the most advanced node. + Check the ``seqno`` value in the ``grastate.dat`` file on all of the nodes: + + .. code-block:: shell-session + + # ansible galera_container -m shell -a "cat /var/lib/mysql/grastate.dat" + node2_galera_container-49a47d25 | success | rc=0 >> + # GALERA saved state version: 2.1 + uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 + seqno: 31 + cert_index: + + node3_galera_container-3ea2cbd3 | success | rc=0 >> + # GALERA saved state version: 2.1 + uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 + seqno: 31 + cert_index: + + node4_galera_container-76275635 | success | rc=0 >> + # GALERA saved state version: 2.1 + uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 + seqno: 31 + cert_index: + + In this example, all nodes in the cluster contain the same positive + ``seqno`` values as they were synchronized just prior to + graceful shutdown. If all ``seqno`` values are equal, any node can + start the new cluster. + + .. code-block:: shell-session + + # /etc/init.d/mysql start --wsrep-new-cluster + + This command results in a cluster containing a single node. The + ``wsrep_cluster_size`` value shows the number of nodes in the + cluster. + + .. code-block:: shell-session + + node2_galera_container-49a47d25 | FAILED | rc=1 >> + ERROR 2002 (HY000): Can't connect to local MySQL server + through socket '/var/run/mysqld/mysqld.sock' (111) + + node3_galera_container-3ea2cbd3 | FAILED | rc=1 >> + ERROR 2002 (HY000): Can't connect to local MySQL server + through socket '/var/run/mysqld/mysqld.sock' (2) + + node4_galera_container-76275635 | success | rc=0 >> + Variable_name Value + wsrep_cluster_conf_id 1 + wsrep_cluster_size 1 + wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 + wsrep_cluster_status Primary + +#. Restart MariaDB on the other nodes and verify that they rejoin the + cluster. + + .. code-block:: shell-session + + node2_galera_container-49a47d25 | success | rc=0 >> + Variable_name Value + wsrep_cluster_conf_id 3 + wsrep_cluster_size 3 + wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 + wsrep_cluster_status Primary + + node3_galera_container-3ea2cbd3 | success | rc=0 >> + Variable_name Value + wsrep_cluster_conf_id 3 + wsrep_cluster_size 3 + wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 + wsrep_cluster_status Primary + + node4_galera_container-76275635 | success | rc=0 >> + Variable_name Value + wsrep_cluster_conf_id 3 + wsrep_cluster_size 3 + wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 + wsrep_cluster_status Primary + Galera cluster recovery -======================= +~~~~~~~~~~~~~~~~~~~~~~~ Run the ``galera-bootstrap`` playbook to automatically recover a node or an entire environment. Run the ``galera install`` playbook @@ -15,8 +147,8 @@ entire environment. The cluster comes back online after completion of this command. -Single-node failure -~~~~~~~~~~~~~~~~~~~ +Recover a single-node failure +----------------------------- If a single node fails, the other nodes maintain quorum and continue to process SQL requests. @@ -55,8 +187,8 @@ continue to process SQL requests. further analysis on the output. As a last resort, rebuild the container for the node. -Multi-node failure -~~~~~~~~~~~~~~~~~~ +Recover a multi-node failure +---------------------------- When all but one node fails, the remaining node cannot achieve quorum and stops processing SQL requests. In this situation, failed nodes that @@ -143,8 +275,8 @@ recover cannot join the cluster because it no longer exists. ``mysqld`` command and perform further analysis on the output. As a last resort, rebuild the container for the node. -Complete failure -~~~~~~~~~~~~~~~~ +Recover a complete environment failure +-------------------------------------- Restore from backup if all of the nodes in a Galera cluster fail (do not shutdown gracefully). Run the following command to determine if all nodes in @@ -184,8 +316,8 @@ each node has an identical copy of the data, we do not recommend to restart the cluster using the ``--wsrep-new-cluster`` command on one node. -Rebuilding a container -~~~~~~~~~~~~~~~~~~~~~~ +Rebuild a container +------------------- Recovering from certain failures require rebuilding one or more containers. diff --git a/doc/source/draft-operations-guide/maintenance-tasks/managing-swift.rst b/doc/source/draft-operations-guide/maintenance-tasks/managing-swift.rst new file mode 100644 index 0000000000..a9c015f48f --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/managing-swift.rst @@ -0,0 +1,78 @@ +============================================ +Managing Object Storage for multiple regions +============================================ + +This is a draft Object Storage page for the proposed OpenStack-Ansible +operations guide. + +Failovers for multi-region Object Storage +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In multi-region Object Storage, objects are retrievable from an +alternate location if the default location becomes unavailable. + +.. important:: + + It is recommended to perform the following steps before a failure + occurs to avoid having to dump and restore the database. + + If a failure does occur, follow these steps to restore the database + from the Primary (failed) Region: + +#. Record the Primary Region output of the ``default_project_id`` for + the specified user from the user table in the keystone database: + + .. note:: + + The user is ``admin`` in this example. + + .. code:: + + # mysql -e "SELECT default_project_id from keystone.user WHERE \ + name='admin';" + + +----------------------------------+ + | default_project_id | + +----------------------------------+ + | 76ef6df109744a03b64ffaad2a7cf504 | + +-----------------—————————————————+ + + +#. Record the Secondary Region output of the ``default_project_id`` + for the specified user from the user table in the keystone + database: + + .. code:: + + # mysql -e "SELECT default_project_id from keystone.user WHERE \ + name='admin';" + + +----------------------------------+ + | default_project_id | + +----------------------------------+ + | 69c46f8ad1cf4a058aa76640985c | + +----------------------------------+ + +#. In the Secondary Region, update the references to the + ``project_id`` to match the ID from the Primary Region: + + .. code:: + + # export PRIMARY_REGION_TENANT_ID="76ef6df109744a03b64ffaad2a7cf504" + # export SECONDARY_REGION_TENANT_ID="69c46f8ad1cf4a058aa76640985c" + + # mysql -e "UPDATE keystone.assignment set \ + target_id='${PRIMARY_REGION_TENANT_ID}' \ + WHERE target_id='${SECONDARY_REGION_TENANT_ID}';" + + # mysql -e "UPDATE keystone.user set \ + default_project_id='${PRIMARY_REGION_TENANT_ID}' WHERE \ + default_project_id='${SECONDARY_REGION_TENANT_ID}';" + + # mysql -e "UPDATE keystone.project set \ + id='${PRIMARY_REGION_TENANT_ID}' WHERE \ + id='${SECONDARY_REGION_TENANT_ID}';" + +The user in the Secondary Region now has access to objects PUT in the +Primary Region. The Secondary Region can PUT objects accessible by the +user in the Primary Region. diff --git a/doc/source/draft-operations-guide/maintenance-tasks/network-maintain.rst b/doc/source/draft-operations-guide/maintenance-tasks/network-maintain.rst new file mode 100644 index 0000000000..6794c0bec7 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/network-maintain.rst @@ -0,0 +1,27 @@ +====================== +Networking maintenance +====================== + +This is a draft networking maintenance page for the proposed OpenStack-Ansible +operations guide. + +Add network interfaces to LXC containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Remove network interfaces from LXC containers +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Add provider bridges using new NICs +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Remove provider bridges +~~~~~~~~~~~~~~~~~~~~~~~ + +Move from Open vSwitch to LinuxBridge and vice versa +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Restart a neutron agent container +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Reference to networking guide/ops guide content +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/source/draft-operations-guide/maintenance-tasks/rabbitmq-maintain.rst b/doc/source/draft-operations-guide/maintenance-tasks/rabbitmq-maintain.rst new file mode 100644 index 0000000000..f26700b344 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/rabbitmq-maintain.rst @@ -0,0 +1,24 @@ +============================ +RabbitMQ cluster maintenance +============================ + +This is a draft RabbitMQ cluster maintenance page for the proposed +OpenStack-Ansible operations guide. + +Create a RabbitMQ cluster +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Check the RabbitMQ cluster status +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Stop and restart a RabbitMQ cluster +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +RabbitMQ and mnesia +~~~~~~~~~~~~~~~~~~~ + +Repair a partitioned RabbitMQ cluster for a single-node +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Repair a partitioned RabbitMQ cluster for a multi-node cluster +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/source/draft-operations-guide/maintenance-tasks/scale-environment.rst b/doc/source/draft-operations-guide/maintenance-tasks/scale-environment.rst new file mode 100644 index 0000000000..be9713c640 --- /dev/null +++ b/doc/source/draft-operations-guide/maintenance-tasks/scale-environment.rst @@ -0,0 +1,234 @@ +======================== +Scaling your environment +======================== + +This is a draft environment scaling page for the proposed OpenStack-Ansible +operations guide. + +Add a new infrastructure node +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +While three infrastructure hosts are recommended, if further hosts are +needed in an environment, it is possible to create additional nodes. + +.. warning:: + + Make sure you back up your current OpenStack environment + before adding any new nodes. See :ref:`backing-up` for more + information. + +#. Add the node to the ``infra_hosts`` stanza of the + ``/etc/openstack_deploy/openstack_user_config.yml`` + + .. code:: console + + infra_hosts: + [...] + NEW_infra: + ip: 10.17.136.32 + NEW_infra: + ip: 10.17.136.33 + +#. Run the ``add_host`` playbook on the deployment host. + + .. code:: console + + # cd /opt/openstack-ansible/playbooks + +#. Update the inventory to add new hosts. Make sure new rsyslog + container names are updated. Send the updated results to ``dev/null``. + + .. code:: console + + # /opt/rpc-openstack/openstack-ansible/playbooks/inventory/dynamic_inventory.py > /dev/null + +#. Create the ``/root/add_host.limit`` file, which contains all new node + host names. + + .. code:: console + + # /opt/rpc-openstack/openstack-ansible/scripts/inventory-manage.py \ + -f /opt/rpc-openstack/openstack-ansible/playbooks/inventory/dynamic_inventory.py \ + -l |awk -F\| '// {print $2}' |sort -u | tee /root/add_host.limit + +#. Run the ``setup-everything.yml`` playbook with the + ``limit`` argument. + + .. warning:: + + Do not run the ``setup-everything.yml`` playbook + without the ``--limit`` argument. Without ``--limit``, the + playbook will restart all containers inside your environment. + + .. code:: console + + # openstack-ansible setup-everything.yml --limit @/root/add_host.limit + # openstack-ansible --tags=openstack-host-hostfile setup-hosts.yml + +#. Run the rpc-support playbooks. + + .. code:: console + + # ( cd /opt/rpc-openstack/rpcd/playbooks ; openstack-ansible rpc-support.yml ) + +#. Generate a new impersonation token, and add that token after the + `maas_auth_token` variable in the ``user_rpco_variables_overrides.yml`` + file. + + .. code:: console + + - Update maas_auth_token /etc/openstack_deploy/user_rpco_variables_overrides.yml + +#. Run the MaaS playbook on the deployment host. + + .. code:: console + + # ( cd /opt/rpc-openstack/rpcd/playbooks ; openstack-ansible setup-maas.yml + --limit @/root/add_host.limit ) + +Test new nodes +~~~~~~~~~~~~~~ + +After creating a new node, test that the node runs correctly by +launching a new instance. Ensure that the new node can respond to +a networking connection test through the :command:`ping` command. +Log in to your monitoring system, and verify that the monitors +return a green signal for the new node. + +Add a compute host +~~~~~~~~~~~~~~~~~~ + +Use the following procedure to add a compute host to an operational +cluster. + +#. Configure the host as a target host. See `Prepare target hosts + `_ + for more information. + +#. Edit the ``/etc/openstack_deploy/openstack_user_config.yml`` file and + add the host to the ``compute_hosts`` stanza. + + If necessary, also modify the ``used_ips`` stanza. + +#. If the cluster is utilizing Telemetry/Metering (Ceilometer), + edit the ``/etc/openstack_deploy/conf.d/ceilometer.yml`` file and add the + host to the ``metering-compute_hosts`` stanza. + +#. Run the following commands to add the host. Replace + ``NEW_HOST_NAME`` with the name of the new host. + + .. code-block:: shell-session + + # cd /opt/openstack-ansible/playbooks + # openstack-ansible setup-hosts.yml --limit NEW_HOST_NAME + # openstack-ansible setup-openstack.yml --skip-tags nova-key-distribute --limit NEW_HOST_NAME + # openstack-ansible setup-openstack.yml --tags nova-key --limit compute_hosts + +Remove a compute host +~~~~~~~~~~~~~~~~~~~~~ + +The `openstack-ansible-ops `_ +repository contains a playbook for removing a compute host from an +OpenStack-Ansible environment. +To remove a compute host, follow the below procedure. + +.. note:: + + This guide describes how to remove a compute node from an OSA environment + completely. Perform these steps with caution, as the compute node will no + longer be in service after the steps have been completed. This guide assumes + that all data and instances have been properly migrated. + +#. Disable all OpenStack services running on the compute node. + This can include, but is not limited to, the ``nova-compute`` service + and the neutron agent service. + + .. note:: + + Ensure this step is performed first + + .. code-block:: console + + # Run these commands on the compute node to be removed + # stop nova-compute + # stop neutron-linuxbridge-agent + +#. Clone the ``openstack-ansible-ops`` repository to your deployment host: + + .. code-block:: console + + $ git clone https://git.openstack.org/openstack/openstack-ansible-ops \ + /opt/openstack-ansible-ops + +#. Run the ``remove_compute_node.yml`` Ansible playbook with the + ``node_to_be_removed`` user variable set: + + .. code-block:: console + + $ cd /opt/openstack-ansible-ops/ansible_tools/playbooks + openstack-ansible remove_compute_node.yml \ + -e node_to_be_removed="" + +#. After the playbook completes, remove the compute node from the + OpenStack-Ansible configuration file in + ``/etc/openstack_deploy/openstack_user_config.yml``. + +Recover a Compute node failure +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The following procedure addresses Compute node failure if shared storage +is used. + + .. note:: + + If shared storage is not used, data can be copied from the + ``/var/lib/nova/instances`` directory on the failed Compute node + ``${FAILED_NODE}`` to another node ``${RECEIVING_NODE}``\ before + performing the following procedure. Please note this method is + not supported. + + #. Re-launch all instances on the failed node. + + #. Invoke the MySQL command line tool + + #. Generate a list of instance UUIDs hosted on the failed node: + + .. code:: + + mysql> select uuid from instances where host = '${FAILED_NODE}' and deleted = 0; + + #. Set instances on the failed node to be hosted on a different node: + + .. code:: + + mysql> update instances set host ='${RECEIVING_NODE}' where host = '${FAILED_NODE}' \ + and deleted = 0; + + #. Reboot each instance on the failed node listed in the previous query + to regenerate the XML files: + + .. code:: + + # nova reboot —hard $INSTANCE_UUID + + #. Find the volumes to check the instance has successfully booted and is + at the login : + + .. code:: + + mysql> select nova.instances.uuid as instance_uuid, cinder.volumes.id \ + as voume_uuid, cinder.volumes.status, cinder.volumes.attach_status, \ + cinder.volumes.mountpoint, cinder.volumes,display_name from \ + cinder.volumes inner join nova.instances on cinder.volumes.instance_uuid=nova.instances.uuid \ + where nova.instances.host = '${FAILED_NODE}'; + + #. If rows are found, detach and re-attach the volumes using the values + listed in the previous query: + + .. code:: + + # nova volume-detach $INSTANCE_UUID $VOLUME_UUID && \ + nova volume-attach $INSTANCE_UUID $VOLUME_UUID $VOLUME_MOUNTPOINT + + #. Rebuild or replace the failed node as described in `Adding a Compute + node `_ diff --git a/doc/source/draft-operations-guide/monitor-environment.rst b/doc/source/draft-operations-guide/monitor-environment.rst index 3fd55fb55d..ae353115a5 100644 --- a/doc/source/draft-operations-guide/monitor-environment.rst +++ b/doc/source/draft-operations-guide/monitor-environment.rst @@ -4,3 +4,8 @@ Monitoring your environment This is a draft monitoring environment page for the proposed OpenStack-Ansible operations guide. + +.. toctree:: + :maxdepth: 2 + + monitor-environment/monitoring-systems.rst diff --git a/doc/source/draft-operations-guide/monitor-environment/monitoring-systems.rst b/doc/source/draft-operations-guide/monitor-environment/monitoring-systems.rst new file mode 100644 index 0000000000..9508092705 --- /dev/null +++ b/doc/source/draft-operations-guide/monitor-environment/monitoring-systems.rst @@ -0,0 +1,11 @@ +======================================================= +Integrate OpenStack-Ansible into your monitoring system +======================================================= + +This is a draft monitoring system page for the proposed OpenStack-Ansible +operations guide. + + +.. TODO monitoring, at a high level, describe how to monitor the services, + and how does haproxy currently check system health (because it can influence + the monitoring process, and ppl may not be aware of the internals. diff --git a/doc/source/draft-operations-guide/openstack-operations.rst b/doc/source/draft-operations-guide/openstack-operations.rst index 4fea4340da..d3bd0f13e7 100644 --- a/doc/source/draft-operations-guide/openstack-operations.rst +++ b/doc/source/draft-operations-guide/openstack-operations.rst @@ -8,4 +8,8 @@ operations guide. .. toctree:: :maxdepth: 2 - verify-deploy.rst + openstack-operations/verify-deploy.rst + openstack-operations/access-environment.rst + openstack-operations/managing-images.rst + openstack-operations/managing-instances.rst + openstack-operations/network-service.rst diff --git a/doc/source/draft-operations-guide/openstack-operations/access-environment.rst b/doc/source/draft-operations-guide/openstack-operations/access-environment.rst new file mode 100644 index 0000000000..e8adba6a75 --- /dev/null +++ b/doc/source/draft-operations-guide/openstack-operations/access-environment.rst @@ -0,0 +1,273 @@ +========================== +Accessing your environment +========================== + +Viewing and setting environment variables +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To connect to the OpenStack installation using command line clients, you must +set the appropriate environment variables. OpenStack clients use environment +variables to provide the information necessary to authenticate to the cloud. +Variables can be viewed and downloaded from the Dashboard and set in the +``admin-openrc.sh`` file on the controller node. + +#. Log in to the Dashboard as the ``admin`` user. + +#. Select the **Compute** tab in the **Project** section of the + navigation pane, then click **Access & Security**. + +#. Select the **API Access** tab, then click the **Download OpenStack RC + File** button. Save the ``admin-openrc.sh`` file to your local system. + +#. Open the ``admin-openrc.sh`` file in a text editor. The file will + display: + + .. important:: + + By default, the ``openrc`` file contains administrative credentials. + It is automatically maintained by the system, and should not by + edited by hand. + + .. code:: + + #!/bin/bash + + # To use an Openstack cloud you need to authenticate against keystone, which + # returns a **Token** and **Service Catalog**. The catalog contains the + # endpoint for all services the user/tenant has access to - including nova, + # glance, keystone, swift. + # + # *NOTE*: Using the 2.0 *auth api* does not mean that compute api is 2.0.We + # will use the 1.1 *compute api* + export OS_AUTH_URL=http://192.168.0.7:5000/v2.0 + + # With the addition of Keystone we have standardized on the term **tenant** + # as the entity that owns the resources. + export OS_TENANT_ID=25da08e142e24f55a9b27044bc0bdf4e + export OS_TENANT_NAME="admin" + + # In addition to the owning entity (tenant), OpenStack stores the entity + # performing the action as the **user**. + export OS_USERNAME="admin" + + # With Keystone you pass the keystone password. + echo "Please enter your OpenStack Password: " + read -sr OS_PASSWORD_INPUT + export OS_PASSWORD=$OS_PASSWORD_INPUT + + # If your configuration has multiple regions, we set that information here. + # OS_REGION_NAME is optional and only valid in certain environments. + export OS_REGION_NAME="RegionOne" + # Don't leave a blank variable, unset it if it was empty + if [ -z "$OS_REGION_NAME" ]; then unset OS_REGION_NAME; fi + + +#. Add the following environment variables entries to the + ``admin-openrc.sh`` file to ensure the OpenStack clients connect to + the correct endpoint type from the service catalog: + + .. code:: + + CINDER_ENDPOINT_TYPE=internalURL + NOVA_ENDPOINT_TYPE=internalURL + OS_ENDPOINT_TYPE=internalURL + +#. Log in to the controller node. + +#. Before running commands, source the ``admin-openrc`` file to set + environment variables. At the command prompt, type: + + .. code:: + + $ source admin-openrc + + +Managing the cloud using the command-line +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This section describes some of the more common commands to view and +manage the cloud. + +Log in to the controller node to run the following commands: + +Server list + The :command:`openstack image list` command shows details about currently + available images: + + .. code:: + + $ openstack image list + +------------------+--------------+--------+ + | ID | Name | Status | + +------------------+--------------+--------+ + | [ID truncated] | ExampleImage | active | + +------------------+--------------+--------+ + + +List services + The :command:`nova service-list` command details the currently running + services: + + .. code:: + + $ nova service-list + +----+------------------+------------+----------+---------+-------+----------------------------+-----------------+ + | Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | + +----+------------------+------------+----------+---------+-------+----------------------------+-----------------+ + | 4 | nova-consoleauth | controller | internal | enabled | up | 2016-12-14T04:06:03.000000 | - | + | 5 | nova-scheduler | controller | internal | enabled | up | 2016-12-14T04:06:03.000000 | - | + | 6 | nova-conductor | controller | internal | enabled | up | 2016-12-14T04:05:59.000000 | - | + | 9 | nova-compute | compute | nova | enabled | down | 2016-10-21T02:35:03.000000 | - | + +----+------------------+------------+----------+---------+-------+----------------------------+-----------------+ + + +View logs + All logs are available in the ``/var/log/`` directory and its + subdirectories. The **tail** command shows the most recent entries + in a specified log file: + + .. code:: + + $ tail /var/log/nova/nova.log + + +See available flavors + The **openstack flavor list** command lists the *flavors* that are + available. These are different disk sizes that can be assigned to + images: + + .. code:: + + $ nova flavor-list + +----+-----------+-----------+------+-----------+------+-------+-------------+ + | ID | Name | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | + +----+-----------+-----------+------+-----------+------+-------+-------------+ + | 1 | m1.tiny | 512 | 0 | 0 | | 1 | 1.0 | + | 2 | m1.small | 2048 | 10 | 20 | | 1 | 1.0 | + | 3 | m1.medium | 4096 | 10 | 40 | | 2 | 1.0 | + | 4 | m1.large | 8192 | 10 | 80 | | 4 | 1.0 | + | 5 | m1.xlarge | 16384 | 10 | 160 | | 8 | 1.0 | + +----+-----------+-----------+------+-----------+------+-------+-------------+ + + + .. important:: + + Do not remove the default flavors. + +List images + The **openstack image list** command lists the currently available + images: + + .. code:: + + $ openstack image list + +--------------------------+----------------------------+--------+ + | ID | Name | Status | + +--------------------------+----------------------------+--------+ + | 033c0027-[ID truncated] | cirros-image | active | + | 0ccfc8c4-[ID truncated] | My Image 2 | active | + | 85a0a926-[ID truncated] | precise-image | active | + +--------------------------+----------------------------+--------+ + + +List floating IP addresses + The **openstack floating ip list** command lists the currently + available floating IP addresses and the instances they are + associated with: + + .. code:: + + $ openstack floating ip list + +------------------+------------------+---------------------+------------ + + | id | fixed_ip_address | floating_ip_address | port_id | + +------------------+------------------+---------------------+-------------+ + | 0a88589a-ffac... | | 208.113.177.100 | | + +------------------+------------------+---------------------+-------------+ + + +OpenStack client utilities +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +OpenStack client utilities are a convenient way to interact with +OpenStack from the command line on the workstation, without being logged +in to the controller nodes. + +.. NOTE FROM JP TO ADD LATER: + If we talk about utilities, I suggest we move the CLI utilities section + above, because it's used already in things above. It makes sense to first + install them and then use them. I'd in that case I'd mention that they don't + need to be installed /upgraded again on the utility containers, because + they already handled by OSA deployment. + +Python client utilities are available using the Python Package Index +(PyPI), and can be installed on most Linux systems using these commands: + +.. NOTE FROM JP: I'd maybe mention the python-openstackclient first. It should + be our first citizen in the future. + + .. code:: + + # pip install python-PROJECTclient + + .. note:: + + The keystone client utility is deprecated. The OpenStackClient + utility should be used which supports v2 and v3 Identity API. + + +Upgrade or remove clients +~~~~~~~~~~~~~~~~~~~~~~~~~ + +To upgrade a client, add the **--upgrade** option to the command: + + .. code:: + + # pip install --upgrade python-PROJECTclient + + +To remove a client, run the **pip uninstall** command: + + .. code:: + + # pip uninstall python-PROJECTclient + + +For more information about OpenStack client utilities, see these links: + +- `OpenStack API Quick + Start `__ + +- `OpenStackClient + commands `__ + +- `Image Service (glance) CLI + commands `__ + +- `Image Service (glance) CLI command cheat + sheet `__ + +- `Compute (nova) CLI + commands `__ + +- `Compute (nova) CLI command cheat + sheet `__ + +- `Networking (neutron) CLI + commands `__ + +- `Networking (neutron) CLI command cheat + sheet `__ + +- `Block Storage (cinder) CLI commands + `__ + +- `Block Storage (cinder) CLI command cheat + sheet `__ + +- `python-keystoneclient `__ + +- `python-glanceclient `__ + +- `python-novaclient `__ + +- `python-neutronclient `__ diff --git a/doc/source/draft-operations-guide/openstack-operations/managing-images.rst b/doc/source/draft-operations-guide/openstack-operations/managing-images.rst new file mode 100644 index 0000000000..f42d45ead5 --- /dev/null +++ b/doc/source/draft-operations-guide/openstack-operations/managing-images.rst @@ -0,0 +1,125 @@ +=============== +Managing images +=============== + +.. FROM JP TO ADD: + I think a far more interesting section for operations is how to handle the + CHANGES of images. For example, deprecation of images, re-uploading new + ones... The process is dependant on each company, but at least it would be + original content, and far more valuable IMO. But it implies research. + +An image represents the operating system, software, and any settings +that instances may need depending on the project goals. Create images +first before creating any instances. + +Adding images can be done through the Dashboard, or the command line. +Another option available is the the ``python-openstackclient`` tool, which +can be installed on the controller node, or on a workstation. + +Adding an image using the Dashboard +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to add an image using the Dashboard, prepare an image binary +file, which must be accessible over HTTP using a valid and direct URL. +Images can be compressed using ``.zip`` or ``.tar.gz``. + + .. note:: + + Uploading images using the Dashboard will be available to users + with administrator privileges. Operators can set user access + privileges. + +#. Log in to the Dashboard. + +#. Select the **Admin** tab in the navigation pane and click **images**. + +#. Click the **Create Image** button. The **Create an Image** dialog box + will appear. + +#. Enter the details of the image, including the **Image Location**, + which is where the URL location of the image is required. + +#. Click the **Create Image** button. The newly created image may take + some time before it is completely uploaded since the image arrives in + an image queue. + + +Adding an image using the command line +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Log into the controller node directly to add images using the command +line with OpenStack command line clients. Glance commands allow users to +add, manipulate, and manage images. + +Alternatively, configure the environment with administrative access to +the controller. + +#. Log into the controller node. + +#. Run the ``openstack image create`` command, and specify the image name. + Use the + ``--file `` variable to specify the image file. + +#. Run the ``openstack image set`` command, and specify the image name with the + ``--name `` variable. + +#. View a list of all the images currently available with ``openstack + image list``. + +#. Run the ``openstack image list`` command to view more details on each image. + + .. list-table:: **glance image details** + :widths: 33 33 33 + :header-rows: 1 + + * - Variable + - Required + - Details + * - ``--name NAME`` + - Optional + - A name for the image + * - ``--public [True|False]`` + - Optional + - If set to ``true``, makes the image available to all users. Permission + to set this variable is admin only by default. + * - ``--protected [True|False]`` + - Optional + - If set to ``true``, this variable prevents an image from being deleted. + * - ``--container-format CONTAINER_FORMAT`` + - Required + - The type of container format, one of ``ami``, ``ari``, ``aki``, + ``bare``, or ``ovf`` + * - ``--disk-format DISK_FORMAT`` + - Required + - The type of disk format, one of ``ami``, ``ari``, ``aki``, ``vhd``, + ``vdi``, and ``iso`` + * - ``--owner PROJECT_ID`` + - Optional + - The tenant who should own the image. + * - ``--size SIZE`` + - Optional + - Size of the image data, which is measured in bytes. + * - ``--min-disk DISK_GB`` + - Optional + - The minimum size of the disk needed to boot the image being configured, + which is measured in gigabytes. + * - ``--min-ram DISK_GB`` + - Optional + - The minimum amount of RAM needed to boot the image being configured, + which is measured in megabytes. + * - ``--location IMAGE_URL`` + - Optional + - The location where the image data resides. This variables sets the + location as a URL. If the image data is stored on a swift service, + specify: swift://account:/container/obj. + * - ``--checksum CHECKSUM`` + - Optional + - Image data hash used for verification. + * - ``--copy-from IMAGE_URL`` + - Optional + - Indicates that the image server should copy data immediately, and store + it in its configured image store. + * - ``--property KEY=VALUE`` + - Optional + - This variable associates an arbitrary property to the image, and can be + used multiple times. diff --git a/doc/source/draft-operations-guide/openstack-operations/managing-instances.rst b/doc/source/draft-operations-guide/openstack-operations/managing-instances.rst new file mode 100644 index 0000000000..02b1b0a8d3 --- /dev/null +++ b/doc/source/draft-operations-guide/openstack-operations/managing-instances.rst @@ -0,0 +1,221 @@ +================== +Managing instances +================== + +This chapter describes how to create and access instances. + +Creating an instance using the Dashboard +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Using an image, create a new instance via the Dashboard options. + +#. Log into the Dashboard, and select the **Compute** project from the + drop down list. + +#. Click the **Images** option. + +#. Locate the image that will act as the instance base from the + **Images** table. + +#. Click **Launch** from the **Actions** column. + +#. Check the **Launch Instances** dialog, and find the **details** tab. + Enter the appropriate values for the instance. + + #. In the Launch Instance dialog, click the **Access & Security** tab. + Select the keypair. Set the security group as "default". + + #. Click the **Networking tab**. This tab will be unavailable if + OpenStack networking (neutron) has not been enabled. If networking + is enabled, select the networks on which the instance will + reside. + + #. Click the **Volume Options tab**. This tab will only be available + if a Block Storage volume exists for the instance. Select + **Don't boot from a volume** for now. + + For more information on attaching Block Storage volumes to + instances for persistent storage, see `the + “Managing volumes for persistent + storage” section `__. + + #. Add customisation scripts, if needed, by clicking the + **Post-Creation** tab. These run after the instance has been + created. Some instances support user data, such as root passwords, + or admin users. Enter the information specific to to the instance + here if required. + + #. Click **Advanced Options**. Specify whether the instance uses a + configuration drive to store metadata by selecting a disk + partition type. + +#. Click **Launch** to create the instance. The instance will start on a + compute node. The **Instance** page will open and start creating a + new instance. The **Instance** page that opens will list the instance + name, size, status, and task. Power state and public and private IP + addresses are also listed here. + + The process will take less than a minute to complete. Instance + creation is complete when the status is listed as active. Refresh the + page to see the new active instance. + + .. list-table:: **Launching an instance options** + :widths: 33 33 33 + :header-rows: 1 + + * - Field Name + - Required + - Details + * - **Availability Zone** + - Optional + - The availability zone in which the image service creates the instance. + If no availability zones is defined, no instances will be found. The + cloud provider sets the availability zone to a specific value. + * - **Instance Name** + - Required + - The name of the new instance, which becomes the initial host name of the + server. If the server name is changed in the API or directly changed, + the Dashboard names remain unchanged + * - **Image** + - Required + - The type of container format, one of ``ami``, ``ari``, ``aki``, + ``bare``, or ``ovf`` + * - **Flavor** + - Required + - The vCPU, Memory, and Disk configuration. Note that larger flavors can + take a long time to create. If creating an instance for the first time + and want something small with which to test, select ``m1.small``. + * - **Instance Count** + - Required + - If creating multiple instances with this configuration, enter an integer + up to the number permitted by the quota, which is ``10`` by default. + * - **Instance Boot Source** + - Required + - Specify whether the instance will be based on an image or a snapshot. If + it is the first time creating an instance, there will not yet be any + snapshots available. + * - **Image Name** + - Required + - The instance will boot from the selected image. This option will be + pre-populated with the instance selected from the table. However, choose + ``Boot from Snapshot`` in **Instance Boot Source**, and it will default + to ``Snapshot`` instead. + * - **Security Groups** + - Optional + - This option assigns security groups to an instance. + The default security group activates when no customised group is + specified here. Security Groups, similar to a cloud firewall, define + which incoming network traffic is forwarded to instances. + * - **Keypair** + - Optional + - Specify a key pair with this option. If the image uses a static key set + (not recommended), a key pair is not needed. + * - **Selected Networks** + - Optional + - To add a network to an instance, click the **+** in the **Networks + field**. + * - **Customisation Script** + - Optional + - Specify a customisation script. This script runs after the instance + launches and becomes active. + + +Creating an instance using the command line +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +On the command line, image creation is managed with the **nova boot** +command. Before launching an image, determine what images and flavors +are available to create a new instance using the **nova image-list** and +**nova flavor-list** commands. + +#. Log in to the controller node. + +#. Issue the **nova boot** command with a name for the instance, along + with the name of the image and flavor to use: + + .. code:: + + $ nova boot --image precise-image --flavor=2 --key-name example-key example-instance + +-------------------------------------+--------------------------------------+ + | Property | Value | + +-------------------------------------+--------------------------------------+ + | OS-DCF:diskConfig | MANUAL | + | OS-EXT-SRV-ATTR:host | None | + | OS-EXT-SRV-ATTR:hypervisor_hostname | None | + | OS-EXT-SRV-ATTR:instance_name | instance-0000000d | + | OS-EXT-STS:power_state | 0 | + | OS-EXT-STS:task_state | scheduling | + | OS-EXT-STS:vm_state | building | + | accessIPv4 | | + | accessIPv6 | | + | adminPass | ATSEfRY9fZPx | + | config_drive | | + | created | 2012-08-02T15:43:46Z | + | flavor | m1.small | + | hostId | | + | id | 5bf46a3b-084c-4ce1-b06f-e460e875075b | + | image | precise-image | + | key_name | example-key | + | metadata | {} | + | name | example-instance | + | progress | 0 | + | status | BUILD | + | tenant_id | b4769145977045e2a9279c842b09be6a | + | updated | 2012-08-02T15:43:46Z | + | user_id | 5f2f2c28bdc844f9845251290b524e80 | + +-------------------------------------+--------------------------------------+ + + +#. To check that the instance was created successfully, issue the **nova + list** command: + + .. code:: + + $ nova list + +------------------+------------------+--------+-------------------+ + | ID | Name | Status | Networks | + +------------------+------------------+--------+-------------------+ + | [ID truncated] | example-instance | ACTIVE | public=192.0.2.0 | + +------------------+------------------+--------+-------------------+ + + +Managing an instance +~~~~~~~~~~~~~~~~~~~~ + +#. Log in to the Dashboard. Select one of the projects, and click + **Instances**. + +#. Select an instance from the list of available instances. + +#. Check the **Actions** column, and click on the **More** option. + Select the instance state. + +The **Actions** column includes the following options: + +- Resize or rebuild any instance + +- View the instance console log + +- Edit the instance + +- Modify security groups + +- Pause, resume, or suspend the instance + +- Soft or hard reset the instance + + .. note:: + + Terminate the instance under the **Actions** column. + + +Managing volumes for persistent storage +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Volumes attach to instances, enabling persistent storage. Volume +storage provides a source of memory for instances. Administrators can +attach volumes to a running instance, or move a volume from one +instance to another. + +Live migration +~~~~~~~~~~~~~~ diff --git a/doc/source/draft-operations-guide/openstack-operations/network-service.rst b/doc/source/draft-operations-guide/openstack-operations/network-service.rst new file mode 100644 index 0000000000..5949ffa193 --- /dev/null +++ b/doc/source/draft-operations-guide/openstack-operations/network-service.rst @@ -0,0 +1,41 @@ +================== +Networking service +================== + +Load-Balancer-as-a-Service (LBaaS) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The LBaaS functionality is configured and deployed using +OpenStack-Ansible. For more information about LBaaS operations, +see `LBaaS`_ in the OpenStack Networking guide. + +Understand the following characteristics of the OpenStack-Ansible LBaaS +technical preview: + + * The preview release is not intended to provide highly scalable or + highly available load balancing services. + * Testing and recommended usage is limited to 10 members in a pool + and no more than 20 pools. + * Virtual load balancers deployed as part of the LBaaS service are + not monitored for availability or performance. + * OpenStack-Ansible enables LBaaS v2 with the default HAProxy-based agent. + * The Octavia agent is not supported. + * Integration with physical load balancer devices is not supported. + * Customers can use API or CLI LBaaS interfaces. + * The Dashboard offers a panel for creating and managing LBaaS load balancers, + listeners, pools, members, and health checks. + * SDN integration is not supported. + + +In Mitaka, you can `enable Dashboard (horizon) panels`_ for LBaaS. +Additionally, a customer can specify a list of servers behind a +listener and reuse that list for another listener. This feature, +called *shared pools*, only applies to customers that have a large +number of listeners (ports) behind a load balancer. + +.. _LBaaS: + http://docs.openstack.org/mitaka/networking-guide/config-lbaas.html + +.. _enable Dashboard (horizon) panels: + http://docs.openstack.org/developer/openstack-ansible/mitaka/install-guide/ + configure-network-services.html#deploying-lbaas-v2 diff --git a/doc/source/draft-operations-guide/openstack-operations/verify-deploy.rst b/doc/source/draft-operations-guide/openstack-operations/verify-deploy.rst new file mode 100644 index 0000000000..94886ac255 --- /dev/null +++ b/doc/source/draft-operations-guide/openstack-operations/verify-deploy.rst @@ -0,0 +1,68 @@ +=============================== +Verifying your cloud deployment +=============================== + +This is a draft cloud verification page for the proposed +OpenStack-Ansible operations guide. + +Verifying your OpenStack-Ansible operation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This chapter goes through the verification steps for a basic operation of +the OpenStack API and dashboard. + +.. note:: + + The utility container provides a CLI environment for additional + configuration and testing. + +#. Determine the utility container name: + + .. code:: + + $ lxc-ls | grep utility + infra1_utility_container-161a4084 + +#. Access the utility container: + + .. code:: + + $ lxc-attach -n infra1_utility_container-161a4084 + +#. Source the ``admin`` tenant credentials: + + .. code:: + + # source openrc + +#. Run an OpenStack command that uses one or more APIs. For example: + + .. code:: + + # openstack user list --domain default + +----------------------------------+--------------------+ + | ID | Name | + +----------------------------------+--------------------+ + | 04007b990d9442b59009b98a828aa981 | glance | + | 0ccf5f2020ca4820847e109edd46e324 | keystone | + | 1dc5f638d4d840c690c23d5ea83c3429 | neutron | + | 3073d0fa5ced46f098215d3edb235d00 | cinder | + | 5f3839ee1f044eba921a7e8a23bb212d | admin | + | 61bc8ee7cc9b4530bb18acb740ee752a | stack_domain_admin | + | 77b604b67b79447eac95969aafc81339 | alt_demo | + | 85c5bf07393744dbb034fab788d7973f | nova | + | a86fc12ade404a838e3b08e1c9db376f | swift | + | bbac48963eff4ac79314c42fc3d7f1df | ceilometer | + | c3c9858cbaac4db9914e3695b1825e41 | dispersion | + | cd85ca889c9e480d8ac458f188f16034 | demo | + | efab6dc30c96480b971b3bd5768107ab | heat | + +----------------------------------+--------------------+ + +#. With a web browser, access the Dashboard using the external load + balancer IP address. This is defined by the ``external_lb_vip_address`` + option in the ``/etc/openstack_deploy/openstack_user_config.yml`` + file. The dashboard uses HTTPS on port 443. + +#. Authenticate using the username ``admin`` and password defined by + the ``keystone_auth_admin_password`` option in the + ``/etc/openstack_deploy/user_osa_secrets.yml`` file. diff --git a/doc/source/draft-operations-guide/ops-add-computehost.rst b/doc/source/draft-operations-guide/ops-add-computehost.rst deleted file mode 100644 index b1def07110..0000000000 --- a/doc/source/draft-operations-guide/ops-add-computehost.rst +++ /dev/null @@ -1,29 +0,0 @@ -===================== -Adding a compute host -===================== - -Use the following procedure to add a compute host to an operational -cluster. - -#. Configure the host as a target host. See `Prepare target hosts - `_ - for more information. - -#. Edit the ``/etc/openstack_deploy/openstack_user_config.yml`` file and - add the host to the ``compute_hosts`` stanza. - - If necessary, also modify the ``used_ips`` stanza. - -#. If the cluster is utilizing Telemetry/Metering (Ceilometer), - edit the ``/etc/openstack_deploy/conf.d/ceilometer.yml`` file and add the - host to the ``metering-compute_hosts`` stanza. - -#. Run the following commands to add the host. Replace - ``NEW_HOST_NAME`` with the name of the new host. - - .. code-block:: shell-session - - # cd /opt/openstack-ansible/playbooks - # openstack-ansible setup-hosts.yml --limit NEW_HOST_NAME - # openstack-ansible setup-openstack.yml --skip-tags nova-key-distribute --limit NEW_HOST_NAME - # openstack-ansible setup-openstack.yml --tags nova-key --limit compute_hosts diff --git a/doc/source/draft-operations-guide/ops-galera-remove.rst b/doc/source/draft-operations-guide/ops-galera-remove.rst deleted file mode 100644 index af44acd444..0000000000 --- a/doc/source/draft-operations-guide/ops-galera-remove.rst +++ /dev/null @@ -1,32 +0,0 @@ -============== -Removing nodes -============== - -In the following example, all but one node was shut down gracefully: - -.. code-block:: shell-session - - # ansible galera_container -m shell -a "mysql -h localhost \ - -e 'show status like \"%wsrep_cluster_%\";'" - node3_galera_container-3ea2cbd3 | FAILED | rc=1 >> - ERROR 2002 (HY000): Can't connect to local MySQL server - through socket '/var/run/mysqld/mysqld.sock' (2) - - node2_galera_container-49a47d25 | FAILED | rc=1 >> - ERROR 2002 (HY000): Can't connect to local MySQL server - through socket '/var/run/mysqld/mysqld.sock' (2) - - node4_galera_container-76275635 | success | rc=0 >> - Variable_name Value - wsrep_cluster_conf_id 7 - wsrep_cluster_size 1 - wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 - wsrep_cluster_status Primary - - -Compare this example output with the output from the multi-node failure -scenario where the remaining operational node is non-primary and stops -processing SQL requests. Gracefully shutting down the MariaDB service on -all but one node allows the remaining operational node to continue -processing SQL requests. When gracefully shutting down multiple nodes, -perform the actions sequentially to retain operation. diff --git a/doc/source/draft-operations-guide/ops-galera-start.rst b/doc/source/draft-operations-guide/ops-galera-start.rst deleted file mode 100644 index 7555546a1c..0000000000 --- a/doc/source/draft-operations-guide/ops-galera-start.rst +++ /dev/null @@ -1,88 +0,0 @@ -================== -Starting a cluster -================== - -Gracefully shutting down all nodes destroys the cluster. Starting or -restarting a cluster from zero nodes requires creating a new cluster on -one of the nodes. - -#. Start a new cluster on the most advanced node. - Check the ``seqno`` value in the ``grastate.dat`` file on all of the nodes: - - .. code-block:: shell-session - - # ansible galera_container -m shell -a "cat /var/lib/mysql/grastate.dat" - node2_galera_container-49a47d25 | success | rc=0 >> - # GALERA saved state version: 2.1 - uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 - seqno: 31 - cert_index: - - node3_galera_container-3ea2cbd3 | success | rc=0 >> - # GALERA saved state version: 2.1 - uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 - seqno: 31 - cert_index: - - node4_galera_container-76275635 | success | rc=0 >> - # GALERA saved state version: 2.1 - uuid: 338b06b0-2948-11e4-9d06-bef42f6c52f1 - seqno: 31 - cert_index: - - In this example, all nodes in the cluster contain the same positive - ``seqno`` values as they were synchronized just prior to - graceful shutdown. If all ``seqno`` values are equal, any node can - start the new cluster. - - .. code-block:: shell-session - - # /etc/init.d/mysql start --wsrep-new-cluster - - This command results in a cluster containing a single node. The - ``wsrep_cluster_size`` value shows the number of nodes in the - cluster. - - .. code-block:: shell-session - - node2_galera_container-49a47d25 | FAILED | rc=1 >> - ERROR 2002 (HY000): Can't connect to local MySQL server - through socket '/var/run/mysqld/mysqld.sock' (111) - - node3_galera_container-3ea2cbd3 | FAILED | rc=1 >> - ERROR 2002 (HY000): Can't connect to local MySQL server - through socket '/var/run/mysqld/mysqld.sock' (2) - - node4_galera_container-76275635 | success | rc=0 >> - Variable_name Value - wsrep_cluster_conf_id 1 - wsrep_cluster_size 1 - wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 - wsrep_cluster_status Primary - -#. Restart MariaDB on the other nodes and verify that they rejoin the - cluster. - - .. code-block:: shell-session - - node2_galera_container-49a47d25 | success | rc=0 >> - Variable_name Value - wsrep_cluster_conf_id 3 - wsrep_cluster_size 3 - wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 - wsrep_cluster_status Primary - - node3_galera_container-3ea2cbd3 | success | rc=0 >> - Variable_name Value - wsrep_cluster_conf_id 3 - wsrep_cluster_size 3 - wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 - wsrep_cluster_status Primary - - node4_galera_container-76275635 | success | rc=0 >> - Variable_name Value - wsrep_cluster_conf_id 3 - wsrep_cluster_size 3 - wsrep_cluster_state_uuid 338b06b0-2948-11e4-9d06-bef42f6c52f1 - wsrep_cluster_status Primary - diff --git a/doc/source/draft-operations-guide/ops-galera.rst b/doc/source/draft-operations-guide/ops-galera.rst deleted file mode 100644 index fccac80652..0000000000 --- a/doc/source/draft-operations-guide/ops-galera.rst +++ /dev/null @@ -1,18 +0,0 @@ -========================== -Galera cluster maintenance -========================== - -.. toctree:: - - ops-galera-remove.rst - ops-galera-start.rst - ops-galera-recovery.rst - -Routine maintenance includes gracefully adding or removing nodes from -the cluster without impacting operation and also starting a cluster -after gracefully shutting down all nodes. - -MySQL instances are restarted when creating a cluster, when adding a -node, when the service is not running, or when changes are made to the -``/etc/mysql/my.cnf`` configuration file. - diff --git a/doc/source/draft-operations-guide/ops-remove-computehost.rst b/doc/source/draft-operations-guide/ops-remove-computehost.rst deleted file mode 100644 index bd6a9aea9d..0000000000 --- a/doc/source/draft-operations-guide/ops-remove-computehost.rst +++ /dev/null @@ -1,49 +0,0 @@ -======================= -Removing a compute host -======================= - -The `openstack-ansible-ops `_ -repository contains a playbook for removing a compute host from an -OpenStack-Ansible (OSA) environment. -To remove a compute host, follow the below procedure. - -.. note:: - - This guide describes how to remove a compute node from an OSA environment - completely. Perform these steps with caution, as the compute node will no - longer be in service after the steps have been completed. This guide assumes - that all data and instances have been properly migrated. - -#. Disable all OpenStack services running on the compute node. - This can include, but is not limited to, the ``nova-compute`` service - and the neutron agent service. - - .. note:: - - Ensure this step is performed first - - .. code-block:: console - - # Run these commands on the compute node to be removed - # stop nova-compute - # stop neutron-linuxbridge-agent - -#. Clone the ``openstack-ansible-ops`` repository to your deployment host: - - .. code-block:: console - - $ git clone https://git.openstack.org/openstack/openstack-ansible-ops \ - /opt/openstack-ansible-ops - -#. Run the ``remove_compute_node.yml`` Ansible playbook with the - ``node_to_be_removed`` user variable set: - - .. code-block:: console - - $ cd /opt/openstack-ansible-ops/ansible_tools/playbooks - openstack-ansible remove_compute_node.yml \ - -e node_to_be_removed="" - -#. After the playbook completes, remove the compute node from the - OpenStack-Ansible configuration file in - ``/etc/openstack_deploy/openstack_user_config.yml``. diff --git a/doc/source/draft-operations-guide/ops-tips.rst b/doc/source/draft-operations-guide/ops-tips.rst deleted file mode 100644 index 84d0221836..0000000000 --- a/doc/source/draft-operations-guide/ops-tips.rst +++ /dev/null @@ -1,38 +0,0 @@ -=============== -Tips and tricks -=============== - -Ansible forks -~~~~~~~~~~~~~ - -The default MaxSessions setting for the OpenSSH Daemon is 10. Each Ansible -fork makes use of a Session. By default, Ansible sets the number of forks to -5. However, you can increase the number of forks used in order to improve -deployment performance in large environments. - -Note that more than 10 forks will cause issues for any playbooks -which use ``delegate_to`` or ``local_action`` in the tasks. It is -recommended that the number of forks are not raised when executing against the -Control Plane, as this is where delegation is most often used. - -The number of forks used may be changed on a permanent basis by including -the appropriate change to the ``ANSIBLE_FORKS`` in your ``.bashrc`` file. -Alternatively it can be changed for a particular playbook execution by using -the ``--forks`` CLI parameter. For example, the following executes the nova -playbook against the control plane with 10 forks, then against the compute -nodes with 50 forks. - -.. code-block:: shell-session - - # openstack-ansible --forks 10 os-nova-install.yml --limit compute_containers - # openstack-ansible --forks 50 os-nova-install.yml --limit compute_hosts - -For more information about forks, please see the following references: - -* OpenStack-Ansible `Bug 1479812`_ -* Ansible `forks`_ entry for ansible.cfg -* `Ansible Performance Tuning`_ - -.. _Bug 1479812: https://bugs.launchpad.net/openstack-ansible/+bug/1479812 -.. _forks: http://docs.ansible.com/ansible/intro_configuration.html#forks -.. _Ansible Performance Tuning: https://www.ansible.com/blog/ansible-performance-tuning diff --git a/doc/source/draft-operations-guide/ops-troubleshooting.rst b/doc/source/draft-operations-guide/ops-troubleshooting.rst deleted file mode 100644 index e5b7a40584..0000000000 --- a/doc/source/draft-operations-guide/ops-troubleshooting.rst +++ /dev/null @@ -1,125 +0,0 @@ -=============== -Troubleshooting -=============== - -Host kernel upgrade from version 3.13 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Ubuntu kernel packages newer than version 3.13 contain a change in -module naming from ``nf_conntrack`` to ``br_netfilter``. After -upgrading the kernel, re-run the ``openstack-hosts-setup.yml`` -playbook against those hosts. See `OSA bug 157996`_ for more -information. - -.. _OSA bug 157996: https://bugs.launchpad.net/openstack-ansible/+bug/1579963 - - - -Container networking issues -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -All LXC containers on the host have two virtual Ethernet interfaces: - -* `eth0` in the container connects to `lxcbr0` on the host -* `eth1` in the container connects to `br-mgmt` on the host - -.. note:: - - Some containers, such as ``cinder``, ``glance``, ``neutron_agents``, and - ``swift_proxy``, have more than two interfaces to support their - functions. - -Predictable interface naming ----------------------------- - -On the host, all virtual Ethernet devices are named based on their -container as well as the name of the interface inside the container: - - .. code-block:: shell-session - - ${CONTAINER_UNIQUE_ID}_${NETWORK_DEVICE_NAME} - -As an example, an all-in-one (AIO) build might provide a utility -container called `aio1_utility_container-d13b7132`. That container -will have two network interfaces: `d13b7132_eth0` and `d13b7132_eth1`. - -Another option would be to use the LXC tools to retrieve information -about the utility container: - - .. code-block:: shell-session - - # lxc-info -n aio1_utility_container-d13b7132 - - Name: aio1_utility_container-d13b7132 - State: RUNNING - PID: 8245 - IP: 10.0.3.201 - IP: 172.29.237.204 - CPU use: 79.18 seconds - BlkIO use: 678.26 MiB - Memory use: 613.33 MiB - KMem use: 0 bytes - Link: d13b7132_eth0 - TX bytes: 743.48 KiB - RX bytes: 88.78 MiB - Total bytes: 89.51 MiB - Link: d13b7132_eth1 - TX bytes: 412.42 KiB - RX bytes: 17.32 MiB - Total bytes: 17.73 MiB - -The ``Link:`` lines will show the network interfaces that are attached -to the utility container. - -Reviewing container networking traffic --------------------------------------- - -To dump traffic on the ``br-mgmt`` bridge, use ``tcpdump`` to see all -communications between the various containers. To narrow the focus, -run ``tcpdump`` only on the desired network interface of the -containers. - -Cached Ansible facts issues -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -At the beginning of a playbook run, information about each host is gathered. -Examples of the information gathered are: - - * Linux distribution - * Kernel version - * Network interfaces - -To improve performance, particularly in large deployments, you can -cache host facts and information. - -OpenStack-Ansible enables fact caching by default. The facts are -cached in JSON files within ``/etc/openstack_deploy/ansible_facts``. - -Fact caching can be disabled by commenting out the ``fact_caching`` -parameter in ``playbooks/ansible.cfg``. Refer to the Ansible -documentation on `fact caching`_ for more details. - -.. _fact caching: http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching - -Forcing regeneration of cached facts ------------------------------------- - -Cached facts may be incorrect if the host receives a kernel upgrade or new -network interfaces. Newly created bridges also disrupt cache facts. - -This can lead to unexpected errors while running playbooks, and -require that the cached facts be regenerated. - -Run the following command to remove all currently cached facts for all hosts: - -.. code-block:: shell-session - - # rm /etc/openstack_deploy/ansible_facts/* - -New facts will be gathered and cached during the next playbook run. - -To clear facts for a single host, find its file within -``/etc/openstack_deploy/ansible_facts/`` and remove it. Each host has -a JSON file that is named after its hostname. The facts for that host -will be regenerated on the next playbook run. - diff --git a/doc/source/draft-operations-guide/ref-info.rst b/doc/source/draft-operations-guide/ref-info.rst index 47dfe8d7ba..773a8f223e 100644 --- a/doc/source/draft-operations-guide/ref-info.rst +++ b/doc/source/draft-operations-guide/ref-info.rst @@ -5,7 +5,60 @@ Reference information This is a draft reference information page for the proposed OpenStack-Ansible operations guide. -.. toctree:: - :maxdepth: 2 +Linux Container commands +~~~~~~~~~~~~~~~~~~~~~~~~ - ops-lxc-commands.rst +The following are some useful commands to manage LXC: + +- List containers and summary information such as operational state and + network configuration: + + .. code-block:: shell-session + + # lxc-ls --fancy + +- Show container details including operational state, resource + utilization, and ``veth`` pairs: + + .. code-block:: shell-session + + # lxc-info --name container_name + +- Start a container: + + .. code-block:: shell-session + + # lxc-start --name container_name + +- Attach to a container: + + .. code-block:: shell-session + + # lxc-attach --name container_name + +- Stop a container: + + .. code-block:: shell-session + + # lxc-stop --name container_name + +Finding Ansible scripts after installation +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All scripts used to install OpenStack with Ansible can be viewed from +the repository on GitHub, and on the local infrastructure server. + +The repository containing the scripts and playbooks is located at +https://github.com/openstack/openstack-ansible. + +To access the scripts and playbooks on the local ``infra01`` server, +follow these steps. + +#. Log into the ``infra01`` server. + +#. Change to the ``/opt/rpc-openstack/openstack-ansible`` directory. + +#. The ``scripts`` directory contains scripts used in the installation. + Generally, directories and subdirectories under ``rpcd`` + contain files related to RPCO. For example, the + ``rpcd/playbooks`` directory contains the RPCO playbooks. diff --git a/doc/source/draft-operations-guide/ref-info/ansible-scripts.rst b/doc/source/draft-operations-guide/ref-info/ansible-scripts.rst new file mode 100644 index 0000000000..4afc1df858 --- /dev/null +++ b/doc/source/draft-operations-guide/ref-info/ansible-scripts.rst @@ -0,0 +1,21 @@ +========================================== +Finding Ansible scripts after installation +========================================== + +All scripts used to install OpenStack with Ansible can be viewed from +the repository on GitHub, and on the local infrastructure server. + +The repository containing the scripts and playbooks is located at +https://github.com/openstack/openstack-ansible. + +To access the scripts and playbooks on the local ``infra01`` server, +follow these steps. + +#. Log into the ``infra01`` server. + +#. Change to the ``/opt/rpc-openstack/openstack-ansible`` directory. + +#. The ``scripts`` directory contains scripts used in the installation. + Generally, directories and subdirectories under ``rpcd`` + contain files related to RPCO. For example, the + ``rpcd/playbooks`` directory contains the RPCO playbooks. diff --git a/doc/source/draft-operations-guide/ops-lxc-commands.rst b/doc/source/draft-operations-guide/ref-info/lxc-commands.rst similarity index 99% rename from doc/source/draft-operations-guide/ops-lxc-commands.rst rename to doc/source/draft-operations-guide/ref-info/lxc-commands.rst index 2464d67fd6..69cae29a83 100644 --- a/doc/source/draft-operations-guide/ops-lxc-commands.rst +++ b/doc/source/draft-operations-guide/ref-info/lxc-commands.rst @@ -35,4 +35,3 @@ The following are some useful commands to manage LXC: .. code-block:: shell-session # lxc-stop --name container_name - diff --git a/doc/source/draft-operations-guide/troubleshooting.rst b/doc/source/draft-operations-guide/troubleshooting.rst index 122803cd08..d40b04b0c8 100644 --- a/doc/source/draft-operations-guide/troubleshooting.rst +++ b/doc/source/draft-operations-guide/troubleshooting.rst @@ -5,7 +5,219 @@ Troubleshooting This is a draft troubleshooting page for the proposed OpenStack-Ansible operations guide. -.. toctree:: - :maxdepth: 2 +Networking +~~~~~~~~~~ + +Checking services +~~~~~~~~~~~~~~~~~ + +Restarting services +~~~~~~~~~~~~~~~~~~~ + +Troubleshooting Instance connectivity issues +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Diagnose Image service issues +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Test the OpenStack Image service by downloading and running a virtual +machine image. + +#. Change into a temporary directory when logged into the target + OpenStack environment and download a copy of the image: + + .. code:: + + $ mkdir /tmp/images + + .. code:: + + $ cd /tmp/images/ + + .. code:: + + $ wget http://cdn.download.cirros-cloud.net/0.3.2/cirros-0.3.2-x86_64-disk.img + +#. Upload the image into the Image Service: + + .. code:: + + $ glance image-create --name=IMAGELABEL --disk-format=FILEFORMAT \ + --container-format=CONTAINERFORMAT --is-public=ACCESSVALUE > IMAGEFILE + + For the arguments attached to the ``openstack image create``, modify the + names of each argument as needed: + + IMAGELABEL + A label or name for the image. + + FILEFORMAT + This argument specifies the file format of the image. Valid + vmdk. Verify an images file format with the **File** command. + + CONTAINERFORMAT + This argument specifies container format. Use bare to indicate + the image file is not a file format that contains metadata from + the virtual machine. + + ACCESSVALUE + Specifies whether a user or an admin can access and view the + image. True enables all users with access privelages, while false + restricts access to admins. + + IMAGEFILE + This specifies the name of the downloaded image file. + + .. note:: + + The image ID returned will differ between OpenStack environments + since the ID is dynamic and variable, generated differently for + each image uploaded. + +#. Run the ``openstack image list`` command to confirm the image was + uploaded successfully. + +#. Remove the locally downloaded image, since the Image Service now + stores a copy of the downloaded image. + + .. code:: + + $ rm -r /tmp/images + +For investigating problems or errors, the Image service directs all +activity to the /var/log/glance-api.log and /var/log/glance-registry.log +inside the glance api container. + +RabbitMQ issues +~~~~~~~~~~~~~~~ + +Analyze RabbitMQ queues +----------------------- + +Analyze OpenStack service logs and RabbitMQ logs +------------------------------------------------ + +Failed security hardening after host kernel upgrade from version 3.13 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Ubuntu kernel packages newer than version 3.13 contain a change in +module naming from ``nf_conntrack`` to ``br_netfilter``. After +upgrading the kernel, re-run the ``openstack-hosts-setup.yml`` +playbook against those hosts. See `OSA bug 157996`_ for more +information. + +.. _OSA bug 157996: https://bugs.launchpad.net/openstack-ansible/+bug/1579963 + +Cached Ansible facts issues +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +At the beginning of a playbook run, information about each host is gathered. +Examples of the information gathered are: + + * Linux distribution + * Kernel version + * Network interfaces + +To improve performance, particularly in large deployments, you can +cache host facts and information. + +OpenStack-Ansible enables fact caching by default. The facts are +cached in JSON files within ``/etc/openstack_deploy/ansible_facts``. + +Fact caching can be disabled by commenting out the ``fact_caching`` +parameter in ``playbooks/ansible.cfg``. Refer to the Ansible +documentation on `fact caching`_ for more details. + +.. _fact caching: http://docs.ansible.com/ansible/playbooks_variables.html#fact-caching + +Forcing regeneration of cached facts +------------------------------------ + +Cached facts may be incorrect if the host receives a kernel upgrade or new +network interfaces. Newly created bridges also disrupt cache facts. + +This can lead to unexpected errors while running playbooks, and +require that the cached facts be regenerated. + +Run the following command to remove all currently cached facts for all hosts: + +.. code-block:: shell-session + + # rm /etc/openstack_deploy/ansible_facts/* + +New facts will be gathered and cached during the next playbook run. + +To clear facts for a single host, find its file within +``/etc/openstack_deploy/ansible_facts/`` and remove it. Each host has +a JSON file that is named after its hostname. The facts for that host +will be regenerated on the next playbook run. + + +Failed ansible playbooks during an upgrade +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + + +Container networking issues +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +All LXC containers on the host have two virtual Ethernet interfaces: + +* `eth0` in the container connects to `lxcbr0` on the host +* `eth1` in the container connects to `br-mgmt` on the host + +.. note:: + + Some containers, such as ``cinder``, ``glance``, ``neutron_agents``, and + ``swift_proxy``, have more than two interfaces to support their + functions. + +Predictable interface naming +---------------------------- + +On the host, all virtual Ethernet devices are named based on their +container as well as the name of the interface inside the container: + + .. code-block:: shell-session + + ${CONTAINER_UNIQUE_ID}_${NETWORK_DEVICE_NAME} + +As an example, an all-in-one (AIO) build might provide a utility +container called `aio1_utility_container-d13b7132`. That container +will have two network interfaces: `d13b7132_eth0` and `d13b7132_eth1`. + +Another option would be to use the LXC tools to retrieve information +about the utility container: + + .. code-block:: shell-session + + # lxc-info -n aio1_utility_container-d13b7132 + + Name: aio1_utility_container-d13b7132 + State: RUNNING + PID: 8245 + IP: 10.0.3.201 + IP: 172.29.237.204 + CPU use: 79.18 seconds + BlkIO use: 678.26 MiB + Memory use: 613.33 MiB + KMem use: 0 bytes + Link: d13b7132_eth0 + TX bytes: 743.48 KiB + RX bytes: 88.78 MiB + Total bytes: 89.51 MiB + Link: d13b7132_eth1 + TX bytes: 412.42 KiB + RX bytes: 17.32 MiB + Total bytes: 17.73 MiB + +The ``Link:`` lines will show the network interfaces that are attached +to the utility container. + +Review container networking traffic +----------------------------------- + +To dump traffic on the ``br-mgmt`` bridge, use ``tcpdump`` to see all +communications between the various containers. To narrow the focus, +run ``tcpdump`` only on the desired network interface of the +containers. - ops-troubleshooting.rst diff --git a/doc/source/draft-operations-guide/verify-deploy.rst b/doc/source/draft-operations-guide/verify-deploy.rst deleted file mode 100644 index ddcdb31134..0000000000 --- a/doc/source/draft-operations-guide/verify-deploy.rst +++ /dev/null @@ -1,6 +0,0 @@ -=============================== -Verifying your cloud deployment -=============================== - -This is a draft cloud verification page for the proposed OpenStack-Ansible -operations guide.