diff --git a/doc/source/index.rst b/doc/source/index.rst index 49914e0e..cb39fec1 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -13,27 +13,34 @@ License for the specific language governing permissions and limitations under the License. +============== What is Rally? ============== -**OpenStack** is, undoubtedly, a really *huge* ecosystem of cooperative services. **Rally** is a **benchmarking tool** that answers the question: **"How does OpenStack work at scale?"**. To make this possible, Rally **automates** and **unifies** multi-node OpenStack deployment, cloud verification, benchmarking & profiling. Rally does it in a **generic** way, making it possible to check whether OpenStack is going to work well on, say, a 1k-servers installation under high load. Thus it can be used as a basic tool for an *OpenStack CI/CD system* that would continuously improve its SLA, performance and stability. +**OpenStack** is, undoubtedly, a really *huge* ecosystem of cooperative +services. **Rally** is a **benchmarking tool** that answers the question: +**"How does OpenStack work at scale?"**. To make this possible, Rally +**automates** and **unifies** multi-node OpenStack deployment, cloud +verification, benchmarking & profiling. Rally does it in a **generic** way, +making it possible to check whether OpenStack is going to work well on, say, a +1k-servers installation under high load. Thus it can be used as a basic tool +for an *OpenStack CI/CD system* that would continuously improve its SLA, +performance and stability. .. image:: ./images/Rally-Actions.png :align: center Contents --------- +======== .. toctree:: :maxdepth: 2 - overview - glossary + overview/index install tutorial cli/cli_reference reports - user_stories plugins plugin/plugin_reference db_migrations diff --git a/doc/source/overview/glossary.rst b/doc/source/overview/glossary.rst new file mode 100644 index 00000000..4e289a70 --- /dev/null +++ b/doc/source/overview/glossary.rst @@ -0,0 +1,171 @@ +:tocdepth: 1 + +======== +Glossary +======== + +.. warning:: Unfortunately, our glossary is not full, but the Rally + team is working on improving it. If you cannot find a definition in + which you are interested, feel free to ping us via IRC + (#openstack-rally channel at Freenode) or via E-Mail + (openstack-dev@lists.openstack.org with tag [Rally]). + +.. contents:: + :depth: 1 + :local: + +Common +====== + +Alembic +------- + +A lightweight database migration tool which powers Rally migrations. Read more +at `Official Alembic documentation `_ + +DB Migrations +------------- + +Rally supports database schema and data transformations, which are also +known as migrations. This allows you to get your data up-to-date with +latest Rally version. + +Rally +----- + +A testing tool that automates and unifies multi-node OpenStack deployment +and cloud verification. It can be used as a basic tool +for an OpenStack CI/CD system that would continuously improve its SLA, +performance and stability. + +Rally Config +------------ + +Rally behavior can be customized by editing its configuration file, +*rally.conf*, in `configparser +`_ +format. While being installed, Rally generates a config with default +values from its `sample +`_. +When started, Rally searches for its config in +"/etc/rally/rally.conf", "~/.rally/rally.conf", +"/etc/rally/rally.conf" + +Rally DB +-------- + +Rally uses a relational database as data storage. Several database backends +are supported: SQLite (default), PostgreSQL, and MySQL. +The database connection can be set via the configuration file option +*[database]/connection*. + +Rally Plugin +------------ + +Most parts of Rally +`are pluggable `_. +Scenarios, runners, contexts and even charts for HTML report are plugins. +It is easy to create your own plugin and use it. Read more at +`plugin reference `_. + +Deployment +========== + +Deployment +---------- + +A set of information about target environment (for example: URI and +authentication credentials) which is saved in the database. It is used +to define the target system for testing each time a task is started. +It has a "type" value which changes task behavior for the selected +target system; for example type "openstack" will enable OpenStack +authentication and services. + +Task +==== + +Cleanup +------- + +This is a specific context which removes all resources on target +system that were created by the current task. If some Rally-related +resources remain, please `file a bug +`_ and attach the task file and a +list of remaining resources. + +Context +------- + +A type of plugin that can run some actions on the target environment +before the workloads start and after the last workload finishes. This +allows, for example, preparing the environment for workloads (e.g., +create resources and change parameters) and restoring the environment +later. Each Context must implement ``setup()`` and ``cleanup()`` +methods. + +Input task +---------- + +A file that describes how to run a Rally Task. It can be in JSON or +YAML format. The *rally task start* command needs this file to run +the task. The input task is pre-processed by the `Jinja2 +`_ templating engine so it is very easy to +create repeated parts or calculate specific values at runtime. It is +also possible to pass values via CLI arguments, using the +*--task-args* or *--task-args-file* options. + +Runner +------ + +This is a Rally plugin which decides how to run Workloads. For +example, they can be run serially in a single process, or using +concurrency. + +Scenario +-------- + +Synonym for `Workload <#workload>`_ + +Service +------- + +Abstraction layer that represents target environment API. For +example, this can be some OpenStack service. A Service provides API +versioning and action timings, simplifies API calls, and reduces code +duplication. It can be used in any Rally plugin. + +SLA +--- + +Service-Level Agreement (Success Criteria). +Allows you to determine whether a subtask or workload is successful +by setting success criteria rules. + +Subtask +------- + +A part of a Task. There can be many subtasks in a single Task. + +Task +---- + +An entity which includes all the necessary data for a test run, and +results of this run. + +Workload +-------- + +An important part of Task: a plugin which is run by the runner. It is +usually run in separate thread. Workloads are grouped into Subtasks. + +Verify +====== + +Rally can run different subunit-based testing tools against a target +environment, for example `tempest +`_ for OpenStack. + +Verification +------------ + +A result of running some third-party subunit-based testing tool. diff --git a/doc/source/overview/index.rst b/doc/source/overview/index.rst new file mode 100644 index 00000000..b91df287 --- /dev/null +++ b/doc/source/overview/index.rst @@ -0,0 +1,25 @@ +.. + Copyright 2015 Mirantis Inc. All Rights Reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + +====================== +Rally project overview +====================== + +.. toctree:: + :glob: + + overview + glossary + user_stories diff --git a/doc/source/overview/overview.rst b/doc/source/overview/overview.rst new file mode 100644 index 00000000..505a5be0 --- /dev/null +++ b/doc/source/overview/overview.rst @@ -0,0 +1,183 @@ +.. + Copyright 2015 Mirantis Inc. All Rights Reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + +.. _overview: + +.. contents:: + :depth: 1 + :local: + +Overview +======== + +**Rally** is a **benchmarking tool** that **automates** and **unifies** +multi-node OpenStack deployment, cloud verification, benchmarking & profiling. +It can be used as a basic tool for an *OpenStack CI/CD system* that would +continuously improve its SLA, performance and stability. + +Who Is Using Rally +------------------ + +Here's a small selection of some of the many companies using Rally: + +.. image:: ../images/Rally_who_is_using.png + :align: center + +Use Cases +--------- + +Let's take a look at 3 major high level Use Cases of Rally: + +.. image:: ../images/Rally-UseCases.png + :align: center + + +Generally, there are a few typical cases where Rally proves to be of great use: + + 1. Automate measuring & profiling focused on how new code changes affect + the OS performance; + + 2. Using Rally profiler to detect scaling & performance issues; + + 3. Investigate how different deployments affect the OS performance: + + * Find the set of suitable OpenStack deployment architectures; + * Create deployment specifications for different loads (amount of + controllers, swift nodes, etc.); + + 4. Automate the search for hardware best suited for particular OpenStack + cloud; + + 5. Automate the production cloud specification generation: + + * Determine terminal loads for basic cloud operations: VM start & stop, + Block Device create/destroy & various OpenStack API methods; + * Check performance of basic cloud operations in case of different + loads. + + +Real-life examples +------------------ + +To be substantive, let's investigate a couple of real-life examples of Rally in +action. + + +How does amqp_rpc_single_reply_queue affect performance? +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Rally allowed us to reveal a quite an interesting fact about **Nova**. We used +*NovaServers.boot_and_delete* benchmark scenario to see how the +*amqp_rpc_single_reply_queue* option affects VM bootup time (it turns on a kind +of fast RPC). Some time ago it was +`shown `_ +that cloud performance can be boosted by setting it on, so we naturally decided +to check this result with Rally. To make this test, we issued requests for +booting and deleting VMs for a number of concurrent users ranging from 1 to 30 +with and without the investigated option. For each group of users, a total +number of 200 requests was issued. Averaged time per request is shown below: + +.. image:: ../images/Amqp_rpc_single_reply_queue.png + :align: center + +**So Rally has unexpectedly indicated that setting the +*amqp_rpc_single_reply_queue* option apparently affects the cloud performance, +but in quite an opposite way rather than it was thought before.** + + +Performance of Nova list command +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Another interesting result comes from the *NovaServers.boot_and_list_server* +scenario, which enabled us to we launched the following benchmark with Rally: + + * **Benchmark environment** (which we also call **"Context"**): 1 temporary + OpenStack user. + * **Benchmark scenario**: boot a single VM from this user & list all VMs. + * **Benchmark runner** setting: repeat this procedure 200 times in a + continuous way. + +During the execution of this benchmark scenario, the user has more and more VMs +on each iteration. Rally has shown that in this case, the performance of the +**VM list** command in Nova is degrading much faster than one might expect: + +.. image:: ../images/Rally_VM_list.png + :align: center + + +Complex scenarios +^^^^^^^^^^^^^^^^^ + +In fact, the vast majority of Rally scenarios is expressed as a sequence of +**"atomic" actions**. For example, *NovaServers.snapshot* is composed of 6 +atomic actions: + + 1. boot VM + 2. snapshot VM + 3. delete VM + 4. boot VM from snapshot + 5. delete VM + 6. delete snapshot + +Rally measures not only the performance of the benchmark scenario as a whole, +but also that of single atomic actions. As a result, Rally also plots the +atomic actions performance data for each benchmark iteration in a quite +detailed way: + +.. image:: ../images/Rally_snapshot_vm.png + :align: center + + +Architecture +------------ + +Usually OpenStack projects are implemented *"as-a-Service"*, so Rally provides +this approach. In addition, it implements a *CLI-driven* approach that does not +require a daemon: + + 1. **Rally as-a-Service**: Run rally as a set of daemons that present Web + UI *(work in progress)* so 1 RaaS could be used by a whole team. + 2. **Rally as-an-App**: Rally as a just lightweight and portable CLI app + (without any daemons) that makes it simple to use & develop. + +The diagram below shows how this is possible: + +.. image:: ../images/Rally_Architecture.png + :align: center + +The actual **Rally core** consists of 4 main components, listed below in the +order they go into action: + + 1. **Server Providers** - provide a **unified interface** for interaction + with different **virtualization technologies** (*LXS*, *Virsh* etc.) and + **cloud suppliers** (like *Amazon*): it does so via *ssh* access and in + one *L3 network*; + 2. **Deploy Engines** - deploy some OpenStack distribution (like *DevStack* + or *FUEL*) before any benchmarking procedures take place, using servers + retrieved from Server Providers; + 3. **Verification** - runs *Tempest* (or another specific set of tests) + against the deployed cloud to check that it works correctly, collects + results & presents them in human readable form; + 4. **Benchmark Engine** - allows to write parameterized benchmark scenarios + & run them against the cloud. + +It should become fairly obvious why Rally core needs to be split to these parts +if you take a look at the following diagram that visualizes a rough **algorithm +for starting benchmarking OpenStack at scale**. Keep in mind that there might +be lots of different ways to set up virtual servers, as well as to deploy +OpenStack to them. + +.. image:: ../images/Rally_QA.png + :align: center diff --git a/doc/source/overview/stories b/doc/source/overview/stories new file mode 120000 index 00000000..bb3efd11 --- /dev/null +++ b/doc/source/overview/stories @@ -0,0 +1 @@ +../../user_stories/ \ No newline at end of file diff --git a/doc/source/overview/user_stories.rst b/doc/source/overview/user_stories.rst new file mode 100644 index 00000000..11907d5f --- /dev/null +++ b/doc/source/overview/user_stories.rst @@ -0,0 +1,31 @@ +.. + Copyright 2015 Mirantis Inc. All Rights Reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + +.. _user_stories: + +User stories +============ + +Many users of Rally were able to make interesting discoveries concerning their +OpenStack clouds using our benchmarking tool. Numerous user stories presented +below show how Rally has made it possible to find performance bugs and validate +improvements for different OpenStack installations. + + +.. toctree:: + :glob: + :maxdepth: 1 + + stories/** diff --git a/doc/user_stories/keystone/authenticate.rst b/doc/user_stories/keystone/authenticate.rst index 33794f81..b91141b0 100644 --- a/doc/user_stories/keystone/authenticate.rst +++ b/doc/user_stories/keystone/authenticate.rst @@ -4,20 +4,29 @@ *(Contributed by Neependra Khare, Red Hat)* -Below we describe how we were able to get and verify a 4x better performance of Keystone inside Apache. To do that, we ran a Keystone token creation benchmark with Rally under different load (this benchmark scenario essentially just authenticate users with keystone to get tokens). +Below we describe how we were able to get and verify a 4x better performance of +Keystone inside Apache. To do that, we ran a Keystone token creation benchmark +with Rally under different load (this benchmark scenario essentially just +authenticate users with keystone to get tokens). Goal ---- - Get the data about performance of token creation under different load. -- Ensure that keystone with increased public_workers/admin_workers values and under Apache works better than the default setup. +- Ensure that keystone with increased public_workers/admin_workers values and + under Apache works better than the default setup. Summary ------- - As the concurrency increases, time to authenticate the user gets up. -- Keystone is CPU bound process and by default only one thread of keystone-all process get started. We can increase the parallelism by: - 1. increasing public_workers/admin_workers values in keystone.conf file - 2. running keystone inside Apache -- We configured Keystone with 4 public_workers and ran Keystone inside Apache. In both cases we got upto 4x better performance as compared to default keystone configuration. +- Keystone is CPU bound process and by default only one thread of + *keystone-all* process get started. We can increase the parallelism by: + + 1. increasing *public_workers/admin_workers* values in *keystone.conf* file + 2. running Keystone inside Apache + +- We configured Keystone with 4 *public_workers* and ran Keystone inside + Apache. In both cases we got up to 4x better performance as compared to + default Keystone configuration. Setup ----- @@ -35,9 +44,11 @@ Keystone - Commit#455d50e8ae360c2a7598a61d87d9d341e5d9d3ed Keystone API - 2 -To increase public_workers - Uncomment line with public_workers and set public_workers to 4. Then restart keystone service. +To increase public_workers - Uncomment line with *public_workers* and set +*public_workers* to 4. Then restart Keystone service. -To run keystone inside Apache - Added *APACHE_ENABLED_SERVICES=key* in localrc file while setting up OpenStack environment with devstack. +To run Keystone inside Apache - Added *APACHE_ENABLED_SERVICES=key* in +*localrc* file while setting up OpenStack environment with Devstack. Results diff --git a/doc/user_stories/nova/boot_server.rst b/doc/user_stories/nova/boot_server.rst index e7fcb496..8557c7bc 100644 --- a/doc/user_stories/nova/boot_server.rst +++ b/doc/user_stories/nova/boot_server.rst @@ -4,7 +4,11 @@ Finding a Keystone bug while benchmarking 20 node HA cloud performance at creati *(Contributed by Alexander Maretskiy, Mirantis)* -Below we describe how we found a `bug in keystone `_ and achieved 2x average performance increase at booting Nova servers after fixing that bug. Our initial goal was to benchmark the booting of a significant amount of servers on a cluster (running on a custom build of `Mirantis OpenStack `_ v5.1) and to ensure that this operation has reasonable performance and completes with no errors. +Below we describe how we found a `bug in Keystone`_ and achieved 2x average +performance increase at booting Nova servers after fixing that bug. Our initial +goal was to benchmark the booting of a significant amount of servers on a +cluster (running on a custom build of `Mirantis OpenStack`_ v5.1) and to ensure +that this operation has reasonable performance and completes with no errors. Goal ---- @@ -38,36 +42,36 @@ Cluster This cluster was created via Fuel Dashboard interface. -+----------------------+-----------------------------------------------------------------------------+ -| Deployment | Custom build of `Mirantis OpenStack `_ v5.1 | -+----------------------+-----------------------------------------------------------------------------+ -| OpenStack release | Icehouse | -+----------------------+-----------------------------------------------------------------------------+ -| Operating System | Ubuntu 12.04.4 | -+----------------------+-----------------------------------------------------------------------------+ -| Mode | High availability | -+----------------------+-----------------------------------------------------------------------------+ -| Hypervisor | KVM | -+----------------------+-----------------------------------------------------------------------------+ -| Networking | Neutron with GRE segmentation | -+----------------------+-----------------------------------------------------------------------------+ -| Controller nodes | 3 | -+----------------------+-----------------------------------------------------------------------------+ -| Compute nodes | 17 | -+----------------------+-----------------------------------------------------------------------------+ ++----------------------+--------------------------------------------+ +| Deployment | Custom build of `Mirantis OpenStack`_ v5.1 | ++----------------------+--------------------------------------------+ +| OpenStack release | Icehouse | ++----------------------+--------------------------------------------+ +| Operating System | Ubuntu 12.04.4 | ++----------------------+--------------------------------------------+ +| Mode | High availability | ++----------------------+--------------------------------------------+ +| Hypervisor | KVM | ++----------------------+--------------------------------------------+ +| Networking | Neutron with GRE segmentation | ++----------------------+--------------------------------------------+ +| Controller nodes | 3 | ++----------------------+--------------------------------------------+ +| Compute nodes | 17 | ++----------------------+--------------------------------------------+ Rally ----- **Version** -For this benchmark, we use custom rally with the following patch: +For this benchmark, we use custom Rally with the following patch: https://review.openstack.org/#/c/96300/ **Deployment** -Rally was deployed for cluster using `ExistingCloud `_ type of deployment. +Rally was deployed for cluster using `ExistingCloud`_ type of deployment. **Server flavor** @@ -153,16 +157,18 @@ Rally was deployed for cluster using `ExistingCloud : Unauthorized (HTTP 401). +Starting from 142 server, we have error from novaclient: **Error : Unauthorized (HTTP 401).** -That is how a `bug in keystone `_ was found. +That is how a `bug in Keystone`_ was found. +------------------+-----------+-----------+-----------+---------------+---------------+---------+-------+ | action | min (sec) | avg (sec) | max (sec) | 90 percentile | 95 percentile | success | count | @@ -173,7 +179,8 @@ That is how a `bug in keystone