2f4555be27
In a lot of placeses we are using word "benchmark" which can mean workload, subtask, or test case which is very confusing. This patch partially address wrong usage of "benchamrk" word Change-Id: Id3b2b7ae841a5243684c12cc51c96f005dbe7544
180 lines
6.3 KiB
ReStructuredText
180 lines
6.3 KiB
ReStructuredText
..
|
|
Copyright 2015 Mirantis Inc. All Rights Reserved.
|
|
|
|
Licensed under the Apache License, Version 2.0 (the "License"); you may
|
|
not use this file except in compliance with the License. You may obtain
|
|
a copy of the License at
|
|
|
|
http://www.apache.org/licenses/LICENSE-2.0
|
|
|
|
Unless required by applicable law or agreed to in writing, software
|
|
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
|
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
|
License for the specific language governing permissions and limitations
|
|
under the License.
|
|
|
|
.. _overview:
|
|
|
|
.. contents::
|
|
:depth: 1
|
|
:local:
|
|
|
|
Overview
|
|
========
|
|
|
|
**Rally** is a **generic testing tool** that **automates** and **unifies**
|
|
multi-node OpenStack deployment, verification, testing & profiling.
|
|
It can be used as a basic tool for an *OpenStack CI/CD system* that would
|
|
continuously improve its SLA, performance and stability.
|
|
|
|
Who Is Using Rally
|
|
------------------
|
|
|
|
Here's a small selection of some of the many companies using Rally:
|
|
|
|
.. image:: ../images/Rally_who_is_using.png
|
|
:align: center
|
|
|
|
Use Cases
|
|
---------
|
|
|
|
Let's take a look at 3 major high level Use Cases of Rally:
|
|
|
|
.. image:: ../images/Rally-UseCases.png
|
|
:align: center
|
|
|
|
|
|
Generally, there are a few typical cases where Rally proves to be of great use:
|
|
|
|
1. Automate measuring & profiling focused on how new code changes affect
|
|
the OS performance;
|
|
|
|
2. Using Rally profiler to detect scaling & performance issues;
|
|
|
|
3. Investigate how different deployments affect the OS performance:
|
|
|
|
* Find the set of suitable OpenStack deployment architectures;
|
|
* Create deployment specifications for different loads (amount of
|
|
controllers, swift nodes, etc.);
|
|
|
|
4. Automate the search for hardware best suited for particular OpenStack
|
|
cloud;
|
|
|
|
5. Automate the production cloud specification generation:
|
|
|
|
* Determine terminal loads for basic cloud operations: VM start & stop,
|
|
Block Device create/destroy & various OpenStack API methods;
|
|
* Check performance of basic cloud operations in case of different
|
|
loads.
|
|
|
|
|
|
Real-life examples
|
|
------------------
|
|
|
|
To be substantive, let's investigate a couple of real-life examples of Rally in
|
|
action.
|
|
|
|
|
|
How does amqp_rpc_single_reply_queue affect performance?
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Rally allowed us to reveal a quite an interesting fact about **Nova**. We used
|
|
*NovaServers.boot_and_delete* scenario to see how the
|
|
*amqp_rpc_single_reply_queue* option affects VM bootup time (it turns on a kind
|
|
of fast RPC). Some time ago it was
|
|
`shown <https://docs.google.com/file/d/0B-droFdkDaVhVzhsN3RKRlFLODQ/edit?pli=1>`_
|
|
that cloud performance can be boosted by setting it on, so we naturally decided
|
|
to check this result with Rally. To make this test, we issued requests for
|
|
booting and deleting VMs for a number of concurrent users ranging from 1 to 30
|
|
with and without the investigated option. For each group of users, a total
|
|
number of 200 requests was issued. Averaged time per request is shown below:
|
|
|
|
.. image:: ../images/Amqp_rpc_single_reply_queue.png
|
|
:align: center
|
|
|
|
**So Rally has unexpectedly indicated that setting the
|
|
*amqp_rpc_single_reply_queue* option apparently affects the cloud performance,
|
|
but in quite an opposite way rather than it was thought before.**
|
|
|
|
|
|
Performance of Nova list command
|
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
|
|
|
Another interesting result comes from the *NovaServers.boot_and_list_server*
|
|
scenario, which enabled us to launch the following task with Rally:
|
|
|
|
* **Task context**: 1 temporary OpenStack user.
|
|
* **Task scenario**: boot a single VM from this user & list all VMs.
|
|
* **Task runner**: repeat this procedure 200 times in a continuous way.
|
|
|
|
During the execution of this task, the user has more and more VMs on each
|
|
iteration. Rally has shown that in this case, the performance of the
|
|
**VM list** command in Nova is degrading much faster than one might expect:
|
|
|
|
.. image:: ../images/Rally_VM_list.png
|
|
:align: center
|
|
|
|
|
|
Complex scenarios
|
|
^^^^^^^^^^^^^^^^^
|
|
|
|
In fact, the vast majority of Rally scenarios is expressed as a sequence of
|
|
**"atomic" actions**. For example, *NovaServers.snapshot* is composed of 6
|
|
atomic actions:
|
|
|
|
1. boot VM
|
|
2. snapshot VM
|
|
3. delete VM
|
|
4. boot VM from snapshot
|
|
5. delete VM
|
|
6. delete snapshot
|
|
|
|
Rally measures not only the performance of the scenario as a whole,
|
|
but also that of single atomic actions. As a result, Rally also displays the
|
|
atomic actions performance data for each scenario iteration in a quite
|
|
detailed way:
|
|
|
|
.. image:: ../images/Rally_snapshot_vm.png
|
|
:align: center
|
|
|
|
|
|
Architecture
|
|
------------
|
|
|
|
Usually OpenStack projects are implemented *"as-a-Service"*, so Rally provides
|
|
this approach. In addition, it implements a *CLI-driven* approach that does not
|
|
require a daemon:
|
|
|
|
1. **Rally as-a-Service**: Run rally as a set of daemons that present Web
|
|
UI *(work in progress)* so 1 RaaS could be used by a whole team.
|
|
2. **Rally as-an-App**: Rally as a just lightweight and portable CLI app
|
|
(without any daemons) that makes it simple to use & develop.
|
|
|
|
The diagram below shows how this is possible:
|
|
|
|
.. image:: ../images/Rally_Architecture.png
|
|
:align: center
|
|
|
|
The actual **Rally core** consists of 3 main components, listed below in the
|
|
order they go into action:
|
|
|
|
1. **Deploy** - store credentials about your deployments, credentials
|
|
are used by verify and task commands. It has plugable mechanism that
|
|
allows one to implement basic LCM for testing environment as well.
|
|
|
|
2. **Verify** - wraps unittest based functional testing framework to
|
|
provide complete tool with result storage and reporting.
|
|
Currently has only plugin implemneted for OpenStack Tempest.
|
|
|
|
3. **Task** - framework that allows to write parametrized plugins and
|
|
combine them in complex test cases using YAML. Framework allows to
|
|
produce all kinds of tests including functional, concurrency,
|
|
regression, load, scale, capacity and even chaos testing.
|
|
|
|
It should become fairly obvious why Rally core needs to be split to these parts
|
|
if you take a look at the following diagram that visualizes a rough **algorithm
|
|
for starting testing clouds at scale**.
|
|
|
|
.. image:: ../images/Rally_QA.png
|
|
:align: center
|