Merge "Add test plan for OpenStack API performance metrics"
This commit is contained in:
commit
e12762db42
250
doc/source/test_plans/openstack_api_metrics/plan.rst
Normal file
250
doc/source/test_plans/openstack_api_metrics/plan.rst
Normal file
@ -0,0 +1,250 @@
|
|||||||
|
.. _openstack_api_performance_metrics_test_plan:
|
||||||
|
|
||||||
|
=================================
|
||||||
|
OpenStack API Performance Metrics
|
||||||
|
=================================
|
||||||
|
|
||||||
|
:status: **draft**
|
||||||
|
:version: 1.0
|
||||||
|
|
||||||
|
:Abstract:
|
||||||
|
|
||||||
|
This test plan defines performance metrics for OpenStack API and the way
|
||||||
|
to measure them.
|
||||||
|
|
||||||
|
:Conventions:
|
||||||
|
- **Operation Duration** - how long does it take to perform a single
|
||||||
|
operation.
|
||||||
|
- **Operation Throughput** - how many operations can be done in one second in
|
||||||
|
average.
|
||||||
|
- **Concurrency** - how many parallel operations can be run when operation
|
||||||
|
throughput reaches the maximum.
|
||||||
|
- **Scale Impact** - comparison of operation metrics when number of objects
|
||||||
|
is high versus low.
|
||||||
|
|
||||||
|
|
||||||
|
Test Plan
|
||||||
|
=========
|
||||||
|
|
||||||
|
This test plan defines set of performance metrics for OpenStack API. This
|
||||||
|
metrics can be used to compare different cloud implementations and for
|
||||||
|
performance tuning.
|
||||||
|
|
||||||
|
This test plan can be used to answer the following questions:
|
||||||
|
* How long does it take to perform a particular operation? (*e.g. duration of
|
||||||
|
Neutron net_create operation*)
|
||||||
|
* How many concurrent operation can be run in parallel without degradation?
|
||||||
|
(*e.g. can one do 10 Neutron net_create operation in parallel or better do
|
||||||
|
them one-by-one*)
|
||||||
|
* How many particular operations can OpenStack cloud process in a second?
|
||||||
|
(*e.g. find out whether one can do 100 Neutron net_create ops per second or
|
||||||
|
not*)
|
||||||
|
* What is the impact of having many objects in the cloud? How the performance
|
||||||
|
degrades? (*e.g. will the cloud be slower when there are thousands of
|
||||||
|
objects and how slower will it be*)
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Preparation
|
||||||
|
^^^^^^^^^^^
|
||||||
|
|
||||||
|
This test plan is executed against existing OpenStack cloud.
|
||||||
|
|
||||||
|
Measurements can be done with the tool that can:
|
||||||
|
* report duration of single operations;
|
||||||
|
* execute operations one-by-one and in a configurable number of concurrent
|
||||||
|
threads.
|
||||||
|
|
||||||
|
Environment description
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The environment description includes hardware specification of servers,
|
||||||
|
network parameters, operation system and OpenStack deployment characteristics.
|
||||||
|
|
||||||
|
Hardware
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
This section contains list of all types of hardware nodes.
|
||||||
|
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| model | | e.g. Supermicro X9SRD-F |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| CPU | | e.g. 6 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| role | | e.g. compute or network |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
|
||||||
|
Network
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
This section contains list of interfaces and network parameters.
|
||||||
|
For complicated cases this section may include topology diagram and switch
|
||||||
|
parameters.
|
||||||
|
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| network role | | e.g. provider or public |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| card model | | e.g. Intel |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| driver | | e.g. ixgbe |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| speed | | e.g. 10G or 1G |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| MTU | | e.g. 9000 |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| offloading modes | | e.g. default |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
|
||||||
|
Software
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
This section describes installed software.
|
||||||
|
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| OS | | e.g. Ubuntu 14.04.3 |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| OpenStack | | e.g. Liberty |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| Hypervisor | | e.g. KVM |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| Neutron plugin | | e.g. ML2 + OVS |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| L2 segmentation | | e.g. VLAN or VxLAN or GRE |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| virtual routers | | e.g. legacy or HA or DVR |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
|
||||||
|
|
||||||
|
Test Case: Operation Performance Measurements
|
||||||
|
---------------------------------------------
|
||||||
|
|
||||||
|
Description
|
||||||
|
^^^^^^^^^^^
|
||||||
|
|
||||||
|
The test case is performed by running a specific OpenStack operation. Every
|
||||||
|
operation is executed several times to collect more reliable statistical data.
|
||||||
|
|
||||||
|
|
||||||
|
Parameters
|
||||||
|
^^^^^^^^^^
|
||||||
|
|
||||||
|
The only parameter is the operation being tested.
|
||||||
|
|
||||||
|
List of performance metrics
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
.. list-table::
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
*
|
||||||
|
- Priority
|
||||||
|
- Value
|
||||||
|
- Measurement Unit
|
||||||
|
- Description
|
||||||
|
*
|
||||||
|
- 1
|
||||||
|
- Duration median
|
||||||
|
- ms
|
||||||
|
- Median of operation durations measured when operations are performed
|
||||||
|
one-by-one in 1 thread
|
||||||
|
*
|
||||||
|
- 1
|
||||||
|
- Duration 95% percentile
|
||||||
|
- ms
|
||||||
|
- 95% percentile of operation durations measured when operations are
|
||||||
|
performed one-by-one in 1 thread
|
||||||
|
*
|
||||||
|
- 2
|
||||||
|
- Duration 99% percentile
|
||||||
|
- ms
|
||||||
|
- 99% percentile of operation durations measured when operations are
|
||||||
|
performed one-by-one in 1 thread
|
||||||
|
*
|
||||||
|
- 1
|
||||||
|
- Concurrency
|
||||||
|
- count
|
||||||
|
- How many operations can be processed in parallel without significant
|
||||||
|
degradation of duration
|
||||||
|
*
|
||||||
|
- 1
|
||||||
|
- Throughput
|
||||||
|
- operations per second
|
||||||
|
- How many operations can be processed in one second
|
||||||
|
*
|
||||||
|
- 1
|
||||||
|
- Scale impact
|
||||||
|
- %
|
||||||
|
- Performance degradation measured as ratio of operation duration when
|
||||||
|
number of objects is 1k versus when number of objects is low.
|
||||||
|
|
||||||
|
|
||||||
|
Tools
|
||||||
|
=====
|
||||||
|
|
||||||
|
Rally
|
||||||
|
-----
|
||||||
|
|
||||||
|
This test plan can be executed with `Rally`_ tool. Rally can report
|
||||||
|
duration of individual operations and can be configured to perform operations
|
||||||
|
in multiple parallel threads.
|
||||||
|
|
||||||
|
Rally scenario execution also involves creation/deletion of additional objects
|
||||||
|
(like tenants, users) and cleaning of resources created by scenario. All this
|
||||||
|
consumes extra time, so it makes sense to run measurements not one-by-one, but
|
||||||
|
grouped by resource type. E.g. instead of having 4 separate scenarios for
|
||||||
|
create, get, list and delete operations have 1 that calls these operations
|
||||||
|
sequentially.
|
||||||
|
|
||||||
|
Scenarios
|
||||||
|
^^^^^^^^^
|
||||||
|
|
||||||
|
To perform measurements we will need 2 types of scenarios:
|
||||||
|
* **cyclic** - sequence of `create`, `get`, `list` and `delete`
|
||||||
|
operations; total number of objects is not increased.
|
||||||
|
* **accumulative** - sequence of `create`, `get` and `list` operations;
|
||||||
|
total number of objects is increasing.
|
||||||
|
|
||||||
|
Duration metrics
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Duration metrics are collected with help of cyclic scenario.
|
||||||
|
|
||||||
|
Actions:
|
||||||
|
#. Set concurrency in 1 thread.
|
||||||
|
#. Run scenario N times, where N is large enough to make a good sample.
|
||||||
|
Collect list of operation durations.
|
||||||
|
#. For every operation calculate median and percentiles.
|
||||||
|
|
||||||
|
|
||||||
|
Concurrency and throughput metrics
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
These metrics are collected with help of cyclic scenarios.
|
||||||
|
|
||||||
|
Actions:
|
||||||
|
#. Start with concurrency in 1 thread.
|
||||||
|
#. Run scenario N times, where N is large enough to make a good sample.
|
||||||
|
Collect list of operation durations.
|
||||||
|
#. Calculate throughput (divide number of operations on total duration).
|
||||||
|
|
||||||
|
Scale impact metrics
|
||||||
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
These metrics are collected with help of accumulative scenarios.
|
||||||
|
|
||||||
|
Actions:
|
||||||
|
#. Set concurrency in 1 thread.
|
||||||
|
#. Run scenario until desired number of objects reached (e.g. 1 thousand).
|
||||||
|
#. Calculate mean for first 50 objects and for last 50.
|
||||||
|
#. Calculate the ratio between means.
|
||||||
|
|
||||||
|
.. references:
|
||||||
|
|
||||||
|
.. _Rally: http://rally.readthedocs.io/
|
Loading…
Reference in New Issue
Block a user