Add test plan for OpenStack API performance metrics
The test plan defines set of metrics that can be collected for OpenStack API operations. Change-Id: I20899adff29578253be7a1d9440d937797a9f400
This commit is contained in:
parent
20b0943204
commit
20d28676ec
250
doc/source/test_plans/openstack_api_metrics/plan.rst
Normal file
250
doc/source/test_plans/openstack_api_metrics/plan.rst
Normal file
@ -0,0 +1,250 @@
|
||||
.. _openstack_api_performance_metrics_test_plan:
|
||||
|
||||
=================================
|
||||
OpenStack API Performance Metrics
|
||||
=================================
|
||||
|
||||
:status: **draft**
|
||||
:version: 1.0
|
||||
|
||||
:Abstract:
|
||||
|
||||
This test plan defines performance metrics for OpenStack API and the way
|
||||
to measure them.
|
||||
|
||||
:Conventions:
|
||||
- **Operation Duration** - how long does it take to perform a single
|
||||
operation.
|
||||
- **Operation Throughput** - how many operations can be done in one second in
|
||||
average.
|
||||
- **Concurrency** - how many parallel operations can be run when operation
|
||||
throughput reaches the maximum.
|
||||
- **Scale Impact** - comparison of operation metrics when number of objects
|
||||
is high versus low.
|
||||
|
||||
|
||||
Test Plan
|
||||
=========
|
||||
|
||||
This test plan defines set of performance metrics for OpenStack API. This
|
||||
metrics can be used to compare different cloud implementations and for
|
||||
performance tuning.
|
||||
|
||||
This test plan can be used to answer the following questions:
|
||||
* How long does it take to perform a particular operation? (*e.g. duration of
|
||||
Neutron net_create operation*)
|
||||
* How many concurrent operation can be run in parallel without degradation?
|
||||
(*e.g. can one do 10 Neutron net_create operation in parallel or better do
|
||||
them one-by-one*)
|
||||
* How many particular operations can OpenStack cloud process in a second?
|
||||
(*e.g. find out whether one can do 100 Neutron net_create ops per second or
|
||||
not*)
|
||||
* What is the impact of having many objects in the cloud? How the performance
|
||||
degrades? (*e.g. will the cloud be slower when there are thousands of
|
||||
objects and how slower will it be*)
|
||||
|
||||
Test Environment
|
||||
----------------
|
||||
|
||||
Preparation
|
||||
^^^^^^^^^^^
|
||||
|
||||
This test plan is executed against existing OpenStack cloud.
|
||||
|
||||
Measurements can be done with the tool that can:
|
||||
* report duration of single operations;
|
||||
* execute operations one-by-one and in a configurable number of concurrent
|
||||
threads.
|
||||
|
||||
Environment description
|
||||
^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The environment description includes hardware specification of servers,
|
||||
network parameters, operation system and OpenStack deployment characteristics.
|
||||
|
||||
Hardware
|
||||
~~~~~~~~
|
||||
|
||||
This section contains list of all types of hardware nodes.
|
||||
|
||||
+-----------+-------+----------------------------------------------------+
|
||||
| Parameter | Value | Comments |
|
||||
+-----------+-------+----------------------------------------------------+
|
||||
| model | | e.g. Supermicro X9SRD-F |
|
||||
+-----------+-------+----------------------------------------------------+
|
||||
| CPU | | e.g. 6 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz |
|
||||
+-----------+-------+----------------------------------------------------+
|
||||
| role | | e.g. compute or network |
|
||||
+-----------+-------+----------------------------------------------------+
|
||||
|
||||
Network
|
||||
~~~~~~~
|
||||
|
||||
This section contains list of interfaces and network parameters.
|
||||
For complicated cases this section may include topology diagram and switch
|
||||
parameters.
|
||||
|
||||
+------------------+-------+-------------------------+
|
||||
| Parameter | Value | Comments |
|
||||
+------------------+-------+-------------------------+
|
||||
| network role | | e.g. provider or public |
|
||||
+------------------+-------+-------------------------+
|
||||
| card model | | e.g. Intel |
|
||||
+------------------+-------+-------------------------+
|
||||
| driver | | e.g. ixgbe |
|
||||
+------------------+-------+-------------------------+
|
||||
| speed | | e.g. 10G or 1G |
|
||||
+------------------+-------+-------------------------+
|
||||
| MTU | | e.g. 9000 |
|
||||
+------------------+-------+-------------------------+
|
||||
| offloading modes | | e.g. default |
|
||||
+------------------+-------+-------------------------+
|
||||
|
||||
Software
|
||||
~~~~~~~~
|
||||
|
||||
This section describes installed software.
|
||||
|
||||
+-----------------+-------+---------------------------+
|
||||
| Parameter | Value | Comments |
|
||||
+-----------------+-------+---------------------------+
|
||||
| OS | | e.g. Ubuntu 14.04.3 |
|
||||
+-----------------+-------+---------------------------+
|
||||
| OpenStack | | e.g. Liberty |
|
||||
+-----------------+-------+---------------------------+
|
||||
| Hypervisor | | e.g. KVM |
|
||||
+-----------------+-------+---------------------------+
|
||||
| Neutron plugin | | e.g. ML2 + OVS |
|
||||
+-----------------+-------+---------------------------+
|
||||
| L2 segmentation | | e.g. VLAN or VxLAN or GRE |
|
||||
+-----------------+-------+---------------------------+
|
||||
| virtual routers | | e.g. legacy or HA or DVR |
|
||||
+-----------------+-------+---------------------------+
|
||||
|
||||
|
||||
Test Case: Operation Performance Measurements
|
||||
---------------------------------------------
|
||||
|
||||
Description
|
||||
^^^^^^^^^^^
|
||||
|
||||
The test case is performed by running a specific OpenStack operation. Every
|
||||
operation is executed several times to collect more reliable statistical data.
|
||||
|
||||
|
||||
Parameters
|
||||
^^^^^^^^^^
|
||||
|
||||
The only parameter is the operation being tested.
|
||||
|
||||
List of performance metrics
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
|
||||
*
|
||||
- Priority
|
||||
- Value
|
||||
- Measurement Unit
|
||||
- Description
|
||||
*
|
||||
- 1
|
||||
- Duration median
|
||||
- ms
|
||||
- Median of operation durations measured when operations are performed
|
||||
one-by-one in 1 thread
|
||||
*
|
||||
- 1
|
||||
- Duration 95% percentile
|
||||
- ms
|
||||
- 95% percentile of operation durations measured when operations are
|
||||
performed one-by-one in 1 thread
|
||||
*
|
||||
- 2
|
||||
- Duration 99% percentile
|
||||
- ms
|
||||
- 99% percentile of operation durations measured when operations are
|
||||
performed one-by-one in 1 thread
|
||||
*
|
||||
- 1
|
||||
- Concurrency
|
||||
- count
|
||||
- How many operations can be processed in parallel without significant
|
||||
degradation of duration
|
||||
*
|
||||
- 1
|
||||
- Throughput
|
||||
- operations per second
|
||||
- How many operations can be processed in one second
|
||||
*
|
||||
- 1
|
||||
- Scale impact
|
||||
- %
|
||||
- Performance degradation measured as ratio of operation duration when
|
||||
number of objects is 1k versus when number of objects is low.
|
||||
|
||||
|
||||
Tools
|
||||
=====
|
||||
|
||||
Rally
|
||||
-----
|
||||
|
||||
This test plan can be executed with `Rally`_ tool. Rally can report
|
||||
duration of individual operations and can be configured to perform operations
|
||||
in multiple parallel threads.
|
||||
|
||||
Rally scenario execution also involves creation/deletion of additional objects
|
||||
(like tenants, users) and cleaning of resources created by scenario. All this
|
||||
consumes extra time, so it makes sense to run measurements not one-by-one, but
|
||||
grouped by resource type. E.g. instead of having 4 separate scenarios for
|
||||
create, get, list and delete operations have 1 that calls these operations
|
||||
sequentially.
|
||||
|
||||
Scenarios
|
||||
^^^^^^^^^
|
||||
|
||||
To perform measurements we will need 2 types of scenarios:
|
||||
* **cyclic** - sequence of `create`, `get`, `list` and `delete`
|
||||
operations; total number of objects is not increased.
|
||||
* **accumulative** - sequence of `create`, `get` and `list` operations;
|
||||
total number of objects is increasing.
|
||||
|
||||
Duration metrics
|
||||
^^^^^^^^^^^^^^^^
|
||||
|
||||
Duration metrics are collected with help of cyclic scenario.
|
||||
|
||||
Actions:
|
||||
#. Set concurrency in 1 thread.
|
||||
#. Run scenario N times, where N is large enough to make a good sample.
|
||||
Collect list of operation durations.
|
||||
#. For every operation calculate median and percentiles.
|
||||
|
||||
|
||||
Concurrency and throughput metrics
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
These metrics are collected with help of cyclic scenarios.
|
||||
|
||||
Actions:
|
||||
#. Start with concurrency in 1 thread.
|
||||
#. Run scenario N times, where N is large enough to make a good sample.
|
||||
Collect list of operation durations.
|
||||
#. Calculate throughput (divide number of operations on total duration).
|
||||
|
||||
Scale impact metrics
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
These metrics are collected with help of accumulative scenarios.
|
||||
|
||||
Actions:
|
||||
#. Set concurrency in 1 thread.
|
||||
#. Run scenario until desired number of objects reached (e.g. 1 thousand).
|
||||
#. Calculate mean for first 50 objects and for last 50.
|
||||
#. Calculate the ratio between means.
|
||||
|
||||
.. references:
|
||||
|
||||
.. _Rally: http://rally.readthedocs.io/
|
Loading…
Reference in New Issue
Block a user