From 20d28676eca00527371e2063cb5c9df969ab00d3 Mon Sep 17 00:00:00 2001 From: Ilya Shakhat Date: Wed, 14 Sep 2016 16:03:15 +0300 Subject: [PATCH] Add test plan for OpenStack API performance metrics The test plan defines set of metrics that can be collected for OpenStack API operations. Change-Id: I20899adff29578253be7a1d9440d937797a9f400 --- .../test_plans/openstack_api_metrics/plan.rst | 250 ++++++++++++++++++ 1 file changed, 250 insertions(+) create mode 100644 doc/source/test_plans/openstack_api_metrics/plan.rst diff --git a/doc/source/test_plans/openstack_api_metrics/plan.rst b/doc/source/test_plans/openstack_api_metrics/plan.rst new file mode 100644 index 0000000..35b4666 --- /dev/null +++ b/doc/source/test_plans/openstack_api_metrics/plan.rst @@ -0,0 +1,250 @@ +.. _openstack_api_performance_metrics_test_plan: + +================================= +OpenStack API Performance Metrics +================================= + +:status: **draft** +:version: 1.0 + +:Abstract: + + This test plan defines performance metrics for OpenStack API and the way + to measure them. + +:Conventions: + - **Operation Duration** - how long does it take to perform a single + operation. + - **Operation Throughput** - how many operations can be done in one second in + average. + - **Concurrency** - how many parallel operations can be run when operation + throughput reaches the maximum. + - **Scale Impact** - comparison of operation metrics when number of objects + is high versus low. + + +Test Plan +========= + +This test plan defines set of performance metrics for OpenStack API. This +metrics can be used to compare different cloud implementations and for +performance tuning. + +This test plan can be used to answer the following questions: + * How long does it take to perform a particular operation? (*e.g. duration of + Neutron net_create operation*) + * How many concurrent operation can be run in parallel without degradation? + (*e.g. can one do 10 Neutron net_create operation in parallel or better do + them one-by-one*) + * How many particular operations can OpenStack cloud process in a second? + (*e.g. find out whether one can do 100 Neutron net_create ops per second or + not*) + * What is the impact of having many objects in the cloud? How the performance + degrades? (*e.g. will the cloud be slower when there are thousands of + objects and how slower will it be*) + +Test Environment +---------------- + +Preparation +^^^^^^^^^^^ + +This test plan is executed against existing OpenStack cloud. + +Measurements can be done with the tool that can: + * report duration of single operations; + * execute operations one-by-one and in a configurable number of concurrent + threads. + +Environment description +^^^^^^^^^^^^^^^^^^^^^^^ + +The environment description includes hardware specification of servers, +network parameters, operation system and OpenStack deployment characteristics. + +Hardware +~~~~~~~~ + +This section contains list of all types of hardware nodes. + ++-----------+-------+----------------------------------------------------+ +| Parameter | Value | Comments | ++-----------+-------+----------------------------------------------------+ +| model | | e.g. Supermicro X9SRD-F | ++-----------+-------+----------------------------------------------------+ +| CPU | | e.g. 6 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz | ++-----------+-------+----------------------------------------------------+ +| role | | e.g. compute or network | ++-----------+-------+----------------------------------------------------+ + +Network +~~~~~~~ + +This section contains list of interfaces and network parameters. +For complicated cases this section may include topology diagram and switch +parameters. + ++------------------+-------+-------------------------+ +| Parameter | Value | Comments | ++------------------+-------+-------------------------+ +| network role | | e.g. provider or public | ++------------------+-------+-------------------------+ +| card model | | e.g. Intel | ++------------------+-------+-------------------------+ +| driver | | e.g. ixgbe | ++------------------+-------+-------------------------+ +| speed | | e.g. 10G or 1G | ++------------------+-------+-------------------------+ +| MTU | | e.g. 9000 | ++------------------+-------+-------------------------+ +| offloading modes | | e.g. default | ++------------------+-------+-------------------------+ + +Software +~~~~~~~~ + +This section describes installed software. + ++-----------------+-------+---------------------------+ +| Parameter | Value | Comments | ++-----------------+-------+---------------------------+ +| OS | | e.g. Ubuntu 14.04.3 | ++-----------------+-------+---------------------------+ +| OpenStack | | e.g. Liberty | ++-----------------+-------+---------------------------+ +| Hypervisor | | e.g. KVM | ++-----------------+-------+---------------------------+ +| Neutron plugin | | e.g. ML2 + OVS | ++-----------------+-------+---------------------------+ +| L2 segmentation | | e.g. VLAN or VxLAN or GRE | ++-----------------+-------+---------------------------+ +| virtual routers | | e.g. legacy or HA or DVR | ++-----------------+-------+---------------------------+ + + +Test Case: Operation Performance Measurements +--------------------------------------------- + +Description +^^^^^^^^^^^ + +The test case is performed by running a specific OpenStack operation. Every +operation is executed several times to collect more reliable statistical data. + + +Parameters +^^^^^^^^^^ + +The only parameter is the operation being tested. + +List of performance metrics +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. list-table:: + :header-rows: 1 + + * + - Priority + - Value + - Measurement Unit + - Description + * + - 1 + - Duration median + - ms + - Median of operation durations measured when operations are performed + one-by-one in 1 thread + * + - 1 + - Duration 95% percentile + - ms + - 95% percentile of operation durations measured when operations are + performed one-by-one in 1 thread + * + - 2 + - Duration 99% percentile + - ms + - 99% percentile of operation durations measured when operations are + performed one-by-one in 1 thread + * + - 1 + - Concurrency + - count + - How many operations can be processed in parallel without significant + degradation of duration + * + - 1 + - Throughput + - operations per second + - How many operations can be processed in one second + * + - 1 + - Scale impact + - % + - Performance degradation measured as ratio of operation duration when + number of objects is 1k versus when number of objects is low. + + +Tools +===== + +Rally +----- + +This test plan can be executed with `Rally`_ tool. Rally can report +duration of individual operations and can be configured to perform operations +in multiple parallel threads. + +Rally scenario execution also involves creation/deletion of additional objects +(like tenants, users) and cleaning of resources created by scenario. All this +consumes extra time, so it makes sense to run measurements not one-by-one, but +grouped by resource type. E.g. instead of having 4 separate scenarios for +create, get, list and delete operations have 1 that calls these operations +sequentially. + +Scenarios +^^^^^^^^^ + +To perform measurements we will need 2 types of scenarios: + * **cyclic** - sequence of `create`, `get`, `list` and `delete` + operations; total number of objects is not increased. + * **accumulative** - sequence of `create`, `get` and `list` operations; + total number of objects is increasing. + +Duration metrics +^^^^^^^^^^^^^^^^ + +Duration metrics are collected with help of cyclic scenario. + +Actions: + #. Set concurrency in 1 thread. + #. Run scenario N times, where N is large enough to make a good sample. + Collect list of operation durations. + #. For every operation calculate median and percentiles. + + +Concurrency and throughput metrics +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +These metrics are collected with help of cyclic scenarios. + +Actions: + #. Start with concurrency in 1 thread. + #. Run scenario N times, where N is large enough to make a good sample. + Collect list of operation durations. + #. Calculate throughput (divide number of operations on total duration). + +Scale impact metrics +^^^^^^^^^^^^^^^^^^^^ + +These metrics are collected with help of accumulative scenarios. + +Actions: + #. Set concurrency in 1 thread. + #. Run scenario until desired number of objects reached (e.g. 1 thousand). + #. Calculate mean for first 50 objects and for last 50. + #. Calculate the ratio between means. + +.. references: + +.. _Rally: http://rally.readthedocs.io/ \ No newline at end of file