Merge "Ceph RBD test plan"
This commit is contained in:
commit
595fe0d395
216
doc/source/test_plans/ceph_rbd/index.rst
Normal file
216
doc/source/test_plans/ceph_rbd/index.rst
Normal file
@ -0,0 +1,216 @@
|
|||||||
|
.. _ceph_rbd_test_plan:
|
||||||
|
|
||||||
|
============================
|
||||||
|
Ceph RBD performance testing
|
||||||
|
============================
|
||||||
|
|
||||||
|
:status: **ready**
|
||||||
|
:version: 1.0
|
||||||
|
|
||||||
|
:Abstract:
|
||||||
|
|
||||||
|
This test plan aims to provide set of tests to identify Ceph RBD
|
||||||
|
performance against given Ceph cluster by using of Wally tests.
|
||||||
|
|
||||||
|
Test Plan
|
||||||
|
=========
|
||||||
|
|
||||||
|
The purpose of this document is to describe the environment and performance test plan
|
||||||
|
for benchmarking Ceph block storage (RBD) performance.
|
||||||
|
|
||||||
|
The main goals are:
|
||||||
|
|
||||||
|
- Define test approach, methodology and benchmarking toolset for testing Ceph
|
||||||
|
block storage performance
|
||||||
|
- Benchmark Ceph performance for defined scenarios
|
||||||
|
|
||||||
|
Preparation
|
||||||
|
-----------
|
||||||
|
|
||||||
|
This test plan is performed against existing Ceph cluster.
|
||||||
|
Single VM is created for running tests on every compute node.
|
||||||
|
Before running IO load storage devices are filled with pseudo-random data.
|
||||||
|
|
||||||
|
Execution Strategy
|
||||||
|
------------------
|
||||||
|
|
||||||
|
All tests are executed sequentially on all dedicated virtual machines. Number of IO load
|
||||||
|
threads per VM depends on test phase. Every test starts with 30 second warm-up, which
|
||||||
|
is not included in test results, followed by 180 second test load phase. At any given time
|
||||||
|
a single VM per compute node generates IO load with given number of threads.
|
||||||
|
|
||||||
|
Block size for small block read/write operation is chosen to be 4K, since using smaller
|
||||||
|
blocks is not reasonable because a) most modern HDD drives have physical sector size
|
||||||
|
equal to 4KB and b) default Linux virtual memory page size equals to 4KB too. Larger
|
||||||
|
block sizes provides no additional information since maximal I/O operations per second
|
||||||
|
value is constant due to HDD mechanics.
|
||||||
|
|
||||||
|
Block size for large block sequential read/write operations has no certain limitations
|
||||||
|
except being bigger than Ceph block size (4MB), so value of 16MB was chosen.
|
||||||
|
|
||||||
|
Test tool
|
||||||
|
---------
|
||||||
|
|
||||||
|
For benchmarking Ceph performance new tool (`Wally`_) was developed. It uses Flexible IO
|
||||||
|
(fio) as load generator.
|
||||||
|
|
||||||
|
Test types
|
||||||
|
----------
|
||||||
|
|
||||||
|
Following load scenarios are selected for Ceph benchmarking:
|
||||||
|
|
||||||
|
- Average random read IOPS for small (4KB) blocks as function of thread count
|
||||||
|
- Average random write IOPS for small (4KB) blocks, both for direct and
|
||||||
|
synchronous mode, as function of thread count
|
||||||
|
- Average linear read throughput for large (16MB) blocks, as function of thread
|
||||||
|
count
|
||||||
|
- Average linear write throughput for large (16MB) blocks, as function of thread
|
||||||
|
count
|
||||||
|
- Maximal synchronous random write IOPS for small (4K) blocks with latency not
|
||||||
|
exceeding some predefined value.
|
||||||
|
- Maximal random read IOPS for small (4K) blocks with latency not exceeding some
|
||||||
|
predefined value.
|
||||||
|
- Maximal amount of threads (virtual machines) can be served from storage with
|
||||||
|
given SLA.
|
||||||
|
|
||||||
|
Every load scenario is executed for different number of simultaneous threads.
|
||||||
|
|
||||||
|
Actual values for scenario parameters are defined in section "Load Description"
|
||||||
|
|
||||||
|
Disk operations with small block size shows maximum IO operations rate under
|
||||||
|
sustained load, moving bottleneck to disks, while sequential operations with large block
|
||||||
|
sizes allows to estimate system performance when bottleneck is network.
|
||||||
|
|
||||||
|
Test Measurements and Metrics
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
During every test run raw metrics are collected at least once per second. Collected data
|
||||||
|
are reported after test run. Report should include median value for a metric, 95%
|
||||||
|
confidence interval and standard deviation value. Charts can be generated for selected
|
||||||
|
metrics.
|
||||||
|
|
||||||
|
Following metrics are collected on each host for all test scenarios:
|
||||||
|
|
||||||
|
- CPU usage per core and total
|
||||||
|
- RAM utilization
|
||||||
|
- Network throughput and IOPS on both replication and public interfaces
|
||||||
|
- Throughput, IOPS and latency per storage device for each participating storage
|
||||||
|
devices
|
||||||
|
|
||||||
|
Following metrics are additionally collected on test VM depending on test type:
|
||||||
|
|
||||||
|
- Random read/write tests:
|
||||||
|
|
||||||
|
- Storage IOPS per thread
|
||||||
|
- Storage operations latency
|
||||||
|
|
||||||
|
- Sequential read/write tests
|
||||||
|
|
||||||
|
- Storage throughput
|
||||||
|
|
||||||
|
Expected Results and Pass/Fail Criteria
|
||||||
|
---------------------------------------
|
||||||
|
|
||||||
|
Pass/Fail Criteria
|
||||||
|
~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
A test run is considered as failed if one or more test loads is not completed without
|
||||||
|
errors.
|
||||||
|
|
||||||
|
Expected results
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
No certain expected results are provided since the purpose of this testing effort is to
|
||||||
|
create benchmarking framework and collect baseline data for described environment.
|
||||||
|
|
||||||
|
The only requirement is that pass criteria are fulfilled.
|
||||||
|
|
||||||
|
However, results difference between test runs by more than 10% for same test
|
||||||
|
scenarios should be explained. This value is based on test execution experience (results
|
||||||
|
variation is about 5%).
|
||||||
|
|
||||||
|
Load Description
|
||||||
|
----------------
|
||||||
|
|
||||||
|
- Random write in synchronous mode using 4k block size. 1, 5, 10, 15, 25 and 40
|
||||||
|
threads
|
||||||
|
- Random write in direct mode using 4k block size. 1 thread
|
||||||
|
- Random read using 4k block size. 1, 5, 10, 15, 25, 40, 80 and 120 threads
|
||||||
|
- Number of VMs conforming SLA (4K block size, 60 MBps, 100 IOPS for
|
||||||
|
read/write, 30 ms latency)
|
||||||
|
- Sequential read, direct, 16m block size, 1, 3 and 10 threads
|
||||||
|
- Sequential write, direct, 16m block size, 1, 3 and 10 threads
|
||||||
|
|
||||||
|
All test loads should be run with default and optimal size of placement groups.
|
||||||
|
|
||||||
|
Test Environment
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Environment description
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
The environment description includes hardware specification of servers,
|
||||||
|
network parameters, operation system and Ceph deployment characteristics.
|
||||||
|
|
||||||
|
Hardware
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
This section contains list of all types of hardware nodes (table below is
|
||||||
|
an example).
|
||||||
|
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| model | | e.g. Supermicro X9SRD-F |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| CPU | | e.g. 6 x Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
| role | | e.g. compute or network |
|
||||||
|
+-----------+-------+----------------------------------------------------+
|
||||||
|
|
||||||
|
Network
|
||||||
|
~~~~~~~
|
||||||
|
|
||||||
|
This section contains list of interfaces and network parameters. For
|
||||||
|
complicated cases this section may include topology diagram and switch
|
||||||
|
parameters (table below is an example).
|
||||||
|
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| network role | | e.g. provider or public |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| card model | | e.g. Intel |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| driver | | e.g. ixgbe |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| speed | | e.g. 10G or 1G |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| MTU | | e.g. 9000 |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
| offloading modes | | e.g. default |
|
||||||
|
+------------------+-------+-------------------------+
|
||||||
|
|
||||||
|
Software
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
This section describes installed software (table below is an example).
|
||||||
|
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| Parameter | Value | Comments |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| OS | | e.g. Ubuntu 16.04 |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
| Ceph | | e.g. Jewel |
|
||||||
|
+-----------------+-------+---------------------------+
|
||||||
|
|
||||||
|
Reports
|
||||||
|
=======
|
||||||
|
|
||||||
|
Test plan execution reports:
|
||||||
|
|
||||||
|
* :ref:`ceph_rbd_performance_results_50_osd`
|
||||||
|
|
||||||
|
.. references:
|
||||||
|
|
||||||
|
.. _Wally: https://github.com/Mirantis/disk_perf_test_tool
|
Loading…
Reference in New Issue
Block a user