Define performance test suite framework

This specification will describe the scope of the performance framework, use cases and how it can scale by new tests added by the community. Change-Id: I844e1619772789edf571a598ce0b6e4561e4e94b Signed-off-by: VictorRodriguez <victor.rodriguez.bahena@intel.com>
2019-08-19 14:47:49 -05:00 · 2019-08-19 14:47:49 -05:00 · 7c1936321d
commit 7c1936321d
parent 4cafd9dd1f
1 changed files with 268 additions and 0 deletions
--- a/specs/2019.03/approved/testing-2006406-performance-framework.rst
+++ b/specs/2019.03/approved/testing-2006406-performance-framework.rst
@ -0,0 +1,268 @@
 ..
  This work is licensed under a Creative Commons Attribution 3.0 Unported
  License. http://creativecommons.org/licenses/by/3.0/legalcode
 ..
 ============================================
 StarlingX: Performance measurement framework
 ============================================
 Storyboard: https://storyboard.openstack.org/#!/story/2006406
 The measurement of performance metrics in edge cloud systems is a key factor in
 quality assurance.  Having a common understanding and a practical framework to
 run the performance tests are important and necessary for users and developers
 in the community. This specification will describe the scope of the performance
 framework, use cases and how it can scale by new tests developed by the
 community or imported from existing test performance frameworks.
 Problem description
 ===================
 The deployment of StarlingX could happen in a wide variability of hardware and
 network environments, these differences could result in a difference in
 performance metrics. As of today, the community does not have a consolidated
 performance framework to measure the performance metrics with specific and
 well-defined test cases.
 Use Cases
 ---------
 Developers who made some fairly significant changes in codebase want to ensure
 no performance regression brought by the changes.  Developers who did some
 performance improvements need to measure the performance result with a
 consolidating framework valid across the community.  Some of the community
 members want to measure out the performance items on StarlingX and promote
 StarlingX advantages in terms of performance which are relevant to an edge
 solution.  Potential users want to evaluate StarlingX as one of their Edge
 solution candidates, so offering such a performance framework will be a
 positive factor of contributing to the evaluation process.
 Proposed change
 ===============
 The proposed change is to create a set of scripts and documentation under the
 starlingx testing repository (https://opendev.org/starlingx/test) that anyone
 can use to measure the metric they need under their hardware and software
 environment.
 Alternatives
 ------------
 The community can use existing performance test frameworks, Some of those are:
 * Rally:
 OpenStack project, dedicated to the performance analysis and benchmarking
 system of individual OpenStack components
 * Kubernetes perf-tests:
 An open source project dedicated to Kubernetes-related performance test tools
 * Yardstick by OPNFV:
 The Yardstick concept decomposes typical virtual network function work-load
 performance metrics into several characteristics/performance vectors, which
 each of them can be represented by distinct test-cases.
 The disadvantage of all these frameworks is that they coexist in separate
 projects. Users need to go and find the test cases anywhere on the internet, is
 not a centralized and scalable solution. The StarlingX Performance measurement
 framework proposed will allow the importing of existing test cases to be
 re-used in a centralized system easy to use for the community.
 Data model impact
 -----------------
 None
 REST API impact
 ---------------
 None
 Security impact
 ---------------
 None
 Other end-user impact
 ---------------------
 End users will interact with this framework by command line on the controller
 system by the performance-test-runner.py script, which is the entry point to
 select and configure the test case. An example of to launch a test case might
 be:
 ::
    ./performance-test-runner.py --test failed_vm_detection --compute=0
 This will generate results of multiple runs in CSV format easy to post-process.
 Performance Impact
 ------------------
 Running this performance framework or test cases, there shouldn’t be a
 performance impact or obvious overhead on the system.
 Other deployer impact
 ---------------------
 None, there will be no impact on how to deploy and configure StarlingX
 Developer impact
 ----------------
 There will be a good impact on the developers. Since they will be able to have
 a consolidated and well aligned framework to measure the performance impact of
 their code/configuration changes.
 Developers can't improve something if they don’t know it needs improvement.
 That’s where this performance testing framework comes in. It gives the
 developers tools to detect how fast, stable, and scalable their StarlingX, is
 so they can decide if it needs to improve or not.
 Upgrade impact
 --------------
 None
 Implementation
 ==============
 The test framework will have the next components on the implementation
 * Common scripts:
 This directory will contain scripts to capture key timestamps or collect other
 key data along with some use cases so that developers can measure out the time
 (latency) or other performance data (such as throughput, system CPU utilization
 and so on). Any common script that any other test case could use should be in
 this directory.
 Despite the existence of a directory for common scripts, each one of the test
 cases is responsible for its own metrics. One example is the time measurement
 of test cases that run in more than one node. For these kinds of test cases
 will be necessary to use synchronization protocols like the Network Time
 Protocol (NTP). The Network Time Protocol (NTP) is a networking protocol for
 clock synchronization between computer systems. NTP is intended to synchronize
 all participating computers to within a few milliseconds of Coordinated
 Universal Time (UTC).In some other cases, it might not be necessary to have a
 protocol for clock synchronization.
 Most of the details for wether to re-use a common script form the common
 directory or handle the measurement by the test itself will be left to the test
 implementation
 * Performance-test-runner.py:
 The main command-line interface to execute the test cases on the system under
 test
 * Test cases directory:
 This directory can contain wrapper scripts to existing upstream performance
 test cases as well as new test cases specific for Starlting X.
 A view of the directory layout will look like:
 ::
    performance/
    ├── common
    │   ├── latency_metrics.py
    │   ├── network_generator.py
    │   ├── ntp_metrics.py
    │   └── statistics.py
    └── tests_cases
    │   ├── failed_compute_detection.py
    │   ├── failed_control_detection.py
    │   ├── failed_network_detection.py
    │   ├── failed_vm_detection.py
    │   ├── neutron_test_case_1.py
    │   ├── opnfv_test_case_1.py
    │   ├── opnfv_test_case_2.py
    │   ├── rally_test_case_1.py
    │   ├── rally_test_case_2.py
    │
    └── performance-test-runner.py
 The goal is that anyone on the StarlingX community can either define
 performance tests cases or re use existing on other projects and create
 scripts to measure in a scalable and repeatable framework.
 In the future, it might be possible to evaluate how to connect some basic
 performance tests cases to zuul to detect regressions on commits to be merged.
 This will require to set up a robust infrastructure to run the sanity
 performance test cases for each commit to be merged.
 Assignee(s)
 -----------
 Wang Hai Tao
 Elio Martinez
 Juan Carlos Alonzo
 Victor Rodriguez
 Repos Impacted
 --------------
 https://opendev.org/starlingx/test
 Work Items
 ----------
 * Develop automated scripts for core test cases, such as:
    * Detection of failed VM
    * Detection failed compute node
    * Controller/Node fail detect/recovery
    * Detection of network link fail
    * DPDK Live Migrate Latency
    * Avg/Max host/guest Latency (Cyclictest)
    * Swact Time
 * Develop performance-test-runner.py that call each test case script
 * Integrate performance-test-runner.py with pytest framework
 * Implement call clone of opnfv test cases
 * Implement execution of opnfv test case
 * Implement monitoring pipeline to automatically detect changes in upstream
 * Automate weekly mail with list of upstream performance test cases that change
 * Implement automatic test cases documentation based on pydoc
 * Document framework on StarlingX wiki
 * Send regular status to ML  and demos on community call for feedback
 Dependencies
 ============
 None
 Testing
 =======
 None
 Documentation Impact
 ====================
 The performance test cases will be documented with pydoc
 (https://docs.python.org/3.0/library/pydoc.html). The pydoc module
 automatically generates documentation from Python modules. The documentation
 can be presented as pages of text on the console, served to a Web browser, or
 saved to HTML files.
 The goal is to generate automatically the documentation for the end-user
 testing guide. This methodology will catch when the community adds or modify a
 new performance test case. At the same time if a standard config option changes
 or is deprecated the test framework documentation will be updated to reflect
 the change.
 References
 ==========
 History
 =======
 .. list-table:: Revisions
   :header-rows: 1
   * - Release Name
     - Description
   * - Stein
     - Introduced