Dina Belova 20b0943204 Add plan<->result references everywhere
Change-Id: I68acbcd3f656faa57540d2d7d4b8e1f2ffb8cfae
Closes-Bug: #1609927
2016-08-16 10:04:02 -07:00

16 KiB

Measuring performance of provisioning systems

status

ready

version

1.0

Abstract

This document describes a test plan for quantifying the performance of provisioning systems as a function of the number of nodes to be provisioned. The plan includes the collection of several resource utilization metrics, which will be used to analyze and understand the overall performance of each system. In particular, resource bottlenecks will either be fixed, or best practices developed for system configuration and hardware requirements.

Conventions
  • Provisioning: is the entire process of installing and configuring an operating system.
  • Provisioning system: is a service or a set of services which enables the installation of an operating system and performs basic operations such as configuring network interfaces and partitioning disks. A preliminary list of provisioning systems can be found below in Applications. The provisioning system can include configuration management systems like Puppet or Chef, but this feature will not be considered in this document. The test plan for configuration management systems is described in the "Measuring_performance_of_configuration_management_systems" document.
  • Performance of a provisioning system: is a set of metrics which describes how many nodes can be provisioned at the same time and the hardware resources required to do so.
  • Nodes: are servers which will be provisioned.

Test Plan

This test plan aims to identify the best provisioning solution for cloud deployment, using specified list of performance measurements and tools.

Test Environment

Preparation

1.

The following package needs to be installed on the provisioning system servers to collect performance metrics.

Software to be installed
package name version source
dstat 0.7.2 Ubuntu trusty universe repository

Environment description

Test results MUST include a description of the environment used. The following items should be included:

  • Hardware configuration of each server. If virtual machines are used then both physical and virtual hardware should be fully documented. An example format is given below:
Description of server hardware
server
name ----------------+ role ----------------+ vendor,model ----------------+ operating_system

-------+

-------+

-------+

-------+

-------+

-------+

CPU
vendor,model ----------------+ processor_count ----------------+ core_count ----------------+ frequency_MHz

-------+

-------+

-------+

-------+

-------+

-------+

RAM
vendor,model ----------------+ amount_MB -------+ -------+
NETWORK
interface_name ----------------+ vendor,model ----------------+ bandwidth

-------+

-------+

-------+

-------+

STORAGE
dev_name ----------------+ vendor,model ----------------+ SSD/HDD ----------------+ size

-------+

-------+

-------+

-------+

-------+

-------+

  • Configuration of hardware network switches. The configuration file from the switch can be downloaded and attached.
  • Configuration of virtual machines and virtual networks (if used). The configuration files can be attached, along with the mapping of virtual machines to host machines.
  • Network scheme. The plan should show how all hardware is connected and how the components communicate. All ethernet/fibrechannel and VLAN channels should be included. Each interface of every hardware component should be matched with the corresponding L2 channel and IP address.
  • Software configuration of the provisioning system. sysctl.conf and any other kernel file that is changed from the default should be attached. List of installed packages should be attached. Specifications of the operating system, network interfaces configuration, and disk partitioning configuration should be included. If distributed provisioning systems are to be tested then the parts that are distributed need to be described.
  • Desired software configuration of the provisioned nodes. The operating system, disk partitioning scheme, network interface configuration, installed packages and other components of the nodes affect the amount of work to be performed by the provisioning system and thus its performance.

Test Case

Description

This specific test plan contains only one test case, that needs to be run step by step on the environments differing list of parameters below.

Parameters

Parameter name Value
number of nodes 10, 20, 40, 80, 160, 320, 640, 1280, 2000

List of performance metrics

The table below shows the list of test metrics to be collected. The priority is the relative ranking of the importance of each metric in evaluating the performance of the system.

List of performance metrics
Priority Value Measurement Units Description
1 PROVISIONING_TIME seconds
The elapsed time to provision all
nodes, as a function of the numbers of
nodes
2 INGRESS_NET Gbit/s
Incoming network bandwidth usage as a
function of the number of nodes.
Average during provisioning on the host
where the provisioning system is
installed.
2 EGRESS_NET Gbit/s
Outgoing network bandwidth usage as a
function of the number of nodes.
Average during provisioning on the host
where the provisioning system is
installed.
3 CPU percentage
CPU utilization as a function of the
number of nodes. Average during
provisioning on the host where the
provisioning system is installed.
3 RAM GB
Active memory usage as a function of
the number of nodes. Average during
provisioning on the host where the
provisioning system is installed.
3 WRITE_IO operations/second
Storage read IO bandwidth as a
function of the number of nodes.
Average during provisioning on the host
where the provisioning system is
installed.
3 READ_IO operations/second
Storage write IO bandwidth as a
function of the number of nodes.
Average during provisioning on the host
where the provisioning system is
installed.

Measuring performance values

The script Full script for collecting performance metrics can be used for the first five of the following steps.

Note

If a distributed provisioning system is used, the values need to be measured on each provisioning system instance.

1.

Start the collection of CPU, memory, network, and storage metrics during the provisioning process. Use the dstat programm which can collect all of these metrics in CSV format into a log file.

2.

Start the provisioning process for the first node and record the wall time.

3.

Wait until the provisioning process has finished (when all nodes are reachable via ssh) and record the wall time.

4.

Stop the dstat program.

5.

Prepare collected data for analysis. dstat provides a large amount of information, which can be pruned by saving only the following:

  • "system"[time]. Save as given.
  • 100-"total cpu usage"[idl]. dstat provides only the idle CPU value. CPU utilization is calculated by subtracting the idle value from 100%.
  • "memory usage"[used]. dstat provides this value in Bytes. This is converted it to Megabytes by dividing by 1024*1024=1048576.
  • "net/eth0"[recv] receive bandwidth on the NIC. It is converted to Megabits per second by dividing by 1024*1024/8=131072.
  • "net/eth0"[send] send bandwidth on the NIC. It is converted to Megabits per second by dividing by 1024*1024/8=131072.
  • "net/eth0"[recv]+"net/eth0"[send]. The total receive and transmit bandwidth on the NIC. dstat provides these values in Bytes per second. They are converted to Megabits per second by dividing by 1024*1024/8=131072.
  • "io/total"[read] storage read IO bandwidth.
  • "io/total"[writ] storage write IO bandwidth.
  • "io/total"[read]+"io/total"[writ]. The total read and write storage IO bandwidth.

These values will be graphed and maximum values reported.

Additional tests will be performed if some anomalous behaviour is found. These may require the collection of additional performance metrics.

6.

The result of this part of test will be:

  • to provide the following graphs, one for each number of provisioned nodes:
    1. Three dependencies on one graph.
      • INGRESS_NET(TIME) Dependence on time of incoming network bandwidth usage.
      • EGRESS_NET(TIME) Dependence on time of outgoing network bandwidth usage.
      • ALL_NET(TIME) Dependence on time of total network bandwidth usage.
    2. One dependence on one graph.
      • CPU(TIME) Dependence on time of CPU utilization.
    3. One dependence on one graph.
      • RAM(TIME) Dependence on time of active memory usage.
    4. Three dependencies on one graph.
      • WRITE_IO(TIME) Dependence on time of storage write IO bandwidth.
      • READ_IO(TIME) Dependence on time of storage read IO bandwidth.
      • ALL_IO(TIME) Dependence on time of total storage IO bandwidth.

Note

If a distributed provisioning system is used, the above graphs should be provided for each provisioning system instance.

  • to fill in the following table for maximum values:

The resource metrics are obtained from the maxima of the corresponding graphs above. The provisioning time is the elapsed time for all nodes to be provisioned. One set of metrics will be given for each number of provisioned nodes.

Maximum values of performance metrics
nodes
| count
provisioning
| time
maximum
CPU
| usage
maximum
RAM
| usage
maximum
NET
| usage
maximum
IO
| usage
10
20
40
80
160
320
640
1280
2000

Applications

List of provisioning systems

list of provisioning systems
Name of provisioning system Version
Cobbler 2.4
Razor 0.13
Image based provisioning via downloading images with bittorrent protocol -

Full script for collecting performance metrics

measure.sh

Reports

Test plan execution reports:
  • Measuring_performance_of_Cobbler