Merge "Platform Power Metrics - Technology Preview (dsR8)"
This commit is contained in:
commit
290fc8190a
@ -408,3 +408,10 @@ Provision BMC using the CLI
|
||||
provisioning_bmc/provisioning-bmc-after-adding-a-host
|
||||
provisioning_bmc/deprovisioning-board-management-control-from-the-cli
|
||||
|
||||
----------------------------------
|
||||
Technology Preview - Power Metrics
|
||||
----------------------------------
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
install-power-metrics-application-a12de3db7478
|
@ -0,0 +1,452 @@
|
||||
.. _install-power-metrics-application-a12de3db7478:
|
||||
|
||||
======================================================
|
||||
Technology Preview - Install Power Metrics Application
|
||||
======================================================
|
||||
|
||||
The Power Metrics app deploys two containers, cAdvisor and Telegraf that
|
||||
collect metrics about hardware usage. This document describes the technical
|
||||
preview of the Power Metrics functionality.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
For running power-metrics, your system must have the following drivers:
|
||||
|
||||
cpufreq kernel module
|
||||
exposes per-CPU Frequency over sysfs
|
||||
(``/sys/devices/system/cpu/cpu%d/cpufreq/scaling_cur_freq``)
|
||||
|
||||
msr kernel module
|
||||
provides access to processor model specific registers over devfs
|
||||
(``/dev/cpu/cpu%d/msr``)
|
||||
|
||||
intel-rapl module
|
||||
exposes Intel Runtime Power Limiting metrics over sysfs
|
||||
(``/sys/devices/virtual/powercap/intel-rapl``)
|
||||
|
||||
intel-uncore-frequency module
|
||||
exposes Intel uncore frequency metrics over sysfs
|
||||
(``/sys/devices/system/cpu/intel_uncore_frequency``)
|
||||
|
||||
|
||||
Uncore events can only be loaded from the following cpu models:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 25
|
||||
:header-rows: 1
|
||||
|
||||
* - **Model number**
|
||||
- **Processor name**
|
||||
* - 0x55
|
||||
- Intel Skylake-X
|
||||
* - 0x6A
|
||||
- Intel IceLake-X
|
||||
* - 0x6C
|
||||
- Intel IceLake-D
|
||||
* - 0x47
|
||||
- Intel Broadwell-G
|
||||
* - 0x4F
|
||||
- Intel Broadwell-X
|
||||
* - 0x56
|
||||
- Intel Broadwell-D
|
||||
* - 0x8F
|
||||
- Intel Sapphire Rapids X
|
||||
* - 0xCF
|
||||
- Intel Emerald Rapids X
|
||||
|
||||
Source: https://github.com/influxdata/telegraf/issues/13098#issuecomment-1512585422
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Upload the application.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 (keystone_admin)]$ system application-upload /usr/local/share/applications/helm/power-metrics-[version].tgz
|
||||
|
||||
#. Apply the application.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 (keystone_admin)]$ system application-apply power-metrics
|
||||
|
||||
#. Wait until Power Metrics is in applied state.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 (keystone_admin)]$ system application-show power-metrics
|
||||
|
||||
#. Assign a label to the node:
|
||||
|
||||
.. note::
|
||||
|
||||
A label must be assigned for the power-metrics to be enabled in a node.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
power-metrics:enabled
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 (keystone_admin)]$ system host-label-assign <node-name> power-metrics=enabled
|
||||
|
||||
.. rubric:: |result|
|
||||
|
||||
The Power Metrics should be installed and both cAdvisor and Telegraf pods must
|
||||
be up and running.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
sysadmin@controller-0:~$ kubectl get pods -n power-metrics
|
||||
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
cadvisor-v76zx 1/1 Running 0 26h
|
||||
telegraf-mc6vd 1/1 Running 0 4d7h
|
||||
|
||||
It is possible to change some configurations via override.
|
||||
|
||||
--------
|
||||
Telegraf
|
||||
--------
|
||||
|
||||
Enable and disable Intel PMU metrics
|
||||
------------------------------------
|
||||
|
||||
You can activate the Intel PMU plugin with the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-update power-metrics telegraf power-metrics --set pmu_enabled=true
|
||||
+----------------+-------------------+
|
||||
| Property | Value |
|
||||
+----------------+-------------------+
|
||||
| name | telegraf |
|
||||
| namespace | power-metrics |
|
||||
| user_overrides | pmu_enabled: true |
|
||||
| | |
|
||||
+----------------+-------------------+
|
||||
|
||||
|
||||
Override intel_powerstat plugin
|
||||
-------------------------------
|
||||
|
||||
You can change the default ``intel_powerstat`` plugin parameters by override.
|
||||
|
||||
The plugin parameters include CPU and package metrics, and also the read method
|
||||
of |MSR|.
|
||||
|
||||
The list of available options for both CPU and package metrics can be found on
|
||||
the powerstat documentation:
|
||||
https://github.com/influxdata/telegraf/blob/master/plugins/inputs/intel_powerstat/README.md#configuration
|
||||
|
||||
It is worth noting that when overriding, the user must inform both metrics
|
||||
parameters (cpu and package), otherwise the plugin would stop collecting the
|
||||
missing metrics.
|
||||
|
||||
The ``read_method`` parameter specifies the reading method of |MSR|. This
|
||||
parameter accepts two values, concurrent or sequential. The default is
|
||||
concurrent. Concurrent method uses goroutines to read each |MSR| value
|
||||
concurrently.
|
||||
|
||||
The sequential method reads each value sequentially. This reduces latency
|
||||
overhead when using preempt-rt kernel with isolated cores, but might cause loss
|
||||
of precision on metrics calculation.
|
||||
|
||||
Example of overriding the powerstat plugin:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ cat telegraf-powerstat.yaml
|
||||
config:
|
||||
intel_powerstat:
|
||||
read_method: "sequential"
|
||||
cpu_metrics: ["cpu_frequency","cpu_busy_frequency","cpu_temperature","cpu_c0_state_residency","cpu_c1_state_residency","cpu_c6_state_residency","cpu_busy_cycles"]
|
||||
package_metrics: ["current_power_consumption","current_dram_power_consumption","thermal_design_power","cpu_base_frequency"]
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-update power-metrics telegraf power-metrics --values telegraf-powerstat.yaml
|
||||
+----------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+----------------+--------------------------------------+
|
||||
| name | telegraf |
|
||||
| namespace | power-metrics |
|
||||
| user_overrides | config: |
|
||||
| | intel_powerstat: |
|
||||
| | cpu_metrics: |
|
||||
| | - cpu_frequency |
|
||||
| | - cpu_busy_frequency |
|
||||
| | - cpu_temperature |
|
||||
| | - cpu_c0_state_residency |
|
||||
| | - cpu_c1_state_residency |
|
||||
| | - cpu_c6_state_residency |
|
||||
| | - cpu_busy_cycles |
|
||||
| | package_metrics: |
|
||||
| | - current_power_consumption |
|
||||
| | - current_dram_power_consumption |
|
||||
| | - thermal_design_power |
|
||||
| | - cpu_base_frequency |
|
||||
| | read_method: sequential |
|
||||
| | |
|
||||
+----------------+--------------------------------------+
|
||||
|
||||
Then, you can re-apply the app:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system application-apply power-metrics
|
||||
|
||||
|
||||
Add input plugins
|
||||
-----------------
|
||||
|
||||
You can add new plugins overriding the plugins column.
|
||||
|
||||
#. Add the cgroups plugin:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ cat telegraf-cgroups.yaml
|
||||
config:
|
||||
inputs:
|
||||
- cgroup:
|
||||
paths: ["/sys/fs/cgroup/cpu","/sys/fs/cgroup/cpu/*","/sys/fs/cgroup/cpu/*/*",]
|
||||
files: ["cpuacct.usage", "cpuacct.usage_percpu", "cpu.cfs_period_us", "cpu.cfs_quota_us", "cpu.shares", "cpu.stat"]
|
||||
|
||||
#. Then apply the override:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system helm-override-update power-metrics telegraf power-metrics --values /path/to/file.yaml
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-update power-metrics telegraf power-metrics --values telegraf-cgroups.yaml
|
||||
+----------------+--------------------------------+
|
||||
| Property | Value |
|
||||
+----------------+--------------------------------+
|
||||
| name | telegraf |
|
||||
| namespace | power-metrics |
|
||||
| user_overrides | config: |
|
||||
| | inputs: |
|
||||
| | - cgroup: |
|
||||
| | files: |
|
||||
| | - cpuacct.usage |
|
||||
| | - cpuacct.usage_percpu |
|
||||
| | - cpu.cfs_period_us |
|
||||
| | - cpu.cfs_quota_us |
|
||||
| | - cpu.shares |
|
||||
| | - cpu.stat |
|
||||
| | paths: |
|
||||
| | - /sys/fs/cgroup/cpu |
|
||||
| | - /sys/fs/cgroup/cpu/* |
|
||||
| | - /sys/fs/cgroup/cpu/*/* |
|
||||
| | |
|
||||
+----------------+--------------------------------+
|
||||
|
||||
#. After you can re-apply the app:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system application-apply power-metrics
|
||||
|
||||
#. If needed, add configmap and volumes via override:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
volumes:
|
||||
- name: telegraf-example
|
||||
configMap:
|
||||
name: telegraf-example
|
||||
mountPoints:
|
||||
- name: telegraf-example
|
||||
mountPath: /path/to/file.json
|
||||
subPath: file.json
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system helm-override-update power-metrics telegraf power-metrics --values /path/to/file.yaml
|
||||
|
||||
For more information on Telegraf plugins, see
|
||||
https://github.com/influxdata/telegraf#documentation.
|
||||
|
||||
|
||||
Modify Telegraf data collection interval
|
||||
----------------------------------------
|
||||
|
||||
Telegraf report its metrics each 10 seconds, but you can modify this time
|
||||
interval with the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system helm-override-update power-metrics telegraf power-metrics --set config.agent.interval=<time-interval>
|
||||
|
||||
--------
|
||||
cAdvisor
|
||||
--------
|
||||
|
||||
Enable and disable Perf Events on cAdvisor
|
||||
------------------------------------------
|
||||
|
||||
To enable or disable Perf Events on cAdvisor, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ system helm-override-update power-metrics cadvisor power-metrics --set perf_events=true
|
||||
+----------------+-------------------+
|
||||
| Property | Value |
|
||||
+----------------+-------------------+
|
||||
| name | cadvisor |
|
||||
| namespace | power-metrics |
|
||||
| user_overrides | perf_events: true |
|
||||
| | |
|
||||
+----------------+-------------------+
|
||||
|
||||
After that, reapply the power-metrics app, and wait until the pod restarts:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system application-apply power-metrics
|
||||
|
||||
|
||||
----------------------------
|
||||
Remove the Power Metrics App
|
||||
----------------------------
|
||||
|
||||
To remove the Power metrics app use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system application-remove power-metrics
|
||||
|
||||
Then, use the following command to return the application to the uploaded state:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
system application-delete power-metrics
|
||||
|
||||
-----------------
|
||||
Available Metrics
|
||||
-----------------
|
||||
|
||||
With Power Metrics application, we have access to system and hardware level raw
|
||||
data, enabling to visualize the power usage.
|
||||
|
||||
Power Metrics, by default, exposes the data collected from both, cAdvisor and
|
||||
Telegraf, in the OpenMetrics format.
|
||||
|
||||
.. rubric:: **Thermal Design Power**
|
||||
|
||||
The Thermal Design Power, or TDP, is the maximum energy available, in watts,
|
||||
for the processor. The metric name for checking the TDP is:
|
||||
``powerstat_package_thermal_design_power_watts``.
|
||||
|
||||
.. rubric:: **Current Power Consumption**
|
||||
|
||||
The current power usage of the system in watts. The metric name for checking
|
||||
power consumption is ``powerstat_package_current_power_consumption_watts``.
|
||||
|
||||
.. rubric:: **Current DRAM Power Consumption**
|
||||
|
||||
The current power usage of dram in the system in watts. The metric name for
|
||||
checking DRAM Consumption is:
|
||||
``powerstat_package_current_dram_power_consumption_watts``.
|
||||
|
||||
.. rubric:: **Current CPU Frequency**
|
||||
|
||||
The current CPU frequency of the of the processor. The metric name for
|
||||
checking the CPU frequency is ``powerstat_core_cpu_frequency_mhz``.
|
||||
|
||||
.. rubric:: **CPU Base Frequency**
|
||||
|
||||
The base frequency (non-turbo) of the processor, it is the default speed of the
|
||||
processor. The metric name for checking cpu base frequency is
|
||||
``powerstat_package_cpu_base_frequency_mhz``.
|
||||
|
||||
.. rubric:: **Uncore Frequency**
|
||||
|
||||
The application reports the current, maximum, and minimum frequency. The uncore
|
||||
frequency can be described as the frequency on a processor that is not actually
|
||||
part of its processor core, like memory controller and cache controller.
|
||||
|
||||
You can check the current uncore frequency with the following metric name:
|
||||
``powerstat_package_uncore_frequency_mhz_cur``, for maximum frequency metric
|
||||
name is ``powerstat_package_uncore_frequency_limit_mhz_max``, and for minimum
|
||||
the name ``powerstat_package_uncore_frequency_limit_mhz_min``.
|
||||
|
||||
.. rubric:: **Per-cpu minimum and maximum frequency**
|
||||
|
||||
The application reports the minimum and maximum frequency that each core of the
|
||||
processor can achieve. It is possible to check the minimum frequency with the
|
||||
metric name ``linux_cpu_cpuinfo_min_freq`` or ``linux_cpu_scaling_min_freq``,
|
||||
and maximum with ``linux_cpu_cpuinfo_max_freq`` or
|
||||
``linux_cpu_scaling_max_freq``.
|
||||
|
||||
.. rubric:: **Per-cpu busy frequency**
|
||||
|
||||
Busy frequency is the frequency of a core that has a high utilization. (confirm
|
||||
this later). It is possible to see the busy frequency with the following metric
|
||||
name ``powerstat_core_cpu_busy_frequency_mhz``.
|
||||
|
||||
.. rubric:: **Per-cpu percentage in C-State**
|
||||
|
||||
The application can report the time, in percent, that a core of the processor
|
||||
spent in each c-state. c-State is the state of the core, in which it can reduce
|
||||
its power consumption, the higher the c-state the higher the sleep state of
|
||||
the core. We have in the power metrics the following c-states reports:
|
||||
|
||||
- C0 state, in this state, the core is executing normally, it is exposed as
|
||||
``powerstat_core_cpu_c0_state_residency_percent``.
|
||||
|
||||
- C1 state, in this state, the core is active but it's not processing any
|
||||
instructions, it can quickly go back to the C0 state, it is exposed as
|
||||
``powerstat_core_cpu_c1_state_residency_percent``.
|
||||
|
||||
- C6 State, in this state the core is with its voltage reduced (or powered
|
||||
off). This is the highest state. It takes a longer time to go to C0 state,
|
||||
but the power saving is higher. It is exposed as
|
||||
``powerstat_core_cpu_c6_state_residency_percent``.
|
||||
|
||||
.. rubric:: **Per-cpu current temperature**
|
||||
|
||||
The application reports the current temperature of each individual core from
|
||||
the processor. The current temperature is exposed as the metric name
|
||||
``powerstat_core_cpu_temperature_celsius``.
|
||||
|
||||
.. rubric:: **Container perf events total**
|
||||
|
||||
From cAdvisor it is reported the number of performance events that occurred in
|
||||
a container, it is exposed as ``container_perf_events_total``.
|
||||
|
||||
.. rubric:: **Container perf events scaling ratio**
|
||||
|
||||
It also reports the scaling ratio, which calculates the ratio of performance
|
||||
events in a container, it is exposed as
|
||||
``container_perf_events_scaling_ration``.
|
||||
|
||||
.. rubric:: **Per Core CPU Power usage**
|
||||
|
||||
By considering the frequency of each core, gathered by
|
||||
``powerstat_core_cpu_frequency_mhz`` metric with the amount of power usage of
|
||||
the processor, gathered by
|
||||
``powerstat_package_current_power_consumption_watts`` metric, it is possible to
|
||||
estimate the total amount of power, in watts, that is being used by each core.
|
||||
|
||||
Example of formula:
|
||||
|
||||
per_cpu_consumption = ((0.6 * powerstat_core_cpu_frequency_mhz{cpu_id=x,
|
||||
package_id=y})/ ∑ powerstat_core_cpu_frequency_mhz{package_id=y}) *
|
||||
powerstat_package_current_power_consumption_watts{package_id=y}
|
||||
|
||||
.. rubric:: **Container CPU Power usage**
|
||||
|
||||
By gathering the number of instructions in each container running on the
|
||||
cluster, gathered by the ``container_perf_events_total`` metric, with the
|
||||
corresponding core that they are using, determined by the per core cpu power
|
||||
usage described above, and the total number of instructions per core, also
|
||||
available from ``container_perf_events_total metric``, it is possible to
|
||||
estimate the power that is being consumed by each container.
|
||||
|
||||
Example of formula to calculate the power consumption of a container on a core:
|
||||
|
||||
container_per_cpu_consumption = (container_perf_events_total{cpu=x,
|
||||
container=z} / container_perf_events_total {cpu=x}) *
|
||||
per_cpu_consumption{cpu=x}
|
||||
|
||||
Where "X" is the core_id of the cpu, "Y" is the package_id or physical_id of
|
||||
the processor, and "Z" is the container name.
|
@ -90,6 +90,7 @@
|
||||
.. |ML| replace:: :abbr:`ML (Machine Learning)`
|
||||
.. |MNFA| replace:: :abbr:`MNFA (Multi-Node Failure Avoidance)`
|
||||
.. |MOTD| replace:: :abbr:`MOTD (Message of the Day)`
|
||||
.. |MSR| replace:: :abbr:`MSR (Model-specific Registers)`
|
||||
.. |MTU| replace:: :abbr:`MTU (Maximum Transmission Unit)`
|
||||
.. |NA| replace:: :abbr:`NA (Not Applicable)`
|
||||
.. |NAT| replace:: :abbr:`NAT (Network Address Translation)`
|
||||
|
Loading…
x
Reference in New Issue
Block a user