
Change-Id: I28de4dfd81f409aebf60e1a98f1ea440858bda0b Signed-off-by: Ngairangbam Mili <ngairangbam.mili@windriver.com>
20 KiB
Configurable Power Manager
Configurable Power Manager focuses on containerized applications that use power profiles individually by the core and/or the application.
has the capability to regulate the frequency of the entire processor. However, this control is primarily directed towards the classification of the core, distinguishing between application and platform cores. Consequently, if a user requires to control over an individual core, such as Core 10 in a 24-core CPU, adjustments must be applied to all cores collectively. In the context of containerized operations, it becomes imperative to establish personalized configurations. This entails assigning each container the requisite power configuration. In essence, this involves providing specific and individualized power configurations to each core or group of cores.
With the introduction of Configurable Power Manager, it is possible to highlight the control of acceptable frequency ranges (minimum and maximum frequency) per core, the behavior of the core in this range (governor), which power levels (c-states) a given core can access, as well as the behavior of the system related to workloads with known intervals/demands.
To encapsulate the dependencies, images, profiles and configurations, Power Manager is delivered as a Application.
System Requirements
- Support and enable the BIOS functionality to delegate
c-states
andp-states
control to the operating system - intel_pstate and acpi_cpufreq drivers
- intel_idle or acpi_idle module
- 3rd and 4th Generation Intel® Xeon® Scalable Processors
Dependencies
The installer will look for the application on the cluster. In case
the is not installed, as Intel recommends its use, the installation will
fail until you either install the or create the
user_override
parameter nfd-required
with
value False to allow the installation of Power Manager without the
application.
You can see an example below on how to override
nfd-required
parameter:
(keystone_admin)]$ system helm-override-update --set nfd-required=False kubernetes-power-manager kubernetes-power-manager intel-power
Power Manager Installation
The installation follows the standard procedure to install a application.
Go to the path for application tgz file:
/usr/local/share/applications/helm/kubernetes-power-manager-<VERSION>.tgz
.(keystone_admin)]$ system application-upload /usr/local/share/applications/helm/kubernetes-power-manager-<VERSION>.tgz (keystone_admin)]$ system application-apply kubernetes-power-manager
The namespace, service accounts, rules and in Kubernetes are all provided by the Power Manager project.
Resource Type Resource Names Namespace intel-power Service Account
intel-power-node-agent
intel-power-operator
Role operator-custom-resource-definitions-role RoleBinding operator-custom-resource-definitions-role-binding Cluster Role
operator-nodes
manager-role
node-agent-cluster-resources
Cluster Role Binding
operator-nodes-binding
node-agent-cluster-resources-binding
Custom Resource Definition
cstates.power.intel.com
powerconfigs.power.intel.com
powernodes.power.intel.com
powerpods.power.intel.com
powerprofiles.power.intel.com
powerworkloads.power.intel.com
timeofdaycronjobs.power.intel.com
timeofdays.power.intel.com
uncores.power.intel.com
The manager container (Kubernetes Operator) of Kubernetes Power Manager (monitors and starts the Power Manager agent on selected nodes) will be deployed. There will only be one instance of the operator in the cluster and it will preferably run on one of the control plane nodes.
Publish the power configuration profile to the cluster (this resource is responsible for exposing the standard power profiles of Intel Power Optimization Library). The default power profiles are: performance, balance-performance, and balance-power.
The Power Manager will create the available configurations. If you want to customize your application, apply those modifications via
helm-override
. To see an example of a customization seeconfigurable-power-manager-04c24b536696-user-defined-settings
.
Label Assignment
A Kubernetes label will control which hosts the Power Manager agent should run. The operator (manager) listens for changes in hosts and when detecting the label it will start the agent container on that host.
The agent is responsible for monitoring and applying the power configurations described by Custom Resources (c-state, Power Profiles, Power Workloads, etc) or in the Pod specifications.
Important
In the kubelet configuration file, the cpuManagerPolicy
has to be set to "static", and the reservedSystemCPUs
must
be set to the desired value:
(keystone_admin)]$ system host-label-assign --overwrite <HOSTNAME> kube-cpu-mgr-policy=static
To create the label, manually enter the command below to inform the host where the agent must be deployed:
(keystone_admin)]$ system host-label-assign <HOSTNAME> power-management=enabled
Note
This command will only be accepted if the
max_cpu_mhz_configured
parameter is disabled. Do not have
both activated simultaneously.
Once the label is applied, the following tasks will be automatically performed:
Default CPU c-states
During the installation process, default c-state levels are configured. By default, platform cores can access the available levels up to C6, while application cores can access levels up to C1.
This configuration is performed automatically on each node and is based on the levels available in the processor. If the target levels do not exist, the application will choose to maintain only C0 on the application cores, and the lowest available level on the platform cores.
Default CPU Frequency (p-state)
CPU p-state management can be controlled either through power profiles applied to containers or through a shared profile that manages CPU cores individually or in groups.
By default, all CPU cores will use the full frequency range available and CPU governor in performance mode.
Two resources will be deployed on Kubernetes: Shared profile and Shared workload profile.
If you want to create a custom profile use the parameters in the yaml file below:
apiVersion: power.intel.com/v1
kind: PowerProfile
metadata:
name: profile-name
namespace: intel-power
spec:
name: profile-name
max: <HOST-MAX-CPU-FREQ> # Maximum core frequency supported
min: <HOST-MIN-CPU-FREQ> # Minimum core frequency supported
epp: performance
governor: performance
Shared Profile
This resource specifies the minimum and maximum core frequencies and
CPU governor for each host in the cluster. When the label is assigned to
a host, it will trigger the creation of this profile applying the
minimum and maximum frequencies supported and the CPU governor will
always be performance
.
Note
In real-time systems the minimum and maximum frequency are the same in all cores (min = max). This is standard behavior for real time systems, and different configurations will affect the maximum frequency.
Shared Workload Profile
This resource binds the Shared Profile to CPU cores on the host. Once
the label is created on the host, the created profile will point to the
Shared Profile and select all CPU cores available except the platform
cores that use the reservedCPUs
parameter.
Note
The CPU p-state of the platform cores is managed by the use of the
reservedProfile
parameter.
Node Agent Pod
The Pod Controller watches for pods. When a pod comes along, the Pod Controller checks if the pod is in the guaranteed quality of service class (using exclusive cores, taking a core out of the shared pool - it is the only option in Kubernetes that can do this operation. For more details see https://kubernetes.io/docs/tasks/configure-pod-container/quality-service-pod/). Then it examines the Pods to determine which Power Profile has been requested and then creates or updates the appropriate Power Workload.
Note
The request and the limits must have a matching number of cores on a
container-by-container basis. Currently, Power Manager only supports a
single Power Profile per pod. If two profiles are requested in different
containers, the pod gets created, but the cores are not tuned. This will
only work if the pods use isolcpus
.
Exclude kernel parameter
When you apply the power-management
label, the
intel_idle.max_cstate
parameter is removed from the kernel
arguments.
Note
This change will take effect after reboot, until then, it retains the
current behavior and the Power Manager will manage the CPU c-states
using the acpi_idle
driver which may not expose all
c-states supported by the processor. After reboot, ensure that all
overrides are applied.
User Defined Settings
You can override the auto-generated settings using the
user_override
functionality of the Power Manager
application. It allows you to customize the settings on a per-host basis
for:
Shared Profile [section sharedProfile]:
- governor:
-
CPU governor
- max:
-
Maximum CPU frequency
- min:
-
Minimum CPU frequency
- reservedCPUs:
-
List of CPU cores to not apply the profile (platform cores)
- reservedProfile:
-
The profile to apply to platform cores
c-states Profile [section cstatesProfile]:
- sharedPoolCStates:
-
List of CPU c-states for all application cores and their status (on/off)
- individualCoreCStates:
-
List of all platform CPU cores:
- List of CPU c-states for each application core and their status (on/off)
See the example below to configure host controller-0. This setting will override the CPU governor and maximum CPU frequency in Shared Profile and disable C6 state for the platform cores (0,96) and enable C6 state for all application cores through the c-state Profile.
sharedProfile:
controller-0:
governor: powersave
max: 2000
cstatesProfile:
controller-0:
individualCoreCStates:
"0":
C6: false
"96":
C6: false
sharedPoolCStates:
C6: true
Applying these user_overrides
will generate a new
configuration (combined_overrides
) by merging and
overriding the auto-generated configuration with the user's definitions.
Also, you can view both configurations individually: the auto-generated
configuration by Power Manager in the system_overrides
section and the user configuration in user_overrides
section as below.
(keystone_admin)]$ system helm-override-show kubernetes-power-manager kubernetes-power-manager intel-power
+--------------------+---------------------------------------------------------------------+
| Property | Value |
+--------------------+---------------------------------------------------------------------+
| attributes | enabled: true |
| | |
| combined_overrides | cstatesProfile: |
| | controller-0: |
| | individualCoreCStates: |
| | "0": |
| | C1: true |
| | C1E: true |
| | C6: false |
| | C6: false |
| | POLL: true |
| | "96": |
| | C1: true |
| | C1E: true |
| | C6: false |
| | POLL: true |
| | sharedPoolCStates: |
| | C1: true |
| | C1E: false |
| | C6: true |
| | POLL: true |
| | sharedProfile |
| | controller-0: |
| | governor: powersave |
| | max: 2000 |
| | min: 800 |
| | reservedCPUs: '[0, 96]' |
| | reservedProfile: performance |
| | shared: true |
| | |
| | |
| name | kubernetes-power-manager |
| namespace | intel-power |
| system_overrides | cstatesProfile: |
| | controller-0: |
| | individualCoreCStates: |
| | '0': {C1: true, C1E: true, C6: true, POLL: true} |
| | '96': {C1: true, C1E: true, C6: true, POLL: true} |
| | sharedPoolCStates: {C1: true, C1E: false, C6: false, POLL: true}|
| | sharedProfile: |
| | controller-0: |
| | governor: performance |
| | max: 3000 |
| | min: 800 |
| | reservedCPUs: [0, 96] |
| | reservedProfile: performance, shared: true} |
| user_overrides | cstatesProfile: |
| | controller-0: |
| | individualCoreCStates: |
| | "0": |
| | C6: false |
| | "96": |
| | C6: false |
| | sharedPoolCStates: |
| | C6: true |
| | sharedProfile: |
| | controller-0: |
| | governor: powersave |
| | max: 2000 |
| | |
+--------------------+---------------------------------------------------------------------+
This final configuration will be published into Kubernetes as a Shared Profile and c-state Profile when you reapply the application.
(keystone_admin)]$ system application-apply kubernetes-power-manager
It is also possible (and optional) to add a c-state for a specific
profile. To do this, you need to add exclusivePoolCstates
tag. See the example below including c-states for performance
profile:
sharedProfile:
controller-0:
governor: powersave
max: 2000
cstatesProfile:
controller-0:
individualCoreCStates:
"0":
C6: false
"96":
C6: false
sharedPoolCStates:
C6: true
exclusivePoolCstates:
performance:
C6: true
There are other features available in the Power Manager, such as Uncore Frequency, and Time of Day that can be used, but their settings should be deployed directly to the cluster using the procedures described in Power Manager documentation in https://github.com/intel/kubernetes-power-manager.
Inconsistent Settings
It is important to consider that when using the application, you will have to configure frequency and power profiles with caution. However, such settings, if inconsistent, may result in an undesired power state of the pods, whether due to the partial application of settings (only c-states or only p-states) or the non-application of settings in general (pod deployed without power settings).
Power Manager Uninstall
To uninstall the application you must use the following commands to remove any application.
(keystone_admin)]$ system application-remove kubernetes-power-manager
(keystone_admin)]$ system application-delete kubernetes-power-manager
The uninstall process will shut down the containers (manager and all
agents) and remove all configurations deployed to Kubernetes related to
Power Manager, including the namespace intel-power
. The
application will not be unistalled even if it had been installed as
dependency on Power Manager, this will avoid the disruption of other
applications that use it. The power-management
label should
be manually removed.
(keystone_admin)]$ system host-label-remove <HOSTNAME> power-management
Note
While the label is assigned to a host, the
intel_idle.max_cstate
kernel parameter will not be restored
on that host and the max_cpu_mhz_configured
parameter will
remain disabled.