C-state Management Application on StarlingX
This commit introduces the StarlingX specification for the C-state Management. An application that allows Kubernetes resources to dynamically control their C-states. Story: 2011105 Task: 49878 Author: Guilherme Santos <guilherme.santos@windriver.com> Co-author: Vinicius Lobo <vinicius.rochalobo@windriver.com> Change-Id: Iebae30c72d94e3d490ecc00a55462aa70fa77516 Signed-off-by: Guilherme Santos <guilherme.santos@windriver.com>
This commit is contained in:
parent
31bf76b1f8
commit
f945ad22ae
@ -0,0 +1,289 @@
|
|||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
..
|
||||||
|
Many thanks to the OpenStack Nova team for the Example Spec that formed the
|
||||||
|
basis for this document.
|
||||||
|
|
||||||
|
===========================================
|
||||||
|
C-state Management Application on StarlingX
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
Storyboard: `#2011105`_
|
||||||
|
|
||||||
|
The objective of this spec is to introduce the C-state Management
|
||||||
|
Application in StarlingX Platform.
|
||||||
|
|
||||||
|
Problem description
|
||||||
|
===================
|
||||||
|
|
||||||
|
StarlingX, in its current version, offers a comprehensive set of features
|
||||||
|
for power management. Allowing users and applications to control acceptable
|
||||||
|
frequency ranges (minimum and maximum frequency) per core; the behavior of
|
||||||
|
cores in such ranges (governor); which idle sleep states (C-states) a given
|
||||||
|
core can access, as well as the behavior of the system in the face of
|
||||||
|
workloads with known intervals/demands. `Kubernetes Power Manager`_ powers
|
||||||
|
the control of the aforementioned features in targeted CPUs/cores, allowing
|
||||||
|
individualized configurations.
|
||||||
|
|
||||||
|
Oftentimes, containerized applications require greater granularity
|
||||||
|
by controlling their CPU idle states (C-states) in execution time. The
|
||||||
|
`C-state Management Application` offers a set of endpoints that enable pods to
|
||||||
|
dynamically consult and adjust their C-states. Therefore, it allows users to
|
||||||
|
save energy by offering fine-grained control of the C-states of the cores
|
||||||
|
assigned to its applications.
|
||||||
|
|
||||||
|
Use Cases
|
||||||
|
---------
|
||||||
|
|
||||||
|
With the introduction of these new capabilities for C-state management,
|
||||||
|
StarlingX end users and deployers gain enhanced control over the CPU core
|
||||||
|
configurations. These new features are beneficial for optimizing power
|
||||||
|
consumption and performance.
|
||||||
|
|
||||||
|
We identify the following potential impacts to StarlingX's stakeholders with
|
||||||
|
this dynamic C-state management integration:
|
||||||
|
|
||||||
|
* End users: The ability to adjust the maximum C-state level of CPU cores
|
||||||
|
assigned to pods through REST API requests offers increased flexibility
|
||||||
|
without disrupting existing workflows. This feature ensures seamless
|
||||||
|
integration with applications running on StarlingX, enhancing user
|
||||||
|
experience.
|
||||||
|
|
||||||
|
* Deployers: The introduction of dynamic C-state management may necessitate
|
||||||
|
minor adjustments for deployers, primarily related to ensuring that assigned
|
||||||
|
CPU cores are appropriately configured as application-isolated or
|
||||||
|
exclusively allocated to the pods. Additionally, deployers may need to ensure
|
||||||
|
that REST API requests for C-state adjustments originate from the same node
|
||||||
|
where the application's pods are deployed, maintaining security and
|
||||||
|
efficiency.
|
||||||
|
|
||||||
|
* Developers: The integration of C-state management brings significant
|
||||||
|
enhancements to the development workflow within StarlingX. By incorporating
|
||||||
|
a dynamic C-state management functionality, developers gain a more granular
|
||||||
|
level of control over CPU core configurations, allowing for finer
|
||||||
|
optimization of power usage and system performance.
|
||||||
|
|
||||||
|
Proposed change
|
||||||
|
===============
|
||||||
|
|
||||||
|
The new `C-state Management Application` will be introduced to StarlingX,
|
||||||
|
resulting in the addition of a REST API that empowers pods to dynamically
|
||||||
|
control their C-states. When disabled, the application will not add changes to
|
||||||
|
StarlingX's standard behavior. When enabled, the Kubernetes pods will be
|
||||||
|
able to programmatically manage their C-state.
|
||||||
|
|
||||||
|
`C-state Management Application` essentially provides endpoints that enable the
|
||||||
|
following functionalities:
|
||||||
|
|
||||||
|
* Change the maximum C-state Level of CPU Cores.
|
||||||
|
|
||||||
|
* The application, via its REST API, initiates a request to modify the
|
||||||
|
maximum C-state level of the CPU cores allocated to its pods.
|
||||||
|
* The assigned CPU cores must either adhere to application isolation or be
|
||||||
|
exclusively assigned to the pods.
|
||||||
|
* The request originates from the node on which the application's pods
|
||||||
|
are deployed.
|
||||||
|
|
||||||
|
* Query the Maximum Available C-state Levels.
|
||||||
|
|
||||||
|
* The application, through its REST API, sends a request to inquire about
|
||||||
|
the maximum C-state levels available for modification.
|
||||||
|
|
||||||
|
* Query the Maximum C-state Configuration
|
||||||
|
|
||||||
|
* The application, utilizing its REST API, requests information regarding
|
||||||
|
the configured maximum C-state from the node where its pods are currently
|
||||||
|
deployed.
|
||||||
|
|
||||||
|
|
||||||
|
This specification also requires that the cloud platform shall be able to:
|
||||||
|
|
||||||
|
* Process the C-state level requests (change/query) and respond if the change
|
||||||
|
occurred or to report the current max c-state level.
|
||||||
|
|
||||||
|
* Process the max C-state level requests (change/query) on the Platform
|
||||||
|
cores, in other words, it shall run the API producer on the Platform cores.
|
||||||
|
|
||||||
|
* Fulfill the request to change the max c-state within a granularity of
|
||||||
|
seconds.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Data model impact
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
REST API impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Security impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Other end user impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
A new REST API will be available, resulting in procedural changes for
|
||||||
|
dynamically managing C-states on StarlingX. The users should be aware that
|
||||||
|
the `C-state Management Application` is not designed to work in tandem with
|
||||||
|
`Kubernetes Power Manager`_. Therefore, we recommend the use of only one of
|
||||||
|
the aforementioned applications at a time.
|
||||||
|
|
||||||
|
C-state availability might be conditioned to the presence of a label such as
|
||||||
|
`power-management`_. The `C-state Management Application` is able to manage the
|
||||||
|
available C-states independently of the applied labels.
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Given the nature of dynamic C-state management, impacts related to power
|
||||||
|
consumption and latency are expected to vary based on the usage of
|
||||||
|
`C-state Management Application`. The following shall be considered:
|
||||||
|
|
||||||
|
* Power Consumption: By actively monitoring and controlling the C-states,
|
||||||
|
applications can optimize power consumption based on workload demands,
|
||||||
|
reducing the overall energy consumption in the cluster. On the other hand,
|
||||||
|
an incorrect or inconsistent configuration might lead to performance
|
||||||
|
degradation.
|
||||||
|
|
||||||
|
* Latency: C-States range from C0 to Cn. C0 indicates an active state. All
|
||||||
|
other C-states (C1-Cn) represent idle sleep states with different parts of
|
||||||
|
the processor powered down. As the C-States get deeper, the exit latency
|
||||||
|
duration becomes longer (the time to transition to C0) and the power savings
|
||||||
|
become greater. This potentially increases the time required for processing
|
||||||
|
varying workloads based on pre-defined parameters.
|
||||||
|
|
||||||
|
Other deployer impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Developer impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
Please see the `Use Cases`_ section.
|
||||||
|
|
||||||
|
Upgrade impact
|
||||||
|
--------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
|
||||||
|
* Guilherme Batista Leite (guilhermebatista)
|
||||||
|
|
||||||
|
Other contributors:
|
||||||
|
|
||||||
|
* Alyson Deives Pereira (adeivesp)
|
||||||
|
* Eduardo Juliano Alberti (ealberti)
|
||||||
|
* Fabio Studyny Higa (fstudyny)
|
||||||
|
* Guilherme Henrique Pereira dos Santos (gsantos1)
|
||||||
|
* Vinicius Fernando Rocha Lobo (vrochalo)
|
||||||
|
|
||||||
|
Repos Impacted
|
||||||
|
--------------
|
||||||
|
|
||||||
|
* starlingx/docs
|
||||||
|
* starlingx/config
|
||||||
|
* starlingx/app-cstate-management (new)
|
||||||
|
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
The following work items are expected to be carried out, with the understanding
|
||||||
|
that the storyboard will be updated as more work items are found to be
|
||||||
|
necessary.
|
||||||
|
|
||||||
|
Spikes and Design
|
||||||
|
*****************
|
||||||
|
|
||||||
|
* Basic testing of per-cpu latency specification.
|
||||||
|
* Review of the proposed design.
|
||||||
|
* Evaluation of options to reduce latency and expected latency reduction.
|
||||||
|
|
||||||
|
Development Work Items
|
||||||
|
**********************
|
||||||
|
|
||||||
|
* Merge proof of concept to StarlingX codebase.
|
||||||
|
* Create FluxCD manifest for C-state DaemonSet.
|
||||||
|
* Create StarlingX application to wrap the FluxCD manifest.
|
||||||
|
* Enhance C-state application to support IPv6 addresses.
|
||||||
|
* Enhance C-state application to prevent modification of CPUs allocated to
|
||||||
|
other Pods.
|
||||||
|
* Installation via system application.
|
||||||
|
|
||||||
|
Customer Documentation
|
||||||
|
**********************
|
||||||
|
|
||||||
|
* Publish the usage guide for what functionality is available and how to make
|
||||||
|
use of it.
|
||||||
|
* Sample code showing how to make use of the functionality.
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
System configuration
|
||||||
|
--------------------
|
||||||
|
The tests will be conducted in the following system configurations:
|
||||||
|
|
||||||
|
* AIO-SX
|
||||||
|
* AIO-DX
|
||||||
|
* Standard
|
||||||
|
|
||||||
|
Test Scenarios
|
||||||
|
--------------
|
||||||
|
|
||||||
|
* Functional tests for `C-state Management Application` and its customizations.
|
||||||
|
* Unit testing the impacted code areas.
|
||||||
|
* Performance testing to identify and address any performance impacts.
|
||||||
|
* Backup and restore tests.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
The end-user documentation must be created, adding a guide to
|
||||||
|
`C-state Management Application` deployments, configurations and
|
||||||
|
customizations.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
#. `Kubernetes Power Manager`_
|
||||||
|
|
||||||
|
|
||||||
|
History
|
||||||
|
=======
|
||||||
|
|
||||||
|
.. list-table:: Revisions
|
||||||
|
:header-rows: 1
|
||||||
|
|
||||||
|
* - Release Name
|
||||||
|
- Description
|
||||||
|
* - stx-10.0
|
||||||
|
- Introduced
|
||||||
|
|
||||||
|
.. Links
|
||||||
|
.. _#2011105: https://storyboard.openstack.org/#!/story/2011105
|
||||||
|
.. _Kubernetes Power Manager: https://github.com/intel/kubernetes-power-manager
|
||||||
|
.. _power-management: https://docs.starlingx.io/node_management/kubernetes/configurable-power-manager-04c24b536696.html
|
Loading…
Reference in New Issue
Block a user