C-state Management Application on StarlingX
This commit introduces the StarlingX specification for the C-state Management. An application that allows Kubernetes resources to dynamically control their C-states. Story: 2011105 Task: 49878 Author: Guilherme Santos <guilherme.santos@windriver.com> Co-author: Vinicius Lobo <vinicius.rochalobo@windriver.com> Change-Id: Iebae30c72d94e3d490ecc00a55462aa70fa77516 Signed-off-by: Guilherme Santos <guilherme.santos@windriver.com>
This commit is contained in:
parent
31bf76b1f8
commit
f945ad22ae
@ -0,0 +1,289 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
..
|
||||
Many thanks to the OpenStack Nova team for the Example Spec that formed the
|
||||
basis for this document.
|
||||
|
||||
===========================================
|
||||
C-state Management Application on StarlingX
|
||||
===========================================
|
||||
|
||||
Storyboard: `#2011105`_
|
||||
|
||||
The objective of this spec is to introduce the C-state Management
|
||||
Application in StarlingX Platform.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
StarlingX, in its current version, offers a comprehensive set of features
|
||||
for power management. Allowing users and applications to control acceptable
|
||||
frequency ranges (minimum and maximum frequency) per core; the behavior of
|
||||
cores in such ranges (governor); which idle sleep states (C-states) a given
|
||||
core can access, as well as the behavior of the system in the face of
|
||||
workloads with known intervals/demands. `Kubernetes Power Manager`_ powers
|
||||
the control of the aforementioned features in targeted CPUs/cores, allowing
|
||||
individualized configurations.
|
||||
|
||||
Oftentimes, containerized applications require greater granularity
|
||||
by controlling their CPU idle states (C-states) in execution time. The
|
||||
`C-state Management Application` offers a set of endpoints that enable pods to
|
||||
dynamically consult and adjust their C-states. Therefore, it allows users to
|
||||
save energy by offering fine-grained control of the C-states of the cores
|
||||
assigned to its applications.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
With the introduction of these new capabilities for C-state management,
|
||||
StarlingX end users and deployers gain enhanced control over the CPU core
|
||||
configurations. These new features are beneficial for optimizing power
|
||||
consumption and performance.
|
||||
|
||||
We identify the following potential impacts to StarlingX's stakeholders with
|
||||
this dynamic C-state management integration:
|
||||
|
||||
* End users: The ability to adjust the maximum C-state level of CPU cores
|
||||
assigned to pods through REST API requests offers increased flexibility
|
||||
without disrupting existing workflows. This feature ensures seamless
|
||||
integration with applications running on StarlingX, enhancing user
|
||||
experience.
|
||||
|
||||
* Deployers: The introduction of dynamic C-state management may necessitate
|
||||
minor adjustments for deployers, primarily related to ensuring that assigned
|
||||
CPU cores are appropriately configured as application-isolated or
|
||||
exclusively allocated to the pods. Additionally, deployers may need to ensure
|
||||
that REST API requests for C-state adjustments originate from the same node
|
||||
where the application's pods are deployed, maintaining security and
|
||||
efficiency.
|
||||
|
||||
* Developers: The integration of C-state management brings significant
|
||||
enhancements to the development workflow within StarlingX. By incorporating
|
||||
a dynamic C-state management functionality, developers gain a more granular
|
||||
level of control over CPU core configurations, allowing for finer
|
||||
optimization of power usage and system performance.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The new `C-state Management Application` will be introduced to StarlingX,
|
||||
resulting in the addition of a REST API that empowers pods to dynamically
|
||||
control their C-states. When disabled, the application will not add changes to
|
||||
StarlingX's standard behavior. When enabled, the Kubernetes pods will be
|
||||
able to programmatically manage their C-state.
|
||||
|
||||
`C-state Management Application` essentially provides endpoints that enable the
|
||||
following functionalities:
|
||||
|
||||
* Change the maximum C-state Level of CPU Cores.
|
||||
|
||||
* The application, via its REST API, initiates a request to modify the
|
||||
maximum C-state level of the CPU cores allocated to its pods.
|
||||
* The assigned CPU cores must either adhere to application isolation or be
|
||||
exclusively assigned to the pods.
|
||||
* The request originates from the node on which the application's pods
|
||||
are deployed.
|
||||
|
||||
* Query the Maximum Available C-state Levels.
|
||||
|
||||
* The application, through its REST API, sends a request to inquire about
|
||||
the maximum C-state levels available for modification.
|
||||
|
||||
* Query the Maximum C-state Configuration
|
||||
|
||||
* The application, utilizing its REST API, requests information regarding
|
||||
the configured maximum C-state from the node where its pods are currently
|
||||
deployed.
|
||||
|
||||
|
||||
This specification also requires that the cloud platform shall be able to:
|
||||
|
||||
* Process the C-state level requests (change/query) and respond if the change
|
||||
occurred or to report the current max c-state level.
|
||||
|
||||
* Process the max C-state level requests (change/query) on the Platform
|
||||
cores, in other words, it shall run the API producer on the Platform cores.
|
||||
|
||||
* Fulfill the request to change the max c-state within a granularity of
|
||||
seconds.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
None
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
A new REST API will be available, resulting in procedural changes for
|
||||
dynamically managing C-states on StarlingX. The users should be aware that
|
||||
the `C-state Management Application` is not designed to work in tandem with
|
||||
`Kubernetes Power Manager`_. Therefore, we recommend the use of only one of
|
||||
the aforementioned applications at a time.
|
||||
|
||||
C-state availability might be conditioned to the presence of a label such as
|
||||
`power-management`_. The `C-state Management Application` is able to manage the
|
||||
available C-states independently of the applied labels.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Given the nature of dynamic C-state management, impacts related to power
|
||||
consumption and latency are expected to vary based on the usage of
|
||||
`C-state Management Application`. The following shall be considered:
|
||||
|
||||
* Power Consumption: By actively monitoring and controlling the C-states,
|
||||
applications can optimize power consumption based on workload demands,
|
||||
reducing the overall energy consumption in the cluster. On the other hand,
|
||||
an incorrect or inconsistent configuration might lead to performance
|
||||
degradation.
|
||||
|
||||
* Latency: C-States range from C0 to Cn. C0 indicates an active state. All
|
||||
other C-states (C1-Cn) represent idle sleep states with different parts of
|
||||
the processor powered down. As the C-States get deeper, the exit latency
|
||||
duration becomes longer (the time to transition to C0) and the power savings
|
||||
become greater. This potentially increases the time required for processing
|
||||
varying workloads based on pre-defined parameters.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
Please see the `Use Cases`_ section.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
|
||||
* Guilherme Batista Leite (guilhermebatista)
|
||||
|
||||
Other contributors:
|
||||
|
||||
* Alyson Deives Pereira (adeivesp)
|
||||
* Eduardo Juliano Alberti (ealberti)
|
||||
* Fabio Studyny Higa (fstudyny)
|
||||
* Guilherme Henrique Pereira dos Santos (gsantos1)
|
||||
* Vinicius Fernando Rocha Lobo (vrochalo)
|
||||
|
||||
Repos Impacted
|
||||
--------------
|
||||
|
||||
* starlingx/docs
|
||||
* starlingx/config
|
||||
* starlingx/app-cstate-management (new)
|
||||
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
The following work items are expected to be carried out, with the understanding
|
||||
that the storyboard will be updated as more work items are found to be
|
||||
necessary.
|
||||
|
||||
Spikes and Design
|
||||
*****************
|
||||
|
||||
* Basic testing of per-cpu latency specification.
|
||||
* Review of the proposed design.
|
||||
* Evaluation of options to reduce latency and expected latency reduction.
|
||||
|
||||
Development Work Items
|
||||
**********************
|
||||
|
||||
* Merge proof of concept to StarlingX codebase.
|
||||
* Create FluxCD manifest for C-state DaemonSet.
|
||||
* Create StarlingX application to wrap the FluxCD manifest.
|
||||
* Enhance C-state application to support IPv6 addresses.
|
||||
* Enhance C-state application to prevent modification of CPUs allocated to
|
||||
other Pods.
|
||||
* Installation via system application.
|
||||
|
||||
Customer Documentation
|
||||
**********************
|
||||
|
||||
* Publish the usage guide for what functionality is available and how to make
|
||||
use of it.
|
||||
* Sample code showing how to make use of the functionality.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
System configuration
|
||||
--------------------
|
||||
The tests will be conducted in the following system configurations:
|
||||
|
||||
* AIO-SX
|
||||
* AIO-DX
|
||||
* Standard
|
||||
|
||||
Test Scenarios
|
||||
--------------
|
||||
|
||||
* Functional tests for `C-state Management Application` and its customizations.
|
||||
* Unit testing the impacted code areas.
|
||||
* Performance testing to identify and address any performance impacts.
|
||||
* Backup and restore tests.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
The end-user documentation must be created, adding a guide to
|
||||
`C-state Management Application` deployments, configurations and
|
||||
customizations.
|
||||
|
||||
References
|
||||
==========
|
||||
#. `Kubernetes Power Manager`_
|
||||
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - stx-10.0
|
||||
- Introduced
|
||||
|
||||
.. Links
|
||||
.. _#2011105: https://storyboard.openstack.org/#!/story/2011105
|
||||
.. _Kubernetes Power Manager: https://github.com/intel/kubernetes-power-manager
|
||||
.. _power-management: https://docs.starlingx.io/node_management/kubernetes/configurable-power-manager-04c24b536696.html
|
Loading…
Reference in New Issue
Block a user