Merge "Sapphire Rapids Data Streaming Accelerator Support"
This commit is contained in:
commit
d53bbc2f7a
@ -47,6 +47,7 @@
|
||||
.. |DDP| replace:: :abbr:`DDP (Dynamic Device Personalization)`
|
||||
.. |DOR| replace:: :abbr:`DOR (Dead Office Recovery)`
|
||||
.. |DHCP| replace:: :abbr:`DHCP (Dynamic Host Configuration Protocol)`
|
||||
.. |DLB| replace:: :abbr:`DLB (Dynamic Load Balancer)`
|
||||
.. |DMA| replace:: :abbr:`DMA (Direct Memory Access)`
|
||||
.. |DMS| replace:: :abbr:`DMS (O-Cloud Deployment Management Services)`
|
||||
.. |DNAT| replace:: :abbr:`DNAT (Destination Network Address Translation)`
|
||||
@ -55,8 +56,10 @@
|
||||
.. |DN| replace:: :abbr:`DN (Distinguished Name)`
|
||||
.. |DORA| replace:: :abbr:`DORA (Dell Open RAN Accelerator)`
|
||||
.. |DRBD| replace:: :abbr:`DRBD (Distributed Replicated Block Device)`
|
||||
.. |DSA| replace:: :abbr:`DSA (Data Streaming Accelerator)`
|
||||
.. |DSCP| replace:: :abbr:`DSCP (Differentiated Services Code Point)`
|
||||
.. |DVR| replace:: :abbr:`DVR (Distributed Virtual Router)`
|
||||
.. |DWQ| replace:: abbr:`DWQ (Dedicated Work Queue)`
|
||||
.. |EMS| replace:: :abbr:`EMS (Element Management System)`
|
||||
.. |ePRTC| replace:: :abbr:`ePRTC (Enhanced Primary Reference Time Clock)`
|
||||
.. |FEC| replace:: :abbr:`FEC (Forward Error Correction)`
|
||||
@ -147,6 +150,7 @@
|
||||
.. |PVCs| replace:: :abbr:`PVCs (Persistent Volume Claims)`
|
||||
.. |PXE| replace:: :abbr:`PXE (Preboot Execution Environment)`
|
||||
.. |PW| replace:: :abbr:`PW (Per Worker)`
|
||||
.. |QAT| replace:: :abbr:`QAT (QuickAssist Technology)`
|
||||
.. |QoS| replace:: :abbr:`QoS (Quality of Service)`
|
||||
.. |RAID| replace:: :abbr:`RAID (Redundant Array of Inexpensive Disks)`
|
||||
.. |RAN| replace:: :abbr:`RAN (Radio Access Network)`
|
||||
@ -165,6 +169,7 @@
|
||||
.. |SBR| replace:: :abbr:`SBR (Source-Based Routing)`
|
||||
.. |SCTP| replace:: :abbr:`SCTP (Stream Control Transmission Protocol)`
|
||||
.. |SDO| replace:: :abbr:`SDO (Secure Device Onboard)`
|
||||
.. |SGX| replace:: :abbr:`SGX (Software Guard Extensions)`
|
||||
.. |SLA| replace:: :abbr:`SLA (Service Level Agreement)`
|
||||
.. |SLAs| replace:: :abbr:`SLAs (Service Level Agreements)`
|
||||
.. |SM| replace:: :abbr:`SM (Service Manager)`
|
||||
@ -186,6 +191,7 @@
|
||||
.. |SSHD| replace:: :abbr:`SSHD (Secure Shell Daemon)`
|
||||
.. |STP| replace:: :abbr:`STP (Spanning Tree Protocol)`
|
||||
.. |SWACT| replace:: :abbr:`SWACT (SWitch ACTivity)`
|
||||
.. |SWQ| replace:: :abbr:`SWQ (Shared Work Queue)`
|
||||
.. |TAI| replace:: :abbr:`TAI (International Atomic Time)`
|
||||
.. |T-BC| replace:: :abbr:`T-BC (Telecom Boundary Clock)`
|
||||
.. |TBF| replace:: :abbr:`TBF (Token Bucket Filter)`
|
||||
|
@ -0,0 +1,481 @@
|
||||
.. _data-streaming-accelerator-db88a67c930c:
|
||||
|
||||
==========================
|
||||
Data Streaming Accelerator
|
||||
==========================
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
Intel® |DSA| is a high-performance data copy and transformation accelerator
|
||||
integrated into Intel® processors starting with 4th Generation Intel® Xeon®
|
||||
processors. It is targeted for optimizing streaming data movement and
|
||||
transformation operations common with applications for high-performance
|
||||
storage, networking, persistent memory, and various data processing
|
||||
applications.
|
||||
|
||||
In |DSA| architecture, descriptors specify a work to be done by the device.
|
||||
They contain the type of operation to be performed, data address and status
|
||||
buffers.
|
||||
|
||||
A work queue is a queue on the device where the descriptors submitted by
|
||||
software clients are stored until they are processed.
|
||||
|
||||
Intel |DSA| supports two kinds of work queues:
|
||||
|
||||
- |DWQ|: the work queue is owned by only a single client.
|
||||
|
||||
- |SWQ|: multiple clients can submit work to the queue. The engine is the
|
||||
unit responsible for processing work. A group is a set of work queues and
|
||||
engines.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
- BIOS
|
||||
|
||||
Intel |DSA| requires Intel Virtualization Technology for Directed I/O
|
||||
(VT-d) to be enabled on BIOS. On some systems, you may also need to enable
|
||||
|DSA| on Socket Configuration.
|
||||
|
||||
For example: **Socket Configuration** > **IIO Configuration** > **IOAT
|
||||
Configuration** > **Sck0 IOAT Config** > **DSA**
|
||||
|
||||
- IDXD Driver
|
||||
|
||||
IDXD driver initialization can be checked using the :command:`dmesg`
|
||||
command to print the kernel message buffer.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
sysadmin@controller-0:~$ dmesg | grep "idxd"
|
||||
[ 11.094099] idxd 0000:f6:01.0: enabling device (0144 -> 0146)
|
||||
[ 11.182431] idxd 0000:f6:01.0: Intel(R) Accelerator Device (v100)
|
||||
|
||||
- |DSA| Devices
|
||||
|
||||
Intel |DSA| |PCI| device ID is ``0x0b25``. The following command lists the
|
||||
Intel |DSA| devices on the system.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
sysadmin@controller-0:~$ lspci | grep 0b25
|
||||
f6:01.0 System peripheral: Intel Corporation Device 0b25
|
||||
|
||||
Install Intel Device Plugins Operator for Kubernetes
|
||||
----------------------------------------------------
|
||||
|
||||
Intel Device Plugins Operator is a Kubernetes custom controller whose goal is
|
||||
to serve the installation and lifecycle management of Intel device plugins for
|
||||
Kubernetes. It provides a single point of control for |GPU|, |QAT|, |SGX|,
|
||||
|FPGA|, |DSA| and |DLB| devices to a cluster administrators. The |DSA| plugin
|
||||
discovers |DSA| work queues and presents them as a node resources.
|
||||
|
||||
This operator is provided via Intel Device Plugins StarlingX application
|
||||
https://opendev.org/starlingx/app-intel-device-plugins.
|
||||
|
||||
.. rubric:: **Dependencies**
|
||||
|
||||
Intel Device Plugins Operator depends on node-feature-discovery StarlingX App.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Upload and apply ``node-feature-discovery`` app.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system application-upload /usr/local/share/applications/helm/node-feature-discovery-24.09-<version>.tgz
|
||||
$ system application-apply node-feature-discovery
|
||||
|
||||
#. Upload ``device-plugins-operator`` app.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system application-upload /usr/local/share/applications/helm/intel-device-plugins-operator-24.09-<version>.tgz
|
||||
|
||||
#. Enable ``intel-device-plugins-dsa`` Helm chart.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system helm-chart-attribute-modify --enabled true intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator
|
||||
|
||||
#. Apply ``intel-device-plugins-operator`` app.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system application-apply intel-device-plugins-operator
|
||||
|
||||
#. Confirm that dsa resources are available.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ kubectl get nodes -o go-template='{{range .items}}{{.metadata.name}}{{"\n"}}{{range $k,$v:=.status.allocatable}}{{" "}}{{$k}}{{": "}}{{$v}}{{"\n"}}{{end}}{{end}}' | grep '^\([^ ]\)\|\( dsa\)'
|
||||
controller-0
|
||||
dsa.intel.com/wq-user-shared: 40
|
||||
|
||||
Test Case Example
|
||||
-----------------
|
||||
|
||||
The plugin can be tested by deploying a pod using the |VRAN| tools image
|
||||
(``stx-debian-tools-dev``).
|
||||
|
||||
#. Create a yaml file for the test pod:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ cat << 'EOF' > dsa-accel-config-demo.yml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: dsa-accel-config-demo
|
||||
labels:
|
||||
app: dsa-accel-config-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: dsa-accel-config-demo
|
||||
image: registry.local:9001/docker.io/starlingx/stx-debian-tools-dev:stx.10.0-v1.0.0
|
||||
imagePullPolicy: "Always"
|
||||
workingDir: "/usr/libexec/accel-config/test/"
|
||||
command:
|
||||
- "./dsa_user_test_runner.sh"
|
||||
args:
|
||||
- "--skip-config"
|
||||
resources:
|
||||
limits:
|
||||
dsa.intel.com/wq-user-shared: 1
|
||||
restartPolicy: Never
|
||||
imagePullSecrets:
|
||||
- name: default-registry-key
|
||||
|
||||
#. Apply the yaml file.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ kubectl apply -f dsa-accel-config-demo.yml
|
||||
|
||||
Review the job's log:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ kubectl logs dsa-accel-config-demo | tail [ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x5625182865b0
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286670
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286730
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x5625182867f0
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x5625182868b0
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286970
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286a30
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286af0
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
[ info] verifying task result for 0x562518286bb0
|
||||
[ info] Checking Src & Dst buffers
|
||||
[ info] compsts: 1
|
||||
[ info] Checking All Tags
|
||||
[ info] All Tags Validated
|
||||
+ '[' --skip-config '!=' --skip-config ']'
|
||||
|
||||
If the pod did not successfully launch, possibly because it could not obtain
|
||||
the |DSA| resource, it will be stuck in the Pending status:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ kubectl get pods
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
dsa-accel-config-demo 0/1 Pending 0 7s
|
||||
|
||||
This can be verified by checking the Events of the pod:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ kubectl describe pod dsa-accel-config-demo | grep -A3 Events:
|
||||
Events:
|
||||
Type Reason Age From Message
|
||||
---- ------ ---- ---- -------
|
||||
Warning FailedScheduling 2m26s default-scheduler 0/1 nodes are available: 1 Insufficient dsa.intel.com/wq-user-dedicated, 1 Insufficient dsa.intel.com/wq-user-shared.
|
||||
|
||||
|
||||
Customize the configuration
|
||||
---------------------------
|
||||
|
||||
The default configuration uses shared queues for controller-0 node and
|
||||
dedicated queues for the remaining nodes. Node specific configuration can be
|
||||
passed by defining the config name with ``dsa-<node-name>.conf``.
|
||||
|
||||
The default ``DSA`` device configuration is as follow:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ cat << 'EOF' > dsa-override.yml
|
||||
overrideConfig:
|
||||
dsa.conf: |
|
||||
[
|
||||
{
|
||||
"dev":"dsaX",
|
||||
"read_buffer_limit":0,
|
||||
"groups":[
|
||||
{
|
||||
"dev":"groupX.0",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.0",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":0,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX0",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.0",
|
||||
"group_id":0
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.1",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.1",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":1,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX1",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.1",
|
||||
"group_id":1
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.2",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.2",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":2,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX2",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.2",
|
||||
"group_id":2
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.3",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.3",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":3,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX3",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.3",
|
||||
"group_id":3
|
||||
},
|
||||
]
|
||||
},
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
|
||||
The |DSA| device configuration can be customized via application overrides.
|
||||
|
||||
For example, the following config uses dedicated queues for all nodes:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ cat << 'EOF' > dsa-override.yml
|
||||
overrideConfig:
|
||||
dsa.conf: |
|
||||
[
|
||||
{
|
||||
"dev":"dsaX",
|
||||
"read_buffer_limit":0,
|
||||
"groups":[
|
||||
{
|
||||
"dev":"groupX.0",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.0",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":0,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX0",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.0",
|
||||
"group_id":0
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.1",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.1",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":1,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX1",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.1",
|
||||
"group_id":1
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.2",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.2",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":2,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX2",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.2",
|
||||
"group_id":2
|
||||
},
|
||||
]
|
||||
},
|
||||
{
|
||||
"dev":"groupX.3",
|
||||
"read_buffers_reserved":0,
|
||||
"use_read_buffer_limit":0,
|
||||
"read_buffers_allowed":8,
|
||||
"grouped_workqueues":[
|
||||
{
|
||||
"dev":"wqX.3",
|
||||
"mode":"dedicated",
|
||||
"size":16,
|
||||
"group_id":3,
|
||||
"priority":10,
|
||||
"block_on_fault":1,
|
||||
"type":"user",
|
||||
"name":"appX3",
|
||||
"threshold":15
|
||||
}
|
||||
],
|
||||
"grouped_engines":[
|
||||
{
|
||||
"dev":"engineX.3",
|
||||
"group_id":3
|
||||
},
|
||||
]
|
||||
},
|
||||
]
|
||||
}
|
||||
]
|
||||
|
||||
Apply the override file:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system helm-override-update intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator --values dsa-override.yaml
|
||||
|
||||
Apply ``intel-device-plugins-operator`` application:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ system application-apply intel-device-plugins-operator
|
@ -194,3 +194,12 @@ Tools
|
||||
:maxdepth: 1
|
||||
|
||||
vran-tools-2c3ee49f4b0b
|
||||
|
||||
--------------------------
|
||||
Data Streaming Accelerator
|
||||
--------------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
data-streaming-accelerator-db88a67c930c
|
||||
|
@ -44,6 +44,9 @@ web page https://packages.debian.org/bullseye/<package name>.
|
||||
- `PCM Tools
|
||||
<https://github.com/opcm/pcm>`__: includes ``pcm``, and other ``Processor Counter Monitor`` tools.
|
||||
|
||||
|
||||
- `accel-config tools <https://github.com/intel/idxd-config>`__.
|
||||
|
||||
You can launch this container image in a Kubernetes pod and ``exec`` into a shell
|
||||
in the container in order to execute the commands. The Kubernetes pod must run
|
||||
in a privileged and host context, such that the above tools provide information
|
||||
|
Loading…
Reference in New Issue
Block a user