Fix yaml. Closes-bug: 2080002 Change-Id: Icb58878a62489c6f939ecdb8e447f9be22a768cc Signed-off-by: Elisamara Aoki Goncalves <elisamaraaoki.goncalves@windriver.com>
15 KiB
Data Streaming Accelerator
Intel® is a high-performance data copy and transformation accelerator integrated into Intel® processors starting with 4th Generation Intel® Xeon® processors. It is targeted for optimizing streaming data movement and transformation operations common with applications for high-performance storage, networking, persistent memory, and various data processing applications.
In architecture, descriptors specify a work to be done by the device. They contain the type of operation to be performed, data address and status buffers.
A work queue is a queue on the device where the descriptors submitted by software clients are stored until they are processed.
Intel supports two kinds of work queues:
- : the work queue is owned by only a single client.
- : multiple clients can submit work to the queue. The engine is the unit responsible for processing work. A group is a set of work queues and engines.
BIOS
Intel requires Intel Virtualization Technology for Directed I/O (VT-d) to be enabled on BIOS. On some systems, you may also need to enable on Socket Configuration.
For example: Socket Configuration > IIO Configuration > IOAT Configuration > Sck0 IOAT Config > DSA
IDXD Driver
IDXD driver initialization can be checked using the
dmesg
command to print the kernel message buffer.sysadmin@controller-0:~$ dmesg | grep "idxd" [ 11.094099] idxd 0000:f6:01.0: enabling device (0144 -> 0146) [ 11.182431] idxd 0000:f6:01.0: Intel(R) Accelerator Device (v100)
Devices
Intel device ID is
0x0b25
. The following command lists the Intel devices on the system.sysadmin@controller-0:~$ lspci | grep 0b25 f6:01.0 System peripheral: Intel Corporation Device 0b25
Install Intel Device Plugins Operator for Kubernetes
Intel Device Plugins Operator is a Kubernetes custom controller whose goal is to serve the installation and lifecycle management of Intel device plugins for Kubernetes. It provides a single point of control for , , , , and devices to a cluster administrators. The plugin discovers work queues and presents them as a node resources.
This operator is provided via Intel Device Plugins StarlingX application https://opendev.org/starlingx/app-intel-device-plugins.
Dependencies
Intel Device Plugins Operator depends on node-feature-discovery StarlingX App.
Upload and apply
node-feature-discovery
app.$ system application-upload /usr/local/share/applications/helm/node-feature-discovery-24.09-<version>.tgz $ system application-apply node-feature-discovery
Upload
device-plugins-operator
app.$ system application-upload /usr/local/share/applications/helm/intel-device-plugins-operator-24.09-<version>.tgz
Enable
intel-device-plugins-dsa
Helm chart.$ system helm-chart-attribute-modify --enabled true intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator
Apply
intel-device-plugins-operator
app.$ system application-apply intel-device-plugins-operator
Confirm that dsa resources are available.
$ kubectl get nodes -o go-template='{{range .items}}{{.metadata.name}}{{"\n"}}{{range $k,$v:=.status.allocatable}}{{" "}}{{$k}}{{": "}}{{$v}}{{"\n"}}{{end}}{{end}}' | grep '^\([^ ]\)\|\( dsa\)' controller-0 dsa.intel.com/wq-user-shared: 40
Test Case Example
The plugin can be tested by deploying a pod using the tools image
(stx-debian-tools-dev
).
Create a yaml file for the test pod:
$ cat << 'EOF' > dsa-accel-config-demo.yml apiVersion: v1 kind: Pod metadata: name: dsa-accel-config-demo labels: app: dsa-accel-config-demo spec: containers: - name: dsa-accel-config-demo image: docker.io/starlingx/stx-debian-tools-dev:stx.10.0-v1.0.0 imagePullPolicy: "Always" workingDir: "/usr/libexec/accel-config/test/" command: - "./dsa_user_test_runner.sh" args: - "--skip-config" resources: limits: dsa.intel.com/wq-user-shared: 1 securityContext: capabilities: add: ["SYS_RAWIO"] restartPolicy: Never EOF
Apply the yaml file.
$ kubectl apply -f dsa-accel-config-demo.yml
Review the job's log:
$ kubectl logs dsa-accel-config-demo | tail
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x5625182865b0
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286670
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286730
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x5625182867f0
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x5625182868b0
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286970
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286a30
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286af0
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
[ info] verifying task result for 0x562518286bb0
[ info] Checking Src & Dst buffers
[ info] compsts: 1
[ info] Checking All Tags
[ info] All Tags Validated
+ '[' --skip-config '!=' --skip-config ']'
If the pod did not successfully launch, possibly because it could not obtain the resource, it will be stuck in the Pending status:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
dsa-accel-config-demo 0/1 Pending 0 7s
This can be verified by checking the Events of the pod:
$ kubectl describe pod dsa-accel-config-demo | grep -A3 Events:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m26s default-scheduler 0/1 nodes are available: 1 Insufficient dsa.intel.com/wq-user-dedicated, 1 Insufficient dsa.intel.com/wq-user-shared.
Customize the configuration
The default configuration uses shared queues for controller-0 node
and dedicated queues for the remaining nodes. Node specific
configuration can be passed by defining the config name with
dsa-<node-name>.conf
.
The default DSA
device configuration is as follow:
$ cat << 'EOF' > dsa-override.yml
overrideConfig:
dsa.conf: |
[
{
"dev":"dsaX",
"read_buffer_limit":0,
"groups":[
{
"dev":"groupX.0",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.0",
"mode":"dedicated",
"size":16,
"group_id":0,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX0",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.0",
"group_id":0
},
]
},
{
"dev":"groupX.1",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.1",
"mode":"dedicated",
"size":16,
"group_id":1,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX1",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.1",
"group_id":1
},
]
},
{
"dev":"groupX.2",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.2",
"mode":"dedicated",
"size":16,
"group_id":2,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX2",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.2",
"group_id":2
},
]
},
{
"dev":"groupX.3",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.3",
"mode":"dedicated",
"size":16,
"group_id":3,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX3",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.3",
"group_id":3
},
]
},
]
}
]EOF
The device configuration can be customized via application overrides.
For example, the following config uses dedicated queues for all nodes:
$ cat << 'EOF' > dsa-override.yml
overrideConfig:
dsa.conf: |
[
{
"dev":"dsaX",
"read_buffer_limit":0,
"groups":[
{
"dev":"groupX.0",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.0",
"mode":"dedicated",
"size":16,
"group_id":0,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX0",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.0",
"group_id":0
},
]
},
{
"dev":"groupX.1",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.1",
"mode":"dedicated",
"size":16,
"group_id":1,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX1",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.1",
"group_id":1
},
]
},
{
"dev":"groupX.2",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.2",
"mode":"dedicated",
"size":16,
"group_id":2,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX2",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.2",
"group_id":2
},
]
},
{
"dev":"groupX.3",
"read_buffers_reserved":0,
"use_read_buffer_limit":0,
"read_buffers_allowed":8,
"grouped_workqueues":[
{
"dev":"wqX.3",
"mode":"dedicated",
"size":16,
"group_id":3,
"priority":10,
"block_on_fault":1,
"type":"user",
"name":"appX3",
"threshold":15
}
],
"grouped_engines":[
{
"dev":"engineX.3",
"group_id":3
},
]
},
]
}
]EOF
Apply the override file:
$ system helm-override-update intel-device-plugins-operator intel-device-plugins-dsa intel-device-plugins-operator --values dsa-override.yaml
Apply intel-device-plugins-operator
application:
$ system application-apply intel-device-plugins-operator