diff --git a/doc/source/admintasks/kubernetes/gpu-device-plugin-configuration-615e2f6edfba.rst b/doc/source/admintasks/kubernetes/gpu-device-plugin-configuration-615e2f6edfba.rst new file mode 100644 index 000000000..fda23d3cd --- /dev/null +++ b/doc/source/admintasks/kubernetes/gpu-device-plugin-configuration-615e2f6edfba.rst @@ -0,0 +1,95 @@ +.. WARNING: Add no lines of text between the label immediately following +.. and the title. + +.. _gpu-device-plugin-configuration-615e2f6edfba: + +=============================== +GPU Device Plugin Configuration +=============================== + +Intel |GPU| plugin enables Kubernetes clusters to utilize Intel GPUs for +hardware acceleration of various workloads. + +This section describes how to enable and use the Intel |GPU| device plugin +in |prod|. + +.. _prerequisites-1: + +.. rubric:: |prereq| + +- The host should have Intel |GPU| hardware. For supported |GPU| devices, + refer to Intel |GPU| plugin documentation for more details: `Intel GPU + device plugin for Kubernetes + `__. + +- Node Feature Discovery application must be installed using the following + commands: + + .. code-block:: + + ~(keystone_admin)]$ system application-upload /usr/local/share/applications/helm/node-feature-discovery*.tgz + ~(keystone_admin)]$ system application-apply node-feature-discovery + +Enable Intel GPU Device Plugin +------------------------------ + +#. Locate the application tarball in the ``/usr/local/share/applications/helm`` + directory. For example: + + ``/usr/local/share/applications/helm/intel-device-plugins-operator-.tgz`` + +#. Upload the application using the following command. + + .. code-block:: + + ~(keystone_admin)]$ system application-upload intel-device-plugins-operator-.tgz + + Replace ```` with the latest version number. + +#. Verify that the application has been uploaded successfully. + + .. code-block:: + + ~(keystone_admin)]$ system application-list + +#. Check the helm chart status using the following command: + + .. code-block:: + + ~(keystone_admin)]$ system helm-override-list intel-device-plugins-operator-long + +#. Enable |GPU| Helm chart using the following command: + + .. code-block:: + + ~(keystone_admin)]$ system helm-chart-attribute-modify --enabled true intel-device-plugins-operator intel-device-plugins-gpu intel-device-plugins-operator + +#. Apply the application using the following command: + + .. code-block:: + + ~(keystone_admin)]$ system application-apply intel-device-plugins-operator + +#. Monitor the status of the application using one of the following commands. + + .. code-block:: + + ~(keystone_admin)]$ watch -n 5 system application-list + + OR + + .. code-block:: + + ~(keystone_admin)]$ watch kubectl get pods -n intel-device-plugins-operator + +#. Pods can be checked using the following command: + + .. code-block:: + + $ kubectl get pods -n intel-device-plugins-operator + +Use Intel GPU Device Plugin +--------------------------- + +For information related to using |GPU| device plugin, see `Testing and Demos +`__. diff --git a/doc/source/admintasks/kubernetes/index-admintasks-kub-ebc55fefc368.rst b/doc/source/admintasks/kubernetes/index-admintasks-kub-ebc55fefc368.rst index c561038f9..b4e8e1275 100644 --- a/doc/source/admintasks/kubernetes/index-admintasks-kub-ebc55fefc368.rst +++ b/doc/source/admintasks/kubernetes/index-admintasks-kub-ebc55fefc368.rst @@ -48,17 +48,18 @@ Optimize application performance isolating-cpu-cores-to-enhance-application-performance kubernetes-topology-manager-policies -.. only:: starlingx - ----------------- - QAT Device Plugin - ----------------- +--------------------------------- +QAT Device and GPU Device Plugins +--------------------------------- - .. toctree:: - :maxdepth: 1 - - k8s_qat_device_plugin +.. toctree:: + :maxdepth: 1 + intel-device-plugins-operator-application-overview-c5de2a6212ae + qat-device-plugin-configuration-616551306371 + gpu-device-plugin-configuration-615e2f6edfba + uninstall-intel-device-plugins-operator-application-e712eabc1e49 -------------- Metrics Server diff --git a/doc/source/admintasks/kubernetes/intel-device-plugins-operator-application-overview-c5de2a6212ae.rst b/doc/source/admintasks/kubernetes/intel-device-plugins-operator-application-overview-c5de2a6212ae.rst new file mode 100644 index 000000000..dc77c9b48 --- /dev/null +++ b/doc/source/admintasks/kubernetes/intel-device-plugins-operator-application-overview-c5de2a6212ae.rst @@ -0,0 +1,32 @@ +.. WARNING: Add no lines of text between the label immediately following +.. and the title. + +.. _intel-device-plugins-operator-application-overview-c5de2a6212ae: + +================================================== +Intel Device Plugins Operator Application Overview +================================================== + +This application provides a set of plugins developed by Intel to facilitate the +use of Intel hardware features in Kubernetes clusters. These plugins are +designed to enable and optimize the use of Intel-specific hardware capabilities +in a Kubernetes environment. + +The following plugins are supported: + +* Intel |QAT| device plugin 0.26.0 + +* Intel |GPU| device plugin 0.26.0 + + +Install Intel Device Plugins Operator Application +------------------------------------------------- + +Intel device plugin Operator application is required to be installed for +configuring the Intel |QAT| device plugin and the Intel |GPU| device plugin. +Installation steps are mentioned in the respective device plugin configuration +sections below. + +:ref:`qat-device-plugin-configuration-616551306371` + +:ref:`gpu-device-plugin-configuration-615e2f6edfba` \ No newline at end of file diff --git a/doc/source/admintasks/kubernetes/k8s_qat_device_plugin.rst b/doc/source/admintasks/kubernetes/k8s_qat_device_plugin.rst deleted file mode 100644 index 58f1ad92e..000000000 --- a/doc/source/admintasks/kubernetes/k8s_qat_device_plugin.rst +++ /dev/null @@ -1,115 +0,0 @@ -.. _k8s_qat_device_plugin: - -.. only:: starlingx - - ========================================== - Kubernetes QAT Device Plugin Configuration - ========================================== - - Intel® QuickAssist Technology (Intel® QAT) accelerates cryptographic workloads - by offloading the data to hardware capable of optimizing those functions. This - guide describes how to enable and consume the Intel QAT device plugin in - StarlingX. - - .. contents:: - :local: - :depth: 1 - - ------------- - Prerequisites - ------------- - - - Install Intel QuickAssist device on host. - - Install StarlingX on bare metal with DPDK enabled. Refer to the |_link-inst-book| - for details. - - ------------------------------ - Enable Intel QAT device plugin - ------------------------------ - - The Intel QAT device plugin daemonset is pre-installed in StarlingX. This - section describes the steps to enable the Intel QAT device plugin for - discovering and advertising QAT VF resources to Kubernetes host. - - #. Verify QuickAssist SR-IOV virtual functions are configured on a specified - node after StarlingX is installed. This example uses the worker-0 node. - - :: - - $ ssh worker-0 - $ for i in 0442 0443 37c9 19e3; do lspci -d 8086:$i; done - - .. note:: - - The Intel QAT device plugin only supports QAT VF resources in the current - release. - - #. Assign the ``intelqat`` label to the node (worker-0 in this example). - - :: - - $ NODE=worker-0 - $ system host-lock $NODE - $ system host-label-assign $NODE intelqat=enabled - $ system host-unlock $NODE - - #. After the node becomes available, verify the Intel QAT device plugin is - registered. - - :: - - $ kubectl describe node $NODE | grep qat.intel.com/generic - qat.intel.com/generic: 10 - .intel.com/generic: 10 - - ------------------------------- - Consume Intel QAT device plugin - ------------------------------- - - #. Build the DPDK image. - - :: - - $ git clone https://github.com/intel/intel-device-plugins-for-kubernetes.git - $ cd demo - $ ./build-image.sh crypto-perf - - This command produces a Docker image named ``crypto-perf``. - - #. Deploy a pod to run an example DPDK application named - ``dpdk-test-crypto-perf``. - - In the pod specification file, add the container resource request and - limit. - - For example, use ``qat.intel.com/generic: `` for a - container requesting Intel QAT devices. - - - For a DPDK-based workload, you may need to add a hugepage request and limit. - - :: - - $ kubectl apply -k deployments/qat_dpdk_app/base/ - $ kubectl get pods - NAME READY STATUS RESTARTS AGE - qat-dpdk 1/1 Running 0 27m - intel-qat-plugin-5zgvb 1/1 Running 0 3h - - .. Note:: - - The deployment example above uses kustomize, which is a tool supported by - kubectl since the Kubernetes v1.14 release. - - - #. Manually execute the ``dpdk-test-crypto-perf`` application to review the - logs. - - :: - - $ kubectl exec -it qat-dpdk bash - - $ ./dpdk-test-crypto-perf -l 6-7 -w $QAT1 -- --ptest throughput --\ - devtype crypto_qat --optype cipher-only --cipher-algo aes-cbc --cipher-op \ - encrypt --cipher-key-sz 16 --total-ops 10000000 --burst-sz 32 --buffer-sz 64 - diff --git a/doc/source/admintasks/kubernetes/qat-device-plugin-configuration-616551306371.rst b/doc/source/admintasks/kubernetes/qat-device-plugin-configuration-616551306371.rst new file mode 100644 index 000000000..71bea5055 --- /dev/null +++ b/doc/source/admintasks/kubernetes/qat-device-plugin-configuration-616551306371.rst @@ -0,0 +1,249 @@ +.. WARNING: Add no lines of text between the label immediately following +.. and the title. + +.. _qat-device-plugin-configuration-616551306371: + +=============================== +QAT Device Plugin Configuration +=============================== + +Intel® QuickAssist Technology (Intel® QAT) accelerates cryptographic workloads +by offloading the data to hardware that is capable of optimizing those +functions. + +This section describes how to enable and consume the Intel |QAT| device plugin +in |prod|. + +.. rubric:: |prereq| + +- The host should have Intel |QAT| hardware. Supported |QAT| devices are 4940 + and 4942. After |prod| is installed, do the following verification to ensure + |QAT| devices are configured. + + - Verify |QAT| |SRIOV| physical functions are configured. + + .. code-block:: + + $ for i in 4942 4940; do lspci -d** **8086:$i; done + + - Verify |QAT| |SRIOV| virtual functions are configured. + + .. code-block:: + + $ for i in 4943 4941; do lspci -d** **8086:$i; done + + $ sudo /etc/init.d/qat_service status # Must list all the virtual functions + + Checking status of all devices. + + There is 34 QAT acceleration device(s) in the system: + + qat_dev0 - type: 4xxx, inst_id: 0, node_id: 0, bsf: 0000:f3:00.0, + #accel: 1 #engines: 9 state: up + + qat_dev1 - type: 4xxx, inst_id: 1, node_id: 0, bsf: 0000:f7:00.0, + #accel: 1 #engines: 9 state: up + + qat_dev2 - type: 4xxxvf, inst_id: 0, node_id: 0, bsf: 0000:f3:00.1, + #accel: 1 #engines: 1 state: up + + qat_dev3 - type: 4xxxvf, inst_id: 1, node_id: 0, bsf: 0000:f3:00.2, + #accel: 1 #engines: 1 state: up + + - Verify the |QAT| driver ``vfio_pci`` is installed. + + .. code-block:: + + $ lsmod | grep vfio_pci + + vfio_pci 69632 0 + vfio_virqfd 16384 1 vfio_pci + vfio 45056 4 intel_qat,vfio_mdev,vfio_iommu_type1,\ **vfio_pci** + irqbypass 16384 3 intel_qat,vfio_pci,kvm + +- Node Feature Discovery application must be installed, using the following + commands. + + .. code-block:: + + ~(keystone_admin)]$ system application-upload /usr/local/share/applications/helm/node-feature-discovery*.tgz + ~(keystone_admin)]$ system application-apply node-feature-discovery + + +Enable Intel QAT Device Plugin +------------------------------ + +The following steps should be performed to enable the Intel |QAT| device plugin +for discovering and advertising |QAT| VF (Virtual Functions) resources to +Kubernetes host. + +#. Locate the application tarball in the ``/usr/local/share/applications/helm`` + directory. For example: + + ``/usr/local/share/applications/helm/intel-device-plugins-operator-.tgz`` + +#. Upload the application using the following command. + + .. code-block:: + + ~(keystone_admin**\ **)]$ system application-upload intel-device-plugins-operator-.tgz + + Replace ```` with the latest version number. + +#. Verify that the application has been uploaded successfully. + + .. code-block:: + + ~(keystone_admin**\ **)]$ system application-list + +#. Check the Hellm chart status. + + .. code-block:: + + ~(keystone_admin*)]$ system helm-override-list intel-device-plugins-operator -long** + +#. Enable QAT helm chart. + + .. code-block:: + + ~(keystone_admin)]$ system helm-chart-attribute-modify --enabled true intel-device-plugins-operator intel-device-plugins-qat intel-device-plugins-operator + +#. Apply the application. + + .. code-block:: + + ~(keystone_admin)]$ system application-apply intel-device-plugins-operator + +#. Monitor the status of the application. + + .. code-block:: + + ~(keystone_admin*)]$ watch -n 5 system application-list + + OR + + .. code-block:: none + + ~(keystone_admin)]$ watch kubectl get pods -n intel-device-plugins-operator + +#. Check the pods. + + .. code-block:: + + $ kubectl get pods -n intel-device-plugins-operator + + NAME READY STATUS RESTARTS AGE + + intel-qat-plugin-qatdeviceplugin-sample-g8n45 1/1 Running 0 34s + inteldeviceplugins-controller-manager-74f4c 2/2 Running 0 64s + +#. Verify |QAT| devices by checking the node's resource allocations. The |QAT| + 4940 device and the |QAT| 4942 device each have 16 virtual functions. If + both devices are present, the following command will display a total of 32 + virtual functions: + + .. code-block:: + + $ kubectl describe node \| grep qat.intel.com/asym-dc + + Capacity: + --- + qat.intel.com/asym-dc: 32 + --- + Allocatable: + --- + qat.intel.com/asym-dc: 32 + --- + +Use Intel QAT Device Plugin +--------------------------- + +This section describes the steps for using |QAT| device plugin. + +#. Deploy a pod using the following sample POD specification file. The pod + specification file can be modified for required resource request and limit. + + The ``qat.intel.com/asym-dc: `` field is used to + configure the requested |QAT| virtual functions. + + For a |DPDK|-based workload, you may need to add a hugepage request and + limit. + + ``qat-dpdk.yaml`` + + .. code-block:: yaml + + kind: Pod + apiVersion: v1 + metadata: + name: dpdk-test-crypto-perf + spec: + containers: + - name: crypto-perf + image: intel/crypto-perf:devel + imagePullPolicy: IfNotPresent + command: [ "/bin/bash", "-c", "--" ] + args: [ "while true; do sleep 300000; done;" ] + volumeMounts: + - mountPath: /dev/hugepages + name: hugepage + - mountPath: /var/run/dpdk + name: dpdk-runtime + resources: + requests: + cpu: "3" + memory: "128Mi" + qat.intel.com/asym-dc: '4' + hugepages-2Mi: "128Mi" + limits: + cpu: "3" + memory: "128Mi" + qat.intel.com/asym-dc: '4' + hugepages-2Mi: "128Mi" + securityContext: + readOnlyRootFilesystem: true + allowPrivilegeEscalation: false + capabilities: + add: + ["IPC_LOCK"] + restartPolicy: Never + volumes: + - name: dpdk-runtime + emptyDir: + medium: Memory + - name: hugepage + emptyDir: + medium: HugePages + + + Apply the pod specification file to create ``dpdk-test-crypto-perf`` pod. + + .. code-block:: + + $ kubectl apply -k qat-dpdk.yaml + +#. Verify the pod status and the allocated |QAT| virtual functions. + + .. code-block:: + + $ kubectl get pods + + NAME READY STATUS RESTARTS AGE + dpdk-test-crypto-perf 1/1 Running 0 27m + + $ kubectl describe pod dpdk-test-crypto-perf** + + Requests: + --- + qat.intel.com/asym-dc: 4 + --- + + $ kubectl describe node + + Allocated resources: + --- + qat.intel.com/asym-dc: 4 + --- + +For more information, see: `Demos and Testing +`__. diff --git a/doc/source/admintasks/kubernetes/uninstall-intel-device-plugins-operator-application-e712eabc1e49.rst b/doc/source/admintasks/kubernetes/uninstall-intel-device-plugins-operator-application-e712eabc1e49.rst new file mode 100644 index 000000000..40ae88f62 --- /dev/null +++ b/doc/source/admintasks/kubernetes/uninstall-intel-device-plugins-operator-application-e712eabc1e49.rst @@ -0,0 +1,23 @@ +.. WARNING: Add no lines of text between the label immediately following +.. and the title. + +.. _uninstall-intel-device-plugins-operator-application-e712eabc1e49: + +=================================================== +Uninstall Intel Device Plugins Operator Application +=================================================== + +Use the following steps to uninstall the Intel Device Plugins operator +application: + +#. Remove the application using the following command: + + .. code-block:: + + ~(keystone_admin)]$ system application-remove intel-device-plugins-operator + +#. Delete application using the following command: + + .. code-block:: + + ~(keystone_admin)]$ system application-delete intel-device-plugins-operator diff --git a/doc/source/operations/index.rst b/doc/source/operations/index.rst index c751f85e3..4b7e295b5 100644 --- a/doc/source/operations/index.rst +++ b/doc/source/operations/index.rst @@ -34,7 +34,7 @@ Kubernetes Operation k8s_nodeport_usage k8s_persistent_vol_claims k8s_sriov_config - k8s_gpu_device_plugin + ------------------- OpenStack Operation diff --git a/doc/source/operations/k8s_gpu_device_plugin.rst b/doc/source/operations/k8s_gpu_device_plugin.rst deleted file mode 100644 index d74e5debb..000000000 --- a/doc/source/operations/k8s_gpu_device_plugin.rst +++ /dev/null @@ -1,77 +0,0 @@ -================================================ -Kubernetes Intel GPU Device Plugin Configuration -================================================ - -This document describes how to enable the Intel GPU device plugin in StarlingX -and schedule pods on nodes with an Intel GPU. - ------------------------------- -Enable Intel GPU device plugin ------------------------------- - -You can pre-install the ``intel-gpu-plugin`` daemonset as follows: - -#. Launch the ``intel-gpu-plugin`` daemonset. - - Add the following lines to the ``localhost.yaml`` file before playing the - Ansible bootstrap playbook to configure the system. - - :: - - k8s_plugins: - intel-gpu-plugin: intelgpu=enabled - -#. Assign the ``intelgpu`` label to each node that should have the Intel GPU - plugin enabled. This will make any GPU devices on a given node available for - scheduling to containers. The following example assigns the ``intelgpu`` - label to the worker-0 node. - - :: - - $ NODE=worker-0 - $ system host-lock $NODE - $ system host-label-assign $NODE intelgpu=enabled - $ system host-unlock $NODE - -#. After the node becomes available, verify the GPU device plugin is registered - and that the available GPU devices on the node have been discovered and reported. - - :: - - $ kubectl describe node $NODE | grep gpu.intel.com - gpu.intel.com/i915: 1 - gpu.intel.com/i915: 1 - -------------------------------------- -Schedule pods on nodes with Intel GPU -------------------------------------- - -Add a ``resources.limits.gpu.intel.com`` to your container specification in order -to request an available GPU device for your container. - -:: - - ... - spec: - containers: - - name: ... - ... - resources: - limits: - gpu.intel.com/i915: 1 - - -The pods will be scheduled to the nodes with available Intel GPU devices. A GPU -device will be allocated to the container and the available GPU devices will be -updated. - -:: - - $ kubectl describe node $NODE | grep gpu.intel.com - gpu.intel.com/i915: 1 - gpu.intel.com/i915: 0 - -For more details, refer to the following examples: - -* `Kubernetes manifest file example `_ -* `Scheduling pods on nodes with Intel GPU example `_