From ad2a8f274f2201152770630ef1ec443d040178d1 Mon Sep 17 00:00:00 2001 From: Juanita-Balaraj Date: Fri, 9 Jul 2021 13:01:41 -0400 Subject: [PATCH] NVIDIA Update - Verified Commercial Hardware Updated Patchset 5 comments Updated Patchset 4 comments Updated Patchset 3 comments Updated Patchset 2 comments Updated Patchset 1 comments Signed-off-by: Juanita-Balaraj Change-Id: Ie0d037d7761b0542436314b2f4e04bc12d7f4e1e Signed-off-by: Juanita-Balaraj --- ...ci-passthrough-interface-to-nvidia-gpu.rst | 120 ++++++++++++++++++ .../node_management/openstack/index.rst | 1 + ...-planning-verified-commercial-hardware.rst | 15 ++- doc/source/shared/abbrevs.txt | 1 + 4 files changed, 133 insertions(+), 4 deletions(-) create mode 100644 doc/source/node_management/openstack/configure-pci-passthrough-interface-to-nvidia-gpu.rst diff --git a/doc/source/node_management/openstack/configure-pci-passthrough-interface-to-nvidia-gpu.rst b/doc/source/node_management/openstack/configure-pci-passthrough-interface-to-nvidia-gpu.rst new file mode 100644 index 000000000..04fd2bf16 --- /dev/null +++ b/doc/source/node_management/openstack/configure-pci-passthrough-interface-to-nvidia-gpu.rst @@ -0,0 +1,120 @@ + + +.. _configure-pci-passthrough-interface-to-nvidia-gpu: + +========================================================= +Configure PCI-Passthrough Interface to NVIDIA GPU in a VM +========================================================= + +This section provides instructions for configuring PCI-Passthrough interface +to NVIDIA GPU Operator in a |VM|. + +.. rubric:: |proc| + +#. Source the platform environment. + + .. code-block:: none + + $ source /etc/platform/openrc + ~(keystone_admin)$ + +#. Lock controller-0 to enable the NVIDIA GPU device. + + .. code-block:: none + + ~(keystone_admin)$ system host-lock controller-0 + + #. Verify that the NVIDIA GPU device is available in the table. + + .. code-block:: none + + ~(keystone_admin)$ system host-device-list controller-0 -a + + + #. Enable the NVIDIA GPU device. + + .. code-block:: none + + ~(keystone_admin)$ system host-device-modify controller-0 --enable=True + + +#. Unlock controller-0 to enable the NVIDIA GPU device. + + .. code-block:: none + + ~(keystone_admin)$ system host-unlock controller-0 + + +#. Add the NVIDIA GPU device information to the Nova overrides. + + #. Check the Nova Helm overrides. + + .. parsed-literal:: + + ~(keystone_admin)$ system helm-override-show |prefix|-openstack nova openstack + + #. Check the **conf.nova.pci.alias.values** override. If your graphics + card alias exists, for example, **"name": "nvidia-tesla-t4-vf"**, + check if the values in step, 2.b., are correct and proceed to step 5. + If the values do not exist or are incorrect, update the Nova Helm + overrides. + + #. Create a file containing the current **conf.nova.pci.alias.values** + overrides and add the following additional entry in the values + array for the NVIDIA device based on the values from step 2.b., + for example: + + .. code-block:: none + + '{"vendor_id": **, "product_id": **, "device_type": "type-PF", "name": "nvidia-tesla-t4-pf"}' + + Where + + ** is the ID of the vendor + + ** is the ID of the product + + #. Save the **.yaml** file. + + .. code-block:: none + + conf: + nova: + pci: + alias: + values: ['{"vendor_id": "8086", "product_id": "0435", "name": "qat-dh895xcc-pf"}', '{"vendor_id": "8086", "product_id": "0443", "name": "qat-dh895xcc-vf"}', '{"vendor_id": "8086", "product_id": "37c8", "name": "qat-c62x-pf"}', '{"vendor_id": "8086", "product_id": "37c9", "name": "qat-c62x-vf"}', '{"name": "gpu"}', '{"vendor_id": "102b", "product_id": "0522", "name": "matrox-g200e"}', '{"vendor_id": "10de", "product_id": "13f2", "name": "nvidia-tesla-m60"}', '{"vendor_id": "10de", "product_id": "1b38", "name": "nvidia-tesla-p40"}', '{"vendor_id": "10de", "product_id": "1eb8", "device_type": "type-PF", "name": "nvidia-tesla-t4-pf"}'] + + #. Upload the **.yaml** file to the platform and apply it. + + .. parsed-literal:: + + ~(keystone_admin)$ system helm-override-update |prefix|-openstack nova openstack --reuse-values --values=your-override-file.yaml + + #. Apply the changes. + + .. parsed-literal:: + + ~(keystone_admin)$ system application-apply |prefix|-openstack + +#. In OpenStack, add a new flavor for the NVIDIA GPU device, for example. + + .. code-block:: none + + # setup admin credentials for the containerized openstack application + $ source /etc/platform/openrc + ~(keystone_admin)$ OS_AUTH_URL=http://keystone.openstack.svc.cluster.local/v3 + # create new flavor with pci_passthrough:alias for nvidia device + ~(keystone_admin)$ openstack flavor create --ram 8192 --vcpus 4 nvidiaT4gpu_8GB_v3 --property "pci_passthrough:alias"="nvidia-tesla-t4-pf:1" --property "hw:mem_page_size"="large" + + .. note:: + 8 GB RAM, 4 VCPUs, and large memory page size are example values + for GPU drivers' system requirements. For valid system requirements for + GPU drivers, see, `https://www.nvidia.com/en-us/geforce/drivers/ `__. + + +#. In OpenStack, create a |VM| and test access to the NVIDIA GPU device. + + #. Create a new |VM|, using the newly created flavor in step 5. + + #. In the |VM|, install and test the CUDA drivers. + See, `https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html `__. diff --git a/doc/source/node_management/openstack/index.rst b/doc/source/node_management/openstack/index.rst index 048dcebb3..fd14a4bcc 100644 --- a/doc/source/node_management/openstack/index.rst +++ b/doc/source/node_management/openstack/index.rst @@ -20,6 +20,7 @@ PCI Device Access for VMs sr-iov-encryption-acceleration configuring-pci-passthrough-ethernet-interfaces pci-passthrough-ethernet-interface-devices + configure-pci-passthrough-interface-to-nvidia-gpu configuring-a-flavor-to-use-a-generic-pci-device generic-pci-passthrough pci-device-access-for-vms diff --git a/doc/source/planning/openstack/installation-and-resource-planning-verified-commercial-hardware.rst b/doc/source/planning/openstack/installation-and-resource-planning-verified-commercial-hardware.rst index 9229e3ef9..c467d349d 100755 --- a/doc/source/planning/openstack/installation-and-resource-planning-verified-commercial-hardware.rst +++ b/doc/source/planning/openstack/installation-and-resource-planning-verified-commercial-hardware.rst @@ -12,11 +12,11 @@ here. .. _installation-and-resource-planning-verified-commercial-hardware-verified-components: .. table:: Table 1. Verified Components - :widths: 100, 200 + :widths: auto +----------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Component | Approved Hardware | - +----------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ + +==========================================================+=========================================================================================================================================================================================================================================================================================================================================================================================================================================+ | Hardware Platforms | - Hewlett Packard Enterprise | | | | | | | @@ -174,6 +174,8 @@ here. | PCI SR-IOV Hardware Accelerators | - Intel AV-ICE02 VPN Acceleration Card, based on the Intel Coleto Creek 8925/8950, and C62x device with QuickAssist Technology. | +----------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | GPUs Verified for PCI Passthrough | - NVIDIA Corporation: VGA compatible controller - GM204GL \(Tesla M60 rev a1\) | + | | | + | | - NVIDIA T4 TENSOR CORE GPU | +----------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Board Management Controllers | - HPE iLO3 | | | | @@ -182,8 +184,13 @@ here. | | - Quanta | +----------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ -.. include:: ../../_includes/installation-and-resource-planning-verified-commercial-hardware.rest - .. seealso:: :ref:`Kubernetes Verified Commercial Hardware ` + +To configure PCI-Passthrough Interface to NVIDIA GPU in a |VM|, see the :ref:`Configure PCI-Passthrough Interface to NVIDIA GPU in a VM ` +for more details. + +.. include:: ../../_includes/installation-and-resource-planning-verified-commercial-hardware.rest + + diff --git a/doc/source/shared/abbrevs.txt b/doc/source/shared/abbrevs.txt index 05ebee3bb..2dc6ec813 100755 --- a/doc/source/shared/abbrevs.txt +++ b/doc/source/shared/abbrevs.txt @@ -43,6 +43,7 @@ .. |FQDN| replace:: :abbr:`FQDN (Fully Qualified Domain Name)` .. |FQDNs| replace:: :abbr:`FQDNs (Fully Qualified Domain Names)` .. |GNP| replace:: :abbr:`GNP (Global Network Policy)` +.. |GCC| replace:: :abbr:`GCC (GNU Compiler Collection)` .. |ICMP| replace:: :abbr:`ICMP (Internet Control Message Protocol)` .. |IEEE| replace:: :abbr:`IEEE (Institute of Electrical and Electronics Engineers)` .. |IGMP| replace:: :abbr:`IGMP (Internet Group Management Protocol)`