How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

9
How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV mode on VMware ESXi 6.7/7.0 and above. Created on Jun 9, 2019 Updated on Sep 13, 2021 Introduction This post describes how to configure the NVIDIA ConnectX-5/6 driver with an SR-IOV (Ethernet) for ESXi 6.7/7.0 Native driver. References ConnectX® Ethernet Driver for VMware® ESXi Server How-to: Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7/7.0. How-to: NVIDIA ConnectX driver upgrade on VMware ESXi 6.5 and above. How-to: Firmware update for NVIDIA ConnectX-5/6 adapter on VMware ESXi 6.5 and above. Single Root IO Virtualization (SR-IOV) Single Root IO Virtualization (SR-IOV) is a technology that allows a physical PCIe device to present itself multiple times through the PCIe bus. This technology enables multiple virtual instances of the device with separate resources. NVIDIA adapters are capable of exposing in ConnectX-4/ConnectX-5 adapter cards up to virtual instances called Virtual Functions (VFs). These virtual functions can then be provisioned 128 separately. Each VF can be seen as an addition device connected to the Physical Function. It shares the same resources with the Physical Function. SR-IOV is commonly used in conjunction with an SR-IOV enabled hypervisor to provide virtual machines direct hardware access to network resources hence increasing its performance. Related Documents How-to: Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7 /7.0. RDG: RoCE accelerated vSphere 6.7 cluster deployment for ML and HPC workloads. RDG: VMware NSX-V hardware VTEPs in High-Availability mode on Spectrum switches running Cumulus Linux. How-to: Firmware update for NVIDIA ConnectX-5/6 adapter on VMware ESXi 6.5 and above. How-to: Change port type of NVIDIA ConnectX VPI adapter on VMware ESXi 6.x and above. How-to: NVIDIA ConnectX driver upgrade on VMware ESXi 6.7/7.0 and above. How-to: Configure NVIDIA network device in VMDirectPath I/O passthrough mode on VMware ESXi 6.x. How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV mode on VMware ESXi 6.7/7.0 and above. How-to: Configure RoCEv2 lossless fabric for VMware ESXi 6.5 and above. How-to: Configure NVIDIA GPU device in VMDirectPath I/O passthrough mode on VMware ESXi 6.x server. How-to: Configure PVRDMA in VMware vSphere 6.5/6.7. RDG: Kubernetes Cluster Deployment for ML and HPC Workloads with NVIDIA GPU Virtualization and VMware PVRDMA Technologies. How-to: Configure NVIDIA network device in DirectPath I/O and Dynamic DirectPath I/O passthrough modes on VMware ESXi 7.0. RDG: Apache Spark 3.0 on Kubernetes accelerated with RAPIDS over RoCE network. How-to: Configure RoCE PVRDMA Namespace in VMware vSphere 7.0. Note: Setting up a VM is out of the scope of this post.

Transcript of How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

Page 1: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV mode on VMware ESXi 6.7/7.0 and above.

Created on Jun 9, 2019

Updated on Sep 13, 2021

IntroductionThis post describes how to configure the NVIDIA ConnectX-5/6 driver with an SR-IOV (Ethernet) for ESXi 6.7/7.0 Native driver.

ReferencesConnectX® Ethernet Driver for VMware® ESXi ServerHow-to: Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7/7.0.How-to: NVIDIA ConnectX driver upgrade on VMware ESXi 6.5 and above.How-to: Firmware update for NVIDIA ConnectX-5/6 adapter on VMware ESXi 6.5 and above.

Single Root IO Virtualization (SR-IOV)Single Root IO Virtualization (SR-IOV) is a technology that allows a physical PCIe device to present itself multiple times through the PCIe bus. This technology enables multiple virtual instances of the device with separate resources. NVIDIA adapters are capable of exposing in ConnectX-4/ConnectX-5 adapter cards up to   virtual instances called Virtual Functions (VFs). These virtual functions can then be provisioned 128separately. Each VF can be seen as an addition device connected to the Physical Function. It shares the same resources with the Physical Function.

SR-IOV is commonly used in conjunction with an SR-IOV enabled hypervisor to provide virtual machines direct hardware access to network resources hence increasing its performance.

Related Documents

How-to: Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7/7.0.

RDG: RoCE accelerated vSphere 6.7 cluster deployment for ML and HPC workloads.

RDG: VMware NSX-V hardware VTEPs in High-Availability mode on Spectrum switches running Cumulus Linux.

How-to: Firmware update for NVIDIA ConnectX-5/6 adapter on VMware ESXi 6.5 and above.

How-to: Change port type of NVIDIA ConnectX VPI adapter on VMware ESXi 6.x and above.

How-to: NVIDIA ConnectX driver upgrade on VMware ESXi 6.7/7.0 and above.

How-to: Configure NVIDIA network device in VMDirectPath I/O passthrough mode on VMware ESXi 6.x.

How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV mode on VMware ESXi 6.7/7.0 and above.

How-to: Configure RoCEv2 lossless fabric for VMware ESXi 6.5 and above.

How-to: Configure NVIDIA GPU device in VMDirectPath I/O passthrough mode on VMware ESXi 6.x server.

How-to: Configure PVRDMA in VMware vSphere 6.5/6.7.

RDG: Kubernetes Cluster Deployment for ML and HPC Workloads with NVIDIA GPU Virtualization and VMware PVRDMA Technologies.

How-to: Configure NVIDIA network device in DirectPath I/O and Dynamic DirectPath I/O passthrough modes on VMware ESXi 7.0.

RDG: Apache Spark 3.0 on Kubernetes accelerated with RAPIDS over RoCE network.

How-to: Configure RoCE PVRDMA Namespace in VMware vSphere 7.0.

Note: Setting up a VM is out of the scope of this post.

Page 2: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

1. 2. 3. 4.

OverviewSR-IOV configuration includes the following steps to:

Enable Virtualization (SR-IOV) in the BIOS (prerequisites).Enable SR-IOV in the firmware.Enable SR-IOV in the MLNX_OFED Driver.Map the Virtual Machine (VM) to the relevant port via SR-IOV.

Hardware and Software Requirements1. A server platform with an adapter card based on one of the following NVIDIA :HCA devices

ConnectX®-5ConnectX®-6 Dx

2. Installer Privileges: The installation requires on the target machine.administrator privileges

3. : For the latest list of device IDs, please visit NVIDIA website.Device ID

PrerequisitesTo set up an SR-IOV environment, the following is required:

Page 3: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

1.

2.

3.

4.

5.

Make sure that of the specific server. Each server has different SR-IOV is enabled in the BIOSBIOS configuration options for virtualization. See as sample HowTo Set Dell PowerEdge R730

 for BIOS configuration examples.BIOS parameters to support SR-IOVInstall NVIDIA Firmware Tools (MFT) on ESXi server, refer to How-to: Install NVIDIA Firmware Tools (MFT) on VMware ESXi 6.7/7.0.Make sure to have the on the Hypervisor. latest nmlx5_core native driverRefer to   and NVIDIA ConnectX® Ethernet Driver for VMware® ESXi Server How-to: NVIDIA ConnectX driver upgrade on VMware ESXi 6.5 and above.Make sure to have the . supported firmware versionRefer to   and NVIDIA ConnectX® Ethernet Driver for VMware® ESXi ServerHow-to: Firmware update for NVIDIA ConnectX-5/6 adapter on VMware ESXi 6.5 and above.

Setting Up SR-IOV

Enable SR-IOV in the BIOS

The figures used in this section are for illustration purposes only.

For further information, please refer to the appropriate BIOS User Manual:

1. in the system BIOS.Enable "SR-IOV"

2. .Enable "Intel Virtualization Technology"

Enable SR-IOV on the Firmware

1.  server.Enable  Access to ESXiSSH

2.    with root permissions.Log into ESXi vSphere Command-Line Interface

3. and .Run MFT check the status

Each server has different BIOS configuration options for virtualization.

Page 4: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

ESXi Console

# /opt/mellanox/bin/mst startModule mst is already loaded

# /opt/mellanox/bin/mst status

MST devices: ------------mt4125_pciconf7

4. the of the device.Query status

ESXi Console

# /opt/mellanox/bin/mlxconfig -d mt4125_pciconf7 q

Device #1:----------

Device type: ConnectX6DXName: MCX623106AC-CDA_AxDescription: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure BootDevice: mt4125_pciconf7

Configurations: Next Boot

...NUM_OF_VFS 0SRIOV_EN False(0)...

5. and the desired number of Virtual Functions ( ).Enable SR-IOV set VFs

SRIOV_EN=1NUM_OF_VFS=16 ; This is an example with eight VFs per port.

ESXi Console

# /opt/mellanox/bin/mlxconfig -d mt4125_pciconf7 s SRIOV_EN=1 NUM_OF_VFS=16

Device #1:----------

Device type: ConnectX6DXName: MCX623106AC-CDA_AxDescription: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure BootDevice: mt4125_pciconf7

Configurations: Next Boot New SRIOV_EN False(0) True(1) NUM_OF_VFS 0 16

Apply new Configuration? (y/n) [n] : yApplying... Done!-I- Please reboot machine to load new configurations.

Page 5: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

6.   the ESXi host.Enter Maintenance Mode

7. the server.Reboot

ESXi Console

# lspci -d | grep Mellanox0000:39:00.0 Ethernet controller: Mellanox Technologies ConnectX-6 Dx EN NIC; 100GbE; dual-port QSFP56; PCIe4.0 x16; (MCX623106AC-CDA) [vmnic0]0000:39:00.1 Ethernet controller: Mellanox Technologies ConnectX-6 Dx EN NIC; 100GbE; dual-port QSFP56; PCIe4.0 x16; (MCX623106AC-CDA) [vmnic1]...

8.   the ESXi host.Exit Maintenance Mode

9. if in the firmware.Check SR-IOV is enabled

ESXi Console

# /opt/mellanox/bin/mlxconfig -d mt4125_pciconf7 q

Device #1:----------

Device type: ConnectX6DXName: MCX623106AC-CDA_AxDescription: ConnectX-6 Dx EN adapter card; 100GbE; Dual-port QSFP56; PCIe 4.0 x16; Crypto and Secure BootDevice: mt4125_pciconf7

Configurations: Current

...

NUM_OF_VFS 16

SRIOV_EN True(1)

...

Enable SR-IOV on the Driver

1. the list as follows:Get module parameter

ESXi Console

# esxcli system module parameters list -m nmlx5_core

Name Type Value Description...max_vfs array of uint Number of PCI VFs to initializeValues : Array of 'uint' of range 0-128, May be limited by device, 0 - disabledDefault: 0...

Note:   must be performed for each PCI device (adapter). In parallel, in the driver the mlxconfigconfiguration is per module, which means that it will be applicable for all adapters installed on the server.

Note: At this point, the VFs are not seen when using  . Only when SR-IOV is enabled on lspcithe driver will you be able to see them.

Page 6: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

2. in the driver and  module parameter.Enable SR-IOV set the max_vfs

ESXi Console

# esxcli system module parameters set -m nmlx5_core -p "max_vfs=16,16"

Or, if you have configured pfc:

ESXi Console

# esxcli system module parameters set -m nmlx5_core -p "pfctx=0x08 pfcrx=0x08 max_vfs=16,16"

3.   the ESXi host.Enter Maintenance Mode

4. the server.Reboot

5.   the ESXi host.Exit Maintenance Mode

6. the PCI bus and verify that you see the (with the same number of VFs on each port).Check VFs

ESXi Console

# lspci -d | grep Mellanox

0000:39:00.0 Ethernet controller: Mellanox Technologies ConnectX-6 Dx EN NIC; 100GbE; dual-port QSFP56; PCIe4.0 x16; (MCX623106AC-CDA) [vmnic0]0000:39:00.1 Ethernet controller: Mellanox Technologies ConnectX-6 Dx EN NIC; 100GbE; dual-port QSFP56; PCIe4.0 x16; (MCX623106AC-CDA) [vmnic1]0000:39:00.2 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_0]0000:39:00.3 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_1]0000:39:00.4 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_2]0000:39:00.5 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_3]0000:39:00.6 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_4]0000:39:00.7 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_5]0000:39:01.0 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_6]0000:39:01.1 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_7]0000:39:01.2 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_8]0000:39:01.3 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_9]0000:39:01.4 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_10]0000:39:01.5 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_11]0000:39:01.6 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_12]

Note 1: Allow at least one more VF to be configured on the firmware ( ) than is num_of_vfsconfigured on the driver. In our example we had eight VFs configured on the firmware while four is configured on the driver ( ).max_vfs

Note 2:   must be performed for each PCI device (adapter). In parallel, in the driver mlxconfigthe configuration is per module, which means that it will be applicable for all adapters installed on the server.

Note 3: Changing the number of VFs is persistent.

Page 7: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

1. 2.

0000:39:01.7 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_13]0000:39:02.0 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_14]0000:39:02.1 Ethernet controller: Mellanox Technologies ConnectX Family nmlx5Gen Virtual Function [PF_0.57.0_VF_15]

At this point you can see and .16 VFs one Physical Function (PF)

Add Network Adapter to the VM in SR-IOV Mode

After you enable the Virtual Functions on the host, each of them becomes available as a PCI device.

To assign Virtual Function to a Virtual Machine in the vSphere Web Client:

1. in the vSphere Web Client.Locate the Virtual Machine

Select a data center, folder, cluster, resource pool, or host and click the Related Objects tab.Click Virtual Machines and select the virtual machine from the list.

2.   the Virtual Machine.Power off

3. and Go to " ".Select the VM Edit Settings

4. Click on  .Add Network adapter

5. Under   select the   connectivity option.Adapter Type SR-IOV passthrough

Note 1: Make sure the , and upgrade it if needed by VM version is Rel. 10 or aboveaccessing the   section (otherwise SR-IOV will not appear as an option in the Compatibilitynetwork adapter selection).

Note 2: Before you start, power off the VM.

Page 8: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

6. Check the   checkbox.Reserve all guest memory (All locked)

I/O memory management unit (IOMMU) must reach all Virtual Machine memory so that the passthrough device can access the memory by using direct memory access (DMA).

7. Expand the   section and connect the Virtual Machine to the   from New Network SRIOV net port groupthe combo box at the bottom of the screen.

The virtual NIC does not use this port group for data traffic. The port group is used to extract the networking properties, for example VLAN tagging, to apply on the data traffic.

Page 9: How-to: Configure NVIDIA ConnectX-5/6 adapter in SR-IOV ...

8. the VMPower on

9. Open the  and make sure that you have the interface connected.VM command line

On the guest VM install the OS NVIDIA driver (OFED, WinOF ...).Configure the IP Address and check Network connectivity.

Troubleshooting1. At least one more VF must be configured on the firmware than is configured on the driver. In our example we had eight VFs configured on the firmware while four are configured on the driver.

2.   must be performed for each PCI device (adapter). In parallel, in the driver the configuration mlxconfigis per module, which means that it will be applicable for all adapters installed on the server.

3. Make sure the   or above, and upgrade it if needed by accessing the VM version is Rel. 10 Compatibili section (otherwise SR-IOV will not appear as option in network adapter selection).ty

Done !

MAC Address and MTU Considerations

Note 1: You can leave the automatic generated MAC address (this is the default), or change it manually.

Note 2: The Hypervisor MTU should be higher or equal to the Guest VM, otherwise, the packets may be dropped. You may modify “Set Guest OS MTU change” to allow changing MTU from guest. This step is applicable only if this feature is supported by the driver.