DATA PROCESSING UNIT (DPU)
Transcript of DATA PROCESSING UNIT (DPU)
Alexander Petrovskiy, Staff System Engineer, NVIDIA Networking
August 2021
DATA PROCESSING UNIT (DPU)Technical overview
2
SERVER NETWORKING EVOLUTION: FROM NIC TO DPUSoftware Defined Data Center Infrastructure-on-a-Chip
Accelerated SW-defined
Infrastructure on CPUSW-defined Infrastructure
on DPU
CPU
SmartNIC
SW-defined Networking SW-defined Security
SW-defined Storage Infrastructure Management
VMs
Acceleration Engines
Containers
CPU
DPU
SW-defined Networking SW-defined Security
SW-defined Storage Infrastructure Management
VMs
Acceleration Engines
Containers
DPU with integrated GPU
SW-defined Networking SW-defined Security
SW-defined Storage Infrastructure Management
Acceleration Engines
Datacenter on a DPU
CPU
VMs Containers
AI Applications
Legacy Infrastructure
CPU
Legacy NIC
SW-defined Networking SW-defined Security
SW-defined Storage Infrastructure Management
VMs Containers
3
DEMYSTIFYING SMARTNICs AND DPUsSoftware-Defined, Hardware-Accelerated
BLUEFIELD DPU
SoC-based DPU with full Data & Control Path Acceleration for Unified Cloud
CONNECT-X SmartNIC
ASIC-based advanced NIC with Fully Accelerated Datapath for Secure Cloud,
Telco and Enterprise
4
WHAT MAKES A SMARTNIC SMART?
PAM4
PAM4
PTP HW
Clock
Secure
Firmware Update
Secure Boot
(HW RoT)
Hardware Steering and Filtering
AES-XTS
Storage
Encryption
Engine
Key
Management
TLS Inline
Offload Engine
Connection
Tracking
IPsec Inline Offload Engine
(aware/un-aware)
RoCE
Selective
Repeat
Resilient
RoCE
Accurate
timestamp
x16 PCIe Gen 4.0
5
BLUEFIELD-2 DPU
ConnectX-6 Dx inside
200 Gbps Ethernet & InfiniBand, NRZ & PAM4 modulation
8 ARM A72 CPUs subsystem in a Tile architecture
- 8MB L2 cache, 6MB L3 cache in 4 Tiles
- ARM Frequency up-to 2.75GHz
Fully integrated PCIe switch, 16 bi-furcated Gen4.0
- Root Complex or End Point modes
1GbE Out-of-Band management port
16 lanes PCIe Gen3/4
Technical Overview
6
BLUEFIELD-3 DPU
ConnectX-7 inside
I/O
2x400Gbs (Active/Standby), 4x100Gbs Ethernet/InfiniBand
100G PAM4 serdes
400Gb/s bandwidth
Integrated PCIe switch
Gen5.0 x32+2
Multi-host – 8 hosts
Compute sub-system
16 Arm®A78 v8.2+ Hercules @2.3GHz
SkyMesh fully coherent low-latency interconnect
8MB L2 Cache, 16MB LLC System Cache
Built-in accelerators
Advanced Memory sub-system
Dual Channel 256GB DDR5-4800MT/s w/ ECC
NVDIMM-N Support
DDR memory encryption
1GbE Out-Of-Band management port
Self-hosted or Server-hosted
Technical Overview
Quad VPI Ports
Ethernet/InfiniBand:
10/25/50/100/200/400G
Out-of-Band
Management Port
SGMII
Mgmt
Port
(GbE)
ConnectX-7 SubsystemPacket Proc.
eSwitch Flow Steering / Switching
IPsec/TLS/CT
Application Offload, NVMe-oF, NVMe-oTCP, T10-DIF, etc.
Packet Proc.
RDMA transport Encrypt/Decrypt
DD
R 5
64b +
16b 4
800M
T/s
DMA
Last LevelCache
L2 Cache
Hercules
L2 Cache
Hercules
DMA DMA DMA
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
L2 Cache
Hercules
I2C,
USB,
DAP,UART
PCIe Gen 5.0 Switch
PCIe Gen 5.0 x32+2Root Complex or Endpoint
eMMC,
GPIO
Last LevelCache
Last LevelCache
Last LevelCache
DD
R 5
64b +
16b 4
800M
T/s
APU
RoT RegExpDecomp
TRNGPKA Accels
Accelerators
7
MOVING INFRASTRUCTURE SERVICES TO DPU
Software Defined Security
Distributed
NG Firewall
IDS/IPS DDOS
Prevention
Software Defined Storage
vRouter vSwitch VMs &
Containers
Software Defined Networking
NVMe-oF
Storage Direct
Data
Encryption
DeDup Micro
Segmentation
Telco/NFV Elastic
Storage
Root of
Trust
CompressionNAT/Load
Balancer
8
DPU FOR UNIFIED CLOUD USE-CASEUnified infrastructure for host Networking, Storage, Security and Management
Today’s Environment
Standard NIC
Network I/O Host MgmtStorage I/O
Hypervisor
Security &
Crypto
Functional Isolation
Today’s Environment
Network I/O
CPU
Scheduling
Storage I/O
Lightweight
Hypervisor
Security &
Crypto
BlueField-2 DPU
Unified Datacenter
Container Container
APP APP
VM VM
APP APP
Host Mgmt
Bare Metal Server
VM
APP
Container
APP
VM
APP
APPContainer Container
APP APP
AI
Acceleration
9
DPU – IS THE NEW NETWORK EDGEMoving the Top-of-rack Into the Server
SmartNIC/DPU
Datacenter ToR Switch
Host Based Networking
FRR
(BGP/EVPN)
Linux
networking
Linux
apps
10
HBN FOR UNIFIED CLOUD
Zero Trust
Network Administrator Server Administrator
EVPN Crypto Ansible
Standard Linux Control Plane
Hypervisors
Bare Metal
NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.
Modern Networking with Classic Controls
VLAN Trunk
BGP
Peering
Offloaded
VXLAN
Accelerated Data PlaneContainers
12
HOST
ASAP2: VIRTUAL NETWORKING AND SDN ACCELERATIONAccelerated Switching and Packet Processing
VM VM Container
Legacy NIC
Hypervisor
CPU
Bandwidth
Virtual Switches / SDN Packet Processing Put Heavy Load on CPU
CPU
Bandwidth
Hardware Accelerated Packet Switching with Zero CPU Utilization Integrated with commercial and community partners
Leveraged to create Efficient Cloud Architectures
HOST
VM VM Container
ConnectX
HypervisorvSwitch vSwitch
13
VM VM VM
VF VF VF
PF
PF
Network adapter
Host
SR-IOV
Single root input/output virtualization
Native hardware access at the VM level
Every VM has direct access to the network adapter
(through virtual function, VF interface)
Baremetal-like network performance in VM, zero CPU
utilization
Guest awareness limitations: NIC driver in VM, VM Live
Migration challenges
VirtIO
Virtualization standard for network device drivers in Linux
systems
VirtIO abstracts the hardware to the guest OS in SW
Poor network performance in VM, CPU is utilized to move
packets
Guest un-aware: Virtio-net interface in VM, native VM Live
Migration
Can be accelerated in NIC HW using vDPA
HOST NETWORKING ACCELERATION
Comparing two general approaches
VM VM VM
Network adapter
Host
VirtIO VirtIO VirtIO
Hypervisor (vhost-net)
Hypervisor
SR-IOV
14
VF REPRESENTOR PORTSSW Representation of SR-IOV NIC Virtual Function
VF Representor
Net Device modeling of eswitch port and exposed
through PF driver
VF and its representor works like Linux veth pair
Flow configuration (add/remove)
Works under switchdev mode
Access from both kernel and DPDK
Multi Queue (RSS/TSO/CSUM)
Attach/Detach in DPDK
Multiple DPDK instances over VF representor
With VF representor, vSwitch can work with SRIOV
together and reduce CPU% consumed by virtio.
Port
DPU
SR-IOV
Host
SW Datapath
PF R1 R2
VM1
VF1
VM2
VF2
15
EMBEDDED SWITCH (ESWITCH)Flow-based Packet Processing and Steering Engine in SmartNIC/DPU
Classification
A
Classification
B
Classification
N
Action N
Action B
Action A
e-switch
Packets InProcessed
Packets
Out
Flow based Classification and action
Hierarchal multiple layer tables
Table consists of classification and action
Action may point to next table
Key fields example: Ethernet
L2/IPv4/IPv6/TCP/UDP/Inner Packet
(VXLAN/GENEVE/etc.)
Actions example: Allow/Deny, Re-write (Route/NAT),
Encap/Decap of headers, Meta Data set, Hairpin,
Sample, Counter, etc.
i2c-tools
16
NETWORKING OFFLOAD MODEL ON DPUFull Control Plane and Data Plane offload
Control plane and SW Datapath on DPU
HW Datapath is accelerated as in SmartNIC
Both SRIOV and VIRTIO interface to VM
Advantages
Support virtualized network services for VMs,
Containers and bare-metal cloud
Zero Host CPU utilization for networking services
All host resources (core and memory) can be used for
VMs
Efficient packet forwarding in HW
Host isolation
DPU can offload extra I/O and management services
Storage (NVMeOF, Virtio-blk)
Security (Firewall, DPI, IPSec/SSL crypto)
Host infrastructure management (BMC, Barametal/VM
provisioning)
VM3 VM1
Host
PortDPU
Control Plane
HW Datapath (eSwitch)
PF R1 R2
SW Datapath
R3
VhostBackend
VF3
VM2
virtio VF1 VF2
17
ASAP2: BLUEFIELD-2 TRAFFIC FLOWEmbedded CPU Configuration (Switchdev)
OVS / OVS-DPDK
ConnectX-6 DX
SmartNIC
Bare
Metal
PF0
pf0vf0 pf0hpf
Host
ECPF0
Bare
MetalVMVM
vport1vport0
pf0vf1
Network Interface
Flows
FDB
Controlplane
Isolation
Uplink
p0
19
DPU STORAGE
UNPARALLELED
PERFORMANCESTORAGE SECURITY DISAGGREGATED STORAGE
Dual 100Gbps or single 200Gbps Up to 5.4M IOPs @4KB
Lowest latencyNVMe-oF acceleration
Storage Agility Meets Best-in-Class Hardware Acceleration
Data-at-rest AES-XTS encryptionAuthentication services
Protection between users
NVMe SNAP Virtio-blk SNAP
Integrated data & control planes
20
SNAP: LOCAL STORAGE TO EMULATED STORAGE
Host OS
Remote Storage
Host OS
NVMe Driver
✓Serving bare-metal and hypervisor/VMs
Bound by physical SSDs capacity
Under-utilized storage
Scalability on demand
Over-provisioning bound to compute node
Physical Local NVMe Storage SNAP Drive Emulation
✓ Serving bare-metal and hypervisor/VMs
✓ Over-provisioning, scaled to rack/cluster
✓ Saving OPEX and CAPEX
✓ OS-agnostic using inbox standard driver
✓ Supports all network transport types –
NVMe-oF, iSCSI, iSER and even proprietary
✓ Accelerated data path* for VMs
✓ Live-migration with virtio-blk* and vDPA*
✓ Support for older OSs where only virtio-blk* is available
NVMe Driver virtio-blk
* Roadmap
SNAPPhysical Local
Storage
21
BLUEFIELD-2 SNAP – NVME/VIRTIO-BLK
Emulate NVMe Local Storage
Connected to Remote Cloud Storage
Virtualized or Bare Metal Cloud
OS Agnostic with RDMA inside
OS/Hypervisor
NVMe std drvr
SNAP
NVMe SNAP SDK
User’s Storage Application
SPDKHardware NVMe-oF
Offload Accelerations
Eth/IB
Host Server
PCIe BUS
2 1
virtio-blk std drvr
2
virtio-blk
1
virtio
NVMe
- Enabling two data paths – (1) offload with NVMe-oF(RDMA)* vs (2) SPDK
- Pluggable to Linux’s block devices (NVMe-oF, iSCSI, iSER, etc)
- Provides infrastructure for Storage Application development
- Enabling End to End storage orchestration and integration
Framework for Storage virtualization software
23
DPU SECURITY SOLUTIONS
SECURED
HARDWARE
Secure FW upgradeRoot-of-Trust
Arm trust zone
Integrated Security for modern data center needs
ADVANCED L4-L7
SECURITYCRYPTO
ACCELERATION
PROGRAMMABILTY
& ISOLATION
NG stateful firewallDeep Packet Inspection
Host introspection
Data-in-motion enc.Data-at-rest enc.
Public Key Acceleration
Hardened IsolationMicro-Segmentation
Programmable Networking
24
DPU SECURITY CAPABILITIESTrust Shifts to the DPU
Root-of-Trust
Stateful Firewall
Inline Crypto Accelerators
Deep Packet Inspection
Isolated Security Control Plane
DPU Security Services
Micro-segmentationNext Generation Firewall
DDoS ProtectionIntrusion ProtectionAnomaly Detection
Security Requires Full Isolation from
the Host
CPU
GPU
Network Traffic
BlueField DPU
25
IPSEC: TRANSPARENT ENCRYPTIONEncryption/decryption at 100Gb/s bidirectional
Host Host
IPsec
Software
Virtual
Switch
Encrypted IPsec Packet
PlaintextPacket
Workload Workload
Workload
EncryptedPacket
PlaintextPacket
Simple NIC vSwitch Control Plane
IPsec Control Plane
eSwitch and
IPsec engine
BlueField DPU
Traditional ServerIPsec runs on CPU
Workload
DPU Accelerated ServerIPsec and vSwitch on DPU
AccelerationEngine
Control Plane Software on ArmInline with other accelerators (tunneling, TLS, etc.)
Cipher: AES-GCM 128/256bit keys
Keys are stored encrypted in hardware
Encrypted RDMA
East-West encryption
26
ACCELERATING NEXT-GENERATION FIREWALLS
Accelerated Switching and Packet Processing (ASAP2) enables seamless offload of packet
filtering, steering, crypto and stateful connection tracking rules to the DPU HW
Hardware-Accelerated Policy Enforcement
Host
OS
Workload
Host
OS
NGFW
Workload
NGFW
Workload
NVIDIA DPU
28
DPU HIGH LEVEL SW ARCHITECTURESoftware-Defined, Hardware-Accelerated Infrastructure
Software-Defined
Security
Distributed
Next-Gen
Firewall
IDS/IPS DDOS
Prevention
Software-Defined
Storage
vRouter
vSwitch
VMs and
ContainersSoftware-Defined
Networking
NVMe-oF Encrypt Dedupe
Micro
Segmentation
Telemetry/
PTP
Elastic
Root of
Trust
Compress
NAT/
Load
Balancer
Video
Streaming
DPU HW
DPU SW and SDK (DOCA)
Open and Programmable API Framework
Easy, Flexible Programming of Infrastructure / Acceleration and Security
29
DPU SOFTWARE COMPONENTS
Bootloader – UEFI, ATF (Arm Trusted FW), ACPI
Linux Distro - CentOS reference drivers, Ubuntu commercial OS
Mellanox Drivers : OFED driver, ASAP2, NVME SNAP
Secure Boot and Secure Firmware Upgrade
OpenBMC for BMC Management
ConnectX-6 Dx firmware binary file
30
NVIDIA DOCA
COMMUNITY of
DEVELOPERSACCELERATE TTM COMPETITIVE EDGE
SDK for ecosystem partners,
academia,community
Leverages open-source andindustry standards (DPDK, P4);
NGC-certified
Best performance;out-of-the-box experience;
libraries with special capabilities
LONG-TERM
COMMITMENT
Backward and forward compatibility;
consistency with performance improvements
Data-Center-Infrastructure-on-a-Chip Architecture
DOCA is for DPUs what CUDA is for GPUs
31
Developer Zone Program and Website
SDK Manager Support
Tools (Compilers, Benchmarks, etc.)
DOCA Drivers and Libraries
API References and Programming Guides
Reference Applications per Use Case
Accelerated Solutions Integration
DOCA
ONE-STOP SHOP FOR DPU DEVELOPERS
32
DOCA SDK STACK
APPLICATIONS
DOCA
SERVICES
DOCA LIBRARIES
DOCA DRIVERS
RDMA
DPI HPC/AI
VNF/UPF
FlexIO
TSDC
DPU Management
Security
DPDK RegExDPDK SFT
Inline Crypto
Networking
ASAP2
DPDKP4 P4-RT
HPC/AIStorage
SNAP VirtIO-FS
XTS Crypto
FLOW
UCX/UCCHost
IntrospectionOrchestration
SDN
Telemetry
Networking Security Storage Telco MediaHPC/AI
Comm Channel
DPU – BlueField and BlueField-X
DO
CA
RiverMax
Storage
Data Integrity
SPDK
33
JOIN THE DOCA DEVELOPER PROGRAM TODAYhttps://developer.nvidia.com/nvidia-doca-sdk-early-access