Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy...
Transcript of Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy...
![Page 1: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/1.jpg)
Vhost and VIOMMU
Jason Wang <[email protected]> (Wei Xu <[email protected]>)
Peter Xu <[email protected]>
![Page 2: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/2.jpg)
08/18/16 VHOST AND VIOMMU 2
Agenda
● IOMMU & Qemu vIOMMU background● Motivation of secure virtio● DMAR (DMA Remapping)
– Design Overview
– Implementation illustration
– Performance optimization – vhost device iotlb● IR (Interrupt Remapping)● Performance results & status
![Page 3: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/3.jpg)
08/18/16 VHOST AND VIOMMU 3
● What is IOMMU?
– A hardware component provides two main functions: IO Translation and Device Isolation.● How IO Translation and Device Isolation are supported by IOMMU
– DMA Remapping(DMAR), IO space address presented by devices are translated to
physical address coupled with access permission on the fly, so the ability of devices
are limited to access specific regions of memory.
– Interrupt Remapping (IR), Some architectures also support interrupt remapping, in a
manner similar to memory remapping.● What's qemu vIOMMU?
– An emulated IOMMU which behaves as a real one.
– The functionality is always a subset of a physical unit depending on implementation.
– Only Intel, ppc, sun4m iommus are support in qemu currently.
IOMMU & Qemu vIOMMU Revisit
![Page 4: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/4.jpg)
08/18/16 VHOST AND VIOMMU 4
IOMMU and vIOMMU
Memory
vIOMMU
Emulated Devices
vCPU
vMMU
VM
Memory
vIOMMU
Emulated Devices
vCPU
vMMU
VM
HOST
Host Memory
IOMMU MMU
Hardware Devices CPU
![Page 5: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/5.jpg)
08/18/16 VHOST AND VIOMMU 5
Motivation
● Security, Securtiy and security.● DPDK: The Userspace Polling-Mode drivers (DPDK)
for virtio net devices are vastly used in NFV.● Vhost is the popular backend for most of user cases.● Vhost is still out of IOMMU scope.
![Page 6: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/6.jpg)
DMA Remapping (DMAR)
![Page 7: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/7.jpg)
08/18/16 VHOST AND VIOMMU 7
gpa
Virtio-Net BackendsVring
Vhost-netVhost-user
Other virtio-net backends
tx/rx
Memory APIVirtio-Net
gpa
Qemu
Virtio-Net Device Address Space Overview
Guest
Virtio-NetBackend Service
gpa-to-hva
Guest pages
![Page 8: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/8.jpg)
08/18/16 VHOST AND VIOMMU 8
iova
Virtio-Net BackendsVring
Vhost-netVhost-user
Other virtio-net backends
tx/rx
IOMMU DrivervIOMMU IOTLB API
dma api
iotlb entry lookup
Memory APIVirtio-Net
iova
Qemu
Design of Secure Virtio-Net Device DriverGuest
Virtio-Net Backend Service
iova-to-hva
Guest Pages
![Page 9: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/9.jpg)
08/18/16 VHOST AND VIOMMU 9
Implementation: Guest
● Guest – Boot guest with a vIOMMU assigned.
– VIRTIO_F_IOMMU_PLATFORM, if this feature bit is provided in the device, then the guest virtio driver is forced to use dmaapi to manage all corresponding dma memory access, otherwisethe device will be disabled by system compulsorily.
![Page 10: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/10.jpg)
08/18/16 VHOST AND VIOMMU 10
Implementation: Qemu and Backends
● Qemu– DMA address translation for vIOMMU has been fully
supported, unfortunately, virtio-pci devices is still using memory address space and never use iova at all, switch to use dma address(iova).
● Backends– All address access to vring must be translated from
guest iova to hva, this is done via iotlb lookup with interfering of vIOMMU.
![Page 11: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/11.jpg)
08/18/16 VHOST AND VIOMMU 11
More optimization: Vhost Device IOTLB Cache
● Why it comes to vhost?– Vhost-net is the most powerful and reliable in-kernel network
backend, and is widely used as a preferred backend.
● What problem does vhost encounter?– IOTLB api of vIOMMU is implemented in qemu, while vhost works in
kernel, high frequency of iotlb translations which traverse between kernel and userspace will impact performance dramatically.
● How does vhost survive? – Kernel-Side device iotlb cache(ATS).
![Page 12: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/12.jpg)
08/18/16 VHOST AND VIOMMU 12
Root Complex
Translation Agent (TA)
PCIe Device APCIe Device B
ats request
ats completion
device iotlb cache
Memory
Address Translation Services(ATS) Overview
![Page 13: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/13.jpg)
08/18/16 VHOST AND VIOMMU 13
Why Address Translation Services(ATS)?
● Alternative– An individual VT-d in vhost, drawbacks:
● Code duplication.● Vendor and architecture specific.● New api for error reporting.
● Benefits of ATS – PCIe spec– Platform independent.– Easily achieved based on current iommu infrastructure.
![Page 14: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/14.jpg)
08/18/16 VHOST AND VIOMMU 14
atranslate iova 'd'
iotlb-miss 'd'
iotlb-update 'd'
iotlb invalidate 'c'
Vhost (d, size, wo)
IOTLB API
lookup
new
error report
illegal address range
update 'd'
guest unmap 'c'
Vring
Qemu
Vhost Device IOTLB Cache Workflow
Tx/Rx
device iotble cache entries interval tree
(a, size, ro)
legal address range
Vhost IOTLB API
(b, size, wo) (c, size, rw)
(d, size, wo)
![Page 15: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/15.jpg)
08/18/16 VHOST AND VIOMMU 15
Vhost Device IOTLB Implementation Summary
● Implementation - Save device iotlb cache entries in kernel. - Lookup entry from the cache when accessing virtio buffers. - Request qemu to translate for any tlb miss on demand. - Process update/invalidate message from qemu and manage the kernel cache correctly.
● Data Structure and Userspace/Kernel Interface - An interval tree is chosen to save the dynamica device iotlb caches. - A message mechanism via vhost 'fd' read/write is used to pass vATS request and reply.
![Page 16: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/16.jpg)
Interrupt Remapping (IR)
![Page 17: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/17.jpg)
08/18/16 VHOST AND VIOMMU 17
X86 system interrupts
System Bus
Bridge
Signal-based Interrupts (MSI/MSIX)
IOAPIC
Line-based Interrupts
PCI Bus
Processor
Local APIC
Processor
Local APIC
Processor
Local APIC
...
Kinds of interrupts:– Line-based (edge/level)– Signal-based (MSI/MSI-X)
IRQ chips– IOAPIC– Local APICs (LAPICs)
![Page 18: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/18.jpg)
08/18/16 VHOST AND VIOMMU 18
IR challenges for vhost
● Interrupt remapping (IR) still not supported for x86 vIOMMU
– MSI and IOAPIC interrupts● Kernel irqchip support:
– How to define interface between user and kernel space?
– How to enable vhost fast irq path (irqfd)?● Performance impact?● Interrupt caching
![Page 19: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/19.jpg)
08/18/16 VHOST AND VIOMMU 19
IOAPIC interrupt delivery
● Workflow before IR:
– Fill in IOAPIC entry with interrupt information (trigger mode, destination ID, destination mode, etc.).
– When line triggered, interrupt sent to CPU with information stored in IOAPIC entry.
● Workflow after IR (IRTE: Interrupt Remapping Table Entry):
– Fill in IRTE with interrupt information (in system memory).
– Fill in IOAPIC entry with IRTE index.
– When line triggered, fetch IRTE index from IOAPIC entry, send the interrupt with information stored in specific IRTE.
![Page 20: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/20.jpg)
08/18/16 VHOST AND VIOMMU 20
MSI/MSI-X delivery
Interrrupt Request (MSI)
Interrrupt Request (MSI with IR)
IRTE IRTE IRTE IRTE
IRTE IRTE IRTE IRTE
IRTE IRTE IRTE IRTE
IRTE IRTE IRTE IRTE
Interrupt Remapping Table
Interrrupt Request (MSI)
Interrrupt Remapping Table Entry (IRTE)
LookupIndexing
Parse
Delivered
Delivered
MSI Delivery without IR
MSI Delivery with IR
![Page 21: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/21.jpg)
08/18/16 VHOST AND VIOMMU 21
IR with kernel-irqchip
● We want interrupts “as fast as before”.● Current implementation:
– Leverage existing GSI routing table in KVM– Instead of translate “on the fly”, translate during setup– Easy to implement (no KVM change required)– Little performance impact (slow setup, fast delivery)– Only support “split|off” kernel irqchip, not “on”
![Page 22: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/22.jpg)
08/18/16 VHOST AND VIOMMU 22
Remap irqfd interrupts
● Fast IRQ path for vhost devices: without remapping
vhost KVMEvent Guest
NotifierIRQ injection
GSI Routing Table
Guest
MSI Message 1
MSI Message 2
MSI Message 3
MSI Message 4
QEMUSetup
Setup
![Page 23: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/23.jpg)
08/18/16 VHOST AND VIOMMU 23
Remap irqfd interrupts (cont.)
● Fast IRQ path for vhost devices: with remapping
vhost KVMEvent Guest
NotifierIRQ injection
GSI Routing Table
Guest
Translated MSI Message 4
Translated MSI Message 3
Translated MSI Message 2
Translated MSI Message 1QEMU
Setup
Setup
![Page 24: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/24.jpg)
08/18/16 VHOST AND VIOMMU 24
All in all...
● To boot guest with DMAR and IR enabled:
(Possibly one extra flag to enable DMAR for guest virtio driver)
qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \ -device intel-iommu,intremap=on \ -netdev tap,id=tap1,script=no,downscript=no,vhost=on \ -device virtio-net-pci,netdev=tap1,disable-modern=off,ats=on
qemu-system-x86_64 -M q35,accel=kvm,kernel-irqchip=split \ -device intel-iommu,intremap=on \ -netdev tap,id=tap1,script=no,downscript=no,vhost=on \ -device virtio-net-pci,netdev=tap1,disable-modern=off,ats=on
![Page 25: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/25.jpg)
08/18/16 VHOST AND VIOMMU 25
Vhost + VIOMMU Performance
● For dynamic DMA mapping (e.g., using generic Linux kernel drivers):
– Performance dropped drastically
– TCP_STREAM: 24500 Mbps 600 Mbps→
– TCP_RR: 25000 trans/s 11600 trans/s→● For static DMA mapping (e.g., DPDK based application like l2fwd)
– Around 5% performance drop for throughput (pktgen)
– Still more work TBD...
![Page 26: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/26.jpg)
08/18/16 VHOST AND VIOMMU 26
Current status & TBDs
● DMAR/IR upstream status:
– QEMU: IR merged (Peter Xu), DMAR still RFC (Jason Wang will post formal patch soon)
– Vhost & Virtio driver: merged (Michael S. Tsirkin/Jason Wang)
– DPDK: vhost-user IOTLB is being developed (Victor Kaplansky)● TBDs
– Performance tuning for DMAR
– Quite a few enhancements for IR: explicit cache invalidations, better error handling, etc.
![Page 27: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/27.jpg)
08/18/16 VHOST AND VIOMMU 27
Thanks!
![Page 28: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/28.jpg)
08/18/16 VHOST AND VIOMMU 28
Appendix
![Page 29: Vhost and VIOMMU - KVM · 2016-08-30 · 08/18/16 VHOST AND VIOMMU 5 Motivation Security, Securtiy and security. DPDK: The Userspace Polling-Mode drivers (DPDK) for virtio net devices](https://reader034.fdocuments.us/reader034/viewer/2022042320/5f0a8ccc7e708231d42c2f10/html5/thumbnails/29.jpg)
08/18/16 VHOST AND VIOMMU 29
Kernel-irqchip: a review
● Command line interface:
● Supported modes
Mode IOAPIC APIC
“ON” In kernel
“SPLIT” In userspace In kernel
In kernel In userspace
“OFF” In userspace
qemu-system-x86_64 -M q35,kernel-irqchip={on|off|split}qemu-system-x86_64 -M q35,kernel-irqchip={on|off|split}