Nested Virtualization Update from Intel
-
Upload
the-linux-foundation -
Category
Technology
-
view
7.269 -
download
5
description
Transcript of Nested Virtualization Update from Intel
Nested Virtualization Update From Intel
Xiantao Zhang, Eddie Dong
Intel Corporation
Legal Disclaimer
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.
Intel may make changes to specifications and product descriptions at any time, without notice.
All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.
Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2012 Intel Corporation.
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancement
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
3
Motivation and Goals
4
• Why nested virtualization?
− Ordinary OS are adopting VMX now
−Windows 7 XP compatibility mode
−Windows 8 Hyper-V
− Other Commercial VMMs requires VMX for
better performance
− vmware vmm
− Anti-virus software depends on VMX
− McAfee Deep Defender
Guest
VMM
VMM
Guest
Hardware Platform with VMX Enabled
• What is the goal ?
− To make VMX-based system software run
smoothly in a Xen guest.
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
5
Nested VMX Architecture
6
VMCSShadow VMCS
HVM Guest
Dom0
L0
L1vVMCS
Nested Guest L2
Virtual EPT
Virtual VT-d
Xen VMM
Sh
ad
ow
ing
VM entry/exit
Nested
VMM
Virtual VM entry/exit
Virtual VMX
VMX-Enabled Platform (VT-d, EPT etc.)
History
• Nested VMX update @ Xen Summit Asia (Nov. 2009)
− Nested VMX design is presented
− Showed Initial Status
−Nested guest can boot up to BIOS early stage with limitations
− single vCPU/single nested guest/ No vCPU migration
• Refined nested VMX support was pushed into upstream
− Support multiple nested guests
− Also includes supporing SMP nested guests
• However, experimental & preliminary support
− Very limited configurations can work
−“KVM on Xen”, Linux guest can successfully boot up
−“Xen on Xen” does not work
− No virtual VT-d, virtual EPT
7
Previous Status
− Only one combination can work
8
L0-VMM L1-VMM
L2 Guest OS
32Bit PAE OS 64Bit OS
RHEL6.0 RHEL5.4 Win7 RHEL6.0 Win7Win2012 Server
Ubuntu 12.04
Xen
Xen X X X X X X X
KVM X √ X X X X X
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
9
Stability Enhancement
− Greatly enhanced stability, with several critical bugs fixed!
10
L0-VMM L1-VMM
L2 Guest OS
32Bit PAE OS 64Bit OS
RHEL6.0 RHEL5.4 Win7 RHEL6.3 Win7Win2012 Server
Ubuntu 12.04
Xen
Xen X X X X X X X
KVM X √ X X X X X
L0-VMM L1-VMM
L2 Guest OS(SMP)
32Bit PAE OS 64Bit OS
RHEL6.0 RHEL5.4 Win7 RHEL6.0 Win7Win2012 Server
Ubuntu 12.04
Xen
Xen √ √ √ √ √ √ √
KVM √ √ √ √ √ √ √
Performance Without Optimizations
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
SpecJBB Unixbench Kernel Build CPU-INT
L2 Guest2(Xen on Xen)
L1 Guest
Native
Platform: SNB-EPOS: RHEL6.3 GuestMEM:4GB MemoryCPU: 2 VCPU
11
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
12
Virtual EPT Architecture
L2 Guest
L0 VMM
L1 VMM
L0
L1
L2
vVMCS
L1 EPT L2-GPA-> L1 GPA
VMCS-L1L0 EPT
L1-GPA-> HPA
Switch to Shadow EPT @ virtual vmentry
sVMCS-L2
Sh
ad
ow
ing
L2-GPA->HPAShadow EPT
Sh
ad
ow
ing
13
VM entry/exit
Virtual VM entry/exit
Virtual EPT: Using EPT Shadowing
• No write-protection to L1-EPT (Guest EPT paging structure)
− Flexibility is good.
• Trap-and-emulate guest’s INVEPT
− Update the shadow EPT entries
• Better SMP Scalability
− No global lock is required
• Requires page-level INVEPT
− Individual address invalidation
14
Enhanced INVEPT Instruction for Virtual EPT
• INVEPT limitations
− No Individual address invalidation
−Only single context and all context invalidation
• Little performance impact, however, hurt nested performance sharply!
−Has to drop shadow EPT table for L1’s each INVEPT(with single context)
• Performance loss if frequent INVEPT in VMM
• For example, KVM
• Enhance it in Software Way
− Add Individual address invalidation for virtual EPT
−Expose it to nested VMM through PV approach
− Need to enhance VMMs
−Easy implementation for Xen and VMM
• Benefits
− Reduce frequent shadow EPT paging structure flush
15
Performance Evaluation For Virtual EPT
0
1
2
3
4
5
6
7
Kernel Build SpecJBB Netperf CPU-INT
w/o virtual EPT
with virtual EPT
16
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
17
Virtual VT-d: Expose VT-d Capability to L1VMM
• I/O performance for L2 guest is very slow
− Due to extremely long device emulation path through all the way to L1 & L0 VMMs
• How to fix that?
− Present virtual VT-d engine to L1 VMM
− So, device can be directly assigned to L2 guest
−High I/O performance, because of minimum VMM intervention.
• Must-to-have features in Virtual VT-d
− DMA Remapping & Queue Invalidation: Exposed
− Interrupt remapping: Not Exposed
18
Virtual VT-d Architecture
Dev Q: Qemu device
Dev P: PT device
Xen
Bus 0
Bus N
Dev M FunN
Hardware VT-d
Hypervisor
Shadow VT-d page table
Qemu
HypercallHypercall
IO TLB
Device ATC
Domain 0
Bus 0 DevQ,Funy
DevP,Funn
Guest view of VT-d
VT-d page tableNested VMM
HVM
loader
Shadow
ing
DMA
Engine
vDMAR
Virtual VT-d
vQI
19
L0
L1
L2Nested Guest
VMM
Two types of guest devices
• Pass through device
−DMA (IOVA->GPA) is handled by hardware VT-d engine
−Remap guest root/context structure
−Use physical remapping table to emulate guest remapping table
− IOVA -> L0 HPA, + audit (use a dummy page for Out of Bound gpn)
− Maybe cached by IOTLB and ATC
−IOTLB/Context Cache Synchronization
−Track guest invalidation of IOTLB
− Invalidate physical IOTLB, and may invalidate ATC as well if the device has ATC
−Track guest invalidation of Context Cache
• Qemu device
−DMA (IOVA->PA remapping) is emulated by Qemu
−2 Options: Caching the remapping table, or No-Caching
−Starting from simple solution: No caching
−Qemu device is already slow
20
Performance Evaluation of virtual VT-d
0
100
200
300
400
500
600
700
800
900
1000
1100
TCP Send TCP Receive UDP Send UDP Receive
Bandwidth of Nested Guest Ideal Bandwidth
21
Iperf testing with the assigned NIC to nested Guest
Mb/s
Bandwidth is good enough!
Latency Evaluation of virtual VT-d
0
20
40
60
80
100
120
Native Nested Guest
Latency
Latency
ms
22
Still have room to tune Latency
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
23
Preliminary Performance
0
0.2
0.4
0.6
0.8
1
1.2
L1 Guest
L2 Guest
24
Based on Xen #25467
Platform: SNB-EPOS: RHEL5.4 GuestMEM:2GB MemoryCPU: 2 VCPU
Agenda
• Motivation and Goals
• History
−Nested VMX Architecture
−Previous status
• Latest status and new features
−Stability Enhancements
−Virtual EPT
−Virtual VT-d
• Preliminary Performance
• Call to Action
25
Call to Action
• Support more L1 VMMs
− McAfee Deep Defender
− VMware VMM
− Hyper-V
− Virtual Box
• Virtual APIC-V
− New Features for Interrupt/APIC Virtualization are coming
− For more information, please come to Nakajima Jun’s talk “Intel Update” this
afternoon.
− Improve interrupt virtualization efficiency for both L1 and L2
• Performance Tuning
26
Reference
• Nested Virtualization on Xen
− Qing He:
− Xen Summit 2009: http://xen.org/xensummit/xensummit_fall_2009.html
• Virtual APIC-V
− Jun Nakajima: Intel Update
− Xen Summit 2012: http://www.xen.org/xensummit/xs12na_talks/T10.html
27