Nested Virtualization Update from Intel

28
Nested Virtualization Update From Intel Xiantao Zhang, Eddie Dong Intel Corporation

description

Nested Virtualization is becoming hot and required to support multiple emerging usage models like XenClient, McAFee Deep Safe, HyperV etc. After enabling nested VMX support for Xen, we have been working on improving its quality and performance to make it run well with these new usages. In the session, we would like update current status of nested virtualization support in Xen, and also demonstrate what we are doing along with new hardware-assisted nested virtulization in this area.

Transcript of Nested Virtualization Update from Intel

Page 1: Nested Virtualization Update from Intel

Nested Virtualization Update From Intel

Xiantao Zhang, Eddie Dong

Intel Corporation

Page 2: Nested Virtualization Update from Intel

Legal Disclaimer

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL’S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL® PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT. INTEL PRODUCTS ARE NOT INTENDED FOR USE IN MEDICAL, LIFE SAVING, OR LIFE SUSTAINING APPLICATIONS.

Intel may make changes to specifications and product descriptions at any time, without notice.

All products, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.

Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

*Other names and brands may be claimed as the property of others.

Copyright © 2012 Intel Corporation.

Page 3: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancement

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

3

Page 4: Nested Virtualization Update from Intel

Motivation and Goals

4

• Why nested virtualization?

− Ordinary OS are adopting VMX now

−Windows 7 XP compatibility mode

−Windows 8 Hyper-V

− Other Commercial VMMs requires VMX for

better performance

− vmware vmm

− Anti-virus software depends on VMX

− McAfee Deep Defender

Guest

VMM

VMM

Guest

Hardware Platform with VMX Enabled

• What is the goal ?

− To make VMX-based system software run

smoothly in a Xen guest.

Page 5: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

5

Page 6: Nested Virtualization Update from Intel

Nested VMX Architecture

6

VMCSShadow VMCS

HVM Guest

Dom0

L0

L1vVMCS

Nested Guest L2

Virtual EPT

Virtual VT-d

Xen VMM

Sh

ad

ow

ing

VM entry/exit

Nested

VMM

Virtual VM entry/exit

Virtual VMX

VMX-Enabled Platform (VT-d, EPT etc.)

Page 7: Nested Virtualization Update from Intel

History

• Nested VMX update @ Xen Summit Asia (Nov. 2009)

− Nested VMX design is presented

− Showed Initial Status

−Nested guest can boot up to BIOS early stage with limitations

− single vCPU/single nested guest/ No vCPU migration

• Refined nested VMX support was pushed into upstream

− Support multiple nested guests

− Also includes supporing SMP nested guests

• However, experimental & preliminary support

− Very limited configurations can work

−“KVM on Xen”, Linux guest can successfully boot up

−“Xen on Xen” does not work

− No virtual VT-d, virtual EPT

7

Page 8: Nested Virtualization Update from Intel

Previous Status

− Only one combination can work

8

L0-VMM L1-VMM

L2 Guest OS

32Bit PAE OS 64Bit OS

RHEL6.0 RHEL5.4 Win7 RHEL6.0 Win7Win2012 Server

Ubuntu 12.04

Xen

Xen X X X X X X X

KVM X √ X X X X X

Page 9: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

9

Page 10: Nested Virtualization Update from Intel

Stability Enhancement

− Greatly enhanced stability, with several critical bugs fixed!

10

L0-VMM L1-VMM

L2 Guest OS

32Bit PAE OS 64Bit OS

RHEL6.0 RHEL5.4 Win7 RHEL6.3 Win7Win2012 Server

Ubuntu 12.04

Xen

Xen X X X X X X X

KVM X √ X X X X X

L0-VMM L1-VMM

L2 Guest OS(SMP)

32Bit PAE OS 64Bit OS

RHEL6.0 RHEL5.4 Win7 RHEL6.0 Win7Win2012 Server

Ubuntu 12.04

Xen

Xen √ √ √ √ √ √ √

KVM √ √ √ √ √ √ √

Page 11: Nested Virtualization Update from Intel

Performance Without Optimizations

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

SpecJBB Unixbench Kernel Build CPU-INT

L2 Guest2(Xen on Xen)

L1 Guest

Native

Platform: SNB-EPOS: RHEL6.3 GuestMEM:4GB MemoryCPU: 2 VCPU

11

Page 12: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

12

Page 13: Nested Virtualization Update from Intel

Virtual EPT Architecture

L2 Guest

L0 VMM

L1 VMM

L0

L1

L2

vVMCS

L1 EPT L2-GPA-> L1 GPA

VMCS-L1L0 EPT

L1-GPA-> HPA

Switch to Shadow EPT @ virtual vmentry

sVMCS-L2

Sh

ad

ow

ing

L2-GPA->HPAShadow EPT

Sh

ad

ow

ing

13

VM entry/exit

Virtual VM entry/exit

Page 14: Nested Virtualization Update from Intel

Virtual EPT: Using EPT Shadowing

• No write-protection to L1-EPT (Guest EPT paging structure)

− Flexibility is good.

• Trap-and-emulate guest’s INVEPT

− Update the shadow EPT entries

• Better SMP Scalability

− No global lock is required

• Requires page-level INVEPT

− Individual address invalidation

14

Page 15: Nested Virtualization Update from Intel

Enhanced INVEPT Instruction for Virtual EPT

• INVEPT limitations

− No Individual address invalidation

−Only single context and all context invalidation

• Little performance impact, however, hurt nested performance sharply!

−Has to drop shadow EPT table for L1’s each INVEPT(with single context)

• Performance loss if frequent INVEPT in VMM

• For example, KVM

• Enhance it in Software Way

− Add Individual address invalidation for virtual EPT

−Expose it to nested VMM through PV approach

− Need to enhance VMMs

−Easy implementation for Xen and VMM

• Benefits

− Reduce frequent shadow EPT paging structure flush

15

Page 16: Nested Virtualization Update from Intel

Performance Evaluation For Virtual EPT

0

1

2

3

4

5

6

7

Kernel Build SpecJBB Netperf CPU-INT

w/o virtual EPT

with virtual EPT

16

Page 17: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

17

Page 18: Nested Virtualization Update from Intel

Virtual VT-d: Expose VT-d Capability to L1VMM

• I/O performance for L2 guest is very slow

− Due to extremely long device emulation path through all the way to L1 & L0 VMMs

• How to fix that?

− Present virtual VT-d engine to L1 VMM

− So, device can be directly assigned to L2 guest

−High I/O performance, because of minimum VMM intervention.

• Must-to-have features in Virtual VT-d

− DMA Remapping & Queue Invalidation: Exposed

− Interrupt remapping: Not Exposed

18

Page 19: Nested Virtualization Update from Intel

Virtual VT-d Architecture

Dev Q: Qemu device

Dev P: PT device

Xen

Bus 0

Bus N

Dev M FunN

Hardware VT-d

Hypervisor

Shadow VT-d page table

Qemu

HypercallHypercall

IO TLB

Device ATC

Domain 0

Bus 0 DevQ,Funy

DevP,Funn

Guest view of VT-d

VT-d page tableNested VMM

HVM

loader

Shadow

ing

DMA

Engine

vDMAR

Virtual VT-d

vQI

19

L0

L1

L2Nested Guest

VMM

Page 20: Nested Virtualization Update from Intel

Two types of guest devices

• Pass through device

−DMA (IOVA->GPA) is handled by hardware VT-d engine

−Remap guest root/context structure

−Use physical remapping table to emulate guest remapping table

− IOVA -> L0 HPA, + audit (use a dummy page for Out of Bound gpn)

− Maybe cached by IOTLB and ATC

−IOTLB/Context Cache Synchronization

−Track guest invalidation of IOTLB

− Invalidate physical IOTLB, and may invalidate ATC as well if the device has ATC

−Track guest invalidation of Context Cache

• Qemu device

−DMA (IOVA->PA remapping) is emulated by Qemu

−2 Options: Caching the remapping table, or No-Caching

−Starting from simple solution: No caching

−Qemu device is already slow

20

Page 21: Nested Virtualization Update from Intel

Performance Evaluation of virtual VT-d

0

100

200

300

400

500

600

700

800

900

1000

1100

TCP Send TCP Receive UDP Send UDP Receive

Bandwidth of Nested Guest Ideal Bandwidth

21

Iperf testing with the assigned NIC to nested Guest

Mb/s

Bandwidth is good enough!

Page 22: Nested Virtualization Update from Intel

Latency Evaluation of virtual VT-d

0

20

40

60

80

100

120

Native Nested Guest

Latency

Latency

ms

22

Still have room to tune Latency

Page 23: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

23

Page 24: Nested Virtualization Update from Intel

Preliminary Performance

0

0.2

0.4

0.6

0.8

1

1.2

L1 Guest

L2 Guest

24

Based on Xen #25467

Platform: SNB-EPOS: RHEL5.4 GuestMEM:2GB MemoryCPU: 2 VCPU

Page 25: Nested Virtualization Update from Intel

Agenda

• Motivation and Goals

• History

−Nested VMX Architecture

−Previous status

• Latest status and new features

−Stability Enhancements

−Virtual EPT

−Virtual VT-d

• Preliminary Performance

• Call to Action

25

Page 26: Nested Virtualization Update from Intel

Call to Action

• Support more L1 VMMs

− McAfee Deep Defender

− VMware VMM

− Hyper-V

− Virtual Box

• Virtual APIC-V

− New Features for Interrupt/APIC Virtualization are coming

− For more information, please come to Nakajima Jun’s talk “Intel Update” this

afternoon.

− Improve interrupt virtualization efficiency for both L1 and L2

• Performance Tuning

26

Page 27: Nested Virtualization Update from Intel

Reference

• Nested Virtualization on Xen

− Qing He:

− Xen Summit 2009: http://xen.org/xensummit/xensummit_fall_2009.html

• Virtual APIC-V

− Jun Nakajima: Intel Update

− Xen Summit 2012: http://www.xen.org/xensummit/xs12na_talks/T10.html

27

Page 28: Nested Virtualization Update from Intel

Questions?

• Or contact [email protected]

28