Computer Architecture course lecture on Virtualization and ...
21
Virtualization Susanta K. Nanda ECSL CSE 502, Fall’05
-
Upload
cameroon45 -
Category
Technology
-
view
1.978 -
download
0
description
Transcript of Computer Architecture course lecture on Virtualization and ...
- 1. Virtualization Susanta K. Nanda ECSL CSE 502, Fall05
2. Virtualization at the Hardware Level
- Observation
-
- Hardware resources aretypicallyunder-utilized
-
- Hardware resources directly relate to cost
- Goal: Improve hardware utilization
- How?
-
- Share hardware resources across multiple machines
-
- May make sense for network attached storage, but what about processor, memory, etc.?
- Theme
-
- Decouplemachine from hardware
- Virtual Machine (VM)
-
- A machine decoupled from the hardware, i.e. does not necessarily correspond to the hardware
-
- Multiple Virtual Machines on the same physical host could share the underlying hardware
-
- First VM: IBM System/360 Model 40 VM [1965]
3. Virtual Machine Monitor (VMM)
- A thin layer of software on top of the bare machine to facilitate virtualization of hardware resources
- Mediates between VMs and the hardware
- Manages VMs
-
- Create, Destroy, Power Off/On, etc.
- Concerns
-
- Isomorphism : State transitions must be isomorphic to a physical nachine
-
- Isolation : One VM from all others
-
- Performance : Close-to-native
-
- Correctness : Exactly same hardware interface to the guest OS to support commodity OSes without any modification
4. A Stolen Picture 5. VM: Additional Advantages
- Non-existing hardware
-
- Virtual devices through emulation via a combination of software and other available devices
-
- Example: SCSI-disk using IDE-disks, (virtual) timer
-
- Use: Legacy systems/software
- Hides heterogeneity of the underlying hardware
-
- Ability to switch hardware vendors
- Mobility
-
- Decoupling helps move a VM from one physical host to another, just as a file
-
- Use: Server consolidation, hardware maintenance, etc.
- OS Debugging, Mixed OS, Event monitoring, Execution Undo, and Many more
6. Key Concepts: Appearance
- A VM consists of Shared and Dedicated Hardware
-
- Shared: Disk, Memory, NIC, CPU, Printer, etc
-
- Dedicated: Keyboard, Mouse, Display, Speakers, CD-Drive, etc
-
- A server VM may not require some dedicated devices
- Dedicated hardware
-
- PerUser
-
- Sharable across multiple VMs if they belong to the same user
7. Key Concepts: State Management
- Each VM would have itsownarchitected state information
-
- Example: registers/memory/disks, page table/TLB
- Not always possible to map all architected states to its natural level in the host
-
- Insufficient/Unavailable host resources
-
- Example: Registers of a VM may be architected using main memory in the host
- VMs keep getting switched in/out by the VMM
-
- Isomorphism requires all state transitions to be performed on the VM states
-
- Performance requires efficient state management
- State Management:IndirectionVs.Copying
8. Key Concepts: State Managementcontd
- Indirection
-
- Holdstatefor each VM in fixed locations in the hosts memory hierarchy
-
- Apointermanaged by VMM indicating the guest state that is currentlyactive
-
- Example: Register block maintained in memory and a processor register pointing to the register block of the currently active VM
-
- Pros: Ease of management
-
- Cons: Inefficient ( mov eax ebxrequires 2 inst)
- Copying
-
- Copy VMs state information to its natural level in memory hierarchy whenswitched in
-
- Copy them back to the original place whenswitched out
-
- Example: Copy all the VM registers to the processor registers
-
- Pros: Efficient (most instructions are executed natively)
-
- Cons: Copying overhead
9. Key Concepts: Resource Control
- VMM must maintainoverall controlof the hardware resources
-
- Hardware resources are assigned to VMs when they are created/executed
-
- Should have a way to get them back when they need to assigned to a different VM
-
- Similar to multi-programming in OS
- Privileged Resources
-
- Certain resources are accessible only to and managed by VMM
-
- Interrupts relating to such resources must then be handled by VMM
-
- Privileged resources are emulated by VMM for the VM
-
- Example : interval timer
- All resource that could help maintain control are marked privileged
-
- Interval timer is used to decide VM scheduling
-
- Page table base register (CR3 on x86) is used to isolate VM memory
- Issues: VM scheduling (An ideallyfairscheduling may not be good)
10. Key Concepts: Native/Hosted VMs
- Native VMs
-
- VMM is installed on the bare machine, no host OS
-
- All other VMs are then created through the VMM
-
- Pros: Clean Architecture, Efficient
-
- Cons: Complicated VMM due to device drivers
-
- Example: VMware ESX Server
- Hosted VMs
-
- VMM is installed on top of a host OS
-
- User-mode: VMM runs in non-privileged mode
-
- Dual-mode: VMM runs partly in privileged mode (as a driver on the host OS) and partly in unprivileged mode (like an application)
-
- Pros: VMM uses drivers in the host OS for I/OThin VMM
-
- Cons: Inefficient for I/O intensive applications
-
- Example: Microsoft Virtual Server
11. Processor Virtualization
- Privilege Levels/Rings
-
- System/User mode
- System ISA vs. User ISA
- Emulation
-
- Guest ISA may differ from Host ISA
-
- Binary translation
-
- Slower
- Native Execution
-
- Guest and Host ISA must be the same
-
- Some critical instructions may still need to be emulated
-
- Issues: Complexity of discovering and emulatingcriticalinstructions efficiently
12. ISA Virtualizability
- Privileged Instructions (PI)
-
- Instructions that generate a trap when executed in any but most-privileged level
-
- Example: LIDT (load interrupt descriptor table)
- Sensitive Instructions (SI)
-
- Instructions whose behavior depends on the current privilege level
-
- Example: POPF (pops the stack to EFLAGS)
-
-
- In user mode, the Interrupt Enable bit of the ELAGS register is not over-written
-
-
-
- In system mode, the value is blindly copied
-
- Popek/Goldberg Theorem
-
- For any conventional third-generation computer, a virtual machine monitor may be constructed if the set ofsensitive instructionsfor that computer is a subset of the set ofprivileged instructions .
-
- In other words, ISA is Virtualizable if and only if SI is a subset of PI
13. When ISA is not Virtualizable?
- All is not lost if an ISA violates Popek/Goldberg theorem
-
- However, it brings in additional complications and inefficient in VMM implementation
- Critical instructions:
-
- Instructions that are sensitive but not privileged
-
- X86 has 17 critical instructions
-
- All critical instructions must be emulated by VMM
- VMM Components
-
- Binary Scanner: Inspects and inserts trap at critical instructions
-
- Dispatcher: Gets control when a trap occurs
-
- Allocator: Allocates machine resources (e.g. load relocation bounds register)
-
- Interpreters: Each interpreter interprets one privileged instruction
14. Memory Virtualization
- VM support in traditional architectures
-
- Architected TLB vs. Architected Page Table
-
- Page-fault and Swap
-
- One level of indirection: Page Table
- VMM requires two levels of indirection
-
- Virtual Memory to Real Memory: Page Table (Guest OS)
-
- Real Memory to Physical Memory: Real Map Table (VMM)
- Architected Page Table
-
- Additional Data Structures
-
-
- Real Map Table (VMM)
-
-
-
- Shadow Page Tables (VMM): Used by hardware for address translation, directly maps virtual address to physical (not real) address
-
-
- Maintenance:
-
-
- VMM intercepts and emulates Page table modifications, Page table base register modifications by the Guest OS
-
15. Memory Virtualizationcontd
- Architected TLB
-
- Virtual TLB: maintained by guest OS
-
-
- Virtual ASID, Virtual Page, Real Page
-
-
- Real TLB: maintained by VMM
-
-
- Real ASID, Virtual Page, Physical Page
-
-
- ASID map table
-
-
- Virtual ASID, Real ASID
-
-
- VMM intercepts/emulates all modifications to TLB by the guest OS
16. I/O Virtualization
- Virtualizing Devices
-
- Dedicated Devices: Display, Keyboard, Mouse, etc.
-
- Partitioned Devices: Disk
-
- Shared Devices: Network adapter
-
- Spooled Devices: Printer
-
- Non-existent Physical Devices: virtual network adapter
- Virtualizing I/O Operations
-
- Intercepting/emulating IN/OUT, INS/OUTS
-
- Map virtual resource ID to physical device ID
-
- De-multiplexing the interrupts for the devices
- Virtualizing I/O in Hosted VMM
-
- VMM-driver translates I/O instructions back to system calls in the host OS
17. Performance Degradation in VMMs
- Setup: VM State initialization
- Emulation: Emulatingcriticalinstructions
- Interrupt Handling
-
- Interrupts generated by a program within a VM has to be first handled by VMM even though its not required sometimes
- State Saving: During world switches
- Bookkeeping: Timers, etc
- Time Elongation: Memory references take longer
18. VT-x: Vanderpool Technology
- VMX Mode for Processors
-
- VMX Root and VMX Non-root
-
- All four privilege level (rings) are available in both root and non-root in VMX mode
-
-
- Thus, four new less privilege levels than Pentiums
-
-
- Guest VMs can run in VMX non-root
-
- Host (Hosted VMM) and VMM in VMX root
- VMX instructions
-
- VMX root has access to a new set of instructions
-
- Critical shared resources are kept under the control of a monitor in VMX root
-
- VMX non-root ring 0 does not have access to the critical resources
-
- An example of a critical resource: Memory for state management
19. An Example Operation
- VMXON:Switch into VMX mode: To VMM
- VMLAUNCH VM1 : Start executing VM1 in VMX non-root operation
- VM1 Exits: Go back to VMM
- VMLAUNCH VM2:Start executing VM2
- VM2 Exits: Go back to VMM
- VMRESUME VM2:Switch to VM2 again
- VM2 Exits: Go back to VMM
- VMRESUME VM2 : Switch to VM2
- VMRESUME VM1:VM2 exits, VM1 switched in
- VM1 exits:Go back to VMM
- VMXOFF : Get back to Regular mode
20. Maintenance of State
- VMCS Data Structure
-
- Fully specified, various fields defined
-
- Manipulatedonlyby hardware or software in VMX-root
-
- VMPTR points to the VMCS structure of the current executing VM
-
- There can be multiple VMs active at any point, but one of them would be executing
-
- VMWRITE/VMREAD to read contents of VMCS
-
- State: More than normal, e.g. architecturally hidden part of segment registers
- Control Fields: Define under what condition a VM exits
-
- Example: Some specific interrupt/instruction/etc, number of model-specific registers (MSRs) that need to be saved when VM exits
- VM exit info
-
- Informs the VMM the reason for exit along with supporting info
21. Maintenance of Statecontd
- State Area
-
- Guest State: Register state, Interruptibility state
-
- Host State: Register State
- Control Area
-
- VM Execution Controls
-
-
- Pin/Processor-based execution controls, bitmap fields, etc
-
-
- VM Exit Controls
-
-
- Control bitmap, MSR Controls
-
-
- VM Entry Controls
-
-
- Control bitmap, MSR Controls, Controls for Event Injection
-
- VM Exit Information
-
- Basic Info: VM-Exit Info, Vectoring Event Info
-
- Other Exit Info: Due to event delivery, due to instruction execution