Devconf.cz 2016 Linux as a guest on Hyper-V
-
Upload
vitaly-kuznetsov -
Category
Software
-
view
405 -
download
2
Transcript of Devconf.cz 2016 Linux as a guest on Hyper-V
Linux as a guest
on Hyper-V
Vitaly KuznetsovRed HatDevConf 2016
2 Linux on Hyper-V
Virtualization at Red Hat
● OS-level virtualization at Red Hat:● Xen-based solutions in the past● KVM-based solutions now:
● RHEV, OpenStack
● RHEL-as-a-guest efforts:● On KVM: RHEV and OpenStack, standalone● On Xen: Amazon Web Services● On VMware: standalone● On Hyper-V: Azure and standalone
3 Linux on Hyper-V
Microsoft Hyper-V
● Present since Windows Server 2008
● The core of Microsoft Azure cloud
● Hyper-V is a Type 1 hypervisor for the x86 architecture
● Requires hardware support (Intel VT-x, AMD-V)
● Emulates standard x86 platforms:● Generation 1 VM: “legacy” BIOS platform with emulated
devices● Generation 2 VM: UEFI platform without emulated
devices
4 Linux on Hyper-V
Hyper-V architecture
● Full virtualization with selective enlightments:● Enlightened I/O paths
● Optional for Generation 1● Mandatory for Generation 2
● Heartbeat● Utility drivers● Time keeping and synchronization● Crash reporting● ...
5 Linux on Hyper-V
Hyper-V and Linux
● Kernel drivers:● Added to staging in 2009● Out of staging in 2011● Included in all major Linux distributions: RHEL, Fedora,
OpenSUSE/SLES, Debian, Ubuntu,...
● Actively used on Azure: 25% of all VMs are Linux! (Microsoft)
6 Linux on Hyper-V
Hyper-V drivers development
● Commits since 2011 (leaving staging):
2011 2012 2013 2014 20150
50
100
150
200
250
300
Commits from @microsoft.com
Commits from @redhat.com
Commits from other community members
7 Linux on Hyper-V
Hyper-V drivers in Linux kernel
● Currently present drivers:● hv_storvsc (IDE/SCSI/FC storages)● hv_netvsc (network adapter)● hyperv_fb (framebuffer device)● hyperv-keyboard (keyboard)● hid-hyperv (mouse)● hv_balloon (memory ballooning and hotplug)● hv_util (utility drivers)
8 Linux on Hyper-V
Hyper-V storvsc driver
● High performace storage driver
● Devices support:● SCSI● IDE (Gen1 VMs)● Fibre Channel
● Partial SPC-3 compliance since Win8/WS2012
● Full SPC-3 compliance Win10/WS2016
● Multiqueue support
9 Linux on Hyper-V
Hyper-V netvsc driver
● High performace network driver
● Multiqueue● Supports scaling for RX with vRSS● Dynamic and Static VMQ for TX
● Supports batching for TX
● Decorates each outgoing packet with RNDIS header
● No NAPI support (yet)
10 Linux on Hyper-V
Hyper-V netvsc performance (Microsoft data)
1 8 64 256 1024 60000.05.0
10.015.020.025.030.035.0
4.1
25.028.3 26.9
23.4
15.5
5.6
20.7
30.6 31.325.2
10.0
On Local HyperV
WS2012R2 Linux
Number Of Connections
Thro
ughp
ut (G
bps)
Note: Server VM CPU: 8 vCPUs of E5-2690 @2.90GHz, on one NUMA node
11 Linux on Hyper-V
Hyper-V netvsc performance (Microsoft data)
1 8 64 256 1024 60000.05.0
10.015.020.025.030.0
2.3
20.624.2 23.3
15.89.9
4.1
15.321.3
17.114.0 12.7
On Azure G5
WS2012R2 Linux
Number Of Connections
Thro
ughp
ut (G
bps)
Note: VM CPU: 32 vCPUs of E5-2698B v3 @ 2.00GHz, on two NUMA nodes
12 Linux on Hyper-V
Hyper-V utility drivers and daemons
● 'Internal' drivers● Clocksources, clockevents● Time synchronization● Heartbeat
● Paired: kernel driver + userspace daemon● hv_kvp – key/value pair exchange (network settings)● hv_vss – freeze/thaw file systems for backup● hv_fcopy – copy an arbitrary file from the host to the
guest
13 Linux on Hyper-V
Memory ballooning and hotplug
● Post memory pressure reports to the host every second.
● “balloon up” request from the host:● Allocate pages and send their PFNs to the host so
actual pages behind these frames can be reused.● “balloon down” request from the host:
● Get PFNs from the host, de-allocate pages.
● Memory hotplug:● Initiated by the host, 2M granularity (128M in Linux)● Possible with 'Dynamic memory' disabled in WS2016
14 Linux on Hyper-V
Timekeeping
● TSC exists but not very reliable
● hv_clocksource:● MSR-based● Stable but slow
● TSC PAGE clocksource● Reading from a shared memory page● Fast as there is no exit to the hypervisor
15 Linux on Hyper-V
Hyper-V drivers in development
● Hvsock● Userspace-to-userspace communications through
VMBUS● Similar to VSOCK
● PCI passthrough
● RDMA● Open-source but not upstream yet
16 Linux on Hyper-V
Linux on Hyper-V internals
● Why do we need Hyper-V-specific drivers?● Emulating real hardware is SLOW, other hypervisors
have their drivers in kernel too:● KVM: virtio● Xen: blkfront, netfront, balloon, …● Vmware: pvscsi, vmxnet3, ..
● Some devices don't have hardware counterparts:● Utility drivers● Memory ballooning
17 Linux on Hyper-V
“Enlightened drivers”
● The core is VMBUS● Protocol for guest ↔ host communication● Based on a concept of “channels”
● Primary/secondary channels for devices● Each channel is bound to a VCPU● Channels don't block each other
● Guest → Host signalling by hypercalls● Host → Guest signalling by interrupts for “events” or
“messages”● Ring buffers for data exchange
18 Linux on Hyper-V
Hypercalls
● The mechanism to signal something to the host
● A single 4k page per guest
● Setup:● virtual mapping within the guest● Physical address → HV_X64_MSR_HYPERCALL
● Usage:● Do function-like call to the page (call id, input addr,
output addr)
19 Linux on Hyper-V
Host→Guest signalling: “Messages”
● A magical page per-VCPU
● … which contains actual data
● One message at a time
● Message's payload is <= 30 QWORDS
● Used mainly for setup/teardown pathes (channels offers, open/close, unload,…)
● Clockevents also use messages.
20 Linux on Hyper-V
Host→Guest signalling: “Events”
● A (different) magic page per-VCPU
● … with an indication that there's pending data on a particular channel.
● Each channel has its own bit so all channels assigned to the same vCPU which need processing are signalled with a single interrupt.
● An event means “go check the ring buffer” and that's where the actual data is.
21 Linux on Hyper-V
Ring buffers
● Data transfer mechanism based on shared memory for performance-critical devices
22 Linux on Hyper-V
Ring buffers for channels
● Two separate rings for each channel for guest → host and host → guest communication.
● Different ring sizes for different drivers:● Netvsc – 128 pages● Storvsc – 256 pages● ...
● Need for signalling both ways.
23 Linux on Hyper-V
Receive ring signalling
● We receive an interrupt indicating there are events pending.
● We scan the event page to see which channels have new data.
● For a particular channel with a pending event we read and process all the data on the ring buffer.
● We advance read pointer freeing space on the buffer.
● If the host was blocked by the absence of space on the ring we signal it when we're done reading.
24 Linux on Hyper-V
Transmit ring signalling
● Host guarantees to drain the buffer on each read operation.
● Host sets interrupt mask to signal an ongoing read.
● We signal the host when the ring transfers from empty to non-empty state and the host is not currently reading.
● … but we can also delay signalling if more data is on the way.