Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation...

30
Unrestricted © Siemens AG 2013. All rights reserved Static System Partitioning and KVM Siemens Corporate Technology | October 2013

description

Jan Kiszka from Siemens presents Jailhouse, an open source partitioning hypervisor, designed with simplicity in mind, made to access io and networking functions with low latency and high speed and that can run under kvm.

Transcript of Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation...

Page 1: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioningand KVM

Siemens Corporate Technology | October 2013

Page 2: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 2 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioning and KVM

Agenda

Motivation & requirements

Jailhouse – a new partitioning approach

Combining partitioning and virtualization

Going open source

Summary and outlook

Page 3: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 3 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Gimme all your CPUs!

The need for full resource dedication

• High-speed control tasks (>10 kHz)

• Every µs overhead reducesachievable frequency

• Small, infrequent disturbances canhave significant impact => Cache pollutions => Deadline misses

• High-performance computing

• Long-running tasks don't wantinterruptions => Keep caches hot

Page 4: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 4 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

CONFIG_NO_HZ* – the solution?

CPU domination with Linux

• Goal: dominate CPU with a single task

• No interrupts, including timer ticks

• No housekeeping work (RCU, load measurement etc.)

• Standard application programming model

• But do not break Linux!

• Important steps made in upstream

• Reduce ticks to 1 HZ if only one task present

• Offload RCU work to other CPUs

• But...

• not yet 100%

• more tasks/interrupts may have to run (on_each_cpu...)

Page 5: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 5 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

What if you need asymmetric multiprocessing?

RTOS

Hardware

Linux ? RTOS

RT-KVM

Page 6: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 6 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Latencies Achievable in KVM-only Setups

• Host setup

• KVM on x86 PREEMPT-RT Linux

• Virtual machine on dedicated core

• Intel NIC (E1000 family) as I/O device,directly assigned to guest

• Permanent disk I/O load

• Guest setup

• Proprietary RTOS

• Real-time network stack

• Measurement setup

• Linux/Xenomai (native installation)

• Real-time network stack RTnet

• Periodic ICMP ping messages sent to target

• Record round-trip latency (error <50 µs)

=> Worst-case latency after 16h: 330 µs

RTOSGuest

RT-KVM

NIC other HW

Real-TimeLinux

with RTnet

NICother HW

Measuring I/O latency of an RT Guest

Improvable viaCONFIG_NO_HZ & Co.

Improvable viaCONFIG_NO_HZ & Co.

Page 7: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 7 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Small is Beautiful

Validation efforts correlate with code sizes

• Demanding security & safety scenarios

• Often require certification(Common Criteria, IEC 61508, …)

• Need to look closely at hardware & software

• Review / testing

• (Formal) validation

• The larger your system, the higher your effort

• Split critical from non-critical components

• Keep critical components small

• Virtualization can help with segregation

• ...if it remains simpler than non-critical parts CC BY-SA 3.0

Page 8: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 8 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

• Focused on guest isolation

• Spatial

• Temporal

• Reduced complexity (& features)

• Reduces validation effort

• Reduces guest latencies

• No standard available yet

• Niche market

• Many commercial hypervisors

• Few open source projects

• Hardware restrictions

• Not targeting industrial use

1st Approach: Micro-Hypervisor

Micro-Hypervisor

Non-RT,non-criticalWorkload

CriticalWorkload

I/O

CPU CPU

I/O

Small, bare-metal hypervisor separates workloads

Page 9: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 9 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Hypervisor

A bare-metal hypervisor has to boot its guest

Classic type-1 hypervisor boot-up

Hardware Hardware

GPOSRT

App

1. Boot phase 2. Operational phase

Firmware/BIOS

[Boot Loader]Hypervisor

GPOS(General

Purpose OS)

Hypervisor

Page 10: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 10 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioning and KVM

Agenda

Motivation & requirements

Jailhouse – a new partitioning approach

Combining partitioning and virtualization

Going open source

Summary and outlook

Page 11: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 11 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

What about postponing the hypervisor start?

Basic concept of late partitioning

Hardware Hardware

Partitioning Layer

LinuxLinux

Hardware

Partitioning Layer

LinuxRT

App

1. Boot phase 2. Partitioningphase

3. Operational phase

Firmware/BIOS

[Boot Loader]

Images

Configs

Linux

Page 12: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 12 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Choosing the Right Balance

Jailhouse focuses on simplicity

Features Simplicity

Page 13: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 13 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Jailhouse Architecture

Linux Kernel

JailhouseLoader Module

JailhouseManagement Tool

/dev/jailhouse

Cell Image CellConfig

Jailhouse Image

CellConfig

Cell Image

RTApp 2

CPU 1 CPU 2 CPU 4CPU 3 CPU 5 CPU 6

RTApp 1

Device 1 Device 4 Device 5Device 2 Device 3

Jailhouse Hypervisor

“Cell”

SystemConfig

Page 14: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 14 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Access Control instead of Virtualization

Limits of exclusive resource assignment

• Intercept and filter access to sensitive resources

• Physical addresses (unless hardware filters)

• I/O interrupt & IPI destination programming

• Cross-cell impact (e.g. system reset)

• 1:1 resource assignment

• No overcommitment, no scheduling

=> Better predictability, less complexity

• Do not hide hypervisor existence

• No emulation of lacking resources

• Expose assigned resource (widely) unmodified

• Linux won't notice (already booted), other cells need awareness

Page 15: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 15 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Jailhouse does not overlap with KVM

You need more? Use KVM!

Page 16: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 16 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Linux is Our Friend

Reuse Linux for management tasks

• Bootstrap

• System boot-up, hardware pre-configuration

• Hypervisor loading and configuration

• jailhouse enable CONFIG-FILE

• RT partition creation & image loading

• jailhouse cell create CONFIG-FILE IMAGE-FILE

• Linux unplugs resources for new cell (CPU, devices, memory)

=> Reduced hypervisor complexity

=> UNIX-like look & feel

Page 17: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 17 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Linux is Our Friend (2)

Reuse Linux for management tasks

• Operation

• Reconfigurations (while in non-operational mode)

• jailhouse cell destroy NAME

• Monitoring, logging etc.

• Shutdown

• jailhouse disable

=> Reduced hypervisor complexity

=> Short turn-around times, less reasons to reboot

Page 18: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 18 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Prototyping on x86

Jailhouse on Intel x86

• Initial focus on Intel

• VT-x with EPT, unrestricted guest mode, x2APIC

• VT-d with interrupt remapping

• Direct interrupt delivery feasible

• Keep IRQs off while in hypervisor

• Use NMIs + preemption timer for hypervisor IPIs

• Minimalistic MMIO

• Enables IO-APIC, xAPIC, PCI mmconfig interception

• Simple, unoptimized, slow-path only use cases

• Work in progress

• Device assignment, management

• Interrupt access control

Page 19: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 19 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Prototyping on x86 (2)

Jailhouse development inside QEMU/KVM

• Bootstrap development done inside QEMU/KVM

• Unbeatable turn-around times

• <30 s from code fix over recompilation and deployment to execution

• Source-level debugging of hypervisor

• Found and fixed several nVMX deficits & bugs

• Direct IRQ delivery

• nEPT stabilization

• Unrestricted guest mode

• Preemption timer

• Unfortunately no virtual VT-d available yet...

Page 20: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 20 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioning and KVM

Agenda

Motivation & requirements

Jailhouse – a new partitioning approach

Combining partitioning and virtualization

Going open source

Summary and outlook

Page 21: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 21 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

What if more than Linux should run?

Hosting non-Linux guests

Linux Kernel

CPU 1 CPU 2 CPU 4CPU 3 CPU 5 CPU 6

RTApp

Device 1 Device 4 Device 5Device 2 Device 3

Jailhouse Hypervisor

Extended Jailhouse?

Page 22: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 22 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

How to minimize the complexity increase?

Nested virtualization will be more beneficial

• Full OS boot over Jailhouse

• Less overhead for guest

• Requires more device emulations

• Requires more accurate virtualization

• Requires virtual BIOS

• ...

• Enable KVM over Jailhouse

• Overhead of monitoring privileged KVM operations

• Can focus on CPU virtualization features

• No need to virtualize/emulate, just validate

• Gain (almost) all features of QEMU/KVM,benefit from its stability

Features Simplicity

Page 23: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 23 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Nested Virtualization on Diet

Enabling Intel x86 KVM over Jailhouse

• Execute VMX instructions on behalf of KVM

• Monitor (shadow) VMCS accesses

• Valid fields?

• Physical addresses with limits?

• Unsupported features disabled?

• We don't care if KVM crashes its CPU

• ...as long as it doesn't affect other cells

• Deny EPT in 1st prototype

• Slow but simple

• General need to establish feature restrictions

• Pragmatic: load KVM after Jailhouse

• Rediscover features on Jailhouse detection

CC BY-SA 3.0

Page 24: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 24 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Optimization: Nested EPT

Monitoring of Extended Page Table usage by KVM

• 1:1 mapping – no shadowing required

• Monitoring concept

• Full validation walk on new EPT

• Note EPT internally as valid

• Trap writes to known EPTs

• Check if page belongs to known EPT(drop write-protection if not)

• Declare EPT invalid if entry becomes invalidthrough write

• Execute write

• Use KVM's EPT while running its guest

• Last resort: para-virtualization

CC BY-SA 3.0

Page 25: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 25 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioning and KVM

Agenda

Motivation & requirements

Jailhouse – a new partitioning approach

Combining partitioning and virtualization

Going open source

Summary and outlook

Page 26: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 26 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Why Open Source?

Benefits of maintaining Jailhouse as open source

• “Just a few lines of code, easily maintainable.”

• Hardware-assisted virtualization is non-trivial=> Many-eyes principle

• New CPUs and hardware features will keep us busy=> Attract contributors, including silicon vendors

• Broaden the usage

• Higher test coverage, faster stabilization

• Additional use cases => more contributors

• Close cooperation with Linux kernel

• Enable upstream changes of Linux (if required)

• Keep the door open for integration

• GPL: Preserve openness

GPL

Page 27: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 27 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Static System Partitioning and KVM

Agenda

Motivation & requirements

Jailhouse – a new partitioning approach

Combining partitioning and virtualization

Going open source

Summary and outlook

Page 28: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 28 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Jailhouse – Static Partitioning as Linux Feature

Summary

• Need for critical workload isolation

• Undisturbed from non-critical system parts

• Low-latency access to I/O

• Reduce validation efforts

• Jailhouse provides building block for partitioning

• Allows full CPU isolation

• Reduced to the minimum (goal: <10k lines of code)

• Linux-based to reuse handy infrastructure

• Optionally combine with KVM for full virtualization

Page 29: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 29 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

What is next?

Outlook

x86 completion

KVM over Jailhouse

ARMv7 port

Linux + Linux?

Follow / join the development!https://github.com/siemens/jailhouse

Management features

Page 30: Github.com.Siemens.jailhouse(Xen.or.Kvm(Can.run.Under.it).Alternative.for.High.speed.low.Latency.network.and.Io.access)(by.ian.Kiszka).Presentation

Page 30 October 2013 Jan Kiszka, Corporate Technology Unrestricted © Siemens AG 2013. All rights reserved

Any Questions?

Thank you!

Jan Kiszka <[email protected]>