Lxc – next gen virtualization for cloud intro (cloudexpo)

34
Linux Containers – NextGen Virtualization for Cloud (Intro & Overview) Cloud Expo June 10-12, 2014 New York City, NY Boden Russell ([email protected])

description

Intro to Linux Containers presented @ cloudexpo 2014 east in NYC.

Transcript of Lxc – next gen virtualization for cloud intro (cloudexpo)

Page 1: Lxc – next gen virtualization for cloud   intro (cloudexpo)

Linux Containers – NextGen Virtualization for Cloud (Intro & Overview)

Cloud ExpoJune 10-12, 2014New York City, NY

Boden Russell ([email protected])

Page 2: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 2

Why LXC: Performance

Manual VM LXC

Provision Time

Days

Minutes

Seconds / ms

linpack performance @ 45000

0

50

100

150

200

250

vcpus

GF

lop

s

Page 3: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 3

Why LXC: Industry UptrendGoogle trends - LXC

Google trends - docker

Page 4: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 4

Why LXC: Flexible & LightweightVirtual Machines Linux Containers

OS

bins / libsapp

OS

bins / libsapp app

bins / libsapp

bins / libsapp app

app app

OS

bins / libs

app

OS

bins / libs

app

OS

bins / libs

app

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

bins / libsapp

Flex

ibili

tyD

ensi

ty

OS

Page 5: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 5

Why LXC: Lower TCO

Supported with out of the box modern Linux Kernel

Open source toolsets

Cloudy integration

Page 6: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 6

Definitions

Linux Containers (LXC LinuX Containers)– Lightweight virtualization– Realized using features provided by a modern Linux kernel – VMs without the hypervisor (kind of)

Containerization of– (Linux) Operating Systems– Single or multiple applications

LXC as a technology ≠ LXC “tools”

Page 7: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 7

Hypervisors vs. Linux Containers

Hardware

Operating System

Hypervisor

Virtual Machine

Operating System

Bins / libs

App App

Virtual Machine

Operating System

Bins / libs

App App

Hardware

Hypervisor

Virtual Machine

Operating System

Bins / libs

App App

Virtual Machine

Operating System

Bins / libs

App App

Hardware

Operating System

Container

Bins / libs

App App

Container

Bins / libs

App App

Type 1 Hypervisor Type 2 Hypervisor Linux Containers

Containers share the OS kernel of the host and thus are lightweight.However, each container must have the same OS kernel.

Containers are isolated, but share OS and, where appropriate, libs / bins.

Page 8: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 8

LXC Technology Stack

Use

r Spa

ceKe

rnel

Spa

ce

Kernel

System Call Interface

Architecture Dependent Kernel Code

GLIBC / Pseudo FS / User Space Tools & Libs

Linux Container Tooling

Linux Container Commoditization

Orchestration & Management

Hardware

cgro

ups

nam

espa

ces

chro

ots

LSM

lxc

Page 9: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 9

So You Want To Build A Container?

High level checklist– Process(es)– Throttling / limits– Prioritization– Resource isolation– Root file system– Security

my-lxc

?

Page 10: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 10

Linux Control Groups (cgroups)

Problem– How do I throttle, prioritize, control and obtain metrics for a group of

tasks (processes)?

Solution control groups (cgroups)

cgroup blue

proc

proc

proc

– Device Access– Resource limiting– Prioritization– Accounting– Control– Injection

Page 11: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 11

Linux cgroup SubsystemsSubsystem Tunable Parameters

blkio - Weighted proportional block I/O access. Group wide or per device.- Per device hard limits on block I/O read/write specified as bytes per second or IOPS

per second.cpu - Time period (microseconds per second) a group should have CPU access.

- Group wide upper limit on CPU time per second.- Weighted proportional value of relative CPU time for a group.

cpuset - CPUs (cores) the group can access.- Memory nodes the group can access and migrate ability.- Memory hardwall, pressure, spread, etc.

devices - Define which devices and access type a group can use.

freezer - Suspend/resume group tasks.

memory - Max memory limits for the group (in bytes).- Memory swappiness, OOM control, hierarchy, etc..

hugetlb - Limit HugeTLB size usage.- Per cgroup HugeTLB metrics.

net_cls - Tag network packets with a class ID.- Use tc to prioritize tagged packets.

net_prio - Weighted proportional priority on egress traffic (per interface).

Page 12: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 12

Linux cgroups Pseudo FS Interface Linux pseudo FS is the interface to cgroups– Directory per subsystem per cgroup– Read / write to pseudo file(s) in your cgroup directory

/sys/fs/cgroup/my-lxc

|-- blkio| |-- blkio.io_merged| |-- blkio.io_queued| |-- blkio.io_service_bytes| |-- blkio.io_serviced| |-- blkio.io_service_time| |-- blkio.io_wait_time| |-- blkio.reset_stats| |-- blkio.sectors| |-- blkio.throttle.io_service_bytes| |-- blkio.throttle.io_serviced| |-- blkio.throttle.read_bps_device| |-- blkio.throttle.read_iops_device| |-- blkio.throttle.write_bps_device| |-- blkio.throttle.write_iops_device| |-- blkio.time| |-- blkio.weight| |-- blkio.weight_device| |-- cgroup.clone_children| |-- cgroup.event_control| |-- cgroup.procs| |-- notify_on_release| |-- release_agent| `-- tasks|-- cpu| |-- ...|-- ...`-- perf_event

echo "8:16 1048576“ > blkio.throttle.read_bps_devic

e

cat blkio.weight_devicedev weight8:1 2008:16 500 App

App

App

Page 13: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 13

Linux cgroups FS Layout

Page 14: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 14

Linux cgroups: CPU Usage

Use CPU shares (and other controls) to prioritize jobs / containers

Carry out complex scheduling schemes Segment host resources Adhere to SLAs

Page 15: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 15

Linux cgroups: CPU Pinning

Pin containers / jobs to CPU cores Carry out complex scheduling schemes Reduce core switching costs Adhere to SLAs

Page 16: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 16

Linux cgroups: Device Access

Limit device visibility; isolation Implement device access controls– Secure sharing

Segment device access Device whitelist / blacklist

Page 17: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 17

So You Want To Build A Container?

Page 18: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023

Linux namespaces

Problem– How do I provide an isolated view of global resources to a group of tasks

(processes)?

Solution namespaces

18

namespace blue

– MNT; mount points, files systems, etc.

– PID; processes– NET; NICs, routing, etc.– IPC; System V IPC– UTS; host and domain name– USER; UID and GID

MNTPIDNETUTSUSER

proc

proc

proc

Page 19: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 19

Linux namespaces: Conceptual Overview

global (i.e. root) namespace

MNT NS//proc/mnt/fsrd/mnt/fsrw/mnt/cdrom/run2

UTS NSglobalhostrootns.com

PID NSPID COMMAND1 /sbin/init2 [kthreadd]3 [ksoftirqd]4 [cpuset]5 /sbin/udevd6 /bin/sh7 /bin/bash

IPC NSSHMID OWNER32452 root43321 boden

SEMID OWNER0 root1 Boden

MSQID OWNER

NET NSlo: UNKNOWN…eth0: UP…eth1: UP…br0: UP…

app1 IP:5000app2 IP:6000app3 IP:7000

USER NSroot 0:0ntp 104:109mysql 105:110boden 106:111

purple namespace

MNT NS//proc/mnt/purplenfs/mnt/fsrw/mnt/cdrom

UTS NSpurplehostpurplens.com

PID NSPID COMMAND1 /bin/bash2 /bin/vim

IPC NSSHMID OWNER

SEMID OWNER0 root

MSQID OWNER

NET NSlo: UNKNOWN…eth0: UP…

app1 IP:1000app2 IP:7000

USER NSroot 0:0app 106:111

blue namespace

MNT NS//proc/mnt/cdrom/bluens

UTS NSbluehostbluens.com

PID NSPID COMMAND1 /bin/bash2 python3 node

IPC NSSHMID OWNER

SEMID OWNER

MSQID OWNER

NET NSlo: UNKNOWN…eth0: DOWN…eth1: UP

app1 IP:7000app2 IP:9000

USER NSroot 0:0app 104:109

Page 20: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 20

Linux namespaces: Common Idioms

It’s not required to use all namespaces – Pick & choose; if your toolset allows it

Constructs exist to permit “connectivity” between parent / child namespace

Various linux user space tools have namespace support Linux sys API supports flexible namespace creation

Page 21: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 21

Linux namespaces & cgroups: Availability

Note: user namespace support in upstream kernel 3.8+, but distributions rolling out phased support:- Map LXC UID/GID between

container and host- Non-root LXC creation

Page 22: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 22

So You Want To Build A Container?

Page 23: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 23

Linux chroot & pivot_root Using pivot_root with MNT namespace addresses escaping chroot

concerns The pivot_root target directory becomes the “new root FS”

Page 24: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 24

LXC ImagesLXC images provide a flexible means to deliver only what you need – lightweight and minimal footprint

Basic constraints– Same architecture & endian– Linux’ish Operating System; you can run different Linux distros on same host

Image types– System; virtualize Operating System(s) – standard distro root FS less the kernel– Application; virtualize application(s) – only package apps + dependencies (aka JeOS – Just

enough Operating System) Bind mount host libs / bins into LXC to share host resources Container image init process

– Container init command provided on invocation – can be an application or a full fledged init process

– Init script customized for image – skinny SysVinit, upstart, etc.– Reduces overhead of lxc start-up and runtime foot print

Various tools to build images– SuSE Kiwi– Debootstrap– Etc.

LXC tooling options often include numerous image templates

Page 25: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 25

So You Want To Build A Container?

Page 26: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 26

Linux Security Modules & MAC Linux Security Modules (LSM) – kernel modules which provide a

framework for Mandatory Access Control (MAC) security implementations MAC vs DAC– In MAC, admin (user or process) assigns access controls to subject / initiator– In DAC, resource owner (user) assigns access controls to individual resources

Existing LSM implementations include: AppArmor, SELinux, GRSEC, etc.

Page 27: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 27

Linux Capabilities

Per process privileges which define sys call access

Can be assigned to LXC process(es)

Page 28: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 28

Other Security Measures

Reduce shared FS access using RO bind mounts Linux seccomp– Confine system calls

Keep Linux kernel up to date User namespaces in 3.8+ kernel– Launching containers as non-root user– Mapping UID / GID into container

Page 29: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 29

So You Want To Build A Container?

Page 30: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 30

LXC Industry ToolingVirtuozzo OpenVZ Linux

VServerLibvirt-lxc Lxc (tools) Warden lmctfy Docker

Summary Commercial product using OpenVZ under the hood

Custom Kernel providing well seasoned LXC support

A set of kernel patches providing LXC. Not based on cgroups or namespaces.

Libvirt support for LXC via cgroups and namespaces.

Lib + set of user spaces tools /bindings for LXC.

LXC management tooling used by CF.

Similar to LXC, but provides more intent based focus.

Commoditization of LXC adding support for images, build files, etc.

Part of upstream Kernel?

No No Partial Yes Yes Yes Yes, but additional patches needed for specific features.

Yes

License Commercial GNU GPL v2 GNU GPL v2 GNU LGPL GNU LGPL Apache v2 Apache v2 Apache v2

APIs / Bindings

- CLI- API

- CLI- C

- CLI- C- Python- Java- C#- PHP

- Python- Lua- GO- CLI

- GO- REST- CLI- Python- Other 3rd

party libs

Management plane/ Dashboard

Virtuozzo Parrallels

Virtuozzo Parrallels + others

- OpenStack- Archipel- Virt-

Manager

- LXC web panel

- Lexy

- OpenStack- Shipyard- Docker UI

Page 31: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 31

LXC Orchestration & Management Docker & libvirt-lxc in OpenStack– Manage containers heterogeneously with traditional VMs… but not w/the level

of support & features we might like CoreOS– Zero-touch admin Linux distro with docker images as the unit of operation– Centralized key/value store to coordinate distributed environment

Various other 3rd party apps– Maestro for docker– Shipyard for docker– Fleet for CoreOS– Etc.

LXC migration– Container migration via criu

But…– Still no great way to tie all virtual resources together with LXC – e.g. storage +

networking• IMO; an area which needs focus for LXC to become more generally applicable

Page 32: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 32

LXC Gaps

There are gaps…

Lack of industry tooling / support Live migration still a WIP Full orchestration across resources (compute / storage / networking) Fears of security Not a well known technology… yet Integration with existing virtualization and Cloud tooling Not much / any industry standards Missing skillset Slower upstream support due to kernel dev process Etc.

Page 33: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 33

LXC: Use Cases For Traditional VMs

There are still use cases where traditional VMs are warranted.

Virtualization of non Linux based OSs– Windows– AIX– Etc.

LXC not supported on host VM requires unique kernel setup which is not applicable to other VMs on the host

(i.e. per VM kernel config) Etc.

Page 34: Lxc – next gen virtualization for cloud   intro (cloudexpo)

04/11/2023 34

References & Related Links http://www.slideshare.net/BodenRussell/realizing-linux-containerslxc http://bodenr.blogspot.com/2014/05/kvm-and-docker-lxc-benchmarking-with.htm

l https://www.docker.io/ http://sysbench.sourceforge.net/ http://dag.wiee.rs/home-made/dstat/ http://www.openstack.org/ https://wiki.openstack.org/wiki/Rally https://wiki.openstack.org/wiki/Docker http://devstack.org/ http://www.linux-kvm.org/page/Main_Page https://github.com/stackforge/nova-docker https://github.com/dotcloud/docker-registry http://www.netperf.org/netperf/ http://www.tokutek.com/products/iibench/ http://www.brendangregg.com/activebenchmarking.html http://wiki.openvz.org/Performance