CRIU: time and space travel for Linux containers -- Kir Kolyshkin

CRIU:

time and space travel

for Linux containers

CRIU:

time and space travel

for Linux containers

Kirill Kolyshkin

ContainerDays NYC, 30 Oct 2015

AgendaAgenda

• Why would we want to migrate containers

• Why wouldn't we want to migrate containers

• How complex is to migrate containers

2

Live migration at a glanceLive migration at a glance

• Save the state

• Transfer the state

• Restore the state

3

Container live migrationContainer live migration

4

Why would we want to migrate containers?Why would we want to migrate containers?

• It's awesome!

• Load balancing in a cluster

• Kernel upgrade

– Can be done without migration

• Hardware upgrade

5

Why wouldn't we want to live migrate containers?Why wouldn't we want to live migrate containers?

6

How to avoid live migrating containersHow to avoid live migrating containers

• Incoming traffic load balancing

• Microservices

• Crash-driven upgrades

• Scheduled downtimes

7

How to make live migration really live?How to make live migration really live?

• Need to get rid of migrating memory while the container is frozen

• Two ways:

– Pre-copy the memory

– Post-copy the memory

8

Live migration in more detailsLive migration in more details

• Pre-copy: collect and transfer the memory (might be iterative)

• Freeze the container

• Save its state

• Copy the state

• Restore

• Unfreeze

• Post-copy: swap in the memory over the network

9

Obstacles, booby traps, and rakesObstacles, booby traps, and rakes

10

VS

What do we need to migrateWhat do we need to migrate

• Virtual Machine

– Environment (i.e. virtual hardware)

– CPU state

– Memory

• Container

– Environment (cgroups, namespaces)

– Processes and stuff

– Memory11

Collect and copy the memoryCollect and copy the memory

• Virtual Machine

– All memory is at hand

• Container

– Memory is spread through the processes

– Different types of memory (shared/private, backed by a file or not)

– Need to collect the processes first

● Only then collect the memory

12

FreezingFreezing

• Virtual Machine

– Suspend all CPUs

• Container

– Walk the tree (/proc), catch the processes and freeze those

– Freeze cgroup helps a bit

13

Saving the stateSaving the state

• Virtual Machine

– Hardware state, tree, 300K, ~70 objects

• Container

– State of all objects, graph, 160K, ~1000 objects

– Not all objects have decent API to get the state

14

Copying the stateCopying the state

• Virtual Machine

– Can read and copy at once, easy to serialize

• Container

– Not easy to serialize as it's a graph not a tree

15

Restoring the stateRestoring the state

• VM: recreate the memory, state of CPUs and virtual hardware

• Containers

– In-kernel: create a myriad of small objects

– In CRIU: same, but there might not be a convenient API

● Over 1000 syscalls

● Need to sort it all out

16

FreezeFreeze

• VM: resume the virtual CPUs

• Container

– Either SIGCONT through the tree

– Or “unfreeze” the cgroup

– Problem: need to wake processes in the proper order

17

Post-memory migration: network swap devicePost-memory migration: network swap device

• Not yet ready for neither VMs nor CTs

• userfaultfd by Andrea Arcangeli of Red Hat

– a file descriptor to inform about page fault and get a memory back

– merged into 4.2 kernel

– work in progress to use it for KVM/QEMU

• Container

– Userfault FD is not sufficient for CRIU case

18

ImplementationImplementation

• https://criu.org

• [email protected]

• plus.google.com/+CriuOrg

• @__criu__

• github: xemul/criu

19

mailto:[email protected]

CRIU uses beyond the live migrationCRIU uses beyond the live migration

• HPC jobs: periodic checkpoints

• Slow boot services speed up

• That magical SAVE button e.g. in games

• Software testing speed up

• Reverse debugging

20

Live migrationLive migration

• P.Haul

– Process hauler

– http://criu.org/P.Haul

– Uses CRIU for c/r

21

That's all Folks!

Kirill Kolyshkin

[email protected]

That's all Folks!

Kirill Kolyshkin

[email protected]

CRIU: time and space travel for Linux containers -- Kir Kolyshkin

Software

Transcript of CRIU: time and space travel for Linux containers -- Kir Kolyshkin