CRIU: time and space travel for Linux containers -- Kir Kolyshkin
Transcript of CRIU: time and space travel for Linux containers -- Kir Kolyshkin
CRIU:
time and space travel
for Linux containers
CRIU:
time and space travel
for Linux containers
Kirill Kolyshkin
ContainerDays NYC, 30 Oct 2015
AgendaAgenda
• Why would we want to migrate containers
• Why wouldn't we want to migrate containers
• How complex is to migrate containers
2
Live migration at a glanceLive migration at a glance
• Save the state
• Transfer the state
• Restore the state
3
Container live migrationContainer live migration
4
Why would we want to migrate containers?Why would we want to migrate containers?
• It's awesome!
• Load balancing in a cluster
• Kernel upgrade
– Can be done without migration
• Hardware upgrade
5
Why wouldn't we want to live migrate containers?Why wouldn't we want to live migrate containers?
6
How to avoid live migrating containersHow to avoid live migrating containers
• Incoming traffic load balancing
• Microservices
• Crash-driven upgrades
• Scheduled downtimes
7
How to make live migration really live?How to make live migration really live?
• Need to get rid of migrating memory while the container is frozen
• Two ways:
– Pre-copy the memory
– Post-copy the memory
8
Live migration in more detailsLive migration in more details
• Pre-copy: collect and transfer the memory (might be iterative)
• Freeze the container
• Save its state
• Copy the state
• Restore
• Unfreeze
• Post-copy: swap in the memory over the network
9
Obstacles, booby traps, and rakesObstacles, booby traps, and rakes
10
VS
What do we need to migrateWhat do we need to migrate
• Virtual Machine
– Environment (i.e. virtual hardware)
– CPU state
– Memory
• Container
– Environment (cgroups, namespaces)
– Processes and stuff
– Memory11
Collect and copy the memoryCollect and copy the memory
• Virtual Machine
– All memory is at hand
• Container
– Memory is spread through the processes
– Different types of memory (shared/private, backed by a file or not)
– Need to collect the processes first
● Only then collect the memory
12
FreezingFreezing
• Virtual Machine
– Suspend all CPUs
• Container
– Walk the tree (/proc), catch the processes and freeze those
– Freeze cgroup helps a bit
13
Saving the stateSaving the state
• Virtual Machine
– Hardware state, tree, 300K, ~70 objects
• Container
– State of all objects, graph, 160K, ~1000 objects
– Not all objects have decent API to get the state
14
Copying the stateCopying the state
• Virtual Machine
– Can read and copy at once, easy to serialize
• Container
– Not easy to serialize as it's a graph not a tree
15
Restoring the stateRestoring the state
• VM: recreate the memory, state of CPUs and virtual hardware
• Containers
– In-kernel: create a myriad of small objects
– In CRIU: same, but there might not be a convenient API
● Over 1000 syscalls
● Need to sort it all out
16
FreezeFreeze
• VM: resume the virtual CPUs
• Container
– Either SIGCONT through the tree
– Or “unfreeze” the cgroup
– Problem: need to wake processes in the proper order
17
Post-memory migration: network swap devicePost-memory migration: network swap device
• Not yet ready for neither VMs nor CTs
• userfaultfd by Andrea Arcangeli of Red Hat
– a file descriptor to inform about page fault and get a memory back
– merged into 4.2 kernel
– work in progress to use it for KVM/QEMU
• Container
– Userfault FD is not sufficient for CRIU case
18
ImplementationImplementation
• https://criu.org
• plus.google.com/+CriuOrg
• @__criu__
• github: xemul/criu
19
CRIU uses beyond the live migrationCRIU uses beyond the live migration
• HPC jobs: periodic checkpoints
• Slow boot services speed up
• That magical SAVE button e.g. in games
• Software testing speed up
• Reverse debugging
20
Live migrationLive migration
• P.Haul
– Process hauler
– http://criu.org/P.Haul
– Uses CRIU for c/r
21
That's all Folks!
Kirill Kolyshkin
That's all Folks!
Kirill Kolyshkin