Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley
Transcript of Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley
![Page 1: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/1.jpg)
Library OS is the New Container.
Chia-Che Tsai / RISE Lab @ UC Berkeley
![Page 2: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/2.jpg)
Talking Points
• In a nutshell, what is LibOS?
• Why you may want to consider LibOS?
• What’s our experience?
• Introducing Graphene: an open-source Linux LibOS
![Page 3: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/3.jpg)
Containers vs VMs
Host OS
AppBin/Lib
AppBin/Lib
GuestOS
GuestOS
• Host-dependent• Light resources• Binary/library compatibility• Userland isolation
Containers
Linux OS
AppBin/Lib
AppBin/Lib
AppBin/Lib
• Host-independent• Heavy resources• System ABI compatibility• Kernel isolation
VMs
![Page 4: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/4.jpg)
LibOS: Pack Your OS with You
• A part of the OS as a library• Per-application OS isolation• Can be light-weight• Can be compatible as system ABI• Can be host-independentHost OS
AppBin/Lib
AppBin/Lib
AppBin/Lib
LibOS LibOS LibOS
Depend on how youimplement the libOS
![Page 5: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/5.jpg)
LibOS and Friends
• Drawbridge
• Unikernels
• Google gVisor
![Page 6: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/6.jpg)
Graphene: An Open-source Linux LibOS
• An ambitious project to build an ultimate libOS
As light-weightas it can be
As host-independentas it can be(Maybe even more than VMs- Explain later)
As securelyisolatedas it can be
https://github.com/oscarlab/graphene
![Page 7: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/7.jpg)
Research Prototype Turned Open-source
2014 Graphene released as an artifact
2016 First to support native Linux applications onhardware enclaves (Intel SGX)
Today Working toward code stability and community building
Main contributors:Intel Labs, Golem / ITL, Fortanix
![Page 8: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/8.jpg)
Getting CompatibilityFor Any Host
![Page 9: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/9.jpg)
Compatibility Goal of Graphene
• Running a Linux application on any platform– Off-the-shelf binaries– Without relying on virtualization
![Page 10: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/10.jpg)
Linux Compatibility is Hard
• Imagine implementing 300+ system calls on any host– Flags, opcodes, corner cases (see “man 2 open”)– Namespaces and idiosyncratic features– IOCTL() and pseudo-filesystems– Architectural ABI (e.g., thread-local storage)– Unspecific behaviors (bug-for-bug compatibility)
![Page 11: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/11.jpg)
Dilemma for API Compatibility
Rich of features
Having a rich set ofAPIs defined forapplication developers
Ease of porting
Being easy to port toother platforms ormaintain in new versions
Compatibility
Being able to reuseexisting applicationbinaries as they are
Cannot achieve all these properties at the same time
![Page 12: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/12.jpg)
Solving the Dilemma
Linux ABI (300+ syscalls)Rich features
Backward-compatible
Backward-compatible
Easy to portHost ABI (36 functions)
Linux KernelVersions BSD OSX Win
IntelSGX
Host options:
LibOSopen read write …
![Page 13: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/13.jpg)
Components of Graphene
• System calls implemented from scratch (one-time effort)LibOS
Host ABI (36 functions)
LinuxPAL
BSDPAL
OSXPAL
WINPAL
SGXPAL
Platform Adaption Layers (PAL):
• Designed for portability– Short ans: UNIX– Long ans: a common subset
of all host ABIs
• The only part that has to be ported for each host
LibOSopen read write …
![Page 14: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/14.jpg)
How Easy is Porting Our Host ABI?
2 MS studentsx term project
1 MS studentsx 2 semesters
1 MS studentsx 3 semesters
1 PhD student (Me)x 3 months
BSDPAL
(Released)
WINPAL
(Experimental)
OSXPAL
(Experimental)
SGXPAL
(Released)
Not all straightforward, but we learned where the pains are.
Problem:can’t set FS register!
Problem:mmap() vs MapViewofFile()
![Page 15: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/15.jpg)
Summary
• A LibOS to implement Linux ABI; painful, but reusable• Host ABI is simple and portable• Porting a PAL = Porting all applications
How does Graphene gain compatibility?
![Page 16: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/16.jpg)
Porting to Intel SGX(A Uniquely-Challenging Example)
![Page 17: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/17.jpg)
What Is Intel SGX?
SoftwareGuardExtensions
Available on Intel 7+ genE3 / i5 / i7 CPUs
HardwareEnclave
Trusted Code
Data stay encryptedon DRAM
Program integrity
CPU attestation
![Page 18: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/18.jpg)
What Can Intel SGX Do?
• Assume the host is untrusted
• You only have to trust your software and
Hacked OSor hypervisor
ModifiedDevices
InterposedDRAM
CompromisedAdmins
![Page 19: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/19.jpg)
As a Platform, SGX Has Many Restrictions
• Limited physical memory (93.5MB)
• Only ring-3 (no VT)
• Cannot make system calls(for explicit security reasons)
Hardware Enclave
![Page 20: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/20.jpg)
Serving System Calls Inside Applications
• LibOS absorbs all system calls• RPCs for I/O & sched
• Shielding: verify RPC results from untrusted hosts
Hardware Enclave
GrapheneLibOS
InterceptSyscalls
SGX PAL
Host OS
RemoteProcedureCalls
![Page 21: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/21.jpg)
Sharing Memory is a Big Problem
Linux is multi-proc:servers, shells, daemons
bash
ps grep
LibOS
LibOS LibOS
• Enclaves can’t share memory
• Why not single-enclave?– Position-dependent binaries– Process means isolation
• LibOSes need to share states:– Fork, IPCs, namespaces
Multi-Enclave
![Page 22: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/22.jpg)
Assumes No Shared Memory
• Basically a distributed OS w/ RPCs– Shared namespaces– Fork by migration– IPCs: signal, msg queue, semaphore– No System V shared mem
bash
ps grep
LibOS
LibOS LibOS
RPCRPC
RPC
![Page 23: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/23.jpg)
Summary
• LibOS serves APIs on a flattened architecture• For multi-proc: Graphene keeps distributed OS views
without shared memory
Why does Graphene work on SGX whilecontainers/VMs don’t?
![Page 24: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/24.jpg)
Security Isolation& Sandboxing
![Page 25: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/25.jpg)
Mutually-Distrusting Containers
• SW technique– No HW isolation– Can’t stop kernel
bugs
User A User B
Linux OS
App
Bin/Lib
App
Bin/Lib
Distrust
User NS
PIDNS
MountNS
syscalls
User NS
PIDNS
MountNS
syscalls
![Page 26: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/26.jpg)
Mutually-Distrusting LibOS Instances
User A User B
Proc 1
LibOS
PAL
Proc 2
LibOS
PAL
Proc 1
LibOS
PAL
Proc 2
LibOS
PAL
Proc 3
LibOS
PAL
Trust group Trust group
Distrust
• If syscalls are served inside libOS, no attack can happen
ProcIsolation
![Page 27: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/27.jpg)
Protecting Host OS From LibOS
User A User B
Proc 1
LibOS
PAL
Proc 2
LibOS
PAL
Proc 1
LibOS
PAL
Proc 2
LibOS
PAL
Proc 3
LibOS
PAL
Trust group Trust group
Distrust
Host OS (Linux)
syscalls syscallsSeccomp
FilterSeccomp
Filter
![Page 28: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/28.jpg)
Default Seccomp Filter: Graphene vs Docker
• What’s used most of the time in cloud
Graphene:https://github.com/oscarlab/graphene/blob/master/Pal/src/security/Linux/filter.c
SYSCALL(__NR_accept4, ALLOW),SYSCALL(__NR_clone, JUMP(&labels, clone)),SYSCALL(__NR_close, ALLOW),SYSCALL(__NR_dup2, ALLOW),SYSCALL(__NR_exit, ALLOW),...
48 syscallsallowed
Docker:https://github.com/moby/moby/blob/master/profiles/seccomp/default.json
“names": [“accept","accept4","access",...
],"action": "SCMP_ACT_ALLOW",
307 syscallsallowed
Only allows a specific flag value
![Page 29: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/29.jpg)
Not enough? Try Graphene-SGX Containers
• Graphene-SGX as a backend for Docker
Dockerfile
DockerEngine
GrapheneConfiguration
Generate
Docker containerlaunch
Hardware Enclave
GrapheneLibOS
SGX PAL
![Page 30: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/30.jpg)
Summary
• System calls inside libOS are naturally isolated• Much smaller seccomp filter (48 calls)• Graphene-SGX containers:
Mutual protection between OS and applications
Why is Graphene better at sandboxingthan containers?
![Page 31: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/31.jpg)
Functionality& Performance
![Page 32: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/32.jpg)
Current LibOS Implementation
Graphene LibOS
Virtual File System
ProcFS
RPC
ELFloader Socket
Chroot(Passthru)
FS Pipe
Sign
al SYS VIPC
Thre
ad
fork
MigrationNamespace
VMA
exec
145 / 318 system callsImplemented (core features)
34 KLOCSource code
909 KBLibrary size
![Page 33: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/33.jpg)
Tested Applications
… and more.
See examples on: https://github.com/oscarlab/graphene
![Page 34: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/34.jpg)
Memory Usage & Startup Time
0
50
100
150
make -j4 Apache4-proc
bashunixbench
Graphene on Linux LXC KVM
Memory Usage (MB):Startup Time (millisec):
0.64
200
10,342
0
10
1,000
Startup Time
Graphene is as lightweight as containers,with extremely short startup time.
![Page 35: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/35.jpg)
R BenchmarksO
verh
ead
to L
inux
Linux Graphene on Linux Graphene on SGX
5xGraphene itself adds no overheads
but SGX does (up to 10X)
![Page 36: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/36.jpg)
Microservices (Threads vs Processes)
0
2
4
6
8
10
0 2 4 6 8 10 12
Res
p. T
ime
(S)
Throughput (k.req/S)
Linux Graphene on Linux Graphene-SGX
(25 threads)
0
2
4
6
8
10
0 2 4 6 8 10 12
Res
p. T
ime
(S)
Throughput (k.req/S)
(5-proc)
Nearly no TP loss at high concurrency
With IPCs, 5% TP loss on Graphene-Linux,25% TP loss on SGX
![Page 37: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/37.jpg)
Takeaway Note
• LibOS: Compatibility & sandboxing w/o VMs, as light as containers.
• Graphene LibOS:– Aiming for full Linux compatibility (progress: 45%)– What’s the craziest place you want to run Linux programs?
It’s possible!
https://github.com/oscarlab/graphene
Send your questions & feedback to: [email protected]
![Page 38: Library OS is the · Library OS is the New Container.. Chia-Che Tsai / RISE Lab @ UC Berkeley](https://reader030.fdocuments.us/reader030/viewer/2022041003/5ea4f4a373a0c87b53166581/html5/thumbnails/38.jpg)