LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY …libmpk-slides.pdf · Chakracore (10 LoC) :...
Transcript of LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY …libmpk-slides.pdf · Chakracore (10 LoC) :...
Soyeon Park, Sangho Lee, Wen Xu, Hyungon Moon and Taesoo Kim
LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY PROTECTION KEYS (INTEL MPK)
INTRODUCTION
SECURITY CRITICAL MEMORY REGIONS NEED PROTECTION
▸ JIT page“To achieve code execution, we can simply locate one of these RWX JIT pages and overwrite it with our own shellcode.” - [1]
▸ Personal information ▸ Password ▸ Private key
“We confirmed that all individuals used only the Heartbleed exploit to obtain the private key.” - [2]
[1] Amy Burnett, et al. “Weaponization of a Javascriptcore vulnerability” RET2 Systems Engineering Blog [2] Nick Sullivan “The Results of the CloudFlare Challenge” CloudFlare Blog
INTRODUCTION
EXAMPLE 1 - HEARTBLEED ATTACK
Private key
Heartbleed request
Leaked data including private key Web Server
Reply “HELLO” (1000 bytes)
1000 bytes
H E L L O · · ·
· · · Private key ·
· ·
INTRODUCTION
EXAMPLE 1 : EXISTING SOLUTION TO PROTECT MEMORY▸ Process separation
Process
MEMORY
[1] Song, Chengyu, et al. "Exploiting and Protecting Dynamic Code Generation”, NDSS 2015. [2] Litton, James, et al. "Light-Weight Contexts: An OS Abstraction for Safety and Performance”, OSDI 2016.
Process
MEMORY
INTRODUCTION
EXAMPLE 2 - EXISTING SOLUTION TO PROTECT JIT PAGE
Process
mprotect(W)
Write code
mprotect(RX)Code Cache
Execute WriteWrite
▸ JIT page W^X protection
INTRODUCTION
PROBLEMS OF EXISTING SOLUTIONS
▸ Process Separation
▸ W^X Protection
High overhead to spawn new process and synch data
Race condition due to permission synchronization
Multiple cost to change permission of multiple pages
This talk: utilizing a hardware mechanism, Intel Memory Protection Key (MPK), to address these challenges
OUTLINE
OUTLINE▸ Introduction
▸ Intel MPK Explained ▸ Challenges ▸ Design ▸ Implementation ▸ Evaluation ▸ Discussion ▸ Related Work ▸ Conclusion
INTEL MPK EXPLAINED
OVERVIEW
▸ Support fast permission change for page groups with single instruction ▸ Fast single invocation ▸ Fast permission change for multiple pages
Kernel
mprotect Intel MPK
Userspace
Late
ncy
(ms)
0
4.5
9
13.5
18
Number of pages1000 6000 11000 16000 21000 26000 31000 36000
mprotect (contiguous)mprotect (sparse)
INTEL MPK EXPLAINED
UNDERLINE IMPLEMENTATION
Kernel
pkey 2 <- R/W page 120 -> R/W
pkey 2 <- R page 120 -> R
page # pkey ··· perm.120 2 ··· R/W···
< Page table>
WRPKRU
RDPKRUpkey_mprotect
32-bit register
16 pkeys
R W R W
▸ Permissions per cpu ▸ 32-bit PKRU register contains keys/perm ▸ WRPKRU: write key/perm ▸ RDPKRU: read key/perm
RWX
function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey)
function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, R)
function fini() pkey_free(pkey)
INTEL MPK EXPLAINED
EXAMPLE - JIT PAGE W^X PROTECTION
Write code in code cache
Grant permission
Revoke permission
pkey = 1
CODE CACHERWX
R1
W
PKRU Register1
INTEL MPK EXPLAINED
EXAMPLE : EXECUTABLE-ONLY MEMORY
function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey)
function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, R)
function fini() pkey_free(pkey)
pkey
CODE CACHERWX
INTEL MPK EXPLAINED
function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey)
function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, None)
function fini() pkey_free(pkey)
CODE CACHERWX RWX
pkey
EXAMPLE : EXECUTABLE-ONLY MEMORY
OUTLINE
OUTLINE▸ Introduction ▸ Intel MPK Explained
▸ Challenges ▸ Non-scalable Hardware Resource ▸ Asynchronous Permission Change
▸ Design ▸ Implementation ▸ Evaluation ▸ Discussion ▸ Related Work ▸ Conclusion
CHALLENGES
NON-SCALABLE HARDWARE RESOURCE▸ Only 16 keys are provided
Process
Write code cache 1
W
R
1 2 3 4
5 … 16 17
Write code cache 16
W
R
Write code cache 17
W
R
pkey 1
pkey 2
pkey 3
pkey 4
pkey 5
pkey 16
pkey 1
pkey 16
pkey ?
Process
CHALLENGES
ASYNCHRONOUS PERMISSION CHANGE - PROS▸ Permission change with MPK is per-thread intrinsically
Code Cache
Write Code Cache
W
R
RXRX
RX
RX
RX
Process
CHALLENGES
ASYNCHRONOUS PERMISSION CHANGE - PROS
Code Cache
Write Code Cache
W
R
pkey
RXRX
RX
RX
W
Write
▸ Permission change with MPK is per-thread intrinsically
CHALLENGES
ASYNCHRONOUS PERMISSION CHANGE - CONS▸ Permission synchronization is necessary in some context
Process
Code Cache
Write Code Cache
W
None pkey
RXRX
RX
RX
X
CHALLENGES
Process
Code Cache
Write Code Cache
W
None pkey
RXRX
RX
RX
X
Read
ASYNCHRONOUS PERMISSION CHANGE - CONS▸ Permission synchronization is necessary in some context
DESIGN
REVISIT : CHALLENGES
▸ Non-scalable Hardware Resources
▸ Asynchronous Permission Change
Key virtualization solve by key indirection.
libmpk provide permission synchronization API
Library
Application
DESIGN
KEY VIRTUALIZATION
▸ Decoupling physical keys from user interface ▸ Key indirection working like cache
Write code
W
R
pkey 1 pkey 16 pkey ?vkey 1 vkey 16 vkey 17
pkey 1 pkey 16
Write code
W
R
Write code
W
R
😱
😊
Evicted
➊ call mpk_mprotect()
pkey_sync
DESIGN
INTER-THREAD PERMISSION SYNCHRONIZATION
Userspace
Kernel
THREAD A STATE : RUNNING
THREAD B STATE : RUNNING
➍ return➌ interrupt
SLEEP
➋ add hooks
task_work
WRPKRU
➎ update PKRU (rescheduled)
RUNNING
RX RXX X
IMPLEMENTATION
IMPLEMENTATION▸ libmpk is written in C/C++ ▸ Userspace library : 663 LoC ▸ Kernel support : 1K LoC ▸ Permission Synchronization ▸ Kernel module for managing metadata ▸ Userspace cannot fabricate metadata
‣ We open source at https://github.com/sslab-gatech/libmpk
function init() vkey = libmpk_mmap(&code_cache, len, RWX)
function JIT() libmpk_begin(vkey, W) ... write code cache ... libmpk_end(vkey) libmpk_mprotect(vkey, X)
CODE CACHERWX
IMPLEMENTATION
USE CASE - JIT PAGE W^X PROTECTION
RWX
XX
XX
Key virtualization
Permission synchronization
OUTLINE
OUTLINE▸ Introduction ▸ Intel MPK Explained ▸ Challenges ▸ Design ▸ Implementation ▸ Evaluation ▸ Usability ▸ Checking overhead occurred by design ▸ Use cases - applying for memory isolation and protection
▸ Discussion ▸ Related Work ▸ Conclusion
EVALUATION
LIBMPK IS EASY TO ADOPT
▸ OpenSSL (83 LoC) : protecting private key ▸ Memcached (117 LoC) : protecting slabs ▸ Chakracore (10 LoC) : protecting JIT pages
EVALUATION
LATENCY - KEY VIRTUALIZATION
▸ Cache miss costs overhead due to eviction
0.0
0.8
1.5
2.3
3.0
0 25 50 75 100Hit rate
Tim
e (μ
s)
HitMiss
mprotect
Reasonable overhead while providing similar functionality.
EVALUATION
LATENCY - INTER-THREAD PERMISSION SYNCHRONIZATION
▸ Performance ▸ 1,000 pages : 3.8x ▸ Single page : 1.7x
Late
ncy (μs
)
0
10
20
30
40
Number of threads
1 5 10 15 20 25 30 35 40
mpk_mprotectmprotect (4KB)mprotect (4000KB)
libmpk outperform mprotect regardless of the number of pages.
EVALUATION
FAST MEMORY ISOLATION - OPENSSL & MEMCACHED▸ OpenSSL ▸ request/sec: 0.53%
slowdown
requ
est/
sec
0
375
750
1125
1500
Size of each request (KB)1 2 4 8 16 32 64 128 256 512 1024
original libmpk
Kbyt
e/se
c
0
125
250
375
500
Number of connections
250 500 750 1000
original mpk_inthreadmpk_synch mprotect
▸ For 1GB protection : ▸ original vs mpk_inthread :
0.01% ▸ mpk_synch vs mprotect :
8.1x
EVALUATION
FAST AND SECURE W⊕X - JIT COMPILATION▸ Chakracore ▸ mprotect-based protection ▸ Allows race-condition attack
▸ 4.39% performance improvement (31.11% at most)
Norm
alize
d Sc
ore
0.70
0.85
1.00
1.15
1.30
RICH
ARDS
DELT
ABLU
ECR
YPTO
RAYT
RACE
EARL
EYBO
YER
REGE
XP
SPLA
YSP
LAYL
ATEN
CYNA
VIER
STOK
ESPD
FJS
MANDR
EEL
MANDR
EELL
ATEN
CYGA
MEBOY
CODE
LOAD
BOX2
D
ZLIB
TYPE
SCRI
PT
mprotect libmpk
DISCUSSION
DISCUSSION▸ Rogue data cache load (Meltdown) ▸ MPK is also affected by the Meltdown attack ▸ Hardware or software-level mitigation
▸ Code reuse attack ▸ Arbitrary executed WRPKRU may break the security ▸ Applying sandboxing or control-flow integrity
▸ Protection key use-after-free ▸ pkey_free does not perfectly free the protection key ▸ Pages are still associated with the pkey after free
RELATED WORK
RELATED WORK▸ ERIM [1] : Secure wrapper of MPK ▸ Shadow Stack [2] : Shadow stack protected by MPK ▸ XOM-Switch [3] : Code-reuse attack prevention with
execute-only memory supported by MPK
[1] Anjo Vahldiek-Oberwagner, et al. “ERIM: Secure, Efficient In-Process Isolation with Memory Protection Keys”, Security 2019 [2] Nathan Burow, et al. “Shining Light on Shadow Stacks”, Oakland 2019 [3] Mingwei Zhang, et al. “XOM-Switch: Hiding Your Code From Advanced Code Reuse Attacks in One Shot”, Black Hat Asia 2018
CONCLUSION
CONCLUSION▸ libmpk is a secure, scalable, and
synchronizable abstraction of MPK for supporting fast memory protection and isolation with little effort.
THANKS!https://github.com/sslab-gatech/libmpk