HPCA 2019 Keynote Towards Secure High- Performance ...people.csail.mit.edu/devadas/pubs/hpca.pdf ·...
Transcript of HPCA 2019 Keynote Towards Secure High- Performance ...people.csail.mit.edu/devadas/pubs/hpca.pdf ·...
Srini DevadasMassachusetts Institute of Technology
HPCA 2019 Keynote
Towards Secure High-Performance Computer
Architectures
Architectural Isolationof Processes
2
Fundamental to maintaining correctness and privacy!
Performance Dictates Microarchitectural Optimization
3
Isolation Breaks Because of Shared Microarchitectural State!
Shared Last Level Cache
4
Line Offsetl-1…0
Address Tagn-1…s+l
Set Indexs+l-1…l
Memory Address
…Set S-1, Way 1 Set S-1, Way W-1Set S-1, Way 0⋮ ⋱ ⋮⋮
Set i, Way 1 Set i, Way W-1…Set i, Way 0
⋮⋮ ⋮ ⋱
Set 1, Way W-1Set 1, Way 0 …Set 1, Way 1
Set 0, Way W-1…Set 0, Way 1Set 0, Way 0
Way W-1…Way 1Way 0
Tag Line Tag Line Tag Line
Matched Line
Tag Comparator
Match? Matched Word
Process 1
Process 2
Control Flow Speculation for Performance
I: Compute
I+1: Compute
I+2: Compute
I+3: Compute
I: Control Flow
J: Compute
J+1: Compute
J+2: Compute
K: Compute
K+1: Compute
K+2: Compute
Correct direction
Mis-speculated direction
Sequential InstructionExecution
Non-Sequential InstructionExecution
5
Control Flow Speculation is insecure
Speculative execution does not affect architectural state → “correct”… but can be observed via some “side channels” (primarily cache tag state)
… and attacker can influence (mis)speculation (branch predictor inputs not authenticated)
A huge, complex attack surface!6
Domain of Victim
Transmitter
Secret
ReceiverChannelAccess
Secret
Attacker
Pre-existing (RSA conditional execution example)Written by attacker (Meltdown)Synthesized out of existing victim code by attacker (Spectre)
Building a Transmitter
7
8
• Real systems: large, complex, cyberphysical(not secure)
•Introducing a spy:
Hypervisor,Bios
CPU
Side Channels Gone Wild!
DR
AMChipset
Network
Thread L1 $L2 $
L3 $DR
AM C
trl.
Disk
Mainboard
OS
sharing!
sharing!sharing!
App
admins! users!vendors!
users!
admins!
Philosophy
Build enclaves on an enclave platform, not
just processes
9
Enclaves strengthenthe process abstraction
• Processes guarantee isolation of memory• Enclaves provide a stronger guarantee
– No other program can infer anything private from the enclave program through its use of shared resources or shared microarchitectural state
• Largely decouple performance considerations from security
• Minimally invasive hardware changes• Provable security under chosen threat model
10
A Typical ComputerTrusted Computing Base
TRUSTED!
TRUSTED!
CPU!!
DRAM
!Chipset!
Network!
Thread! L1 $!L2 $!
L3 $!DRAM
Ctrl
.!Disk!
Main!board!
Priv
eleg
e!
BIOS (SMM)!
Hypervisor (Ring 0, VMX root) !
OS Kernel (Ring 0) !
App! App!
(Ring 3) !
Software…!… Running on hardware!
Secure App!
Single-Chip Secure Processor:Shrink the TCB
Protected Environment
Memory
I/O
TrustedSoftware
Protect
Identify
• Enclave assumes trusted hardware + trusted software “monitor”
• Operating system is untrusted 12
Edward Suh’s ICS 2003 Talk on Aegis processor
Enclave Enclave
Enclave Lifecycle (simplified)
13
Enclave binary image
Untrusted software (OS)
Create enclave,Grant resources
Loadenclave
Sealenclave
Enclave
Enclave binary image
Enclave binary image
① ② ③
Enclave can no longer be modified from outside
Enclave has a measurement for attestation
Enclave Lifecycle (simplified)
14
Untrusted software (OS)
Create enclave,Grant resources
Loadenclave
Sealenclave
Enterenclave
Exitenclave
Enclave binary image
④① ② ③
⑤
(enclave executes in a strongly isolated
environment)
Platform erases (flushes)
sensitive state
Strong Isolation Goal
• Any attack by a privileged attacker on the same machine as the victim that can extract a secret inside the victim enclave, could also have been run successfully by an attacker on a different machine than the victim.– No protection against an enclave leaking its own
secrets through its public API.
• Three strategies for isolation: Spatial isolation, temporal isolation and cryptography 15
Sanctum Design
Victor Costan, Ilia LebedevSanctum: Minimal Hardware Extensions
for Strong Software Isolation
Software Stack
User
Supervisor
Hypervisor
Machine
Hypervisor
Host ApplicationEnclave
Security MonitorMeasurement Root
Enclave multiplexing
Operating System
Enclave management
Enclave syscall shims Sanctum-aware runtimeNon-sensitive code and data Sensitive code and data
Enclave setup
User
Supervisor
Hypervisor
Machine
Hypervisor
Host ApplicationEnclave
Security MonitorMeasurement Root
Enclave multiplexing
Operating System
Enclave management
Enclave syscall shims Sanctum-aware runtimeNon-sensitive code and data Sensitive code and data
Enclave setup
Software Stack
Target: multi-core processor(no hyperthreading, no speculation)
L3 C
ache
CBox
Core
L2 Cache
L3 CacheSlice
L3 CacheSlice
CBox
Core
L2 Cache
HomeAgent
CBox
Core
L2 Cache
L3 CacheSlice
L3 CacheSlice
CBox
Core
L2 Cache
QPIPacketizer
MemoryController
DDR3Channel
Ring toQPI
Ring toPCIeI/O Controller
UBox
QPI Link
PCIe Lanes
Microarchitectural StateIsolation in Sanctum Enclaves
• Resources exclusively granted to an enclave, and scheduled at the granularity of process context switches are isolated temporally– Register files, branch predictors, private caches,
and private TLBs
• Resources shared between processes on-demand, with arbitrarily small granularity are isolated spatially by partitioning– Shared caches and shared TLBs
20
Operating SystemManages Page Tables
VirtualAddress
Physical AddressMapping
PageTables
VirtualAddress Space
PhysicalAddress Space
Address Translation
Software DRAM
System bus
Practical Software Attackon SGX “Simulators”
• Microsoft Research, IEEE S&P 2015: Exploit no-noise side channel due to page faults
22
Page Table Isolation
Host applicationspace
Host applicationspace
EVRANGE A
Enclave A VirtualAddress Space
Physical Memory
Enclave A region
Enclave A page tables
Enclave A region
Enclave B region
Enclave B page tables
OS region
OS region
OS page tables
Host applicationspace
Host applicationspace
EVRANGE B
Enclave B VirtualAddress Space
Partitioning to Prevent Timing Attacks
24
Line Offsetl-1…0
Address Tagn-1…s+l
Set Indexs+l-1…l
Memory Address
…Set S-1, Way 1 Set S-1, Way W-1Set S-1, Way 0⋮ ⋱ ⋮⋮
Set i, Way 1 Set i, Way W-1…Set i, Way 0
⋮⋮ ⋮ ⋱
Set 1, Way W-1Set 1, Way 0 …Set 1, Way 1
Set 0, Way W-1…Set 0, Way 1Set 0, Way 0
Way W-1…Way 1Way 0
Tag Line Tag Line Tag Line
Matched Line
Tag Comparator
Match? Matched Word
Enclave
OS
Page Coloring
DRAM RegionIndex
CacheLine Offset
5 0611Page Offset
1214
Cache Set Index
DRAM StripeIndex
151720 18Cache Tag
Address bits used by 256 KB of DRAM
Address bits covering the maximum addressable physical space of 2 MB
Physical page number (PPN)
Page Colors =
DRAM Regions
0
MEMTOP
Normal DRAMpage colors
Sanctum DRAMpage colors / regions
0
MEMTOP
Region 0 Region 1 Region 2 Region 3Region 4 Region 5 Region 6 Region 7
LLC set colors
Page Colors =
DRAM Regions
0
MEMTOP
Normal DRAMpage colors
Sanctum DRAMpage colors / regions
0
MEMTOP
Region 0 Region 1 Region 2 Region 3Region 4 Region 5 Region 6 Region 7
LLC set colors
A little bit-shifting gets us a large contiguous DRAM region
Sanctum Secure ProcessorNo Speculation, No Hyperthreading
RISCV Rocket Core, Changes required by Sanctum (+ ~2% of core)
Also requires 9 new config registers
Sanctum Status andCurrent Limitations
• We have built an open-source Sanctum based on the RISC-V ISA– Low performance and area overhead to support
enclaves– Ongoing formal verification effort
• Sanctum is an academic, lightweight processor• Apply its design philosophy to speculative
out-of-order (OOO) processors, which need to protect against Spectre-style attacks
29
MI6 Design
Thomas Bourgeat, Ilia Lebedev,Andrew Wright, Sizhuo Zhang, Arvind
MI6: Secure Enclaves in aSpeculative Out-of-Order Processor
RiscyOO Processor
31
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
RiscyOO Processor
32
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Not speculating during entire program results in average 3X slowdown!
MI6 Processor
33
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
All modules with private state are flushed on enclave entry and exit.
Performance overhead ~ 5%.
MI6 Processor
34
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Last Level Cache is spatially partitioned.
Performance overhead ~7%.
Leaky Cache Hierarchy
35
Timing IndependentCache Hierarchy
36
Fair LLC Arbiter.Performance overhead ~8%.
MI6 Processor
37
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Stop speculating: Every memory instruction will be stalled at the rename stage until ROB is empty.
MI6 Processor
38
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Stop speculating ONLY during enclave copy operations (I/O) à
overhead is negligible.
MI6 Processor
39
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Area overhead for hardware modifications is ~2% not counting
L1 cache, FPU, LLC slice.
MI6 Processor
40
Rename
ROB
ALU IQ Issue RegRead Exec Reg
Write
MEM IQ Issue RegRead
AddrCalc
UpdateLSQ
Physical Reg File
L1 D TLB
LSQ (LQ + SQ)
Commit
IssueLd
Deq
StoreBuffer
L1 D$
RespLd
IssueSt
RespSt
RenameTable
SpeculationManager
EpochManager
Scoreboard
ALU pipeline
MEM pipeline
Load-Store Unit
Front-end
Fetch Bypass
Last Level Cache
Security Monitor virtually unchanged. Hardware can evolve
separately from software (and vice versa).
Challenge: Expressivity
• ~15% performance overhead for enclaves• Enclaves trade expressivity for security
– Cannot make system calls directly since OS can’t be trusted to restore an enclave’s execution state
– Enclave’s runtime must ask the host application to proxy file system and network I/O requests
– What syscall functionality should the enclave’s runtime provide?
41
Challenge: Adaptivity
• Runtime decisions based on sensitive data leak information through timing: completion time, resource usage
• Crypto to the rescue?– Secure demand paging using page-level memory
encryption, integrity verification and ORAM– Secure and efficient dynamic memory allocation
in enclaves an open problem
42
Challenge: Interaction
• Interaction with the outside may leak information– Public schedule for interaction does not leak
• Can we bound leakage of adaptive interactions with users, other programs?
43
Can ITrust You?
Challenge: (Formal) Verification
Open Source à Independent Verification
Properties of Enclaves:
Measurement := Different enclaves have different measurements (also inverse)
Integrity := Modelled attacker cannot affect enclave state
Confidentiality := Modelled attacker cannot observe enclave state
44
Modeling the Adversary
Adversary := set of ops an attacker can use to tamper with or observe enclave state. Any combination of these can be used at any time.
Threat model := ∪(observation function, tamper function, model initial state)
Specify non-interference properties or invariants that execution should satisfy
45
Invariants and Non-Interference
46
Check invariants
Do an attacker actionInit
operationsDo a victim action
The proof describes a CFG with “forks”. Search this graph for a path that violates an invariant.
Summary: Desiderata forSingle-Chip Secure Processor
• Open source• Formally verified (small) TCB• Secure against all practical software attacks• Secure against physical attacks on memory• Enhanced physical security against invasive
attacks• Minimal performance overhead
47
Thank you for your attention!48
Acknowledgements
• Edward Suh• Victor Costan• Ilia Lebedev• Chris Fletcher• Ling Ren• Albert Kwon• Sanjit Seshia• Pramod Subramanyan
• Arvind• Thomas Bourgeat• Andrew Wright• Sizhuo Zhang• Kyle Hogan• Jules Drean• Rohit Sinha• NSF, DARPA, ADI, Delta