Using Undocumented CPU Behaviour to See into Kernel Mode and ...

110
Using Undocumented CPU Behaviour to See into Kernel Mode and Break KASLR in the Process Anders Fogh and Daniel Gruss August 4, 2016 1 / 64

Transcript of Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Page 1: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Using Undocumented CPU Behaviour to See into KernelMode and Break KASLR in the Process

Anders Fogh and Daniel Gruss

August 4, 2016

1 / 64

Page 2: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

About this presentation

This talk is about a class of microarchitectorial attacksI Not about software bugsI It is about CPU design as an attack vectorI But not about Instruction Set ArchitectureI Focus on Intel x86-64 - applies to other architectures too

2 / 64

Page 3: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Take aways

Take awaysI CPU design is security relevantI Prefetch instructions leak information

Exploit this to:I Locate a driver in kernel = defeat KASLRI Translate Virtual to physical addresses for other attacks

3 / 64

Page 4: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Take aways

Take awaysI CPU design is security relevantI Prefetch instructions leak information

Exploit this to:I Locate a driver in kernel = defeat KASLRI Translate Virtual to physical addresses for other attacks

3 / 64

Page 5: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Introduction

Memory subsystem

Kernel Address-space Layout Randomization (KASLR)

Prefetch Side Channel

Prefetching the Kernel

Case study: Defeating Windows 7 KASLR

Case study: Exploiting direct-physical maps

Bonus material4 / 64

Page 6: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Introduction

5 / 64

Page 7: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets

NOTE

Using the PREFETCH instruction is recommended only if data does not fit incache. Use of software prefetch should be limited to memory addresses that aremanaged or owned within the application context. Prefetching to addresses thatare not mapped to physical pages can experience non-deterministic performancepenalty. For example specifying a NULL pointer (0L) as address for a prefetchcan cause long delays.

6 / 64

Page 8: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets

NOTE

Using the PREFETCH instruction is recommended only if data does not fit incache.

Use of software prefetch should be limited to memory addresses that aremanaged or owned within the application context. Prefetching to addresses thatare not mapped to physical pages can experience non-deterministic performancepenalty. For example specifying a NULL pointer (0L) as address for a prefetchcan cause long delays.

6 / 64

Page 9: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets

NOTE

Using the PREFETCH instruction is recommended only if data does not fit incache. Use of software prefetch should be limited to memory addresses that aremanaged or owned within the application context.

Prefetching to addresses thatare not mapped to physical pages can experience non-deterministic performancepenalty. For example specifying a NULL pointer (0L) as address for a prefetchcan cause long delays.

6 / 64

Page 10: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets

NOTE

Using the PREFETCH instruction is recommended only if data does not fit incache. Use of software prefetch should be limited to memory addresses that aremanaged or owned within the application context. Prefetching to addresses thatare not mapped to physical pages can experience non-deterministic performancepenalty.

For example specifying a NULL pointer (0L) as address for a prefetchcan cause long delays.

6 / 64

Page 11: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets

NOTE

Using the PREFETCH instruction is recommended only if data does not fit incache. Use of software prefetch should be limited to memory addresses that aremanaged or owned within the application context. Prefetching to addresses thatare not mapped to physical pages can experience non-deterministic performancepenalty. For example specifying a NULL pointer (0L) as address for a prefetchcan cause long delays.

6 / 64

Page 12: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets (translation)

PLEASE

,

only use prefetch as Intel intends,or else Intel will be angry,and there is no reason why anyone would measure the execution time.

7 / 64

Page 13: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets (translation)

PLEASE,

only use prefetch as Intel intends

,or else Intel will be angry,and there is no reason why anyone would measure the execution time.

7 / 64

Page 14: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets (translation)

PLEASE,

only use prefetch as Intel intends,or else Intel will be angry

,and there is no reason why anyone would measure the execution time.

7 / 64

Page 15: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets (translation)

PLEASE,

only use prefetch as Intel intends,or else Intel will be angry,and there is no reason why anyone would measure the execution time.

7 / 64

Page 16: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The chamber of secrets (translation)

7 / 64

Page 17: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Whoami

I Anders FoghI Principal Security Researcher, GDATA Advanced AnalyticsI Playing with malware since 1992I Twitter: @anders_foghI Email: [email protected]

8 / 64

Page 18: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Whoami

I Daniel GrussI PhD Student, Graz University of TechnologyI Currently intern at Microsoft Research CambridgeI Twitter: @lavadosI Email: [email protected]

9 / 64

Page 19: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

And the team

The research teamI Clementine MauriceI Moritz LippI Stefan Mangard

from Graz University of Technology

10 / 64

Page 20: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Memory subsystem

11 / 64

Page 21: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

CPU Caches

Memory (DRAM) is slow compared to the CPUI bu�er frequently used memory for the CPUI every memory reference goes through the cacheI transparent to OS and programs

12 / 64

Page 22: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Memory Access Latency

50 100 150 200 250 300 350 400100

101

102

103

104

105

106

107

Access time in cycles

Num

bero

facc

esse

scache hits cache misses

13 / 64

Page 23: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Data Caches

core 0

L1

L2

core 1

L1

L2

core 2

L1

L2

core 3

L1

L2ring bus

LLCslice 0

LLCslice 1

LLCslice 2

LLCslice 3

Last-level cache:I shared memory shared is in

cache, across cores!→ physically indexedI need physical address to

manipulateI only one cache entry per

physical address

14 / 64

Page 24: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Unprivileged cache maintainance

User programs can optimize cache usage:I prefetch: suggest CPU to load data into cacheI clflush: throw out data from all caches

... based on virtual addresses

15 / 64

Page 25: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Caches: Software control

There are 5 prefetch instructions:I prefetcht0: suggest CPU to load data into L1I prefetcht1: suggest CPU to load data into L2I prefetcht2: suggest CPU to load data into L3I prefetchnta: suggest CPU to load data for non-temporal accessI prefetchw: suggest CPU to load data with intention to write

actual behaviour varies between CPU models

16 / 64

Page 26: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Caches: Software control

The prefetch instructions are somewhat unusualI Hints – can be ignored by the CPUI Do not check privileges or cause exceptions

17 / 64

Page 27: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Virtual and physical addressing

Why address translation: Run multiple processes securely on a single CPUI Let applications run in their own virtual address spaceI Create exchangeable map from “virtual memory” to “physical memory”I Privileges are checked on memory accessesI Managed by the operating system kernel

18 / 64

Page 28: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Address translation on x86-64

PML4I (9 b) PDPTI (9 b) PDI (9 b) PTI (9 b) O�set (12 b)48-bit virtual address

CR3 PML4PML4E 0PML4E 1

···#PML4I···

PML4E 511

PDPTPDPTE 0PDPTE 1

···#PDPTI···

PDPTE 511

Page DirectoryPDE 0PDE 1

···PDE #PDI

···PDE 511

Page TablePTE 0PTE 1

···PTE #PTI

···PTE 511

4 KiB PageByte 0Byte 1

···O�set

···Byte 4095

19 / 64

Page 29: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Address Translation Caches

Problem: translation tables are stored in physical memory

20 / 64

Page 30: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Solution: Address Translation Caches

Core 0

ITLB DTLB

PDE cache

PDPTE cache

PML4E cache

Core 1

ITLB DTLB

PDE cache

PDPTE cache

PML4E cache

Page table structures insystem memory (DRAM)

Lookupdirection

21 / 64

Page 31: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel Address-space Layout Randomization (KASLR)

22 / 64

Page 32: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel is mapped in every process

Today’s operating systems:Shared address space

User memory Kernel memory0 −1

context switch

23 / 64

Page 33: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Address-Space Layout Randomization (ASLR)

I Kernel and drivers at randomized o�sets in virtual memoryI Mitigates code reuse attacks e.g. return-oriented-programmingI Attacks based on read primitives or write primitives

I But: leaking kernel/driver addresses defeats ASLR

24 / 64

Page 34: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Address-Space Layout Randomization (ASLR)

I Kernel and drivers at randomized o�sets in virtual memoryI Mitigates code reuse attacks e.g. return-oriented-programmingI Attacks based on read primitives or write primitivesI But: leaking kernel/driver addresses defeats ASLR

24 / 64

Page 35: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel direct-physical map

Virtual address spaceUser Kernel

Physical memory

0

0 max. phys.

247 −247 −1

direct

map

Available on many operating systems / hypervisorsI OS XI LinuxI BSDI Xen PVM (Amazon EC2)

But not on Windows!

25 / 64

Page 36: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel direct-physical map

Virtual address spaceUser Kernel

Physical memory

0

0 max. phys.

247 −247 −1

direct

map

Available on many operating systems / hypervisorsI OS XI LinuxI BSDI Xen PVM (Amazon EC2)

But not on Windows!25 / 64

Page 37: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Prefetch Side Channel

26 / 64

Page 38: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

1.I The kernel is mapped in every process

2.I The prefetch instruction takes a virtual address as inputI To manipulate L3 a physical address is needed= The prefetch instruction must translate

3.I The prefetch instruction does not check privileges= Any address can be prefetched

4.I Translation is cachedI Lookup searches caches in a fixed order

Can we measure a time di�erence?

27 / 64

Page 39: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

1.I The kernel is mapped in every process

2.I The prefetch instruction takes a virtual address as inputI To manipulate L3 a physical address is needed= The prefetch instruction must translate

3.I The prefetch instruction does not check privileges= Any address can be prefetched

4.I Translation is cachedI Lookup searches caches in a fixed order

Can we measure a time di�erence?

27 / 64

Page 40: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

1.I The kernel is mapped in every process

2.I The prefetch instruction takes a virtual address as inputI To manipulate L3 a physical address is needed= The prefetch instruction must translate

3.I The prefetch instruction does not check privileges= Any address can be prefetched

4.I Translation is cachedI Lookup searches caches in a fixed order

Can we measure a time di�erence?

27 / 64

Page 41: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

1.I The kernel is mapped in every process

2.I The prefetch instruction takes a virtual address as inputI To manipulate L3 a physical address is needed= The prefetch instruction must translate

3.I The prefetch instruction does not check privileges= Any address can be prefetched

4.I Translation is cachedI Lookup searches caches in a fixed order

Can we measure a time di�erence?

27 / 64

Page 42: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

1.I The kernel is mapped in every process

2.I The prefetch instruction takes a virtual address as inputI To manipulate L3 a physical address is needed= The prefetch instruction must translate

3.I The prefetch instruction does not check privileges= Any address can be prefetched

4.I Translation is cachedI Lookup searches caches in a fixed order

Can we measure a time di�erence?27 / 64

Page 43: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

There is a timing di�erence!

PDPT PD PT cached P. uncached P.

200

300

400

230246

222

181

383

Mapping level

Exec

utio

ntim

ein

cycl

es

Idea: Would this also work on inaccessible kernel memory?

28 / 64

Page 44: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Prefetching the Kernel

29 / 64

Page 45: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

A translation oracle

Definition of our translation oracleI Timing the prefetch instruction on an arbitrary address will recover the

translation level

30 / 64

Page 46: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Recovering a map of the kernel

Recovering /proc/pid/pagemap in 4 steps:

1. Recover mappings in PML4 by using the translation oracle on kerneladdresses

2. Recover mappings in PDPT by using the translation oracle on kerneladdresses

3. Recover mappings in PD by using the translation oracle on kernel addresses4. Recover mappings in PT by using the translation oracle on kernel addresses

Complete process takes from seconds to hours depending on how pages areactually mapped by the operating system.

31 / 64

Page 47: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Recovering a map of the kernel

Recovering /proc/pid/pagemap in 4 steps:

1. Recover mappings in PML4 by using the translation oracle on kerneladdresses

2. Recover mappings in PDPT by using the translation oracle on kerneladdresses

3. Recover mappings in PD by using the translation oracle on kernel addresses4. Recover mappings in PT by using the translation oracle on kernel addresses

Complete process takes from seconds to hours depending on how pages areactually mapped by the operating system.

31 / 64

Page 48: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Recovering a map of the kernel

Recovering /proc/pid/pagemap in 4 steps:

1. Recover mappings in PML4 by using the translation oracle on kerneladdresses

2. Recover mappings in PDPT by using the translation oracle on kerneladdresses

3. Recover mappings in PD by using the translation oracle on kernel addresses

4. Recover mappings in PT by using the translation oracle on kernel addresses

Complete process takes from seconds to hours depending on how pages areactually mapped by the operating system.

31 / 64

Page 49: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Recovering a map of the kernel

Recovering /proc/pid/pagemap in 4 steps:

1. Recover mappings in PML4 by using the translation oracle on kerneladdresses

2. Recover mappings in PDPT by using the translation oracle on kerneladdresses

3. Recover mappings in PD by using the translation oracle on kernel addresses4. Recover mappings in PT by using the translation oracle on kernel addresses

Complete process takes from seconds to hours depending on how pages areactually mapped by the operating system.

31 / 64

Page 50: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Recovering a map of the kernel

Recovering /proc/pid/pagemap in 4 steps:

1. Recover mappings in PML4 by using the translation oracle on kerneladdresses

2. Recover mappings in PDPT by using the translation oracle on kerneladdresses

3. Recover mappings in PD by using the translation oracle on kernel addresses4. Recover mappings in PT by using the translation oracle on kernel addresses

Complete process takes from seconds to hours depending on how pages areactually mapped by the operating system.

31 / 64

Page 51: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The address-translation oracle

The address-translation oracle tell us whether virtual address p and p map to thesame physical address

1. Use clflush to remove p from cache

2. Use prefetch to load p into cache3. time access of p. If fast it was cached: p: maps to the same physical memory

as p

Beware! prefetch is a hint!

32 / 64

Page 52: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The address-translation oracle

The address-translation oracle tell us whether virtual address p and p map to thesame physical address

1. Use clflush to remove p from cache2. Use prefetch to load p into cache

3. time access of p. If fast it was cached: p: maps to the same physical memoryas p

Beware! prefetch is a hint!

32 / 64

Page 53: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

The address-translation oracle

The address-translation oracle tell us whether virtual address p and p map to thesame physical address

1. Use clflush to remove p from cache2. Use prefetch to load p into cache3. time access of p. If fast it was cached: p: maps to the same physical memory

as p

Beware! prefetch is a hint!

32 / 64

Page 54: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at rdtscp

instruction 1

rdtscp

instruction 2

rdtscp

instruction 3

33 / 64

Page 55: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at rdtscp

instruction 1

rdtscp

instruction 2

rdtscp

instruction 3

3

3

3

33 / 64

Page 56: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at rdtscp

instruction 1

rdtscp

instruction 2

rdtscp

instruction 3

7

7

33 / 64

Page 57: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at mfence

instruction 1

mfence

instruction 2

mfence

instruction 3

34 / 64

Page 58: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at mfence

instruction 1

mfence

instruction 2

mfence

instruction 3

3

3

3

34 / 64

Page 59: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

The CPU may reorder instructions – a look at mfence

instruction 1

mfence

instruction 2

mfence

instruction 3

7

7

34 / 64

Page 60: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructionsThe CPU may reorder instructions

instruction 1

cpuid

instruction 2

cpuid

instruction 3

but not over cpuid!

35 / 64

Page 61: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder prefetch instruction – a look at rdtscp

prefetch

rdtscp

prefetch

rdtscp

prefetch

36 / 64

Page 62: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder prefetch instruction – a look at rdtscp

prefetch

rdtscp

prefetch

rdtscp

prefetch

3

3

3

36 / 64

Page 63: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder prefetch instruction – a look at rdtscp

prefetch

rdtscp

prefetch

rdtscp

prefetch

3

3

36 / 64

Page 64: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder instructions – a look at mfence

prefetch

mfence

prefetch

mfence

prefetch

37 / 64

Page 65: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder instructions – a look at mfence

prefetch

mfence

prefetch

mfence

prefetch

3

3

3

37 / 64

Page 66: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing the prefetch instruction

The CPU may reorder instructions – a look at mfence

prefetch

mfence

prefetch

mfence

prefetch

3

3

37 / 64

Page 67: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

Combine instructions in clever procedures toI prevent reordering of prefetch if necessary

= mfence rdtscp cpuid target instruction cpuid rdtscp mfence

I avoid noise from cpuid if possible

= mfence cpuid rdtscp target instruction rdtscp cpuid mfence

38 / 64

Page 68: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

Combine instructions in clever procedures toI prevent reordering of prefetch if necessary

= mfence rdtscp cpuid target instruction cpuid rdtscp mfence

I avoid noise from cpuid if possible

= mfence cpuid rdtscp target instruction rdtscp cpuid mfence

38 / 64

Page 69: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

Combine instructions in clever procedures toI prevent reordering of prefetch if necessary

= mfence rdtscp cpuid target instruction cpuid rdtscp mfence

I avoid noise from cpuid if possible= mfence cpuid rdtscp target instruction rdtscp cpuid mfence

38 / 64

Page 70: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Timing instructions

Combine instructions in clever procedures toI prevent reordering of prefetch if necessary

= mfence rdtscp cpuid target instruction cpuid rdtscp mfence

I avoid noise from cpuid if possible= mfence cpuid rdtscp target instruction rdtscp cpuid mfence

We use either the prefetchnta or prefetcht2 instructions and a mov instructionfor memory access

38 / 64

Page 71: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Case study: Defeating Windows 7 KASLR

39 / 64

Page 72: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Memory layout

I HAL and kernel located bewetweenI start: 0xffff f800 0000 0000I end : 0xffff f87f ffff ffff

I Kernel driversI start: 0xffff f880 0000 0000I end : 0xffff f8ff ffff ffff

40 / 64

Page 73: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Breaking KASLR

1. Map the drivers address space using translation recovery attack

2. Evict the page translation caches: Sleep() and/or access large memorybu�er

3. Perform a syscall to targeted driver4. Time prefetch(PageAddress)

5. Repeat 2,3,4 for all pages found in 1Fastest average access time is right address.

41 / 64

Page 74: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Breaking KASLR

1. Map the drivers address space using translation recovery attack2. Evict the page translation caches: Sleep() and/or access large memory

bu�er

3. Perform a syscall to targeted driver4. Time prefetch(PageAddress)

5. Repeat 2,3,4 for all pages found in 1Fastest average access time is right address.

41 / 64

Page 75: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Breaking KASLR

1. Map the drivers address space using translation recovery attack2. Evict the page translation caches: Sleep() and/or access large memory

bu�er3. Perform a syscall to targeted driver

4. Time prefetch(PageAddress)

5. Repeat 2,3,4 for all pages found in 1Fastest average access time is right address.

41 / 64

Page 76: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Breaking KASLR

1. Map the drivers address space using translation recovery attack2. Evict the page translation caches: Sleep() and/or access large memory

bu�er3. Perform a syscall to targeted driver4. Time prefetch(PageAddress)

5. Repeat 2,3,4 for all pages found in 1Fastest average access time is right address.

41 / 64

Page 77: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Windows 7 Breaking KASLR

1. Map the drivers address space using translation recovery attack2. Evict the page translation caches: Sleep() and/or access large memory

bu�er3. Perform a syscall to targeted driver4. Time prefetch(PageAddress)

5. Repeat 2,3,4 for all pages found in 1Fastest average access time is right address.

41 / 64

Page 78: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Locate Kernel Driver (defeat KASLR)

0 4,000 8,000 12,000

90

100

110

120

Page o�set in kernel driver region

Avg.

exec

utio

ntim

ein

cycl

es

42 / 64

Page 79: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Case study: Exploiting direct-physical maps

43 / 64

Page 80: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel exploits

I Overwrite return address→ jump to userspace codeI Overwrite stack pointer→ switch to userspace stack

44 / 64

Page 81: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Mitigating kernel exploits

I Jump to userspace code? Nope! Hardware prevents that.= Supervisor-mode execution prevention (SMEP)I Switch to userspace stack? Nope! Hardware prevents that.= Supervisor-mode access prevention (SMAP)

45 / 64

Page 82: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Kernel direct-physical map

Virtual address spaceUser Kernel

Physical memory

0

0 max. phys.

247 −247 −1

direct

map

46 / 64

Page 83: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Evading the mitigation

I Get direct-physical-map address of userspace address→ jump/switch there

Known as “ret2dir” attacks Kemerlis et al. 2014

47 / 64

Page 84: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Mitigating the evasion

I Getting rid of direct-physical map?

Apparently not.→ Do not leak physical addresses to user

48 / 64

Page 85: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Mitigating the evasion

I Getting rid of direct-physical map? Apparently not.→ Do not leak physical addresses to user

48 / 64

Page 86: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Circumventing the mitigation

Prefetching via direct-physical mapI use known address or translation recovery attack to find the

direct-physical-mapI find user mode address in with direct-physical map using

address-translation oracle

49 / 64

Page 87: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Circumventing the mitigation

Prefetching via direct-physical map

0 20 40 60 80 100 120 140 160 180 200 220 240100

150

200

250

Page o�set in direct-physical mapMin

.acc

ess

late

ncy

incy

cles

50 / 64

Page 88: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Prefetching via direct-physical map

I immediately leaks a direct-physical map address→ no information leak necessary (compared to ret2dir)I if direct-physical map o�set is known→ leaks physical address

51 / 64

Page 89: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Prefetching via direct-physical map

I works on LinuxI works on OSXI works on Xen PVM (on Amazon EC2)I does not work on Windows

I no direct-physical map ;)

52 / 64

Page 90: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Cache side-channel attacks

Powerful side channel in the cloudI infer user inputI crypto key recoveryI cross-VM, cross-core, even cross-CPUI any architecture

53 / 64

Page 91: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Cache mapping

slice 0 slice 1 slice 2 slice 3

H

2

offsetsettagphysical address

30

061735

11

line

I slice function H unknownI reverse-engineered by Hund et al.

2013; Maurice et al. 2015; Inci et al.2015; Yarom et al. 2015

→ we need the physical address

54 / 64

Page 92: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer

Rowhammer: yet another attack requiring physical address informationI Rowhammer: bit flip at a random location in DRAMI exploitable → gain root privileges Seaborn and Dullien 2015

55 / 64

Page 93: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�er

56 / 64

Page 94: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�er

activate

row bu�er

copy

56 / 64

Page 95: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�er

activate

row bu�er

copy

56 / 64

Page 96: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�er

activate

row bu�er

copy

56 / 64

Page 97: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�er

activate

row bu�er

copy

56 / 64

Page 98: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Rowhammer“It’s like breaking into an apartment by repeatedly slamming a neighbor’s door until the vibrationsopen the door you were after” – Motherboard Vice

DRAM bank

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1 1 1 1 1 1 1. . .

1 1 1 1 1 1 1 1 1 1 1 1 1 1

row bu�errow bu�er

1 0 1 1 1 1 1 0 1 0 1 1 1 1

bit flips in row 2!

56 / 64

Page 99: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

How is DRAM organized?

channel 0

channel 1

back of DIMM: rank 1

front of DIMM:rank 0

chip

57 / 64

Page 100: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

How is DRAM organized?

channel 0

channel 1

back of DIMM: rank 1

front of DIMM:rank 0

chip

57 / 64

Page 101: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

How is DRAM organized?

channel 0

channel 1

back of DIMM: rank 1

front of DIMM:rank 0

chip

57 / 64

Page 102: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

How is DRAM organized?

channel 0

channel 1

back of DIMM: rank 1

front of DIMM:rank 0

chip

57 / 64

Page 103: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

DRAM organization example

chipbank 0

row 0row 1row 2

. . .row 32767

row bu�er

I bits in cells in rowsI access: activate row,

copy to row bu�erI cells leak → refresh

necessaryI cells leak faster upon

proximate accesses

58 / 64

Page 104: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

DRAM mapping functions on a DDR4 system

...678911 1012131416171819202122...

BG0BG1

RankBA0

Ch.

15

BA1

Again: based on physical addresses

59 / 64

Page 105: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Summary

I Defeat SMAP/SMEP through direct-physical mapI Leak physical addresses

I Perform cache attacksI Perform Rowhammer attacks

60 / 64

Page 106: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Black Hat Sound Bytes.

I CPU Design is security relevant (prefetch leaks significant information)I We can locate a driver in the kernel and thus break KASLRI We can break SMAP/SMEP and get physical addresses to assist other attacks

61 / 64

Page 107: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Using Undocumented CPU Behaviour to See into KernelMode and Break KASLR in the Process

Anders Fogh and Daniel Gruss

August 4, 2016

62 / 64

Page 108: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Bonus material

63 / 64

Page 109: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Stronger Kernel Isolation

Today’s operating systems:Shared address space

User memory Kernel memory0 −1

context switch

Stronger kernel isolation:User address space

User memory Not mapped0 −1

Kernel address space

Not mapped Kernel memory0 −1

context switch

switch

addr.space

Interruptdispatcher

64 / 64

Page 110: Using Undocumented CPU Behaviour to See into Kernel Mode and ...

Bibliography I

Hund, Ralf et al. (2013). “Practical Timing Side Channel Attacks against Kernel Space ASLR”. In:S&P’13.

Inci, Mehmet Sinan et al. (2015). “Seriously, get o� my cloud! Cross-VM RSA Key Recovery in aPublic Cloud”. In: Cryptology ePrint Archive, Report 2015/898, pp. 1–15.

Kemerlis, Vasileios P et al. (2014). “ret2dir: Rethinking kernel isolation”. In: USENIX SecuritySymposium, pp. 957–972.

Maurice, Clementine et al. (2015). “Reverse Engineering Intel Complex Addressing UsingPerformance Counters”. In: RAID.

Seaborn, Mark and Thomas Dullien (2015). “Exploiting the DRAM rowhammer bug to gain kernelprivileges”. In: Black Hat 2015 Briefings.

Yarom, Yuval et al. (2015). “Mapping the Intel Last-Level Cache”. In: Cryptology ePrint Archive,Report 2015/905, pp. 1–12.

65 / 64