Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit...

20
Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010

Transcript of Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit...

Page 1: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Qin Zhao (MIT)Derek Bruening (VMware)Saman Amarasinghe (MIT)

Efficient Memory Shadowing for 64-bit Architectures

ISMM 2010, Toronto, CanadaJune 6, 2010

Page 2: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Dynamic Program Analysis

• Understand Program Behavior– Optimization– Debugging– Security– Memory management

• Shadow Memory Tools– Maintain meta-data for every memory location– Update meta-data on every memory operation

ISMM, Toronto, Canada, 6/6/2010 2

Page 3: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Examples

• Memory Error Detection– MemCheck [VEE’07]– Purify [USENIX’92]– Dr. Memory

• Dynamic Information Flow Tracking– LIFT [MICRO’39]– TaintTrace [ISCC’06]

• Multi-threaded Program Analysis– Eraser [TCS’97]– Helgrind

• Memory Usage Analysis– CETS [ISMM’10]– Staleness

ISMM, Toronto, Canada, 6/6/2010 3

Page 4: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Shadow Memory System

• Shadow Memory Manager– Meta-data for application memory– Memory mapping scheme (addrA addrS)

• DMS (Direct Mapping)• SMS (Segmented Mapping)

• Instrumentor– Every memory operation

• Address calculation• Meta-data update

– Expensive• MemCheck (~25x)

– ~12x for addrA addrS

ISMM, Toronto, Canada, 6/6/2010

a.outa.out

stack stack

libc libc

Application Memory

Shadow Memory

heap heap

4

Page 5: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Direct Mapping Scheme (DMS)• Single memory region for entire address space.• Translation:• Issue: address conflict between memA and memS

ISMM, Toronto, Canada, 6/6/2010

dispaddraddr AS

lea [addr] %r1add %r1 disp %r1

DMS-32 SMS-32 DMS-64 SMS-640

1

2

3

4

5

1.80

2.40

4.67

Slowdown relative to

native execution

Application

Shadow

5

Page 6: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

DMS-32 SMS-32 DMS-64 SMS-640

1

2

3

4

5

1.80

2.40

4.67

Slowdown relative to

native execution

Segmented Mapping Scheme (SMS)• Shadow segment per application segment• Translation:

– Segment lookup (address indexing)– Address translation

ISMM, Toronto, Canada, 6/6/2010

lea [addr] %r1mov %r1 %r2shr %r2, 16 %r2add %r1, disp[%r2] %r1

segAS dispaddraddr

addrA

addrS

App 1

Shd 1

Shd 2

App 2

Segment table

6

Page 7: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Kernel space

Shadow Memory Mapping

• Scaling to 64-bit Architecture– DMS

• Infeasible due to memory layout

ISMM, Toronto, Canada, 6/6/2010

a.out

Unusable space

stack

User space

vsyscall

247

264

7

Page 8: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Shadow Memory Mapping

• Scaling to 64-bit Architecture– DMS

• Infeasible due to memory layout– Single-Level SMS

• Too big (~4 billion entries)

ISMM, Toronto, Canada, 6/6/2010

addrA

8

Page 9: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Shadow Memory Mapping

• Scaling to 64-bit Architecture– DMS

• Infeasible due to memory layout– Single-Level SMS

• Too big (~4 billion entries)– Multi-Level SMS

• Even more expensive

ISMM, Toronto, Canada, 6/6/2010DMS-32 SMS-32 DMS-64 SMS-64

0

1

2

3

4

5

1.80

2.40

4.67

Slowdown relative to

native execution

addrA

9

Page 10: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Umbra (CGO’10)

• Scaling to 64-bit Architecture– Single-Level SMS is too big but sparse

• Umbra (CGO’10)– Eliminate empty entries– Compact table– Walk the table to find the entry

ISMM, Toronto, Canada, 6/6/2010 10

Page 11: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

DMS-32 SMS-32 DMS-64 SMS-64 Umbra-640

1

2

3

4

5

1.80

2.40

4.67

2.49

Slowdown relative to

native execution

Umbra (CGO’10)

• Reference Uni-Cache– Software cache per instr per thread

• Segment tag & displacement• Check uni-cache before table walk

• 99.97% hit ratio

ISMM, Toronto, Canada, 6/6/2010 11

tag = addrA & mask;if (cachetag != tag) { … // table walk} addrS = addrA + cachedisp

Page 12: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

EMS64: Key Idea

• Umbra

• EMS64 – Speculatively use a disp without check– Smart shadow memory placement

• Notified by memory access violation fault for incorrect displacement

ISMM, Toronto, Canada, 6/6/2010 12

tag = addrA & mask;if (cachetag != tag) { … // table walk (0.03%)}addrS = addrA + cachedisp

Page 13: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

EMS64: Example

A0

A2

S0

0: Application

2: Shadow

11: Application

12: Unavailable

S2 10: Shadow

13: Unavailable

15: Unavailable

14: Unavailable

6: Shadow7: ApplicationA1

S1

Displacement: {-1, 2}

ISMM, Toronto, Canada, 6/6/2010 13

9: Reserved

13: Unavailable/Reserved

15: Unavailable/Reserved

Page 14: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

EMS64: Potential Problem

A0

A2

S0

0: Application

2: Shadow

11: Application

12: Unavailable

S2 10: Shadow

14: Unavailable

6: Shadow7: ApplicationA1

S1

Displacement: {-1, 2}

ISMM, Toronto, Canada, 6/6/2010 14

9: Reserved

13: Unavailable/Reserved

15: Unavailable/Reserved

Page 15: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

EMS64: Final Solution

A0

A2

S0

0: Application

2: Shadow

11: Application

12: Unavailable

S2 10: Shadow

13: Unavailable/Reserved

15: Unavailable/Reserved

14: Unavailable

6: Shadow7: ApplicationA1

S1

Displacement: {-1, 2}

ISMM, Toronto, Canada, 6/6/2010 15

9: Reserved

4: Reserved

5: Reserved

1: Reserved

12: Unavailable/Reserved

8: Reserved

Page 16: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Slot Finding Problem

• Given n slots: – k Application slots– x Empty slots– y Reserved slots

• Find k S-slots.– For each slot A i, there is one associated slot S with

displacement di where di = Si - Ai.

– For each slot Ai and each existing displacement dj where di≠dj, slot ((Ai + dj) mod n) is an R-slot or an E-slot.

– For each slot S and any existing valid displacement di slot, slot ((S + di) mod n) is an R-slot or an E-slot.

ISMM, Toronto, Canada, 6/6/2010 16

A0 A1 E0 E1 E2 E3 E4 R0

Ai Application slot

Shadow slot

Ei Empty slot

Ri Reserved slot

Si

S0 S1 R1 R2

Page 17: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Slot Finding Problem

• Given n slots: – k Application slots– x Empty slots– y Reserved slots

• Can We Find k S-slots?– Depends on layout!– Guarantee to find it, for 48-bit address space, if

• Application memory < 250 GB– Proof

• x ≥ 8k2+2k+1• We can always find an Si for Ai if #E-slot > #conflicts

ISMM, Toronto, Canada, 6/6/2010 17

Ai Application slot

Shadow slot

Ei Empty slot

Ri Reserved slot

Si

Page 18: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Implementation & Optimization

• Implementation– Shadow memory allocation– Add signal handler– Remove reference uni-cache check

• Optimization– Restore uni-cache checks for instructions that access

multiple segments, e.g., references from memcpy• When number of access violation exceed 2

ISMM, Toronto, Canada, 6/6/2010

lea [addr] %r1 add %r1, unicachedisp %r1

18

Page 19: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Experimental ResultsSlowdown relative to

native execution

ISMM, Toronto, Canada, 6/6/2010

DMS-32 SMS-32 SMS-64 Umbra-64 EMS-640.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

1.80

2.40

4.67

2.49

1.81

19

Page 20: Qin Zhao (MIT) Derek Bruening (VMware) Saman Amarasinghe (MIT) Efficient Memory Shadowing for 64-bit Architectures ISMM 2010, Toronto, Canada June 6, 2010.

Thank You

• Download– http://people.csail.mit.edu/qin_zhao/umbra/

• Q & A

ISMM, Toronto, Canada, 6/6/2010 20