SWAT Memory Leak Detection

49
1 SWAT Memory Leak Detection Matthias Hauswirth

description

SWAT Memory Leak Detection. Matthias Hauswirth. Agenda. Approaches to memory leak detection SWAT infrastructure Heap model Staleness predicates Leak analysis tool. Memory Leaks. alloc. access. free. object1. time. Memory Leaks. alloc. access. free. object1. alloc. access. - PowerPoint PPT Presentation

Transcript of SWAT Memory Leak Detection

Page 1: SWAT Memory Leak Detection

1

SWATMemory Leak Detection

Matthias Hauswirth

Page 2: SWAT Memory Leak Detection

2

Agenda

Approaches to memory leak detection SWAT infrastructure Heap model Staleness predicates Leak analysis tool

Page 3: SWAT Memory Leak Detection

3

Memory Leaks

time

object1

alloc freeaccess

Page 4: SWAT Memory Leak Detection

4

Memory Leaks

time

object1

alloc access

alloc freeaccess

shutdown

object2

Page 5: SWAT Memory Leak Detection

5

Memory Leaks

time

object1

alloc access

alloc access

reachable unreachable

alloc freeaccess

shutdown

object2

object3

Page 6: SWAT Memory Leak Detection

6

Approaches to Leak Detection

Survivors Objects surviving until program termination

Unreachables Objects unreachable at snapshot (GC)

Stales Objects not recently accessed at snapshot

(SWAT)

Page 7: SWAT Memory Leak Detection

7

Survivors: Guess

time

o1

o2

startup shutdown

o3

o4

o5

leak

leak

leak

leak

-

Page 8: SWAT Memory Leak Detection

8

Survivors: Reality

time

o1

o2

startup shutdown

o3

o4

o5

leak ?

leak

leak ?

leak

-

Page 9: SWAT Memory Leak Detection

9

Unreachables: Guess

time

o1

o2

startup shutdown

o3

o4

o5

leak

snapshot

-

alive

-

alive

Page 10: SWAT Memory Leak Detection

10

Unreachables: Reality

time

o1

o2

startup shutdown

o3

o4

o5

alive ?

snapshot

leak

-

alive

-

Page 11: SWAT Memory Leak Detection

11

Stales (SWAT): Guess

time

o1

o2

startup shutdown

o3

o4

o5

leak

snapshot

leak

-

-

alive

Page 12: SWAT Memory Leak Detection

12

Stales (SWAT): Reality

time

o1

o2

startup shutdown

o3

o4

o5

snapshot

-

-

alive

leak

leak

Page 13: SWAT Memory Leak Detection

13

SWAT Infrastructure

instrument

winword.exe

winword.swat.exe

runswatruntime.dll

source info

postprocess

snapshots

statistics

view

settings

Page 14: SWAT Memory Leak Detection

14

Instrument

proc1

comp1

Page 15: SWAT Memory Leak Detection

15

Bursty Tracing:Duplicate Basic Blocks

proc1 prof$proc1

comp1

Page 16: SWAT Memory Leak Detection

16

Bursty Tracing:Insert Dispatch Checks

proc1 prof$proc1

comp1

Page 17: SWAT Memory Leak Detection

17

Instrumentation:Patch Allocations & Frees

xalloc XallocWrapper

comp1 swatruntime.dll

Page 18: SWAT Memory Leak Detection

18

Instrumentation:Instrument Loads & Stores

proc1 prof$proc1

comp1

RecordReference

swatruntime.dll

Page 19: SWAT Memory Leak Detection

19

Bursty TracingDispatch Check

DecOrig

StayOrig

OrigTgt

OrigZero

DecProf

StayProfStartOrig StartProf

ProfTgt

OrigSrc ProfSrcGlobal Counters:cOrig # of StayOrigcProf # of StayProf

cOrig==1

cOrig>1 cProf==0cProf==1 cProf>1

Page 20: SWAT Memory Leak Detection

20

Adaptive Bursty Tracing

Bursty tracing Sampling rate influences results Rate chosen at runtime

Adaptive bursty tracing Different sampling rate by dispatch check point Start at high rate Wait until average gets down to requested rate Start rate, delta & target rate chosen at runtime

Page 21: SWAT Memory Leak Detection

21

Why Adaptive Bursty Tracing?

Skewed Code Coverage

0

0.5

1

1.5

2

2.5

Bil

lio

ns

Dispatch Check (sorted by # executions, top 30 of 1200)

# e

xe

cu

tio

ns

Page 22: SWAT Memory Leak Detection

22

Adaptive Bursty TracingDispatch Check

DecOrig

StayOrig

OrigTgt

OrigZero

DecProf

StayProfStartOrig StartProf

ProfTgt

OrigSrc ProfSrc Per-Dispatch Check Counter: cOrig[dcid] # of StayOrigGlobal Counter:cProf # of StayProf

cOrig[dcid]==1

cOrig[dcid]>1 cProf==0cProf==1 cProf>1

dcid

Page 23: SWAT Memory Leak Detection

23

Effect of Adaptive Bursty Tracing on Coverage

Boosting Coverage of Rare Dispatch Checks

-50%

-25%

0%

25%

50%

75%

100%

125%

150%

Dispatch Checks (sorted # executions, top 30 of 1200)

% m

ore

# p

rofi

led

ex

ec

uti

on

s w

ith

ad

ap

tiv

e b

urs

ty t

rac

ing

Page 24: SWAT Memory Leak Detection

24

SWAT Heap Model

Requirements AllocateObject(eip, startAddress, size) FreeObject(eip, startAddress) FindObject(eip, address) GetObjectIterator()

Implementations Hash table (address→objectInfo) Hash table (startAddress→objectInfo)

Hash table (address→offsetToStartAddress) Address tree

Page 25: SWAT Memory Leak Detection

25

SWAT Heap Model0

0

0

0

0

0 00

0 0 0 0 0 0 0

0000 10000100 1100

1

1

1

1

1

1 11

1 1 1 1 1 1 1

Address: 0101

0101

Page 26: SWAT Memory Leak Detection

26

SWAT Heap Model0

0

0

0

0

0 00

0 0 0 0 0 0 0

0000 10000100 1100

1

1

1

1

1

1 11

1 1 1 1 1 1 1

8 byte0101

Page 27: SWAT Memory Leak Detection

27

SWAT Heap Model0

0

0

0

0

0 00

0 0 0 0 0 0 0

0000 10000100 1100

1

1

1

1

1

1 11

1 1 1 1 1 1 1

Page 28: SWAT Memory Leak Detection

28

SWAT Heap Model0

0 0

0 00

0 0 0 0 0

0000 10000100 1100

1

1 1

1 11

1 1 1 1 1

Page 29: SWAT Memory Leak Detection

29

SWAT Heap Model0

0 0

0 00

0 0 0 0 0

0000 10000100 1100

1

1 1

1 11

1 1 1 1 1

Page 30: SWAT Memory Leak Detection

30

SWAT Heap Model0

0 0

00

0 0

0000 10000100 1100

1

1 1

11

1 1

Page 31: SWAT Memory Leak Detection

31

SWAT Heap Model0

0 0

00

0 0

1

1 1

11

1 1

Start address: 0101Size: 8Access count: 19Last access time: 19’000’000Alloc site: EIP 0x400019Last access site: EIP 0x400190

Page 32: SWAT Memory Leak Detection

32

SWAT Heap Model

Space Overhead Address Tree Nodes

0.03 … 0.35 allocated node bytes / allocated byte Overall

0.12 … 3.4 times the allocated memory

Time FindObject(eip, address)

Log(addressSpaceSize) --- (32 bits = 32 nodes)

Page 33: SWAT Memory Leak Detection

33

Evaluation: Time Overhead

0

1000

2000

3000

4000

5000

6000

7000

8000b

0.1

%

a 0

.1%

b 1

%

a 1

%

b 1

0%

a 1

0%

b 0

.1%

a 0

.1%

b 1

%

a 1

%

b 1

0%

a 1

0%

twolf vpr

Sec

on

ds

Uninstrumented

Excluding Snapshots

Total

Leak (All)

Benchmark Config

Data

Page 34: SWAT Memory Leak Detection

34

active

Staleness Predicates

Stale = object not needed anymore Stale, if…

Never accessed

Idle time > t

Idle time > n * active time

idle

n*active

t

idle

Page 35: SWAT Memory Leak Detection

35

Evaluation

Inject leaks Randomly, at runtime, decide not to execute a free

Variables Sampling rate Adaptive or bursty Predicate

Measurement results per snapshot List of objects assumed leaked

Some true, some false List of objects assumed alive

Some true, some false

Page 36: SWAT Memory Leak Detection

36

Comparing Predicates

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%Id

le>

10

0*A

ctiv

e

Idle

>1

0*A

ctiv

e

Idle

>1

*Act

ive

Idle

>1

00

0M

io

Idle

>1

00

Mio

Idle

>1

0M

io

Error in Leak ListError in Alive List

benchmark (All) leakage (All) # (All) config (All)

predicate

Data

Page 37: SWAT Memory Leak Detection

37

Comparing Sampling Rates

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

b 0.1% a 0.1% b 1% a 1% b 10% a 10%

Error in Leak ListError in Alive List

benchmark (All) leakage (All) # (All) predicate (All)

config

Data

Page 38: SWAT Memory Leak Detection

38

Lucky Omission Effect

maxIdleTime

time [# actual references]

Injected Leak

QuestionAt time of snapshot, is object a leak?

snapshot

Page 39: SWAT Memory Leak Detection

39

Lucky Omission Effect

maxIdleTime

Low sampling rate

time [# actual references] snapshot

Page 40: SWAT Memory Leak Detection

40

Lucky Omission Effect

maxIdleTime

Low sampling rate assumed leaked: true

time [# actual references] snapshot

Page 41: SWAT Memory Leak Detection

41

Lucky Omission Effect

maxIdleTime

Low sampling rate

High sampling rate

assumed leaked: true

time [# actual references] snapshot

Page 42: SWAT Memory Leak Detection

42

Lucky Omission Effect

maxIdleTime

Low sampling rate

High sampling rate

assumed leaked: true

assumed alive: false

time [# actual references] snapshot

Page 43: SWAT Memory Leak Detection

43

Lucky Omission Effect

lucky omission window

maxIdleTime

Low sampling rate

High sampling rate

assumed leaked: true

assumed alive: false

time [# actual references] snapshot

Page 44: SWAT Memory Leak Detection

44

Mitigation ofLucky Omission Effect

Reduce chance of leak happening during maxIdleTime snapshotInterval >> maxIdleTime

maxIdleTime

time [# actual references]

maxIdleTime

snapshotInterval

snapshotsnapshot

Page 45: SWAT Memory Leak Detection

45

Practical Sampling Rates &Useful Predicates

0%

5%

10%

15%

20%

25%

30%

a 1% b 1% a 1% b 1%

Idle>1*Active Idle>1000Mio

Error in Leak List

Error in Alive List

benchmark (All) leakage (All) # (All)

predicate config

Data

Page 46: SWAT Memory Leak Detection

46

Leak Analysis Tool

Page 47: SWAT Memory Leak Detection

47

Ranking

Sort <alloc site, last access site> pairs

Old rankings: # of stale objects [currently used]

# of stale bytes Drag caused by stale objects (bytes*idle time)

New ranking: # of predicates declaring an object stale

Page 48: SWAT Memory Leak Detection

48

Conclusions

Many ways to leak detection Predicting leaks by looking at past events:

Important objects might never be used (boxsim) Lots of stale objects might indicate a space-

inefficient algorithm Leak Analysis Tool

Made it easy to find several statically injected leaks

Page 49: SWAT Memory Leak Detection

49

Future Work

Currently: Store source info compactly (at instrumentation time) Snapshots at runtime don’t use source info Post process snapshots to add source info

This week: Rank leaks Update Leak Analysis Tool to use ranking Run new version on winword.exe and mshtml.dll

Later: Combine “Unreachables” with “Stales” approach