Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.

Interprocedural Shape Analysis for Recursive Programs

Noam Rinetzky Mooly Sagiv

Shape Analysis

• Static program analysis

• Determines information about dynamically allocated storage– A pointer variable is not NULL– Two data structures are disjoint

• The algorithm is Conservative

Applications of Shape Analysis

• Cleanness – Dor, Rodeh, Sagiv [SAS2000]

• Parallelization –Assmann, Weinhardt [PMMPC93] –Hendren, Nicolau [TPDS90]–Larus, Hilfinger [PLDI88]

Current State

• Good Intraprocedural analyses

• Sagiv, Reps, Wilhelm [TOPLAS 1998] – Analyze body of list manipulation procedures:

• reverse , insert, delete

– Expensive, imprecise interprocedural analyses of recursive procedures

Main Results

• Interprocedural shape analysis algorithm for programs manipulating linked lists– Handles recursive procedures

• Prototype implementation – Successfully analyzed several list manipulating

procedures• insert, delete, reverse, reverse_append

– Properties verified• An a-cyclic list remains a-cyclic

• No memory leaks

• No NULL dereference

Running Exampletypedef struct List {

int data ; struct List* n ;

} *L ;

L create(int s) {L t=NULL;if (s <= 0)

return NULL;

t = (L) malloc(sizeof(*L));t data = s ;

l2: t n = create(s-1);return t;

}

void main() { L r = NULL; int k; …

l1: r = create(k);}

Selected Memory States

exit k=3 r = NULL

void main() { L r = NULL; int k; …

l1: r = create(k);}

L create(int s) {

L t=NULL;

if (s <= 0)

return NULL;

t = (L) malloc(sizeof(*L)); td = s ;

l2: t n = create(s-1);

return t;

}

l1 s=3

t


l2 s=0

t = NULL

l2 s=1

t

l2 s=2

t

exit k=3

r = NULL

3 NULL 2 NULL 1 NULL

L create(int s) {

L t=NULL;

if (s <= 0)

return NULL;



return t;

}

l1 s=3

t


l2 s=1

t

l2 s=2

t

exit k=3

r = NULL


L create(int s) {

L t=NULL;

if (s <= 0)

return NULL;



return t;

}

l1 s=3

t


l2 s=2

t

exit k=3

r = NULL

3 NULL 2 1 NULL

L create(int s) {

L t=NULL;

if (s <= 0)

return NULL;



return t;

}

l1 s=3

t


exit k=3

r = NULL

3 2 1 NULL


exit k=3

r

3 2 1 NULLvoid main() { L r = NULL; int k; …

l1: r = create(k);}

Where is the Challenge ?

• Dynamic allocation– Unbounded number of objects

• Recursion– Unbounded number of

activation records

• Properties of:– Invisible instances of local

variables

– Dynamically allocated objects

l1 s=3

l2 s=0

l2 s=1

l2 s=2

exit k=3


r = NULL

t

t

t

t = NULL

Our Approach Reduce the interprocedural problem shape

analysis problem to an intraprocedural problem

Program with

procedures

Program without

proceduresRepresent the activation record stack as a linked list:• Control Information

• Invisible instances of local variables

Explicit manipulation of the stack

Our Algorithm

• Abstract Interpretation– Concrete Semantics:

• Concrete representation of memory states

• Effect of program statements

– Abstract Semantics:• Abstract representation of memory states

• Transfer functions

• Finds abstract representation of memory states at every program point

Concrete Memory Descriptors

csexit

csl1

csl2

csl2

top csl2

pr

pr

pr

pr

t

t

tl1 s=3

t

l2 s=0

t = NULL

l2 s1

t

l2 s=2

t

exit k=3

r = NULL


Concrete Memory Descriptors

Relationships between memory elements:

• value of local variables: t, r

• n-successor: n

• invoked by: pr

csexit

csl1

csl2

csl2

top csl2

pr

pr

pr

pr

t

t

t

Properties of memory elements:

• “type”: stack, heap

• “visibility”: top

• “call-site”: exit, csl1 , csl2

Bounding the Representation

• Concrete Memory Descriptors represent memory states – Every object is represented uniquely

• Abstract Memory Descriptors– Conservatively represent Concrete Memory

Descriptors– A bounded representation

3-Valued Properties

True False

top

t

Don’t Know

top=1/2

t

Abstraction

csexit

csl1

csl2

csl2

csl2 , toppr

pr

pr

prt

t

tt

tcsexit

csl1

csl2 , top

pr

csl2pr

pr

pr

Bounding the Representation

• Summarize nodes according to their unary properties

• Join values of relationships

• Convert a Concrete Memory Descriptor of arbitrary size into an Abstract Memory Descriptor of bounded size

• Does the Abstract Memory Descriptor contain enough information?

Problem

csl2 , top

csl2

exit

pr

pr

prt

csl1

pr

texit

csl1

csl2

csl2

csl2 , toppr

pr

pr

prt

t

t

Observing Properties of Invisible Variables

• Explicitly track universal properties of invisible-variables – Different invisible instances of t cannot point to

the same heap cell

• Instrumentation properties– Track derived properties of memory elements

Some Instrumentation Properties

• Pointed-to by an invisible instance of t

• Pointed by more than one invisible instance of t

• t is not NULL

Memory Descriptors with Instrumentation

exit

csl1

csl2

csl2

csl2 , toppr

pr

pr

pr

t

t

t

csl2 , top

csl2

exit

pr

pr

pr

csl1

pr

t

t

Problem - solved

csl2 , top

csl2

exit

pr

pr

pr

csl1

pr

t

t

exit

csl1

csl2

csl2

csl2 , toppr

pr

pr

prt

t

csl2 , top

t

Why Does It Work

• Shape analysis handles linked list quite precisely (Sagiv, Reps, Wilhelm [TOPLAS98])

• Utilize the (intraprocedural) 3-valued logic framework of Sagiv, Reps and Wilhelm [POPL99] to analyze the resulting intraprocedural problem

Prototype Implementation

• Implemented in TVLA [Lev-Ami, Sagiv SAS 2000]

• Analyzed some recursive list manipulating programs

• Verified cleanness properties:– No memory leaks– No NULL dereferences

Prototype ImplementationProcedure

create

delAll

insert

delete

search

append

reverse

reverse_append

reverse_append _r

Running example

Time (sec)

7.31

12.74

34.61

38.29

8.07

40.64

47.56

95.35

1204.13

16.50

Number of (3VL) Structures

219

139

344

423

303

326

414

797

2285

208

Conclusion

• Need to know more than potential values of invisible variables

• Tracking properties of invisible variables helps to overcome the (necessary) imprecision summarization of their values

• Instrumentation – Generic

• Sharing by different instances of a local variable

– List specific

Conclusion

• Storing the call-site enable to improve information propagation to return-sites

• Shows how the intraprocedural framework of Sagiv, Reps and Wilhelm can be used for interprocedural analyses

• Analysis of a complex data structure

Limitations

• Small programs

• No mutual recursion (Implementation)

• Predefined instrumentation library Easy to use, no need for user intervention – Might not be good for all programs

Further Work

• Scaling the algorithm – Distinguishing between “relevant context” and

“irrelevant” context– Analysis of programs manipulating Abstract

Data Types

The End

Interprocedural shape analysis for recursive programsNoam rinetzky and Mooly Sagiv

Compiler Construction 2001

www.cs.tau.ac.il/~maon

Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.

Documents

Transcript of Interprocedural Shape Analysis for Recursive Programs Noam Rinetzky Mooly Sagiv.