A Practical and Precise Inference and Specializer for Array Bound Checks Elimination

PEPM 2008 - 8 January 1

A Practical and Precise Inference and Specializer for

Array Bound Checks Elimination

Corneliu PopeeaNatl Univ of Singapore

Dana N. XuUniv of Cambridge

Wei-Ngan ChinNatl Univ of Singapore

2

optimized program

method summaries

Array Bound Check Elimination

• Problem:– without array bound checks (e.g. C), programs

may be unsafe.– with array bound checks (e.g. Java), program

execution is slowed down.

• Solution: eliminate redundant checks.

SpecializationInferenceinput

program

3

Inference

Goal: derive preconditions that make checks redundant.

Our contributions: modular inference of preconditions.

handling indirection arrays.

float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}

L1

L2

Symb. Program State:i=j+1 Æ 0<i<=n

Checks : i¸0 i<len(a)

Checks : m¸0 m<len(a)

SAFEPRECONDITION

SAFEUNSAFE

Symb. Program State:i=j+1 Æ m>=0

4

Specialization

Goal: eliminate runtime checks guided by inference results.• If we assume all callers satisfy (j+1< len(a)) :

float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error);}

L1

Our contribution: integrate modular inference with specializer.

5

Overview

• Introduction

• Our approach– Modular inference: postcondition + preconditions.– Flexi-variant specialization.

• Experimental results.

• Conclusion.

6

Setting

• First order imperative language:

• Invariants expressed as linear formulae:

meth ::= t mn ( ([ref] t v)* ) { e } - methodt ::= int | float | t[int, .. , int] - typee ::= k | v | if v then e1 else e2 - expression | v=e | t v=e1;e2 | mn(v*)

Q ::= { q(v*) = Á } - recursive formulaÁ ::= Á1ÆÁ2| Á1ÇÁ2| q(v*) | s - formulas ::= a1v1 + .. + anvn · a - linear inequality

7

sps(L1) = sps(L0) Æ i’=j'+1 Æ 0<i’·nsps(L2) = sps(L0) Æ i’=j'+1 Æ m’¸0

Forward Derivation

• Compute sps (symb. program state) at each point.• To support modularity, symbolic transitions relate

initial values (j,n) and latest values (j’,n’) :

float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}

sps(L0) = len(a)>0Æj’=jÆn’=n

L1

L2

L0

8

Forward Derivation for Recursion

• Each method is first translated to a recursive constraint.

• Compute an over-approximation of the least fixed point of this recursive constraint:– precise disjunctive polyhedron abstract domain.– with hulling and widening operators.

• Details and examples in the paper.

9

Indirection Arrays

• Hold indexes for accessing another array.• Used intensively for sparse matrix operations.

• Need to capture universal properties about elements inside array:

0 · a_elem · 10represented as:

8 i 2 indexes(a) ¢ 0 · a[i] · 10

10

Indirection Arrays

• Given method:

• Compute postcondition:

void initArr(int a[], int i, int j, int n) { if (i>j) then () else { a[i]=n; initArr(a,i+1,j,n+1) }

(i>j Æ a_elem'=a_elem)Ç (0·i·j<len(a) Æ (a_elem'=a_elem Ç n·a_elem'·n+j-i))

11

Inference of Preconditions

• Classify checks with

– pre is valid: safe check.

– pre is unsatisfiable: unsafe check.

– .. otherwise propagate pre as a check for the caller.

pre = 8L¢(sps ) chk)

12

float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}

sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n

pre(L1.high) = 8 {i',j',n'} ¢ (sps(L1) ) i'<len(a))

= (j<len(a)-1) Ç (n·j Æ j¸len(a)-1)

sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n

pre(L1.low) = 8 {i',j',n'} ¢ (sps(L1) ) i'¸ 0)

= true

sps(L2) = len(a)>0 Æ i'=j'+1 Æ m'¸ 0 Æ j'=j Æ n'=n

pre(L2.high) = 8 {i',j',n',m'} ¢ (sps(L2) ) m'<len(a))

= false

Example: Preconditions

• Derive weakest precondition for each check:

L2

L1

13

Efficient Preconditions

• Problem: negation of sps results in large preconditions.

– naïve pre-derivation: (len(a)· 0) Ç (j<len(a)-1 Æ 1·len(a)) Ç (n·j Æ 1·len(a)·j+1)

• Simplify preconditions via strengthening:

– weak pre-derivation drops disjuncts that violate type-invariants: (j<len(a)-1) Ç (n·jÆ len(a)· j+1)

– strong pre-derivation drops disjuncts that allow the avoidance of the check: (j<len(a) - 1)

– selective pre-derivation between weak and strong.

no loss in precision

less precise, but more efficient

too large

14

Inference Result: Method Summary

• Postcondition: (j<len(a)-1 Ç j¸len(a)-1 Æ n·j) Æ j’=j Æ n’=n

• Preconditions: { L1.high: (j<len(a)-1) }• Unsafe-checks: { L2.high }

float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}

L1

L2

15

Overview

• Introduction

• Our approach– Modular inference: postcondition + preconditions.– Flexi-variant specialization.

• Experimental results.

• Conclusion.

16

Specialization

• If we assume all contexts satisfy (j+1 < len(a)):

• If we assume all contexts do not satisfy (j+1 < len(a)): specialize foo with 2 runtime checks.

• Otherwise … ?

float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error);}

L1

17

Specialization

• Monovariant specializer– One specialized code for each method.– Lower bound of all optimization.– Compact code size.

• Polyvariant specializer– Multiple optimized codes per method.– Each call site is replaced by a specialized call.– Highly optimized but may have code blow-up.

18

Flexivariant Specialization

• Allows trade-off between optimization and code size.• Decides how many copies to generate per method,

based on frequency and size constraint.

• Less optimization - 1 copy: foo (2 runtime checks).• More optimization - 2 copies: foo1 (1 runtime check)

+ foo2 (2 runtime checks)

19

Soundness

• Inference + Specialization = Well-typed program

Theorem: Given a program P and an inference judgment ` P PI. Let Bflex PI PT be the specialization of PI to PT. Then, if PT is well-typed, its execution will never proceed to invalid array-accesses.

20

Implementation

• Prototype written in Haskell language:– uses an efficient Presburger solver [W. Pugh et al].– disjunctive fixed-point analyzer [Popeea and Chin].

• Test programs:– small programs: binary search, merge sort, quick sort.– numerical benchmarks: Fast Fourier Transform,

LU decomposition, Linpack.

21

Experimental ResultsBenchmarks Source

(lines)

Static Checks Time

(secs)

Static Checks Eliminated

binary-search 31 2 1.81 100%

bubble-sort 39 12 1.51 100%

merge-sort 58 24 16.01 100%

queens 39 8 2.11 100%

quick-sort 43 20 1.92 100%

sentinel 26 4 0.16 75%

sparse multiply 46 12 17.37 100%

FFT 336 62 58.02 100%

LU 191 82 93.31 100%

SOR 84 32 4.67 100%

Linpack 903 166 360.1 100%

22

Precondition Strengthening

• Weak prederivation may generate preconditions that are too large to be manipulated (* signifies a timing over an hour)

• Strong prederivation keeps preconditions small (simplifies 81% from weak-pre).

• Selective prederivation: both efficient and precise (simplifies 63.4% from weak-pre).

Benchmark Time (secs)

Programs Weak Selective Strong

FFT * 58.02 28.74

LU 137.1 93.31 72.91

SOR 7.18 4.67 3.8

Linpack * 360.1 162.2

23

Conclusion

• Modular summary-based analysis:– Disjunctive postcondition inference– Derivation of efficient, scalable preconditions.

• Integration with a flexi-variant specializer.

• Implementation of a prototype system.

• Correctness proof.

24

A Practical and Precise Inference and Specializer for

Array Bound Checks Elimination

Corneliu Popeea, Dana N. Xu, Wei-Ngan Chin

We thank Siau-Cheng Khoo for sound and insightful suggestions. Thanks to anonymous referees for comments.

25

Related Work

• Global analyses:– Techniques: Suzuki and Ishihata [POPL'77], Cousot

and Halbwachs [POPL'78]– Tools: Astreé [PLDI'03], C Global Surveyor [PLDI'04]

• Modular analyses:– Cousot and Cousot [IFIP'77, CC'02]– Chatterjee, Ryder and Landi [POPL'99]– Moy [VMCAI'08]

• Dependent type checking:– Xi and Pfenning [PLDI'98]

26

• Limitations:– Large formulae: currently under-approx. formulae

are propagated. Over-approx. formulae are more compact, since sps appears in a positive position.

• Future work:– Dual analysis to validate some alarms as true bugs.– Extend the analysis with sound treatment of reference

types.– Handle more (existential) properties about array

elements.

27

Two Kinds of Recursive Invariants

• For loops:– compute a loop invariant.

• For methods with general recursion:– compute a loop invariant.– the method postcondition cannot be

determined directly from the loop invariant: a separate fixed-point is computed.

28

VCgen Verification Condition Generator

• Backward VCgen:– given: {P} assert chk {Q}– derives: P = (Q Æ chk)

• Our precondition derivation:– given: {pre} …; assert chk {sps}– derives: pre = (sps => chk)

• Differences:– sps is a transition relation: pre holds at the beginning

of the current method.– sps is computed by a separate forward derivation.

A Practical and Precise Inference and Specializer for Array Bound Checks Elimination

Documents

Transcript of A Practical and Precise Inference and Specializer for Array Bound Checks Elimination