Background Checks. Why Conduct Background Checks? INDIANA UNIVERSITY 1.
A Practical and Precise Inference and Specializer for Array Bound Checks Elimination
description
Transcript of A Practical and Precise Inference and Specializer for Array Bound Checks Elimination
PEPM 2008 - 8 January 1
A Practical and Precise Inference and Specializer for
Array Bound Checks Elimination
Corneliu PopeeaNatl Univ of Singapore
Dana N. XuUniv of Cambridge
Wei-Ngan ChinNatl Univ of Singapore
2
optimized program
method summaries
Array Bound Check Elimination
• Problem:– without array bound checks (e.g. C), programs
may be unsafe.– with array bound checks (e.g. Java), program
execution is slowed down.
• Solution: eliminate redundant checks.
SpecializationInferenceinput
program
3
Inference
Goal: derive preconditions that make checks redundant.
Our contributions: modular inference of preconditions.
handling indirection arrays.
float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}
L1
L2
Symb. Program State:i=j+1 Æ 0<i<=n
Checks : i¸0 i<len(a)
Checks : m¸0 m<len(a)
SAFEPRECONDITION
SAFEUNSAFE
Symb. Program State:i=j+1 Æ m>=0
4
Specialization
Goal: eliminate runtime checks guided by inference results.• If we assume all callers satisfy (j+1< len(a)) :
float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error);}
L1
Our contribution: integrate modular inference with specializer.
5
Overview
• Introduction
• Our approach– Modular inference: postcondition + preconditions.– Flexi-variant specialization.
• Experimental results.
• Conclusion.
6
Setting
• First order imperative language:
• Invariants expressed as linear formulae:
meth ::= t mn ( ([ref] t v)* ) { e } - methodt ::= int | float | t[int, .. , int] - typee ::= k | v | if v then e1 else e2 - expression | v=e | t v=e1;e2 | mn(v*)
Q ::= { q(v*) = Á } - recursive formulaÁ ::= Á1ÆÁ2| Á1ÇÁ2| q(v*) | s - formulas ::= a1v1 + .. + anvn · a - linear inequality
7
sps(L1) = sps(L0) Æ i’=j'+1 Æ 0<i’·nsps(L2) = sps(L0) Æ i’=j'+1 Æ m’¸0
Forward Derivation
• Compute sps (symb. program state) at each point.• To support modularity, symbolic transitions relate
initial values (j,n) and latest values (j’,n’) :
float foo (float a[], int j, int n) { float v=0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}
sps(L0) = len(a)>0Æj’=jÆn’=n
L1
L2
L0
8
Forward Derivation for Recursion
• Each method is first translated to a recursive constraint.
• Compute an over-approximation of the least fixed point of this recursive constraint:– precise disjunctive polyhedron abstract domain.– with hulling and widening operators.
• Details and examples in the paper.
9
Indirection Arrays
• Hold indexes for accessing another array.• Used intensively for sparse matrix operations.
• Need to capture universal properties about elements inside array:
0 · a_elem · 10represented as:
8 i 2 indexes(a) ¢ 0 · a[i] · 10
10
Indirection Arrays
• Given method:
• Compute postcondition:
void initArr(int a[], int i, int j, int n) { if (i>j) then () else { a[i]=n; initArr(a,i+1,j,n+1) }
(i>j Æ a_elem'=a_elem)Ç (0·i·j<len(a) Æ (a_elem'=a_elem Ç n·a_elem'·n+j-i))
11
Inference of Preconditions
• Classify checks with
– pre is valid: safe check.
– pre is unsatisfiable: unsafe check.
– .. otherwise propagate pre as a check for the caller.
pre = 8L¢(sps ) chk)
12
float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}
sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n
pre(L1.high) = 8 {i',j',n'} ¢ (sps(L1) ) i'<len(a))
= (j<len(a)-1) Ç (n·j Æ j¸len(a)-1)
sps(L1) = len(a)>0 Æ i'=j'+1 Æ 0<i'·n' Æ j'=j Æ n'=n
pre(L1.low) = 8 {i',j',n'} ¢ (sps(L1) ) i'¸ 0)
= true
sps(L2) = len(a)>0 Æ i'=j'+1 Æ m'¸ 0 Æ j'=j Æ n'=n
pre(L2.high) = 8 {i',j',n',m'} ¢ (sps(L2) ) m'<len(a))
= false
Example: Preconditions
• Derive weakest precondition for each check:
L2
L1
13
Efficient Preconditions
• Problem: negation of sps results in large preconditions.
– naïve pre-derivation: (len(a)· 0) Ç (j<len(a)-1 Æ 1·len(a)) Ç (n·j Æ 1·len(a)·j+1)
• Simplify preconditions via strengthening:
– weak pre-derivation drops disjuncts that violate type-invariants: (j<len(a)-1) Ç (n·jÆ len(a)· j+1)
– strong pre-derivation drops disjuncts that allow the avoidance of the check: (j<len(a) - 1)
– selective pre-derivation between weak and strong.
no loss in precision
less precise, but more efficient
too large
14
Inference Result: Method Summary
• Postcondition: (j<len(a)-1 Ç j¸len(a)-1 Æ n·j) Æ j’=j Æ n’=n
• Preconditions: { L1.high: (j<len(a)-1) }• Unsafe-checks: { L2.high }
float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + a[m];}
L1
L2
15
Overview
• Introduction
• Our approach– Modular inference: postcondition + preconditions.– Flexi-variant specialization.
• Experimental results.
• Conclusion.
16
Specialization
• If we assume all contexts satisfy (j+1 < len(a)):
• If we assume all contexts do not satisfy (j+1 < len(a)): specialize foo with 2 runtime checks.
• Otherwise … ?
float foo (float a[], int j, int n) { float v = 0; int i = j+1; if (0<i<=n) then v=a[i] else (); int m = abs(random()); v + (if (m<len(a)) then a[m] else error);}
L1
17
Specialization
• Monovariant specializer– One specialized code for each method.– Lower bound of all optimization.– Compact code size.
• Polyvariant specializer– Multiple optimized codes per method.– Each call site is replaced by a specialized call.– Highly optimized but may have code blow-up.
18
Flexivariant Specialization
• Allows trade-off between optimization and code size.• Decides how many copies to generate per method,
based on frequency and size constraint.
• Less optimization - 1 copy: foo (2 runtime checks).• More optimization - 2 copies: foo1 (1 runtime check)
+ foo2 (2 runtime checks)
19
Soundness
• Inference + Specialization = Well-typed program
Theorem: Given a program P and an inference judgment ` P PI. Let Bflex PI PT be the specialization of PI to PT. Then, if PT is well-typed, its execution will never proceed to invalid array-accesses.
20
Implementation
• Prototype written in Haskell language:– uses an efficient Presburger solver [W. Pugh et al].– disjunctive fixed-point analyzer [Popeea and Chin].
• Test programs:– small programs: binary search, merge sort, quick sort.– numerical benchmarks: Fast Fourier Transform,
LU decomposition, Linpack.
21
Experimental ResultsBenchmarks Source
(lines)
Static Checks Time
(secs)
Static Checks Eliminated
binary-search 31 2 1.81 100%
bubble-sort 39 12 1.51 100%
merge-sort 58 24 16.01 100%
queens 39 8 2.11 100%
quick-sort 43 20 1.92 100%
sentinel 26 4 0.16 75%
sparse multiply 46 12 17.37 100%
FFT 336 62 58.02 100%
LU 191 82 93.31 100%
SOR 84 32 4.67 100%
Linpack 903 166 360.1 100%
22
Precondition Strengthening
• Weak prederivation may generate preconditions that are too large to be manipulated (* signifies a timing over an hour)
• Strong prederivation keeps preconditions small (simplifies 81% from weak-pre).
• Selective prederivation: both efficient and precise (simplifies 63.4% from weak-pre).
Benchmark Time (secs)
Programs Weak Selective Strong
FFT * 58.02 28.74
LU 137.1 93.31 72.91
SOR 7.18 4.67 3.8
Linpack * 360.1 162.2
23
Conclusion
• Modular summary-based analysis:– Disjunctive postcondition inference– Derivation of efficient, scalable preconditions.
• Integration with a flexi-variant specializer.
• Implementation of a prototype system.
• Correctness proof.
24
A Practical and Precise Inference and Specializer for
Array Bound Checks Elimination
Corneliu Popeea, Dana N. Xu, Wei-Ngan Chin
We thank Siau-Cheng Khoo for sound and insightful suggestions. Thanks to anonymous referees for comments.
25
Related Work
• Global analyses:– Techniques: Suzuki and Ishihata [POPL'77], Cousot
and Halbwachs [POPL'78]– Tools: Astreé [PLDI'03], C Global Surveyor [PLDI'04]
• Modular analyses:– Cousot and Cousot [IFIP'77, CC'02]– Chatterjee, Ryder and Landi [POPL'99]– Moy [VMCAI'08]
• Dependent type checking:– Xi and Pfenning [PLDI'98]
26
• Limitations:– Large formulae: currently under-approx. formulae
are propagated. Over-approx. formulae are more compact, since sps appears in a positive position.
• Future work:– Dual analysis to validate some alarms as true bugs.– Extend the analysis with sound treatment of reference
types.– Handle more (existential) properties about array
elements.
27
Two Kinds of Recursive Invariants
• For loops:– compute a loop invariant.
• For methods with general recursion:– compute a loop invariant.– the method postcondition cannot be
determined directly from the loop invariant: a separate fixed-point is computed.
28
VCgen Verification Condition Generator
• Backward VCgen:– given: {P} assert chk {Q}– derives: P = (Q Æ chk)
• Our precondition derivation:– given: {pre} …; assert chk {sps}– derives: pre = (sps => chk)
• Differences:– sps is a transition relation: pre holds at the beginning
of the current method.– sps is computed by a separate forward derivation.