Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst,...
-
date post
20-Dec-2015 -
Category
Documents
-
view
215 -
download
2
Transcript of Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst,...
Dynamically Discovering Likely Program Invariants to Support
Program Evolution
Michael D. Ernst, Jake Cockrell,
William G. Griswold, David Notkin
Presented by: Nick Rutar
Program Invariants
Useful in software development Protect programmers from making errant changes Verify properties of a program
Can be explicitly stated in programs Programmers can annotate code with invariants This can take time and effort Many important invariants will be missed
Could there be a way to
dynamically discover program
invariants???
Daikon: An Invariant Detector
Pick a source program (Daikon is language independent) Instrument source program to trace variables of interest Run instrumented program over test cases Infer variants over
Instrumented variables (variables present in source) Derived variables
Created variables that might be of interest
Derived Variables
From any Sequence s Length: size(s) Extremal elements: s[0], s[1], s[-1], s[-2]
From a numeric sequence sum(s), min(s), max(s)
Any Sequence s and numeric variable(i) Element at index: s[i], s[i-1] Subsequences: s[0…i], s[0…i-1]
From Function Invocations: Number of calls so far
Example Program(taken from “The Science of Programming”)
i, s = 0;
do i ≠ n i, s = i + 1, s + b[i]
Precondition:
n ≥ 0
Postcondition:
s = ( j : 0 ≤ j < n : b[j])
Loop Invariant:
0 ≤ i ≤ n and
s = ( j : 0 ≤ j < i : b[j])
Daikon results from the program(100 randomly generated input arrays of length 7-13)
ENTER N = size(B) N in [7 … 13] B - All elements ≥ -100
EXIT N = I = orig(N) = size(B) B = orig(B) S = sum(B) N in [7 … 13] B - All elements ≥ -100
LOOP N = size(B) S = sum(B[0 … I -1]) N in [7 … 13] I in [0 … 13] I ≤ N B - all elements in [-100.100] sum(B) in [-556.539] B[0] nonzero in [-99.96] B[-1] in [-88.99] N != B[-1] B[0] != B[-1]
*boxes indicate generated invariants that match expected ones
OriginalProgram
InstrumentedProgram
Instrument
Test Suite
Run DetectInvariants
DataTrace
Invariants
Architecture of the Daikon tool
Daikon has instrumenters for Java, C, and Lisp Source to Source Translation Determines which variables are in scope Inserts code to dump the variables into an output file Creates a declaration file
Variables being instrumented Types in the original program Representations in the trace file Sets of variables that may be sensibly compared
Operates only on scalar numbers and arrays of numbers. Scalar numbers includes characters and booleans Any other type is converted to one of these forms
OriginalProgram
InstrumentedProgram
Instrument
At each program point of interest Instrumented Program writes to a data trace file
All variables in scope Global Variables Procedure Arguments Local Variables Return Values (at procedure exits)
Modification bit Whether a value has been set since last time
For small programs runtime may be I/O bound
InstrumentedProgram Run Data
Trace
Single variable invariants (numeric or sequence) Constant value: x = a (variable is a constant) Uninitialized: x = uninit (variable is never set) Modulus: x ≡ a mod b (x mod b = a always holds)
Multiple variables up to 3 (numeric or sequence) Linear relationship: y = ax + b. Reversal: x is the reverse of y Invariants over x - y, x + y
These are just a few Complete list can be found in the paper Domain-Specific invariants can easily be coded in
Detect InvariantsDataTrace
Invariants
Run Time of Daikon
Informally, can be characterized as Time = O( (vars³ x falsetime +
trueinvs x testsuite) x program) vars is the number of variables at a program point (in scope)
Most invariants are falsified quickly Only true invariants are checked for the entire run Potentially cubic because invariants involve at most 3 variables
falsetime is the (small constant) time to falsify a potential invariant trueinvs is the (small) number of true invariants at a program point testsuite is the size of the test suite
Must balance accuracy versus runtime program is the number of instrumented program points
The default is proportional to the size of the program Users can control the extent of instrumentation
Invariant Stability Size of Test Suite
Too Small Small number of invariants More false invariants
Too large Increases runtime linearly
Interesting vs. Uninteresting Different size test suites will have more/less invariants Uninteresting
Difference in a bound on a variable’s range Different small set of possible values
Interesting – everything else
Invariant Type/Test Cases 500 1000 1500 2000Identical Unary 2129 2419 2553 2612Missing Unary 125 47 27 14Diff Unary 442 230 117 73 Interesting 57 18 10 8 Uninteresting 385 212 107 65Identical binary 5296 9102 12515 14089Missing Binary 4089 1921 1206 732Diff Binary 109 45 24 19 Interesting 22 21 15 13 Uninteresting 87 24 9 6
Invariant differences(2500-element test suite)
Invariants and Program Correctness
Compare invariants detected across programs Correct versions of programs have more invariants than incorrect ones Examination of 424 intro C programs from U of Washington
Given # of students, amount of money, # of pizzas, calculates whether the students can afford the pizzas.
Chose eight relevant invariants people – [1…50] pizzas – [1…10] pizza_price – {9,11} excess_money – [0...40] slices = 8 * pizza slices = 0 (mod 8) slices_per – {0,1,2,3} slices_left people - 1
Relationship of Grade and Goal Invariants
Grade 2 3 4 5 6
12 4 2 0 0 0
14 9 2 5 2 0
15 15 23 27 11 3
16 33 40 42 19 9
17 13 10 23 27 7
18 16 5 29 27 21
Invariants Detected
Other Applications of Invariants
Inserted as assert statements for testing Double-check existing documentation
Check against existing assert statements Useful when program self-checks are ineffective
Discovering Bugs Generate test cases or validate existing test suites Could possibly direct a correctness proof
Ongoing and Future Work
Increasing Relevance Invariant is relevant if it assists programmer Repress invariants logically implied by others Unrelated variables don’t need to be compared Ignore variables not assigned since last time
Viewing and Managing Invariants Overwhelming for a programmer to sort through Various tools for selective reporting of invariants
Ordering by category Retrieves invariants based on supplied property List of invariants by program point
More Ongoing Work
Improving Performance Balance between invariant quality and runtime Number of Derived Variables used
Richer Invariants Invariants over Pointer based data structures Computing Conditional Invariants
Resources
Daikon website http://pag.lcs.mit.edu/daikon/download/ Contains links to
Papers Source Code User Manual Developers Manual
Questions???