Bias-Variance Tradeoffs in Program Analysis
description
Transcript of Bias-Variance Tradeoffs in Program Analysis
![Page 1: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/1.jpg)
Bias-Variance Tradeoffs inProgram Analysis
Rahul Sharma, Aditya V. Nori, Alex Aiken Stanford MSR India Stanford
![Page 2: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/2.jpg)
Observation
int i = 1, j = 0;while (i<=5) { j = j+i ; i = i+1;}
Invariant inference
Intervals
Octagons
Polyhedra
Increasing precision
D. Monniaux and J. L. Guen. Stratified static analysis based on variable dependencies. Electr. Notes Theor. Comput. Sci. 2012
![Page 3: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/3.jpg)
Another Example: Yogi
A. V. Nori and S. K. Rajamani. An empirical study of optimizations in YOGI.ICSE (1) 2010
![Page 4: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/4.jpg)
The Problem Increased precision is causing worse results
Programs have unbounded behaviors
Program analysis Analyze all behaviors Run for a finite time
In finite time, observe only finite behaviors
Need to generalize
![Page 5: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/5.jpg)
Generalization
Generalization is ubiquitous
Abstract interpretation: widening
CEGAR: interpolants
Parameter tuning of tools
Lot of folk knowledge, heuristics, …
![Page 6: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/6.jpg)
Machine Learning
“It’s all about generalization”
Learn a function from observations
Hope that the function generalizes
Work on formalization of generalization
![Page 7: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/7.jpg)
Our Contributions
Model the generalization process Probably Approximately Correct (PAC)
model
Explain known observations by this model
Use this model to obtain better tools
http://politicalcalculations.blogspot.com/2010/02/how-science-is-supposed-to-work.html
![Page 8: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/8.jpg)
Why Machine Learning?
INTERPOLANTS CLASSIFIERS
+
Rahul Sharma, Aditya V. Nori, Alex Aiken: Interpolants as Classifiers. CAV 2012
++++
- ----
![Page 9: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/9.jpg)
PAC Learning Framework
Assume an arbitrary but fixed distribution Given (iid) samples from Each sample is example with a label (+/-)
+++++
- ----
c
![Page 10: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/10.jpg)
PAC Learning Framework
+++++
- ----
Empirical error of a hypothesis
![Page 11: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/11.jpg)
PAC Learning Framework
+++++
- ----
c
Empirical risk minimization (ERM) Given a set of possible hypotheses
(precision) Select that minimizes empirical error
![Page 12: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/12.jpg)
PAC Learning Framework
Generalization error: for a new sample
Relate generalization error to empirical error and precision
![Page 13: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/13.jpg)
Precision
Capture precision by VC dimension (VC-d)
Higher precision -> More possible hypotheses
H𝑉𝐶 (𝐻 )=𝑑
For any arbitrary labeling
++-+
++-+
![Page 14: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/14.jpg)
VC-d Example
++
-- ++
--
++
--++
--++
--
+ +
+- - -
-
++
--+
![Page 15: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/15.jpg)
Regression ExamplePrecision is lowUnderfitting
Precision is highOverfitting
Good fit
Y
X
![Page 16: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/16.jpg)
Main Result of PAC Framework Generalization error is bounded by
sum of Bias: Empirical error of best available
hypothesis Variance: O(VC-d)
Bias
Variance
Increase precision
Generalization error
Possible hypotheses
![Page 17: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/17.jpg)
Example Revisited
int i = 1, j = 0;while (i<=5) { j = j+i ; i = i+1;}
Invariant inference
Intervals
Octagons
Polyhedra
![Page 18: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/18.jpg)
Intuition What goes wrong
with excess precision?
Fit polyhedra to program behaviors Transfer functions,
join, widening
Too many polyhedra, make a wrong choice
int i = 1, j = 0;while (i<=5) { j = j+i ; i = i+1;}
Intervals: Polyhedra:
![Page 19: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/19.jpg)
Abstract Interpretation
J. Henry, D. Monniaux, and M. Moy. Pagai: A path sensitive static analyser. Electr. Notes Theor. Comput. Sci. 2012.
![Page 20: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/20.jpg)
Yogi
A. V. Nori and S. K. Rajamani. An empirical study of optimizations in YOGI.ICSE (1) 2010
![Page 21: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/21.jpg)
Case Study
Parameter tuning of program analyses
Overfitting? Generalization on new tasks?
P. Godefroid, A. V. Nori, S. K. Rajamani, and S. Tetali. Compositionalmay-must program analysis: unleashing the power of alternation. POPL 2010.
Benchmark Set (2490 verification tasks)
𝑌𝑜𝑔𝑖 Train
Tuned , test length =500, …
![Page 22: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/22.jpg)
Cross Validation
How to set the test length in YogiBenchmark Set (2490 verification
tasks)
Training Set (1743)
Test Set (747)
𝑌𝑜𝑔𝑖50 𝑌𝑜𝑔𝑖𝑖… … Train
Test𝑌𝑜𝑔𝑖50 𝑌𝑜𝑔𝑖𝑖… …
𝑌𝑜𝑔𝑖𝑘
![Page 23: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/23.jpg)
Cross Validation on Yogi
Performance on test set of tuned ’s350
500
![Page 24: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/24.jpg)
Comparison
On 2106 new verification tasks
40% performance improvement!
Yogi in production suffers from overfitting
![Page 25: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/25.jpg)
Recommendations
Keep separate training and test sets Design of the tools governed by training
set Test set as a check
SVCOMP: all benchmarks are public Test tools on some new benchmarks too
![Page 26: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/26.jpg)
Increase Precision Incrementally
R. Jhala and K. L. McMillan. A practical and complete approach to predicate refinement.TACAS 2006.
Suggests incrementally increasing precision Find a sweet spot where generalization
error is low
![Page 27: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/27.jpg)
More in the paper
VC-d of TCMs: intervals, octagons, etc.
Templates:
Arrays, separation logic
Expressive abstract domains -> higher VC-d
VC-d can help choose abstractions
![Page 28: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/28.jpg)
Inapplicability
No generalization -> no bias-variance tradeoff
Certain classes of type inference Abstract interpretation without widening Loop-free and recursion-free programs
Verify a particular program (e.g., seL4) Overfit on the one important program
![Page 29: Bias-Variance Tradeoffs in Program Analysis](https://reader036.fdocuments.us/reader036/viewer/2022062310/56816554550346895dd7d28e/html5/thumbnails/29.jpg)
Conclusion
A model to understand generalization Bias-Variance tradeoffs
These tradeoffs do occur in program analysis
Understand these tradeoffs for better tools