Open Source Model Checking Radu Grosu SUNY at Stony Brook
-
Upload
lucy-gregory -
Category
Documents
-
view
21 -
download
0
description
Transcript of Open Source Model Checking Radu Grosu SUNY at Stony Brook
Open Source Model Checking
Radu Grosu SUNY at Stony Brook
Joint work with
X. Huang, S. Jain and S. A. Smolka
GCC Compiler
• Early stages: A modest C compiler.- Translation: source code translated directly to RTL.
- Optimization: at low RTL level.
- High level information lost: calls, structures, fields, etc.
• Now days: Full blown, multi-language compiler generating code for more than 30 architectures.
- Input: C, C++, Objective-C, Fortran, Java and Ada.
- Tree-SSA: added GENERIC, GIMPLE and SSA ILs.
- Optimization: at GENERIC, GIMPLE, SSA and RTL levels.
- Verification: Tree-SSA API suitable for verification, too.
GCC Compilation Process
Java FileC++ FileC File
C Parser
C++ Parser
Java Parser
Genericize
Gimplify
Parse Tree
GEN AST
..
GPL AST
Code Gen
Build CFG
GPL AST
Rest Comp
SSA/GPL CFG
RTL Code
Obj Code
C Program and its GIMPLE IL
int main() {
int a,b,c;
a = 5;
b = a + 10;
c = a + foo(a,b);
if (a > c)
c = b++/a + b*a;
bar(a,b,c); }
int main { int a,b,c; int T1,T2,T3,T4;
a = 5; b = a + 10; T1 = foo(a,b); T2 = a + T1;
if (a > T2) goto fi; T3 = b / a; T4 = b * a; c = T2 + T3; b = b + 1;fi: bar(a,b,c); }
Associated GIMPLE CFG
a = 5;b = a + 10;T1 = foo(a,b);T2 = b + T1;if (a > T2) goto B;
A
a 5
=CE
b
a 10
+
=
CE
CE
b
T1
foo a
CallE
= B
a T2
>
if
CE
T2
b T1
+
=T3 = b / a;T4 = b * a;c = T3 + T4;b = b + 1;
bar(a,b,c);return;
Exit
true falseBC
FUNCTION DECL
Entry int int int int int int inta T4T3T2c T1b
GCC Model Checking (GMC)
• GMC: a suite of analysis and verification tools we are developing for the Tree-SSA level of GCC. Currently:
– Intra-procedural slicer: in work is inter-procedural slicing.
– Symbolic execution engine: for Boolean C programs.
– Interpreter: traverses the CFG using Tree-SSA iterators.
– Monte Carlo MC (GMC2): OSE, randomized alg. for LTL MC.
• GMC2: a newly developed technique that uses the theory of geometric random variables, statistical hypothesis testing and random sampling of lassos.
recurrencediameter
LassosComputation tree (CT)
Explore all lassos in the CT
DDFS,SCC: time efficient DFS: memory efficient
LTL MC Finding Accepting Lassos
LTL
Randomized Algorithms
• Takes of next step algorithm may depend on random choice (coin flip).
– Benefits: simplicity, efficiency, and symmetry breaking.
• Monte Carlo: may produce incorrect result but with bounded error probability.– Example: Election’s result prediction
• Las Vegas: always gives correct result but running time is a random variable.
– Example: Randomized Quick Sort
recurrencediameter
Explore N(,) independent lassos in the CT
Error margin and confidence ratio
Monte Carlo Approach
LTL…
flip a k-sided coin
LassosComputation tree (CT)
Bernoulli Random Variable Z(coin flip)
1
2
3
4
1
1 2
4 3
4 41
4
½
¼ ⅛
⅛
p(0) = P[Z=0] = qZ = 7/8
p(1) = P[Z=1] = pZ = 1/8
Probability mass function:
Geometric Random Variable
• Value of geometric RV X with parameter pz:
– No. of independent lassos until success.
• Probability mass function:
– p(N) = P[X = N] = qzN-1 pz
• Cumulative Distribution Function:
– F(N) = P[X N] = ∑i Np(i) = 1 – qzN = 1 – (1- pz)N
How Many Lassos?
• Requiring 1- (1-pz)N = 1- δ yields:
N = ln (δ) / ln (1- pz)
• Lower bound on number of trials N needed to achieve success with confidence ratio δ.
What If pz Unknown?
• Requiring pz ε yields:
M = ln (δ) / ln (1- ε) N = ln (δ) / ln (1- pz)
and therefore P[X M] 1- δ
• Lower bound on number of trials M needed to achieve success with
confidence ratio δ and error margin ε .
Statistical Hypothesis Testing
• Null hypothesis H0: pz ε
• Alternative hypothesis H1: pz < ε
• If no success after N trials, then reject H0
• Type I error: α = P[ X > M | H0 ] < δ
• Since: P[ X M | H0 ] 1- δ
Monte Carlo Model Checking (MC2)
input: B=(Σ,Q,Q0,δ,F), ε, δ
N = ln (δ) / ln (1- ε)
for (i = 1; i N; i++)
if (RL(B) == 1) return (1, error-trace);
return (0, “reject H0 with α = Pr[ X>N | H0 ] < δ”);
where RL(B) performs a uniform random walk through B to obtain a random lasso.
GCC MC2 (GMC2)
• Input: a set of CFGs.– Main function: A specifically designated CFG.
• Random walks in the Büchi automaton: generated on-the-fly.– Initial state: of the main routine + bookkeeping information.
– Next state: choose process + call interpreter on its CFG.
– Processes: created by using the fork primitive.
– Optimization: interpreter returns only upon context switch.
• Lassos: detected by using a hierarchic hash table.– Local variables: removed upon return from a procedure.
Shared Variables Valuation(channels & semaphores)
List Of Process statesp1 p2 p3 …
CFG Name Statement #
Control State Data State
Program State
Shared Variables Valuation(channels & semaphores)
List Of Process statesp1 p2 p3 …
Heap Global Variables Valuation
Control State Data State
Frame Stack
Return Control State Local Variables Valuation
f1 f2 …
Program State
Interpreter
• Interprets GIMPLE statements: according to their semantics. Interesting:– Inter-procedural: call(), return(). Manipulate the frame
stack.
• Catches and interprets: function calls to various modeling and concurrency primitives:– Modeling: toss(), assert(). Nondeterminism and checks.
– Processes: fork(), … Manipulate the process list.
– Communication: send(), recv(). Manipulate shared vars. May involve a context switch.
GMC2property rule bugs time sampl
1 no 0.23 1278 Safe Advisory Selection 2 yes 0.03 147
1 no 0.23 1278 Best Advisory Selection 2 yes 0.04 206
1 yes 0.01 36 Avoid unnecessary Crossing 2 yes 0.03 180
1 yes 0.01 27No. Crossing Adv. Selection 2 yes 0.01 8
1 no 0.23 1278Optimal Advisory Selection 2 yes 0.06 217
Results: TCAS
GMC2 Verisoftph time sampl ce.len time states trans
4 0:00.07 2 12 0:00.61 16 37 6 0:00.11 4 12 0:16.60 773 11718 0:00.78 11 20 2:57.29 5431 8449 10 0:02.17 31 24 10:41 17908 31433 12 0:04.82 24 27 >2hr N/A N/A 14 0:06.22 22 44 >2hr N/A N/A
16 0:11.56 14 32 >2hr N/A N/A
(Deadlock freedom)
DPh: Symmetric Fair Version
GMC2 Verisoft Genetic time sampl time states time errors
6h 37' 10,682,639 >8h N/A 2h 33' 3
Needham-Schroeder Protocol
• Quite sophisticated C implementation.
• However, of a sequential nature:- Essentially executes only one round of a reactive system
Related Work
• Software model checkers for concurrent C/C++: – VeriSoft, Spin, Blast (Slam), Magic, C-Wolf. Bogor?
• Cooperative Bug Isolation [Liblit, Naik & Zheng]:– Compile-time instrumentation. Distribute binaries/collect bugs.
– Statistical analysis to isolate erroneous code segments.
• Random interpretation [Gulvany & Necula]: – Execute random paths and merge with random linear operators.
• Monte Carlo and abstract interpretation [Monniaux]: – Analyze programs with probabilistic and nondeterministic input.
Conclusions
• Presented GMC2: a software MC for GCC based on Monte Carlo MC:
– At Tree-SSA level: applicable to C, C++, Ada, Java, etc.
– Open source: freely available for usage/critique/extension.
• Ongoing and Future Work: Create a software MC branch of GCC, which also includes:
– Automated abstraction/refinement/interpolation techniques.
– Currently we manually apply a form of bounded-range abstraction (e.g. in TCAS).
Talk Outline
1. Model Checking
2. Randomized Algorithms
3. LTL Model Checking
4. Probability Theory Primer
5. Monte Carlo Model Checking
6. Implementation & Results
7. Conclusions & Open Problem
Linear Temporal Logic
• LTL formula: made up inductively of
• atomic propositions p, boolean connectives , , • temporal modalities X (neXt) and U (Until).
• Safety: “nothing bad ever happens”
E.g. G( (pc1=cs pc2=cs)) where G is a derived modality (Globally).
• Liveness: “something good eventually happens”
E.g. G( req F serviced ) where F is a derived modality (Finally).
Model Checking
• S is a nondeterministic/concurrent system.
is a temporal logic formula.
– in our case Linear Temporal Logic (LTL).
• Basic idea: intelligently explore S’s state space in attempt to establish S |= .
LTL Model Checking
• Every LTL formula can be translated to a Büchi automaton B such that L() = L(B)
• Automata-theoretic approach:
S |= iff L(BS) L(B ) iff L(BS B )
• Checking non-emptiness is equivalent to finding a reachable accepting cycle (lasso).
Emptiness Checking
• Checking non-emptiness is equivalent to finding an accepting cycle reachable from initial state (lasso).
• Double Depth-First Search (DDFS) algorithm can be used to search for such cycles, and this can be done on-the-fly!
s1 s2 s3 sksk-2 sk-1
sk+1sk+2sk+3sn
DFS2
DFS1
Randomized Algorithms
• Huge impact on CS: (distributed) algorithms, complexity theory, cryptography, etc.
• Takes of next step algorithm may depend on random choice (coin flip).
• Benefits of randomization include simplicity, efficiency, and symmetry breaking.
Lassos Probability Space
• Sample Space: lassos in BS B
• Bernoulli random variable Z :
– Outcome = 1 if randomly chosen lasso accepting
– Outcome = 0 otherwise
• pZ = ∑ pi Zi (expectation of an accepting lasso)
where pi is lasso prob. (uniform random walk)
Bernoulli Random Variable(coin flip)
• Value of Bernoulli RV Z:
Z = 1 (success) & Z = 0 (failure)
• Probability mass function:
p(1) = Pr[Z=1] = pz
p(0) = Pr[Z=0] = 1- pz = qz
• Expectation: E[Z] = pz
Statistical Hypothesis Testing
• Example: Given a fair and a biased coin.
– Null hypothesis H0 - fair coin selected.
– Alternative hypothesis H1 - biased coin selected.
• Hypothesis testing: Perform N trials.
– If number of heads is LOW, reject H0 .
– Else fail to reject H0 .
Statistical Hypothesis Testing
H0 is True H0 is False
reject H0
Type I error
w/prob. α
Correct to reject H0
fail to reject H0
Correct to fail to
reject H0
Type II error
w/prob. β
Random Lasso (RL) Algorithm
Buchi automaton B; sample lasso; return 0 if accepting; 1 if not;
(1)
input : output :
while s := rInit(B); i := 1; f := 0;
(2) (s HashTbl) {(3) HashTbl(s) := i;(4) acc
R
(
AL
s,
V al
B) f
gor
:= iif ;
ithm
(5) t
s := rNext(s,B); i := i +1; }(6) (HashTbl(s) f) 0if return elsere urn 1;
Correctness of MC2
Theorem: Given a Büchi automaton B, error margin ε, and confidence ratio δ, if MC2 rejects H0, then its type I error has probability
α = P[ X > M | H0 ] < δ
Complexity of MC2
Theorem: Given a Büchi automaton B having diameter D, error margin ε, and confidence ratio δ, MC2 runs in time O(N∙D) and uses space O(D), where N = ln(δ) / ln(1- ε)
Cf. DDFS which runs in O(2|S|+|φ|) time
for B = BS B .
Alternative Sampling Strategies
0 1 nn-1
• Multilasso sampling: ignores backedges that do not lead to an accepting lasso.
Pr[Ln]= O(2-n)
• Probabilistic systems: there is a natural way to assign a probability to a RL.
• Input partitioning: partition input into classes that trigger the same behavior (guards).