Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants
description
Transcript of Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants
Prof. Necula CS 294-8 Lecture 12 1
Decision-Procedure Based Theorem Provers
Tactic-Based Theorem ProvingInferring Loop Invariants
CS 294-8Lecture 12
Prof. Necula CS 294-8 Lecture 12 2
Review
Source language
VCGen
FOL Theories
Sat. proc 1
Sat. proc n
Goal-directedproving …
?
Prof. Necula CS 294-8 Lecture 12 3
Combining Satisfiability Procedures
• Consider a set of literals F– Containing symbols from two theories T1 and T2
• We split F into two sets of literals– F1 containing only literals in theory T1
– F2 containing only literals in theory T2
– We name all subexpressions: p1(f2(E)) is split into f2(E) = n Æ p1(n)
• We have: unsat (F1 Æ F2) iff unsat(F)– unsat(F1) Ç unsat(F2) ) unsat(F)– But the converse is not true
• So we cannot compute unsat(F) with a trivial combination of the sat. procedures for T1 and T2
Prof. Necula CS 294-8 Lecture 12 4
Combining Satisfiability Procedures. Example
• Consider equality and arithmetic
f(f(x) - f(y)) f(z) x y y + z x 0 z
false
f(x) = f(y)
y xx = y
0 = zf(x) - f(y) = z
f(f(x) - f(y)) = f(z)
Prof. Necula CS 294-8 Lecture 12 5
Combining Satisfiability Procedures
• Combining satisfiability procedures is non trivial• And that is to be expected:
– Equality was solved by Ackerman in 1924, arithmetic by Fourier even before, but E + A only in 1979 !
• Yet in any single verification problem we will have literals from several theories:– Equality, arithmetic, lists, …
• When and how can we combine separate satisfiability procedures?
Prof. Necula CS 294-8 Lecture 12 6
Nelson-Oppen Method (1)
1. Represent all conjuncts in the same DAG f(f(x) - f(y)) f(z) Æ y ¸ x Æ x ¸ y + z Æ z ¸ 0
f
-f f
y x
¸
z 0
¸
+
f ¸
Prof. Necula CS 294-8 Lecture 12 7
Nelson-Oppen Method (2)
2. Run each sat. procedure • Require it to report all contradictions (as usual)• Also require it to report all equalities between nodes
f
-f f
y x
¸
z 0
¸
+
f ¸
Prof. Necula CS 294-8 Lecture 12 8
Nelson-Oppen Method (3)
3. Broadcast all discovered equalities and re-run sat. procedures• Until no more equalities are discovered or a
contradiction arisesf
-f f
y x
¸
z 0
¸
+
f ¸
x Contradiction
Prof. Necula CS 294-8 Lecture 12 9
What Theories Can be Combined?
• Only theories without common interpreted symbols– But Ok if one theory takes the symbol uninterpreted
• Only certain theories can be combined – Consider (Z, +, ·) and Equality– Consider: 1 · x · 2 Æ a = 1 Æ b = 2 Æ f(x) f(a) Æ f(x)
f(b)– No equalities and no contradictions are discovered– Yet, unsatisfiable
• A theory is non-convex when a set of literals entails a disjunction of equalities without entailing any single equality
Prof. Necula CS 294-8 Lecture 12 10
Handling Non-Convex Theories
• Many theories are non-convex• Consider the theory of memory and pointers
– It is not-convex: true ) A = A’ Ç sel(upd(M, A, V), A’) = sel(M, A’) (neither
of the disjuncts is entailed individually)
• For such theories it can be the case that– No contradiction is discovered– No single equality is discovered– But a disjunction of equalities is discovered
• We need to propagate disjunction of equalities …
Prof. Necula CS 294-8 Lecture 12 11
Propagating Disjunction of Equalities
• To propagate disjunctions we perform a case split:• If a disjunction E1 Ç … Ç En is discovered:
Save the current state of the proverfor i = 1 to n { broadcast Ei
if no contradiction arises then return “satisfiable” restore the saved prover state}return “unsatisfiable”
Prof. Necula CS 294-8 Lecture 12 12
Handling Non-Convex Theories
• Case splitting is expensive– Must backtrack (performance --)– Must implement all satisfiability procedures in incremental
fashion (simplicity --)
• In some cases the splitting can be prohibitive:– Take pointers for example. upd(upd(…(upd(m, i1, x), …, in-1, x), in, x) = upd(…(upd(m, j1, x), …, jn-1, x) Æ sel(m, i1) x Æ … Æ sel(m, in) x entails Çj k ij ik (a conjunction of length n entails n2 disjuncts)
Prof. Necula CS 294-8 Lecture 12 13
Forward vs. Backward Theorem Proving
Prof. Necula CS 294-8 Lecture 12 14
Forward vs. Backward Theorem Proving• The state of a prover can be expressed as:
H1 Æ … Æ Hn )? G– Given the hypotheses Hi try to derive goal G
• A forward theorem prover derives new hypotheses, in hope of deriving G– If H1 Æ … Æ Hn ) H then move to state H1 Æ … Æ Hn Æ H )? G– Success state: H1 Æ … Æ G Æ … Hn )? G
• A forward theorem prover uses heuristics to reach G– Or it can exhaustively derive everything that is derivable !
Prof. Necula CS 294-8 Lecture 12 15
Forward Theorem Proving
• Nelson-Oppen is a forward theorem prover:– The state is L1 Æ … Æ Ln Æ : L )? false– If L1 Æ … Æ Ln Æ : L ) E (an equality) then– New state is L1 Æ … Æ Ln Æ : L Æ E )? false (add the equality)– Success state is L1 Æ … Æ L Æ … Æ : L Æ … Ln )? false
• Nelson-Oppen provers exhaustively produce all derivable facts hoping to encounter the goal
• Case splitting can be explained this way too:– If L1 Æ … Æ Ln Æ : L ) E Ç E’ (a disjunction of equalities) then– Two new states are produced (both must lead to success) L1 Æ … Æ Ln Æ : L Æ E )? false L1 Æ … Æ Ln Æ : L Æ E’ )? false
Prof. Necula CS 294-8 Lecture 12 16
Backward Theorem Proving
• A backward theorem prover derives new subgoals from the goal– The current state is H1 Æ … Æ Hn )? G– If H1 Æ … Æ Hn Æ G1 Æ … Æ Gn ) G (Gi are subgoals)– Produce “n“ new states (all must lead to success): H1 Æ … Æ Hn )? Gi
• Similar to case splitting in Nelson-Oppen:– Consider a non-convex theory: H1 Æ … Æ Hn ) E Ç E’ is same as H1 Æ … Æ Hn Æ : E Æ : E’ ) false(thus we have reduced the goal “false” to subgoals : E Æ : E’ )
Prof. Necula CS 294-8 Lecture 12 17
Programming Theorem Provers
• Backward theorem provers most often use heuristics• If it useful to be able to program the heuristics
• Such programs are called tactics and tactic-based provers have this capability– E.g. the Edinburgh LCF was a tactic based prover whose
programming language was called the Meta-Language (ML)
• A tactic examines the state and either:– Announces that it is not applicable in the current state, or– Modifies the proving state
Prof. Necula CS 294-8 Lecture 12 18
Programming Theorem Provers. Tactics.• State = Formula list £ Formula
– A set of hypotheses and a goal• Tactic = State ! (State ! ) ! (unit ! ) !
– Continuation passing style– Given a state will invoke either the success continuation
with a modified state or the failure continuation• Example: a congruence-closure based tactic
cc (h, g) c f = let e1, …, en new equalities in the congruence closure
of h c (h [ {e1, …, en}, g)– A forward chaining tactic
Prof. Necula CS 294-8 Lecture 12 19
Programming Theorem Provers. Tactics.• Consider an axiom: 8x. a(x) ) b(x)
– Like the clause b(x) :- a(x) in Prolog
• This could be turned into a tacticclause (h, g) c f = if unif(g, b) = then s (h, (a)) else f ()– A backward chaining tactic
Prof. Necula CS 294-8 Lecture 12 20
Programming Theorem Provers. Tacticals.• Tactics can be composed using tacticalsExamples: • THEN : tactic ! tactic ! tactic
THEN t1 t2 = s.c.f. let newc s’ = t2 s’ c f in t1 s newc f
• REPEAT : tactic ! tacticREPEAT t = THEN t (REPEAT t)
• ORELSE : tactic ! tactic ! tacticORELSE t1 t2 = s.c.f. let newf x = t2 s c f in t1 s c newf
Prof. Necula CS 294-8 Lecture 12 21
Programming Theorem Provers. Tacticals• Prolog is just one possible tactic:
– Given tactics for each clause: c1, …, cn
– Prolog : tactic Prolog = REPEAT (c1 ORLESE c2 ORELSE … ORELSE cn)
• Nelson-Oppen can also be programmed this way– The result is not as efficient as a special-purpose
implementation
• This is a very powerful mechanism for semi-automatic theorem proving– Used in: Isabelle, HOL, and many others
Prof. Necula CS 294-8 Lecture 12 22
Techniques for Inferring Loop Invariants
Prof. Necula CS 294-8 Lecture 12 23
Inferring Loop Invariants
• Traditional program verification has several elements:– Function specifications and loop invariants– Verification condition generation– Theorem proving
• Requiring specifications from the programmer is often acceptable
• Requiring loop invariants is not acceptable– Same for specifications of local functions
Prof. Necula CS 294-8 Lecture 12 24
Inferring Loop Invariants
• A set of cutpoints is a set of program points :– There is at least one cutpoint on each circular path in CFG– There is a cutpoint at the start of the program – There is a cutpoint before the return
• Consider that our function uses n variables x:• We associate with each cutpoint an assertion Ik(x)• If is a path from cutpoint k to j then :
– R(x) : Zn ! Zn expresses the effect of path on the values of x at j as a function of those at k
– P(x) : Zn ! B is a path predicate that is true exactly of those values of x at k that will enable the path
Prof. Necula CS 294-8 Lecture 12 25
Cutpoints. Example.
• p01 = true r01 = { A Ã A, K Ã 0, L Ã len(A), S Ã 0,
à }
• p11 = K + 1 < L r11 = { A Ã A, K Ã K + 1, L Ã L,
S Ã S + sel(, A + K), Ã }
• p12 = K + 1 ¸ L r12 = r11
• Easily obtained through sym. eval.
L = len(A)K = 0S = 0
S = S + A[K]K ++K < L
return S
0
1
2
Prof. Necula CS 294-8 Lecture 12 26
Equational Definition of Invariants
• A set of assertions is a set of invariants if:– The assertion for the start cutpoint is the precondition– The assertion for the end cutpoint is the postcondition– For each path from i to j we have 8x. Ii(x) Æ Pij(x) ) Ij(Rij(x))
• Now we have to solve a system of constraints with the unknowns I1, …, In-1– I0 and In are known
• We will consider the simpler case of a single loop– Otherwise we might want to try solving the inner/last loop
first
Prof. Necula CS 294-8 Lecture 12 27
Invariants. Example.
1. I0 ) I1(r0(x))• The invariant I1 is established initially
2. I1 Æ K+1 < L ) I1(r1(x))• The invariant I1 is preserved in the
loop
3. I1 Æ K+1 ¸ L ) I2(r1(x))• The invariant I1 is strong enough (i.e.
useful)h
L = len(A)K = 0S = 0
S = S + A[K]K ++K < L
return S
0
1
2
ro
r1
Prof. Necula CS 294-8 Lecture 12 28
The Lattice of Invariants
• A few predicates satisfy condition 2– Are invariant– Form a lattice
true
false
• Weak predicates satisfy the condition 1– Are satisfied initially
• Strong predicates satisfy condition 3– Are useful
Prof. Necula CS 294-8 Lecture 12 29
Finding The Invariant
• Which of the potential invariants should we try to find ?
• We prefer to work backwards– Essentially proving only what is needed to satisfy In
– Forward is also possible but sometimes wasteful since we have to prove everything that holds at any point
Prof. Necula CS 294-8 Lecture 12 30
Finding the Invariant
• Thus we do not know the “precondition” of the loop
• The weakest invariant that is strong enough has most chances of holding initially
• This is the one that we’ll try to find– And then check that it is
weak enough
true
false
Prof. Necula CS 294-8 Lecture 12 31
Induction Iteration Method
• Equation 3 gives a predicate weaker than any invariant
I1 Æ K+1 ¸ L ) I2(r1(x)) I1 ) ( K+1 ¸ L ) I2(r1(x))) W0 = K+1 ¸ L ) I2(r1(x))
• Equation 2 suggest an iterative computation of the invariant I1
I1 ) (K+1 < L ) I1(r1(x)))false
true
Prof. Necula CS 294-8 Lecture 12 32
Induction Iteration Method
• Define a family of predicates W0 = K+1 ¸ L ) I2(r1(x)) Wj = W0 Æ K+1 < L ) Wj-1(r1(x))• Properties of Wj
– Wj ) Wj-1 ) … ) W0 (they form a strengthening chain)– I1 ) Wj (they are weaker than any invariant)– If Wj-1 ) Wj then
• Wj is an invariant (satisfies both equations 2 and 3): Wj ) K+1 < L ) Wj(r1(x))• Wj is the weakest invariant
(recall domain theory, predicates form a domain, and we use the fixed point theorem to obtain least solutions to recursive equations)
Prof. Necula CS 294-8 Lecture 12 33
Induction Iteration Method
W = K+1 ¸ L ) I2(r1(x)) // This is W0
W’ = true while (not (W’ ) W)) { W’ = W W = (K+1 ¸ L ) I2(r1(x))) Æ (K + 1 < L )
W’(r1(x))) }
• The only hard part is to check whether W’ ) W – We use a theorem prover for this purpose
Prof. Necula CS 294-8 Lecture 12 34
Induction Iteration.
• The sequence of Wj approaches the weakest invariant from above
• The predicate Wj can quickly become very large– Checking W’ ) W becomes
harder and harder
• This is not guaranteed to terminate
false
true
W0
W2
W3
W1
Prof. Necula CS 294-8 Lecture 12 35
Induction Iteration. Example.
• Consider that the strength condition is:I1 ) K ¸ 0 Æ K < len(A) (array bounds)
• We compute the W’s:W0 = K ¸ 0 Æ K < len(A) W1 = W0 Æ K+1<L ) K+1¸ 0 Æ
K+1<len(A)W2 = W0 Æ K+1 < L ) (K + 1 ¸ 0 Æ K + 1 < len(A) Æ K+2< L ) K+2 ¸ 0 Æ K+2 < len(A)))…
L = len(A)K = 0S = 0
S = S + A[K]K ++K < L
return S
0
1
2
ro
r1
Prof. Necula CS 294-8 Lecture 12 36
Induction Iteration. Strengthening.
• We can try to strengthen the inductive invariant• Instead of:
Wj = W0 Æ K+1 < L ) Wj-1(r1(x)) we compute:
Wj = strengthen (W0 Æ K+1 < L ) Wj-1(r1(x))) where strengthen(P) ) P
• We still have Wj ) Wj-1 and we stop when Wj-1 ) Wj– The result is still an invariant that satisfies 2 and 3
Prof. Necula CS 294-8 Lecture 12 37
Strengthening Heuristics
• One goal of strengthening is simplification:– Drop disjuncts: P1 Ç P2 ! P1
– Drop implications: P1 ) P2 ! P2
• A good idea is to try to eliminate variables changed in the loop body:– If Wj does not depend on variables changed by r1 (e.g. K,
S)– Wj+1 = W0 Æ K+1 < L ) Wj(r1(x)) = W0 Æ K+1 < L ) Wj
– Now Wj ) Wj+1 and we are done !
Prof. Necula CS 294-8 Lecture 12 38
Induction Iteration. Strengthening
• We are still in the “strong-enough” area
• We are making bigger steps
• And we might overshoot then weakest invariant
• We might also fail to find any invariant
• But we do so quickly false
true
W0
W2
W3
W1
W’2
W’1
Prof. Necula CS 294-8 Lecture 12 39
One Strengthening Heuristic for Integers• Rewrite Wj in conjunctive normal form W1 = K ¸ 0 Æ K < len(A) Æ (K + 1 < L ) K + 1 ¸ 0 Æ K + 1 <
len(A)) = K ¸ 0 Æ K < len(A) Æ (K + 1 ¸ L Ç K + 1 < len(A))• Take each disjunction containing arithmetic literals
– Negate it and obtain a conjunction of arithmetic literals K+1 < L Æ K+1 ¸ len(A)– Weaken the result by eliminating a variable (preferably a loop-
modified variable) E.g., add the two literals: L > len(A)– Negate the result and get another disjunction: L · len(A)W1 = K ¸ 0 Æ K < len(A) Æ L · len(A) (check that W1 ) W2)
Prof. Necula CS 294-8 Lecture 12 40
Induction Iteration
• We showed a way to compute invariants algoritmically– Similar to fixed point computation in domains– Similar to abstract interpretation on the lattice of
predicates
• Then we discussed heuristics that improve the termination properties– Similar to widening in abstract interpretation
Prof. Necula CS 294-8 Lecture 12 41
Theorem Proving. Conclusions.
• Theorem proving strengths– Very expressive
• Theorem proving weaknesses– Too ambitious
• A great toolbox for software analysis– Symbolic evaluation– Decision procedures
• Related to program analysis– Abstract interpretation on the lattice of predicates