Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants

Prof. Necula CS 294-8 Lecture 12 1

Decision-Procedure Based Theorem Provers

Tactic-Based Theorem ProvingInferring Loop Invariants

CS 294-8Lecture 12


Review

Source language

VCGen

FOL Theories

Sat. proc 1

Sat. proc n

Goal-directedproving …

?


Combining Satisfiability Procedures

• Consider a set of literals F– Containing symbols from two theories T1 and T2

• We split F into two sets of literals– F1 containing only literals in theory T1

– F2 containing only literals in theory T2

– We name all subexpressions: p1(f2(E)) is split into f2(E) = n p1(n)

• We have: unsat (F1 F2) iff unsat(F)– unsat(F1) unsat(F2) unsat(F)

– But the converse is not true

• So we cannot compute unsat(F) with a trivial combination of the sat. procedures for T1 and T2


Combining Satisfiability Procedures. Example

• Consider equality and arithmetic

f(f(x) - f(y)) f(z) x y y + z x 0 z

false

f(x) = f(y)

y xx = y

0 = zf(x) - f(y) = z

f(f(x) - f(y)) = f(z)


Combining Satisfiability Procedures

• Combining satisfiability procedures is non trivial• And that is to be expected:

– Equality was solved by Ackerman in 1924, arithmetic by Fourier even before, but E + A only in 1979 !

• Yet in any single verification problem we will have literals from several theories:– Equality, arithmetic, lists, …

• When and how can we combine separate satisfiability procedures?


Nelson-Oppen Method (1)

1. Represent all conjuncts in the same DAG f(f(x) - f(y)) f(z) y x x y + z z 0

f

-f f

y x

z 0

+

f



2. Run each sat. procedure • Require it to report all contradictions (as usual)• Also require it to report all equalities between nodes

f

-f f

y x

z 0

+

f



3. Broadcast all discovered equalities and re-run sat. procedures• Until no more equalities are discovered or a

contradiction arisesf

-f f

y x

z 0

+

f

x Contradiction


What Theories Can be Combined?

• Only theories without common interpreted symbols– But Ok if one theory takes the symbol uninterpreted

• Only certain theories can be combined – Consider (Z, +, ·) and Equality– Consider: 1 x 2 a = 1 b = 2 f(x) f(a) f(x) f(b)– No equalities and no contradictions are discovered– Yet, unsatisfiable

• A theory is non-convex when a set of literals entails a disjunction of equalities without entailing any single equality


Handling Non-Convex Theories

• Many theories are non-convex• Consider the theory of memory and pointers

– It is not-convex: true A = A’ sel(upd(M, A, V), A’) = sel(M, A’) (neither

of the disjuncts is entailed individually)

• For such theories it can be the case that– No contradiction is discovered– No single equality is discovered– But a disjunction of equalities is discovered

• We need to propagate disjunction of equalities …


Propagating Disjunction of Equalities

• To propagate disjunctions we perform a case split:

• If a disjunction E1 … En is discovered:

Save the current state of the proverfor i = 1 to n { broadcast Ei

if no contradiction arises then return “satisfiable” restore the saved prover state}return “unsatisfiable”


Handling Non-Convex Theories

• Case splitting is expensive– Must backtrack (performance --)– Must implement all satisfiability procedures in

incremental fashion (simplicity --)

• In some cases the splitting can be prohibitive:– Take pointers for example. upd(upd(…(upd(m, i1, x), …, in-1, x), in, x) =

upd(…(upd(m, j1, x), …, jn-1, x)

sel(m, i1) x … sel(m, in) x

entails j k ij ik

(a conjunction of length n entails n2 disjuncts)


Forward vs. Backward Theorem Proving


Forward vs. Backward Theorem Proving

• The state of a prover can be expressed as: H1 … Hn ? G

– Given the hypotheses Hi try to derive goal G

• A forward theorem prover derives new hypotheses, in hope of deriving G– If H1 … Hn H then

move to state H1 … Hn H ? G

– Success state: H1 … G … Hn ? G

• A forward theorem prover uses heuristics to reach G– Or it can exhaustively derive everything that is derivable !


Forward Theorem Proving

• Nelson-Oppen is a forward theorem prover:– The state is L1 … Ln L ? false

– If L1 … Ln L E (an equality) then

– New state is L1 … Ln L E ? false (add the equality)

– Success state is L1 … L … L … Ln ? false

• Nelson-Oppen provers exhaustively produce all derivable facts hoping to encounter the goal

• Case splitting can be explained this way too:– If L1 … Ln L E E’ (a disjunction of equalities) then

– Two new states are produced (both must lead to success) L1 … Ln L E ? false

L1 … Ln L E’ ? false


Backward Theorem Proving

• A backward theorem prover derives new subgoals from the goal– The current state is H1 … Hn ? G

– If H1 … Hn G1 … Gn G (Gi are subgoals)

– Produce “n“ new states (all must lead to success): H1 … Hn ? Gi

• Similar to case splitting in Nelson-Oppen:– Consider a non-convex theory: H1 … Hn E E’

is same as H1 … Hn E E’ false

(thus we have reduced the goal “false” to subgoals E E’ )


Programming Theorem Provers

• Backward theorem provers most often use heuristics

• If it useful to be able to program the heuristics

• Such programs are called tactics and tactic-based provers have this capability– E.g. the Edinburgh LCF was a tactic based prover whose

programming language was called the Meta-Language (ML)

• A tactic examines the state and either:– Announces that it is not applicable in the current state, or– Modifies the proving state


Programming Theorem Provers. Tactics.

• State = Formula list Formula– A set of hypotheses and a goal

• Tactic = State (State ) (unit ) – Continuation passing style– Given a state will invoke either the success continuation

with a modified state or the failure continuation

• Example: a congruence-closure based tacticcc (h, g) c f = let e1, …, en new equalities in the congruence closure

of h c (h {e1, …, en}, g)

– A forward chaining tactic


Programming Theorem Provers. Tactics.

• Consider an axiom: x. a(x) b(x)– Like the clause b(x) :- a(x) in Prolog

• This could be turned into a tacticclause (h, g) c f = if unif(g, b) = then s (h, (a)) else f ()– A backward chaining tactic


Programming Theorem Provers. Tacticals.

• Tactics can be composed using tacticalsExamples: • THEN : tactic tactic tactic

THEN t1 t2 = s.c.f.

let newc s’ = t2 s’ c f in t1 s newc f

• REPEAT : tactic tacticREPEAT t = THEN t (REPEAT t)

• ORELSE : tactic tactic tacticORELSE t1 t2 = s.c.f.

let newf x = t2 s c f in t1 s c newf


Programming Theorem Provers. Tacticals

• Prolog is just one possible tactic:– Given tactics for each clause: c1, …, cn

– Prolog : tactic Prolog = REPEAT (c1 ORLESE c2 ORELSE … ORELSE cn)

• Nelson-Oppen can also be programmed this way– The result is not as efficient as a special-purpose

implementation

• This is a very powerful mechanism for semi-automatic theorem proving– Used in: Isabelle, HOL, and many others


Techniques for Inferring Loop Invariants


Inferring Loop Invariants

• Traditional program verification has several elements:– Function specifications and loop invariants– Verification condition generation– Theorem proving

• Requiring specifications from the programmer is often acceptable

• Requiring loop invariants is not acceptable– Same for specifications of local functions


Inferring Loop Invariants

• A set of cutpoints is a set of program points :– There is at least one cutpoint on each circular path in

CFG– There is a cutpoint at the start of the program – There is a cutpoint before the return

• Consider that our function uses n variables x:

• We associate with each cutpoint an assertion Ik(x)

• If is a path from cutpoint k to j then :– R(x) : Zn Zn expresses the effect of path on the

values of x at j as a function of those at k

– P(x) : Zn B is a path predicate that is true exactly of those values of x at k that will enable the path


Cutpoints. Example.

• p01 = true

r01 = { A A, K 0, L len(A), S 0,

}

• p11 = K + 1 < L

r11 = { A A, K K + 1, L L,

S S + sel(, A + K), }

• p12 = K + 1 L

r12 = r11

• Easily obtained through sym. eval.

L = len(A)K = 0S = 0

S = S + A[K]K ++K < L

return S

0

1

2


Equational Definition of Invariants

• A set of assertions is a set of invariants if:– The assertion for the start cutpoint is the precondition– The assertion for the end cutpoint is the postcondition– For each path from i to j we have x. Ii(x) Pij(x) Ij(Rij(x))

• Now we have to solve a system of constraints with the unknowns I1, …, In-1

– I0 and In are known

• We will consider the simpler case of a single loop– Otherwise we might want to try solving the inner/last

loop first


Invariants. Example.

1. I0 I1(r0(x))

• The invariant I1 is established initially

2. I1 K+1 < L I1(r1(x))

• The invariant I1 is preserved in the loop

• I1 K+1 L I2(r1(x))

• The invariant I1 is strong enough (i.e. useful)h

L = len(A)K = 0S = 0

S = S + A[K]K ++K < L

return S

0

1

2

ro

r1


The Lattice of Invariants

• A few predicates satisfy condition 2– Are invariant– Form a lattice

true

false

• Weak predicates satisfy the condition 1– Are satisfied initially

• Strong predicates satisfy condition 3– Are useful


Finding The Invariant

• Which of the potential invariants should we try to find ?

• We prefer to work backwards– Essentially proving only what is needed to satisfy In

– Forward is also possible but sometimes wasteful since we have to prove everything that holds at any point


Finding the Invariant

• Thus we do not know the “precondition” of the loop

• The weakest invariant that is strong enough has most chances of holding initially

• This is the one that we’ll try to find– And then check that it is

weak enough

true

false


Induction Iteration Method

• Equation 3 gives a predicate weaker than any invariant

I1 K+1 L I2(r1(x))

I1 ( K+1 L I2(r1(x)))

W0 = K+1 L I2(r1(x))

• Equation 2 suggest an iterative computation of the invariant I1

I1 (K+1 < L I1(r1(x)))false

true



• Define a family of predicates W0 = K+1 L I2(r1(x))

Wj = W0 K+1 < L Wj-1(r1(x))

• Properties of Wj

– Wj Wj-1 … W0 (they form a strengthening chain)

– I1 Wj (they are weaker than any invariant)

– If Wj-1 Wj then

• Wj is an invariant (satisfies both equations 2 and 3):

Wj K+1 < L Wj(r1(x))

• Wj is the weakest invariant

(recall domain theory, predicates form a domain, and we use the fixed point theorem to obtain least solutions to recursive equations)



W = K+1 L I2(r1(x)) // This is W0

W’ = true while (not (W’ W)) { W’ = W W = (K+1 L I2(r1(x))) (K + 1 < L

W’(r1(x)))

}

• The only hard part is to check whether W’ W – We use a theorem prover for this purpose


Induction Iteration.

• The sequence of Wj approaches the weakest invariant from above

• The predicate Wj can quickly become very large– Checking W’ W becomes

harder and harder

• This is not guaranteed to terminate

false

true

W0

W2

W3

W1


Induction Iteration. Example.

• Consider that the strength condition is:I1 K 0 K < len(A) (array bounds)

• We compute the W’s:W0 = K 0 K < len(A)

W1 = W0 K+1<L K+1 0 K+1<len(A)

W2 = W0 K+1 < L

(K + 1 0 K + 1 < len(A)

K+2< L K+2 0

K+2 < len(A)))

…

L = len(A)K = 0S = 0

S = S + A[K]K ++K < L

return S

0

1

2

ro

r1


Induction Iteration. Strengthening.

• We can try to strengthen the inductive invariant• Instead of:

Wj = W0 K+1 < L Wj-1(r1(x))

we compute: Wj = strengthen (W0 K+1 < L Wj-1(r1(x)))

where strengthen(P) P

• We still have Wj Wj-1 and we stop when Wj-1 Wj

– The result is still an invariant that satisfies 2 and 3


Strengthening Heuristics

• One goal of strengthening is simplification:– Drop disjuncts: P1 P2 P1

– Drop implications: P1 P2 P2

• A good idea is to try to eliminate variables changed in the loop body:– If Wj does not depend on variables changed by r1 (e.g. K,

S)

– Wj+1 = W0 K+1 < L Wj(r1(x))

= W0 K+1 < L Wj

– Now Wj Wj+1 and we are done !


Induction Iteration. Strengthening

• We are still in the “strong-enough” area

• We are making bigger steps

• And we might overshoot then weakest invariant

• We might also fail to find any invariant

• But we do so quickly false

true

W0

W2

W3

W1

W’2

W’1


One Strengthening Heuristic for Integers

• Rewrite Wj in conjunctive normal form

W1 = K 0 K < len(A) (K + 1 < L K + 1 0 K + 1 < len(A))

= K 0 K < len(A) (K + 1 L K + 1 < len(A))

• Take each disjunction containing arithmetic literals– Negate it and obtain a conjunction of arithmetic literals K+1 < L K+1 len(A)– Weaken the result by eliminating a variable (preferably a loop-

modified variable) E.g., add the two literals: L > len(A)– Negate the result and get another disjunction: L len(A)W1 = K 0 K < len(A) L len(A) (check that W1 W2)


Induction Iteration

• We showed a way to compute invariants algoritmically– Similar to fixed point computation in domains– Similar to abstract interpretation on the lattice of

predicates

• Then we discussed heuristics that improve the termination properties– Similar to widening in abstract interpretation


Theorem Proving. Conclusions.

• Theorem proving strengths– Very expressive

• Theorem proving weaknesses– Too ambitious

• A great toolbox for software analysis– Symbolic evaluation– Decision procedures

• Related to program analysis– Abstract interpretation on the lattice of predicates

Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants

Documents

Transcript of Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants