The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

48
The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs

Transcript of The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Page 1: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

The behavior of SAT solversin model checking

applications

K. L. McMillanCadence Berkeley Labs

Page 2: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Overview

• Some SAT-based model checking methods– Localization abstraction using SAT– Interpolation-based model checking

• Discuss– When SAT is effective in these applications– Why SAT solvers behave this way– Where are improvements possible?

Page 3: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Refutations

• The ability to generate a refutation in the unsatisfiable case is critical to these applications.

• A DPLL-style solver naturally produces refutions by reolution.

Page 4: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Conflict Clauses and resolution

(a b) (b c d) (b d)

a

c

Decisions

b

Assignment: a b c d

d

Conflict!

(b c )

resolve

Conflict!(a c)

resolve

Conflict!

Page 5: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Conflict Clauses (cont.)

• Conflict clause generation is really a way of using failures in the backtracking search to guide resolution.– search guides deduction– deduction guides search

Many heuristics are available for determiningwhen to terminate the resolution process (e.g., the 1UIP rule)

Page 6: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Generating refutations

• Refutation = a proof of the null clause– Record a DAG containing all resolution steps

performed during conflict clause generation.– When null clause is generated, we can extract a

proof of the null clause as a resolution DAG.

Original clauses

Derived clauses

Null clause

Page 7: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Bounded Model Checking

• Given– A finite transition system M– A property p

• Determine– Does M allow a counterexample to p of k

transitions of fewer?

This problem can be translated to a SAT problem

BCCZ99

Page 8: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Models

Transition system described by a set of constraints

ab cp

g

Each circuit element is a constraintnote: a = at and a' = at+1

g = a b

p = g c

c' = p

Model:

C = { g = a b, p = g c, c' = p }

Page 9: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Properties

• We restrict our attention to safety properties.

• Characterized by:– Initial condition I– Final condition F (representing "bad" states)

• A counterexample is a path from a state satisfying I to state satisfying F, where every transition satisfies C.

Page 10: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Unfolding

• Unfold the model k times: Uk = C0 C1 ... Ck-1

ab

cp

g ab

cp

g ab

cp

g

...I0 Fk

• Use SAT solver to check satisfiability of I0 Uk Fk

• A satisfying assignment is a counterexample of k steps

Page 11: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Localization abstraction

• Property: G (c X c)

ab cp

g

Model:

C = { g = a b, p = g c, c' = p }

'

free variable

C'property, C C' C property

Kurshan

Page 12: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Localization, cont• C' may refer to fewer state variables than C

– reduction in the state explosion problem

• Key issue: how to choose constraints in C'– counterexample-based– proof-based

Page 13: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Algorithm

Model checkabstraction C'

Choose initial C'

Can extend Cexfrom C'to C?

Add constraintsto C'

true, done

Cex

yes, Cex

no

SAT uses

Kurshan

Page 14: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Abstract counterexamples

• Assume simple safety property:– initial condition I and final condition F– w.l.o.g., assume I and F are atomic formulas

• to make this true, add constraints in C: vI I vF F

• Abstract variables V' = support(C',I,F)• Abstract counterexample A' is a truth

assignment to: { vt | v in V', t in 0..k }

where k is the number of steps.

Page 15: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Counterexample extension

• Abstract counterexample A' satisfies: I0 U'k Fk where U'k = C'0 C'1 ... C'k-1

• Find A consistent with A', satisfying: I0 Uk Fk where Uk = C0 C1 ... Ck-1

• That is, A is any satisfying assignment to:

A' I0 Uk Fk

I.e., to extend an abstract counterexample, we justapply it as a constraint in BMC. If unsat, abstractcounterexample is "false".

CGJLV 2000

Page 16: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Abstraction refinement

• Refinement = adding constraints to C' to eliminate false counterexamples.

• Many heuristsics used for this.

– Too many to cover here.

– Recall that a SAT solver can produce a resolution-based refutation in the UNSAT case....

Page 17: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Proof-based refinement

• Recall, to extend abstract Cex A', we check: A' I0 Uk Fk

• If UNSAT, we obtain refutation proof P– proof that A' cannot be extended to concrete Cex

• Let E be set of constraints used in proof P:E = { c C | some ci occurs in P }

• A' cannot be extended to a Cex for E– P is the proof of this.

Thus, add E to C' and continue...

CCKSVW02

Page 18: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

In other words...

The refutation of the formula: A' I0 Uk Fk

gives us a sufficient set of constraints to rule out the abstract counterexample.

We continue ruling out counterexamples until either theabstraction C' proves the property or we can extend anabstract counterexample to a concrete one.

Page 19: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Weakness of Cex-based approach

• Arbitrarily chosen abstract Cex may be refutable for many reasons not related to property.– Thus, may add irrelevant constraints.– May require many iterations– To remedy, may try to characterize a set of

Cex's rather than just one (e.g., GKM-HFV,TACAS03).

Alternative: don't use counterexamples

Page 20: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Proof-based abstraction

BMCat depth k

Cex?done

No Cex?

Use refutation to choose abstraction

MC abstraction doneTrue?

False?

Incr

ease

kMA,TACAS03

Page 21: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

BMC phase

• Unfold the model k times: U = C0 C1 ... Ck-1

• Use SAT solver to check satisfiability of I0 U Fk

• If unsatisfiable:• property has no Cex of length k• produce a refutation proof P

Page 22: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Abstraction phase

• Let C' be set of constraints used in proof P:C' = { c C | some ci occurs in P }

• C' admits no counterexample of length k– let U' = C'0 C'1 ... C'k-1

– P is a refutation of I0 U' Fk

• Model check property on C'– property true for C' implies true for C– else Cex of length k' > k (why?)

• restart for k = k'

Page 23: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Algorithm

BMCC at depth k

Cex?done

No Cex?

Refutation P inducesabstraction C'

Model check C' doneTrue?

Cex of depth k'?

let

k =

k'

Notice: MC counterexample is thrown away!

Page 24: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Termination

• Depth k increases at each iteration• Eventually k > d, diameter of C'• If k > d, no counterexample is possible

In practice, termination uses occurs when k d/2

Usually, diameter C' << diameter of C

Page 25: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Weakness of proof-based abs

• BMC must refute all counterexamples of length k, while in Cex-based, BMC must refute only one (partial) counterexample.– more stress on the SAT solver with PBA

• Experimentally...– CBA and PBA behave similarly for smaller

circuits– PBA is faster for larger circuits because it

terminates in fewer iterations.

Page 26: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

PicoJavaII benchmarks

• Hardware Java virtual machine implementation• Properties derived from verification of ICU

– handles cache, instruction prefetch and decode

• Original abstraction was manual• Added neigboring IFU to make problem harder

ICU IFUMem,Cache

Integerunit

properties

No properties can be verified by standard model checking!

Page 27: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Abstraction results

solid = original, gray = manual, open = proof-based abstraction

345

305 306 306 305

104

307

73

97

52 54

292

312

285

126

354

289

212

151

51

0

50

100

150

200

250

300

350

400S

tate

va

ria

ble

s

Page 28: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Inference

• SAT solver seems to be very effective at narrowing down the proof to relevant facts.

In most cases, it did better than manual abstraction.

Page 29: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

A (fuzzy) hypothesis

• Parameterized models allowing no abstraction

SAT-based BMC "succeeds" when number of relevant variables is small, and fails otherwise.

"success" is BMC for k = diameter of relevant logic

Model Max state vars

German protocol 42

"swap" 21

Page 30: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Industrial benchmarks

0

100

200

300

400

500

600

700

0 100 200 300 400 500 600 700

Original state variables

Ab

stra

ctio

n s

tate

var

iab

les

Page 31: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Possible explanation

• Internally, SAT solver is really doing CBA

a=0b=1c=0d=1

decision stack= abstract Cex A'

refutation of A'

decision heuristicmoves proof variablesup, into A'

Page 32: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

VSIDS heuristic (CHAFF)

• Increment variables score when it is used in the proof of a conflict clause

• Scores decay exponentially with number of decisions

Page 33: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

VSIDS working setV

ari

ab

les

wit

h n

on

-zero

sco

re500000

Time (decisions)

Working set is solver’s localization of the problem

Page 34: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Working set, cont

• Size is small and relatively stable• When exhausted, decisions become random

– Working set is rapidly forgotten, then relearned– Most decisions are made randomly!

• Cost of irrelevant decisions is low– Decision cost must be proportional to working set size!

As a result, the effect of improving the decisionheuristic is minimal for this class of problems.

Page 35: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Interpolation-based MC

• BMC and Craig interpolation allow us to compute image over-approximation relative to property.– Avoid computing exact image.– Maintain SAT solver's advantage of filtering out

irrelevant facts. Exploit it to localize invariants.

McM03

Page 36: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Interpolation

• If A B = false, there exists an interpolant A' for (A,B) such that:

A A'A' B = false

A' refers only to common variables of A,B

• Example: – A = p q, B = q r, A' = q

• New result– given a resolution refutation of A B,

A' can be derived in linear time.

(Craig,57)

(Pudlak,Krajicek,97)

Page 37: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Interpolation-based MC

• Interpolation gives us– SAT-based algorithm for over-approximate

image computation, using interpolation– SAT-only symbolic model checking

Page 38: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Reachability

I FR1

R2...

R

= I Img(I,C)= R1 Img(R1,C)

R is the "strongest inductive invariant"

Is there a path from I to F satisfyingtransition constraint C?

Page 39: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Adequate image

P F

Img(P,C)

Reached from P Can reach F

Img’(P,C)

But how do you get an adequate Img'?

Page 40: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

k-adequate image operator

• Image operator is k-adequate (w.r.t.) F, when– if P cannot reach F, image of P cannot reach F within k steps

• Note, if k > diameter, then k-adequate is equivalent to adequate.

Page 41: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Interpolation-based image

• Idea -- use unfolding to enforce k-adequacyA = P-1 C-1

B = C0 C1 Ck-1 Fk

P FC C C C C C C

A B

t=0 t=k

Let Image of P = A', where A' is an interpolant for (A,B)...

Img' is k-adequate!

Page 42: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Intuition

• A' tells is everything the SAT solver deduced about the image of P in proving it can't reach F in k steps.

• Hence, A' is in some sense an abstraction of the image relative to the property.

P FC C C C C C C

A B

t=0 t=k

A'

Page 43: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Reachability algorithm

• Increase k until invariant obtained proves the property.

• Eventually k > d, the diameter, in which case image operator is adequate, hence we terminate.

Notes:– don't need to know when k > d in order to

terminate– often termination occurs with k << d

Page 44: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

PicoJava II Benchmarks

0.01

0.1

1

10

100

1000

0.01 0.1 1 10 100 1000

Proof-based abstraction (s)

Inte

rpo

latio

n-b

ase

d m

eth

od

(s)

Reason: terminates for smaller k value

Page 45: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Interpolation-based MC

• Fully SAT-based.• Exploits SAT solvers ability to concentrate on facts relevant to a

property.• Like PBA, most effective when

– Very large set of facts is available– Only a small subset are relevant to property

• For true properties, appears to converge for smaller k values than PBA– Very important, because SAT-based BMC performance degrades rapidly with k.

Page 46: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

Conclusions

• SAT solvers are very effective at ignoring irrelevant facts– Can think of decision heuristic as a form of CBA

• Solver’s “working set” is in effect a localization

• For MC applications, SAT solver performance is tied to number of relevant variables– Performs well if there is a small UNSAT "core"– Performs badly when all variables relevant.

Challenge: SAT solvers that are efficientfor non-localizable instances!

Page 47: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

CCKSVW approach (FMCAD02)• Find the shortest prefix of Cex A' that cannot be extended.

• That is, A' I0 Uk Fk

is feasible for all k < i, but not for k=i.

s0 s1 s2 si-1 si...

OK OK OK OK NO!

Page 48: The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Copyright 2002 Cadence Design Systems. Permission is granted to reproduce without modification.

CCKSVW approach cont.

• Let P be a refutation of A' I0 Ui Fi

• Let E be set of constraints used in proof P only on state si-1:

E = { c C | ci-2 occurs in P }

s0 s1 s2 si-1 si...

OK OK OK OK NO!

add constraints used here