SAT/SMT Solvers and Their Applications - IIT Bombay · 2017. 3. 2. · CFDVS 2017 Ashutosh Gupta...
Transcript of SAT/SMT Solvers and Their Applications - IIT Bombay · 2017. 3. 2. · CFDVS 2017 Ashutosh Gupta...
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 1
SAT/SMT Solversand
Their Applications
Ashutosh Gupta
TIFR, India
Compile date: 2017-03-01
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 2
Logic is the backbone of formal methods
Differential equationsare the calculus of
Electrical engineering
Logicis the calculus ofFormal methods
Logic provides tools to define/manipulate computational objects
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 3
Topic 1.1
SAT problem
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 4
Example: SAT problem
Let x , y be rational variables.
Choose a value of x and y such that the following formula holds true.
x + y = 3
We say{x 7→ 1, y 7→ 2} |= x + y = 3
Commentary: We are not calling x and y rational numbers. They are not numbers. They are symbols that can hold numbers.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 5
Example: SAT problem(contd.)
Let x , y be rational variables.
Choose a value of x and y such that the following formula holds true.
x + y = 3 ∧ y > 10 ∧ x > 0 theory formulas Eas
y
x + y = 3 ∧ y > 10 ∧ (x > 0 ∨ x < −4) Quantifier-free Har
d
∀y . x + y = 3 ∧ y > 10 ∧ (x > 0 ∨ x < −4) quantified formulas
Imp
ossi
ble
Commentary: The above are increasingly harder class of satisfiability problems.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 6
Solvers for Quantifier-free formulas
We will look at satisfiability solvers for the quantifier-free formulas thatconsists of
I Theory atoms
I Boolean structure
Example 1.1
x + y = 3 ∧ y > 10 ∧ (x > 0 ∨ x < −4)
Theory atoms Boolean Structure
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 7
A comment on theories
Theory is a technical name for the subject of interest.
I Rationals
I Integers
I Reals
I Floats
I Arrays
I Chairs
I Cartoons
Let us stick to rational/integer arithmetic in this talk.
Theory is a verygeneral concept.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 8
Propositional formulas
Propositional formulas are a special case, where the theory atoms areBoolean variables.
Example 1.2
Let p1, p2, p3 be Boolean variables.
p1 ∧ ¬p2 ∧ (p3 ∨ p2)
A satisfying assignment.
{p1 7→ 1, p2 7→ 0, p3 7→ 1} |= p1 ∧ ¬p2 ∧ (p3 ∨ p2)
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 9
A bit of jargon
I Solvers for quantifier-free propositional formulas are called
SAT solvers.I Solvers for quantifier-free formulas with the other theories are called
SMT solvers.SMT = satisfiability modulo theory
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 10
Topic 1.2
SAT problems are everywhere
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 11
SAT problems
Every field of S&T encounters SAT problem of quantifier-free formulas.
A few are listed here
I Hardware verification and design assistanceAlmost all hardware/EDA companies have their own SAT solver
I Planning: many resource allocation problems are convertible to SAT
I Security: analysis of crypto algorithms
I Solving hard problems, e. g., travelling salesman problem
I Sampling/counting
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 12
Example: Solving Sudoku using SAT solvers
Example 1.3I Variables: vi ,j ,k ∈ B and i , j , k ∈ {1, ...., 9}I If vi ,j ,k = 1, column i and row j contains k.
I Value in each cell is valid:9∑
k=1
vi ,j ,k = 1 i , j ∈ {1, .., 9}
I Each value used exactly once in each row:9∑
i=1
vi ,j ,k = 1 j , k ∈ {1, .., 9}
I Each value used exactly once in each column:9∑
j=1
vi ,j ,k = 1 i , k ∈ {1, .., 9}
I Each value used exactly once in each 3× 3 grid3∑
s=1
3∑r=1
v3i+r ,j+s,k = 1 i , j ∈ {0, 1, 2}, k ∈ {1, .., 9}
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 13
Encoding x1 + .... + xk = 1
I At least one of xi is true
(x1 ∨ .... ∨ xk)
I Not more than two xi s are true
(¬xi ∨ ¬xj) i , j ∈ {1, .., 9}
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 14
SMT problem in bug detection
Example 1.4
Consider program
foo(x,y) {
u=x+y;
if (u!=1)
z=2;
else
z=u+1;
u = y/z;//avoid divide by 0
return u;
}
The following formula in quantifier-free linear integer arithmetic encodesthe program behaviors
u = x + y∧(u = 1 ∧ z = 2 ∨ u 6= 1 ∧ z = u + 1)∧
z = 0
If the above formula is sat, the pro-gram has a bug
Detailed presentation will be given on Tuesday
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 15
Topic 1.3
Rise of Solvers
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 16
Rise of SAT/SMT solvers
SAT solving is theoretically known to be a hard problem.
However, it did not stop researchers to attempt building practical solvers.
I In early 2000s, stable and scalable SAT/SMT solvers started appearing.e.g., zChaff, Yiecs
I SAT/SMT competitions became a driving force in their ever increasingefficiency
I Formal methods community quickly realized their potential
I Z3, one of the leading SMT solver, alone has about 3000+ citations(375 per year)(June 2016)
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 17
Efficiency of SAT solvers over the years
Source: http://satsmt2014.forsyte.at/files/2014/07/SAT-introduction.pdf
Cactus plot:
Y-axis: time out
X-axis: number of solved problems
Color: a competing solver
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 18
SAT technology: quite revolution
Impact is enormous.
Probably, one of the greatest achievement of the first decade of this century
All verification tools depends on the solvers.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 19
Topic 1.4
SAT solver
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 20
Some terminologyI Propositional variables are also referred as atomsI A literal is either an atom or its negationI A clause is a disjunction of literals.I A formula is in CNF if it is a conjunction of clauses.
Example 1.5
I p is an atom but ¬p is not.
I ¬p and p both are literals.
I p ∨ ¬p ∨ p ∨ q is a clause.
I ¬p and p both are in CNF.
I (p ∨ ¬q) ∧ (r ∨ ¬q) ∧ ¬r is in CNF.
I (p ∨ ¬q) ∧ ((r ∧ ¬p) ∨ ¬q) ∧ ¬r is not in CNF.
Definition 1.1Let atoms(F ) denote the set of atoms appearing in F .
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 21
Partial model
Definition 1.2For a CNF F , A partial model m is an ordered partial map from atoms(F ) toB.
Example 1.6
partial models m1 = {x 7→ 0, y 7→ 1} and m2 = {y 7→ 1, x 7→ 0} are notsame.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 22
Some notation
Before presenting the solvers, let us define some notations.
Under partial model m,
A literal ` is true if m(`) = 1 and` is false if m(`) = 0.Otherwise, ` is undefined.
A clause C is true if there is ` ∈ C s.t. ` is true andC is false if for each ` ∈ C , ` is false.Otherwise, C is undefined.
CNF F is true if for each C ∈ F , C is true andF is false if there is C ∈ F s.t. C is false.Otherwise, F is undefined.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 23
Unit clause and unit literal
Definition 1.3C is a unit clause under m if a literal ` ∈ C is undefined and the rest are false.` is called unit literal.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 24
DPLL (Davis-Putnam-Loveland-Logemann)Algorithm 1.1: DPLL(F,m)
Input: CNF F , partial model m1 if F is true under m then2 return sat
3 if F is false under m then4 return unsat
5 if ∃ unit literal x under m then6 return DPLL(F ,m[x 7→ 1])
7 if ∃ unit literal ¬x under m then8 return DPLL(F ,m[x 7→ 0])
9 Choose an undefined x ;10 if DPLL(F ,m[x 7→ 0]) == sat then11 return sat12 else13 return DPLL(F ,m[x 7→ 1])
Backtracking atconflict
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 25
Example: Brancing and bracktracking in DPLL
Example 1.7
c1 = (¬p1 ∨ p2)
c2 = (¬p1 ∨ p3 ∨ p5)
c3 = (¬p2 ∨ p4)
c4 = (¬p3 ∨ ¬p4)
c5 = (p1 ∨ p5 ∨ ¬p2)
c6 = (p2 ∨ p3)
c7 = (p2 ∨ ¬p3)
c8 = (p6 ∨ ¬p5)
p6
p5
0
p1
0, c8
p3
1
p2
1, c2
p4
1, c1
p3
1, c3
conflict
0, c4
..0
Backtrackto the lastdecision
Decisionvariable
Propagatedvariable
Exercise 1.1Complete the DPLL run
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 26
Optimizations
There are various optimizations in implementing DPLL
We will discuss only four optimizations.
I clause learning
I 2-watched literals
I variable ordering
I restarts
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 27
Topic 1.5
Clause learning
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 28
Clause learning
As we decide and propagate, we may construct a data structure that allowsus to do efficient back tracking.
Definition 1.4 (implication graph)
An implication graph is a labeled directed graph (N,E ), where
I N contains true literals and a conflict node to denote contradiction
I E = {(`1, `2)|¬`1 ∈ clause(`2)}clause(`) , clause due to which unit propagation made ` trueNote: For decision literals clause(`) is undefined
Note: Not same definition as defined for 2-SAT!
We also annotate each node with decision level (e. g., ¬p@3), i.e., thenumber of decisions after which the variable was assigned
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 29
Example: implication graph
Example 1.8
c1 = (¬p1 ∨ p2)
c2 = (¬p1 ∨ p3 ∨ p5)
c3 = (¬p2 ∨ p4)
c4 = (¬p3 ∨ ¬p4)
c5 = (p1 ∨ p5 ∨ ¬p2)
c6 = (p2 ∨ p3)
c7 = (p2 ∨ ¬p3∨p7)
c8 = (p6 ∨ ¬p5)
Note: Modified example
p6
p5
0
p7
p1
0
0, c8
p3
1
p2
1, c2
p4
1, c1
p3
1, c3
conflict0, c4
Implication graph
¬p6@1
¬p5@1
c8
¬p7@2 p1@3
p3@3
c2 c2
p2@3
c1
p4@3
c3
conflict
c4
c4
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 30
Conflict clause
In the case of conflict, we traverse the implication graph backwards to findthe set of decisions that caused the conflict.
The clause of the negations of the decisions is called conflict clause.
Example 1.9¬p6@1
¬p5@1
c8
¬p7@2 p1@3
p3@3
c2 c2
p2@3
c1
p4@3
c3
conflictc4
c4
Conflict clause : p6 ∨ ¬p1
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 31
Clause learning
Clause learning heuristics
I add conflict clause in the input clauses and
I backtrack to the second last conflicting decision, and proceed like DPLL
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 32
Benefit of adding conflict clauses1. Prunes away search space2. Records past work of the SAT solver3. Enables very many other heuristics without much complications.
We will see them shortly.
Example 1.10
In the previous example, we made decisions :m(p6) = 0, m(p7) = 0, and m(p1) = 1
We learned a conflict clause : p6 ∨ ¬p1
Adding this clause to the input clauses results in
1. m(p6) = 0, m(p7) = 1, and m(p1) = 1 will never be tried
2. m(p6) = 0 and m(p1) = 1 will never occur simultaneously.
Impact of clause learning was so profound that some people call the optimizedalgorithm CDCL(conflict driven clause learning) instead of DPLL
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 33
CDCL as an algorithmAlgorithm 1.2: CDCLInput: CNF F
1 AddClauses(F ); m := UnitPropagation(); dl := 0; dstack := λx .0;2 do3 // backtracking4 while ∃x {x 7→ 0, x 7→ 1} ⊆ m do5 if dl = 0 then return unsat;6 (C , dl) := AnalyzeConflict(m);7 m.resize(dstack(dl)); AddClauses({C}); m := UnitPropagation();
8 // Boolean decision9 if m is partial then
10 dstack(dl) := m.size();11 dl := dl + 1; m := Decide(); m := UnitPropagation() ;
12 while m is partial or ∃x {x 7→ 0, x 7→ 1} ⊆ m;13 return sat
I AddClauses(Cs) - adds Cs to the current set of problem clauses
I UnitPropagation() - applies unit propagation and extends m as much as possible
I Decide() - chooses an undefined variable in m and assigns a Boolean value
I AnalyzeConflict() - returns a conflict clause learned using implication graph anda decision level for back tracking
stands for decision level
dstack records historyfor backtracking
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 34
Topic 1.6
Other heuristics
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 35
Other heuristics
Now we will discuss the other heuristics that may improve the performance ofSAT solvers
I 2-watched literals
I pure literals
I variable ordering
I restarts
I Learned clause deletion
I Cache aware implementation
Commentary: Clause learning is an algorithmic change. The above optimization are clever data structures and implementations.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 36
2-watched literals
This data structure optimizes unit clause propagation
Observation:To decide if a clause is ready for unit propagation, we need to look at onlytwo literals that are not false
For each clause we choose two literals and we call them watched literals.
In a clause,
I if watched literals are non-false, the clause is not a unit clause
I if any of the two becomes false, we look for another two non-false literals
I If we can not find another two, the clause is a unit clause
Exercise 1.2Why this scheme may optimize CDCL?
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 37
Example: 2-watched literals
Example 1.11
Consider clause p1 ∨ p2 ∨ ¬p3 ∨ ¬p4 in a formula among other variables andclauses. Let us suppose initially we watch p1 and p2 in the clause.∗ , watched literals.© , no work to be done!
Initially: p∗1 ∨ p∗2 ∨ ¬p3 ∨ ¬p4 m = {}...Assign p1 = 0: p1 ∨ p∗2 ∨ ¬p∗3 ∨ ¬p4 m = {. . . , p1 7→ 0}Assign p2 = 1: p1 ∨ p∗2 ∨ ¬p∗3 ∨ ¬p4 m = {. . . , p1 7→ 0, p2 7→ 1} ©Backtrack to p1: p1 ∨ p∗2 ∨ ¬p∗3 ∨ ¬p4 m = {. . . } ©Assign p4 = 1: p1 ∨ p∗2 ∨ ¬p∗3 ∨ ¬p4 m = {. . . , p4 7→ 1} ©
The benefit: often no work to be done!
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 38
Topic 1.7
SMT solver
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 39
SMT solver
We will now solve quantifier-free formulas in some theory.
Example 1.12
I f (x) ≈ g(h(x , y)) is a formula in QF EUF.
I x > 0 ∨ y + x ≈ 3.5z is a formula in QF LRA.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 40
CDCL(T )
CDCL solves(i.e. checks satisfiability) quantifier-free propositional formulas
CDCL(T ) solves quantifier-free formulas in theory T ,
I separates the boolean and theory reasoning,
I proceeds like CDCL, and
I needs support of a T -solver DPT , i.e., a decision procedure forconjunction of literals of T
The tools that are build using CDCL(T ) are calledsatisfiablity modulo theory solvers (SMT solvers)
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 41
Boolean encoder
For a formula F , let boolean encoder e be a partial map from atoms(F ) tofresh boolean variables.
For a term t, let e(t) denote the term obtained by replacing each atom a bye(a) if e(a) is defined.
Example 1.13
Let F = x < 2 ∨ (y > 0 ∨ x ≥ 2)and e = {x < 2 7→ x1, y > 0 7→ x2}e(F ) = x1 ∨ (x2 ∨ ¬x1)
Definition 1.5For a partial model m of e, lete−1(m) , {e−1(x)|x 7→ 1 ∈ m} ∪ {¬e−1(x)|x 7→ 0 ∈ m}
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 42
CDCL(T )Algorithm 1.3: CDCL(T )Input: CNF F , boolean encoder e
1 AddClauses(e(F )); m := UnitPropagation(); dl := 0; dstack := λx .0;2 do3 // backtracking4 while ∃x {x 7→ 0, x 7→ 1} ⊆ m do5 if dl = 0 then return unsat;6 (C , dl) := AnalyzeConflict(m) ; // clause learning
7 m.resize(dstack(dl)); AddClauses({C}); m := UnitPropagation();
8 // Boolean decision9 if m is partial then
10 dstack(dl) := m.size();11 dl := dl + 1; m := Decide(); m := UnitPropagation() ;
12 // Theory propagation13 if ∀x {x 7→ 0, x 7→ 1} 6⊆ m then14 (Cs, dl ′) := TheoryDeduction(
∧e−1(m));
15 if dl ′ < dl then {dl = dl ′; m.resize(dstack(dl)); } ;16 AddClauses(e(Cs)); m := UnitPropagation();
17 while m is partial or ∃x {x 7→ 0, x 7→ 1} ⊆ m;18 return sat
stands for decision level
dstack records historyfor backtracking
returns a clause setand a decision level
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 43
Theory propagation
TheoryDeduction looks at the atoms assigned so far and checks
I if they are mutually unsatisfiable
I if not, are there other literals from F that are implied by the currentassignment
Any implementation must comply with the following goals
I Correctness: boolean model is consistent with TI Termination: unsat partial models are never repeated
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 44
TheoryDeduction
TheoryDeduction solves conjunction of literals and returns a set ofclauses and a decision level.
(Cs, dl ′) := TheoryDeduction(∧
e−1(m))
Cs may contain the clauses of the form
(∧
L)⇒ `
where ` ∈ lits(F ) ∪ {⊥} and L ⊆ e−1(m).Note: The RHS need not be a single literal
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 45
Requirement form TheoryDeduction
The output of TheoryDeduction must satisfy the following conditions
I If∧
e−1(m) is unsat in T then Cs must contain a clause with ` = ⊥.
I if∧
e−1(m) is sat then dl ′ = dl .Otherwise, dl ′ is the decision level immediately after which theunsatisfiablity occurred (clearly stated shortly).
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 46
Example : CDCL(T )
Consider F = (x = y ∨ y = z) ∧ (y 6= z ∨ z = u) ∧ (z = x)e(F ) = (x1 ∨ x2) ∧ (¬x2 ∨ x3) ∧ x4
After AddClauses(e(F )); m := UnitPropagation()m = {x4 7→ 1}
After m := Decide();m = {x4 7→ 1, x2 7→ 0}
After m := UnitPropagation()m = {x4 7→ 1, x2 7→ 0, x1 7→ 1}
After (Cs, dl ′) := TheoryDeduction(x = y ∧ y 6= z ∧ z = x)Cs = {x 6= y ∨ y = z ∨ z 6= x}, dl ′ = 0,e(Cs) = {¬x1 ∨ x2 ∨ ¬x4}
After AddClauses(e(Cs)); m := UnitPropagation()m = {x4 7→ 1, x2 7→ 0, x1 7→ 1, x1 7→ 0} ← conflict
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 47
Topic 1.8
Theory propagation implementation
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 48
Theory propagation implementation - Incremental theorysolver
Typically, theory propagation is implemented using incremental/onlinesolvers.
Incremental/online solver DPTI takes input constraints as a sequence of literals,
I maintains a data structure that defines the solver state and satisfiabilityof constraints seen so far.
I provides a stack like interfaceI push( ` ) - adds literal ` in “constraint store”I pop() - removes last pushed literal from the storeI checkSat() - checks satisfiability of current storeI unsatCore() - returns the set of literals that caused unsatisfiablity
Note: We assume that push and pop call checkSat() at the end of their execution.
Therefore, explicit calls to checkSat() are not necessary. However, practical tools
allow users to choose the policy of calling checkSat() - lazy vs. eager
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 49
Theory propagation implementationAlgorithm 1.4: TheoryDeductionInput: Set of literals Ls
1 Read only input: m partial model, dstack decision depths, dl current decision level2 foreach ` ∈ Ls do3 DPT .push(`)
4 if DPT .checkSat() == unsat then5 Ls ′ := DPT .unsatCore(); // minimize clause6 dl ′ := max{dl ′′|∃` ∈ Ls ′, i . dstack(dl ′′) < i ∧m[i ] = e(`) 7→ };7 return ({¬
∧Ls ′}, dl ′)
8 else9 //implied clauses
10 Cs := ∅;11 foreach ` ∈ Lits(F ) do12 DPT .push(¬`);13 if DPT .checkSat() == unsat then14 Ls ′ := DPT .unsatCore(); // ` is called implied model and ¬` ∈ Ls ′
15 Cs := Cs ∪ {¬∧
Ls ′};16 DPT .pop();
17 return (Cs,dl)
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 50
Topic 1.9
Indian research in SAT/SMT solving
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 51
Indian research in SAT/SMT solving
Limited research activity in the field in India.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 52
Solvers for verification tools
are like
engines for the cars.
One must learn to build engines ifone wants to build cars.
cbna CFDVS 2017 Ashutosh Gupta TIFR, India 53
What should we do?
We need to build an eco-system for the field
I funding for the backend research
I concretely defined projects
I more users/researchers interactions
I support for start-ups in the related areas
I no expectations of finished products from academia
I support promising individuals