Constrained Sampling and Counting: When Practice Drives Theory...

Post on 01-Apr-2021

2 views 0 download

Transcript of Constrained Sampling and Counting: When Practice Drives Theory...

Constrained Sampling and Counting: When Practice Drives Theory

Supratik Chakraborty

IIT Bombay

Joint work with Kuldeep Meel and Moshe Y. Vardi

(Rice University)

Probabilistic Inference

2

Interest

in topic

Trust in

Speaker

Availabili

ty

Attend

Talk

Modeling Attendance for

Today’s Talk

Model

CountingRoth, 1996

How do we infer useful information from the data filled with uncertainty?

+Pr(Attending Talk

|Interest in topic =

True )

Smart Cities

• Alarm system in every house that responds to either burglary or earthquake

• Every alarm system is connected to the central dispatcher (of course, automated!)

• Suppose one of the alarm goes off

• Important to predict whether its earthquake or burglary

3

Deriving Useful Inferences

4

What is the probability of earthquake (𝐸)

given that alarm sounded (𝐴)?

=

Bayes’ rule to the rescue

How do we calculate these

probabilities?

Pr[event|evidence]

Probabilistic Models

Graphical Models

5

Graphical Models

6

B E

A

𝐸 Pr

𝑇 0.1

𝐹 0.9

𝐵 Pr

𝑇 0.8

𝐹 0.2

B E A Pr(A|E,

B)

𝑇 𝑇 𝑇 0.3

𝑇 𝑇 𝐹 0.7

𝑇 𝐹 𝑇 0.4

𝑇 𝐹 𝐹 0.6

𝐹 𝑇 𝑇 0.2

𝐹 𝐹 𝐹 0.8

𝐹 𝐹 𝑇 0.1

Calculating

7

B E

A

𝐸 Pr

𝑇 0.1

𝐹 0.9

𝐵 Pr

𝑇 0.8

𝐹 0.2

B E A Pr(A|E,

B)

𝑇 𝑇 𝑇 0.3

𝑇 𝑇 𝐹 0.7

𝑇 𝐹 𝑇 0.4

𝑇 𝐹 𝐹 0.6

𝐹 𝑇 𝑇 0.2

𝐹 𝐹 𝐹 0.8

𝐹 𝐹 𝑇 0.1

Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵

+Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵]

Calculating

8

B E

A

𝐸 Pr

𝑇 0.1

𝐹 0.9

𝐵 Pr

𝑇 0.8

𝐹 0.2

B E A Pr(A|E,

B)

𝑇 𝑇 𝑇 0.3

𝑇 𝑇 𝐹 0.7

𝑇 𝐹 𝑇 0.4

𝑇 𝐹 𝐹 0.6

𝐹 𝑇 𝑇 0.2

𝐹 𝐹 𝐹 0.8

𝐹 𝐹 𝑇 0.1

Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵] + Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵

Calculating

9

B E

A

𝐸 Pr

𝑇 0.1

𝐹 0.9

𝐵 Pr

𝑇 0.8

𝐹 0.2

B E A Pr(A|E,

B)

𝑇 𝑇 𝑇 0.3

𝑇 𝑇 𝐹 0.7

𝑇 𝐹 𝑇 0.4

𝑇 𝐹 𝐹 0.6

𝐹 𝑇 𝑇 0.2

𝐹 𝐹 𝐹 0.8

𝐹 𝐹 𝑇 0.1

Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵] + Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵

Moving from Probability to Logic

• 𝑋 = 𝐴, 𝐵, 𝐸

• 𝐹 = 𝐸 ∧ 𝐴

• 𝑊 𝐵 = 0 = 0.2, 𝑊 𝐵 = 1 = 1 − 𝑊 𝐵 = 0 = 0.8

• 𝑊 𝐴 = 0 = 0.1, 𝑊 𝐴 = 1 = 0.9

• 𝑊 𝐸 = 0| 𝐴 = 0, 𝐵 = 0 = ⋯

• 𝑊 𝐴 = 1, 𝐸 = 1, 𝐵 = 1 = 𝑊 𝐵 = 1 ∗ 𝑊 𝐸 = 1 ∗ 𝑊(𝐴 = 1|𝐸 = 1, 𝐵 = 1)

• 𝑅𝐹 = (𝐴 = 1, 𝐸 = 1, 𝐵 = 0 , (𝐴 = 1, 𝐸 = 1, 𝐵 = 1)}

• 𝑊 𝐹 = 𝑊 𝐴 = 1, 𝐸 = 1, 𝐵 = 1 + 𝑊(𝐴 = 1, 𝐸 = 1, 𝐵 = 1)

10

𝑾 𝑭 = 𝐏𝐫 [𝑬 ∩ 𝑨]

Weighted Model Count

(WMC)

Probabilistic Inference to WMC to Unweighted Model Counting

B E

A

Pr [𝐸|𝐴]

Weighte

d Model

Countin

gRoth, 1996

11

Weighted Model Counting Unweighted Model Counting

Polynomial time reductions

Model Counting

• Given a SAT formula F

• RF: Set of all solutions of F

• Problem (#SAT): Estimate the number of solutions of F (#F) i.e., what is the cardinality of RF?

• E.g., F = (a v b)

• RF = {(0,1), (1,0), (1,1)}

• The number of solutions (#F) = 3

#P: The class of counting problems for

decision problems in NP!12

How do we guarantee that systems work correctly ?

Functional Verification

• Formal verification

Challenges: formal requirements, scalability

~10-15% of verification effort

• Dynamic verification: dominant approach

13

📱

Dynamic Verification

Design is simulated with test vectors

•Test vectors represent different verification scenarios

Results from simulation compared to intended results

Challenge: Exceedingly large test space!

14

Constrained-Random Simulation

15

a b

c

64 bit

64 bit

64 bit

c = f(a,b)

Sources for Constraints

• Designers:

1. a +64 11 *32 b = 12

2. a <64 (b >> 4)

• Past Experience:

1. 40 <64 34 + a <64 5050

2. 120 <64 b <64 230

• Users:

1. 232 *32 a + b != 1100

2. 1020 <64 (b /64 2) +64 a <64 2200

Problem: How can we uniformly sample the values of a and b

satisfying the above constraints?

Problem Formulation

16

Set of

Constraints

Sample satisfying assignments

uniformly at random

SAT Formula

Scalable Uniform Generation of SAT Witnesses

a b

c

64 bit

64 bit

64 bit

c = f(a,b)

Agenda

Design Scalable Techniques for

Uniform Generation and

Model Counting

with Strong Theoretical Guarantees

17

Agenda

Design Scalable Techniques for

Almost-Uniform Generation and

Approximate-Model Counting

with Strong Theoretical Guarantees

18

Formal Definitions

• 𝐹: CNF Formula; RF ∶ Solution Space of 𝐹

• Input: 𝐹 Output: 𝑦 ∈ 𝑅𝐹

• Uniform Generator:

Guarantee: ∀𝑦 ∈ 𝑅𝐹 , Pr[𝑦 is output] = 1

𝑅𝐹

• Almost-Uniform Generator

Guarantee: ∀𝑦 ∈ 𝑅𝐹 , 1

1+𝜀 𝑅𝐹≤ Pr[𝑦 is output ] ≤

1+𝜀

𝑅𝐹

19

Formal Definitions

• 𝐹: CNF Formula; RF ∶ Solution Space of 𝐹

• Probably Approximately Correct (PAC) Counter

Input: 𝐹 Output: 𝐶

20

Uniform Generation

21

Rich History of Theoretical Work

• Jerrum, Valiant and Vazirani (1986): Uniform Generator: Polynomial time PTM (Probabilistic Turing

Machine) given access to σ 𝑃2 oracle

Almost-Uniform

Generator

PAC

Counter

PTIME

No Practical Algorithms22

Stockmeyer (1983): Deterministic approximate counting in 3rd level of

polynomial hierarchy.

Can be used to design a BPP^NP procedure -- too large NP instances

Rich History of Theoretical Work

• Bellare, Goldreich, and Petrank (2000)

• Uniform Generator: Polynomial time PTM given access to NP oracle

• Employs n-universal hash functions

23

Universal Hashing

• 𝐻 𝑛, 𝑚, 𝑟 : Set of r-universal hash functions from 0,1 𝑛 →0,1 𝑚

∀𝑦1, 𝑦2, ⋯ 𝑦𝑟 (distinct) ∈ 0,1 𝑛 and ∀𝛼1, 𝛼2 ⋯ 𝛼𝑟 ∈ 0,1 𝑚

Pr ℎ 𝑦𝑖 = 𝛼𝑖 = 1

2𝑚

Pr ℎ 𝑦1 = 𝛼1 ∧ ⋯ ∧ ℎ 𝑦𝑟 = 𝛼𝑟 = 2− 𝑚𝑟

• (r-1) degree polynomials → r-universal hash functions

24

(Independence)

(Uniformity)

Concentration Bounds

• t-wise 𝑡 ≥ 4 random variables 𝑋1 , 𝑋2, ⋯ 𝑋𝑛 ∈ 0,1

𝑋 = σ 𝑋𝑖 ; 𝜇 = 𝐸 𝑋

• For t = 2

25

BGP Method

Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚, 𝑛

• For right choice of m, all the cells are small (# of solutions ≤ 2𝑛2)• Check if all the cells are small (NP- Query)

• If yes, pick a solution randomly from randomly picked cell

In practice, the query is too long and

can not be handled by SAT Solvers!

• Polynomial of degree n-1

• SAT Solvers can not handle

large polynomials!

26

To Recap

• Jerrum, Valiant and Vazirani (1986): Uniform Generator: Polynomial time PTM given access to

σ 𝑃2 oracle

Almost-Uniform Generation is inter-reducible to PAC counting

• Bellare, Goldreich, and Petrank (2000)

• Uniform Generator: Polynomial time PTM given access to NP oracle

Does not work in practice!27

Prior Work

28Performance

Gu

ara

nte

es

MCMC

SAT-

Based

BGP BDD

Desires

Generator Relative runtime

State-of-the-art:

XORSample’

50000

Ideal Uniform

Generator*

10

SAT Solver 1

Experiments over 200+ benchmarks

*: According to EDA experts

29

Our Contribution

30Performance

Gu

ara

nte

es

MCMC

SAT-

Based

BGP BDD

UniGen

Key Ideas

Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚,∗

• For right choice of m, large number of cells are “small”

• “almost all” the cells are “roughly” equal

• Check if a randomly picked cell is “small”

• If yes, pick a solution randomly from randomly picked cell

31

Key Challenges

• F: Formula X: Set of variables 𝑅𝐹 : Solution space

• 𝑅𝐹,ℎ,𝛼: Set of solutions for 𝐹 ∧ (ℎ 𝑋 = 𝛼) where

ℎ ∈ 𝐻 𝑛, 𝑚,∗ ; 𝛼 ∈ 0,1 𝑚

1. How large is “small” cell ?

2. How much universality do we need?

3. What is the value of m?

32

Size of cell

Pr[ y is output ] = 1

2𝑚 ∗ Pr[Cell is small| y is in the cell] ∗1

𝑆𝑖𝑧𝑒 𝑜𝑓 𝑐𝑒𝑙𝑙

Let Size of cell ∈ [𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ, ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎ], Then:

q

33

Losing Independence

Our desire:

Suppose ℎ ∈ 𝐻 𝑛, 𝑚,∗ 𝑎𝑛𝑑 𝑚 = log𝑅𝐹

𝑝𝑖𝑣𝑜𝑡

Then, 𝐸[ 𝑅𝐹,ℎ,𝛼 ] = |𝑅𝐹|

2𝑚= 𝑝𝑖𝑣𝑜𝑡

Concentration bound k-universal (small constant)

34

How many cells?

• Our desire: 𝑚 = log|𝑅𝐹|

𝑝𝑖𝑣𝑜𝑡

But determining 𝑅𝐹 is expensive (#P complete)

• How about approximation? 𝐴𝑝𝑝𝑟𝑜𝑥𝑀𝐶 𝐹, 휀, 𝛿 returns C:

Pr[ 𝑅𝐹

1+𝜀≤ 𝐶 ≤ 1 + 휀 |𝑅𝐹|] ≥ 1 − 𝛿

𝑞 = log 𝐶 − log 𝑝𝑖𝑣𝑜𝑡

Concentrate on m = q-1, q, q+1

35

UniGen(F,

1. C = ApproxMC(F,휀)

2. Compute pivot, loThresh, hiThresh

3. 𝑞 = log 𝐶 − log 𝑝𝑖𝑣𝑜𝑡

4. for i in {q-1, q, q+1}:

5. Choose h randomly* from H(n,i,3)

6. Choose 𝛼 randomly from 0,1 𝑚

7. If (𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ ≤ 𝑅𝐹,ℎ,𝛼 ≤ ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎ):

8. Pick 𝑦 ∈ 𝑅𝐹,ℎ,𝛼 randomly

36

One time execution

Run for

every sample

required

Are we back to JVV?

NOT Really

•JVV makes linear (in n ) calls to Approximate counter compared to just 1 in UniGen

•# of calls to ApproxMC is only 1 regardless of the number of samples required unlike JVV 37

PAC Counter: ApproxMC(F,

Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚, 3

• For right choice of m, large number of cells are “small”

• “almost all” the cells are “roughly” equal

• Check if a randomly picked cell is “small”

• If yes, then estimate = # of solutions in cell * 2𝑚

38

ApproxMC(F,

#sols <

pivot

NO

39

ApproxMC(F,

#sols <

pivot

NO

40

ApproxMC(F,

#sols <

pivotYES

Estimate:

# of sols * 2𝑚

41

ApproxMC(F,

Key Lemmas

Let 𝑚∗ = log 𝑅𝐹 − log 𝑝𝑖𝑣𝑜𝑡

Lemma 1: The algorithm terminates with 𝑚 ∈ 𝑚∗ − 1 , 𝑚∗

with high probability

Lemma 2: The estimate from a randomly picked cell for 𝑚 ∈𝑚∗ − 1 , 𝑚∗ is correct with high probability

42

Results: Performance Comparison

43

0

10000

20000

30000

40000

50000

60000

70000

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190

Tim

e (

secon

ds)

Benchmarks

ApproxMC Cachet

Results: Performance Comparison

44

0

10000

20000

30000

40000

50000

60000

70000

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180

ApproxMC

Cachet

Can Solve a Large Class of Problems

45Large class of problems that lie beyond the exact

counters but can be computed by ApproxMC

0

10000

20000

30000

40000

50000

60000

70000

0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190

Tim

e (

seco

nd

s)

Benchmarks

ApproxMC

Cachet

Mean Error: Only 4% (allowed: 75%)

46Mean error: 4% – much smaller than the

theoretical guarantee of 75%

1.0E+00

3.2E+01

1.0E+03

3.3E+04

1.0E+06

3.4E+07

1.1E+09

3.4E+10

1.1E+12

3.5E+13

1.1E+15

3.6E+16

0 10 20 30 40 50 60 70 80 90

Co

un

t

Benchmarks

Cachet*1.75

Cachet/1.75

ApproxMC

Runtime Performance of UniGen

47

1-2 Orders of Magnitude Faster

0.1

1

10

100

1000

10000

100000

case4

7

case_

3_b1

4_3

case1

05

case8

case2

03

case1

45

case6

1

case9

case1

5

case1

40

case_

2_b1

4_1

case_

3_b1

4_1

squ

ari

ng

14

squ

ari

ng

7

case_

2_p

tb_1

case_

1_p

tb_1

case_

2_b1

4_2

case_

3_b1

4_2

Time(s)

Benchmarks

UniGen

XORSample'

48

Results: Uniformity

49• Benchmark: case110.cnf; #var: 287; #clauses: 1263

• Total Runs: 4x106; Total Solutions : 16384

0

50

100

150

200

250

300

350

400

450

500

184 208 228 248 268 288

Frequency

#Solutions

Results: Uniformity

50• Benchmark: case110.cnf; #var: 287; #clauses: 1263

• Total Runs: 4x106; Total Solutions : 16384

0

50

100

150

200

250

300

350

400

450

500

184 208 228 248 268 288

Frequency

#Solutions

US

UniGen

So far

• The first scalable approximate model counter

• The first scalable uniform generator

• Outperforms state-of-the-art generators/counters

Are we done?

51

Where are we?

Generator Relative runtime

State-of-the-art:

XORSample’

50000

UniGen ~5000

Ideal Uniform

Generator*

10

SAT Solver 1

Experiments over 200+ benchmarks

*: According to EDA experts 52

XOR-Based Hashing

• Partition 2n space into 2m cells

• Variables: X1, X2, X3,….., Xn

• Pick every variable with prob. ½ ,XOR them and add 0/1 with prob. ½

• X1+X3+X6+…. Xn-1 + 0

• To construct h: 0,1 𝑛 → 0,1 𝑚, choose m random XORs

• 𝛼 ∈ 0,1 𝑚 → Set every XOR equation to 0 or 1 randomly

• The cell: F ∧ XOR (CNF+XOR)

53

XOR-Based Hashing

• CryptoMiniSAT: Efficient for CNF+XOR

• Avg Length : n/2

• Smaller XORs better performance

How to shorten XOR clauses? 54

Independent Support

• Set I of variables such that assignments to these uniquely determine assignments to rest of variables (for satisfying assignments)

• If agree on I then

• c ⟷ (a V b) ; Independent Support I: {a, b}

• Key Idea: Hash only on the independent variables

55

s1ands2 s1=s2

Independent Support

• Hash only on the Independent Support

• Average size of XOR: n/2 to |I|/2

56

Formal Definition

57

Minimal Unsatisfiable Subset

• Given Ψ = 𝐻1 ∧ 𝐻2 ⋯ 𝐻𝑚

Find subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘 } of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT

Unsatisfiable subset

Find minimal subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘} of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 is UNSAT

Minimal Unsatisfiable subset

58

Key Idea

59

Key Idea

60

𝐼 = {𝑥𝑖} is Independent Support iff 𝐻𝐼 ∧ Ω is

unsatisfiable where 𝐻𝐼 = 𝐻𝑖 𝑥𝑖 ∈ 𝐼}

Group-Oriented Minimal UnsatisfiableSubset

• Given Ψ = 𝐻1 ∧ 𝐻2 ⋯ 𝐻𝑚 ∧ Ω

Find subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘 } of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT

Group Oriented Unsatisfiable subset

Find minimal subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘} of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT

Group Oriented Minimal Unsatisfiable subset

61

Minimal Independent Support

62

𝐼 = {𝑥𝑖} is minimal Independent Support iff 𝐻𝐼 is

minimal unsatisfiable subset where 𝐻𝐼 = 𝐻𝑖 𝑥𝑖 ∈𝐼}

Key Idea

63

Minimal

Independent

Support (MIS)

Minimal

Unsatisfiable

Subset (MUS)

Impact on Sampling and Counting Techniques

64

MIS

SamplingTools

Counting Tools

FI

What about complexity

• Computation of MUS: 𝐹𝑃𝑁𝑃

• Why solve a 𝐹𝑃𝑁𝑃 for almost-uniform generation/approximate counter (PTIME PTM with NP Oracle)

Settling the debate through practice!

65

Performance Impact on Approximate Model Counting

1.8

18

180

1800

18000

ApproxMC IApproxMC

66

Performance Impact on Uniform Sampling

67

0.018

0.18

1.8

18

180

1800

18000

UniGen UniGen1

Where are we?

Generator Relative runtime

State-of-the-art:

XORSample’

50000

UniGen 5000

UniGen1 470

Ideal Uniform Generator* 10

SAT Solver 1

68

Back to basics

69

# of solutions in “small” cell ∈ 𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ, ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎWe pick one solution

“Wastage” of loThresh solutions

Pick 𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ samples!

70

3-Universal hash functions: Choose hash function randomly

For arbitrary distribution on solutions=> All cells are roughly equal in expectation

But:

While each input is hashed uniformly

And each 3-solutions set is hashed independently

A 4-solutions set might not be hashed independently

3-Universal and Independence of Samples

Balancing Independence

For ℎ ∈ 𝐻 𝑛, 𝑚, 3

• Choosing up to 3 samples => Full independence among samples

• Choosing loThresh (>> 3) samples => Loss of independence

71

Why care about Independence

72

🕷🕷

If every sample is independent => Faster convergence

Convergence

requires

multiplication of

probabilities

The principle of principled compromise!

• Choosing up to 3 samples => Full independence among samples

• Choosing loThresh (>> 3) samples => Loss of independence “Almost-Independence” among samples

Still provides strong theoretical guarantees of coverage

73

Strong Guarantees

•Polynomial Constant number of SAT calls per sample

After one call to ApproxMC

74

Bug-finding effectiveness

75

bug frequency f =

Simply put,

#of SAT calls for UniGen2 << # of SAT calls for

UniGen

Bug-finding effectiveness

UniGen UniGen2

Expected

number of SAT

calls

4.35 × 107 3.38 × 106

76

bug frequency f = 1/104

find bug with probability ≥ 1/2

An order of magnitude difference!

~20 times faster than UniGen1

0.01

0.1

1

10

100

1000

s12

38

a_3_2

s11

96

a_3_2

s83

2a

_1

5_7

case_

1_b1

2_2

squ

ari

ng

16

squ

ari

ng

7

dou

bly

Lin

ked

Lis

t

Log

inS

ervic

e2

Sort

20

.sk

enq

ueu

e

Ka

rats

ub

a

lltr

aver

sal

llre

ver

se

dia

gS

ten

cil_

new

tuto

ria

l3

dem

o2

_n

ew

Time

per

sample

(s)

Benchmarks

UniGen2

UniGen1

77

Where are we?

Generator Relative runtime

State-of-the-art:

XORSample’

50000

UniGen 5000

UniGen1 470

UniGen2 20

Ideal Uniform Generator* 10

SAT Solver 1

78

The Final Push….

•UniGen requires one time computation of ApproxMC

•Generation of samples in fully distributed fashion

(Previous algorithms lacked the above property)

•New paradigms!

79

80

Current Paradigm of Simulation-based Verification

Test 2 Test 3

Test 4Test 1

Test Generator

Simulator

SimulatorSimulator

Simulator

• Can not be

parallelized since

test generators

maintain “global

state”

• Loses theoretical

guarantees (if any)

of uniformity

Test Generator

New Paradigm of Simulation-based Verification

Simulator

Simulator

Simulator

Simulator

Test Generator

Test Generator

Test Generator

Preprocessin

g

• Preprocessing needs to be done only once

• No communication required between

different copies of the test generator

• Fully distributed!

81

Closing in…

Generator Relative runtime

State-of-the-art:

XORSample’

50000

UniGen 5000

UniGen1 470

UniGen2 20

Multi-core UniGen2 10 (two cores)

Ideal Uniform Generator* 10

SAT Solver 1 82

So what happened….

83

Sampling and

Counting

Important

Applications

Beautiful Theory

But does not work in

practice

Theoretical

Contributions

(Practice drives

theory)

New Applications

(Theory drives

practice)

Future Directions

84

Extension to More Expressive domains

• Efficient hashing schemes Extending bit-wise XOR to richer constraint domains provides

guarantees but no advantage of SMT progress

• Solvers to handle F + Hash efficiently CryptoMiniSAT has fueled progress for SAT domain

Similar solvers for other domains?

85

Handling Distributions

• Given: CNF formula F and Weight function W over assignments

• Weighted Counting: sum the weight of solutions

• Weighted Sampling: Sample according to weight of solution

• Wide range of applications in Machine Learning

• Extending universal hashing works only in theory so far

86