Passenger Counting / People Counting Applications and Devices
Constrained Sampling and Counting: When Practice Drives Theory...
Transcript of Constrained Sampling and Counting: When Practice Drives Theory...
Constrained Sampling and Counting: When Practice Drives Theory
Supratik Chakraborty
IIT Bombay
Joint work with Kuldeep Meel and Moshe Y. Vardi
(Rice University)
Probabilistic Inference
2
Interest
in topic
Trust in
Speaker
Availabili
ty
Attend
Talk
Modeling Attendance for
Today’s Talk
Model
CountingRoth, 1996
How do we infer useful information from the data filled with uncertainty?
+Pr(Attending Talk
|Interest in topic =
True )
Smart Cities
• Alarm system in every house that responds to either burglary or earthquake
• Every alarm system is connected to the central dispatcher (of course, automated!)
• Suppose one of the alarm goes off
• Important to predict whether its earthquake or burglary
3
Deriving Useful Inferences
4
What is the probability of earthquake (𝐸)
given that alarm sounded (𝐴)?
=
Bayes’ rule to the rescue
How do we calculate these
probabilities?
Pr[event|evidence]
Probabilistic Models
Graphical Models
5
Graphical Models
6
B E
A
𝐸 Pr
𝑇 0.1
𝐹 0.9
𝐵 Pr
𝑇 0.8
𝐹 0.2
B E A Pr(A|E,
B)
𝑇 𝑇 𝑇 0.3
𝑇 𝑇 𝐹 0.7
𝑇 𝐹 𝑇 0.4
𝑇 𝐹 𝐹 0.6
𝐹 𝑇 𝑇 0.2
𝐹 𝐹 𝐹 0.8
𝐹 𝐹 𝑇 0.1
Calculating
7
B E
A
𝐸 Pr
𝑇 0.1
𝐹 0.9
𝐵 Pr
𝑇 0.8
𝐹 0.2
B E A Pr(A|E,
B)
𝑇 𝑇 𝑇 0.3
𝑇 𝑇 𝐹 0.7
𝑇 𝐹 𝑇 0.4
𝑇 𝐹 𝐹 0.6
𝐹 𝑇 𝑇 0.2
𝐹 𝐹 𝐹 0.8
𝐹 𝐹 𝑇 0.1
Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵
+Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵]
Calculating
8
B E
A
𝐸 Pr
𝑇 0.1
𝐹 0.9
𝐵 Pr
𝑇 0.8
𝐹 0.2
B E A Pr(A|E,
B)
𝑇 𝑇 𝑇 0.3
𝑇 𝑇 𝐹 0.7
𝑇 𝐹 𝑇 0.4
𝑇 𝐹 𝐹 0.6
𝐹 𝑇 𝑇 0.2
𝐹 𝐹 𝐹 0.8
𝐹 𝐹 𝑇 0.1
Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵] + Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵
Calculating
9
B E
A
𝐸 Pr
𝑇 0.1
𝐹 0.9
𝐵 Pr
𝑇 0.8
𝐹 0.2
B E A Pr(A|E,
B)
𝑇 𝑇 𝑇 0.3
𝑇 𝑇 𝐹 0.7
𝑇 𝐹 𝑇 0.4
𝑇 𝐹 𝐹 0.6
𝐹 𝑇 𝑇 0.2
𝐹 𝐹 𝐹 0.8
𝐹 𝐹 𝑇 0.1
Pr 𝐸 ∩ 𝐴 = Pr 𝐸 ∗ Pr 𝐵 ∗ Pr [𝐴|𝐸, 𝐵] + Pr 𝐸 ∗ Pr ¬𝐵 ∗ Pr 𝐴 𝐸, ¬𝐵
Moving from Probability to Logic
• 𝑋 = 𝐴, 𝐵, 𝐸
• 𝐹 = 𝐸 ∧ 𝐴
• 𝑊 𝐵 = 0 = 0.2, 𝑊 𝐵 = 1 = 1 − 𝑊 𝐵 = 0 = 0.8
• 𝑊 𝐴 = 0 = 0.1, 𝑊 𝐴 = 1 = 0.9
• 𝑊 𝐸 = 0| 𝐴 = 0, 𝐵 = 0 = ⋯
• 𝑊 𝐴 = 1, 𝐸 = 1, 𝐵 = 1 = 𝑊 𝐵 = 1 ∗ 𝑊 𝐸 = 1 ∗ 𝑊(𝐴 = 1|𝐸 = 1, 𝐵 = 1)
• 𝑅𝐹 = (𝐴 = 1, 𝐸 = 1, 𝐵 = 0 , (𝐴 = 1, 𝐸 = 1, 𝐵 = 1)}
• 𝑊 𝐹 = 𝑊 𝐴 = 1, 𝐸 = 1, 𝐵 = 1 + 𝑊(𝐴 = 1, 𝐸 = 1, 𝐵 = 1)
10
𝑾 𝑭 = 𝐏𝐫 [𝑬 ∩ 𝑨]
Weighted Model Count
(WMC)
Probabilistic Inference to WMC to Unweighted Model Counting
B E
A
Pr [𝐸|𝐴]
Weighte
d Model
Countin
gRoth, 1996
11
Weighted Model Counting Unweighted Model Counting
Polynomial time reductions
Model Counting
• Given a SAT formula F
• RF: Set of all solutions of F
• Problem (#SAT): Estimate the number of solutions of F (#F) i.e., what is the cardinality of RF?
• E.g., F = (a v b)
• RF = {(0,1), (1,0), (1,1)}
• The number of solutions (#F) = 3
#P: The class of counting problems for
decision problems in NP!12
How do we guarantee that systems work correctly ?
Functional Verification
• Formal verification
Challenges: formal requirements, scalability
~10-15% of verification effort
• Dynamic verification: dominant approach
13
📱
Dynamic Verification
Design is simulated with test vectors
•Test vectors represent different verification scenarios
Results from simulation compared to intended results
Challenge: Exceedingly large test space!
14
Constrained-Random Simulation
15
a b
c
64 bit
64 bit
64 bit
c = f(a,b)
Sources for Constraints
• Designers:
1. a +64 11 *32 b = 12
2. a <64 (b >> 4)
• Past Experience:
1. 40 <64 34 + a <64 5050
2. 120 <64 b <64 230
• Users:
1. 232 *32 a + b != 1100
2. 1020 <64 (b /64 2) +64 a <64 2200
Problem: How can we uniformly sample the values of a and b
satisfying the above constraints?
Problem Formulation
16
Set of
Constraints
Sample satisfying assignments
uniformly at random
SAT Formula
Scalable Uniform Generation of SAT Witnesses
a b
c
64 bit
64 bit
64 bit
c = f(a,b)
Agenda
Design Scalable Techniques for
Uniform Generation and
Model Counting
with Strong Theoretical Guarantees
17
Agenda
Design Scalable Techniques for
Almost-Uniform Generation and
Approximate-Model Counting
with Strong Theoretical Guarantees
18
Formal Definitions
• 𝐹: CNF Formula; RF ∶ Solution Space of 𝐹
• Input: 𝐹 Output: 𝑦 ∈ 𝑅𝐹
• Uniform Generator:
Guarantee: ∀𝑦 ∈ 𝑅𝐹 , Pr[𝑦 is output] = 1
𝑅𝐹
• Almost-Uniform Generator
Guarantee: ∀𝑦 ∈ 𝑅𝐹 , 1
1+𝜀 𝑅𝐹≤ Pr[𝑦 is output ] ≤
1+𝜀
𝑅𝐹
19
Formal Definitions
• 𝐹: CNF Formula; RF ∶ Solution Space of 𝐹
• Probably Approximately Correct (PAC) Counter
Input: 𝐹 Output: 𝐶
20
Uniform Generation
21
Rich History of Theoretical Work
• Jerrum, Valiant and Vazirani (1986): Uniform Generator: Polynomial time PTM (Probabilistic Turing
Machine) given access to σ 𝑃2 oracle
Almost-Uniform
Generator
PAC
Counter
PTIME
No Practical Algorithms22
Stockmeyer (1983): Deterministic approximate counting in 3rd level of
polynomial hierarchy.
Can be used to design a BPP^NP procedure -- too large NP instances
Rich History of Theoretical Work
• Bellare, Goldreich, and Petrank (2000)
• Uniform Generator: Polynomial time PTM given access to NP oracle
• Employs n-universal hash functions
23
Universal Hashing
• 𝐻 𝑛, 𝑚, 𝑟 : Set of r-universal hash functions from 0,1 𝑛 →0,1 𝑚
∀𝑦1, 𝑦2, ⋯ 𝑦𝑟 (distinct) ∈ 0,1 𝑛 and ∀𝛼1, 𝛼2 ⋯ 𝛼𝑟 ∈ 0,1 𝑚
Pr ℎ 𝑦𝑖 = 𝛼𝑖 = 1
2𝑚
Pr ℎ 𝑦1 = 𝛼1 ∧ ⋯ ∧ ℎ 𝑦𝑟 = 𝛼𝑟 = 2− 𝑚𝑟
• (r-1) degree polynomials → r-universal hash functions
24
(Independence)
(Uniformity)
Concentration Bounds
• t-wise 𝑡 ≥ 4 random variables 𝑋1 , 𝑋2, ⋯ 𝑋𝑛 ∈ 0,1
𝑋 = σ 𝑋𝑖 ; 𝜇 = 𝐸 𝑋
• For t = 2
25
BGP Method
Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚, 𝑛
• For right choice of m, all the cells are small (# of solutions ≤ 2𝑛2)• Check if all the cells are small (NP- Query)
• If yes, pick a solution randomly from randomly picked cell
In practice, the query is too long and
can not be handled by SAT Solvers!
• Polynomial of degree n-1
• SAT Solvers can not handle
large polynomials!
26
To Recap
• Jerrum, Valiant and Vazirani (1986): Uniform Generator: Polynomial time PTM given access to
σ 𝑃2 oracle
Almost-Uniform Generation is inter-reducible to PAC counting
• Bellare, Goldreich, and Petrank (2000)
• Uniform Generator: Polynomial time PTM given access to NP oracle
Does not work in practice!27
Prior Work
28Performance
Gu
ara
nte
es
MCMC
SAT-
Based
BGP BDD
Desires
Generator Relative runtime
State-of-the-art:
XORSample’
50000
Ideal Uniform
Generator*
10
SAT Solver 1
Experiments over 200+ benchmarks
*: According to EDA experts
29
Our Contribution
30Performance
Gu
ara
nte
es
MCMC
SAT-
Based
BGP BDD
UniGen
Key Ideas
Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚,∗
• For right choice of m, large number of cells are “small”
• “almost all” the cells are “roughly” equal
• Check if a randomly picked cell is “small”
• If yes, pick a solution randomly from randomly picked cell
31
Key Challenges
• F: Formula X: Set of variables 𝑅𝐹 : Solution space
• 𝑅𝐹,ℎ,𝛼: Set of solutions for 𝐹 ∧ (ℎ 𝑋 = 𝛼) where
ℎ ∈ 𝐻 𝑛, 𝑚,∗ ; 𝛼 ∈ 0,1 𝑚
1. How large is “small” cell ?
2. How much universality do we need?
3. What is the value of m?
32
Size of cell
Pr[ y is output ] = 1
2𝑚 ∗ Pr[Cell is small| y is in the cell] ∗1
𝑆𝑖𝑧𝑒 𝑜𝑓 𝑐𝑒𝑙𝑙
Let Size of cell ∈ [𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ, ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎ], Then:
q
33
Losing Independence
Our desire:
Suppose ℎ ∈ 𝐻 𝑛, 𝑚,∗ 𝑎𝑛𝑑 𝑚 = log𝑅𝐹
𝑝𝑖𝑣𝑜𝑡
Then, 𝐸[ 𝑅𝐹,ℎ,𝛼 ] = |𝑅𝐹|
2𝑚= 𝑝𝑖𝑣𝑜𝑡
Concentration bound k-universal (small constant)
34
How many cells?
• Our desire: 𝑚 = log|𝑅𝐹|
𝑝𝑖𝑣𝑜𝑡
But determining 𝑅𝐹 is expensive (#P complete)
• How about approximation? 𝐴𝑝𝑝𝑟𝑜𝑥𝑀𝐶 𝐹, 휀, 𝛿 returns C:
Pr[ 𝑅𝐹
1+𝜀≤ 𝐶 ≤ 1 + 휀 |𝑅𝐹|] ≥ 1 − 𝛿
𝑞 = log 𝐶 − log 𝑝𝑖𝑣𝑜𝑡
Concentrate on m = q-1, q, q+1
35
UniGen(F,
1. C = ApproxMC(F,휀)
2. Compute pivot, loThresh, hiThresh
3. 𝑞 = log 𝐶 − log 𝑝𝑖𝑣𝑜𝑡
4. for i in {q-1, q, q+1}:
5. Choose h randomly* from H(n,i,3)
6. Choose 𝛼 randomly from 0,1 𝑚
7. If (𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ ≤ 𝑅𝐹,ℎ,𝛼 ≤ ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎ):
8. Pick 𝑦 ∈ 𝑅𝐹,ℎ,𝛼 randomly
36
One time execution
Run for
every sample
required
Are we back to JVV?
NOT Really
•JVV makes linear (in n ) calls to Approximate counter compared to just 1 in UniGen
•# of calls to ApproxMC is only 1 regardless of the number of samples required unlike JVV 37
PAC Counter: ApproxMC(F,
Choose mChoose ℎ ∈ 𝐻 𝑛, 𝑚, 3
• For right choice of m, large number of cells are “small”
• “almost all” the cells are “roughly” equal
• Check if a randomly picked cell is “small”
• If yes, then estimate = # of solutions in cell * 2𝑚
38
ApproxMC(F,
#sols <
pivot
NO
39
ApproxMC(F,
#sols <
pivot
NO
40
ApproxMC(F,
#sols <
pivotYES
Estimate:
# of sols * 2𝑚
41
ApproxMC(F,
Key Lemmas
Let 𝑚∗ = log 𝑅𝐹 − log 𝑝𝑖𝑣𝑜𝑡
Lemma 1: The algorithm terminates with 𝑚 ∈ 𝑚∗ − 1 , 𝑚∗
with high probability
Lemma 2: The estimate from a randomly picked cell for 𝑚 ∈𝑚∗ − 1 , 𝑚∗ is correct with high probability
42
Results: Performance Comparison
43
0
10000
20000
30000
40000
50000
60000
70000
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190
Tim
e (
secon
ds)
Benchmarks
ApproxMC Cachet
Results: Performance Comparison
44
0
10000
20000
30000
40000
50000
60000
70000
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180
ApproxMC
Cachet
Can Solve a Large Class of Problems
45Large class of problems that lie beyond the exact
counters but can be computed by ApproxMC
0
10000
20000
30000
40000
50000
60000
70000
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190
Tim
e (
seco
nd
s)
Benchmarks
ApproxMC
Cachet
Mean Error: Only 4% (allowed: 75%)
46Mean error: 4% – much smaller than the
theoretical guarantee of 75%
1.0E+00
3.2E+01
1.0E+03
3.3E+04
1.0E+06
3.4E+07
1.1E+09
3.4E+10
1.1E+12
3.5E+13
1.1E+15
3.6E+16
0 10 20 30 40 50 60 70 80 90
Co
un
t
Benchmarks
Cachet*1.75
Cachet/1.75
ApproxMC
Runtime Performance of UniGen
47
1-2 Orders of Magnitude Faster
0.1
1
10
100
1000
10000
100000
case4
7
case_
3_b1
4_3
case1
05
case8
case2
03
case1
45
case6
1
case9
case1
5
case1
40
case_
2_b1
4_1
case_
3_b1
4_1
squ
ari
ng
14
squ
ari
ng
7
case_
2_p
tb_1
case_
1_p
tb_1
case_
2_b1
4_2
case_
3_b1
4_2
Time(s)
Benchmarks
UniGen
XORSample'
48
Results: Uniformity
49• Benchmark: case110.cnf; #var: 287; #clauses: 1263
• Total Runs: 4x106; Total Solutions : 16384
0
50
100
150
200
250
300
350
400
450
500
184 208 228 248 268 288
Frequency
#Solutions
Results: Uniformity
50• Benchmark: case110.cnf; #var: 287; #clauses: 1263
• Total Runs: 4x106; Total Solutions : 16384
0
50
100
150
200
250
300
350
400
450
500
184 208 228 248 268 288
Frequency
#Solutions
US
UniGen
So far
• The first scalable approximate model counter
• The first scalable uniform generator
• Outperforms state-of-the-art generators/counters
Are we done?
51
Where are we?
Generator Relative runtime
State-of-the-art:
XORSample’
50000
UniGen ~5000
Ideal Uniform
Generator*
10
SAT Solver 1
Experiments over 200+ benchmarks
*: According to EDA experts 52
XOR-Based Hashing
• Partition 2n space into 2m cells
• Variables: X1, X2, X3,….., Xn
• Pick every variable with prob. ½ ,XOR them and add 0/1 with prob. ½
• X1+X3+X6+…. Xn-1 + 0
• To construct h: 0,1 𝑛 → 0,1 𝑚, choose m random XORs
• 𝛼 ∈ 0,1 𝑚 → Set every XOR equation to 0 or 1 randomly
• The cell: F ∧ XOR (CNF+XOR)
53
XOR-Based Hashing
• CryptoMiniSAT: Efficient for CNF+XOR
• Avg Length : n/2
• Smaller XORs better performance
How to shorten XOR clauses? 54
Independent Support
• Set I of variables such that assignments to these uniquely determine assignments to rest of variables (for satisfying assignments)
• If agree on I then
• c ⟷ (a V b) ; Independent Support I: {a, b}
• Key Idea: Hash only on the independent variables
55
s1ands2 s1=s2
Independent Support
• Hash only on the Independent Support
• Average size of XOR: n/2 to |I|/2
56
Formal Definition
57
Minimal Unsatisfiable Subset
• Given Ψ = 𝐻1 ∧ 𝐻2 ⋯ 𝐻𝑚
Find subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘 } of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT
Unsatisfiable subset
Find minimal subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘} of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 is UNSAT
Minimal Unsatisfiable subset
58
Key Idea
59
Key Idea
60
𝐼 = {𝑥𝑖} is Independent Support iff 𝐻𝐼 ∧ Ω is
unsatisfiable where 𝐻𝐼 = 𝐻𝑖 𝑥𝑖 ∈ 𝐼}
Group-Oriented Minimal UnsatisfiableSubset
• Given Ψ = 𝐻1 ∧ 𝐻2 ⋯ 𝐻𝑚 ∧ Ω
Find subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘 } of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT
Group Oriented Unsatisfiable subset
Find minimal subset {𝐻𝑖1 , 𝐻𝑖2 , ⋯ 𝐻𝑖𝑘} of {𝐻1 , 𝐻2 , ⋯ 𝐻𝑚} such that 𝐻𝑖1 ∧ 𝐻𝑖2 ⋯ 𝐻𝑖𝑘 ∧ Ω is UNSAT
Group Oriented Minimal Unsatisfiable subset
61
Minimal Independent Support
62
𝐼 = {𝑥𝑖} is minimal Independent Support iff 𝐻𝐼 is
minimal unsatisfiable subset where 𝐻𝐼 = 𝐻𝑖 𝑥𝑖 ∈𝐼}
Key Idea
63
Minimal
Independent
Support (MIS)
Minimal
Unsatisfiable
Subset (MUS)
Impact on Sampling and Counting Techniques
64
MIS
SamplingTools
Counting Tools
FI
What about complexity
• Computation of MUS: 𝐹𝑃𝑁𝑃
• Why solve a 𝐹𝑃𝑁𝑃 for almost-uniform generation/approximate counter (PTIME PTM with NP Oracle)
Settling the debate through practice!
65
Performance Impact on Approximate Model Counting
1.8
18
180
1800
18000
ApproxMC IApproxMC
66
Performance Impact on Uniform Sampling
67
0.018
0.18
1.8
18
180
1800
18000
UniGen UniGen1
Where are we?
Generator Relative runtime
State-of-the-art:
XORSample’
50000
UniGen 5000
UniGen1 470
Ideal Uniform Generator* 10
SAT Solver 1
68
Back to basics
69
# of solutions in “small” cell ∈ 𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ, ℎ𝑖𝑇ℎ𝑟𝑒𝑠ℎWe pick one solution
“Wastage” of loThresh solutions
Pick 𝑙𝑜𝑇ℎ𝑟𝑒𝑠ℎ samples!
70
3-Universal hash functions: Choose hash function randomly
For arbitrary distribution on solutions=> All cells are roughly equal in expectation
But:
While each input is hashed uniformly
And each 3-solutions set is hashed independently
A 4-solutions set might not be hashed independently
3-Universal and Independence of Samples
Balancing Independence
For ℎ ∈ 𝐻 𝑛, 𝑚, 3
• Choosing up to 3 samples => Full independence among samples
• Choosing loThresh (>> 3) samples => Loss of independence
71
Why care about Independence
72
🕷🕷
If every sample is independent => Faster convergence
Convergence
requires
multiplication of
probabilities
The principle of principled compromise!
• Choosing up to 3 samples => Full independence among samples
• Choosing loThresh (>> 3) samples => Loss of independence “Almost-Independence” among samples
Still provides strong theoretical guarantees of coverage
73
Strong Guarantees
•
•Polynomial Constant number of SAT calls per sample
After one call to ApproxMC
74
Bug-finding effectiveness
75
bug frequency f =
Simply put,
#of SAT calls for UniGen2 << # of SAT calls for
UniGen
Bug-finding effectiveness
UniGen UniGen2
Expected
number of SAT
calls
4.35 × 107 3.38 × 106
76
bug frequency f = 1/104
find bug with probability ≥ 1/2
An order of magnitude difference!
~20 times faster than UniGen1
0.01
0.1
1
10
100
1000
s12
38
a_3_2
s11
96
a_3_2
s83
2a
_1
5_7
case_
1_b1
2_2
squ
ari
ng
16
squ
ari
ng
7
dou
bly
Lin
ked
Lis
t
Log
inS
ervic
e2
Sort
20
.sk
enq
ueu
e
Ka
rats
ub
a
lltr
aver
sal
llre
ver
se
dia
gS
ten
cil_
new
tuto
ria
l3
dem
o2
_n
ew
Time
per
sample
(s)
Benchmarks
UniGen2
UniGen1
77
Where are we?
Generator Relative runtime
State-of-the-art:
XORSample’
50000
UniGen 5000
UniGen1 470
UniGen2 20
Ideal Uniform Generator* 10
SAT Solver 1
78
The Final Push….
•UniGen requires one time computation of ApproxMC
•Generation of samples in fully distributed fashion
(Previous algorithms lacked the above property)
•New paradigms!
79
80
Current Paradigm of Simulation-based Verification
Test 2 Test 3
Test 4Test 1
Test Generator
Simulator
SimulatorSimulator
Simulator
• Can not be
parallelized since
test generators
maintain “global
state”
• Loses theoretical
guarantees (if any)
of uniformity
Test Generator
New Paradigm of Simulation-based Verification
Simulator
Simulator
Simulator
Simulator
Test Generator
Test Generator
Test Generator
Preprocessin
g
• Preprocessing needs to be done only once
• No communication required between
different copies of the test generator
• Fully distributed!
81
Closing in…
Generator Relative runtime
State-of-the-art:
XORSample’
50000
UniGen 5000
UniGen1 470
UniGen2 20
Multi-core UniGen2 10 (two cores)
Ideal Uniform Generator* 10
SAT Solver 1 82
So what happened….
83
Sampling and
Counting
Important
Applications
Beautiful Theory
But does not work in
practice
Theoretical
Contributions
(Practice drives
theory)
New Applications
(Theory drives
practice)
Future Directions
84
Extension to More Expressive domains
• Efficient hashing schemes Extending bit-wise XOR to richer constraint domains provides
guarantees but no advantage of SMT progress
• Solvers to handle F + Hash efficiently CryptoMiniSAT has fueled progress for SAT domain
Similar solvers for other domains?
85
Handling Distributions
• Given: CNF formula F and Weight function W over assignments
• Weighted Counting: sum the weight of solutions
• Weighted Sampling: Sample according to weight of solution
• Wide range of applications in Machine Learning
• Extending universal hashing works only in theory so far
86