Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...
-
Upload
alissa-border -
Category
Documents
-
view
219 -
download
1
Transcript of Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...
Model Counting of Query Expressions:Limitations of Propositional Methods
Paul Beame1 Jerry Li2 Sudeepa Roy1 Dan Suciu1
1University of Washington2 MIT
11
Probabilistic Databases AsthmaPatien
t
Ann
Bob
Friend
Ann Joe
Ann Tom
Bob Tom
Smoker
Joe
Tom
Boolean query Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)
• Tuples are probabilistic (and independent)▫ “Ann” is present with probability 0.3
• Lineage FQ,D = (x1y1z1) (x1y2z2) (x2y3z2)▫ Q is true on D FQ,D is true
• What is the probability that Q is true on D?• Two main evaluation techniques: lifted vs. grounded
inference
x1
x2
z1
z2
y1
y2
y3
0.30.1
0.51.0
0.90.5
0.7
Pr(x1) = 0.3
2
3
Lifted Inference
Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)
Work with explicit query structure, i.e. the first order logic
Dichotomy Theorem [Dalvi, Suciu 12] For any UCQ, evaluating it is either
▫#P-hard▫Polynomial time computable using lifted inference▫and there is a simple condition to tell which case holds
4
Grounded Inference
FQ,D = (x1y1z1) (x1y2z2) (x2y3z2)
Work with the boolean formula
Folklore sentiment: Lifted inference is strictly stronger than grounded inference
We give the first clear proof of this
5
Outline
•Background: Model Counting, DPLL algorithms▫Extensions (Caching & Component Analysis)▫Knowledge Compilation (FBDDs & Decision-DNNF)
•Our Contributions▫Statement of separation▫Sketch of FBDD lower bound
•Conclusions
6
Model Counting• Probability Computation Problem:
Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F)
• Model Counting Problem:
Given a Boolean formula F, compute #F = #Models (satisfying assignments) of F
e.g. F = (x y) (x u w) (x u w z) #Assignments on x, y, u, z, w which make F = true
7
•CDP•Relsat•Cachet•SharpSAT•c2d•Dsharp•…
Known Model Counting Algorithms
Search-based/DPLL-based(explore the assignment-space and count the satisfying ones)
Knowledge Compilation-based(compile F into a “computation-friendly” form)
[Survey by Gomes et. al. ’09]
Both techniques explicitly or implicitly • use DPLL-based algorithms • produce FBDD or Decision-DNNF compiled forms (output or trace)
[Huang-Darwiche’05, ’07]
[Birnbaum et. al.’99]
[Bayardo Jr. et. al. ’97, ’00]
[Sang et. al. ’05]
[Thurley ’06]
[Darwiche ’04]
[Muise et. al. ’12]
DPLL Algorithms
Davis, Putnam, Logemann, Loveland [Davis et. al. ’60, ’62]
8
x
z
0
y
1
u 01
1
0
w
1
0
0
1
1 0
u11
1
0
w
1
0
0
1
1 0
1 0 1 0
01
11
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
½
¾ ¾
y(uw)3/87/8
5/8
w½
Assume uniform distribution for simplicity
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
DPLL Algorithms
9
x
z
0
y
1
u 01
1
0
w
1
0
0
1
1 0
u11
1
0
w
1
0
0
1
1 0
1 0 1 0
01
11
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
½
¾ ¾
y(uw)3/87/8
5/8
w½
The trace is a Decision-Tree for F
10
Extensions to DPLL
• Caching Subformulas
• Component Analysis
• Conflict Directed Clause Learning▫ Affects the efficiency of the algorithm, but not the final “form” of the trace
Extensions to DPLL: Caching
11
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
x
z
0
y
1
u 01
1
0
w
1
0
0
1
1 0
u11
1
0
w
1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
y(uw)
w
// DPLL with caching:Cache F and Pr(F);look it up before computing
Caching & FBDDs
12
x
z
0
y
1
0
1 0
u11
1
0
w
1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
y(uw)The trace is a decision-DAG for F
FBDD (Free Binary Decision Diagram)or
ROBP (Read Once Branching Program)
• Every variable is tested at most once on any path
Extensions to DPLL: Component Analysis
13
x
z
0
y
1
0
1 0
u11
1
0
w
1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
y (uw)
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
// DPLL with component analysis (and caching):
if F = G Hwhere G and H have disjoint sets of variablesPr(F) = Pr(G) × Pr(H)
Components & Decision-DNNF
14
x
z
1u1
1
1
0
w
1
0
0
1
1 0
uwz
w
y (uw)
0
y
1
0
F: (xy) (xuw) (xuwz)
The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07]
FBDD + “Decomposable” AND-nodes
(Two sub-DAGs do not share variables)
y
01AND Node
uw
15
How much power does component analysis add?
Theorem [BLRS]: decision-DNNF for F of size N FBDD for F of size Nlog N + 1 [UAI ’13]
Conversion works even when we allow negation and arbitrary decomposable binary gates. [ICDT ’14]
Corollary: Exponential lower bound for FBDD(F) exponential lower bound for decision-DNNF(F)
16
Implications for Lower Bounds?
•All real world exact model counters compile into FBDDs or decision-DNNFs
•By conversion, an exponential size lower bound for FBDDs implies an exponential lower bound for decision-DNNFs
•Thus suffices to consider FBDDs
17
Outline
•Background: Model Counting, DPLL algorithms▫Extensions (Caching & Component Analysis)▫Knowledge Compilation (FBDDs & Decision-DNNF)
•Our Contributions▫Statement of separation▫Sketch of FBDD lower bound
•Conclusions
18
An important class of queries
H1 =R(x)S(x,y) S(x,y)T(y)
Hk =R(x)S1(x,y) ... Si(x,y)Si+1(x,y) ... Sk(x,y)T(y)
▫ [Dalvi, Suciu 12]: Hk is #P-hard to evaluate
▫Known to “capture” hardness for probabilistic DB queries▫But, some functions of the hki are poly-time computable
using lifted inference, e.g. (h30 h32) (h30 h33) (h31 h33)
hk0 hki hkk
19
New Lower Bounds
Theorem: For all k, FBDD(Hk) = 2(𝑛), which implies
Decision-DNNF(Hk) = 2(√n)
Theorem: Any Boolean function f of hk0,...,hkk that depends on all of them requires
FBDD(f) = 2(𝑛)
which implies Decision-DNNF(f) = 2(√n)
Corollary: Grounded inference requires 2(√𝑛) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference.
Implies separation between grounded and lifted inference
20
Proof for H1
H1 = R(x)S(x,y) S(x,y)T(y)
Over the complete database of size n,
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
Q: why is H1 hard for FBDDs?
21
Matrix view
T(1) T(2) T(3) T(4) T(5)
R(1) S(1,1) S(1,2) S(1,3) S(1,4) S(1,5)
R(2) S(2,1) S(2,2) S(2,3) S(2,4) S(2,5)
R(3) S(3,1) S(3,2) S(3,3) S(3,4) S(3,5)
R(4) S(4,1) S(4,2) S(4,3) S(4,4) S(4,5)
R(5) S(5,1) S(5,2) S(5,3) S(5,4) S(5,5)
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
22
Matrix view
T(1) T(2) T(3) T(4) T(5)
R(1) S(1,1)
S(1,2)
S(1,3)
S(1,4)
S(1,5)
R(2) S(2,1)
S(2,2)
S(2,3)
S(2,4)
S(2,5)
R(3) S(3,1)
S(3,2)
S(3,3)
S(3,4)
S(3,5)
R(4) S(4,1)
S(4,2)
S(4,3)
S(4,4)
S(4,5)
R(5) S(5,1)
S(5,2)
S(5,3)
S(5,4)
S(5,5)
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
R(1)
23
Matrix view
T(1) T(2) T(3) T(4) T(5)
R(1) S(1,1)
S(1,2)
S(1,3)
S(1,4)
S(1,5)
R(2) S(2,1)
S(2,2)
S(2,3)
S(2,4)
S(2,5)
R(3) S(3,1)
S(3,2)
S(3,3)
S(3,4)
S(3,5)
R(4) S(4,1)
S(4,2)
S(4,3)
S(4,4)
S(4,5)
R(5) S(5,1)
S(5,2)
S(5,3)
S(5,4)
S(5,5)
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
R(1)0 1
24
Matrix view
T(1) T(2) T(3) T(4) T(5)
R(1) S(1,1)
S(1,2)
S(1,3)
S(1,4)
S(1,5)
R(2) S(2,1)
S(2,2)
S(2,3)
S(2,4)
S(2,5)
R(3) S(3,1)
S(3,2)
S(3,3)
S(3,4)
S(3,5)
R(4) S(4,1)
S(4,2)
S(4,3)
S(4,4)
S(4,5)
R(5) S(5,1)
S(5,2)
S(5,3)
S(5,4)
S(5,5)
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
R(1)0 1
25
Matrix view
T(1) T(2) T(3) T(4) T(5)
R(1) S(1,1)
S(1,2)
S(1,3)
S(1,4)
S(1,5)
R(2) S(2,1)
S(2,2)
S(2,3)
S(2,4)
S(2,5)
R(3) S(3,1)
S(3,2)
S(3,3)
S(3,4)
S(3,5)
R(4) S(4,1)
S(4,2)
S(4,3)
S(4,4)
S(4,5)
R(5) S(5,1)
S(5,2)
S(5,3)
S(5,4)
S(5,5)
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
Can’t Cache!
R(1)
0
1
S(1,1)
S(1,5)
…
0
26
A “unit rule” for FBDDs
Variable x in a formula Φ is a unit if Φ = x v G
A FBDD follows the unit rule if each node tests a unit variable whenever possible
Can we assume that FBDDs follow the unit rule?
G
x
1
1 0
Unit Node
27
A “unit rule” for FBDDs
Lemma: Given an FBDD for a monotone DNF formula Φ of size N, there exists an FBDD for Φ that follows the unit rule of size at most |var (Φ)| N.
Proof: Alter FBDD to test units whenever possible, then restore read-once property
28
Bound for H1
• Idea: specify a set of “admissible” partial paths A so that:1. None of them cache2. Each takes n – 1 degrees of freedom to specify
Given this set A:▫ Each partial path in A must end at a unique node (they
don’t cache)▫ There are 2𝑛-1 such paths
(n – 1 degrees of freedom) Implies A has at least 2𝑛-1 nodes
29
Admissible Paths
Let A be the set of partial paths P which1. Don’t end at a leaf node2. Touch n – 1 rows and/or columns, but not more3. Never set R(i) = S(ij) = T(j) = 0, for any i, j
30
Bound for H1
Proposition: If P, Q are paths in A which end at the same node v, then they test the same set of R and T variables, and assign them the same value.
Proof:Suppose P sets R(i), Q does not The subformula at v cannot contain any term R(i)S(ij) Q sets every S(ij) = 0 or every T(j) = 1 (unit rule) #Col(Q) = n, contradiction
31
Paths don’t cache
Intuition: given that R(i), T(j) are set, S(ij) is determined
Since two paths that end at the same node v set the same R, T variables, they set the same S variables
R(i) S(ij) T(j)
0 1 0
1 0 0
0 0 1
1 0 1
32
n – 1 degrees of freedomLet P be an admissible path
w.l.o.g. |Row(P)| = n - 1
At each node where we first visit a row, could’ve chosen either edge and still been admissible!
n – 1 degrees of freedom
2𝑛-1 distinct admissible paths
FBDD(H1) = 2(𝑛)
R(2)
S(1, 4)
S(5, 4)
33
Proof for Hk
Same basic structure. We only need to change definition of admissible path.Let A be the set of partial paths P which
1. Don’t end at a leaf node2. Touch n – 1 rows and/or columns, but not more3. Always set i,j consistent with the following table:
R(i) S1(ij)
S2(ij) S3(ij) T(j)
0 1 0 1 0
1 0 1 0 1
0 0 1 0 1
1 0 1 0 1
34
New Lower Bounds
Theorem: For all k, FBDD(Hk) = 2(𝑛), which implies
Decision-DNNF(Hk) = 2(√n)
Theorem: Any Boolean function f of hk0,...,hkk that depends on all of them requires
FBDD(f) = 2(𝑛)
which implies Decision-DNNF(f) = 2(√n)
Corollary: Grounded inference requires 2(√𝑛) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference.
Implies separation between grounded and lifted inference
35
Boolean combinations of hk0,...,hkk
f is a Boolean function that depends on all its inputs
Ψ = f(hk0, hk1,…, hkk)
We give a reduction from any FBDD for Ψ into an FBDD for Hk
Intuitively: to compute Ψ using an FBDD, you must compute the hk0, hk1, … , hkk, so that FBDD can also compute Hk
36
Summary
• FBDDs and decision-DNNFs bound the power of known model counting algorithms
• Exponential lower bounds on FBDDs & decision-DNNFs
• Which implies a separation between lifted and grounded inference
37
Open Problems
• A polynomial conversion of decision-DNNFs to FBDDs?
• General Dichotomy theorem for grounded inference?
• Approximate model counting?
38
Thank You
Questions?