Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...
-
Upload
chrystal-melton -
Category
Documents
-
view
223 -
download
0
description
Transcript of Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...
Model Counting of Query Expressions:Limitations of Propositional Methods
Paul Beame1 Jerry Li2 Sudeepa Roy1 Dan Suciu1
1University of Washington2 MIT
11
Probabilistic Databases AsthmaPatien
tAnnBob
FriendAnn JoeAnn TomBob Tom
Smoker
JoeTom
Boolean query Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)
• Tuples are probabilistic (and independent)▫ “Ann” is present with probability 0.3
• Boolean formula FQ,D = (x1y1z1) (x1y2z2) (x2y3z2)▫ Q is true on D FQ,D is true
• What is the probability that Q is true on D?• Two main evaluation techniques: lifted vs. grounded
inference
x1
x2
z1
z2
y1
y2
y3
0.30.1
0.51.0
0.90.5
0.7
Pr(x1) = 0.3
2
3
Lifted Inference Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)
Dichotomy Theorem [Dalvi, Suciu 12] For any union of conjunctive queries (UCQ), evaluating it is either
▫#P-hard▫Polynomial time computable using lifted inference▫and there is a simple condition to tell which case holds
4
Grounded InferenceFQ,D = (x1y1z1) (x1y2z2) (x2y3z2)
Equivalent to problem of model counting, counting the number of satisfying assignments of FQ,D
Folklore sentiment: Lifted inference is strictly stronger than grounded inference
Our examples give a first clear proof of this
5
Outline• DPLL algorithms
▫ Extensions (Caching & Component Analysis)▫ Knowledge Compilation (FBDDs & Decision-DNNF)
• Our Contributions▫ DLDD to FBDD conversion▫ FBDD lower bounds
• Sketch of FBDD lower bound
• Conclusions
6
Model Counting• Probability Computation Problem:
Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F)
• Model Counting Problem: Given a Boolean formula F,
compute #F = #Models (satisfying assignments) of F
e.g. F = (x y) (x u w) (x u w z) #Assignments on x, y, u, z, w which make F = true
7
•CDP•Relsat•Cachet•SharpSAT•c2d•Dsharp•…
Exact Model CountersSearch-based/DPLL-based(explore the assignment-space and count the satisfying ones)
Knowledge Compilation-based(compile F into a “computation-friendly” form)
[Survey by Gomes et. al. ’09]
Both techniques explicitly or implicitly • use DPLL-based algorithms • produce FBDD or Decision-DNNF compiled forms (output or trace)
[Huang-Darwiche’05, ’07]
[Birnbaum et. al.’99]
[Bayardo Jr. et. al. ’97, ’00]
[Sang et. al. ’05]
[Thurley ’06]
[Darwiche ’04]
[Muise et. al. ’12]
DPLL Algorithms
Davis, Putnam, Logemann, Loveland [Davis et. al. ’60, ’62]
8
x
z
0
y
1
u 01
1
0
w1
0
0
1
1 0
u11
1
0
w1
0
0
1
1 0
1 0 1 0
01
11
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
½
¾ ¾
y(uw)3/87/8
5/8
w½
Assume uniform distribution for simplicity
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
DPLL Algorithms
9
x
z
0
y
1
u 01
1
0
w1
0
0
1
1 0
u11
1
0
w1
0
0
1
1 0
1 0 1 0
01
11
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
½
¾ ¾
y(uw)3/87/8
5/8
w½
The trace is a Decision-Tree for F
10
Extensions to DPLL• Caching Subformulas
• Component Analysis
• Conflict Directed Clause Learning▫ Affects the efficiency of the algorithm, but not the final “form” of the trace
Traces of• DPLL + caching + (clause learning) FBDD• DPLL + caching + component + (clause learning) Decision-DNNF
Caching
11
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
x
z
0
y
1
u 01
1
0
w1
0
0
1
1 0
u11
1
0
w1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
uw
y(uw)
w
// DPLL with caching:Cache F and Pr(F);look it up before computing
Caching & FBDDs
12
x
z
0
y
1
0
1 0
u11
1
0
w1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
y(uw)The trace is a decision-DAG for F
FBDD (Free Binary Decision Diagram)orROBP (Read Once Branching Program)
• Every variable is tested at most once on any path
• All internal nodes are decision-nodes
Component Analysis
13
x
z
0
y
1
0
1 0
u11
1
0
w1
0
0
1
1 0
F: (xy) (xuw) (xuwz)
uwz
uw
w
y (uw)
// basic DPLL:Function Pr(F):
if F = false then return 0if F = true then return 1select a variable x, return
½ Pr(FX=0) + ½ Pr(FX=1)
// DPLL with component analysis (and caching):
if F = G Hwhere G and H have disjoint sets of variablesPr(F) = Pr(G) × Pr(H)
Components & Decision-DNNF
14
x
z
1u1
1
1
0
w1
0
0
1
1 0
uwz
w
y (uw)
0
y
1
0
F: (xy) (xuw) (xuwz)
The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07]
FBDD + “Decomposable” AND-nodes
(Two sub-DAGs do not share variables)
y
01AND Node
uw
15
Decomposable Logic Decision Diagrams (DLDDs)
•Generalization of Decision-DNNFs:▫not just decomposable AND nodes▫Also NOT nodes, decomposable binary OR, XOR, etc
sub-DAGs for each node are labelled by disjoint sets of variables
16
Outline• DPLL algorithms
▫ Extensions (Caching & Component Analysis)▫ Knowledge Compilation (FBDDs & Decision-DNNF)
• Our Contributions▫ DLDD to FBDD conversion▫ FBDD lower bounds
• Sketch of FBDD lower bound
• Conclusions
17
How much power does component analysis add?
Theorem [UAI 2013]:
decision-DNNF for F of size N FBDD for F of size Nlog N + 1
• If F is a k-DNF or k-CNF, then FBDD is of size Nk
Conversion algorithm runs in linear time in the size of its output
Theorem [ICDT 2014]: Conversion works even for DLDDs
18
An important class of queriesH1(x,y)=R(x)S(x,y) S(x,y)T(y) Hk(x,y)=R(x)S1(x,y) ... Si(x,y)Si+1(x,y) ... Sk(x,y)T(y)
▫ [Dalvi, Suciu 12]: Hk is #P-hard to evaluate
▫However, some boolean combinations of the hki are poly-time computable using lifted inference, e.g. (h30 h32) (h30 h33) (h31 h33)
hk0 hki hkk
19
New Lower BoundsTheorem: Any Boolean function f of hk0,...,hkk that depends on all of them requires FBDD(f) = 2(𝑛)
which implies DLDD(f) = 2(√n)
DLDD(f) = 2(𝑛/𝑘) if f is monotone.
Corollary: Grounded inference requires 2(𝑛) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference.
Implies separation between grounded and lifted inference
20
Outline• DPLL algorithms
▫ Extensions (Caching & Component Analysis)▫ Knowledge Compilation (FBDDs & Decision-DNNF)
• Our Contributions▫ DLDD to FBDD conversion▫ FBDD lower bounds
• Sketch of FBDD lower bound
• Conclusions
21
Outline of Proof•FBDD FBDD with unit rule
•Prove hardness for Hk for FBDDs with unit rule
•Then reduce FBDDs for functions over hk0,...,hkk
to FBDDs for Hk
22
A “unit rule” for FBDDsDefinition: A variable x in a boolean formula Φ is a unit for Φ if Φ = x v G, for some G.
Definition: An FBDD for a formula F follows the unit rule if each node tests a unit variable whenever a unit variable exists in the corresponding sub-formula.
G
x
1
1 0
23
A “unit rule” for FBDDsFor any variable X in a DNF formula Φ, let deg(X) be the number of variables which co-occur with X in some clause, and let ∆(Φ) = max deg(X)
Note: deg(Hk) = n
Lemma: Given an FBDD for a monotone DNF formula Φ of size N, there exists an FBDD for Φ that follows the unit rule of size at most ∆(Φ) N.
24
Proof of Lemma: A Local Transform
yy(x1 v x2 v…v xn) v H
1
wx1 v x2 v…v xn v H[y = 1]
yy(x1 v x2 v…v xn) v H
wH[y = 1, x1 = x2 = … = xn = 0 ]
1
x1
1
1
0
xn
1
1
0
…Note: size increases
by factor of ∆(Φ)
Decision Node
Unit Node
25
Proof of Lemma: A Local Transform•But this might cause us to test a variable twice
along a path, violating the read-once property!
• In last case, simply remove the second test▫Point all of edges that point to it to its 0-child▫Does not increase size!
•The resulting structure will then be read-once
26
Apply these transformations globally. Then it is sufficient to show the following:
Claim: Let v be any node in the original FBDD, with corresponding subformula Φv. In the new FBDD, it has corresponding subformula Φv [X = 0], where X was the set of units in Φv.
Proof: Every unit of Φv became a unit somewhere along each path to v, where we set it to 0.
Proof of Lemma
27
Back to Hk H1 = R(x)S(x,y) S(x,y)T(y)
Over the complete database of size n,
H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)
Key idea: If a subformula of H1 is unit-free, then all conjunctions are clearly either from h10 or h11
28
Bound for H1
• Let F be an FBDD for H1 that follows the unit rule.• For any partial path P in F starting at the root, let Row(P) = {i: P tests R(i) or S(ij) at a decision node, for some j}Col(P) = {j: P tests T(j) or S(ij) at a decision node, for some i}
• Let P be set of partial paths P so that the resulting subformula is not 0 or 1, and
|Row(P)| < n and |Col(P)| < n but no extension of P has |Row(P)| < n and |Col(P)| < n
29
Bound for H1
Proposition: If P, Q are paths in P which end at the same node v, then they must test the same set of R and T variables, and assign them the same value.
Proof:Suppose P sets r(i), Q does not (other cases similar)The subformula at v cannot contain any term r(i)s(ij) Q sets every s(ij) = 0 or every t(j) = 1 at some decision node |Col(Q)| = n, contradiction
30
Admissible PathsDefinition: A path P in P is admissible if there does not exist an i,j so that P is not consistent with the following table:
Let A denote the set of admissible paths
R(i) S(ij) T(j)0 1 01 0 00 0 11 0 1
31
Admissible PathsTheorem: Two distinct admissible paths P, Q end at different vertices.
Proof: By contradiction. Assume P and Q end at same vertex (i.e. subformula are same)• Let v be the first (decision) node where P, Q differ, with
variable x▫ w.l.o.g. assume P sets x = 0, Q sets x = 1
• If x is R or T variable, done by Lemma, so assume x = s(ij)• Q sets x to 1 Q sets R(i) = T(j) = 0 (unit rule)• By Lemma, P sets R(i) = T(j) = 0, contradiction.
32
Proof of Lower Bound• Thus, suffices to prove bound on number of admissible
paths.• Let P be an admissible path with |
Row(P)| = n – 1▫ For each i Row(P), consider the first R(i), S(i1), S(i2), …, S(in) ∈
variable we encounter along P▫ We could’ve set it to either 0 and 1 and still maintained
admissibility up to that decision• There are always at least n -1 such “unforced” decisions• Any different choice for these decisions leads to a different
admissible path # of admissible paths ≥ 2n-1
33
Proof for Hk Same basic structure. We only need to change definition of admissible path.
R(i) S1(ij)
S2(ij) S3(ij) T(j)
0 1 0 1 01 0 1 0 10 0 1 0 11 0 1 0 1
34
Boolean combinations of hk0,...,hkk
f is a Boolean function that depends on all its inputs
Ψ = f(hk0, hk1,…, hkk)
We want to reduce any FBDD F for Ψ into an FBDD for
Hk
Intuitively: to compute Ψ using an FBDD, you must compute the h30, h31, h32, h33, so that FBDD can also compute Hk
35
Transparent SubformulasDefinition: A formula Φ that is a restriction of Ψ is called transparent if for any two partial assignments θ1, θ2, if Φ = Ψ[θ1] = Ψ[θ2], then hki[θ1] = hki[θ2], for all i.
From Φ, we can read off the values of the hki.
When will the subformula be transparent?
36
When is a subformula easy?Definition: Let θ be a partial assignment to the hk0,...,hkk. A transversal in θ is a pair of indices (i, j) so that R(i) S1(i, j) is a prime implicant of hk0[θ], Sl(i, j)S(l+1)(i, j) is a prime implicant of hkl[θ], and Sk(i, j) T(j) is a prime implicant of hkk[θ].
We say a formula Φ is transversal-free if there exists θ so that Φ = Ψ[θ] and θ has no transversals.
37
Transversal-free Subformula are easyT(1) T(2) T(3) T(4) T(5)
R(1) S(1,1) S(1,2) S(1,3) S(1,4) S(1,5)
R(2) S(2,1) S(2,2) S(2,3) S(2,4) S(2,5)
R(3) S(3,1) S(3,2) S(3,3) S(3,4) S(3,5)
R(4) S(4,1) S(4,2) S(4,3) S(4,4) S(4,5)
R(5) S(5,1) S(5,2) S(5,3) S(5,4) S(5,5)
38
Transversal-free Subformula are easyT(1) T(2) T(3) T(4) T(5)
R(1) S(1,1) S(1,2) S(1,3) S(1,4) S(1,5)
R(2) S(2,1) S(2,2) S(2,3) S(2,4) S(2,5)
R(3) S(3,1) S(3,2) S(3,3) S(3,4) S(3,5)
R(4) S(4,1) S(4,2) S(4,3) S(4,4) S(4,5)
R(5) S(5,1) S(5,2) S(5,3) S(5,4) S(5,5)
39
Transversal-free Subformula are easy
R(2)
R(1)
R(n)
T(1)
T(2)
T(n)
40
Subformula with few transversals are easyTwo transversals (i, j) and (i’, j’) are independent if i ≠ i’ and j ≠ j’
If a subformula has few independent transversals, then we can test the variables shared by the transversals, and make the formula transversal free.
i.e. if all transversals went through R(i), then first test R(i), and then the resulting formula is transversal-free
41
Subformula with few transversals are easy
Φ
GF
Suppose all of Φ’s transversals involve the variables R(1), T(2)
R(1)
Φ
T(2) T(2)
42
A Pseudo-Unit Rule•Transversal-free easy for FBDDs
hard for lower bounds•So we need control over when a subformula
becomes transversal-free•This is just like the units for H1!
Definition: A variable X in a subformula Φ is a Hk-unit if Φ is not transversal-free but Φ[X = 1] is.
43
Transparent Subformula LemmaTheorem: If a subformula is Hk-unit free and has at least 4 independent transversals, then it is transparent.
Proof: See paper.
44
Putting it all together1. Do the unit rule conversion with Hk-units*
2. If a node has < 4 transversals, transform it as above
3. Now the FBDD is transparent except at nodes at which we control ingress, so we can deduce the values of the hk0,...,hkk at every node.
45
Summary• Quasi-polynomial conversion of any decision-DNNF into an
FBDD (polynomial for k-DNF)
• Exponential lower bounds on model counting algorithms • d-DNNFs and AND-FBDDs are exponentially more
powerful than decision-DNNFs
• Applications in probabilistic databases
46
Open Problems
• A polynomial conversion of decision-DNNFs to FBDDs?
• A more powerful syntactic subclass of d-DNNFs than decision-DNNFs?▫ d-DNNF is a semantic concept▫ No efficient algorithm to test if two sub-DAGs of an OR-node are
simultaneously satisfiable
• Approximate model counting?
47
Thank You
Questions?