Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...

Model Counting of Query Expressions:Limitations of Propositional Methods

Paul Beame1 Jerry Li2 Sudeepa Roy1 Dan Suciu1

1University of Washington2 MIT

11

Probabilistic Databases AsthmaPatien

tAnnBob

FriendAnn JoeAnn TomBob Tom

Smoker

JoeTom

Boolean query Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)

• Tuples are probabilistic (and independent)▫ “Ann” is present with probability 0.3

• Boolean formula FQ,D = (x1y1z1) (x1y2z2) (x2y3z2)▫ Q is true on D FQ,D is true

• What is the probability that Q is true on D?• Two main evaluation techniques: lifted vs. grounded

inference

x1

x2

z1

z2

y1

y2

y3

0.30.1

0.51.0

0.90.5

0.7

Pr(x1) = 0.3

2

3

Lifted Inference Q: x y AsthmaPatient(x) Friend (x, y) Smoker(y)

Dichotomy Theorem [Dalvi, Suciu 12] For any union of conjunctive queries (UCQ), evaluating it is either

▫#P-hard▫Polynomial time computable using lifted inference▫and there is a simple condition to tell which case holds

4

Grounded InferenceFQ,D = (x1y1z1) (x1y2z2) (x2y3z2)

Equivalent to problem of model counting, counting the number of satisfying assignments of FQ,D

Folklore sentiment: Lifted inference is strictly stronger than grounded inference

Our examples give a first clear proof of this

5

Outline• DPLL algorithms

▫ Extensions (Caching & Component Analysis)▫ Knowledge Compilation (FBDDs & Decision-DNNF)

• Our Contributions▫ DLDD to FBDD conversion▫ FBDD lower bounds

• Sketch of FBDD lower bound

• Conclusions

6

Model Counting• Probability Computation Problem:

Given F, and independent Pr(x), Pr(y), Pr(z), …, compute Pr(F)

• Model Counting Problem: Given a Boolean formula F,

compute #F = #Models (satisfying assignments) of F

e.g. F = (x y) (x u w) (x u w z) #Assignments on x, y, u, z, w which make F = true

7

•CDP•Relsat•Cachet•SharpSAT•c2d•Dsharp•…

Exact Model CountersSearch-based/DPLL-based(explore the assignment-space and count the satisfying ones)

Knowledge Compilation-based(compile F into a “computation-friendly” form)

[Survey by Gomes et. al. ’09]

Both techniques explicitly or implicitly • use DPLL-based algorithms • produce FBDD or Decision-DNNF compiled forms (output or trace)

[Huang-Darwiche’05, ’07]

[Birnbaum et. al.’99]

[Bayardo Jr. et. al. ’97, ’00]

[Sang et. al. ’05]

[Thurley ’06]

[Darwiche ’04]

[Muise et. al. ’12]

DPLL Algorithms

Davis, Putnam, Logemann, Loveland [Davis et. al. ’60, ’62]

8

x

z

0

y

1

u 01

1

0

w1

0

0

1

1 0

u11

1

0

w1

0

0

1

1 0

1 0 1 0

01

11

F: (xy) (xuw) (xuwz)

uwz

uw

w

uw

½

¾ ¾

y(uw)3/87/8

5/8

w½

Assume uniform distribution for simplicity

// basic DPLL:Function Pr(F):

if F = false then return 0if F = true then return 1select a variable x, return

½ Pr(FX=0) + ½ Pr(FX=1)

DPLL Algorithms

9

x

z

0

y

1

u 01

1

0

w1

0

0

1

1 0

u11

1

0

w1

0

0

1

1 0

1 0 1 0

01

11


uwz

uw

w

uw

½

¾ ¾

y(uw)3/87/8

5/8

w½

The trace is a Decision-Tree for F

10

Extensions to DPLL• Caching Subformulas

• Component Analysis

• Conflict Directed Clause Learning▫ Affects the efficiency of the algorithm, but not the final “form” of the trace

Traces of• DPLL + caching + (clause learning) FBDD• DPLL + caching + component + (clause learning) Decision-DNNF

Caching

11



½ Pr(FX=0) + ½ Pr(FX=1)

x

z

0

y

1

u 01

1

0

w1

0

0

1

1 0

u11

1

0

w1

0

0

1

1 0


uwz

uw

w

uw

y(uw)

w

// DPLL with caching:Cache F and Pr(F);look it up before computing

Caching & FBDDs

12

x

z

0

y

1

0

1 0

u11

1

0

w1

0

0

1

1 0


uwz

uw

w

y(uw)The trace is a decision-DAG for F

FBDD (Free Binary Decision Diagram)orROBP (Read Once Branching Program)

• Every variable is tested at most once on any path

• All internal nodes are decision-nodes

Component Analysis

13

x

z

0

y

1

0

1 0

u11

1

0

w1

0

0

1

1 0


uwz

uw

w

y (uw)



½ Pr(FX=0) + ½ Pr(FX=1)

// DPLL with component analysis (and caching):

if F = G Hwhere G and H have disjoint sets of variablesPr(F) = Pr(G) × Pr(H)

Components & Decision-DNNF

14

x

z

1u1

1

1

0

w1

0

0

1

1 0

uwz

w

y (uw)

0

y

1

0


The trace is a Decision-DNNF [Huang-Darwiche ’05, ’07]

FBDD + “Decomposable” AND-nodes

(Two sub-DAGs do not share variables)

y

01AND Node

uw

15

Decomposable Logic Decision Diagrams (DLDDs)

•Generalization of Decision-DNNFs:▫not just decomposable AND nodes▫Also NOT nodes, decomposable binary OR, XOR, etc

sub-DAGs for each node are labelled by disjoint sets of variables

16





• Conclusions

17

How much power does component analysis add?

Theorem [UAI 2013]:

decision-DNNF for F of size N FBDD for F of size Nlog N + 1

• If F is a k-DNF or k-CNF, then FBDD is of size Nk

Conversion algorithm runs in linear time in the size of its output

Theorem [ICDT 2014]: Conversion works even for DLDDs

18

An important class of queriesH1(x,y)=R(x)S(x,y) S(x,y)T(y) Hk(x,y)=R(x)S1(x,y) ... Si(x,y)Si+1(x,y) ... Sk(x,y)T(y)

▫ [Dalvi, Suciu 12]: Hk is #P-hard to evaluate

▫However, some boolean combinations of the hki are poly-time computable using lifted inference, e.g. (h30 h32) (h30 h33) (h31 h33)

hk0 hki hkk

19

New Lower BoundsTheorem: Any Boolean function f of hk0,...,hkk that depends on all of them requires FBDD(f) = 2(𝑛)

which implies DLDD(f) = 2(√n)

DLDD(f) = 2(𝑛/𝑘) if f is monotone.

Corollary: Grounded inference requires 2(𝑛) time even on probabilistic DB instances with poly(n) time algorithms using lifted inference.

Implies separation between grounded and lifted inference

20





• Conclusions

21

Outline of Proof•FBDD FBDD with unit rule

•Prove hardness for Hk for FBDDs with unit rule

•Then reduce FBDDs for functions over hk0,...,hkk

to FBDDs for Hk

22

A “unit rule” for FBDDsDefinition: A variable x in a boolean formula Φ is a unit for Φ if Φ = x v G, for some G.

Definition: An FBDD for a formula F follows the unit rule if each node tests a unit variable whenever a unit variable exists in the corresponding sub-formula.

G

x

1

1 0

23

A “unit rule” for FBDDsFor any variable X in a DNF formula Φ, let deg(X) be the number of variables which co-occur with X in some clause, and let ∆(Φ) = max deg(X)

Note: deg(Hk) = n

Lemma: Given an FBDD for a monotone DNF formula Φ of size N, there exists an FBDD for Φ that follows the unit rule of size at most ∆(Φ) N.

24

Proof of Lemma: A Local Transform

yy(x1 v x2 v…v xn) v H

1

wx1 v x2 v…v xn v H[y = 1]

yy(x1 v x2 v…v xn) v H

wH[y = 1, x1 = x2 = … = xn = 0 ]

1

x1

1

1

0

xn

1

1

0

…Note: size increases

by factor of ∆(Φ)

Decision Node

Unit Node

25

Proof of Lemma: A Local Transform•But this might cause us to test a variable twice

along a path, violating the read-once property!

• In last case, simply remove the second test▫Point all of edges that point to it to its 0-child▫Does not increase size!

•The resulting structure will then be read-once

26

Apply these transformations globally. Then it is sufficient to show the following:

Claim: Let v be any node in the original FBDD, with corresponding subformula Φv. In the new FBDD, it has corresponding subformula Φv [X = 0], where X was the set of units in Φv.

Proof: Every unit of Φv became a unit somewhere along each path to v, where we set it to 0.

Proof of Lemma

27

Back to Hk H1 = R(x)S(x,y) S(x,y)T(y)

Over the complete database of size n,

H1 = ∨n R(i) S(ij) ∨n S(ij) T(j)

Key idea: If a subformula of H1 is unit-free, then all conjunctions are clearly either from h10 or h11

28

Bound for H1

• Let F be an FBDD for H1 that follows the unit rule.• For any partial path P in F starting at the root, let Row(P) = {i: P tests R(i) or S(ij) at a decision node, for some j}Col(P) = {j: P tests T(j) or S(ij) at a decision node, for some i}

• Let P be set of partial paths P so that the resulting subformula is not 0 or 1, and

|Row(P)| < n and |Col(P)| < n but no extension of P has |Row(P)| < n and |Col(P)| < n

29

Bound for H1

Proposition: If P, Q are paths in P which end at the same node v, then they must test the same set of R and T variables, and assign them the same value.

Proof:Suppose P sets r(i), Q does not (other cases similar)The subformula at v cannot contain any term r(i)s(ij) Q sets every s(ij) = 0 or every t(j) = 1 at some decision node |Col(Q)| = n, contradiction

30

Admissible PathsDefinition: A path P in P is admissible if there does not exist an i,j so that P is not consistent with the following table:

Let A denote the set of admissible paths

R(i) S(ij) T(j)0 1 01 0 00 0 11 0 1

31

Admissible PathsTheorem: Two distinct admissible paths P, Q end at different vertices.

Proof: By contradiction. Assume P and Q end at same vertex (i.e. subformula are same)• Let v be the first (decision) node where P, Q differ, with

variable x▫ w.l.o.g. assume P sets x = 0, Q sets x = 1

• If x is R or T variable, done by Lemma, so assume x = s(ij)• Q sets x to 1 Q sets R(i) = T(j) = 0 (unit rule)• By Lemma, P sets R(i) = T(j) = 0, contradiction.

32

Proof of Lower Bound• Thus, suffices to prove bound on number of admissible

paths.• Let P be an admissible path with |

Row(P)| = n – 1▫ For each i Row(P), consider the first R(i), S(i1), S(i2), …, S(in) ∈

variable we encounter along P▫ We could’ve set it to either 0 and 1 and still maintained

admissibility up to that decision• There are always at least n -1 such “unforced” decisions• Any different choice for these decisions leads to a different

admissible path # of admissible paths ≥ 2n-1

33

Proof for Hk Same basic structure. We only need to change definition of admissible path.

R(i) S1(ij)

S2(ij) S3(ij) T(j)

0 1 0 1 01 0 1 0 10 0 1 0 11 0 1 0 1

34

Boolean combinations of hk0,...,hkk

f is a Boolean function that depends on all its inputs

Ψ = f(hk0, hk1,…, hkk)

We want to reduce any FBDD F for Ψ into an FBDD for

Hk

Intuitively: to compute Ψ using an FBDD, you must compute the h30, h31, h32, h33, so that FBDD can also compute Hk

35

Transparent SubformulasDefinition: A formula Φ that is a restriction of Ψ is called transparent if for any two partial assignments θ1, θ2, if Φ = Ψ[θ1] = Ψ[θ2], then hki[θ1] = hki[θ2], for all i.

From Φ, we can read off the values of the hki.

When will the subformula be transparent?

36

When is a subformula easy?Definition: Let θ be a partial assignment to the hk0,...,hkk. A transversal in θ is a pair of indices (i, j) so that R(i) S1(i, j) is a prime implicant of hk0[θ], Sl(i, j)S(l+1)(i, j) is a prime implicant of hkl[θ], and Sk(i, j) T(j) is a prime implicant of hkk[θ].

We say a formula Φ is transversal-free if there exists θ so that Φ = Ψ[θ] and θ has no transversals.

37

Transversal-free Subformula are easyT(1) T(2) T(3) T(4) T(5)

R(1) S(1,1) S(1,2) S(1,3) S(1,4) S(1,5)

R(2) S(2,1) S(2,2) S(2,3) S(2,4) S(2,5)

R(3) S(3,1) S(3,2) S(3,3) S(3,4) S(3,5)

R(4) S(4,1) S(4,2) S(4,3) S(4,4) S(4,5)

R(5) S(5,1) S(5,2) S(5,3) S(5,4) S(5,5)

38

Transversal-free Subformula are easyT(1) T(2) T(3) T(4) T(5)

R(1) S(1,1) S(1,2) S(1,3) S(1,4) S(1,5)

R(2) S(2,1) S(2,2) S(2,3) S(2,4) S(2,5)

R(3) S(3,1) S(3,2) S(3,3) S(3,4) S(3,5)

R(4) S(4,1) S(4,2) S(4,3) S(4,4) S(4,5)

R(5) S(5,1) S(5,2) S(5,3) S(5,4) S(5,5)

39

Transversal-free Subformula are easy

R(2)

R(1)

R(n)

T(1)

T(2)

T(n)

40

Subformula with few transversals are easyTwo transversals (i, j) and (i’, j’) are independent if i ≠ i’ and j ≠ j’

If a subformula has few independent transversals, then we can test the variables shared by the transversals, and make the formula transversal free.

i.e. if all transversals went through R(i), then first test R(i), and then the resulting formula is transversal-free

41

Subformula with few transversals are easy

Φ

GF

Suppose all of Φ’s transversals involve the variables R(1), T(2)

R(1)

Φ

T(2) T(2)

42

A Pseudo-Unit Rule•Transversal-free easy for FBDDs

hard for lower bounds•So we need control over when a subformula

becomes transversal-free•This is just like the units for H1!

Definition: A variable X in a subformula Φ is a Hk-unit if Φ is not transversal-free but Φ[X = 1] is.

43

Transparent Subformula LemmaTheorem: If a subformula is Hk-unit free and has at least 4 independent transversals, then it is transparent.

Proof: See paper.

44

Putting it all together1. Do the unit rule conversion with Hk-units*

2. If a node has < 4 transversals, transform it as above

3. Now the FBDD is transparent except at nodes at which we control ingress, so we can deduce the values of the hk0,...,hkk at every node.

45

Summary• Quasi-polynomial conversion of any decision-DNNF into an

FBDD (polynomial for k-DNF)

• Exponential lower bounds on model counting algorithms • d-DNNFs and AND-FBDDs are exponentially more

powerful than decision-DNNFs

• Applications in probabilistic databases

46

Open Problems

• A polynomial conversion of decision-DNNFs to FBDDs?

• A more powerful syntactic subclass of d-DNNFs than decision-DNNFs?▫ d-DNNF is a semantic concept▫ No efficient algorithm to test if two sub-DAGs of an OR-node are

simultaneously satisfiable

• Approximate model counting?

47

Thank You

Questions?

Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...

Documents

Transcript of Model Counting of Query Expressions: Limitations of Propositional Methods Paul Beame 1 Jerry Li 2...