A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

25
A Dichotomy in the Complexity of A Dichotomy in the Complexity of Deletion Propagation with Functional Deletion Propagation with Functional Dependencies Dependencies 2012 ACM SIGMOD/PODS Conference 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA Scottsdale, Arizona, USA PODS 2012 PODS 2012 Benny Kimelfeld IBM Research – Almaden

description

PODS 2012. 2012 ACM SIGMOD/PODS Conference Scottsdale, Arizona, USA. A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies. Benny Kimelfeld IBM Research – Almaden. Deletion Propagation. - PowerPoint PPT Presentation

Transcript of A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Page 1: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

A Dichotomy in the Complexity of Deletion A Dichotomy in the Complexity of Deletion Propagation with Functional DependenciesPropagation with Functional Dependencies

2012 ACM SIGMOD/PODS Conference2012 ACM SIGMOD/PODS ConferenceScottsdale, Arizona, USAScottsdale, Arizona, USA

PODS 2012PODS 2012

Benny Kimelfeld

IBM Research – Almaden

Page 2: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

This Work!This Work!This Work!This Work!

Deletion PropagationDeletion Propagation

• Translate a tuple deletion on the view back to the source relations … properly

• Classic database problem– Specializing the more general view-update problem– [Dayal & Bernstein 1982; Cosmadakis & Papadimitriou 1984; Keller 1986; Cui &

Widom 2001; Buneman & Khanna & Tan 2002; Cong & Fan & Geerts 2006; …]

• Renewed motivation: debug/causality for false positives [K, Vondrak, Williams, 2011]

• Various definitions of “properly” were studied– Minimize the view side effect

• # view tuples lost except the intentional one

– Minimize the source side effect• # source tuples to delete• = maximal “responsibility” for an answer [Meliou et al., 2010]

Page 3: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Example: File AccessExample: File Access

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

Delete source rows, s.t. Emma won’t access a.txt.But, maintain maximum access permissions!

[Cui & Widom 2001; Buneman et al. 2002]

Access

user file

Emma a.txt

Emma b.txt

Olivia a.txt

Olivia b.txt

Jacob a.txt

Jacob b.txt

= ⋈

Page 4: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Example: File AccessExample: File Access

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

Delete source rows, s.t. Emma won’t access a.txt.But, maintain maximum access permissions!

= ⋈

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Access

user file

Emma a.txt

Emma b.txt

Olivia a.txt

Olivia b.txt

Jacob a.txt

Jacob b.txt

[Cui & Widom 2001; Buneman et al. 2002]

Page 5: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Example: File AccessExample: File Access

Delete source rows, s.t. Emma won’t access a.txt.But, maintain maximum access permissions!

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Access

user file

Emma a.txt

Emma b.txt

Olivia a.txt

Olivia b.txt

Jacob a.txt

Jacob b.txt

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

= ⋈side-effect side-effect

freefree(& minimal side (& minimal side effect)effect)

side-effect side-effect

freefree(& minimal side (& minimal side effect)effect)

[Cui & Widom 2001; Buneman et al. 2002]

Page 6: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Formal DefinitionsFormal DefinitionsSchema S: rel. symbols + functional dependencies (fd)

R1,….,Rm Ri: attribute-set → attribute

Conjunctive Query (CQ) Q:

head variables existential variables

Q( y1 , y2 , y3 ) :– R1(x1 , y1), R2(x1

,'ibm'), R3(x2 , y1 , y2

, x3), R4(x4 , y3)

Solution: E ⊆ D s.t. a ∉ Q(E)

• Side-effect free: Q(E) = Q(D) – {a}

• Optimal: |Q(E)| is maximal

Input:

• DB D over S• Answer a ∈ Q(D)

to delete

No self joins!atom

Page 7: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Complexity QuestionsComplexity Questions

What is the complexity of

• Deciding if a side-effect-free solution exists?

• Finding an optimal solution?– Or one w/ approximately minimal side effect?– Or one w/ approximately maximal # surviving answers?

• Not the same [K, Vondrák, Williams, 2011]

Data complexity:

Fixed:Fixed: Schema S, CQ Q

Input:Input: DB D over S, answer a ∊ Q(D) to delete

Page 8: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Unirelation Algorithm (Unirelation Algorithm (1Rel1Rel): Example): Example

Delete a = (Emma, a.txt)

= ⋈

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Access

user file

Emma a.txt

Emma b.txt

Olivia a.txt

Olivia b.txt

Jacob a.txt

Jacob b.txt

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

[Buneman et al., 2002]

Page 9: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Unirelation Algorithm (Unirelation Algorithm (1Rel1Rel): Example): Example

Recall: there is even better solution (side-effect free)

better than previous ⇒ selected solutionDelete a = (Emma, a.txt)

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Access

user file

Emma a.txt

Emma b.txt

Olivia a.txt

Olivia b.txt

Jacob a.txt

Jacob b.txt

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

[Buneman et al., 2002]

=

Page 10: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

1Rel1Rel: General Case: General Case

undesired a ∈ Q(D)

R1 R2 Rk

select bestselect best

(i=1,…,k) solution i:

delete from Ri each tuple consistent w/ a

solution 1solution 1

solution 2solution 2

solutionsolution kk

Q has k atoms…

R1 R2 Rk

R1 R2 Rk

D

D

D

Page 11: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Head Domination Head Domination [K, Vondrák, Williams, 2011][K, Vondrák, Williams, 2011]

Q: A CQ over a schema S

G∃[Q]:nodes = atoms(Q)edges = “sharing ≥1 existential var.”

head domination:∀ C ∊ CC(G∃[Q]) ∃∊ atoms(Q) s.t.,headVars(C) ⊆ vars()

Connected Components

Q( y1 , y2 , y3) :– R1(x1 , y1) , R2(x1

, y2) , R3(y1 , y2) , R4(x2 , y2

, y3)

Q( y1 , y2) :– R1(x , y1) , R2(x , y2)

Q( y1 , y2) :– R1(x1 , y1) , R2(x1

, y2) , R3(x1 , y1 , y2)

Access(u,f)

Page 12: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Previous Dichotomy Theorem [KVW 2011] Previous Dichotomy Theorem [KVW 2011]

Let Q be a CQ over a schema S(no self joins)

[K, Vondrak, Williams, 2011], no FDs:

Q has head domination

⇒ 1Rel returns an optimal solution (in PTime)

otherwise ⇒∃side-effect-free is NP-complete; NP-hard to find an (αQ-approx.) optimal solution

Q( y1 , y2 , y3) :– R1(x1 , y1) , R2(x1

, y2) , R3(y1 , y2) , R4(x2 , y2

, y3)

Q( y1 , y2) :– R1(x , y1) , R2(x , y2)

Q( y1 , y2) :– R1(x1 , y1) , R2(x1

, y2) , R3(x1 , y1 , y2)PTimePTime (1Rel)

PTimePTime (1Rel)

NP-hardNP-hardAccess(u,f)

Page 13: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Access Example RevisitedAccess Example RevisitedDelete (Emma, a.txt)

group ← file

PTimePTime

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Accessuser file

Emma a.txtEmma b.txtOlivia a.txtOlivia b.txtJacob a.txtJacob b.txt

⋈=

NP-hardNP-hard

Page 14: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Access Example RevisitedAccess Example RevisitedDelete (Emma, a.txt)

user → group

NP-hardNP-hard

PTimePTime

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Accessuser file

Emma a.txtEmma b.txtOlivia a.txtOlivia b.txtJacob a.txtJacob b.txt

= ⋈

group ← file

PTimePTime

Page 15: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Access Example RevisitedAccess Example RevisitedDelete (Emma, a.txt)NP-hardNP-hard

user → group

PTimePTime group ← file

PTimePTime

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Accessuser file

Emma a.txtEmma b.txtOlivia a.txtOlivia b.txtJacob a.txtJacob b.txt

user ← group

PTimePTime

⋈=

Page 16: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Access Example RevisitedAccess Example RevisitedDelete (Emma, a.txt)NP-hardNP-hard

user → group

PTimePTime group ← file

PTimePTime

user ← group

PTimePTime group → file

PTimePTime Every nontrivial Every nontrivial set of FDs brings set of FDs brings the problem to the problem to

PTimePTime

GroupFile

group file

ai a.txt

ai b.txt

db a. txt

db b.txt

os a.txt

UserGroup

user group

Emma ai

Emma db

Olivia os

Olivia db

Jacob ai

Accessuser file

Emma a.txtEmma b.txtOlivia a.txtOlivia b.txtJacob a.txtJacob b.txt

⋈=

Page 17: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Additional ExamplesAdditional Examples

Q(y , y1 , y2) :– R1(y1 , x1) , R(x1

, y , x2) , R2(y2 , x2)

Q(y , y1 , y2) :– R1(x1 , y1) , R(x1

, y , x2) , R2(x2 , y2)

Q( y , y1 , y2) :– R1(x1 , y1) , R(x1

, y , x2) , R2(x2 , y2)

PTimePTime

NP-NP-hardhard

NP-NP-hardhard

Page 18: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Dichotomy with FDsDichotomy with FDs

[K, Vondrak, Williams, 2011], no FDs:

Q has head domination

⇒1Rel returns an optimal solution (in PTime)

otherwise ⇒

∃side-effect-free is NP-complete; NP-hard to find an (αQ-approx.) optimal solution

This paper: (FDs)

Q+ has

functional head dom.

⇒1Rel* returns an optimal solution (in PTime)

otherwise ⇒

∃side-effect-free is NP-complete; NP-hard to find an (αQ-approx.) optimal solution

Let Q be a CQ over a schema S(no self joins)

Depending on the CQ and FDs, the problem is either straightforward or

hard!

Remove tuple only if it is used for the undersired answer

Page 19: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

FDs Among VariablesFDs Among Variables

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

FD: group → file

g → fu → g

FD: user → group

u → f {u,g} → f

Definition:

CQ Q over schema S, U, V ⊆ variables(Q)

U → V: ∀ D ∈ db(S) 1, 2 ∈ hom(Q→D)

1=2 on U ⇒ 1=2 on V

Page 20: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

The CQ The CQ QQ++

Definition:

CQ Q over schema S, U, V ⊆ variables(Q)

U → V: ∀ D ∈ db(S) 1, 2 ∈ hom(Q→D)

1=2 on U ⇒ 1=2 on V

Q+ : add to Q’s head every x s.t. headVars → x

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

group ← file

Access+(u,g,f) :– UserGroup(u,g), GroupFile(g,f)

g ← {u,f} ⇒

Tractability Condition: Q+ has functional head domination

Tractability Condition: Q+ has functional head domination

Page 21: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Functional Head DominationFunctional Head Domination

functional head domination:

∀ C∈CC(G∃[Q]) ∃∊ atoms(Q), s.t. vars() → headVars(C)

head domination:∀ C∈CC(G∃[Q]) ∃∊ atoms(Q), s.t. vars()⊇ headVars(C)

Access(u,f) :– UserGroup(u,g), GroupFile(g,f)

group → file{u,g} → {u,f} ⇐

Q: A CQ over a schema S

G∃[Q]:nodes = atoms(Q)edges = “sharing ≥1 existential var.”

Tractability Condition: Q+ has functional head domination

Tractability Condition: Q+ has functional head domination

Page 22: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

ExamplesExamplesTractability Condition:

Q+ has functional head domination Tractability Condition:

Q+ has functional head domination

Q(y , y1 , y2) :– R1(x1 , y1) , R(x1

, y , x2) , R2(x2 , y2)

PTimePTime (1Rel*)(1Rel*)

Q+(y , y1 , y2, x2) :– R1(x1 , y1) , R(x1

, y , x2) , R2(x2 , y2)

{y , y1 , y2} → x2

Q( y , y1 , y2) :– R1(x1 , y1) , R(x1

, y , x2) , R2(x2 , y2)

NP-NP-hardhard

Page 23: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Example: Key-Preserving Views Example: Key-Preserving Views

Theorem [Cong, Fan, Geerts, 2006]:

Q preserves keys* ⇒ deletion propagation in PTime

Tractability Condition: Q+ has functional head domination

Tractability Condition: Q+ has functional head domination

* Each relation has a key; none of the key attributes are projected out

Q preserves keys

⇒ Q+ has no existential vars ⇒ G∃[Q+] has no edges

⇒ Q+ trivially has functional head domination (every connected component is a node, dominated by itself…)

⇒ 1Rel* returns an optimal solution

For CQs w/o self joins, follows directly from our positive side:

Page 24: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

About the ProofAbout the Proof

• The positive side is fairly simple – … once the tractability condition is found

• The negative side is intricate– Reduction from the special case of the Access CQ

– Challenge: simulating Access(u,f) by an instance that satisfies all the FDs

– Central concept: graph separation on the variable graph of the CQ

Q'(y1 , y2) :– R1(y1 , x1 , x) , R2(x , x2

, y2)

Q(y1 , y2) :– R1(y1 , x) , R2(x , y2)

R3(x1 , x2)→→

Page 25: A Dichotomy in the Complexity of Deletion Propagation with Functional Dependencies

Conclusions & Ongoing WorkConclusions & Ongoing Work• Studied deletion propagation in the presence of

functional dependencies

• Established a dichotomy in complexity: – PTime by a straightforward algorithm vs.– Hardness (of approximation)

• Generalizes previously established special cases: no FDs, key-preserving views

• Ongoing work: deletion of multiple answers– Preview: trichotomy

• Straightforward • Hard but approximable (by a constant-factor)• Hard to approximate

Questions?Questions?