Database Tuples Play Cooperative Games

Post on 11-Nov-2021

10 views 0 download

Transcript of Database Tuples Play Cooperative Games

Database Tuples Play Cooperative Games

Ester Livshits

Joint work with:

Leopoldo Bertossi, Benny Kimelfeld, Alon Reshef, Moshe Sebag

Ester Livshits Oxford Data and Knowledge Seminar 2

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Why we obtained a

particular answer?

Why we did not obtain

some other answer?

Ester Livshits Oxford Data and Knowledge Seminar 3

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Why we obtained a

particular answer?

Why we did not obtain

some other answer?

Ester Livshits Oxford Data and Knowledge Seminar 4

Which tuples in the database

explain the query result?

Measuring Contribution

โžข Causal responsibility [Meliou et al. 2010]

โ– ๐‘ก is a counterfactual cause for ๐‘ž if ๐ท โŠจ ๐‘ž and ๐ท โˆ– {๐‘ก} โŠจ ๐‘ž

โ– ๐‘ก is an actual cause for ๐‘ž if ๐ท โˆ– ฮ“ โŠจ ๐‘ž and ๐ท โˆ– {ฮ“ โˆช {๐‘ก}} โŠจ ๐‘žfor some ฮ“ โŠ† ๐ท โˆ– {๐‘ก}

โ– The responsibility of ๐‘ก is 1

1+|ฮ“min|

โžข Not extendable to aggregate queries

โžข May be counterintuitive

Ester Livshits Oxford Data and Knowledge Seminar 5

Is there a path from a to b?

Contingency

set

Measuring Contribution

โžข Causal effect [Salimi et al. 2016]

โ– See the database as a probabilistic database

โ– CE ๐‘ก = E ๐‘ž ๐‘ก โˆˆ ๐ท) โˆ’ E ๐‘ž ๐‘ก โˆ‰ ๐ท)

Ester Livshits Oxford Data and Knowledge Seminar 6

What makes the choice of a contribution score a good one?

Shapley Value

โžข A widely known profit-sharing formula in cooperative game theory

โžข Introduced by Lloyd Shapley in 1953

โžข Applied in various areas beyond cooperative game theory:

โ– Pollution responsibility in environmental management

โ– Influence measurement in social network analysis

โ– Identifying candidate autism genes

โ– Bargaining foundations in economics

โ– Takeover corporate rights in law

โ– Explanations in machine learning

Ester Livshits Oxford Data and Knowledge Seminar 7

Shapley Value

Ester Livshits 8

Set ๐ด of players: Wealth function ๐‘ฃ:๐’ซ ๐ด โ†’ โ„:

3

7

42

How to distribute the total

wealth among the players?

Machine learning

Query answering

Inconsistency

Features Prediction

Tuples Answer

Tuples Measure

[Lundberg, Lee 2017]

[L, Kimelfeld 2021]

[L et al. 2020]

Oxford Data and Knowledge Seminar

Shapley Value

Ester Livshits Oxford Data and Knowledge Seminar 9

Shapley ๐ด, ๐‘ฃ, ๐‘Ž =

๐ตโŠ†๐ดโˆ–{๐‘Ž}

๐ต ! ๐ด โˆ’ ๐ต โˆ’ 1 !

๐ด !๐‘ฃ ๐ต โˆช ๐‘Ž โˆ’ ๐‘ฃ ๐ต

72

21 25

+4

The Shapley value is the expected delta

due to the addition in a random permutation

Shapley Value for Database Queries

โžข Which tuples in the database explain the query result?

Ester Livshits Oxford Data and Knowledge Seminar 10

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)

SUM๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Players

Wealth function

๐‘†๐‘‰ ๐ด๐‘™๐‘–๐‘๐‘’ = 20๐‘†๐‘‰ ๐ถ๐‘Ž๐‘กโ„Ž๐‘ฆ = 14.67๐‘†๐‘‰ ๐ต๐‘œ๐‘ = 2.67๐‘†๐‘‰ ๐ท๐‘Ž๐‘ฃ๐‘–๐‘‘ = 2.67๐‘†๐‘‰ ๐ธ๐‘™๐‘™๐‘’๐‘› = 0

Ester Livshits Oxford Data and Knowledge Seminar 11

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

Ester Livshits Oxford Data and Knowledge Seminar 12

AUTHOR

Name Affiliation

Alice UCLA

Bob NYU

Cathy MIT

David UCSD

Ellen NYU

INSTITUTE

Name STATE

UCLA CA

UCSD CA

NYU NY

MIT MA

PUBLICAION

Author Paper

Alice A

Alice B

Bob C

Cathy C

Cathy D

David C

CITATIONS

PAPER CITS

A 18

B 2

C 8

D 12

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS ๐‘ง, ๐‘ค

PAPER CITS

A 18

B 2

C 8

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, โ€ฒAโ€ฒ , CITATIONS โ€ฒAโ€ฒ, 18

โžข Explaining Query Answers

โžข Computational Complexity

โžข Responsibility to Inconsistency

Outline

Ester Livshits Oxford Data and Knowledge Seminar 13

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 14

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

๐‘ž1():โˆ’๐‘… ๐‘ฅ, ๐‘ฆ , ๐‘†(๐‘ฅ, ๐‘ง)

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

๐‘ฆ ๐‘ง

๐‘ฅ

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 15

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

๐‘ž2():โˆ’๐‘… ๐‘ฅ , ๐‘† ๐‘ฅ, ๐‘ฆ , ๐‘‡(๐‘ฆ)

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , INSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

๐‘ฆ

๐‘ฅ

Conjunctive Queries

โžข To prove hardness, we consider the simplest non-hierarchical query

๐‘ž๐‘…๐‘†๐‘‡(): โˆ’๐‘… ๐‘ฅ , ๐‘† ๐‘ฅ, ๐‘ฆ , ๐‘‡(๐‘ฆ)

โžข Reduction from counting independent sets in a bipartite graph

Ester Livshits Oxford Data and Knowledge Seminar 16

R S T

Conjunctive Queries

โžข Each instance provides us with an equation over |IS(๐‘”, ๐‘˜)|

โžข |IS(๐‘”, ๐‘˜)| - number of independent sets of size ๐‘˜ in ๐‘”

Ester Livshits Oxford Data and Knowledge Seminar 17

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 18

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž():โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , ยฌINSTITUTE ๐‘ฆ, โ€ฒCAโ€ฒ , PUBLICATION ๐‘ฅ, ๐‘ง

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 19

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)SUM๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 20

โžข A CQ ๐‘ž is hierarchical if for every two existential variables ๐‘ฅ and ๐‘ฆ:

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ โŠ† ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ or

โ– ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฅ โˆฉ ๐ด๐‘ก๐‘œ๐‘š๐‘  ๐‘ฆ = โˆ…

Query Hierarchical Non-hierarchical

SJFCQ PTIME FP#P-complete

SJFCQ with

negationsPTIME FP#P-complete

sum \ count PTIME FP#P-complete

[L et al.

ICDT 2020]

[Reshef et al.

PODS 2020]

๐‘ž ๐‘ง, ๐‘ค :โˆ’AUTHOR ๐‘ฅ, ๐‘ฆ , PUBLICATION ๐‘ฅ, ๐‘ง , CITATIONS(๐‘ง, ๐‘ค)MAX๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ, MIN๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ, AVERAGE๐‘คโŸจ๐‘ž ๐‘ง, ๐‘ค โŸฉ

Hardness can be extended to

general numerical queries

โžข Computing the Shapley value is often hard

โžข The picture is more positive when allowing approximation

โžข Generalizes to unions of CQs

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 21

Pr๐‘“(๐‘ฅ)

1 + ๐œ–โ‰ค ๐ด ๐‘ฅ, ๐œ–, ๐›ฟ โ‰ค (1 + ๐œ–)๐‘“(๐‘ฅ) โ‰ฅ 1 โˆ’ ๐›ฟ

Query Hierarchical Non-hierarchical

SJFCQ PTIME FPRAS

sum \ count PTIME FPRAS

โžข Additive approximation via Monte Carlo sampling

โžข Also a multiplicative approximation due to the โ€œgap propertyโ€

โžข Does not hold when allowing negation

โžข Negation fundamentally changes the complexity picture!

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 22

Pr ๐‘“ ๐‘ฅ โˆ’ ๐œ– โ‰ค ๐ด ๐‘ฅ, ๐œ–, ๐›ฟ โ‰ค ๐‘“ ๐‘ฅ + ๐œ– โ‰ฅ 1 โˆ’ ๐›ฟ

For every tuple ๐‘ก in the database ๐ท:

Shapley(๐‘ก)=0 or Shapley(๐‘ก)โ‰ฅ1

๐‘(|๐ท|)

โžข With negation, the contribution can be negative

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 23

Register

Student Course

Alice OS

Alice AI

Bob OS

Cathy DB

Cathy IC

Student

Name

Alice

Bob

Cathy

David

TA

Name

Alice

Bob

David

๐‘ž(): โˆ’Student ๐‘ฅ , ยฌTA ๐‘ฅ , Register(๐‘ฅ, ๐‘ฆ)

In some cases, deciding whether Shapley(๐‘ก)โ‰ 0 is hard

โžข Causal effect [Salimi et al. 2016]

โ– See the database as a probabilistic database

โ– CE ๐‘ก = E ๐‘ž ๐‘ก โˆˆ ๐ท) โˆ’ E ๐‘ž ๐‘ก โˆ‰ ๐ท)

โžข Coincides with the Banzhaf Power Index [Banzhaf 1965]

โžข Our complexity results extend to this measure

Ester Livshits Oxford Data and Knowledge Seminar

Banzhaf Power Index

24

โžข Explaining Query Answers

โžข Computational Complexity

โžข Responsibility to Inconsistency

Outline

Ester Livshits Oxford Data and Knowledge Seminar 25

Inconsistent Databasesโžข A database is inconsistent if it violates integrity constraints

Ester Livshits Oxford Data and Knowledge Seminar 26

Cullen Douglas

dbo:birthPlace

โ–ช dbr:California

โ–ช dbr:Florida

Marion Jones

dbo:height

โ–ช 1.524

โ–ช 1.778

Irene Tedrow

dbo:deathPlace

โ–ช dbr:California

โ–ช dbr:Hollywood,_Los_Angeles

โ–ช dbr:New_York_City

Inconsistent Databases

Ester Livshits Oxford Data and Knowledge Seminar 27

โžข Imprecise data sources

โ– Crowd, Web pages, social encyclopedias, sensors, โ€ฆ

โžข Imprecise data generation

โ– natural-language processing, sensor/signal processing, image recognition, โ€ฆ

โžข Conflicts in data integration

โ– Crowd + enterprise data + KB + Web + ...

โžข Data staleness

โ– Entities change address, status, ...

โžข And so onโ€ฆ

Ester Livshits Oxford Data and Knowledge Seminar 28

Idea:

Quantify the extent to which

integrity constraints are violated

Reliability estimationHow reliable is a new data source?

Progress indicationProgress bar for data repairing

Action prioritizationWhich tuples are mostly

responsible for inconsistency?

Ester Livshits Oxford Data and Knowledge Seminar 29

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

Ester Livshits Oxford Data and Knowledge Seminar 30

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

How to Measure Inconsistency?

โžข Several measures proposed by the KR and DB communities

โ– The drastic measure โ€“ 1 if inconsistent, 0 otherwise [Thimm 2017]

โ– #minimal inconsistent subsets [Hunter and Konieczny 2008]

โ– #problematic tuples [Grant and Hunter 2011]

โ– Minimal #tuples to remove to satisfy the constraints [Grant and Hunter 2013], [Bertossi 2018]

โ– #maximal consistent subsets [Grant and Hunter 2011]

โžข What makes a measure a good one? [L et al. SIGMOD 2021]

Ester Livshits Oxford Data and Knowledge Seminar 31

Ester Livshits Oxford Data and Knowledge Seminar 32

How can we quantify the

responsibility of individual tuples

to inconsistency?

Inconsistency measure

Responsibility sharing

mechanism

Shapley Value

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 33

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

FD: birthCity โ†’ birthState

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 34

โžข ๐ผ๐‘€๐ผ - Number of minimal inconsistent subsets

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 35

โžข ๐ผ๐‘€๐ผ - Number of minimal inconsistent subsets

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

+2

๐‘“ increases the value of ๐ผ๐‘€๐ผ by ๐‘˜ if

๐‘˜ of the previous tuples conflict with it

Tractable Measures

Ester Livshits Oxford Data and Knowledge Seminar 36

โžข ๐ผ๐‘ƒ - Number of problematic tuples

๐‘“4

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

๐‘“7 ๐‘“1 ๐‘“3 ๐‘“9 ๐‘“2 ๐‘“5 ๐‘“8 ๐‘“6

+1

๐‘“ increases the value of ๐ผ๐‘ by ๐‘˜ if

(๐‘˜ โˆ’ 1) of the previous tuples:

(1) conflict with ๐‘“,

(2) do not conflict with other

tuples that occur before ๐‘“.

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 37

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ฎ โ†’ ๐ธ,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ป}

๐ต โŠ† ๐ต, ๐ถ โŠ† {๐ต, ๐ถ, ๐น} ๐ต, ๐ถ, ๐บ โŠˆ ๐ต, ๐ถ, ๐น , {๐ต, ๐ถ, ๐น}โŠˆ ๐ต, ๐ถ, ๐บ

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ธ}

Left-Hand Side Chain

Ester Livshits Oxford Data and Knowledge Seminar 38

Train Departs Arrives Time Duration

๐‘“1 16 NYP BBY 1030 315

๐‘“2 16 NYP PVD 1030 250

๐‘“3 16 PHL WIL 1030 20

๐‘“4 16 PHL BAL 1030 70

๐‘“5 16 PHL WAS 1030 120

๐‘“6 16 BBY PHL 1030 260

๐‘“7 16 BBY NYP 1030 260

๐‘“8 16 BBY WAS 1030 420

๐‘“9 16 WAS PVD 1030 390

Train Time โ†’ Departs

Train Time Duration โ†’ Arrives

PVD

Train, Time

Departs

Duration

NYP PHL BBY WAS

16, 1030

315 250 20 70 120 260 420 390

BBY PVD WIL BAL WAS PHL NYP WAS

Arrives

Independent

branchesConflicting

branches

Computational Complexity

Ester Livshits Oxford Data and Knowledge Seminar 39

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FP#P-complete

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME Open NP-hard

#repairs PTIME FP#P-complete

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ฎ โ†’ ๐ธ,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ป}

๐ต โŠ† ๐ต๐ถ โŠ† {๐ต๐ถ๐น} ๐ต๐ถ๐บ โŠˆ ๐ต๐ถ๐น , {๐ต๐ถ๐น}โŠˆ ๐ต๐ถ๐บ

{๐‘ฉ โ†’ ๐ด,๐‘ฉ๐‘ช โ†’ ๐ท,๐‘ฉ๐‘ช๐‘ญ โ†’ ๐ธ}

Efficiency: ฯƒ๐‘Žโˆˆ๐ด Shapley ๐ด, ๐‘ฃ, ๐‘Ž = ๐‘ฃ(๐ด)

Approximation Complexity

Ester Livshits Oxford Data and Knowledge Seminar 40

Measure lhs chainNo lhs chain,

tractable c-repairother

drastic PTIME FPRAS

#min-

inconsistentPTIME

#problematic

tuplesPTIME

cardinality

repairPTIME FPRAS No FPRAS

#repairs PTIME Open

Would imply an FPRAS for #MIS in a bipartite

graph โ€“ long standing open problem

โžข Two situations where we wish to quantify the responsibility of tuples:

โ– Query answering

โ– Database inconsistency

โžข We treat the contribution from the viewpoint of game theory

โžข We investigated the computational complexity

Ester Livshits Oxford Data and Knowledge Seminar

Concluding Remarks

41

Ester Livshits Oxford Data and Knowledge Seminar 42

Thank you for listening!

Questions?