Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...

22
Provenance Management In Knowledge Graphs Prof. Arnab Bhattacharya Dr. Srikanta B Jagannath Garima Gaur April 6, 2017

Transcript of Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...

Page 1: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

Provenance Management In KnowledgeGraphs

Prof. Arnab BhattacharyaDr. Srikanta B Jagannath

Garima Gaur

April 6, 2017

Page 2: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

KNOWLEDGE GRAPH

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

I Graphical way of representing knowledgeI Belongs to the category of semantic networks.I Directed or undirected graph with concepts as vertices and

relationships between concepts as edges.

Page 3: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

DYNAMIC KG

I Due to new facts coming in – resulting in deletion andinsertion.

I Semantics of affecting an answer set is based on query typeand the nature of change in KG:

I Top-k queries: The value of parameter under considerationmight have changed.

I Descriptive query: Particular item in answer set doesn’tsatify query condition.

I Shortest path query: Answer is no more correct.I Analytical Query: Evaluated value is no more valid.

Page 4: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

WHY DOES IT MATTER ?I Critical decision making based on query result.

I Ever growing Knowledge Graphs.

Recompute Query!!

Page 5: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

WHY DOES IT MATTER ?I Critical decision making based on query result.

I Ever growing Knowledge Graphs.

Recompute Query!!

Page 6: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

HANDLING DESCRIPTIVE QUERY

I Trying to answer a simple question

Does the deletion/insertion of an edge e affects the queryresult R ?

I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —

I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.

Page 7: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

HANDLING DESCRIPTIVE QUERY

I Trying to answer a simple question

Does the deletion/insertion of an edge e affects the queryresult R ?

I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —

I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.

Page 8: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

IN NEED OF PROVENANCE MODEL

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1A2

Page 9: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

IN NEED OF PROVENANCE MODEL

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1A2

Page 10: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

OBVIOUS CHOICE – LINEAGE

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1 { e1, e4 }}

LineageA2 {e2, e3, e4, e5 }

Page 11: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

OBVIOUS CHOICE – LINEAGE

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1 { e1, e4 }}

LineageA2 {e2, e3, e4, e5 }

Page 12: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

LET’S DELETE AN EDGE

Assume that edge e1 gets deleted.

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

Genre

Genre

e2

e3

e4

e5

Result R

A1 { e1, e4 }A2 {e2, e3, e4, e5 }

Updated Result R’

A2 {e2, e3, e4, e5 }

Page 13: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

IS LINEAGE SUFFICIENT ?

What if edge e3 gets deleted ?

A1

A2

M1

M2

”Com”

ActedIn

Genre

Genre

e2

e4

e5

Result R’A2 {e2, e3, e4, e5 }

I A2 still an answer !!

I Need a provenance modelwhich can capturederivation process.

Page 14: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

IS LINEAGE SUFFICIENT ?

What if edge e3 gets deleted ?

A1

A2

M1

M2

”Com”

ActedIn

Genre

Genre

e2

e4

e5

Result R’A2 {e2, e3, e4, e5 }

I A2 still an answer !!I Need a provenance model

which can capturederivation process.

Page 15: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

PROVENANCE POLYNOMIAL

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Result RA1 e1.e4A2 e2.e4 + e3.e5

I Provenance polynomial encodes the intereaction ofinvolved edges.

I Each term of polynomial is self-sufficient.

Page 16: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

Page 17: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

Page 18: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

Page 19: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

OUR SYSTEM

I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the

changes.

I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution

time.

Page 20: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

OUR SYSTEM

I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the

changes.

I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution

time.

Page 21: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

Questions?

Page 22: Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com” A3 A4 M3 M4 ”Drama” ActedIn ActedIn ActedIn Genre Genre ActedIn ActedIn I Graphical

Thanks!