Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...

Post on 17-Jul-2020

3 views 0 download

Transcript of Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...

Provenance Management In KnowledgeGraphs

Prof. Arnab BhattacharyaDr. Srikanta B Jagannath

Garima Gaur

April 6, 2017

KNOWLEDGE GRAPH

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

I Graphical way of representing knowledgeI Belongs to the category of semantic networks.I Directed or undirected graph with concepts as vertices and

relationships between concepts as edges.

DYNAMIC KG

I Due to new facts coming in – resulting in deletion andinsertion.

I Semantics of affecting an answer set is based on query typeand the nature of change in KG:

I Top-k queries: The value of parameter under considerationmight have changed.

I Descriptive query: Particular item in answer set doesn’tsatify query condition.

I Shortest path query: Answer is no more correct.I Analytical Query: Evaluated value is no more valid.

WHY DOES IT MATTER ?I Critical decision making based on query result.

I Ever growing Knowledge Graphs.

Recompute Query!!

WHY DOES IT MATTER ?I Critical decision making based on query result.

I Ever growing Knowledge Graphs.

Recompute Query!!

HANDLING DESCRIPTIVE QUERY

I Trying to answer a simple question

Does the deletion/insertion of an edge e affects the queryresult R ?

I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —

I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.

HANDLING DESCRIPTIVE QUERY

I Trying to answer a simple question

Does the deletion/insertion of an edge e affects the queryresult R ?

I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —

I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.

IN NEED OF PROVENANCE MODEL

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1A2

IN NEED OF PROVENANCE MODEL

A1

A2

M1

M2

”Com”

A3

A4

M3

M4

”Drama”

ActedIn

ActedIn

ActedIn

Genre

Genre

ActedIn

ActedIn

ActedIn

Genre

Genre

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1A2

OBVIOUS CHOICE – LINEAGE

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1 { e1, e4 }}

LineageA2 {e2, e3, e4, e5 }

OBVIOUS CHOICE – LINEAGE

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Query:Select ?actorwhere{

?actor ActedIn ?movie.?movie Genre ”Com”.

}

Result R

A1 { e1, e4 }}

LineageA2 {e2, e3, e4, e5 }

LET’S DELETE AN EDGE

Assume that edge e1 gets deleted.

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

Genre

Genre

e2

e3

e4

e5

Result R

A1 { e1, e4 }A2 {e2, e3, e4, e5 }

Updated Result R’

A2 {e2, e3, e4, e5 }

IS LINEAGE SUFFICIENT ?

What if edge e3 gets deleted ?

A1

A2

M1

M2

”Com”

ActedIn

Genre

Genre

e2

e4

e5

Result R’A2 {e2, e3, e4, e5 }

I A2 still an answer !!

I Need a provenance modelwhich can capturederivation process.

IS LINEAGE SUFFICIENT ?

What if edge e3 gets deleted ?

A1

A2

M1

M2

”Com”

ActedIn

Genre

Genre

e2

e4

e5

Result R’A2 {e2, e3, e4, e5 }

I A2 still an answer !!I Need a provenance model

which can capturederivation process.

PROVENANCE POLYNOMIAL

A1

A2

M1

M2

”Com”

ActedIn

ActedIn

ActedIn

Genre

Genre

e1

e2

e3

e4

e5

Result RA1 e1.e4A2 e2.e4 + e3.e5

I Provenance polynomial encodes the intereaction ofinvolved edges.

I Each term of polynomial is self-sufficient.

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

HOW DELETION WORKS

I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.

M(e) =

{1 if e is part of KG0 if e is deleted

I Subsitute the edge variables by M(e) and evaluate thepolynomial.

I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.

I On deletion of edge e3 ,

A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1

OUR SYSTEM

I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the

changes.

I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution

time.

OUR SYSTEM

I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the

changes.

I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution

time.

Questions?

Thanks!