YesWorkflow: More Provenance Mileage from Hybrid Provenance Models and Queries
Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...
Transcript of Provenance Management In Knowledge Graphs · 2017-08-18 · KNOWLEDGE GRAPH A1 A2 M1 M2 ”Com”...
Provenance Management In KnowledgeGraphs
Prof. Arnab BhattacharyaDr. Srikanta B Jagannath
Garima Gaur
April 6, 2017
KNOWLEDGE GRAPH
A1
A2
M1
M2
”Com”
A3
A4
M3
M4
”Drama”
ActedIn
ActedIn
ActedIn
Genre
Genre
ActedIn
ActedIn
ActedIn
Genre
Genre
I Graphical way of representing knowledgeI Belongs to the category of semantic networks.I Directed or undirected graph with concepts as vertices and
relationships between concepts as edges.
DYNAMIC KG
I Due to new facts coming in – resulting in deletion andinsertion.
I Semantics of affecting an answer set is based on query typeand the nature of change in KG:
I Top-k queries: The value of parameter under considerationmight have changed.
I Descriptive query: Particular item in answer set doesn’tsatify query condition.
I Shortest path query: Answer is no more correct.I Analytical Query: Evaluated value is no more valid.
WHY DOES IT MATTER ?I Critical decision making based on query result.
I Ever growing Knowledge Graphs.
Recompute Query!!
WHY DOES IT MATTER ?I Critical decision making based on query result.
I Ever growing Knowledge Graphs.
Recompute Query!!
HANDLING DESCRIPTIVE QUERY
I Trying to answer a simple question
Does the deletion/insertion of an edge e affects the queryresult R ?
I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —
I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.
HANDLING DESCRIPTIVE QUERY
I Trying to answer a simple question
Does the deletion/insertion of an edge e affects the queryresult R ?
I Metadata provides better insight.I Provenance — origin of something.I Various perspectives under one umbrella —
I Why-provenance: Comprises of the data involved.I How-provenance: Concerns with the derivation process.
IN NEED OF PROVENANCE MODEL
A1
A2
M1
M2
”Com”
A3
A4
M3
M4
”Drama”
ActedIn
ActedIn
ActedIn
Genre
Genre
ActedIn
ActedIn
ActedIn
Genre
Genre
Query:Select ?actorwhere{
?actor ActedIn ?movie.?movie Genre ”Com”.
}
Result R
A1A2
IN NEED OF PROVENANCE MODEL
A1
A2
M1
M2
”Com”
A3
A4
M3
M4
”Drama”
ActedIn
ActedIn
ActedIn
Genre
Genre
ActedIn
ActedIn
ActedIn
Genre
Genre
Query:Select ?actorwhere{
?actor ActedIn ?movie.?movie Genre ”Com”.
}
Result R
A1A2
OBVIOUS CHOICE – LINEAGE
A1
A2
M1
M2
”Com”
ActedIn
ActedIn
ActedIn
Genre
Genre
e1
e2
e3
e4
e5
Query:Select ?actorwhere{
?actor ActedIn ?movie.?movie Genre ”Com”.
}
Result R
A1 { e1, e4 }}
LineageA2 {e2, e3, e4, e5 }
OBVIOUS CHOICE – LINEAGE
A1
A2
M1
M2
”Com”
ActedIn
ActedIn
ActedIn
Genre
Genre
e1
e2
e3
e4
e5
Query:Select ?actorwhere{
?actor ActedIn ?movie.?movie Genre ”Com”.
}
Result R
A1 { e1, e4 }}
LineageA2 {e2, e3, e4, e5 }
LET’S DELETE AN EDGE
Assume that edge e1 gets deleted.
A1
A2
M1
M2
”Com”
ActedIn
ActedIn
Genre
Genre
e2
e3
e4
e5
Result R
A1 { e1, e4 }A2 {e2, e3, e4, e5 }
Updated Result R’
A2 {e2, e3, e4, e5 }
IS LINEAGE SUFFICIENT ?
What if edge e3 gets deleted ?
A1
A2
M1
M2
”Com”
ActedIn
Genre
Genre
e2
e4
e5
Result R’A2 {e2, e3, e4, e5 }
I A2 still an answer !!
I Need a provenance modelwhich can capturederivation process.
IS LINEAGE SUFFICIENT ?
What if edge e3 gets deleted ?
A1
A2
M1
M2
”Com”
ActedIn
Genre
Genre
e2
e4
e5
Result R’A2 {e2, e3, e4, e5 }
I A2 still an answer !!I Need a provenance model
which can capturederivation process.
PROVENANCE POLYNOMIAL
A1
A2
M1
M2
”Com”
ActedIn
ActedIn
ActedIn
Genre
Genre
e1
e2
e3
e4
e5
Result RA1 e1.e4A2 e2.e4 + e3.e5
I Provenance polynomial encodes the intereaction ofinvolved edges.
I Each term of polynomial is self-sufficient.
HOW DELETION WORKS
I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.
M(e) =
{1 if e is part of KG0 if e is deleted
I Subsitute the edge variables by M(e) and evaluate thepolynomial.
I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.
I On deletion of edge e3 ,
A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1
HOW DELETION WORKS
I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.
M(e) =
{1 if e is part of KG0 if e is deleted
I Subsitute the edge variables by M(e) and evaluate thepolynomial.
I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.
I On deletion of edge e3 ,
A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1
HOW DELETION WORKS
I Membership-function M : E → {0, 1} ,where E the is set ofedge variables.
M(e) =
{1 if e is part of KG0 if e is deleted
I Subsitute the edge variables by M(e) and evaluate thepolynomial.
I Result set persists all the anwers whose correspndingprovenance polynomial evaluates to non-zero value.
I On deletion of edge e3 ,
A2 e3.e5 + e2.e4 = 0.1 + 1.1 = 1
OUR SYSTEM
I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the
changes.
I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution
time.
OUR SYSTEM
I 2-step processI Search-step: Find candidate queries.I Confirmation-step : Evaluate polynomials to confirm the
changes.
I YAGO Dataset: 5.8M nodes, 22.5M edges and 39 relations.I Achieved an update time of 2.6% of RDF query execution
time.
Questions?
Thanks!