Managing Completeness of Data

1
Fariz Darari, Werner Nutt, Giuseppe Pirrò, Simon Razniewski: Completeness Statements about RDF Data Sources and Their Use for Query Answering. ISWC 2013 Elisa Marengo, Werner Nutt, Ognjen Savkovic: Towards a Theory of Query Stability in Business Processes. AMW 2014 Werner Nutt, Sergey Paramonov, Ognjen Savković: Implementing Query Completeness Reasoning. CIKM 2015 MANAGING COMPLETENESS OF DATA Fariz Darari, Elisa Marengo, Werner Nutt, Simon Razniewski, Ognjen Savković When executing a query over the Web, are we sure we are not missing something from the result? Steve Scholar wants to find out all presidents of Union College …………………………. Can we use these statements to reason about the completeness of the query result? Statements on data completeness Mario Moviegoer wants to find out all cast of Reservoir Dogs CURRENT LIMITATIONS The statements are only in natural language It is not clear what data completeness & query completeness mean No techniques and algorithms exist to reason about the completeness of query results SOLUTIONS We formalize completeness statements in a Web data format Formal foundation for data completeness & query completeness and their relation Algorithms to check query completeness from completeness statements and to provide a complete approximation of the query in the case of an incomplete query Complete for all cast of Reservoir Dogs Compl(reservoirDogs actor ?a) Complete for all directors of Reservoir Dogs Compl(reservoirDogs director ?d) Give me all cast of Reservoir Dogs who were also a director of the movie SELECT ?x {reservoirDogs actor ?x . reservoirDogs director ?x} Our framework can infer that the query answer is complete! FRAMEWORK IN ACTION LMDB Give me all presidents of both Union College and UniBZ SELECT ?p {unionCollege everPresident ?p . unibz everPresident ?p} Complete for all presidents of Union College up to 2010 Compl(unionCollege everPresident ?p, 2010) Complete for all presidents of UniBZ up to 2012 Compl(unibz everPresident ?p, 2012) Our framework can infer that the query answer is complete up to 2010! DBpedia In collaboration with Funded by RELATED LINES OF RESEARCH Data is usually manipulated according to business processes Query Stability: given a query, is the result final or is it going to change as an effect of a process execution? We developed: 1. A framework that describes how a business process reads/writes new data from/into a database 2. Techniques to reason about query stability taking into account the business process and the query Query Stability: Completeness over Business Processes How to extend completeness reasoning to include DB constraints such as foreign keys? How to detect which parts of a DB are needed for a query to be complete? We developed: 1. Algorithms that take into account DB constraints to reason on query completeness 2. Algorithms that suggest which parts of the DB should be completed to make the query complete 3. A Web tool that implements those algorithms: magik- demo.inf.unibz.it Completeness over Relational Databases 1. 2. 2. SELECTED PUBLICATIONS

Transcript of Managing Completeness of Data

Page 1: Managing Completeness of Data

• Fariz Darari, Werner Nutt, Giuseppe Pirrò, Simon Razniewski: Completeness Statements about RDF Data Sources and Their Use for Query Answering. ISWC 2013

• Elisa Marengo, Werner Nutt, Ognjen Savkovic: Towards a Theory of Query Stability in Business Processes. AMW 2014• Werner Nutt, Sergey Paramonov, Ognjen Savković: Implementing Query Completeness Reasoning. CIKM 2015

MANAGING COMPLETENESS OF DATAFariz Darari, Elisa Marengo, Werner Nutt, Simon Razniewski, Ognjen Savković

When executing a query over the Web,are we sure we are not missing something from the result?

Steve Scholar wants to find outall presidents of Union College

………………………….

Can we use these statementsto reason about the completenessof the query result?

Statements on data completeness

Mario Moviegoer wants to find outall cast of Reservoir Dogs

CURRENT LIMITATIONS• The statements are only in natural

language• It is not clear what data completeness

& query completeness mean• No techniques and algorithms exist to

reason about the completeness of query results

SOLUTIONS• We formalize completeness statements in a Web data

format• Formal foundation for data completeness & query

completenessand their relation

• Algorithms to check query completeness from completeness statements and to provide a complete approximation of the query in the case of an incomplete query

Complete for all cast of Reservoir Dogs Compl(reservoirDogs actor ?a)Complete for all directors of Reservoir Dogs Compl(reservoirDogs director ?d)

Give me all cast of Reservoir Dogswho were also a director of the movie SELECT ?x {reservoirDogs actor ?x . reservoirDogs director ?x}

Our framework can infer thatthe query answer is complete!

FRAMEWORK IN ACTION

LMDB

Give me all presidents of both Union College and UniBZ SELECT ?p {unionCollege everPresident ?p . unibz everPresident ?p}

Complete for all presidents of Union College up to 2010 Compl(unionCollege everPresident ?p, 2010)Complete for all presidents of UniBZ up to 2012 Compl(unibz everPresident ?p, 2012)

Our framework can infer thatthe query answer is complete up to 2010!

DBpedia

In collaboration with Funded by

RELATED LINES OF RESEARCH

Data is usually manipulated according to business processes• Query Stability: given

a query, is the result final or is it going to change as an effect of a process execution?

We developed:1. A framework that describes how a

business process reads/writes new data from/into a database

2. Techniques to reason about query stability taking into account the business process and the query

Query Stability: Completeness over Business Processes

• How to extend completeness reasoning to include DB constraints such as foreign keys?

• How to detect which parts of a DB are needed for a query to be complete?

We developed:1. Algorithms that take into account

DB constraints to reason on query completeness

2. Algorithms that suggest which parts of the DB should be completed to make the query complete

3. A Web tool that implements those algorithms: magik-demo.inf.unibz.it

Completeness over Relational Databases

1.

2. 2.

SELECTED PUBLICATIONS