Querying Heterogeneous Datasets on the Linked Data Web
-
Upload
edward-curry -
Category
Technology
-
view
107 -
download
0
description
Transcript of Querying Heterogeneous Datasets on the Linked Data Web
Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
Digital Enterprise Research Institute www.deri.ie
Querying Heterogeneous Datasets on the Linked Data Web:
Challenges, Approaches, and Trends
André Freitas, Edward Curry, João G. Oliveira, Seán O’Riain
Digital Enterprise Research Institute www.deri.ie
IEEE Internet Computing
http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141http://andrefreitas.org
A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain,“Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends,” IEEE ��Internet Computing, vol. 16, no. 1, pp. 24-33, 2012.
Digital Enterprise Research Institute www.deri.ie
Motivation
Digital Enterprise Research Institute www.deri.ie
Querying Data over the Web
We can see (a) natural language query over two search engines; (b) corresponding SPARQL representation; and (c) semantic gap between the user’s information needs and data representation.
Digital Enterprise Research Institute www.deri.ie
Expressivity-Usability Trade-Off
Expressivity–usability trade-off for querying over structured data. Blue dots indicate an ideal query mechanism for linked data must provide both
high expressivity and high usability
Digital Enterprise Research Institute www.deri.ie
Challenges
Digital Enterprise Research Institute www.deri.ie
Challenges
Analysis focuses on investigation of existing approaches under the perspective of the usability-expressivity trade-off.
This focus guides the categorization and analysis of existing challenges, approaches and trends.
Digital Enterprise Research Institute www.deri.ie
Challenge Dimensions
Query Expressivity Ability to query datasets by referencing
elements in data model structure, as well as to operate over the data (aggregate results, express conditional statements, etc.)
Usability Easy-to-operate, intuitive, and task-efficient
query interface Vocabulary-level Semantic Matching
Ability to semantically match user query terms to dataset vocabulary-level terms
Digital Enterprise Research Institute www.deri.ie
Challenge Dimensions
Entity Reconciliation Matches entities expressed in the query to
semantically equivalent dataset entities Semantic Tractability
Ability to answer queries not supported by explicit dataset statements
– For example, “Is Natalie Portman an Actress?” can be supported by the statement “Natalie Portman starred Star Wars,” instead of an explicit statement “Natalie Portman occupation Actress,” which might not be present in dataset
Digital Enterprise Research Institute www.deri.ie
Approaches
Digital Enterprise Research Institute www.deri.ie
Approaches
Information Retrieval approaches Entity-centric search Structure search
Natural Language approaches Question Answering Semantic best-effort natural language interfaces
Digital Enterprise Research Institute www.deri.ie
Entity-Centric Search
e.g. Sindice
Digital Enterprise Research Institute www.deri.ie
Structure Search
e.g. Semplore
Digital Enterprise Research Institute www.deri.ie
Question Answering
e.g. FreyA
Digital Enterprise Research Institute www.deri.ie
Semantic Best-Effort/NL
e.g. Treo
Digital Enterprise Research Institute www.deri.ie
Comparative Analysis (Approaches)
Digital Enterprise Research Institute www.deri.ie
Trends
Digital Enterprise Research Institute www.deri.ie
Addressing the Challenges
The functionality analysis of existing approaches provides insights on how the major challenges should be addressed.
This set of strategic functionalities define the set of trends.
Digital Enterprise Research Institute www.deri.ie
Linked Data Web
Digital Enterprise Research Institute www.deri.ie
Trends
Complementary Search and Query Services User Interaction and Feedback Mechanisms Semantic Best-Effort Query Model Natural Language Processing Techniques Distributional Semantic Model External Knowledge Sources for Semantic Enrichment Integrated Entity Reconciliation Techniques
Digital Enterprise Research Institute www.deri.ie
IEEE Internet Computing
http://doi.ieeecomputersociety.org/10.1109/MIC.2011.141http://andrefreitas.org
A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain,“Querying Heterogeneous Datasets on the Linked Data Web: Challenges, Approaches, and Trends,” IEEE ��Internet Computing, vol. 16, no. 1, pp. 24-33, 2012.
Digital Enterprise Research Institute www.deri.ie
Further Reading
A. Freitas, E. Curry, J. G. Oliveira, and S. O’Riain, A Distributional Structured Semantic Space for Querying RDF Graph Data, International Journal of Semantic Computing, vol. 5, no. 4, pp. 433-462, 201
S. O’Riain, E. Curry, and A. Harth, XBRL and Open Data for Global Financial Ecosystems: A Linked Data Approach, International Journal of Accounting Information Systems, vol. 13, no. 2, pp. 141-162, 2012.
A. Freitas, E. Curry, and S. O'Riain, �A Distributional Approach for Terminology-Level Semantic Search on the Linked Data Web, in 27th ACM Symposium On Applied Computing (SAC 2012), 2012.
A. Freitas, J. G. Oliveira, S. O'Riain, and E. Curry, �A Multidimensional Semantic Space for Data Model Independent Queries over RDF Data, in Fifth IEEE International Conference on Semantic Computing (ICSC 2011)
A. Freitas, T. Knap, S. O’Riain, and E. Curry, W3P: Building an OPM based provenance model for the Web, Future Generation Computer Systems, vol. 27, no. 6, pp. 766-774, Jun. 2011.