LOD2 Webinar . 29.11.2011 . Page 1 http://lod2.eu
Creating Knowledge out of Interlinked Data
LOD2 Webinar . 29.11.2011 . Page 2 http://lod2.eu
Creating Knowledge out of Interlinked Data
LOD2 is a large-scale integrating project co-funded by the European Commission within the FP7 Information and Communication Technologies Work Programme. This 4-year project comprises leading Linked Open Data technology researchers, companies, and service providers. Coming from across 12 countries the partners are coordinated by the Agile Knowledge Engineering and Semantic Web Research Group at the University of Leipzig, Germany.
LOD2 will integrate and syndicate Linked Data with existing large-scale applications. The project shows the benefits in the scenarios of Media and Publishing, Corporate Data intranets and eGovernment.
LOD2 Webinar . 29.11.2011 . Page 3 http://lod2.eu
Creating Knowledge out of Interlinked Data
Once per month the LOD2 webinar series offer a free webinar about tools and services along the Linked Open Data Life Cycle.
Stay with us and learn more about acquisition, editing, composing, connected applications – and finally publishing Linked Open Data.
LOD2 Webinar . 29.11.2011 . Page 4 http://lod2.eu
Creating Knowledge out of Interlinked Data
• School of Business & Economics, Freie Universität Berlin• Research focus: Linked Data technologies for extending the
World Wide Web with a global data commons• Funded Projects:
• LOD2 - Creating Knowledge out of Interlinked Data• LATC - LOD Around The Clock• PlanetData
• Visit us at: http://wbsg.de
Web-based Systems Group
LOD2 Webinar . 29.11.2011 . Page 5 http://lod2.eu
Creating Knowledge out of Interlinked Data
• DBpedia is a community effort lead by WBSG, AKSW and OpenLink Software to:
• Extract structured information from Wikipedia• Make this information available on the Web under an open license• Interlink the DBpedia dataset with other open datasets on the Web• DBpedia Spotlight: Automatic annotation of free-text with DBpedia URIs
• Data Integration• R2R: Translates Web data that is represented using terms from different
vocabularies into a single target vocabulary.• Silk: Tool for generating RDF links between data items.• LDIF: Translates heterogeneous Linked Data from the Web into a clean,
local target representation while keeping track of data provenance.
Main Projects
LOD2 Webinar . 29.11.2011 . Page 6 http://lod2.eu
Creating Knowledge out of Interlinked Data
• D2R/Sparqlify in the LOD2 Stack• The D2RQ Platform• The D2RQ Mapping Language• Example and Demo• Availability• Sparqlify (Claus Stadler)• Q & A
Outline
LOD2 Webinar . 29.11.2011 . Page 7 http://lod2.eu
Creating Knowledge out of Interlinked Data
D2R/Sparqlify in the LOD2 Stack
LOD2 Webinar . 29.11.2011 . Page 8 http://lod2.eu
Creating Knowledge out of Interlinked Data
• System for accessing relational databases as virtual RDF graphs
• Offers RDF-based access to the content of relational databases without having to replicate it into an RDF store
• Features:• query a non-RDF database using SPARQL• access the content of the database as Linked Data over
the Web• create custom dumps of the database in RDF • access information using the Apache Jena API
The D2RQ Platform
LOD2 Webinar . 29.11.2011 . Page 9 http://lod2.eu
Creating Knowledge out of Interlinked Data
• The D2RQ Platform consists of:• D2RQ Mapping Language, a declarative mapping
language for describing the relation between an ontology and an relational data model.
• D2RQ Engine, uses the mappings to rewrite SQL queries against the database and passes query results up to the higher layers of the frameworks
• D2R Server, an HTTP server that provides a Linked Data view, a HTML view for debugging and a SPARQL Protocol endpoint over the database.
Components
LOD2 Webinar . 29.11.2011 . Page 10 http://lod2.eu
Creating Knowledge out of Interlinked Data
Architecture
LOD2 Webinar . 29.11.2011 . Page 11 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Declarative language for mapping relational database schemas to RDF vocabularies and OWL ontologies.
• N3 based syntax• Very flexible• Usual workflow: auto-generate mapping from DB schema,
then customize
D2RQ Mapping Language
LOD2 Webinar . 29.11.2011 . Page 12 http://lod2.eu
Creating Knowledge out of Interlinked Data
Mapping process
LOD2 Webinar . 29.11.2011 . Page 13 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Existing database which stores information about:• Conferences• Papers• Authors• Topics
• We want publish this database as RDF• We will use the International Semantic Web Community
(ISWC) Ontology.
Example
LOD2 Webinar . 29.11.2011 . Page 14 http://lod2.eu
Creating Knowledge out of Interlinked Data
• d2rq:Database defines a JDBC connection to a local or remote relational database
• d2rq:jdbcDSN specifies the JDBC database URL• Typically of the form: jdbc:subprotocol:subname
• d2rq:jdbcDriver specifies the JDBC driver for the database
Define DB connection
map:MyDatabase a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/mydb"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password".
map:MyDatabase a d2rq:Database; d2rq:jdbcDSN "jdbc:mysql://localhost/mydb"; d2rq:jdbcDriver "com.mysql.jdbc.Driver"; d2rq:username "user"; d2rq:password "password".
LOD2 Webinar . 29.11.2011 . Page 15 http://lod2.eu
Creating Knowledge out of Interlinked Data
• d2rq:ClassMap represents a class or a group of similar classes
• A class map defines how instances of the class are identified
• d2rq:uriPattern specifies a URI pattern that will be used to identify instances of this class map.
Define your entities
(SQL fragments in red)
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”.
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”.
LOD2 Webinar . 29.11.2011 . Page 16 http://lod2.eu
Creating Knowledge out of Interlinked Data
• d2rq:condition specifies an SQL WHERE condition• An instance of this class will only be generated for database
rows that satisfy the condition• Conditions can be used to hide parts of the database from
D2RQ
Define your entities
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”.
map:People a d2rq:ClassMap;d2rq:uriPattern “http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”.
(SQL fragments in red)
LOD2 Webinar . 29.11.2011 . Page 17 http://lod2.eu
Creating Knowledge out of Interlinked Data
• d2rq:class relates the generated entity to a OWL/RDFS class
• We use the Person class from the FOAF vocabulary
Add properties to entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:People a d2rq:ClassMap;d2rq:uriPattern
“http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”;d2rq:class foaf:Person .
map:People a d2rq:ClassMap;d2rq:uriPattern
“http://.../people/@@User.ID@@”;d2rq:condition “User.deleted=0”;d2rq:class foaf:Person .
LOD2 Webinar . 29.11.2011 . Page 18 http://lod2.eu
Creating Knowledge out of Interlinked Data
• A d2rq:PropertyBridge relates a database column to an RDF property.• Here we use properties from the FOAF vocabulary as well
Add properties to entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:name a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:nick; d2rq:column “User.name”.
map:mbox a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:mbox; d2rq:uriPattern “mailto:@@User.email@@”.
map:name a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:nick; d2rq:column “User.name”.
map:mbox a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:mbox; d2rq:uriPattern “mailto:@@User.email@@”.
LOD2 Webinar . 29.11.2011 . Page 19 http://lod2.eu
Creating Knowledge out of Interlinked Data
• d2rq:sqlExpression generates literal values by evaluating a SQL expression.
• Note that querying for such a computed value might put a heavy load on the database.
• We compute the SHA1 sum from the user email address
Add properties to entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:mbox_sha1 a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:mbox_sha1sum; d2rq:sqlExpression
“SHA1(CONCAT(‘mailto:’, User.email))”.
map:mbox_sha1 a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:mbox_sha1sum; d2rq:sqlExpression
“SHA1(CONCAT(‘mailto:’, User.email))”.
LOD2 Webinar . 29.11.2011 . Page 20 http://lod2.eu
Creating Knowledge out of Interlinked Data
• We define a second class mapping for photos• In the next step, we will interlink person with their photos
Link your entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:Photos a d2rq:ClassMap;d2rq:uriPattern
“http://.../photo/@@Photo.ID@@”;d2rq:class foaf:Image .
map:Photos a d2rq:ClassMap;d2rq:uriPattern
“http://.../photo/@@Photo.ID@@”;d2rq:class foaf:Image .
LOD2 Webinar . 29.11.2011 . Page 21 http://lod2.eu
Creating Knowledge out of Interlinked Data
• We can use the already presented syntax to interlink persons to their photo
• Photo.UserID is a foreign key to User.ID
Link your entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:uriPattern “http://.../photo/@@Photo.UserID@@”.
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:uriPattern “http://.../photo/@@Photo.UserID@@”.
LOD2 Webinar . 29.11.2011 . Page 22 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Better, with less repetition:
Link your entities
(SQL fragments in red, RDFS/OWL vocabulary in blue)
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:join “User.ID = Photo.UserID”; d2rq:refersToClassMap map:Photos .
map:photo a d2rq:PropertyBridge; d2rq:belongsToClassMap map:People; d2rq:property foaf:made; d2rq:join “User.ID = Photo.UserID”; d2rq:refersToClassMap map:Photos .
LOD2 Webinar . 29.11.2011 . Page 23 http://lod2.eu
Creating Knowledge out of Interlinked Data
Mapping Overview
LOD2 Webinar . 29.11.2011 . Page 24 http://lod2.eu
Creating Knowledge out of Interlinked Data
•Demo
LOD2 Webinar . 29.11.2011 . Page 25 http://lod2.eu
Creating Knowledge out of Interlinked Data
• D2RQ can be downloaded from the official homepage at:
• http://d2rq.org/
• Support is provided through the official mailing list:
• The latest source code is available from the project's Git repository:
• https://github.com/d2rq/d2rq
• D2RQ is licensed under the terms of the Apache Software Licence
Availability
LOD2 Webinar . 29.11.2011 . Page 26 http://lod2.eu
Creating Knowledge out of Interlinked Data
Developers
LOD2 Webinar . 29.11.2011 . Page 27 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Supported databases• Oracle• MySQL• PostgreSQL• SQL Server• HSQLDB• Interbase/Firebird
• ODBC data sources• Works with some limitations.
• Other databases• May or may not work. By default, D2RQ interacts with the
database using the SQL-92 standard. Any compatible database should work out of the box. We are interested in reports about D2RQ on other databases.
Database Compatibility
LOD2 Webinar . 29.11.2011 . Page 28 http://lod2.eu
Creating Knowledge out of Interlinked Data
• D2RQ is actively developed• Work on supporting RDB2RDF (Direct Mapping und R2RML)
in the next 6 weeks
Current Work
LOD2 Webinar . 29.11.2011 . Page 29 http://lod2.eu
Creating Knowledge out of Interlinked Data
SparqlifySparqlify
Project Page: http://aksw.org/projects/SparqlifySource Code: https://github.com/AKSW/Sparqlify
LOD2 Webinar . 29.11.2011 . Page 30 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Claus Stadler• Austria• PhD Student at the University of Leipzig since 2011
– In the Agile Knowledge Engineering and Semantic Web (AKSW) research group, headed by Soeren Auer.
• Research Interests: Spatial Data Management, SPARQL-SQL query rewriting and optimization, Data integration.
About me
LOD2 Webinar . 29.11.2011 . Page 31 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Founded in 2006• 25+ Researchers• 3 Sub groups
• Goals– Contributing to the advancement of science in Semantic Web, Knowledge
Engineering, Software Engineering– Cost efficient, high-impact R&D, which proves usefulness at an early stage– Bridge the gap between research results and applications
• Committed to Open Source, Open Access, and Open Knowledge movements
Agile Knowledge Engineering and Semantic Web Research Group
LOD2 Webinar . 29.11.2011 . Page 32 http://lod2.eu
Creating Knowledge out of Interlinked Data
• EU Funded Projects:
– Linked Open Data 2 (LOD2)
– LOD Around the Clock (LATC)
– Open Data Portal (ODP)
– Semantic Content Management Systems for Enterprise Knowledge Management and News Mining (SCMS)
– OntoWiki - Semantic Collaboration for Knowledge Management, E-Learning and E-Tourism
Agile Knowledge Engineering and Semantic Web Research Group
LOD2 Webinar . 29.11.2011 . Page 33 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Further Projects– SlideWiki
• SlideWiki is a collaboration platform which enables communities to build, share and play online presentations.
– LinkedGeoData• Making OpenStreetMap data available in the Semantic Web• Motivation for Sparqlify
– LIMES• Very fast tool for interlinking RDF knowledge bases.
– DBpedia Live• Synchronization of DBpedia with Wikipedia
– …
• Find more at– http://aksw.org/Projects
Agile Knowledge Engineering and Semantic Web Research Group
LOD2 Webinar . 29.11.2011 . Page 34 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Introduction• View Definition Example
– based on challenges encountered with LinkedGeoData
• Launching Sparqlify Server• Demonstration• Initial Results of the Performance Evaluation• Conclusion & Future Work• Outro
Structure
LOD2 Webinar . 29.11.2011 . Page 35 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Sparqlify is a SPARQL-SQL rewriter that enables one to define RDF views on relational databases and query them with SPARQL. Currently only PostgreSQL is supported.
• Inputs– PostgreSQL Database, Set of View Definitions, Sparql Query
• Features– Intuitive View Definition Syntax– SPARQL queries are rewritten into a single SQL query
• Give as much control as possible to the query optimizer of the underlying RDBMS– High expressivity
• Language and Data type Tags can originate from columns• Constraints can be stated for tuning the rewriting process
– Initial support for geospatial predicates• Can be extended to enable the use of arbitrary SQL predicates on the SPARQL level
Introduction
LOD2 Webinar . 29.11.2011 . Page 36 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
On the following slides, Prefix Declarations are omitted for brevity
LOD2 Webinar . 29.11.2011 . Page 37 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
Create View pois As Construct { …
id class geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
LOD2 Webinar . 29.11.2011 . Page 38 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
Create View pois As Construct { ?s a ?t . ?s geom:geometry ?geo . } With …
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
LOD2 Webinar . 29.11.2011 . Page 39 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
Create View pois As Construct { ?s a ?t . ?s geom:geometry ?geo . } With ?s = spy:uri(concat(“http://ex.org/”, ?id)) ….
LOD2 Webinar . 29.11.2011 . Page 40 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
Create View pois As Construct { ?s a ?t . ?s geom:geometry ?geo . } With ?s = spy:uri(concat(“http://...”, ?id)) ?t = spy:uri(?type) ?geom = spy:typedLiteral(?geom, ogc:WKTLiteral) From …
LOD2 Webinar . 29.11.2011 . Page 41 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
Create View pois As Construct { ?s a ?t . ?s geom:geometry ?geo . } With ?s = spy:uri(concat(“http://ex.org/”, ?id)) ?t = spy:uri(?type) ?geom = spy:typedLiteral(?geom, ogc:WKTLiteral) From points_of_interest;
LOD2 Webinar . 29.11.2011 . Page 42 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “points_of_interest”
id type geom
1 lgdo:Bakery (1, 1)
2 lgdo:School (2, 2)
3 lgdo:Pub (3, 3)
Create View pois As Construct { ?s a ?t . ?s geom:geometry ?geo . } With ?s = spy:uri(concat(“http://ex.org/”, ?id)) ?t = spy:uri(?type) ?geom = spy:typedLiteral(?geom, ogc:WKTLiteral) Constrain ?t prefix “http://linkedgeodata.org/ontology/” From points_of_interest;
LOD2 Webinar . 29.11.2011 . Page 43 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “resource_label”
resource label language
lgdo:Bakery Baeckerei de
lgdo:Bakery Bakery en
lgdo:School Schule de
LOD2 Webinar . 29.11.2011 . Page 44 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Mapping the table “resource_label”
resource label language
lgdo:Bakery Baeckerei de
lgdo:Bakery Bakery en
lgdo:School Schule de
Create View labels As Construct { ?s rdfs:label ?l . } With ?s = spy:uri(?resource) ?l = spy:plainLiteral(?label, ?language) Constrain ?s prefix “http://linkedgeodata.org/ontology/” From resource_labels;
LOD2 Webinar . 29.11.2011 . Page 45 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Adding a set of static triples
Create View static_triples As Construct { lgdo:Bakery a owl:Class . lgdo:School a owl:Class . lgdo:Pub a owl:class };
LOD2 Webinar . 29.11.2011 . Page 46 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition File Syntax
Prefix Declarations
Create View {name} As Construct { {triple patterns} } With {variable bindings} Constrain {constraint expressions} From logical table (table, view or SQL query);
… More View Definitions …
LOD2 Webinar . 29.11.2011 . Page 47 http://lod2.eu
Creating Knowledge out of Interlinked Data
View Definition Example: Wortschatz
Create View view_co_n As Construct { ?a wso:coOccursDirectlyWith ?b . ?x owl:annotatedSource ?a . ?x owl:annotatedProperty wso:coOccursDirectlyWith . ?x owl:annotatedTarget ?b . ?x wso:frequency ?f . ?x wso:sigma ?s . } With ?a = spy:uri(concat('http://aksw.org/wortschatz/word/', ?w1_id)) ?b = spy:uri(concat('http://aksw.org/wortschatz/word/', ?w2_id)) ?x = spy:uri(concat('http://aksw.org/wortschatz/co-occurence/direct/', ?w1_id, '/', ?w2_id)) ?f = spy:typedLiteral(?freq, xsd:long) ?s = spy:typedLiteral(?sig, xsd:long) From [[SELECT w1_id, w2_id, freq::bigint, sig::bigint FROM co_n]];
Escape SQL queries in double brackets
LOD2 Webinar . 29.11.2011 . Page 48 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Download from git, build with– mvn assebly:assembly
• Run– java -cp target/sparqlify-0.0.1-SNAPSHOT-jar-with-dependencies.jar RunEndpoint
[options]
• Options are– Server Configuration
• -c Config file containing the mapping definitions• -P Server port [default 7531]
– Database settings• -h Hostname of the database (e.g. localhost or localhost:5432)• -d Database name• -u User name• -p Password
– Quality of Service• -n Maximum result set size• -t Maximum query execution time (excluding rewriting time)
Launching Sparqlify
LOD2 Webinar . 29.11.2011 . Page 49 http://lod2.eu
Creating Knowledge out of Interlinked Data
Demonstration
LOD2 Webinar . 29.11.2011 . Page 50 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Initial performance comparision on BSBM 1 mio dataset on PostgreSQL:
– (Times per Query Mix)– D2R Fast Mode Disabled: ~8sec– D2R Fast Mode Enabled: ~3sec– Sparqlify: 4 sec– Performance is comparable to D2R.
• Mixed results for the LinkedGeoData schema:– Simple queries work well on the LGD schema– Complex queries are troublesome (timeouts) on a complete OSM dump as the
PostgreSQL optimizer makes suboptimal choices.
Initial Results of the Performance Evaluation
LOD2 Webinar . 29.11.2011 . Page 51 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Sparqlify provides an intuitive Mapping Syntax
• Originally developed for the LinkedGeoData use-case– Spatial predicate support, arbitrary predicate support planned.– URIs, language and datatype tags can be mapped from columns of the DB.– Queries are rewritten into a single SQL statement, in order to give as much control
to the query optimizer of the underlying DBMS as possible.
• Initial performance results seem to be comparable to D2R– More extensive testing has yet to be done
• Bugfixing
• Additional features– Especially support for the COUNT keyword
Conclusion and Future Work
LOD2 Webinar . 29.11.2011 . Page 52 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Project Page– http://aksw.org/projects/Sparqlify
• Source Code– https://github.com/AKSW/Sparqlify
• AKSW Research Group– http://aksw.org
• My Work Page– http://bis.informatik.uni-leipzig.de/ClausStadler
• My Email– [email protected]
Contact
LOD2 Webinar . 29.11.2011 . Page 53 http://lod2.eu
Creating Knowledge out of Interlinked Data
Thank you for your attention!
Q & A
LOD2 Webinar . 29.11.2011 . Page 54 http://lod2.eu
Creating Knowledge out of Interlinked Data
Credits
Jingle R.E.M., Martin Kaltenböck, Florian Kondert
Coordination Thomas Thurner
Martin Kaltenböck
Moderation Martin Kaltenböck
Presented by Robert Isele & Claus Stadler
LOD2 Webinar . 29.11.2011 . Page 55 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
Hope you enjoyed staying with us – if you need more detailed information, visit us at www.lod2.eu and let us know how we can improve to meet your expectations!
Don’t forget to register for our next webinar
22.05. 2012 – Cloud View (Exalead Dassault Systems, France) 19.06. 2012 – PoolParty Thesaurus Manager (SWC, Austria)Have a great day and don’t forget ...
LOD2 Webinar . 29.11.2011 . Page 56 http://lod2.eu
Creating Knowledge out of Interlinked Data
http://lod2.eu
LOD2 Webinar . 29.11.2011 . Page 57 http://lod2.eu
Creating Knowledge out of Interlinked Data
• There is– Virtuoso RDF Views– D2R– Revelytix Spyder– Asio Semantic Web Bridge for Relational Databases– ODE Mapster, RDBToOnto– Soon further implementations of R2RML– Ultrawrap– …
Why another SPARQL – SQL Rewriter?
LOD2 Webinar . 29.11.2011 . Page 58 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Map OpenStreetMap data to RDF– Taken approach
• Download a OSM planet file (>10GB compressed), pipe each OSM entity (node, way, relation) through a custom Java RDF mapper, and load the data into Virtuoso
• Implemented a LiveSync on top of that• Repeat the dump process after each change in the mappings• Takes more than 2 days.
– Goal• Immediate effect of a change in the mappings• Reuse of Osmosis' LiveSync
– Possible Solution• Keep the mapping information in the relational database, and use a RDB-RDF
mapper for querying it.– However: Back in April 2011, none of the existing RDB-RDF solutions seemed suitable
• Lack of support for spatial predicates• Evaluations of Sparql-Filters in memory• No support for creating literals where the language tag or datatype are stored in
the database.
Motivation
LOD2 Webinar . 29.11.2011 . Page 59 http://lod2.eu
Creating Knowledge out of Interlinked Data
• LinkedGeoData project: Convert OpenStreetMap (OSM) data as RDF– (http://linkedgeodata.org)
• Main tables of the OSM Schema (Excerpt):– Nodes(id, geom, tstamp)– NodeTags(node_id, k, v)
– Ways(id, geom, tstamp)– WayTags(way_id, k, v)
– WayNodes(way_id, sequence_id, node_id)
Motivation
(place, city)(name, Leipzig)
LOD2 Webinar . 29.11.2011 . Page 60 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Geometry datatype• URIs and language tags stored in database tables
Challenges with OpenStreetMap data
node_id k v
1 amenity school
k v property object
amenity school rdf:type lgdo:school
Additional mappings tables for LGD
k v label language
amenity school Schule de
Nodes (OSM)
lgd_map_resource_kv
lgd_map_resource_labels
Labels imported from TranslateWiki
LOD2 Webinar . 29.11.2011 . Page 61 http://lod2.eu
Creating Knowledge out of Interlinked Data
Rewriting process
LOD2 Webinar . 29.11.2011 . Page 62 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Rewriting process– View Candidate Finding
• Given a SPARQL query, find an appropriate subset of the views for answering the query
– Rewriting• After the candidates have been identified, translate the SPARQL algebra to
SQL algebra.• Thereby do book-keeping of how the SPARQL variables are reconstructed
from the SQL columns.– Result Set Rendering
• Execute the SQL query, construct the RDF according to the SPARQL variable bindings, serialize the result.
Rewriting process
LOD2 Webinar . 29.11.2011 . Page 63 http://lod2.eu
Creating Knowledge out of Interlinked Data
• Based onLe, Wangchao and Duan, Songyun and Kementsietsidis, Anastasios and Li, Feifei and
Wang, MinRewriting Queries on SPARQL Views,In WWW2011
View Candidate Finding
Top Related