Database Techniques for Linked Data Management - PlanetData
Transcript of Database Techniques for Linked Data Management - PlanetData
![Page 1: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/1.jpg)
Database Techniques for Linked Data Management
Andreas Harth, Katja Hose, Ralf Schenkel Tutorial SIGMOD 2012
![Page 2: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/2.jpg)
Introduction to Linked Data (Andreas)
Motivation Linked Data principles Relation to Dataspaces Linked Data application architectures Conclusion
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 2 30.11.2012
![Page 3: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/3.jpg)
Centralized storage and query processing (Ralf)
SPARQL Overview Rowstore solutions Columnstore solutions Other solutions and outlook
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 3 30.11.2012
![Page 4: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/4.jpg)
Distributed query processing (Katja)
Motivation for virtual integration Lookup-based query processing Distributed query processing
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 4 30.11.2012
![Page 5: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/5.jpg)
MOTIVATION
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 5 30.11.2012
![Page 6: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/6.jpg)
Facebook Open Graph
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 6 30.11.2012
$ curl -H "Accept: text/turtle" http://graph.facebook.com/?ids=http://www.cs.rpi.edu/~wehtml,jesserweaver @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix api: <tag:graph.facebook.com,2011:/> . @prefix og: <http://ogp.me/ns#> . @prefix fb: <http://ogp.me/ns/fb#> . @prefix : <http://graph.facebook.com/schema/~/> . @prefix user: <http://graph.facebook.com/schema/user#> @prefix page: <http://graph.facebook.com/schema/page# </100002988319400#> user:id "100002988319400" ; user:name "Jesser Weaver" ; user:first_name "Jesser" ; user:last_name "Weaver" ; user:username "jesser.weaver" . $
![Page 7: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/7.jpg)
Schema.org (Google, Yahoo, Bing)
Goal: embedding structured data into web pages via microformats Popular classes
Creative works: CreativeWork, Book, Movie, MusicRecording, Recipe, TVSeries ... Embedded non-text objects: AudioObject, ImageObject, VideoObject Event Organization Person Place, LocalBusiness, Restaurant ... Product, Offer, AggregateOffer Review, AggregateRating
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 7 30.11.2012
![Page 8: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/8.jpg)
Google Rich Snippets/Knowledge Graph
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 8 30.11.2012
![Page 9: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/9.jpg)
Linked Data on the Web
2007-10
30.11.2012 9 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 10: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/10.jpg)
Linked Data on the Web
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 10 30.11.2012
2011-09 http://lod-cloud.net/
![Page 11: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/11.jpg)
Types of Data in the Linking Open Data Cloud
http://www4.wiwiss.fu-berlin.de/lodcloud/state/ (Sept 2010)
30.11.2012 11 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 12: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/12.jpg)
Billion Triple Challenge Dataset
Part of the annual Semantic Web Challenge (http://challenge.semanticweb.org/) 2011 dataset at http://km.aifb.kit.edu/projects/btc-2011/ 20GB compressed, 200GB uncompressed ~2 bn statements, 213.384 distinct classes, 47.681 distinct properties
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 12 30.11.2012
Class URI # documents http://xmlns.com/foaf/0.1/Person 1633434 http://xmlns.com/foaf/0.1/Document 814800 http://rdf.freebase.com/ns/common.topic 572382 http://www.w3.org/2002/07/owl#Thing 468387 http://purl.org/ontology/mo/MusicArtist 346728
![Page 13: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/13.jpg)
WebDataCommons Dataset
Based on CommonCrawl corpus (http://commoncrawl.org/) Parse structured data from HTML pages
RDFa HTML Microdata Microformats: hCard, hListing, hCalendar, Geo, hResume, hReview, hRecipe, Species, xfn
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 13 30.11.2012
Crawl date 2009/09-2010/11 2012/02 Total URIs 2.8 bn 1.7 bn HTML pages 2.5 bn (28.9TB) 1.5 bn (20.9TB) URIs with structured data 148 m 189 m Domains with structured data 19 m 65 m Resulting statements 5.2 bn 3.3 bn
![Page 14: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/14.jpg)
Scenario Overview
Semantic Technologies facilitate access to data Q: data about Berlin? Q: famous people that died in Berlin? Q: data about Hegel? Q: Hegel’s publications? Q: data about Marlene Dietrich? Q: Dietrich’s songs?
1. Query
2. Answer
? !
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 14
![Page 15: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/15.jpg)
DBpedia
Linked Data version of Wikipedia Scripts that extract data (text, links, infoboxes) from Wikipedia Published as Linked Data Interlinking hub in the Linked Data web Berlin
http://dbpedia.org/resource/Berlin
Hegel http://dbpedia.org/resource/Georg_Wilhelm_Friedrich_Hegel
Marlene Dietrich http://dbpedia.org/resource/Marlene_Dietrich
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 15 30.11.2012
![Page 16: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/16.jpg)
BBC Music
Data about BBC (radio) programmes, artists, songs… Combination of BBC-internal data (playlists), MusicBrainz (artists, albums), Wikipedia (artists) Underpinning the BBC Music website Data published according to Linked Data principles Marlene Dietrich
http://www.bbc.co.uk/music/artists/191cba6a-b83f-49ca-883c-02b20c7a9dd5.rdf#artist
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 16 30.11.2012
![Page 17: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/17.jpg)
Virtual International Authority File (VIAF)
Joint project of national libraries and related organisations 21 institutions, among them the Library of Congress, Deutsche Nationalbibliothek, Bibliothèque nationale de France
Provide access to “authority files” Matching and interlinking collections from participating institutions Hegel
http://viaf.org/viaf/89774942/
Marlene Dietrich http://viaf.org/viaf/97773925/
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 17 30.11.2012
![Page 18: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/18.jpg)
LINKED DATA PRINCIPLES
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 18 30.11.2012
![Page 19: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/19.jpg)
Semantic Technologies
Semantic Web technologies, standardised by the W3C, are mature:
RDF recommendation in 1999, update in 2004 RDFa (RDF in HTML) note in 2008 RDFS recommendation in 2004 SPARQL recommendation in 2008 OWL recommendation in 2004, update in 2009 RIF Core recommendation in 2010
Linked Data is a subset of the Semantic Web stack, including web architecture:
IRI (IETF RFC 3987, 2005) HTTP (IETF RFC 2616, 1999)
30.11.2012 19 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 20: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/20.jpg)
Linked Data Principles
1. Use URIs as names for things 2. Use HTTP URIs so that people can look up those names. 3. When someone looks up a URI, provide useful
information, using the standards (RDF*, SPARQL) 4. Include links to other URIs. so that they can discover more
things.
http://www.w3.org/DesignIssues/LinkedData
30.11.2012 20 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 21: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/21.jpg)
1. Use URIs as Names for Things
Use a unique identifier to denote things URIs are defined in RFC 2396 Hegel, Georg Wilhelm Friedrich
http://dbpedia.org/resource/Georg_Wilhelm_Friedrich_Hegel http://viaf.org/viaf/89774942/ …
Hegel, Georg Wilhelm Friedrich: Gesammelte Werke / Vorlesungen über die Logik
urn:isbn:978-3-7873-1964-0
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 21 30.11.2012
![Page 22: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/22.jpg)
Names for Things
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 22 30.11.2012
![Page 23: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/23.jpg)
2. Use HTTP URIs
Enables “lookup” of URIs Via Hypertext Transfer Protocol (HTTP) Piggy-backs on hierarchical Domain Name System to guarantee uniqueness of identifiers Uses established HTTP infrastructure Connects logical level (thing) with physical level (source) Important: distinction between “thing URI” and “source URI” („other resource“ vs. „information resource“)
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 23 30.11.2012
![Page 24: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/24.jpg)
Information Resources vs. Other Resources
30.11.2012
Name? Creator? Birth date? Last change date? License? Copyright? …
Marlene Dietrich, the person
File containing data about Marlene Dietrich
24 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 25: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/25.jpg)
Correspondence between thing-URI and source-URI („hash URIs“)
25
User Agent
Web Server
HTTP GET
RDF
http://www.bbc.co.uk/music/artists/191cba6a-b83f-49ca-883c-02b20c7a9dd5.rdf#artist
http://www.bbc.co.uk/music/artists/191cba6a-b83f-49ca-883c-02b20c7a9dd5.rdf
30.11.2012 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 26: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/26.jpg)
Hypertext Transfer Protocol (HTTP)
$ curl -H "Accept: application/rdf+xml" -v http://viaf.org/viaf/97773925/
> GET /viaf/97773925/ HTTP/1.1 > User-Agent: curl/7.25.0 > Host: viaf.org > Accept: application/rdf+xml < HTTP/1.1 200 OK < Date: Mon, 28 Mar 2011 17:16:30 GMT < Content-Location: rdf.xml < Last-Modified: Wed, 29 Sep 2010 15:39:28 GMT < Content-Type: application/rdf+xml; qs=0.9 < Connection: close
REQ
UES
T R
ESPO
NSE
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 26
![Page 27: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/27.jpg)
Correspondence between thing-URI and source-URI („slash URIs“)
27
User Agent
Web Server
http://dbpedia.org/resource/Marlene_Dietrich
http://dbpedia.org/data/Marlene_Dietrich
HTTP GET
303 HTTP GET
RDF
http://dbpedia.org/page/Marlene_Dietrich
30.11.2012 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 28: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/28.jpg)
3. Provide Useful Information
When somebody looks up a URI, return data using the standards (RDF*, SPARQL) Resource Description Framework, a format for encoding graph-structured data (with URIs to identify nodes/vertices and links/edges)
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 28 30.11.2012
![Page 29: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/29.jpg)
Resource Description Framework
Directed, labeled graph triple(subject, predicate, object)
subject: URI (or blank node) predicate: URI object: URI (or blank node) or RDF literal (string, integer, date…)
RDF/XML is the most widely deployed serialisation Other serialisations possible (N-Triples, Turtle, Notation3…) Quadruples (or quads) used as internal representation when integrating data quad(subject, predicate, object, context)
context: URI (used to store origin of triple)
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 29
![Page 30: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/30.jpg)
Merging Data with RDF
+
=
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 30
![Page 31: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/31.jpg)
4. Link to Other URIs
Enable people (and machines) to jump from server to server External links vs. internal links (for any predicate) Special owl:sameAs links to denote equivalence of identifiers (useful for data merging)
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 31 30.11.2012
![Page 32: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/32.jpg)
Equivalences via owl:sameAs
http://viaf.org/viaf/89774942 http://dbpedia.org/resource/Georg_Wilhelm_Friedrich_Hegel http://www.idref.fr/026917467/id http://libris.kb.se/resource/auth/190350 http://d-nb.info/gnd/118547739
http://www.bbc.co.uk/music/artists/191cba6a-b83f-49ca-883c-02b20c7a9dd5#artist http://dbpedia.org/resource/Marlene_Dietrich
http://viaf.org/viaf/97773925 http://dbpedia.org/resource/Marlene_Dietrich . http://d-nb.info/gnd/118525565 http://libris.kb.se/resource/auth/238817 http://www.idref.fr/027561844/id
http://dbpedia.org/resource/Berlin http://mpii.de/yago/resource/Berlin http://data.nytimes.com/N50987186835223032381 - Berlin (Germany) http://www4.wiwiss.fu-berlin.de/flickrwrappr/photos/Berlin http://data.nytimes.com/16057429728088573361 - Gaspe Peninsula (Quebec) (?) Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 32 30.11.2012
![Page 33: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/33.jpg)
Benefits of Linked Data
Explicit, simple data representation Common data representation (Resource Description Framework, RDF) hides underlying technologies and systems
Distributed System Decentralised distributed ownership and control facilitates adoption and scalability
Cross-referencing Allows for linking and referencing of existing data, via reuse of URIs
Loose coupling with common language layer Large scale systems require loose coupling, via HTTP as common access protocol
Ease of publishing and consumption Simple and easy-to-use systems and technologies to facilitate uptake
Incremental data integration Start with merged RDF graphs and provide mappings as you go
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 33
![Page 34: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/34.jpg)
Challenges (I)
Ramp-up cost for data conversion May be alleviated by semi-automatic mappings and adequate tool support for manual conversion
Integrated data may be messy at first But can be refined as need arises
Distributed creation and loose coordination may result in inconsistencies
Can be detected, diagnosed, and fixed with appropriate tools
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 34 30.11.2012
![Page 35: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/35.jpg)
The Pedantic Web Group
Get the community to contact publishers about errors/issues as they arise Get involved: http://pedantic-web.org/ 137 members! Acknowledgements to: Aidan Hogan, Alex Passant, Me, Antoine Zimmermann, Axel Polleres, Michael Hausenblas, Richard Cyganiak, Stéphane Corlosquet
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 35 30.11.2012
![Page 36: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/36.jpg)
Challenges (II)
Often very much oriented towards individuals Little possibilities for expressing schema knowledge Different data sources have different ways of representing the same facts Ontology languages (RDFS, OWL) solve that drawback RDFS and OWL are layered on top of RDF
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012 36
![Page 37: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/37.jpg)
LINKED DATA AND DATASPACES
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012
![Page 38: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/38.jpg)
Dataspaces
Abstraction for Data Management to overcome data integration problems
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 30.11.2012
![Page 39: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/39.jpg)
Dataspace Architecture Components
Catalog and browse Collection of data sources (schema, rate of change, accuracy…)
Search and query Query everything Structured queries Metadata queries Monitoring
Local store and index Store associations between objects, increase availability,…
Discovery Locate new databases
Source extension Add query functionality,…
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 39 30.11.2012
![Page 40: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/40.jpg)
Linked Data vs. Dataspaces
Method for decentralised data publishing and interlinking Ecosystem (incl. people) m:n mappings Many small sources Decentralised interlinking No central catalog
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 40 30.11.2012
Compehensive architecture for data integration Platform 1:m mappings Few large sources Links in the local index Central catalog
![Page 41: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/41.jpg)
LINKED DATA APPLICATION ARCHITECTURES
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 41 30.11.2012
![Page 42: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/42.jpg)
Architecture Styles
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 42 30.11.2012
1. Q
uery
2. A
nsw
er
? !
0. Crawl- Index
? ! Warehousing/ Crawl-Index-Serve
Virtual Integration/ Distributed Querying
![Page 43: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/43.jpg)
Basic Application: Entity Browsing
30.11.2012
Warehousing/ Crawl-Index-Serve
Virtual Integration/ Distributed Querying
Google, SWSE, Falcons, Sindice, Watson, FactForge…
Tabulator, Disco, Zitgist…
43 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 44: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/44.jpg)
SUMMARY
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 44 30.11.2012
![Page 45: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/45.jpg)
Summary
The Linked Data Web is a large, decentralised, complex system built on simple principles
identify resource via HTTP URIs provide RDF that links to other URIs upon lookup
Current trend around Linked Data allows for a re-think of components in Semantic Web Layer Cake Data publishers and consumers coordinate little Web of Data grows rapidly and covers a large variety of domains Algorithms operating over a common access protocol and data model Ontology languages provide integration and mapping between disparate sources First commercial applications emerging
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 45 30.11.2012
![Page 46: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/46.jpg)
Attribution
Slides adapted from my SWT-2 lectures and WWW 2010 SILD and INFORMATIK 2011 tutorials Linking Open Data cloud diagrams, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/ Images of Berlin, Hegel and Dietrich via Wikipedia
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 46 30.11.2012
![Page 47: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/47.jpg)
With increased use of computers more and more data is being stored
Organisations rely on data for business decisions Data drives policy decisions in government Individuals rely on data from the Web for information and communication
Data volumes explode More and more data available on the Web is represented in Semantic Web standards Linking Open Data (LOD) initiative
Semantic Web technologies facilitate the integration of data from multiple sources Combining data from multiple sources enables insights
Motivation
30.11.2012 47 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 48: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/48.jpg)
RDF Graph
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 48 30.11.2012
RDF graph collected via breadth-first expansion from http://danbri.org/foaf.rdf
7683 triples from 25 RDF files 1062 IRIs 154 blank nodes 1160 literals
![Page 49: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/49.jpg)
Information Resource Graph
RDF graph collected via breadth-first expansion from http://danbri.org/foaf.rdf
319 nodes from RDF files 453 edges average outdegree: 25 http://mmt.me.uk/foaf.rdf has outdegree of 105!
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 49 30.11.2012
![Page 50: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/50.jpg)
Dataspace Architecture
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management 50 30.11.2012
![Page 51: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/51.jpg)
Semantic Web Components
( )
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
30.11.2012 51
![Page 52: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/52.jpg)
(
Linked Data: Minimal Components
1. Q
uery
2. A
nsw
er
? !
) Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for
Linked Data Management 30.11.2012 52
![Page 53: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/53.jpg)
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
Data Integration System Architecture
30.11.2012
! ?
Source 1 Source 2 Source n
Wrapper 1 Wrapper 2 Wrapper n
Integration
Wrapper 1
53
![Page 54: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/54.jpg)
Linked Data on the Web
2007-11
30.11.2012 54 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 55: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/55.jpg)
Linked Data on the Web
2008-02
30.11.2012 55 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 56: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/56.jpg)
Linked Data on the Web
2008-03
30.11.2012 56 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 57: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/57.jpg)
Linked Data on the Web
2008-09
30.11.2012 57 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 58: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/58.jpg)
Linked Data on the Web
2009-03
30.11.2012 58 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 59: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/59.jpg)
Linked Data on the Web
2009-07
30.11.2012 59 Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
![Page 60: Database Techniques for Linked Data Management - PlanetData](https://reader031.fdocuments.us/reader031/viewer/2022020703/61fb35f02e268c58cd5b7897/html5/thumbnails/60.jpg)
Linked Data on the Web
Andreas Harth, Katja Hose, Ralf Schenkel – Tutorial on Database Techniques for Linked Data Management
2010-09
30.11.2012 60