Introduction to linked data and the semantic web

65
Linked data and its role in the semantic web Dave Reynolds, Epimorphics Ltd @der42

description

Talk on linked data for the BCS Data Management Specialist Group

Transcript of Introduction to linked data and the semantic web

Page 1: Introduction to linked data and the semantic web

Linked data and its role in the semantic web

Dave Reynolds, Epimorphics Ltd@der42

Page 2: Introduction to linked data and the semantic web

Roadmap

image: Leo Oosterloo @ flickr.com

What is linked data?

What is linked data?

ExamplesExamples

ModellingModelling

AccessAccess

Strengths and weaknesses

Strengths and weaknesses

other topicsother topics

Page 3: Introduction to linked data and the semantic web

Linked data intro

Page 4: Introduction to linked data and the semantic web

Linked data ...

publishing data on the web ...

... to enable integration, linking and reuse across silos

Page 5: Introduction to linked data and the semantic web

Can’t we just publish data as files?pdf

easy to read and publishExcel

allows further processing and analysis csv

processing without need for proprietary tools

But ... structure of data not explained no connection between different data sets, silos static and fixed – can’t retrieve just slices relevant

to problem

Page 6: Introduction to linked data and the semantic web

Linked dataApply the principles of the web to publication of

dataThe web:

is a global network of pages each identified by a URL fetching a URL gives a document pages connected by links open, anyone can say anything about anything

else

Page 7: Introduction to linked data and the semantic web

Linked dataApply the principles to the web to publication of

dataThe linked data web:

is a global network of things each identified by a URI fetching a URI gives a set of statements things connected by typed links open, anyone can say anything about anything

else

Linked data is “data you can click on”

Page 8: Introduction to linked data and the semantic web

Example schools informationhttp://education.data.gov.uk/id/school/401874

Page 9: Introduction to linked data and the semantic web

Example schools informationhttp://education.data.gov.uk/id/school/401874

“Cardiff High School”“Secondary”

“Cardiff”

label phasedistrict

Page 10: Introduction to linked data and the semantic web

Example schools informationhttp://education.data.gov.uk/id/school/401874

“Cardiff High School”

phasedistrict

http://statistics.data.gov.uk/id/local-authority-district/00PT “Cardiff”label

school:PhaseOfEducation_Secondary

label

Page 11: Introduction to linked data and the semantic web

Example schools informationhttp://education.data.gov.uk/id/school/401874

“Cardiff High School”

phasedistrict

http://statistics.data.gov.uk/id/local-authority-district/00PT “Cardiff”label

school:PhaseOfEducation_Secondary

http://data.ordnancesurvey.co.uk/id/7000000000025484

label

contains wardextent

contains parishGML: 310499.4 184176.6 310476.5 ...

Page 12: Introduction to linked data and the semantic web

Example schools informationhttp://education.data.gov.uk/id/school/401874

“Cardiff High School”

phasedistrict

http://statistics.data.gov.uk/id/local-authority-district/00PT “Cardiff”label

school:PhaseOfEducation_Secondary

http://data.ordnancesurvey.co.uk/id/7000000000025484

label

contains wardextent

contains parishGML: 310499.4 184176.6 310476.5 ...

same as

Page 13: Introduction to linked data and the semantic web
Page 14: Introduction to linked data and the semantic web

Linked data principles Use URIs as names for things Use HTTP URIs so that people can look up

those names When someone looks up a URI, provide useful

information, using the standards (RDF*, SPARQL)

Include links to other URIs, so that they can discover more things

Pattern of application of semantic web stack

Page 15: Introduction to linked data and the semantic web

Linked open data cloud: 2007

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Page 16: Introduction to linked data and the semantic web

Linked open data cloud: 2009

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Page 17: Introduction to linked data and the semantic web

Linked open data cloud: 2010

Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/

Page 18: Introduction to linked data and the semantic web

Data.gov.uk – linked datasets and APIs

Page 19: Introduction to linked data and the semantic web

Data.gov.ukvisualizations on top of linked data

Page 20: Introduction to linked data and the semantic web
Page 21: Introduction to linked data and the semantic web

Ordnance survey

Page 22: Introduction to linked data and the semantic web

Environment agency - data, API, visualizations

Page 23: Introduction to linked data and the semantic web

BBC – integration and site design

Page 24: Introduction to linked data and the semantic web

E-commerce and rich snippets

Overstock.com

Peek-cloppenburg.de

Page 25: Introduction to linked data and the semantic web

Internal use

Page 26: Introduction to linked data and the semantic web

Open?

Linked open data =

linked data +

open data

Page 27: Introduction to linked data and the semantic web

Modelling

Page 28: Introduction to linked data and the semantic web

ModellingThing, entity, concept ... resource resource being described

abstract concept real world thing data item, particular measurement document

identify by URI provide information making statements about

those resources identifier NOT a container c.f. UML

open schema critical to open extensibility and integration similar to Entity-Attribute-Value modelling

Page 29: Introduction to linked data and the semantic web

Modelling – RDF – Resource Description Framework

Statement, triple, logical assertion

Subject Predicate Object

Page 30: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

some school has a name/label some literal

Page 31: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

http://education.data.gov.uk/id/

school/401874

has a name/label “Cardiff High School”

Page 32: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

http://education.data.gov.uk/id/

school/401874

http://www.w3.org/2000/01/rdf-schema#label

“Cardiff High School”

Page 33: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

whereschool: = http://education.data.gov.uk/id/school/rdfs: = http://www.w3.org/2000/01/rdf-schema#

Page 34: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

school:401874 ont:districtAdministrative

la:00PT

la:00PT rdfs:label Cardiff

Page 35: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

school:401874 ont:districtAdministrative

la:00PT

la:00PT rdfs:label “Cardiff”

school:401874

“Cardiff High School”

ont:districtAdministrative

la:00PT

“Cardiff”

rdfs:label

rdfs:label

Page 36: Introduction to linked data and the semantic web

Modelling – RDFStatement, triple, logical assertion

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

school:401874 ont:districtAdministrative

la:00PT

la:00PT rdfs:label “Cardiff”

la:00PT rdfs:label “Caerdydd”@cy

Page 37: Introduction to linked data and the semantic web

RDF SyntaxesRDF/XML

normative

Turtle more human readable/writable being standardized

RDFa embed in (X)HTML

[others omitted]

Page 38: Introduction to linked data and the semantic web

Modelling – RDFRDF/XML syntax

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

school:401874 ont:districtAdministrative

la:00PT

la:00PT rdfs:label “Cardiff”

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:ont="http://education.data.gov.uk/def/school/" xmlns:la="http://statistics.data.gov.uk/id/local-authority-district/" xmlns:school="http://education.data.gov.uk/id/school/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdf:Description rdf:about="http://education.data.gov.uk/id/school/401874"> <rdfs:label>Cardiff High School</rdfs:label> <ont:districtAdministrative> <rdf:Description rdf:about="http://statistics.data.gov.uk/id/local-authority-district/00PT"> <rdfs:label>Cardiff</rdfs:label> </rdf:Description> </ont:districtAdministrative> </rdf:Description></rdf:RDF>

Page 39: Introduction to linked data and the semantic web

Modelling – RDFTurtle syntax

Subject Predicate Object

school:401874 rdfs:label “Cardiff High School”

school:401874 ont:districtAdministrative

la:00PT

la:00PT rdfs:label “Cardiff”

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix school: <http://education.data.gov.uk/id/school/> .@prefix ont: <http://education.data.gov.uk/def/school/> .@prefix la: <http://statistics.data.gov.uk/id/local-authority-district/> .

school:401874 rdfs:label "Cardiff High School"; ont:districtAdministrative la:00PT .

la:00PT rdfs:label "Cardiff" .

Page 40: Introduction to linked data and the semantic web

ModellingVocabularies

so far no actual models, let alone semantics want to define

types of thing : Class what you can say about them : Property

encode definitions in more RDFand publish at the corresponding URIs link from data to data model reuse published vocabularies to enable integration freely combine different vocabularies or new ones

Page 41: Introduction to linked data and the semantic web

Modelling – vocabulariesLogical modelling modelling the domain, not a particular data

structure what exists what is asserted? what can you deduce from that? not about constraints as such monotonic, open world

controlledvocabulary

taxonomy

thesaurus

ontology

Ontology

Page 42: Introduction to linked data and the semantic web

Modelling – vocabularies

unfamiliar terminology but related to information architecture and conceptual modelling domain-driven design

... and yes knowledge representation

Page 43: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language classes, types and type hierarchy

ont:School rdfs:Classrdf:type

“School”rdfs:label

Page 44: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language classes, types and type hierarchy

ont:WelshEstablishment

ont:School rdfs:Classrdf:type

rdf:typerdfs:subClassOf

“School”rdfs:label

Page 45: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language classes, types and type hierarchy

school:401874

ont:WelshEstablishment

ont:WelshEstablishment

ont:School rdfs:Class rdf:typerdf:type

rdf:typerdfs:subClassOf

“School”rdfs:label

Page 46: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language classes, types and type hierarchy

school:401874

ont:WelshEstablishment

ont:WelshEstablishment

ont:School rdfs:Class rdf:typerdf:type

rdf:typerdfs:subClassOf

school:401874

ont:WelshEstablishment

ont:School

rdf:type

“School”rdfs:label

“School”rdfs:label

Page 47: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language properties, property hierarchy

school:401874

person:JoeBloggsont:staffAt

ont:headOf

rdf:Property

ont:headOfrdf:type

rdfs:subPropertyOf

school:401874person:JoeBloggs

ont:staffAt

ont:headOf

Page 48: Introduction to linked data and the semantic web

Modelling – RDFSRDF vocabulary description language class/property relations

domain range

Already have power to do some vocabulary mapping declare classes or properties from different

vocabularies to be equivalent:A rdfs:subClassOf BB rdfs:subClassOf A

Page 49: Introduction to linked data and the semantic web

Modelling - OWL richer modelling and semantics axioms on properties

transitive, symmetric, inverseOf, ... functional, inverse functional equivalent property

axioms on classes intersection, union, disjoint, equivalent

restrictions on classes some value from, all values from, cardinality, has

value, one of, keys axioms on individuals

same as, different from, all different imports

Page 50: Introduction to linked data and the semantic web

Modelling – OWL supports much richer modelling consistency checking of model consistency checking of data

some surprises if used to schema languages open world, no unique name assumption can extend to closed world checking

inference classification inferred relationships

Page 51: Introduction to linked data and the semantic web

ModellingSpectrum of goals and styles

Lightweight vocabularies Rich ontological models

simple modelling just enough agreement

to get useful work done removing boundaries to

enable information to be found and connected

global consistency not possible

a little semantics goes a long way

rich domain models need expressivity consistency is critical make complex

inferences you can rely on, across data you trust

knowledge is power

Page 52: Introduction to linked data and the semantic web

ModellingOntology reuse invest in complete ontology for a domain

rich but general model, may be modular inside strong “ontological commitment” e.g. medical ontologies

reuse small, common, vocabularies FOAF, SIOC, Dublin Core, Org ... pick and choose classes and properties you need fill in a few missing links for your domain

generic reusable vocabularies Data cube vocabulary

Page 53: Introduction to linked data and the semantic web

Accessing all this data link following

HTTP GET, follow links, aggregate relevant statements

query SPARQL

Page 54: Introduction to linked data and the semantic web

rdfs:labelont:districtAdministrative

SPARQL core idea is pattern matching

graph patterns with variables any subgraph which matches yields row of

bindings

syntax based on Turtle syntax for RDF web API endpoints lots of power

?school [ ] “Cardiff”

filters optionals named graphs

sub-queries property chains aggregation

federated query update construct

Page 55: Introduction to linked data and the semantic web

Accessing all this data link following

HTTP GET, follow links, aggregate relevant statements query

SPARQL linked data API

RESTful API onto linked data resources simple query, usable without RDF stack, web dev

friendly easy to layer visualizations and UIs on top

third parties search engines and aggregators e.g. Sindice,

sameAs.org

Page 56: Introduction to linked data and the semantic web

Semantic web layer cake

Page 57: Introduction to linked data and the semantic web

Strengths and weaknesses

image: spcbrass @ flickr.com

Page 58: Introduction to linked data and the semantic web

Strengths data integration

use of global identifiers (URIs) composable – statements v. containers, schemaless linking, vocabulary mapping

extensible, incremental, decentralized, resilient no global ontology/schema to develop or maintain freely add terms from other vocabularies open world assumption

modelling and data entwined link data to models, data in context use same technology to share, manage extend models

supports inference and classification rich access routes

web linking, download, query, web APIs

Page 59: Introduction to linked data and the semantic web

Weaknesses complexity of the stack

alphabet soup – RDF, RDFS, OWL, SPARQL, RIF .. unfamiliar “ontology”, “logical entailment” lots of arcane details RDF/XML syntax

performance of schema-less stores optimization challenges

limited validation and constraints

cost of modelling,ontology development

no inbuilt notions of time, uncertainty

• use the parts you need• tooling e.g. Linked Data API• core ideas not that complex

• technology improving steadily• hybrid solutions

• closed world checkers

• ontology reuse• generic ontologies (data cube)• tools

• model on top

Page 60: Introduction to linked data and the semantic web

Wrapping up

image: erika g. @ flickr.com

Page 61: Introduction to linked data and the semantic web

Things we missed out RDF nuances

blank nodes, containers and collections named graphs

linked data nuances URI for thing v. web page, content negotiation, httprange-14 URI architecture

OWL nuances OWL species, serializations, lots of details

Other technologies in the stack SPARQL update, rules (RIF), GRDDL, Powder, Geo

SPARQL, RDB mapping, triple/quad stores Embedding structured data in markup

RDFa, micro formats, micro data, schema.org and all that

Page 62: Introduction to linked data and the semantic web

Hot topics Government linked data

identifiers to seed linked data data publication

transparency, improving services, economic growth

structured data and search engines rich snippets, structured results, SEO search => question answering

user interfaces visualization, exploration, exploiting linking

data as a service

Page 63: Introduction to linked data and the semantic web

fin.

Page 64: Introduction to linked data and the semantic web

Spare

Page 65: Introduction to linked data and the semantic web

Case study: Local government payments

data model publish useuse