Publishing linked data from relational databases

Post on 17-Jan-2015

827 views 1 download

description

Jornadas de Software Libre y Web 2.0 Universidad de Cádiz 10 de Noviembre de 2011 http://softwarelibre.uca.es/node/1201

Transcript of Publishing linked data from relational databases

Iván Ruiz Rube Departamento de Lenguajes y Sistemas Informáticos

Universidad de Cádiz

09/11/2011 1

Publishing Linked Data

from relational databases

Jornadas de Software Libre y Web 2.0

Roadmap

The evolution of the Web

Linked Open Data

Exposing databases with D2R Server

Case study: The VOA3R Project

Conclusions

09/11/2011 2 Jornadas de Software Libre y Web 2.0

THE EVOLUTION OF THE WEB

PUBLISHING LINKED DATA FROM RELATIONAL DATABASES

09/11/2011 3 Jornadas de Software Libre y Web 2.0

World Wide Web

Most important infrastructure for the

distribution of information.

Rich and broad information: text, images,

videos, slides, etc.

Web navigators support HTML, JS, CSS

and other formats.

Navigation based on hyperlinks.

09/11/2011 4 Jornadas de Software Libre y Web 2.0

Web based on documents

09/11/2011 5 Jornadas de Software Libre y Web 2.0

Web Evolution

09/11/2011 6

Web 1.0 Web 2.0 Web 3.0

Jornadas de Software Libre y Web 2.0

Web 1.0

Beginnings of the

Web

Static pages

Limited use of

standards

Lack of interaction

with the user

09/11/2011 7 Jornadas de Software Libre y Web 2.0

Web 2.0

Higher bandwidth

Standards

Rich User Interface

Accessibility

Usability

Social networks

09/11/2011 8 Jornadas de Software Libre y Web 2.0

Web 3.0

3D virtual

environments

The Internet of

Things

Domotics

Cloud Computing

Semantic Web

09/11/2011 9 Jornadas de Software Libre y Web 2.0

Semantic Web

09/11/2011 10

“I have a dream for the Web (in which computers) become

capable of analyzing all the data on the Web….”

“…the day-to-day mechanisms of trade, bureaucracy and

our daily lives will be handled by machines talking to

machines.”

1Tim Berners-Lee

Jornadas de Software Libre y Web 2.0

LINKED OPEN DATA

PUBLISHING LINKED DATA FROM RELATIONAL DATABASES

09/11/2011 11 Jornadas de Software Libre y Web 2.0

Information Age

Huge amount of

information

A large number of

information

systems

Big challenges: ◦ Data integration

◦ Data analysis

09/11/2011 12 Jornadas de Software Libre y Web 2.0

Need for open data

Improvement of

organizational

transparency

Public data

Foster the

research

Promote the

development of

third-party system

09/11/2011 13 Jornadas de Software Libre y Web 2.0

Linked Open Data

“A method of publishing structured data so that it can be

interlinked and become more useful.

…it extends web pages to share information in a way that can

be read automatically by computers.”1

09/11/2011 Jornadas de Software Libre y Web 2.0 14

1Tim Berners-Lee

Resource Description Format

http://publisher.org/Papers/

Paper12345

09/11/2011 15 Jornadas de Software Libre y Web 2.0

Resource Description Format

2008

year

Linked Data -

The Story So Far

title

http://publisher.org/Papers/

Paper12345

09/11/2011 16 Jornadas de Software Libre y Web 2.0

Resource Description Format

author

publishedIn

2008

http://publisher.org/Journals/

JournalSWIS

year

Linked Data -

The Story So Far

title

http://w3.org/People/

Berners-Lee

http://publisher.org/Papers/

Paper12345

09/11/2011 17 Jornadas de Software Libre y Web 2.0

Resource Description Format

author

publishedIn

2008

http://publisher.org/Journals/

JournalSWIS

year

http://w3c.org

director

Linked Data -

The Story So Far

title

http://w3.org/People/

Berners-Lee

http://publisher.org/Papers/

Paper12345

09/11/2011 18 Jornadas de Software Libre y Web 2.0

http://xmlns.com/foaf/

Person

type

RDF (sintaxis)

09/11/2011 19

<http://publisher.org/Papers/Paper12345>

title "Linked Data - The Story So Far";

year "2008-01-01";

author <http://w3.org/People/Berners-Lee>;

publishedIn <http://publisher.org/Journal/JournalSWIS> .

<rdf:Description rdf:about="http://publisher.org/Papers/Paper12345">

<title>Linked Data - The Story So Far</title>

<year>2008-01-01</year>

<author rdf:resource="http://w3.org/People/Berners-Lee" />

<publishedIn rdf:resource="http://publisher.org/Journal/JournalSWIS" />

</rdf:Description>

Jornadas de Software Libre y Web 2.0

Ontologies (vocabularies)

09/11/2011 20

“An ontology is an explicit and formal specification of a shared

conceptualization1“

1Tom Gruber

Jornadas de Software Libre y Web 2.0

Linked Data Cloud

09/11/2011 21 Jornadas de Software Libre y Web 2.0

EXPOSING DATABASES WITH D2R SERVER

PUBLISHING LINKED DATA FROM RELATIONAL DATABASES

09/11/2011 22 Jornadas de Software Libre y Web 2.0

How is your data currently stored?

09/11/2011 Jornadas de Software Libre y Web 2.0 23

How to publish Linked Data?

Annotation

◦ Manual

◦ Collaborative

◦ (Semi-)automatic

Exposure

◦ RDF Triple Store

◦ HTML+RDF (RDFa)

◦ RDF Wrappers

◦ SQL2RDF

09/11/2011 Jornadas de Software Libre y Web 2.0 24

2008

JournalSemanticWeb

W3C

The Story

So Far

Berners-Lee LinkedData

Web Application Architecture

09/11/2011 25

Relational

Database Application Server

User Interface

Jornadas de Software Libre y Web 2.0

Web Application Architecture using

D2R Server

09/11/2011 26

Relational

Database

Application

Server

Jornadas de Software Libre y Web 2.0

Web Application Architecture using

D2R Server

09/11/2011 27

<http://cris.org:/resource/projects/Organic>

a cerif:Project ;

rdfs:label "Multilingual Federation of Learning

Repositories"@en-uk ;

cerif:acronym "Organic.Edunet" ;

cerif:endDate "2010-09-30"^^xsd:date ;

cerif:internalIdentifier

"ff808181300cf99e01300d1a355f0003"

cerif:isLinkedByOrganisationUnit

Relational

Database

D2R

Server

Application

Server

Jornadas de Software Libre y Web 2.0

Web Application Architecture using

D2R Server

09/11/2011 28

<http://cris.org:/resource/projects/Organic>

a cerif:Project ;

rdfs:label "Multilingual Federation of Learning

Repositories"@en-uk ;

cerif:acronym "Organic.Edunet" ;

cerif:endDate "2010-09-30"^^xsd:date ;

cerif:internalIdentifier

"ff808181300cf99e01300d1a355f0003"

cerif:isLinkedByOrganisationUnit

Relational

Database

D2R

Server

Application

Server

Jornadas de Software Libre y Web 2.0

Exposing and Consuming Linked

Data

Internet Navigator

URL: http://mashup.org File Favourites Help

mashup

D2R

Server Relational

Database

09/11/2011 29 Jornadas de Software Libre y Web 2.0

Installing D2R

09/11/2011 30 Jornadas de Software Libre y Web 2.0

Using D2R

~/d2rserver$> generate-mapping

-o MAPPING.n3

-d com.mysql.jdbc.Driver

-u USER -p PASSWORD

jdbc:mysql://localhost:3306/DATABASE

09/11/2011 31

~/d2rserver$> d2r-server MAPPING.n3

Jornadas de Software Libre y Web 2.0

Database model example

09/11/2011 32 Jornadas de Software Libre y Web 2.0

Vocabularies

09/11/2011 33

#Built-in vocabularies

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

@prefix owl: <http://www.w3.org/2002/07/owl#> .

#Specific vocabularies

@prefix foaf: <http://xmlns.com/foaf/0.1/> .

@prefix dc: <http://purl.org/dc/elements/1.1/> .

@prefix dcterms: <http://purl.org/dc/terms/> .

@prefix bibo: <http://purl.org/ontology/bibo/> .

@prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> .

@prefix cerif: <http://eurocris.org/cerif#> .

Jornadas de Software Libre y Web 2.0

Database Connection

09/11/2011 34

map:database a d2rq:Database;

# Main settings

d2rq:jdbcDriver "com.mysql.jdbc.Driver";

d2rq:jdbcDSN "jdbc:mysql://localhost:3306/DATABASE";

d2rq:username "USER";

d2rq:password "PASSWORD";

# Other settings

jdbc:autoReconnect "true";

jdbc:zeroDateTimeBehavior "convertToNull";

d2rq:allowDistinct "true";

jdbc:keepAlive "3600"; # value in seconds

jdbc:keepAliveQuery "SELECT 1";

.

Jornadas de Software Libre y Web 2.0

Exposing RDF Resources

09/11/2011 35

map:OrganisationUnits a d2rq:ClassMap;

d2rq:dataStorage map:database;

d2rq:class cerif:Organization;

d2rq:uriPattern "organizations/@@ORGANISATIONS.ACRONYM@@";

d2rq:condition "ORGANISATIONS.ACRONYM <> ''“ .

Jornadas de Software Libre y Web 2.0

http://dataset.org/organizations/

UCA

http://eurocris.org/cerif/

Organization

rdf:type

Exposing literal properties

09/11/2011 36

map:OrganisationUnits_Headcount a d2rq:PropertyBridge;

d2rq:belongsToClassMap map:OrganisationUnits;

d2rq:property cerif:headcount;

d2rq:column "ORGANISATIONS.HEADCOUNT “ .

Jornadas de Software Libre y Web 2.0

http://dataset.org/organizations/

UCA

cerif:headcount

2400

Exposing 1:N relations

09/11/2011 37

map:OrganisationUnits_Name a d2rq:PropertyBridge;

d2rq:belongsToClassMap map:OrganisationUnits;

d2rq:property cerif:name;

d2rq:join "ORG_NAME.ORGID = ORGANISATIONS.ID";

d2rq:column "ORG_NAME.NAME“ .

Jornadas de Software Libre y Web 2.0

http://dataset.org/organizations/

UCA

cerif:name

University of

Cádiz@en

Universidad

de Cádiz@es

cerif:name

Exposing N:M relations

09/11/2011 38

map:OrganisationUnits_Person a d2rq:PropertyBridge;

d2rq:belongsToClassMap map:OrganisationUnits;

d2rq:property cerif:members;

d2rq:join "ORG_PERS.ORGID = ORGANISATIONS.ID";

d2rq:join "ORG_PERS.PERSID = PERSON.ID";

d2rq:refersToClassMap map:Person .

Jornadas de Software Libre y Web 2.0

http://dataset.org/people/

InvestigadorXYZ

http://dataset.org/organizations/

UCA

cerif:members

CASE STUDY: THE VOA3R PROJECT

PUBLISHING LINKED DATA FROM RELATIONAL DATABASES

09/11/2011 39 Jornadas de Software Libre y Web 2.0

Platform based on semantic technologies

to integrate open contents for

researchers.

Manages scientific context:

◦ Organizations

◦ Research Projects

◦ Researcher Profiles

◦ etc.

Publishes its data using D2R Server.

09/11/2011 40 Jornadas de Software Libre y Web 2.0

VOA3R Portal

09/11/2011 41 Jornadas de Software Libre y Web 2.0

Organization’s Data in VOA3R

09/11/2011 42 Jornadas de Software Libre y Web 2.0

Organization’s Data in RDF <http://voa3r.cc.uah.es/dataset/resource/organisationUnits/UAH>

rdf:type cerif:OrganisationUnit ;

rdfs:label "University of Alcala" ;

cerif:acronym "UAH" ;

foaf:homepage <http://www.uah.es> ;

cerif:researchActivities "Ontologies, Linked Data" ;

dcterms:subject <http://aims.fao.org/aos/agrovoc/c_7273> ,

<http://aims.fao.org/aos/agrovoc/c_8070> ;

09/11/2011 43 Jornadas de Software Libre y Web 2.0

Organization’s Data in RDF (II) …

cerif:researchProjects

<http://voa3r.cc.uah.es/dataset/resource/projects/Organic.Edunet> , <http://voa3r.cc.uah.es/dataset/resource/projects/Organic.Lingua> , <http://voa3r.cc.uah.es/dataset/resource/projects/VOA3R> ;

cerif:innerGroups

<http://voa3r.cc.uah.es/dataset/resource/organisationUnits/IERU> ;

cerif:members

<http://voa3r.cc.uah.es/dataset/resource/person/Salvador_Sanchez> , <http://voa3r.cc.uah.es/dataset/resource/person/Miguel_Refusta> , <http://voa3r.cc.uah.es/dataset/resource/person/Luis_Torrico> .

09/11/2011 44 Jornadas de Software Libre y Web 2.0

Organization’s Data via D2R

09/11/2011 45 Jornadas de Software Libre y Web 2.0

SPARQL Client

09/11/2011 46 Jornadas de Software Libre y Web 2.0

CONCLUSIONS

PUBLISHING LINKED DATA FROM RELATIONAL DATABASES

09/11/2011 47 Jornadas de Software Libre y Web 2.0

Conclusions

Web based on documents Web based on

Data.

Linked Data as a way for interchanging data

between different datasets in the Web.

RDF as a standard format to describe data.

D2R allows to publish RDF metadata from

databases (non-intrusive technique).

Main aim: Create new third-party

applications using open linked data from LD

systems. 09/11/2011 48 Jornadas de Software Libre y Web 2.0

References

Linked Data: Evolving the Web into a

Global Data Space

◦ http://linkeddatabook.com/

W3C Linking Open Data Project

◦ http://www.w3.org/wiki/SweoIG/TaskForces/C

ommunityProjects/LinkingOpenData

D2R Server

◦ http://www4.wiwiss.fu-berlin.de/bizer/d2r-

server

09/11/2011 49 Jornadas de Software Libre y Web 2.0

Iván Ruiz Rube ivan.ruiz@uca.es

09/11/2011 50

Publishing Linked Data from

relational databases

thanks

Jornadas de Software Libre y Web 2.0