Produce and Consume Linked Data with Drupal! -...

33
Chapter Copyright 2009 Digital Enterprise Research Institute. All rights reserved. Digital Enterprise Research Institute www.deri.ie [email protected] DERI NUI Galway, MGH October 27th, 2009 Produce and Consume Linked Data with Drupal! Stéphane Corlosquet , Renaud Delbru, Tim Clark, Axel Polleres and Stefan Decker ISWC 2009 1

Transcript of Produce and Consume Linked Data with Drupal! -...

Page 1: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Chapter Copyright 2009 Digital Enterprise Research Institute. All rights reserved.

Digital Enterprise Research Institute www.deri.ie

[email protected] DERI NUI Galway, MGHOctober 27th, 2009

Produce and Consume Linked Data with Drupal!

Stéphane Corlosquet, Renaud Delbru, Tim Clark,Axel Polleres and Stefan Decker

ISWC 2009

1

Page 2: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Loads of Data on the Web in CMS...

2

Page 3: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Some Motivations...

Status of the current web Data contained in millions of documents Disparate platforms and systems Wide range of topics (personal blogs, news, etc.) Various types of resources (text, pictures, video, etc.) Note: Lots of Structured data in Content Management Systems

Problem Not possible to reuse this data outside the CMS (except RSS) Not available as unified machine readable format

3

Page 4: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

So, here’s our idea of CMS:

4

PROJECT BLOGS

REMOTE DRUPAL SITE

DBLP

SPARQL

endpoint

SPARQL

endpoint

Tim

.........

SPARQL

endpoint

SELECT ?name ?title WHERE { ?person foaf:made ?pub. ?person rdfs:label ?name. ?pub dc:title ?title. FILTER regex(?title, "knowledge", "i") }

Figure 3.5: Extended example in a typical Linked Data eco-system.

one for bridging the DBLP SPARQL endpoint to the project blogs website, and a sec-ond for bridging the Science Collaboration Framework website. When visiting Tim’sprofile page, the relevant publication information will be fetched from both DBLP andSCF websites, and either new nodes will be created on the site or older ones will beupdated if necessary.

3.4 Neologism: Easy RDFS vocabulary publishingNeologism11 is a web-based vocabulary editor and publishing platform designed toaddress these issues related to Vocabulary authoring and publishing on the Web. It iscurrently available as an open-source project 12.

3.4.1 ArchitecturePublic interface. To non-authenticated users on the Web, Neologism presents a verysimple interface: a homepage that lists one or more vocabularies, and for each of them

11http://neologism.deri.ie/12http://neologism.googlecode.com/

42

Page 5: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Approach

Our Goal Integrate "any" CMS site to the Web of Data

A challenging taskLittle incentive for users to annotate their data manuallySite owners do not have the resources to convert their data to RDFPer-site schema: each site is different and its structure cannot be

predefined

SolutionsExpose the CMS site structure in a unified format AUTOMATICALLY!Use Semantic Web standards (RDFa, SPARQL)

5

Page 6: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Approach

Implementation in DrupalWhy? One of the most popular CMS out there Modules to take the burden off the site users

What our modules allow:1. Automatic site vocabulary generation2. Mapping Content Models to existing ontologies3. Data endpoint for SPARQL querying4. Lazy loading of external data (data import)

6

Page 7: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Pre-Existing work

“Semantic Content Management Systems”

Ontology-based CMS:– Semantic community Web portals (2000)–OntoWebber: Model-Driven Ontology-Based Web Site Management

(2001)

Our approach is reverse: from existing CMS structure to ontologies

7

Page 8: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

The Drupal CMS

Drupal*Easy to useLarge communityPopular on the WebHundreds of thousands of sitesModular design

Drupal site workflowSite administrator: set up the site and install modules they

like/needSite editors: create the content of the site following the

schema defined by the site administrator

8

* http://drupal.org/

Page 9: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Drupal: Content Construction Kit

Content Construction Kit (CCK) moduleGUI for extending the internal schema of a Drupal siteUsed on many Drupal sitesCan build new types of pages, known as content typesCan create fields for each content types. Fields can be of

various types: plain text fields, dates, email addresses, file uploads, reference to other pages

9

Page 10: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Drupal: Content Construction Kit

Demo use case: project blogs site*Community siteVarious content:– People–Organizations– Projects– Blogs

10

* http://drupal.deri.ie/projectblogs/

PROJECT BLOGS

REMOTE DRUPAL SITE

DBLP

SPARQL

endpoint

SPARQL

endpoint

Tim

.........

SPARQL

endpoint

SELECT ?name ?title WHERE { ?person foaf:made ?pub. ?person rdfs:label ?name. ?pub dc:title ?title. FILTER regex(?title, "knowledge", "i") }

Figure 3.5: Extended example in a typical Linked Data eco-system.

one for bridging the DBLP SPARQL endpoint to the project blogs website, and a sec-ond for bridging the Science Collaboration Framework website. When visiting Tim’sprofile page, the relevant publication information will be fetched from both DBLP andSCF websites, and either new nodes will be created on the site or older ones will beupdated if necessary.

3.4 Neologism: Easy RDFS vocabulary publishingNeologism11 is a web-based vocabulary editor and publishing platform designed toaddress these issues related to Vocabulary authoring and publishing on the Web. It iscurrently available as an open-source project 12.

3.4.1 ArchitecturePublic interface. To non-authenticated users on the Web, Neologism presents a verysimple interface: a homepage that lists one or more vocabularies, and for each of them

11http://neologism.deri.ie/12http://neologism.googlecode.com/

42

Page 11: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Drupal: Content Construction Kit

CCK User Interface

11

Page 12: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Drupal: Content Construction Kit

CCK User Interface

12

Figure 2.11: List of fields of the Person content type in Drupal’s CCK.

The fields form for the Person content type is displayed on Figure 2.11. This formallows to easily reorder the fields by a “drag and drop” technique, add new fields,remove existing fields or access the configuration form for a field.

Figure 2.12: Defining constraints on the gender field in Drupal’s CCK.

The configuration form for the field gender appears on Figure 2.12, where it canbe set as required. Its cardinality can be specified as well as, if appropriate, a list ofallowed values – in which case the form will present a drop list of choices for this field.

Particularly, we will illustrate in the further sections how to extend the publicationsfield to automatically display a list of publications pulled from various data endpoints.

30

Page 13: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Drupal: Content Construction Kit

CCK User Interface

13

Figures 2.9, 2.10, 2.11 and 2.12 show the typical look and feel of a Drupal page andadministrative interface for the Person content type, without our extensions installed.This content type offers fields such as name, homepage, email, colleagues, blog url,current project, past projects, publications, contributions.

Figure 2.9: User profile page built with Drupal’s CCK.

An example of node (page) of the type Person is depicted on Figure 2.9 where allthe fields are listed with their respective values. These values are formatted dependingon the type of field they belong to, e.g. a value of type link such as for the fieldhomepage is a link to http://openspring.net/, a value of type Node referencesuch as for the field colleagues will be a link to the page of Aidan Hogan for instance,which is hosted on the same site. Note also the View and Edit links at the top which areavailable to logged in users who have the permissions to edit the page.

Figure 2.10: Administration page of the Person content type in Drupal’s CCK.

Figure 2.10 presents the basic form of the Person content type. It does not containsany information about the field but more general settings like the name of the contenttype, whether it should allow comments, have revisions, etc.

29

Page 14: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

1, 21, 2

Digital Enterprise Research Institute www.deri.ie

What do we add?

14

Page 15: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

1. Site Vocabulary

Automatic site vocabulary in RDFS/OWL from CCKDescribes the content types and fieldsContent type <=> RDF classField <=> RDF propertyRDFa output on sitehttp://siteurl/ns#

15

Page 16: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

1. Site Vocabulary

Automatic site vocabulary in RDFS/OWLField constraintsExample with cardinalities:– the name of a Person is required–max. 5 projects per person

16

Page 17: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

2. Mapping Content Models to existing ontologies

Mapping Content Models to Existing Ontologies Import of any vocabulary published onlineExternal ontology search serviceLocal terms are subclasses/subproperties of public terms

Ensure “safe” vocabulary re-use: – only subclassing/subproperty avoids “redefinition”– adding cardinalities might introduce inconsistencies still, possible to

avoid in the user interface

17

Objects of type literals and URI are normalised (tokenised) before being concate-nated with their field name. It is thus possible to use full-text search not only on lit-erals, but also on URIs identifying ontology terms. For example one could aearch for”Agent” to match foaf:Agent, ignoring the namespace.

We allow search for plain keywords, combinations of keywords, or structured queries(e.g. student AND subClassOf:Person or name AND domain:Person).Search examples are shown in Figure 3.2. Details on improving the ranking of oursearch algorithm can be found in [45].

3.2.3 Mapping processThe terms suggested by both of the import service and the ontology search service canbe mapped to each content type and their fields. For mapping content types, one canchoose among the classes of the imported ontologies and for fields, one can chooseamong the properties. The local terms will be linked with rdfs:subClassOf andrdfs:subPropertyOf statements, e.g.site:Person rdfs:subClassOf foaf:Person to the mapped terms in thesite vocabulary; wherever a mapping is definined, extra triples using the mapped termsare exposed in the RDFa of the page.

Additionally, we allow inverse reuse of existing properties. E.g., assume the siteadministrator imports a vocabulary ex: that defines a relation between Countries/Re-gions and goods that this region/coutry produces via the property ex:produces. Ouruser interface also allows to relate fields to the inverse of imported properties. For in-stance, the origin field could be related to ex:produces in such an inverse manner,resulting in

site:origin rdfs:subPropertyOf[ owl:inverseOf ex:produces ] .

being added to the site vocabulary.The use of subclassing and subproperties for mapping to existing onotlogies – in-

stead of reusing the imported terms directly in the definitions of the site vocabulary– is a simple way of minimizing unintended conflicts between the semantics of localvocabulary and public terms. Per OWL semantics, constraints imposed on the localterm by the content model/site vocabulary such as the cardinality restrictions which wederive from CCK (see Section 3.1) will thus not propagate to the public term. Thisensures safe vocabulary re-use, i.e. avoids what is sometimes referred to as “OntologyHijacking” [3].

Intuitively, safe reuse means that a vocabulary importing another one does not mod-ify the meaning of the imported vocabulary or “hijack” instance data of the importedvocabulary.

Let us assume that we would, on the contrary, directly use the imported proper-ties and classes in the site vocabulary. That would cause problems. For instance, theexport of the content model as described in the previous section contains the triplesite:name a owl:DatatypeProperty. Would we have used foaf:namedirectly here, we would have changed the FOAF vocabulary, which, by itself doesn’t

37

Page 18: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

2. Mapping Content Models to existing ontologies

RDF mappings page

18Figure 3.2: RDF mappings management through the Drupal interface: RDF class map-ping (left) and RDF property mapping (right).

Figure 3.3: Form for importing an external vocabulary.

goal is to use our project blogs website as a hub containing information federated fromvarious remote locations:

• DBLP is a public SPARQL endpoint containing metadata on scientific publica-tions. It is part of the Linking Open Data cloud and runs on a D2R server9.

• The Science Collaboration Framework website which contains information aboutthe SCF team and their scientific publications. It runs Drupal and the modulesdescribed in this thesis.

3.3.1 Exposing RDF data with a SPARQL endpointThe first step to ensure interoperability on the Web of Data is to provide an endpointwhich exposes RDF data. The RDF SPARQL endpoint module uses the PHP ARC2library10. Upon installation, the module will create a local RDF repository which willhost all the RDF data generated by the RDF CCK module (see Section 3.1). The sitecan then be indexed with a simple click. The RDF data of each node is stored in a graph

9http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/10http://arc.semsol.org/

40

Page 19: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

2. Mapping Content Models to existing ontologies

RDF mappings page

19Figure 3.2: RDF mappings management through the Drupal interface: RDF class map-ping (left) and RDF property mapping (right).

Figure 3.3: Form for importing an external vocabulary.

goal is to use our project blogs website as a hub containing information federated fromvarious remote locations:

• DBLP is a public SPARQL endpoint containing metadata on scientific publica-tions. It is part of the Linking Open Data cloud and runs on a D2R server9.

• The Science Collaboration Framework website which contains information aboutthe SCF team and their scientific publications. It runs Drupal and the modulesdescribed in this thesis.

3.3.1 Exposing RDF data with a SPARQL endpointThe first step to ensure interoperability on the Web of Data is to provide an endpointwhich exposes RDF data. The RDF SPARQL endpoint module uses the PHP ARC2library10. Upon installation, the module will create a local RDF repository which willhost all the RDF data generated by the RDF CCK module (see Section 3.1). The sitecan then be indexed with a simple click. The RDF data of each node is stored in a graph

9http://www4.wiwiss.fu-berlin.de/bizer/d2r-server/10http://arc.semsol.org/

40

Page 20: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

1, 21, 233

Digital Enterprise Research Institute www.deri.ie

What do we add?

20

Page 21: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

3. Data endpoint for complex querying

Local RDF data exposed in a SPARQL endpointEnables interoperability across sitesBuilt on the PHP ARC2 libraryAll RDF data indexed in the endpointEach page stored as graph and kept up to date

21Figure 3.6: A list of SPARQL results (left) and an RDF SPARQL Proxy profile form(right).

a vocabulary page containing some general information about the vocabulary, followedby the descriptions of all its classes and properties.

Editor. After a vocabulary maintainer logs in, additional links become visible onthe vocabulary page and allow adding new terms, as well as editing of existing terms.Terms are created and edited through a web form (Figure 3.8). The form allows entryof an ID (to become part of the term’s URI), label, comment, subclasses, subproperties,domain, range, disjoint classes, inverse properties, and marking a property as functionalor inverse functional. Authenticated users can also create new vocabularies and modifythe vocabulary metadata.

RDFS output, URIs and content negotiation. The URIs identifying classes andproperties are always generated by appending the hash character and the term’s ID tothe URI of the vocabulary page. This makes sure that the vocabulary page is returnedwhen these URIs are resolved. HTTP requests to the vocabulary page are subject tocontent negotiation. Web browsers will see the HTML variant shown in Figure 3.7.RDF-aware clients will receive the RDFS/OWL specification, either in RDF/XML orN3 syntax. In a nutshell, Neologism publishes standards-compliant vocabularies on theWeb without requiring any additional effort on the part of vocabulary maintainers.

Implementation. Neologism is implemented in PHP as a Drupal module. Drupalreduces development time by providing many features for free, such as account man-agement, database abstraction layer and content managemement. It also makes inte-gration with a larger Drupal-based site very easy, for example to provide a news blogand discussion forum for each vocabulary built with Neologism. All data is stored ina MySQL database. RAP13 is used to serialize RDF/XML and N3. The PHP ContentNegotiation library14 is used instead of the usual Apache rules to implement contentnegotiation, and Vapour15 was used to validate its correctness. The overview diagram

13http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/14http://ptlis.net/source/php-content-negotiation/15http://vapour.sourceforge.net/

43

Page 22: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

3. Data endpoint for complex querying

Local RDF data exposed in a SPARQL endpointenable interoperability across sitesbuilt on the PHP ARC2 libraryall RDF data indexed in the endpointEach page stored as graph and kept up to date

22

Figure 3.6: A list of SPARQL results (left) and an RDF SPARQL Proxy profile form(right).

a vocabulary page containing some general information about the vocabulary, followedby the descriptions of all its classes and properties.

Editor. After a vocabulary maintainer logs in, additional links become visible onthe vocabulary page and allow adding new terms, as well as editing of existing terms.Terms are created and edited through a web form (Figure 3.8). The form allows entryof an ID (to become part of the term’s URI), label, comment, subclasses, subproperties,domain, range, disjoint classes, inverse properties, and marking a property as functionalor inverse functional. Authenticated users can also create new vocabularies and modifythe vocabulary metadata.

RDFS output, URIs and content negotiation. The URIs identifying classes andproperties are always generated by appending the hash character and the term’s ID tothe URI of the vocabulary page. This makes sure that the vocabulary page is returnedwhen these URIs are resolved. HTTP requests to the vocabulary page are subject tocontent negotiation. Web browsers will see the HTML variant shown in Figure 3.7.RDF-aware clients will receive the RDFS/OWL specification, either in RDF/XML orN3 syntax. In a nutshell, Neologism publishes standards-compliant vocabularies on theWeb without requiring any additional effort on the part of vocabulary maintainers.

Implementation. Neologism is implemented in PHP as a Drupal module. Drupalreduces development time by providing many features for free, such as account man-agement, database abstraction layer and content managemement. It also makes inte-gration with a larger Drupal-based site very easy, for example to provide a news blogand discussion forum for each vocabulary built with Neologism. All data is stored ina MySQL database. RAP13 is used to serialize RDF/XML and N3. The PHP ContentNegotiation library14 is used instead of the usual Apache rules to implement contentnegotiation, and Vapour15 was used to validate its correctness. The overview diagram

13http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/14http://ptlis.net/source/php-content-negotiation/15http://vapour.sourceforge.net/

43

Page 23: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

What do we add?

23

1, 21, 2

33

44

Page 24: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

4. Lazy loading of external data

Lazy loading (caching) of distant RDF resourcesEnables interoperability across sitesBuilt on the PHP ARC2 libraryCONSTRUCT query to map distant schema to local schema

24Figure 3.6: A list of SPARQL results (left) and an RDF SPARQL Proxy profile form(right).

a vocabulary page containing some general information about the vocabulary, followedby the descriptions of all its classes and properties.

Editor. After a vocabulary maintainer logs in, additional links become visible onthe vocabulary page and allow adding new terms, as well as editing of existing terms.Terms are created and edited through a web form (Figure 3.8). The form allows entryof an ID (to become part of the term’s URI), label, comment, subclasses, subproperties,domain, range, disjoint classes, inverse properties, and marking a property as functionalor inverse functional. Authenticated users can also create new vocabularies and modifythe vocabulary metadata.

RDFS output, URIs and content negotiation. The URIs identifying classes andproperties are always generated by appending the hash character and the term’s ID tothe URI of the vocabulary page. This makes sure that the vocabulary page is returnedwhen these URIs are resolved. HTTP requests to the vocabulary page are subject tocontent negotiation. Web browsers will see the HTML variant shown in Figure 3.7.RDF-aware clients will receive the RDFS/OWL specification, either in RDF/XML orN3 syntax. In a nutshell, Neologism publishes standards-compliant vocabularies on theWeb without requiring any additional effort on the part of vocabulary maintainers.

Implementation. Neologism is implemented in PHP as a Drupal module. Drupalreduces development time by providing many features for free, such as account man-agement, database abstraction layer and content managemement. It also makes inte-gration with a larger Drupal-based site very easy, for example to provide a news blogand discussion forum for each vocabulary built with Neologism. All data is stored ina MySQL database. RAP13 is used to serialize RDF/XML and N3. The PHP ContentNegotiation library14 is used instead of the usual Apache rules to implement contentnegotiation, and Vapour15 was used to validate its correctness. The overview diagram

13http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/14http://ptlis.net/source/php-content-negotiation/15http://vapour.sourceforge.net/

43

Page 25: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

4. Lazy loading of external data

Lazy loading of distant RDF resources

25

Figure 3.6: A list of SPARQL results (left) and an RDF SPARQL Proxy profile form(right).

a vocabulary page containing some general information about the vocabulary, followedby the descriptions of all its classes and properties.

Editor. After a vocabulary maintainer logs in, additional links become visible onthe vocabulary page and allow adding new terms, as well as editing of existing terms.Terms are created and edited through a web form (Figure 3.8). The form allows entryof an ID (to become part of the term’s URI), label, comment, subclasses, subproperties,domain, range, disjoint classes, inverse properties, and marking a property as functionalor inverse functional. Authenticated users can also create new vocabularies and modifythe vocabulary metadata.

RDFS output, URIs and content negotiation. The URIs identifying classes andproperties are always generated by appending the hash character and the term’s ID tothe URI of the vocabulary page. This makes sure that the vocabulary page is returnedwhen these URIs are resolved. HTTP requests to the vocabulary page are subject tocontent negotiation. Web browsers will see the HTML variant shown in Figure 3.7.RDF-aware clients will receive the RDFS/OWL specification, either in RDF/XML orN3 syntax. In a nutshell, Neologism publishes standards-compliant vocabularies on theWeb without requiring any additional effort on the part of vocabulary maintainers.

Implementation. Neologism is implemented in PHP as a Drupal module. Drupalreduces development time by providing many features for free, such as account man-agement, database abstraction layer and content managemement. It also makes inte-gration with a larger Drupal-based site very easy, for example to provide a news blogand discussion forum for each vocabulary built with Neologism. All data is stored ina MySQL database. RAP13 is used to serialize RDF/XML and N3. The PHP ContentNegotiation library14 is used instead of the usual Apache rules to implement contentnegotiation, and Vapour15 was used to validate its correctness. The overview diagram

13http://www4.wiwiss.fu-berlin.de/bizer/rdfapi/14http://ptlis.net/source/php-content-negotiation/15http://vapour.sourceforge.net/

43

Page 26: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Where is it used?

26

Page 27: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Science Collaboration Framework

Web application toolkit based on DrupalEnables online scientific collaboration– publishing, annotating, sharing and discussing any content– articles, papers, reviews, perspectives, interviews, news, biographies– profile information on community members

Targets biomedecine communities, but generic in essence

Networked sites producing Linked Data

27

Page 28: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

SCF collaborating sites

Stembook (Stem Cell articles and reviews)– http://www.stembook.org/

28

Page 29: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

SCF collaborating sites

Michael J Fox Foundation (Parkinson disease)– http://www.pdonlineresearch.org/

29

Page 30: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Conclusion

30

Page 31: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Conclusion

Structure of CMS sites contain valuable schema information

Our suggested “workflow”:site vocabulary from the local structure (RDF CCK)enables out-of-the-box RDF export: expose your Drupal site

to the Web of Data without any additional effort from site admin or content editors (RDF CCK)

mapping to existing RDF vocabularies improves integration in the LOD cloud (evoc)

SPARQL endpointLazy loading of RDF resources (RDF Proxy)

31

Page 32: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Conclusion

Drupal 6 modules available for download – http://drupal.org/project/rdfcck– http://drupal.org/project/evoc– http://drupal.org/project/sparql_ep– http://drupal.org/project/rdfproxy

Online prototype– http://drupal.deri.ie/projectblogs/

32

Page 33: Produce and Consume Linked Data with Drupal! - …openspring.net/sites/openspring.net/files/slides_iswc2009_final2.pdfProduce and Consume Linked Data with Drupal! ... Used on many

Digital Enterprise Research Institute www.deri.ie

Good news from Drupal 7:

RDF mapping feature committed to Drupal 7 coreRDFa output by default (blogs, forums, comments, etc.)

using FOAF, SIOC, DC, SKOS.Download development snapshot

– http://ftp.drupal.org/files/projects/drupal-7.x-dev.tar.gz

Currently more than 200.000* sites on Drupal 6waiting to make the switch to Drupal 7waiting to massively increase the amount of RDF data

on the Web Discussion

http://groups.drupal.org/semantic-web

33

* http://drupal.org/project/usage/drupal