Drupal and the semantic web - SemTechBiz 2012
-
Upload
scorlosquet -
Category
Education
-
view
1.579 -
download
1
description
Transcript of Drupal and the semantic web - SemTechBiz 2012
Leveraging the Semantic Web with Drupal 7
Stéphane Corlosquet, Paolo CiccareseMIND InformaticsSemTechBiz San Francisco 2012June 4th, 2012
About the speakers
● Stéphane Corlosquet
● 6 years with Drupal
● Drupal core maintainer (RDF)
● Drupal Security Team member
● Co-authored theDefinitive Guide to Drupal 7
● Co-maintain RDF Extensions,SPARQL, schema.org
● Member of the RDFa WG
About the speakers
● Paolo Ciccarese, PhD
● Assistant in Neurology at Mass General Hospital
● Research faculty at Harvard Medical School
● Author of 30+ scientific publications
● Senior software and knowledge engineer
● Member of W3C HCLS Interest Group
● Co-chair of the W3C Open Annotation Community Group
Tutorial outline
● Introduction to Drupal
● What is it good for
● Installation / Hosted Drupal
● Semantic Web and Drupal
● Technology stack
● Use cases, hands on session
● Domeo & Drupal
Drupal
● Dries Buytaert - small news site in 2000
● Open Source - 2001
● Content Management System
● LAMP stack
● Non-developers can build sites and publish content
● Control panels instead of code
http://www.flickr.com/photos/funkyah/2400889778/
Drupal
● Open & modular architecture
● Extensible by modules
● Standards-based
● Low resource hosting
● Scalable
Building a Drupal site
http://www.flickr.com/photos/toomuchdew/3792159077/
Building a Drupal site
● Create the content types you need
Blog, article, wiki, forum, polls, image, video, podcast, e-commerce... (be creative)
http://www.flickr.com/photos/georgivar/4795856532/
Building a Drupal site
● Enable the features you want
Comments, tags, voting/rating, location, translations, revisions, search...
http://www.flickr.com/photos/skip/42288941/
Building a Drupal site
Set how your content is displayed
Building a Drupal site
Thousands of free contributed modules
● Google Analytics
● Wysiwyg
● Captcha
● Calendar
● XML sitemap
● Five stars
● ...
http://www.flickr.com/photos/kaptainkobold/1422600992/
The Drupal Community
http://www.flickr.com/photos/x-foto/4923221504/
The Drupal Community
http://webchick.net/node/80
“It’s really the Drupal community and not so much the software that makes the Drupal project what it
is. So fostering the Drupal community is actually more important than just managing the code base.” -
Dries Buytaert
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
Who uses Drupal?
http://buytaert.net/tag/drupal-sites
Try Drupal 7
● Download and Install Drupal 7
● Grab latest release http://drupal.org/project/drupal
● LAMP stack:
– Mac OS: http://www.mamp.info/
– Acquia Stack http://acquia.com/downloads
● Drupal Gardens: free Drupal 7 site http://www.drupalgardens.com/
Rich Snippets
Yahoo!
Bing
Why Structured Data in HTML
● Help machines extract relevant data from HTML
● Can make use of this data in amazing ways (e.g. enhanced search results)
Structured Data in HTML
● Add or alter HTML attributes
● Syntaxes
– Microformats (@class, @rel)
– RDFa (@property, @about, @typeof, …)
– Microdata (@itemscope, @itemtype, @itemprop, …)
– RDFa 1.1 & RDFa Lite
Structured Data in HTML
● Evolution and cross-syntax influence
Schema.org
Schema.org
● Describe the type of your content (Person, Event, Recipe, Product, Book, Movie, etc.)
– 290 types and counting
● Each type has a set of properties
– Common properties: name, description, image, url
– Specific properties depending on the type (see type page on schema.org)
– 240 properties and counting
Credits: Dan Brickley - link.
Schema.org
Schema.org module for Drupal
● UI instead of code
● Map your content types and fields to the schema.org terms
http://drupal.org/project/schemaorg
Example: Event
Rich Snippet testing tool
● http://www.google.com/webmasters/tools/richsnippets
Examples in the wild
● Events
– “force11 events”: http://goo.gl/VVhNM
– DrupalCon Munich: http://goo.gl/jgMvw
● Recipes
– “delicious lemon coconut squares”: http://goo.gl/ORdl1
– Apple pie with ingredients: http://goo.gl/wCO1w
Examples in the wild
● University of Waterloo
– School of Public Health and Health Systems launch: http://goo.gl/Df9hp
● Curling tournament calendar
– European Curling Championships 2012: http://goo.gl/YXgXl
– World Women’s Curling Championships 2013: http://goo.gl/BDNZW
Schema.org module
● http://drupal.org/project/schemaorg
– Download module (beta)
– Documentation on drupal.org
– Screencast + examples
Schema.org module
Play time!
http://www.google.com/webmasters/tools/richsnippets
Drupal 7 and RDF
History of RDF in Drupal
● rdf.php (2000, Dries)
● FOAF, vCard (2004, walkah)
● Relationship (2005, dman)
● Semantic Search (2006, hendler)
● RDF (2007, Arto)
● OpenCalais (febbraro, 2008)
● RDF CCK (2008, scor)
Drupal 7 and RDF
● Drupal 7 core is RDFa enabled
● RDFa output by default on blogs, forums, comments, etc. using FOAF, SIOC, DC, SKOS
http://en.wikipedia.org/wiki/File:Oriente_Station_Lisboa_roof.jpg
Architecture
● User driven data model
● Content type => RDF class
● Field => RDF property
● Node => RDF resource
Content types and Fields
Content types and Fields
Node
Drupal 7 and RDF
Drupal 7 and RDF
● Contributed module for more features
● RDF Extensions● Serialization formats: RDF/XML, Turtle, N-Triples
● SPARQL● Expose Drupal RDF data in a SPARQL Endpoint
● SPARQL Views● Display remote RDF data in Drupal using SPARQL
● JSON-LD● Expose Drupal RDF data as JSON-LD (CORS-enabled)
● Features and packaging● Build distributions / deployment workflow
SPARQL Endpoint
http://drupal.org/project/sparql
● Indexing
SPARQL Endpoint
● Public endpoint available at /sparql
● http://prefix.cc/sioc,rnews.sparql
JSON-LD in Drupal
● Client side as well as server side friendly
● Browser Scripting:
– Native javascript format
– RDFa API in the DOM
● Data can be fetched from anywhere:
– Cross-Origin Resource Sharing (CORS) enabled
● Client can mash data
● http://drupal.org/project/jsonld
JSON-LD plug
RDFa 1.1
● RDFa Lite
● RDFa 1.1 Full
● http://rdfa.info/play/
Demos
rNews / SPARQL
PREFIX dc: <http://purl.org/dc/terms/>PREFIX rnews: <http://iptc.org/std/rNews/2011-10-07#>
SELECT * WHERE { ?s a rnews:Article; dc:title ?title.}
Demos
● Occupy Directory
– http://directory.occupy.net/occupations
– JSON-LD: http://directory.occupy.net/node/19652.jsonld
● Federated General Assembly
– Drupal distribution for occupy movement
– http://wiki.occupy.net/wiki/Federated_General_Assembly
DOMEO: a web-based tool for semantic annotation of online
documents
As (biomedical) scientists…
• We deal with an increasing amount of digital resources (documents, images, videos, datasets, databases… )
• We commonly use annotation but…
–are we really efficient?
–can we leverage machine computation?
–can we share it easily with our colleagues?
–can we capitalize on the work of colleagues?
–can we integrate it with other resources?
Annotation Framework (Components)
• Annotation Ontolog y (AO): OWL vocabulary for representing and sharing annotation of digital resources and their fragments – Website http://purl.org/ao/home
– Paper http://www.jbiomedsem.com/content/2/S 2/S 4
• DOM E O c lient: web application for producing and sharing manual, semi-automatic and automatic annotation – Website http://annotationframework.org
– Paper http://www.jbiomedsem.com/content/3/S 1/S 1
Annotation of digital resources
http://antibodyregistry.org/antibody17/antibodyform.html?
gui_type=advanced&ab_id=2266850
antibodyregistry.org
Visually and effectively annotate - better
semantically annotate - any digital resource
and resource fragment, while performing our
regular browsing/reading activities
Leverage text mining and community curation
Run text mining and entities recognition algorithms on scientific documents and persist the results in a standard format
Benefit from crowdsourcing by supporting curation of manual and automatic annotation
… and more
• Efficiently search and reuse the annotation
– S emantic inference
• S ubscribe to feeds related to topics of interest
–Proteins, Cells, Authors, Papers…
• Retrieve additional content (mashups)
–Entrez gene, UniProt, …
S emantic tagging through ontologies
Sem
antic
Tag
http://purl.obolibrary.org/obo/PR_000004168Label ‘amyloid beta A4 protein’Exact synonyms ‘APP’, ‘amyloidogenic glycoprotein’, …Related S ynonyms ‘A4’, ‘ABPP’,
Is a http://purl.obolibrary.org/obo/PR_000000001Label ‘protein’Definition ‘An amino acid chain that… ’
S ource: Protein Ontology (PRO)https://pir5.georgetown.edu/wiki/PRO
AP
Ps for
the S
em
antic
Resourc
es P
roje
ct, M
ay 2
010
Zooming in
APPs for the S emantic Resources Project, May 2010
Annotation Ontology (AO)
OWL vocabulary for representing and sharing annotation of digital resources and their fragments
Not only for biomedicine!
–Website http://purl.org/ao/home
–Paper http://www.jbiomedsem.com/content/2/S 2/S 4
A simplified view of AOAO allows to annotate: Res ourc es : Documents (HTML, PDF, Word, Excel), Images, Databases, Web S ervices... (and their fragments) S pecifying (or not) an:
Annotation Type: through one of the already available types (errata, highlight, qualifiers...) or the ones the users will define.With (or without) a: Topic : free text, structured text, UR Is, RDF entities, RDF graphs, domain ontologies…Tracing: Provenance: who created what, when, with which software, with what expectations…
Annotating a documentA
lzS
WA
N: http://tin
yurl.c
om
/18r
Annotating a document fragment
Protein Ontology – PRO: http://purl.org/obo/owl/PRO
S WAN Ontology 2.0: http://code.google.com/p/swan-ontology/
HyQ
ue
trip
les
Work
flow
s
Experim
ents
Annotation Ontology Network
The Living DocumentProject
Biotea
Open Annotation Community GroupAnnotation Ontology is going to be replaced in our applications by the Open Annotation Model developed through the W3C Open
Annotation Community Group
–Website http://www.w3.org/community/openannotation/
–Core Model http://www.openannotation.org/spec/core/
–Extensions http://www.openannotation.org/spec/extension/
DOMEO: Document Metadata Organizer
S emantic Tags or Qualifiers [1]
S emantic Tags or Qualifiers [2]
S emantic Tags or Qualifiers [3]
Domeo and the NCBO Annotator
http://w
ww
.bio
onto
logy.
org
/annota
tor-
serv
ice
Domeo allows automatic/manual annotation with terms coming from selected ontologies managed by
the BioPortal
Running NCBO Annotator
Additional text mining services will be listed here
NCBO Annotator Results in Domeo
List of recognized entities
Results Curation
Customizable
Cumulative Results Curation
One item only
All instances with the same text match
All instances independently from the text match
S erialization in AO/RDF (S hare)
UIMA, C lerezza and AO
AO RDF
http://www.slideshare.net/paolociccarese/domeo-and-text-mining
AO RDFApplicationsPublishing
Text Mining Results
CuratedText
Mining Results
Evaluating PerformanceComparing AlgorithmsLearning…
Combining'Disparate'Sources'of'Data'
h[p://annota7onframework.org/!
Demos
● Domeo + Drupal
– Data mash up from independent, but related sources
Thanks!
● Stéphane Corlosquet: [email protected]
– @scorlosquet
– http://openspring.net/
● Paolo Ciccarese: [email protected]