A Non-Technical, Example-Driven Introduction to Linked Data

35
LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO) HOW L INKED DATA AND S EMANTIC WEB T ECHNOLOGIES F OSTER THE P UBLICATION,RETRIEVAL ,REUSE , AND I NTEGRATION OF DATA ANON-TECHNICAL,EXAMPLE-DRIVEN I NTRODUCTION Krzysztof Janowicz, Grant McKenzie, and Yingjie Hu STKO Lab University of California, Santa Barbara, USA UCSB Library, April 2014 AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ,MCKENZIE, AND HU

description

How Linked Data and Semantic Web Technologies Foster the Publication, Retrieval, Reuse, and Integration of Data. A Non-Technical, Example-Driven Introduction to Linked Data for the UCSB Library.

Transcript of A Non-Technical, Example-Driven Introduction to Linked Data

Page 1: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

HOW LINKED DATA AND SEMANTIC WEB

TECHNOLOGIES FOSTER THE

PUBLICATION, RETRIEVAL, REUSE, AND

INTEGRATION OF DATA

A NON-TECHNICAL, EXAMPLE-DRIVEN INTRODUCTION

Krzysztof Janowicz, Grant McKenzie, and Yingjie HuSTKO Lab

University of California, Santa Barbara, USA

UCSB Library, April 2014

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 2: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

WHAT IS LINKED DATA?

LINKING DATA AS NEXT-GENERATION INFRASTRUCTURE

Data SilosWeb servicesDatabasesWeb pages

hinder ad-hoc combinationenforce data modelslimit re-usability

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 3: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

FROM DOCUMENTS TO DATA

FROM LINKED DOCUMENTS TO LINKED DATA

Use Uniform Resource Identifiers (URI) to identify entities, link them to otherentities, encode information about these entities using themachine-understandable RDF, and make them available on the Web.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 4: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

FROM DOCUMENTS TO DATA

BERNERS-LEE’S LINKED DATA PRINCIPLES AND STARS

Four Rules for Linked DataUse URIs as names for things

Use HTTP URIs so that people can look up those names.

When someone looks up a URI, provide useful information, using the standards(RDF*, SPARQL)

Include links to other URIs. so that they can discover more things.

Is your Linked Open Data 5 Star?? Available on the web (whatever format) but with an open licence, to be Open Data

?? Available as machine-readable structured data (e.g. excel instead of imagescan of a table)

? ? ? as (2) plus non-proprietary format (e.g. CSV instead of excel)

? ? ?? All the above plus, Use open standards from W3C (RDF and SPARQL) toidentify things, so that people can point at your stuff

? ? ? ? ? All the above, plus: Link your data to other people’s data to providecontext

See http://www.w3.org/DesignIssues/LinkedData.html

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 5: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

EXPLORING LINKED DATA RELATED TO SANTA BARBARA

Follow-your-nose: Explore information related to Santa Barbara usingLinked Data (DBpedia).

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 6: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

FINDING RELATIONS BETWEEN PLACES

Finding Relations between places (via people, events, objects,...).

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 7: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

REASONATOR:BROWSE WIKIDATA

’Wikidata is afree knowledgebase that can beread and editedby humans andmachines alike.’

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 8: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

SEARCHING THE WEB OF DOCUMENTS

This is still how most (Web) search works today. 20 million results, no hit.AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 9: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

SEARCHING THE WEB OF (LINKED) DATA

Populated places have a population, are located, occupy a certain area,...AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 10: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

GOOGLE’S KNOWLEDGE GRAPH

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 11: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

EXPLORING LINKED DATA

GOOGLE’S KNOWLEDGE GRAPH

Google’s Web search is changing towards query answering.AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 12: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

QUERYING AND INTEGRATION OF LINKED DATA

THE GLOBAL GRAPH OF LINKED DATA

The examples before involved only one data set, but there is much more...AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 13: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

QUERYING AND INTEGRATION OF LINKED DATA

THE GLOBAL GRAPH OF LINKED DATA

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 14: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

QUERYING AND INTEGRATION OF LINKED DATA

INTEGRATION AND QUERY FEDERATION

Integration by searching equivalent classes or/and same featuresin data sets. This requires ontology matching and alignment.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 15: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

ONTOLOGIES IN COMPUTER/INFORMATION SCIENCE

ONTOLOGIES IN COMPUTER/INFORMATION SCIENCE

An ontology is an explicit specification of a conceptualization used

to achieve a shared understanding of a particular domain of interest. (adopted from Gruber (1993))

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 16: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

ENTITIES, CONCEPTS AND CATEGORIES

ENTITIES, CONCEPTS AND CATEGORIES

An entity is an individual (real world) object.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 17: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

ENTITIES, CONCEPTS AND CATEGORIES

ENTITIES, CONCEPTS AND CATEGORIES

A concept/class is a (mental) template used for grouping entities.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 18: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

ENTITIES, CONCEPTS AND CATEGORIES

ENTITIES, CONCEPTS AND CATEGORIES

A category is the set of entities grouped using a particular concept.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 19: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SUBSUMPTION AND SIMILARITY

SUBSUMPTION

Concepts can be organized within hierarchical structures.[this subClassOf relation is transitive]

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 20: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SUBSUMPTION AND SIMILARITY

SIMILARITY

Some concepts (and entities) are more similar than others.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 21: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

ONTOLOGY LANGUAGES

UNDER THE HOOD OF A MAP LEGEND ONTOLOGY

NC = {LegendItem,Symbol ,Label ,FeatureType} (1)NR = {consistsOf , isLabelFor , isLabelOf ,SymbolizedBy} (2)

> v ¬∃N.> (3)

LegendItem v ∃consistsOf .Symbol t ∃consistsOf .LegendItem (4)

Label v ∃SymbolizedBy .Symbol u ∀SymbolizedBy .Symbol (5)> v≤ 1isLabelFor.> (6)> v≤ 1isLabelOf.> (7)

> v≤ 1SymbolizedBy.> (8)Label v ∃isLabelFor .FeatureType (9)

Label u Symbol v ⊥ (also for Symbol, Label, FeatureType, LegendItem) (10)SymbolizedBy− ◦ isLabelFor v depictedBy− (11)

¬∃consistsOf− v Legend (12). . . (13)

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 22: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SEMANTIC INTEROPERABILITY

WHY NOT JUST STANDARDIZE MEANING? (CITY OR TOWN?)

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 23: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SEMANTIC INTEROPERABILITY

WHY NOT JUST STANDARDIZE MEANING? (CITY OR TOWN?)

California:City ≡ Town

Utah:Town ≡< (population,1000)

Pennsylvania:Town ≡ {Bloomsburg}

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 24: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SEMANTIC INTEROPERABILITY

SEMANTIC INTEROPERABILITY – MEANINGFUL LINKS

Unfortunately, our data sources useexactly the same terminology (e.g.,connection) to talk about totally differentand contradicting facts (e.g., separation)

While we can still syntactically integrateand reuse information, the results may bemisleading or even meaningless

We need heterogeneity preserving semantic interoperability methods

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 25: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

SEMANTIC INTEROPERABILITY

SEMANTIC INTEROPERABILITY – MEANINGFUL LINKS

Unfortunately, our data sources useexactly the same terminology (e.g.,connection) to talk about totally differentand contradicting facts (e.g., separation)

While we can still syntactically integrateand reuse information, the results may bemisleading or even meaningless

We need heterogeneity preserving semantic interoperability methods

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 26: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND VOCABULARIES

ONTOLOGIES TO MAKE YOUR DATA MORE USABLE

Five Stars of Linked Data Vocabulary Use© Linked Data without any vocabulary.

? There is dereferencable human-readable information about the usedvocabulary.

?? The information is available as machine-readable explicitaxiomatization of the vocabulary.

? ? ? The vocabulary is linked to other vocabularies

? ? ?? Metadata about the vocabulary is available (in a dereferencableand machine-readable form).

? ? ? ? ? The vocabulary is linked to by other vocabularies.

See http://semantic-web-journal.net/content/

five-stars-linked-data-vocabulary-use

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 27: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND MAPS

A TINY ONTOLOGY FOR ESRI’S ARCGIS ONLINE

This fragment of the ontology developed for ArcGIS Online definesthe relations (e.g., isOwnerOf) between items (e.g., map services),users, and user groups.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 28: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND MAPS

WHICH BASEMAP IS MOST POPULAR (BASED ON VIEWS)?

SELECT DISTINCT ?baseMap ?numViewsWHERE { ?baseMap arcgis:isBaseMapOf ?item .

?baseMap arcgis:numViews ?numViews }ORDER BY DESC(?numViews) LIMIT 10

Listing 1: SPARQL query for most popular maps based on views.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 29: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND MAPS

WHICH BASEMAP IS MOST POPULAR (BASED ON USAGE)?

SELECT ?baseMap (count(distinct ?item) as ?usedTimes)WHERE { ?baseMap arcgis:isBaseMapOf ?item }GROUP BY ?baseMapORDER BY DESC(?usedTimes) LIMIT 10

Listing 2: SPARQL query for most popular maps based on usage.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 30: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND MAPS

LINKED DATA PORTAL FOR ARCGIS ONLINE (DEMO)

Semantics-enabled and Linked Data-driven similarity search interface for ArcGIS Online.

http://stko-exp.geog.ucsb.edu/linkedarcgis/

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 31: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND GAZETTEERS

USCB’S ALEXANDRIA DIGITAL LIBRARY GAZETTEER (2006)

Interface of the original Alexandria Digital Library Gazetteer.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 32: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND GAZETTEERS

ONTOLOGY-POWERED, LINKED DATA-DRIVEN ADL GAZETTEER (DEMO)

5 Million places merged from multiple authoritative data sources. Containsmultiple alternative (e.g., historic) names, provenance information, 1200geographic feature classes, polygon data, GeoSPARQL endpoint, etc.[Still a lot of work to be done].

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 33: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND GAZETTEERS

WHY NOT JUST USE GEONAMES?

A SPARQL query for people living near the Gulf of Guinea will return about 7billion! See http://stko.geog.ucsb.edu/location_linked_data for more examples.

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 34: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND SCIENTOMETRICS

DEKDIV: EXPLORING BIBLIOGRAPHIC LINKED DATA (DEMO)

System: http://stko-exp.geog.ucsb.edu/lak/; paper: http://bit.ly/1dW2NER

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU

Page 35: A Non-Technical, Example-Driven Introduction to Linked Data

LINKED DATA ONTOLOGIES APPLICATIONS (AT STKO)

LINKED DATA AND SCIENTOMETRICS

SEMANTIC WEB JOURNAL: LINKED SCIENTOMETRICS

System: http://semantic-web-journal.com/SWJPortal/;

paper: http://bit.ly/1ilwbRU

AN EXAMPLE-DRIVEN INTRODUCTION TO LINKED DATA JANOWICZ, MCKENZIE, AND HU