Post on 10-May-2015
description
0
EarthCube Building Blocks:
OceanLink
Leveraging Semantics and Linked Data for
Geoscience Data Sharing and Discovery
1
Overview
Goal
• Enable discovery of geoscience data and knowledge, and ultimately, integration
Strategy
• Publish content from existing network of repositories as Linked Open Data (LOD)
• Enable horizontal semantic integration
• Provide tools + services useful to working scientists
2
Domain
Ocean Science
• Research vessels collect data from the solid earth, water column, atmosphere
• Many repositories already interoperate
• Approach is extensible to other geo domains
U.S. academic oceanographic research fleet (above), and recent expedition tracks (left)
3
Project Team Lamont-Doherty Earth Observatory
Robert Arko Suzanne Carbotte
Marymount University Tom Narock (lead)
University of Maryland, Baltimore County Tim Finin
Woods Hole Oceanographic Institution Cynthia Chandler Lisa Raymond Adam Shepherd Peter Wiebe
Wright State University Pascal Hitzler Michelle Cheatham Adila Krisnadhi
4
Collections
• Biological & Chemical Oceanography Data Management Office (BCO-DMO)
• Rolling Deck to Repository (R2R) • cruise catalog +underway enviro sensor data
• Marine Biological Laboratory / Woods Hole Oceanographic
Institution (MBLWHOI) Library • published articles, theses, tech reports, datasets
• AGU meeting abstracts
• NSF funding award abstracts
5
ODPs
Ontology Design Patterns
• Core set of conceptual primitives from Ocean Science Vessel Cruise Instrument Dataset Person Organization etc.
• Reuse existing standard vocabularies where they exist (DCAT, FOAF, PROV)
• Maximize reusability, minimize commitment
6
ODPs
• Patterns published as OWL files with embedded axioms and local vocabularies eg. Cruise must have a Vessel Cruise may have a Person in the Role of Chief Scientist
• Leverage existing alignment among repositories that use eg. NERC Vocabulary Server
• Inference to find relationships among cruises, datasets, people, publications, etc.
(cont.)
7
Work Plan 1. Model, align, inference over existing LOD collections
(BCO-DMO + R2R) • Develop use cases eg. "find publications related to
cruises at the Bermuda Rise that produced CTD profiles and/or seafloor mapping data”
• Develop ODPs • Map existing collections to ODPs
2. Publish LOD for other collections (Library, AGU, NSF) and map to ODPs
3. Prototype end-user tools and services • Search/browse across federated LOD collections • Edit ontologies • Annotate LOD resources incl. provenance
8
Initial Results “An Ontology Pattern for Oceanographic Cruises” (Krisnadhi et al.)
Technical Report and draft set of ODPs Reuses existing patterns including
• Semantic Trajectory (Janowicz et al.) • Information Object • Simple Event Model
to model a Cruise and ship’s track
R/V Atlantis cruise AT22 (Scotian Shelf Survey, August 2012) Basemap: GMRT
9
Lessons
• Recurrent themes in EarthCube Workshop Reports eg. • Data are still difficult to discover and access • Data attribution and citation are critical • Reuse of data still hampered by need for implicit
understanding
• Collaboration between Geo Science and Computer Science works best with Use Cases
• In-person working meetings are key to initial progress Oct. 2013 Woods Hole
Nov. 2013 Baltimore Jan. 2014 Washington
(probably more)
10
Acknowledgements
“EAGER: Collaborative Research: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery” NSF Funding Awards: ICER 13-54990 LDEO ICER 13-54693 UMBC ICER 13-54778 WSU ICER 13-54107 WHOI September 15, 2013 - August 31, 2014
11
Thank you.
www.oceanlink.org
March 2014 – Montana State University – Pascal Hitzler
Metadata Semantics: What Semantic Web technologies can contribute to scientific
data and information sharing and discovery
Pascal Hitzler DaSe Lab for Data Semantics
Wright State University http://www.pascal-hitzler.de
March 2014 – Montana State University – Pascal Hitzler 2
Distributing scientific information
• Since the rise of the World Wide Web, the role of publishing houses and scientific libraries is changing.
• Scientific publishing houses are redefining their roles and are investigating new revenue models.
• What exactly is the role of libraries? • What will the role of libraries be in, say, 20 years?
March 2014 – Montana State University – Pascal Hitzler 3
Library discovery issues
• “I’m looking for an easy but introductory text on discrete mathematics suitable for computer scientists, with high quality in the mathematical formalization and notation, and including (besides the usual stuff) at least brief treatments of Russel’s paradox and of countable versus uncountable sets, e.g. uncountability of the real numbers.”
• “I’m looking for a textbook for a second-year introductory class on logic for computer scientists. Formal treatment of mathematics, tableaux algorithms for propositional and predicate logic, and preferably some coverage of datalog.”
March 2014 – Montana State University – Pascal Hitzler 4
Semantic Web journal
• EiCs: Pascal Hitzler Krzysztof Janowicz
• Established 2010. Going strong.
• We very much welcome contributions at the “rim” of traditional Semantic Web research – e.g., work which is strongly inspired by a different field.
• Non-standard (open & transparent) review process.
• http://www.semantic-web-journal.net/
March 2014 – Montana State University – Pascal Hitzler 5
Semantic Web journal
March 2014 – Montana State University – Pascal Hitzler 6
Summary Statistics
March 2014 – Montana State University – Pascal Hitzler 7
Citation Maps
March 2014 – Montana State University – Pascal Hitzler 8
Collaboration Networks
March 2014 – Montana State University – Pascal Hitzler 9
Topic Trends
March 2014 – Montana State University – Pascal Hitzler 10
Publications analysis
• Provide analysis of citations, topic trends, research networks, etc., which can be obtained from (suitable!) metadata.
• Establish the social, economical and computational infrastructure to provide such data: open access, legal reusability of text and data, rich metadata (citations and beyond)
March 2014 – Montana State University – Pascal Hitzler 11
Data Discovery
March 2014 – Montana State University – Pascal Hitzler 12
Scenario
Determine if a GMRT grid contains high-resolution data from a ship’s multibeam sonar in the proximity of a specified physiographic feature. Return the list of ship expeditions that contributed high-resolution data to those grid cells. For these expeditions, determine which, if any, are found in the R2R catalog and contain quality-controlled geophysical (gravity/ magnetics) profiles along the same ship track. Further determine which investigators are linked to those expeditions; which expeditions and investigators are linked to journal publications and/or meeting abstracts that contain thematic keywords pertaining to the physiographic feature; and what other data are available from the same expeditions in other repositories such as BCO-DMO.
March 2014 – Montana State University – Pascal Hitzler 13
“Inside” and beyond the publications
• Make paper contents available through rich metadata.
• Combine papers with data and datasets, and with information from “outside” the publishing process proper, such as funding awards, geographical information, affiliations, etc.
• More importantly, help in providing a social, economical and technological infrastructure where such information is provided to scientists and students.
March 2014 – Montana State University – Pascal Hitzler 14
EarthCube
EarthCube: Developing a Community-Driven Data and Knowledge Environment for the Geosciences “concepts and approaches to create integrated data management infrastructures across the Geosciences.” “EarthCube aims to create a well-connected and facile environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to understand and predict the Earth system.”
March 2014 – Montana State University – Pascal Hitzler 15
OceanLink
• An EarthCube Building Block
• Integrating ocean science respositories BCO-DMO and R2R, as well as datasets from the WHOI Library, AGU abstracts, NSF projects.
• Demonstrable added value (faceted integrated search).
• Key: extensible architecture that has the potential to grow to EarthCube size
March 2014 – Montana State University – Pascal Hitzler 16
Integration approach
• Well-established: – using controlled vocabularies – which are standardized through a social process
• How many vocabularies do you need to – answer circumstantial queries? – cover all scientific paper contents? – even just to cover the earth sciences?
• What do you do if scientific notions or perspectives change?
March 2014 – Montana State University – Pascal Hitzler 17
E.g., “Event”
Event
xsd:dateTime xsd:string
occursAtPlace occursAtTime
March 2014 – Montana State University – Pascal Hitzler 18
Better Event (more general)
Event
<TemporalThing> <SpatialThing>
occursAtPlace occursAtTime
But what about events taking place in Second Life?
March 2014 – Montana State University – Pascal Hitzler 19
Perhaps even …
Event
<TemporalThing> <Place>
occursAtPlace occursAtTime
<Agent>
hasParticipant
March 2014 – Montana State University – Pascal Hitzler 20
Different representations
person/101396
“Smith, John”
name
R2R:
foaf:Person
type Person_752
name
foaf:Person
type
“John Smith”
familyName
“Smith”
givenName
“John”
BCO-DMO:
What about other countries?
March 2014 – Montana State University – Pascal Hitzler 21
Semantic Web
• Research field in computer science. • Took off in ca. the year 2000.
• Significant funding, initially DARPA, then large-scale in the EU. • In the meantime, large international effort, with significant
investment by funding agencies and companies.
• The Semantic Web vision is about seamless integration of data, knowledge, and services. It is not restricted to the WWW.
• The Semantic Web approach has (whatever type of) formal knowledge representation as a key ingredient.
March 2014 – Montana State University – Pascal Hitzler 22
Knowledge Representation
• Vocabularies on steroids. – Complex relationships between notions are part of the formal
and machine-processable vocabulary definitions, e.g. “Every cruise must have a chief scientist who is PI on one of the research awards which pays for the expenses of the cruise.”
• Standardization of languages for defining vocabularies.
E.g., the Web Ontology Language OWL. – Rather than standardizing vocabularies themselves. – Requires establishing best practices for defining and sharing
vocabularies.
March 2014 – Montana State University – Pascal Hitzler 23
Libraries?
• Libraries could again be at the forefront of being providers for scientific information.
• Trends go towards integrated information spaces with a plethora of differing and heterogeneous information sources.
• How to organize this information space conceptually, technologically, and socially, is a key quest in the Big Data age.
March 2014 – Montana State University – Pascal Hitzler 24
Thanks!
March 2014 – Montana State University – Pascal Hitzler 25
OceanLink Collaborators Robert Arko, Columbia University Suzanne Carbotte, Columbia University Cynthia Chandler, Woods Hole Oceanographic Institution Michelle Cheatham, Wright State University Timothy Finin, University of Maryland, Baltimore County Pascal Hitzler, Wright State University Krzysztof Janowicz, University of California, Santa Barbara Adila Krisnadhi, Wright State University Thomas Narock, Marymount University Lisa Raymond, Woods Hole Oceanographic Institution Adam Shepherd, Woods Hole Oceanographic Institution Peter Wiebe, Woods Hole Oceanographic Institution Some of the presented work is part of the NSF OceanLink project: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery
March 2014 – Montana State University – Pascal Hitzler 26
References
• Pascal Hitzler, Frank van Harmelen, A reasonable Semantic Web. Semantic Web 1 (1-2), 39-44, 2010.
• Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Linked Data is Merely More Data. In: Dan Brickley, Vinay K. Chaudhri, Harry Halpin, Deborah McGuinness: Linked Data Meets Artificial Intelligence. Technical Report SS-10-07, AAAI Press, Menlo Park, California, 2010, pp. 82-86. ISBN 978-1-57735-461-1. Proceedings of LinkedAI at the AAAI Spring Symposium, March 2010.
• Pascal Hitzler, Krzysztof Janowicz, What’s Wrong with Linked Data? http://blog.semantic-web.at/2012/08/09/whats-wrong-with-linked-data/ , August 2012.
• Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies. Chapman and Hall/CRC Press, 2009.
March 2014 – Montana State University – Pascal Hitzler 27
References
• Pascal Hitzler, Krzysztof Janowicz, Linked Data, Big Data, and the 4th Paradigm. Semantic Web 4 (3), 2013, 233-235.
• Krzysztof Janowicz, Pascal Hitzler, The Digital Earth as Knowledge Engine. Semantic Web 3 (3), 213-221, 2012.
• Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, Ilya Zaslavsky, Semantics and Ontologies for EarthCube. In: K. Janowicz, C. Kessler, T. Kauppinen, D. Kolas, S. Scheider (eds.), Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012), Columbus, Ohio, USA. September 18th, 2012. Proceedings.
• Krzysztof Janowicz, Pascal Hitzler, Thoughts on the Complex Relation Between Linked Data, Semantic Annotations, and Ontologies. In: Paul N. Bennett, Evgeniy Gabrilovich, Jaap Kamps, Jussi Karlgren (eds.), Proceedings of the 6th International Workshop on Exploiting Semantic Annotation in Information Retrieval, ESAIR 2013, ACM, San Francisco, 2013, pp. 41-44.
March 2014 – Montana State University – Pascal Hitzler 28
References
• Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh, Ontology Alignment for Linked Open Data. In P. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Pan, I. Horrocks, B. Glimm (eds.), The Semantic Web - ISWC 2010. 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I. Lecture Notes in Computer Science Vol. 6496. Springer, Berlin, 2010, pp. 402-417.
• Amit Krishna Joshi, Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Mariana Damova, Alignment-based Querying of Linked Open Data. In: Meersman, R.; Panetto, H.; Dillon, T.; Rinderle-Ma, S.; Dadam, P.; Zhou, X.; Pearson, S.; Ferscha, A.; Bergamaschi, S.; Cruz, I.F. (eds.), On the Move to Meaningful Internet Systems: OTM 2012, Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012, Proceedings, Part II. Lecture Notes in Computer Science Vol. 7566, Springer, Heidelberg, 2012, pp. 807-824.
March 2014 – Montana State University – Pascal Hitzler 29
References
• Yingjie Hu, Krzysztof Janowicz, David Carral, Simon Scheider, Werner Kuhn, Gary Berg-Cross, Pascal Hitzler, Mike Dean, Dave Kolas, A Geo-Ontology Design Pattern for Semantic Trajectories. In: Thora Tenbrink, John G. Stell, Antony Galton, Zena Wood (Eds.): Spatial Information Theory - 11th International Conference, COSIT 2013, Scarborough, UK, September 2-6, 2013. Proceedings. Lecture Notes in Computer Science Vol. 8116, Springer, 2013, pp. 438-456.
• Yingjie Hu, Krzysztof Janowicz, Grant McKenzie, Kunal Sengupta, Pascal Hitzler, A Linked Data-driven Semantically-enabled Journal Portal for Scientometrics. In: H. Alani, L. Kagal, A. Fokoue, P. Groth, C. Biemann, J.X. Parreira, L. Aroyo, N. Noy, C. Welty, K. Janowicz (eds.), The Semantic Web - ISWC 2013. 12th International Semantic Web Conference, Sydney, NSW, Australia, October 21-25, 2013, Proceedings, Part II. Lecture Notes in Computer Science Vol. 8219, Springer, Heidelberg, 2013, pp. 114-129.
March 2014 – Montana State University – Pascal Hitzler 30
References
• Prateek Jain, Peter Z. Yeh, Kunal Verma, Reymonrod G. Vasquez, Mariana Damova, Pascal Hitzler, Amit P. Sheth, Contextual Ontology Alignment of LOD with an Upper Ontology: A Case Study with Proton. In: Grigoris Antoniou, Marko Grobelnik, Elena Paslaru Bontas Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leenheer, Jeff Pan (Eds.): The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29-June 2, 2011, Proceedings, Part I. Lecture Notes in Computer Science 6643, Springer, 2011, pp. 80-92.
• Prateek Jain, Pascal Hitzler, Kunal Verma, Peter Yeh, Amit Sheth, Moving beyond sameAs with PLATO: Partonomy detection for Linked Data. In: Ethan V. Munson, Markus Strohmaier (Eds.): 23rd ACM Conference on Hypertext and Social Media, HT '12, Milwaukee, WI, USA, June 25-28, 2012. ACM, 2012, pp. 33-42.
March 2014 – Montana State University – Pascal Hitzler 31
References
• Sebastian Rudolph, Markus Krötzsch, Pascal Hitzler, Cheap Boolean Role Constructors for Description Logics. In: Steffen Hölldobler and Carsten Lutz and Heinrich Wansing (eds.), Proceedings of 11th European Conference on Logics in Artificial Intelligence (JELIA), volume 5293 of LNAI, pp. 362-374. Springer, September 2008.
• Adila Alfa Krisnadhi, Frederick Maier, Pascal Hitzler, OWL and Rules. In: A. Polleres, C. d'Amato, M. Arenas, S. Handschuh, P. Kroner, S. Ossowski, P.F. Patel-Schneider (eds.), Reasoning Web. Semantic Technologies for the Web of Data. 7th International Summer School 2011, Galway, Ireland, August 23-27, 2011, Tutorial Lectures. Lecture Notes in Computer Science Vol. 6848, Springer, Heidelberg, 2011, pp. 382-415.
• Adila Krisnadhi, Robert Arko, Suzanne Carbotte, Cynchia Chandler, Michelle Cheatham, Timothy Finin, Pascal Hitzler, Krzysztof Janowicz, Thomas Narock, Lisa Raymond, Adam Shepherd, Peter Wiebe, An Ontology Pattern for Oceanograhic Cruises: Towards an Oceanograhper's Dream of Integrated Knowledge Discovery. OceanLink Technical Report 2014.1.
January 2014 – Ontology Summit – Pascal Hitzler
Towards ontology patterns for ocean science repository integration
Pascal Hitzler DaSe Lab for Data Semantics
Wright State University http://www.pascal-hitzler.de
January 2014 – Ontology Summit – Pascal Hitzler 2
Collaborators Robert Arko, Columbia University Suzanne Carbotte, Columbia University Cynthia Chandler, Woods Hole Oceanographic Institution Michelle Cheatham, Wright State University Timothy Finin, University of Maryland, Baltimore County Pascal Hitzler, Wright State University Krzysztof Janowicz, University of California, Santa Barbara Adila Krisnadhi, Wright State University Thomas Narock, Marymount University Lisa Raymond, Woods Hole Oceanographic Institution Adam Shepherd, Woods Hole Oceanographic Institution Peter Wiebe, Woods Hole Oceanographic Institution The presented work is part of the NSF OceanLink project: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery
January 2014 – Ontology Summit – Pascal Hitzler 3
OceanLink and EarthCube
EarthCube: Developing a Community-Driven Data and Knowledge Environment for the Geosciences “concepts and approaches to create integrated data management infrastructures across the Geosciences.” “EarthCube aims to create a well-connected and facile environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to understand and predict the Earth system.”
January 2014 – Ontology Summit – Pascal Hitzler 4
OceanLink
Bottom-up constructed project. Currently first phase: • Integrating ocean science respositories BCO-DMO and R2R, as
well as datasets from the WHOI Library, AGU abstracts, NSF projects.
• Demonstrable added value (faceted integrated search).
• Key: extensible architecture that has the potential to grow to EarthCube size
January 2014 – Ontology Summit – Pascal Hitzler 5
Logic
Many axioms / strong theory
Few axioms / weak theory
Few models Many inferences Many models Few inferences
January 2014 – Ontology Summit – Pascal Hitzler 6
Ontologies
Strong / many ontological commitments
Weak / few ontological commitments
Few models Many inferences Not very reusable Many models Few inferences More easily reusable
January 2014 – Ontology Summit – Pascal Hitzler 7
Ontology Design Patterns
Strong / many ontological commitments
Weak / few ontological commitments
Few models Many inferences Not very reusable Many models Few inferences More easily reusable
January 2014 – Ontology Summit – Pascal Hitzler 8
Ontology Design Patterns
“An ontology design pattern is a reusable successful solution to a recurrent modeling problem.” So-called content patterns usually encode specific abstract notions, such as process, event, agent, etc.
January 2014 – Ontology Summit – Pascal Hitzler 9
E.g., “Event”
Event
xsd:dateTime xsd:string
occursAtPlace occursAtTime
January 2014 – Ontology Summit – Pascal Hitzler 10
Better Event (more general)
Event
<TemporalThing> <SpatialThing>
occursAtPlace occursAtTime
This is a pattern!
But what about events taking place in Second Life?
January 2014 – Ontology Summit – Pascal Hitzler 11
Perhaps even …
Event
<TemporalThing> <Place>
occursAtPlace occursAtTime
<Agent>
hasParticipant
January 2014 – Ontology Summit – Pascal Hitzler 12
Event
<Place>
occursAtPlace
Event
xsd:string
occursAtPlace
Shortcuts / views
xsd:string
hasName
There are several things wrong here!
January 2014 – Ontology Summit – Pascal Hitzler 13
Event
<Place>
a:occursAtPlace
Event
xsd:string
b:occursAtPlace
Shortcuts / views
xsd:string
a:hasName
Better, but …
January 2014 – Ontology Summit – Pascal Hitzler 14
Event
<Place>
a:occursAtPlace
Event
xsd:string
b:occursAtPlace
Shortcuts / views
xsd:string
a:hasName
The latter is not in OWL!
January 2014 – Ontology Summit – Pascal Hitzler 15
Event
<Place>
a:occursAtPlace
Shortcuts / views
xsd:string
a:hasName
The latter is not in OWL!
b:occursAtPlace
January 2014 – Ontology Summit – Pascal Hitzler 16
Similar problem
Splitting a role: hasParent
hasFather
hasMother
January 2014 – Ontology Summit – Pascal Hitzler 17
Cruise
For us: ocean science cruise. A cruise is a type of event. But what kind of place does it occur at?
January 2014 – Ontology Summit – Pascal Hitzler 18
Cruise
Cruise
<TemporalThing> <Place>
occursAtPlace occursAtTime
<Agent>
hasParticipant
January 2014 – Ontology Summit – Pascal Hitzler 19
Semantic Trajectories
[Hu, Janowicz, Carral, Scheider, Kuhn, Berg-Cross, Hitzler, Dean, COSIT2013]
January 2014 – Ontology Summit – Pascal Hitzler 20
Semantic Trajectories
January 2014 – Ontology Summit – Pascal Hitzler 21
Semantics in OWL
January 2014 – Ontology Summit – Pascal Hitzler 22
Semantics in OWL
January 2014 – Ontology Summit – Pascal Hitzler 23
Ocean Science Cruise (draft)
January 2014 – Ontology Summit – Pascal Hitzler 24
Cruise trajectory (draft)
January 2014 – Ontology Summit – Pascal Hitzler 25
Cruise trajectory
January 2014 – Ontology Summit – Pascal Hitzler 26
Cruise trajectory
January 2014 – Ontology Summit – Pascal Hitzler 27
Cruise trajectory
January 2014 – Ontology Summit – Pascal Hitzler 28
Cruise trajectory
January 2014 – Ontology Summit – Pascal Hitzler 29
Why ODPs?
Traditionally, ODPs are thought of as building blocks for ontology modeling. This idea is certainly valid in the context of special purpose ontology-based systems. However, it can be argued that ODPs can be much more than mere building blocks.
January 2014 – Ontology Summit – Pascal Hitzler 30
Horizontal alignment
“Horizontal” alignment via patterns
Pattern1 Pattern1
Pattern2 Pattern2
Pattern2
Pattern3
Pattern3
January 2014 – Ontology Summit – Pascal Hitzler 31
OceanLink setup
OceanLink Patterns
R2R BCO-DMO MBLWHOI Library
AGU NSF
UI Views
User Interface
mappings
January 2014 – Ontology Summit – Pascal Hitzler 32
Other added values of patterns
• Pattern-driven GUIs • Pattern-driven mapping tools • Pattern-driven query rewriting • Pattern-driven reasoning modularization • …
January 2014 – Ontology Summit – Pascal Hitzler 33
OceanLink setup
EarthCube Patterns
repository repository repository repository repository
UI Views
User Interface
mappings
January 2014 – Ontology Summit – Pascal Hitzler 34
Thanks!
January 2014 – Ontology Summit – Pascal Hitzler 35
References
• BCO-DMO: Biological & Chemical Oceanography Data Management Office, http://www.bco-dmo.org/
• R2R: Rolling Deck to Repository, http://www.rvdata.us • OceanLink website and publications are forthcoming • Yingjie Hu, Krzysztof Janowicz, David Carral, Simon Scheider,
Werner Kuhn, Gary Berg-Cross, Pascal Hitzler, Mike Dean, Dave Kolas, A Geo-Ontology Design Pattern for Semantic Trajectories. In: Thora Tenbrink, John G. Stell, Antony Galton, Zena Wood (Eds.): Spatial Information Theory - 11th International Conference, COSIT 2013, Scarborough, UK, September 2-6, 2013. Proceedings. Lecture Notes in Computer Science Vol. 8116, Springer, 2013, pp. 438-456.
• http://ontologydesignpatterns.org
January 2014 – Ontology Summit – Pascal Hitzler 36
General References
• Pascal Hitzler, Frank van Harmelen, A reasonable Semantic Web. Semantic Web 1 (1-2), 39-44, 2010.
• Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Linked Data is Merely More Data. In: Dan Brickley, Vinay K. Chaudhri, Harry Halpin, Deborah McGuinness: Linked Data Meets Artificial Intelligence. Technical Report SS-10-07, AAAI Press, Menlo Park, California, 2010, pp. 82-86. ISBN 978-1-57735-461-1. Proceedings of LinkedAI at the AAAI Spring Symposium, March 2010.
• Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies. Chapman and Hall/CRC Press, 2009.
• Krzysztof Janowicz, Pascal Hitzler, The Digital Earth as Knowledge Engine. Semantic Web 3 (3), 213-221, 2012.
January 2014 – Ontology Summit – Pascal Hitzler 37
General References
• Pascal Hitzler, Krzysztof Janowicz, Linked Data, Big Data, and the 4th Paradigm. Semantic Web 4 (3), 2013, 233-235.
• Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, Ilya Zaslavsky, Semantics and Ontologies for EarthCube. In: K. Janowicz, C. Kessler, T. Kauppinen, D. Kolas, S. Scheider (eds.), Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012), Columbus, Ohio, USA. September 18th, 2012. Proceedings.
• Krzysztof Janowicz, Pascal Hitzler, Thoughts on the Complex Relation Between Linked Data, Semantic Annotations, and Ontologies. In: Paul N. Bennett, Evgeniy Gabrilovich, Jaap Kamps, Jussi Karlgren (eds.), Proceedings of the 6th International Workshop on Exploiting Semantic Annotation in Information Retrieval, ESAIR 2013, ACM, San Francisco, 2013, pp. 41-44.
January 2014 – Ontology Summit – Pascal Hitzler 38
General References
• Sebastian Rudolph, Markus Krötzsch, Pascal Hitzler, Cheap Boolean Role Constructors for Description Logics. In: Steffen Hölldobler and Carsten Lutz and Heinrich Wansing (eds.), Proceedings of 11th European Conference on Logics in Artificial Intelligence (JELIA), volume 5293 of LNAI, pp. 362-374. Springer, September 2008.
• Adila Alfa Krisnadhi, Frederick Maier, Pascal Hitzler, OWL and Rules. In: A. Polleres, C. d'Amato, M. Arenas, S. Handschuh, P. Kroner, S. Ossowski, P.F. Patel-Schneider (eds.), Reasoning Web. Semantic Technologies for the Web of Data. 7th International Summer School 2011, Galway, Ireland, August 23-27, 2011, Tutorial Lectures. Lecture Notes in Computer Science Vol. 6848, Springer, Heidelberg, 2011, pp. 382-415.
March 2014 – GeoVoCampSB – Pascal Hitzler
OceanLink: Using Patterns for Discovery in EarthCube
Pascal Hitzler DaSe Lab for Data Semantics
Wright State University http://www.pascal-hitzler.de
March 2014 – GeoVoCampSB – Pascal Hitzler 2
OceanLink Collaborators Robert Arko, Columbia University Suzanne Carbotte, Columbia University Cynthia Chandler, Woods Hole Oceanographic Institution Michelle Cheatham, Wright State University Timothy Finin, University of Maryland, Baltimore County Pascal Hitzler, Wright State University Krzysztof Janowicz, University of California, Santa Barbara Adila Krisnadhi, Wright State University Thomas Narock, Marymount University Lisa Raymond, Woods Hole Oceanographic Institution Adam Shepherd, Woods Hole Oceanographic Institution Peter Wiebe, Woods Hole Oceanographic Institution The presented work is part of the NSF OceanLink project: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery
March 2014 – GeoVoCampSB – Pascal Hitzler 3
Classical ontology-based integration
Query Upper level ontology
LOD IMDB
Dataset
LOD
Wikipedia Dataset
(DBPedia)
Answer
[ODBASE 2012, JWS 2007]
March 2014 – GeoVoCampSB – Pascal Hitzler 4
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
In principle, all the knowledge is there: • GovTrack • GeoNames • DBPedia • US Census
But even with LoD we cannot answer this query.
March 2014 – GeoVoCampSB – Pascal Hitzler 5
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
Some missing puzzle pieces: • Where is the data?
– GovTrack GeoNames US Census requires intimate knowledge of the LoD data sets
March 2014 – GeoVoCampSB – Pascal Hitzler 6
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
Some missing puzzle pieces: • Where is the data?
(smart federation needed) • Missing background (schema) knowledge.
(enhancements of the LoD cloud) • Crucial info still hidden in texts.
(ontology learning from texts) • Added reasoning capabilities (e.g., spatial).
(new ontology language features)
March 2014 – GeoVoCampSB – Pascal Hitzler 7
Linked Data: Variety
“Nancy Pelosi voted in favor of the Health Care Bill.”
Bills:h3962
H.R. 3962: Affordable Health Care for America
Act
Votes:2009-887/+
people/P000197
Nancy Pelosi On Passage: H R 3962 Affordable Health Care for
America Act
Vote: 2009-887
vote:hasAction
vote:vote
dc:title
vote:hasOption
rdfs:label Aye
dc:title
vote:votedBy
name
March 2014 – GeoVoCampSB – Pascal Hitzler 8
Querying approach
Works very well, but only in some very limited cases. Cannot deal with graph representations of even very minimal complexity.
March 2014 – GeoVoCampSB – Pascal Hitzler 9
Automated federation?
person/101396
“Smith, John”
name
R2R:
foaf:Person
type
Person_752
name
foaf:Person
type
“John Smith”
familyName
“Smith”
givenName
“John”
BCO-DMO:
March 2014 – GeoVoCampSB – Pascal Hitzler 10
Automated federation?
March 2014 – GeoVoCampSB – Pascal Hitzler 11
Ways forward?
How to establish a flexible conceptual architecture using data and ontological modeling?
March 2014 – GeoVoCampSB – Pascal Hitzler 12
Ontology Design Patterns
“An ontology design pattern is a reusable successful solution to a recurrent modeling problem.” So-called content patterns usually encode specific abstract notions, such as process, event, agent, etc. Patterns provide modular, reusable, replaceable, pieces. By agreeing on reuse of generic patterns (but leaving the relationships between the patterns to a specific assembly for a special purpose), we can have reuse while preserving heterogeneity.
March 2014 – GeoVoCampSB – Pascal Hitzler 13
Ontology Design Patterns
• Bottom-up homogenization of data representation.
• Avoidance of strong ontological commitments.
• Avoidance of standardization of specific modeling details.
• Well thought-out patterns can be very strong and versatile, thus serve many needs.
We are currently establishing many geo-patterns in a series of
hands-on workshops, the GeoVoCamps, see http://vocamp.org/
March 2014 – GeoVoCampSB – Pascal Hitzler 14
Ontology Design Patterns
“Horizontal” alignment via patterns
Pattern1 Pattern1
Pattern2 Pattern2
Pattern2
Pattern3
Pattern3
March 2014 – GeoVoCampSB – Pascal Hitzler 15
EarthCube
EarthCube: Developing a Community-Driven Data and Knowledge Environment for the Geosciences “concepts and approaches to create integrated data management infrastructures across the Geosciences.” “EarthCube aims to create a well-connected and facile environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to understand and predict the Earth system.”
March 2014 – GeoVoCampSB – Pascal Hitzler 16
OceanLink
NSF EarthCube project “OceanLink”: • Integration of existing ocean science data repositories.
• For faceted browsing and semantic search.
• To be done in a flexible, extendable, modular way.
• With minimal effort for additional data providers to integrate
their content.
National Science Foundation award 1354778 "EAGER: Collaborative Research: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery."
March 2014 – GeoVoCampSB – Pascal Hitzler 17
OceanLink setup
OceanLink Patterns
R2R BCO-DMO WHOI Library
AGU NSF
additional application-specific modeling
User Interface
mappings
March 2014 – GeoVoCampSB – Pascal Hitzler 18
OceanLink patterns
Some central patterns: • Cruise • Trajectory • Person • Organization • Roles of Agents • Repository Object • Data Set • Document
We’re not starting from zero of course.
March 2014 – GeoVoCampSB – Pascal Hitzler 19
Ocean Science Cruise (draft)
March 2014 – GeoVoCampSB – Pascal Hitzler 20
Cruise trajectory (draft)
March 2014 – GeoVoCampSB – Pascal Hitzler 21
Cruise trajectory
March 2014 – GeoVoCampSB – Pascal Hitzler 22
Cruise trajectory
March 2014 – GeoVoCampSB – Pascal Hitzler 23
Cruise trajectory
March 2014 – GeoVoCampSB – Pascal Hitzler 24
Cruise trajectory
March 2014 – GeoVoCampSB – Pascal Hitzler 25
Ways forward
• Establish a flexible conceptual architecture using data and
ontological modeling. • A principled use of patterns, including
– the development of a theory of patterns and – the provision of a critical amount of central patterns may provide a primary path forward.
March 2014 – GeoVoCampSB – Pascal Hitzler 26
Thanks!
March 2014 – GeoVoCampSB – Pascal Hitzler 27
Semantic Trajectories
[Hu, Janowicz, Carral, Scheider, Kuhn, Berg-Cross, Hitzler, Dean, COSIT2013]
March 2014 – GeoVoCampSB – Pascal Hitzler 28
Semantic Trajectories
March 2014 – GeoVoCampSB – Pascal Hitzler 29
Semantics in OWL
March 2014 – GeoVoCampSB – Pascal Hitzler 30
Semantics in OWL
March 2014 – GeoVoCampSB – Pascal Hitzler 31
References
• Pascal Hitzler, Frank van Harmelen, A reasonable Semantic Web. Semantic Web 1 (1-2), 39-44, 2010.
• Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Linked Data is Merely More Data. In: Dan Brickley, Vinay K. Chaudhri, Harry Halpin, Deborah McGuinness: Linked Data Meets Artificial Intelligence. Technical Report SS-10-07, AAAI Press, Menlo Park, California, 2010, pp. 82-86. ISBN 978-1-57735-461-1. Proceedings of LinkedAI at the AAAI Spring Symposium, March 2010.
• Pascal Hitzler, Krzysztof Janowicz, What’s Wrong with Linked Data? http://blog.semantic-web.at/2012/08/09/whats-wrong-with-linked-data/ , August 2012.
• Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies. Chapman and Hall/CRC Press, 2009.
March 2014 – GeoVoCampSB – Pascal Hitzler 32
References
• Pascal Hitzler, Krzysztof Janowicz, Linked Data, Big Data, and the 4th Paradigm. Semantic Web 4 (3), 2013, 233-235.
• Krzysztof Janowicz, Pascal Hitzler, The Digital Earth as Knowledge Engine. Semantic Web 3 (3), 213-221, 2012.
• Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, Ilya Zaslavsky, Semantics and Ontologies for EarthCube. In: K. Janowicz, C. Kessler, T. Kauppinen, D. Kolas, S. Scheider (eds.), Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012), Columbus, Ohio, USA. September 18th, 2012. Proceedings.
• Krzysztof Janowicz, Pascal Hitzler, Thoughts on the Complex Relation Between Linked Data, Semantic Annotations, and Ontologies. In: Paul N. Bennett, Evgeniy Gabrilovich, Jaap Kamps, Jussi Karlgren (eds.), Proceedings of the 6th International Workshop on Exploiting Semantic Annotation in Information Retrieval, ESAIR 2013, ACM, San Francisco, 2013, pp. 41-44.
March 2014 – GeoVoCampSB – Pascal Hitzler 33
References
• Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh, Ontology Alignment for Linked Open Data. In P. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Pan, I. Horrocks, B. Glimm (eds.), The Semantic Web - ISWC 2010. 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I. Lecture Notes in Computer Science Vol. 6496. Springer, Berlin, 2010, pp. 402-417.
• Amit Krishna Joshi, Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Mariana Damova, Alignment-based Querying of Linked Open Data. In: Meersman, R.; Panetto, H.; Dillon, T.; Rinderle-Ma, S.; Dadam, P.; Zhou, X.; Pearson, S.; Ferscha, A.; Bergamaschi, S.; Cruz, I.F. (eds.), On the Move to Meaningful Internet Systems: OTM 2012, Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012, Proceedings, Part II. Lecture Notes in Computer Science Vol. 7566, Springer, Heidelberg, 2012, pp. 807-824.
• Yingjie Hu, Krzysztof Janowicz, David Carral, Simon Scheider, Werner Kuhn, Gary Berg-Cross, Pascal Hitzler, Mike Dean, Dave Kolas, A Geo-Ontology Design Pattern for Semantic Trajectories. In: Thora Tenbrink, John G. Stell, Antony Galton, Zena Wood (Eds.): Spatial Information Theory - 11th International Conference, COSIT 2013, Scarborough, UK, September 2-6, 2013. Proceedings. Lecture Notes in Computer Science Vol. 8116, Springer, 2013, pp. 438-456.
March 2014 – GeoVoCampSB – Pascal Hitzler 34
References
• Prateek Jain, Peter Z. Yeh, Kunal Verma, Reymonrod G. Vasquez, Mariana Damova, Pascal Hitzler, Amit P. Sheth, Contextual Ontology Alignment of LOD with an Upper Ontology: A Case Study with Proton. In: Grigoris Antoniou, Marko Grobelnik, Elena Paslaru Bontas Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leenheer, Jeff Pan (Eds.): The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29-June 2, 2011, Proceedings, Part I. Lecture Notes in Computer Science 6643, Springer, 2011, pp. 80-92.
• Prateek Jain, Pascal Hitzler, Kunal Verma, Peter Yeh, Amit Sheth, Moving beyond sameAs with PLATO: Partonomy detection for Linked Data. In: Ethan V. Munson, Markus Strohmaier (Eds.): 23rd ACM Conference on Hypertext and Social Media, HT '12, Milwaukee, WI, USA, June 25-28, 2012. ACM, 2012, pp. 33-42.
March 2014 – GeoVoCampSB – Pascal Hitzler 35
References
• D. Oberle, A. Ankolekar, P. Hitzler, P. Cimiano, M. Sintek, M. Kiesel, B. Mougouie, S. Vembu, S. Baumann, M. Romanelli, P. Buitelaar, R. Engel, D. Sonntag, N. Reithinger, B. Loos, R. Porzel, H.-P. Zorn, V. Micelli, C. Schmidt, M. Weiten, F. Burkhardt, J. Zhou, DOLCE ergo SUMO: On Foundational and Domain Models in the SmartWeb Integrated Ontology (SWIntO). Journal of Web Semantics: Science, Services and Agents on the World Wide Web 5 (3), 2007, 156-174.
• Adila Krisnadhi, Robert Arko, Suzanne Carbotte, Cynchia Chandler, Michelle Cheatham, Timothy Finin, Pascal Hitzler, Krzysztof Janowicz, Thomas Narock, Lisa Raymond, Adam Shepherd, Peter Wiebe, An Ontology Pattern for Oceanograhic Cruises: Towards an Oceanograhper's Dream of Integrated Knowledge Discovery. OceanLink Technical Report 2014.1.
March 2014 – GeoVoCampSB – Pascal Hitzler 36
References
• Sebastian Rudolph, Markus Krötzsch, Pascal Hitzler, Cheap Boolean Role Constructors for Description Logics. In: Steffen Hölldobler and Carsten Lutz and Heinrich Wansing (eds.), Proceedings of 11th European Conference on Logics in Artificial Intelligence (JELIA), volume 5293 of LNAI, pp. 362-374. Springer, September 2008.
• Adila Alfa Krisnadhi, Frederick Maier, Pascal Hitzler, OWL and Rules. In: A. Polleres, C. d'Amato, M. Arenas, S. Handschuh, P. Kroner, S. Ossowski, P.F. Patel-Schneider (eds.), Reasoning Web. Semantic Technologies for the Web of Data. 7th International Summer School 2011, Galway, Ireland, August 23-27, 2011, Tutorial Lectures. Lecture Notes in Computer Science Vol. 6848, Springer, Heidelberg, 2011, pp. 382-415.
January 2014 – IBM – Pascal Hitzler
Ontologies in a Data-driven World
Pascal Hitzler DaSe Lab for Data Semantics
Wright State University http://www.pascal-hitzler.de
January 2014 – IBM – Pascal Hitzler 2
Textbook
Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph Foundations of Semantic Web Technologies Chapman & Hall/CRC, 2010 Choice Magazine Outstanding Academic Title 2010 (one out of seven in Information & Computer Science) http://www.semantic-web-book.org
January 2014 – IBM – Pascal Hitzler 3
Textbook – Chinese translation
Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph
语义Web技术基础 Tsinghua University Press (清华大学出版社), 2013. Translators: Yong Yu, Haofeng Wang, Guilin Qi (俞勇,王昊奋,漆桂林)
http://www.semantic-web-book.org
January 2014 – IBM – Pascal Hitzler 4
Semantic Web journal
• EiCs: Pascal Hitzler Krzysztof Janowicz
• New journal with significant uptake.
• We very much welcome contributions at the “rim” of traditional Semantic Web research – e.g., work which is strongly inspired by a different field.
• Non-standard (open & transparent) review process.
• http://www.semantic-web-journal.net/
January 2014 – IBM – Pascal Hitzler 5
Ontologies?
January 2014 – IBM – Pascal Hitzler 6
• ... Agent 1
Thing
Person 2
Ontology description
Agent 2
exchange of symbols
‘‘Duck“
Concept MA1
HA1 HA2 MA2
Symbol
Specific Domain, e.g. Animals
agreement Ontology
Semantics
Person 1
exchange of symbols
agreement
A Basic Idea of the Semantic Web
January 2014 – IBM – Pascal Hitzler 7
Ontology represents
general domain knowledge
Reconciling OWL and Rules Knorr, Hitzler, Maier ECAI 2012
Data e.g. on Websites
e.g. every publication has an author
A Basic Idea of the Semantic Web
January 2014 – IBM – Pascal Hitzler 8
Reconciling OWL and Rules Knorr, Hitzler, Maier ECAI 2012
e.g. every publication has an author
Publication
Event
Title
Author
A Basic Idea of the Semantic Web
January 2014 – IBM – Pascal Hitzler 9
Ontology represents
general domain knowledge
Data e.g. on Websites
e.g. every publication has an author
A Basic Idea of the Semantic Web
Reconciling OWL and Rules Knorr, Hitzler, Maier ECAI 2012
January 2014 – IBM – Pascal Hitzler 10
The ontology hype
• Large, well-thought-out ontologies (foundational/domain/etc). • Networked, interlinked ontologies
• “You just have to get your formal definitions right, and a lot of
the rest will just fall into place.”
January 2014 – IBM – Pascal Hitzler 11
The ontology hype
• Large, well-thought-out ontologies (foundational/domain/etc). • Networked, interlinked ontologies
• “You just have to get your formal definitions right, and a lot of
the rest will just fall into place.” – This does not even work for
• scientists • wanting to share and reuse scientific data • through well-kept data repositories
– So how is this supposed to work for the web at large?
January 2014 – IBM – Pascal Hitzler 12
Multiple perspectives
• Try to find a universal definition for – Forest – Mountain – City – River
– Etc.
• The stronger our ontological commitments, the more we loose
reusability.
• We need to accept that conceptualizations are often very local, resulting in “micro-ontologies”.
January 2014 – IBM – Pascal Hitzler 13
Multiple perspectives
Two ontologies. Left: transportation domain Right: agriculture domain We cannot simply equate a:Canal and b:Canal !
January 2014 – IBM – Pascal Hitzler 14
The well-done ontologies
• Brittle • Expensive • Sometimes unintuitive • Unwieldy • Single-perspective • Difficult to reuse
• Work in some contexts. • Work if a lot of central control is imposed. • Take a lot of manpower.
January 2014 – IBM – Pascal Hitzler 15
Pre-LOD Semantic Web
• Foundational ontologies • Networked ontologies • Sophisticated ontology languages
Scientific Hypothesis: These will solve your data and information management problems Remember that scientific progress is fundamentally about falsification, not verification
January 2014 – IBM – Pascal Hitzler 16
Linked Data?
January 2014 – IBM – Pascal Hitzler 17
The linked data counter-hype
• “Ontologies don’t work, let’s just link data”
• “Okay, with a little bit of ontologies on top.”
• “The Linked Data Web is the true Semantic Web.”
January 2014 – IBM – Pascal Hitzler 18
Linked Data 2011
January 2014 – IBM – Pascal Hitzler 19
DBpedia: LOTR page
January 2014 – IBM – Pascal Hitzler 20
Information as RDF graph
LOTR hasAuthor Tolkien . Hobbit hasAuthor Tolkien . LOTR hasCharacter Bilbo . Hobbit hasCharacter Bilbo .
LOTR
Hobbit
Tolkien
Bilbo
hasAuthor
hasAuthor
hasCharacter
hasCharacter
January 2014 – IBM – Pascal Hitzler 21
Linked Data: Volume
Number of Datasets 2011-09-19 295 2010-09-22 203 2009-07-14 95 2008-09-18 45 2007-10-08 25 2007-05-01 12
Number of triples (Sept 2011) 31,634,213,770 with 503,998,829 out-links
From http://www4.wiwiss.fu-berlin.de/lodcloud/state/
January 2014 – IBM – Pascal Hitzler 22
Linked Data: Volume Geoindexed Linked Data – courtesy of Krzysztof Janowicz http://stko.geog.ucsb.edu/location_linked_data
January 2014 – IBM – Pascal Hitzler 23
Linked Data: Volume
October 2013: Ca. 25,000,000,000 schema.org references on the web. 15% of all pages now have schema.org markup. That’s just schema.org references …
January 2014 – IBM – Pascal Hitzler 24
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
In principle, all the knowledge is there: • GovTrack • GeoNames • DBPedia • US Census
But even with LoD we cannot answer this query.
January 2014 – IBM – Pascal Hitzler 25
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
Some missing puzzle pieces: • Where is the data?
– GovTrack GeoNames US Census requires intimate knowledge of the LoD data sets
January 2014 – IBM – Pascal Hitzler 26
Example querying LoD
“Identify congress members, who have voted “No” on pro environmental legislation in the past four years, with high-pollution industry in their congressional districts.”
Some missing puzzle pieces: • Where is the data?
(smart federation needed) • Missing background (schema) knowledge.
(enhancements of the LoD cloud) • Crucial info still hidden in texts.
(ontology learning from texts) • Added reasoning capabilities (e.g., spatial).
(new ontology language features)
January 2014 – IBM – Pascal Hitzler 27
Linked Data: Variety
“Nancy Pelosi voted in favor of the Health Care Bill.”
Bills:h3962
H.R. 3962: Affordable Health Care for America
Act
Votes:2009-887/+
people/P000197
Nancy Pelosi On Passage: H R 3962 Affordable Health Care for
America Act
Vote: 2009-887
vote:hasAction
vote:vote
dc:title
vote:hasOption
rdfs:label Aye
dc:title
vote:votedBy
name
January 2014 – IBM – Pascal Hitzler 28
Linked Data federated querying
Query Upper level ontology
LOD IMDB
Dataset
LOD
Wikipedia Dataset
(DBPedia)
Answer
Joshi, Jain, Hitzler et al. ODBASE 2012
January 2014 – IBM – Pascal Hitzler 29
Bootstrapping-based alignment
Jain, Hitzler et al, ISWC2010
January 2014 – IBM – Pascal Hitzler 30
Linked Data federated querying
Query Upper level ontology
LOD IMDB
Dataset
LOD
Wikipedia Dataset
(DBPedia)
Answer
Joshi, Jain, Hitzler et al. ODBASE 2012
January 2014 – IBM – Pascal Hitzler 31
ALOQUS Illustration
“Identify films, the nations where they were shot and the population of these countries” SELECT ?film ?nation ?pop WHERE { ?film protonu:ofCountry ?nation. ?film rdf:type protonu:Movie. ?film rdfs:label ?film_name. ?nation protont:populationCount ?pop. }
January 2014 – IBM – Pascal Hitzler 32
Querying approach
Works very well, but only in some very limited cases. Cannot deal with graph representations of even very minimal complexity.
January 2014 – IBM – Pascal Hitzler 33
Automated federation?
person/101396
“Smith, John”
name
R2R:
foaf:Person
type
Person_752
name
foaf:Person
type
“John Smith”
familyName
“Smith”
givenName
“John”
BCO-DMO:
January 2014 – IBM – Pascal Hitzler 34
Automated federation?
January 2014 – IBM – Pascal Hitzler 35
Automated federation? Copernicus lunar crater located on earth – courtesy of Krzysztof
Janowicz http://stko.geog.ucsb.edu/location_linked_data (missing reference coordinate system)
January 2014 – IBM – Pascal Hitzler 36
The linked data counter-hype
• “Ontologies don’t work, let’s just link data”
• “Okay, with a little bit of ontologies on top.”
• But then we don’t even know how to effectively query over multiple linked datasets (without using a lot of manpower to manually integrate them).
• It seems rather obvious that we need to get ontologies into the picture, but how to do it while avoiding the drawbacks of strong ontological commitments?
January 2014 – IBM – Pascal Hitzler 37
So What Now?
January 2014 – IBM – Pascal Hitzler 38
Ways forward?
How to establish a flexible conceptual architecture using data and ontological modeling?
January 2014 – IBM – Pascal Hitzler 39
Ontology Design Patterns
“An ontology design pattern is a reusable successful solution to a recurrent modeling problem.” So-called content patterns usually encode specific abstract notions, such as process, event, agent, etc.
January 2014 – IBM – Pascal Hitzler 40
Ontology Design Patterns
• Bottom-up homogenization of data representation.
• Avoidance of strong ontological commitments.
• Avoidance of standardization of specific modeling details.
• Well thought-out patterns can be very strong and versatile, thus serve many needs.
We are currently establishing many geo-patterns in a series of
hands-on workshops, the GeoVoCamps, see http://vocamp.org/
January 2014 – IBM – Pascal Hitzler 41
Ontology Design Patterns
“Horizontal” alignment via patterns
Pattern1 Pattern1
Pattern2 Pattern2
Pattern2
Pattern3
Pattern3
January 2014 – IBM – Pascal Hitzler 42
Semantic Trajectories
[Hu, Janowicz, Carral, Scheider, Kuhn, Berg-Cross, Hitzler, Dean, COSIT2013]
January 2014 – IBM – Pascal Hitzler 43
Semantic Trajectories
January 2014 – IBM – Pascal Hitzler 44
Semantics in OWL
January 2014 – IBM – Pascal Hitzler 45
Semantics in OWL
January 2014 – IBM – Pascal Hitzler 46
Helpfulness of patterns
person/101396
“Smith, John”
name
R2R:
foaf:Person
type
Person_752
name
foaf:Person
type
“John Smith”
familyName
“Smith”
givenName
“John”
BCO-DMO: Even minimalistic reuse is helpful:
January 2014 – IBM – Pascal Hitzler 47
Patterns
• Help to focus when modeling (one key notion at a time). • Good ontology modeling implicitly employs the patterns idea
anyway. It’s just that you expose the patterns. • An ontology composed of patterns exposes its internal
conceptual structure (as a composition of formal vocabulary pieces).
• Well-designed patterns are widely reusable and adaptable. • You don’t have to buy a whole ontology when you adopt a few
patterns from it. • You can easily modify a pattern without giving up on a lot of
similarity to the original pattern (which can be leveraged for data integration).
• You can separate the patterns from specific (application-driven) modifications.
• You can separate the patterns from specific axiomatically defined “views”.
January 2014 – IBM – Pascal Hitzler 48
Patterns Example
NSF EarthCube project “OceanLink”: • Integration of existing ocean science data repositories.
• For faceted browsing and semantic search.
• To be done in a flexible, extendable, modular way.
• With minimal effort for additional data providers to integrate
their content.
National Science Foundation award 1354778 "EAGER: Collaborative Research: EarthCube Building Blocks, Leveraging Semantics and Linked Data for Geoscience Data Sharing and Discovery."
January 2014 – IBM – Pascal Hitzler 49
OceanLink and EarthCube
EarthCube: Developing a Community-Driven Data and Knowledge Environment for the Geosciences “concepts and approaches to create integrated data management infrastructures across the Geosciences.” “EarthCube aims to create a well-connected and facile environment to share data and knowledge in an open, transparent, and inclusive manner, thus accelerating our ability to understand and predict the Earth system.”
January 2014 – IBM – Pascal Hitzler 50
OceanLink setup
OceanLink Patterns
R2R BCO-DMO WHOI Library
AGU NSF
UI Views
User Interface
mappings
January 2014 – IBM – Pascal Hitzler 51
OceanLink patterns
Some central patterns: • Cruise • Trajectory • Person • Organization • Roles of Agents • Repository Object • Data Set • Document
We’re not starting from zero of course.
January 2014 – IBM – Pascal Hitzler 52
Ocean Science Cruise (draft)
January 2014 – IBM – Pascal Hitzler 53
Cruise trajectory (draft)
January 2014 – IBM – Pascal Hitzler 54
Cruise trajectory
January 2014 – IBM – Pascal Hitzler 55
Cruise trajectory
January 2014 – IBM – Pascal Hitzler 56
Cruise trajectory
January 2014 – IBM – Pascal Hitzler 57
Cruise trajectory
January 2014 – IBM – Pascal Hitzler 58
Ways forward
• Establish a flexible conceptual architecture using data and
ontological modeling. • A principled use of patterns, including
– the development of a theory of patterns and – the provision of a critical amount of central patterns may provide a primary path forward.
January 2014 – IBM – Pascal Hitzler 59
Thanks!
January 2014 – IBM – Pascal Hitzler 60
References
• Pascal Hitzler, Frank van Harmelen, A reasonable Semantic Web. Semantic Web 1 (1-2), 39-44, 2010.
• Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Linked Data is Merely More Data. In: Dan Brickley, Vinay K. Chaudhri, Harry Halpin, Deborah McGuinness: Linked Data Meets Artificial Intelligence. Technical Report SS-10-07, AAAI Press, Menlo Park, California, 2010, pp. 82-86. ISBN 978-1-57735-461-1. Proceedings of LinkedAI at the AAAI Spring Symposium, March 2010.
• Pascal Hitzler, Krzysztof Janowicz, What’s Wrong with Linked Data? http://blog.semantic-web.at/2012/08/09/whats-wrong-with-linked-data/ , August 2012.
• Pascal Hitzler, Markus Krötzsch, Sebastian Rudolph, Foundations of Semantic Web Technologies. Chapman and Hall/CRC Press, 2009.
January 2014 – IBM – Pascal Hitzler 61
References
• Pascal Hitzler, Krzysztof Janowicz, Linked Data, Big Data, and the 4th Paradigm. Semantic Web 4 (3), 2013, 233-235.
• Krzysztof Janowicz, Pascal Hitzler, The Digital Earth as Knowledge Engine. Semantic Web 3 (3), 213-221, 2012.
• Gary Berg-Cross, Isabel Cruz, Mike Dean, Tim Finin, Mark Gahegan, Pascal Hitzler, Hook Hua, Krzysztof Janowicz, Naicong Li, Philip Murphy, Bryce Nordgren, Leo Obrst, Mark Schildhauer, Amit Sheth, Krishna Sinha, Anne Thessen, Nancy Wiegand, Ilya Zaslavsky, Semantics and Ontologies for EarthCube. In: K. Janowicz, C. Kessler, T. Kauppinen, D. Kolas, S. Scheider (eds.), Workshop on GIScience in the Big Data Age, In conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012), Columbus, Ohio, USA. September 18th, 2012. Proceedings.
• Krzysztof Janowicz, Pascal Hitzler, Thoughts on the Complex Relation Between Linked Data, Semantic Annotations, and Ontologies. In: Paul N. Bennett, Evgeniy Gabrilovich, Jaap Kamps, Jussi Karlgren (eds.), Proceedings of the 6th International Workshop on Exploiting Semantic Annotation in Information Retrieval, ESAIR 2013, ACM, San Francisco, 2013, pp. 41-44.
January 2014 – IBM – Pascal Hitzler 62
References
• Prateek Jain, Pascal Hitzler, Amit P. Sheth, Kunal Verma, Peter Z. Yeh, Ontology Alignment for Linked Open Data. In P. Patel-Schneider, Y. Pan, P. Hitzler, P. Mika, L. Zhang, J. Pan, I. Horrocks, B. Glimm (eds.), The Semantic Web - ISWC 2010. 9th International Semantic Web Conference, ISWC 2010, Shanghai, China, November 7-11, 2010, Revised Selected Papers, Part I. Lecture Notes in Computer Science Vol. 6496. Springer, Berlin, 2010, pp. 402-417.
• Amit Krishna Joshi, Prateek Jain, Pascal Hitzler, Peter Z. Yeh, Kunal Verma, Amit P. Sheth, Mariana Damova, Alignment-based Querying of Linked Open Data. In: Meersman, R.; Panetto, H.; Dillon, T.; Rinderle-Ma, S.; Dadam, P.; Zhou, X.; Pearson, S.; Ferscha, A.; Bergamaschi, S.; Cruz, I.F. (eds.), On the Move to Meaningful Internet Systems: OTM 2012, Confederated International Conferences: CoopIS, DOA-SVI, and ODBASE 2012, Rome, Italy, September 10-14, 2012, Proceedings, Part II. Lecture Notes in Computer Science Vol. 7566, Springer, Heidelberg, 2012, pp. 807-824.
• Yingjie Hu, Krzysztof Janowicz, David Carral, Simon Scheider, Werner Kuhn, Gary Berg-Cross, Pascal Hitzler, Mike Dean, Dave Kolas, A Geo-Ontology Design Pattern for Semantic Trajectories. In: Thora Tenbrink, John G. Stell, Antony Galton, Zena Wood (Eds.): Spatial Information Theory - 11th International Conference, COSIT 2013, Scarborough, UK, September 2-6, 2013. Proceedings. Lecture Notes in Computer Science Vol. 8116, Springer, 2013, pp. 438-456.
January 2014 – IBM – Pascal Hitzler 63
References
• Prateek Jain, Peter Z. Yeh, Kunal Verma, Reymonrod G. Vasquez, Mariana Damova, Pascal Hitzler, Amit P. Sheth, Contextual Ontology Alignment of LOD with an Upper Ontology: A Case Study with Proton. In: Grigoris Antoniou, Marko Grobelnik, Elena Paslaru Bontas Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leenheer, Jeff Pan (Eds.): The Semantic Web: Research and Applications - 8th Extended Semantic Web Conference, ESWC 2011, Heraklion, Crete, Greece, May 29-June 2, 2011, Proceedings, Part I. Lecture Notes in Computer Science 6643, Springer, 2011, pp. 80-92.
• Prateek Jain, Pascal Hitzler, Kunal Verma, Peter Yeh, Amit Sheth, Moving beyond sameAs with PLATO: Partonomy detection for Linked Data. In: Ethan V. Munson, Markus Strohmaier (Eds.): 23rd ACM Conference on Hypertext and Social Media, HT '12, Milwaukee, WI, USA, June 25-28, 2012. ACM, 2012, pp. 33-42.
January 2014 – IBM – Pascal Hitzler 64
References
• Sebastian Rudolph, Markus Krötzsch, Pascal Hitzler, Cheap Boolean Role Constructors for Description Logics. In: Steffen Hölldobler and Carsten Lutz and Heinrich Wansing (eds.), Proceedings of 11th European Conference on Logics in Artificial Intelligence (JELIA), volume 5293 of LNAI, pp. 362-374. Springer, September 2008.
• Adila Alfa Krisnadhi, Frederick Maier, Pascal Hitzler, OWL and Rules. In: A. Polleres, C. d'Amato, M. Arenas, S. Handschuh, P. Kroner, S. Ossowski, P.F. Patel-Schneider (eds.), Reasoning Web. Semantic Technologies for the Web of Data. 7th International Summer School 2011, Galway, Ireland, August 23-27, 2011, Tutorial Lectures. Lecture Notes in Computer Science Vol. 6848, Springer, Heidelberg, 2011, pp. 382-415.