SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC...

22
SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory

Transcript of SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC...

Page 1: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

SWEET: Upper-Level Ontologies for

Earth System Science

OPeNDAP MeetingFeb 2007

Rob RaskinPO.DAAC

Jet Propulsion Laboratory

Page 2: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Data to Knowledge

Data Information Knowledge

Basic Elements Bytes Numbers Models FactsServices Ingest Archive Visualize Infer Understand PredictStorage File Database HDF-EOS GIS/MIS Ontology MindInteroperability Syntactic OPeNDAP WMS/WCS SemanticVolume/Density High/Low Low/HighStatistics Checksum Moments Descriptive InferentialAnalysis Fourier Wavelet EOF SSAMethodology Exploratory-analysis Model-based-mining

Syntax Semantics

Page 3: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Semantics: Shared Understanding of Concepts

Provides a namespace for scientific terms…plus Provides descriptions of how terms relate to one another Example tags in markup language:

subclass, subproperty, part of, same as, transitive property, cardinality, etc.

Enables object in “data space” to be associated formally with object in “science concept space”

“Shared understanding” enables software tools to find “meaning” in resources

Page 4: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Ontology Representation W3C has adopted four XML-based standard

ontology languages: RDF, OWL-Lite, OWL-DL, OWL Full

Basic building blocks: Class, subclass, property, subproperty, sameAs

Standard language enables anyone to extend an ontology

Knowledge built up incrementally

Page 5: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Why an Upper-Level Ontology for Earth System Science?

Many common concepts used across Earth Science disciplines (such as properties of the Earth) Provides common definitions for terms used in multiple

disciplines or communities Provides common language in support of community

and multidisciplinary activities Reduced burden (and barrier to entry) on creators

of specialized domain ontologies Only need to create ontologies for incremental

knowledge

Page 6: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Semantic Web for Earth & Environmental Terminology

(SWEET)

Ontology of Earth system science and data concepts

Provides a common semantic framework (or namespace) for describing Earth science information and knowledge

Emphasis on improving search for NASA Earth science data resources

Represented in OWL-DL

Page 7: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Non-LivingSubstances

LivingSubstances

PhysicalProcesses

Earth Realm

PhysicalProperties

Time

NaturalPhenomena

Human Activities

Integrative Ontologies

Space

Data

Faceted Ontologies

Units

Numerics

SWEET Ontologies

Page 8: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

SWEET Supports Knowledge Reuse

SWEET is a concept space Enables scalable classification of Earth science and data-

related concepts Enables object in data space to be mapped to science

concept space Concept space is translatable into other

languages/cultures using “sameAs” notions

Page 9: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

SWEET Science Ontologies

Earth Realms Atmosphere, SolidEarth, Ocean, LandSurface, …

Physical Properties temperature, composition, area, albedo, …

Substances CO2, water, lava, salt, hydrogen, pollutants, …

Living Substances Humans, fish, …

Page 10: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

SWEET Conceptual Ontologies

Phenomena ElNino, Volcano, Thunderstorm, Deforestation,

Terrorism, physical processes (e.g., convection) Each has associated EarthRealms,

PhysicalProperties, spatial/temporal extent, etc. Specific instances included

e.g., 1997-98 ElNino

Human Activities Fisheries, IndustrialProcessing, Economics,…

Page 11: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

SWEET Numerical Ontologies SpatialEntities

Extents: country, Antarctica, equator, inlet, … Relations: above, northOf, …

TemporalEntities Extents: duration, century, season, … Relations: after, before, …

Numerics Extents: interval, point, 0, positiveIntegers, … Relations: lessThan, greaterThan, …

Units Extracted from Unidata’s UDUnits Added SI prefixes Multiplication of two quantities carries units

Page 12: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Numerical Ontologies Numeric concepts defined in OWL only through

standard XML XSD spec Intervals defined as restrictions on real line

Added in SWEET Numerical relations (lessThan, max, …) Cartesian product (multidimensional spaces)

Numeric ontologies used to define spatial and temporal concepts

Page 13: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

XSD: Datatypes Numeric

boolean, decimal, float, double, integer, nonNegativeInteger, positiveInteger, nonPositiveInteger, negativeInteger, long, int, short, unsignedLong, unsignedInt, unsignedShort, unsignedByte, hexBinary, base64Binary

String String, normalizedString, anyURI, token,

language, NMTOKEN, Name, NCName Date

dateTime, time, date, gYearMonth, gYear, gMonthDay, gDayxsd:gMonth

Page 14: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Data and Services Ontology Formats Data models Data Sttructures Special values

Missing, land, sea, ice, etc. Parameters

Scale factors, offsets, algorithms Data Services

Subset, reproject

Page 15: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Example: AIRS Level 2 Dataset

Subset of Dataset where DataModel= Level 2 Instrument= AIRS HorizontalDimension= 2 VerticalDimension= 1 Format= HDF-EOS Property= Temperature Substance= Air

Page 16: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Fragment of SWEET

Atmosphere

AtmosphereLayer

Troposphere

Tropopause

Stratosphere

isUpperBoundaryOf isLowerBoundaryOf

subClassOfsubClassOf

partOf

PlanetaryLayer

partOf

3DLayer

subClassOf

upperBoundary=50 km

lowerBoundary=15 km

primarySubstance=“air”

sameAs=“LowerAtmosphere”

Page 17: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

How SWEET was Initially Populated

Initial sources GCMD

Over 10,000 datasets Over 1000 keywords Data providers submit additional terms for “free-text” search

CF Over 700 keywords Very long term names

surface_downwelling_photon_spherical_irradiance_in_sea_water

Decomposed into facets Property= spherical_irradiance Substance= sea_water Space= surface Direction= down

Page 18: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Collaboration Web Site

Discussion tools Blog, wiki, moderated discussion board

Version Control/ Configuration Management Trace dependencies on external ontologies Tools to search for existing concepts in registered

ontologies Ontology Validation Procedure

W3C note is formal submission method Registry/discovery of ontologies Support workflows/services for ontology development

Page 19: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Community Issues

Content Maintain alignment given expansion of classes and properties

Standards and Conventions Agreement on standards for use of OWL Fuzzy representation conventions Submit as standard to NASA Standards & Processes Working

Group Review Board

Who will oversee and maintain for perpetuity (or at least through the next funding cycle)?

ESIP Federation? A new consortium? Global Support

Provide tools to visualize and appreciate the big picture

Page 20: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Update/Matching Issues

No removal of terms except for spelling or factual errors

Subscription service to notify affected ontologies when changes made

Must avoid contradictions Additions can create redundancy if sameAs not used Humans must oversee “matching”

CF has established moderator to carry out analogous additions

OWL “import” imports entire file Associate community with ontology terms

Community tagging

Page 21: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Best Practices Keep ontologies small, modular

Be careful that “Owl:Import” imports everything

Use higher level ontologies where possible Identify hierarchy of concept spaces

Model schemas Try to keep dependencies unidirectional

Page 22: SWEET: Upper-Level Ontologies for Earth System Science OPeNDAP Meeting Feb 2007 Rob Raskin PO.DAAC Jet Propulsion Laboratory.

Web Sites http://sweet.jpl.nasa.gov http://PlanetOnt.org