Heterogeneity is Here to Stay and Semantics is Not About Agreement

4
H H S S A Krzysztof Janowicz STKO Lab University of California, Santa Barbara, USA EarthCube All-Hands Meeting Addressing Data Heterogeneity 2014 June 2014 H H S S A K. J

description

Earthcube All Hands 2014 Data Heterogeneity Panel; Semantics + Ontologies

Transcript of Heterogeneity is Here to Stay and Semantics is Not About Agreement

Page 1: Heterogeneity is Here to Stay and Semantics is Not About Agreement

Heterogeneity is Here to Stay and

Semantics is not about Agreement

Krzysztof JanowiczSTKO Lab University of California, Santa Barbara, USA

EarthCube All-Hands MeetingAddressing Data Heterogeneity 2014

June 2014

Heterogeneity is Here to Stay and Semantics is not about Agreement K. Janowicz

Page 2: Heterogeneity is Here to Stay and Semantics is Not About Agreement

The Data Retrieval Problem Is Real

Even the major data hubs such as Data.gov still rely on keyword-based searchand have unreliable, incomplete, and missing metadata. For this type ofretrieval problems, even ’a little semantics goes a long way’ (Hendler 1997).

Heterogeneity is Here to Stay and Semantics is not about Agreement K. Janowicz

Page 3: Heterogeneity is Here to Stay and Semantics is Not About Agreement

Sensemaking is Difficult – Fitness for Puspose is Key

There is no shortage of data, butfinding data that is fit for a certainpurpose is difficult.Data as statements (think RDF) notas truth.Heterogeneity is caused by culturaldifferences, progress in science,viewpoints, granularity, ...Alchemist Fallacy1; semanticsdoes not come for free.Lack of provenance informationSensemaking requires morepowerful semantic technologies andontologies (compared to IR).

1You cannot transmute base metals into gold and even if you could, gold would not be precious anymore. Recall the data citation discussion.

Heterogeneity is Here to Stay and Semantics is not about Agreement K. Janowicz

Page 4: Heterogeneity is Here to Stay and Semantics is Not About Agreement

Meaningful Analysis and Synthesis is Difficult

Ensuring that data is analyzed andcombined in a meaningful way is farfrom trivial.What if the information on how touse the data would come togetherwith these data?Focus on smart data instead of(merely on) smart applications.The purpose of ontologies is not toagree on the meaning of terms but tomake the data provider’s intendedmeaning explicit.

A little experiment: The statement all rivers flow into other water bodiesis not useful because it is ’true’2, but because...?

2 It is not; rivers can flow into the ground or just dry up entirely before reaching another water body.

Heterogeneity is Here to Stay and Semantics is not about Agreement K. Janowicz