'The Why, What, and How of Geo-Information Observatories' GeoRich2014 Keynote

45
W W H T W, W, H G-I O Krzysztof Janowicz STKO Lab University of California, Santa Barbara, USA GeoRich 2014 Keynote, Snowbird, Utah, June 2014 T W, W, H G-I O K. J

Transcript of 'The Why, What, and How of Geo-Information Observatories' GeoRich2014 Keynote

Why What How

The Why, What, and How of

Geo-Information Observatories

Krzysztof JanowiczSTKO Lab

University of California, Santa Barbara, USA

GeoRich 2014 Keynote, Snowbird, Utah, June 2014

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Whyis this interesting?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Astronomical Observatories

The Griffith Observatory

Griffith donated funds and land to build the observatory to make astronomy accessible tothe public. This was in clear contrast to the prevailing idea of locating observatories onremote mountaintops and restrict them to scientists. Today, our society is willing to investbillions to study phenomena that may not even exist anymore (e.g., the Pillars of Creation).

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Astronomical Observatories

Observatories and Their Sensors

Whether on land or in space, observatories and their sensors servedifferent purposes and are most useful when they work together.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Astronomical Observatories

Spectral Signatures, Bands, and Remote Sensing

Spectral signatures are the combination of emitted, reflected or absorbedelectromagnetic radiation at varying wavelengths (bands) that uniquelyidentify a feature type.Spectral libraries, the idea of sharing spectral signatures, hasrevolutionized remote sensing.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Astronomical Observatories

Astronomical Breakthrough: Hubble Deep Field

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Astronomical Observatories

Astronomical Breakthrough: Hubble Deep Field

The universe is(mostly)HomogenousIsotropic

We will do such an experiment in a few minutes.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Observatories In Other Sciences

Observatories In Other Sciences

What do these observatories have in common? Why are they useful?Physical location to phenomenon, collaboration between observatories, tangible.

Observatories beyond AstronomyOcean observatories initiativeVolcano observatoriesMeteorological observatoriesGeological observatories

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Towards Information Observatories

Towards Information Observatories

Web Science Trust: A web observatory is a system that gives public access tosome specific aspects of the WWW and provides the infrastructure andvisualization techniques to support monitoring, analysis, and experiments.Web Science Trust wants to establish a network of observatories.The information universe has entered a phase of exponential growth but itsfoundations are still purely understood.We need observatories that are tangible (physical) installations; rememberGriffith’s will.{Web, Data, Information, Knowledge, Virtual Earth} Observatory?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Towards Information Observatories

How Does This Differ From the Digital Earth and CyberGIS?

The Digital Earth is a data archive to accessand visualize data layers on a digital globe.CyberGIS is mostly concerned with creatingonline workbenches for scientists to ease thestorage of data on the cloud and to docomplex spatial analysis on the cloud.Recall Griffith’s vision of making observatoriesavailable to the public, not just scientists.A way to handle some common sampling biasand quality arguments.Most examples will relate the informationuniverse back to the physical universe.However, it is important to note that theinformation universe can also be studied inits own rights.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Towards Information Observatories

Towards Information Observatories

Essentially, all models are wrong, butsome are useful. (George E. P. Box)

What we know is an artifact of thetechnical infrastructure we use (e.g.,sensors) and the models we develop.The physical universe is governed byphysical laws, constants, elementaryparticles, and so forth.What about the information universe?Are there laws of information?Complex sociotechnical interactions.Physical-Cyber- Social systems (cf.Sheth 2013).

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Whatwould we observe?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Is the Information Universe Homogenous amd Isotropic?

Spatial Distribution of Data on the Social Web

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Is the Information Universe Homogenous amd Isotropic?

Spatial Distribution of Data on the Social Web

In terms of geospatial distribution the Social (media) Web is neitherhomogenous nor isotropic.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Is the Information Universe Homogenous amd Isotropic?

The Idealized Linked Data Cloud

A highly popular visualization of the Linked Data Cloud by Cyganiak and Jentzschfrom Sept 2011. Is the LOD Cloud homogenous, isotropic?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Is the Information Universe Homogenous amd Isotropic?

A Linear Cluster Map Of The LOD Cloud

Credit: Gueret, Schlobach, Wang, Groth, van Harmelen (2011)

In terms of link structure, the Linked Data web is neither homogenousnor isotropic.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Are there Laws of the Information Universe?

A Law Of The Information Universe?

Terminological knowledge is orders of magnitude smaller than factualknowledge. (cf. van Harmelen, ISWC 2011)

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Are there Laws of the Information Universe?

What are the "Elementary Particles", "Constants" and "Laws"Governing the Information Universe?

Interestingly, the power law applies to terminological and factual knowledge.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

The Urban Observatory

’Urban Observatory – a live museum with a data pulse.’ (urbanobservatory.org)

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

POI Pulse: Point Of Interest Information Observatory

Analyze (zoom, change time, select categories, etc.) the pulse of a city via itsPoints of Interest and user behavior on social media (http://poipulse.com/).

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

POI Pulse: Point Of Interest Information Observatory

Theory-driven upper-level categories and default behavior based on semantic signatures.The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

POI Pulse: Point Of Interest Information Observatory

User interaction and fine-grained, data-driven categorization.The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

POI Pulse: Point Of Interest Information Observatory

Burst mode adds real-time data; tweets [red circles] and Foursquare check-ins.The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Early Geo-Information Observatories

Frankenplace

Credit: Adams & McKenzie (2012)

Frankenplace and thematic signatures support to study thegeo-indicativeness of text and sense of place.Note how POI Pulse and Frankenplace allow for observational andexperimental research.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Howcould we do this?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Challenges for Information Observatories

Where Are The Information Observatories?

Prototypes Aside, Where Are The Information Observatories?

Well, it’s a difficult taskData PublishingData RetrievalData SynthesisData ReuseSensemaking

Semantic Web technologies and ontologies aim at exactly thosechallenges and we are beginning to see their wide scale adoption.However, we need to work on approaches that combine data-drivenand theory-driven techniques.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Challenges for Information Observatories

The Data Retrieval Problem Is Real

Even the major data hubs such as Data.gov still rely on keyword-based searchand have unreliable, incomplete, and missing metadata. For this type of retrievalproblems, even a little semantics goes a long way (Hendler 1997).

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Challenges for Information Observatories

Sensemaking is Difficult – Fitness for Puspose is Key

There is no shortage of data, butfinding data that is fit for a certainpurpose is difficult.Data as statements (think RDF) notas truth.Heterogeneity is caused by culturaldifferences, progress in science,viewpoints, granularity, etc.Alchemist Fallacy1; semanticsdoes not come for free.Lack of provenance informationSensemaking requires morepowerful semantic technologies andontologies (compared to IR).

1You cannot transmute base metals into gold and even if you could, gold would not be precious anymore.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Challenges for Information Observatories

Meaningful Analysis and Synthesis is Difficult

Ensuring that data is analyzed andcombined in a meaningful way is farfrom trivial.What if the information on how touse the data would come togetherwith these data?Focus on smart data instead of(merely on) smart applications.The purpose of ontologies is not toagree on the meaning of terms but tomake the data provider’s intendedmeaning explicit.

A little experiment: The statement all rivers flow into other water bodiesis not useful because it is "true"2, but because...?

2 It is not; rivers can flow into the ground or just dry up entirely before reaching another water body.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

So What Are These Semantic Signatures?

Semantic signatures are an analogy to spectral signatures usedin remote sensingCombine numerical and statistical models and data with ontologiesto derive local primitives (reifications)Multiple spectral bands→ multiple semantic bandsA shared semantic signatures library will hopefully have the sameimpact that spectral signatures had on remote sensing.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Semantic Signatures In POI Pulse

Semantic Signature12 geospatial bandsbased on geographic location

ANND (1)Ripley’s K Bins (10)J Measure (1)

168 temporal bandsbased on geo-social check-Ins

24 Hours7 Days

60 thematic bandsbased on venue tips and reviews

LDA topicsMakes use of dataheterogeneity, social machines

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Semantic Signatures Example: Thematic Bands

A thematic band can becomputed out of unstructuredtext using latent Dirichletallocation (LDA); data sourceWikipedia and travel blogs.Non-georeferenced plain text isoften still geo-indicativeDifferent types (taken fromDBpedia) of geographicfeatures have different,diagnostic topics associated tothem (out of 500 topics)

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Semantic Signatures Example: Thematic Bands

City topics: 204>450>104>282>267>497>443>484>277>97>...Town topics: 425>450>419>367>104>429>266>69>204>308>...Mountain topics: 27>110>5>172>208>459>232>398>453>183>...

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

The IARPA Finder ChallengeFinder is like facial recognition for backgrounds ;-)

Estimate the location of pictures and videos without any explicitgeolocation information.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

The IM2GPS System

’Estimating geographic information from a single image’’Purely data-driven scene matching’ (low-level features)

Big Data CheckVolume: 6 million (out of 6 billion) of Flickr photosVVelocity: in theory, new pictures every secondVVariety: single type of dataV

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Our DiaLoc System: Exploiting Heterogeneity

Key Idea: Exploit the geo-indicativeness of thematic bands.

’market food street narrow dense populated asia economy air conditioning smogfog humid warm building construction skyscrapers skyline shipping exportchannel harbor transportation tram city advertisement’

Variety: Plain text, not image features as data sourceV

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Estimation of Location And Type

0

0.1

0.2

0.3

0.4

0.5Cape NormanSanta Barbara

City

Lake

Valley

Mountain

HistoricPlace

Town

WorldHeritageSite

ProtectedArea

Village

Cave

Island

Museum

Stream

Park

Theatre

Lighthouse

Stadium

Hotel

Restaurant

Airport

Hospital

Volume: > 500,000 Wikipedia articles & travel blog entries. VVelocity: in theory, new travel blog entries every minuteVIM2GPS and DiaLoc each exclude 99.9% of the land-surface of theEarth, what if we combine them.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Thematic Semantic Signatures for DBpedia Classes

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Geolocation APIs – Mapping Space to Place

Geolocation APIs map geographic coordinates, e.g., from a user’ssmartphone, to an ordered sets of nearby candidate POI.These services typically return the n nearest POI within a certain radius anduse spatial distance to the provided coordinates to determine their order.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Temporal Signatures: Combined Day + Hour Band for POI

When you are is what you arePlaces can be semantically annotated based on geo-social check-ins.Primitives: weekday vs. weekend, evening vs. morning, etc.Sometimes day or hour bands alone are not indicative (e.g., university) butjointly form a signature.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Distort the POI Locations Based on Temporal Signatures

The likelihood of visiting a coffee shop, university, bakery, etc at 7pm israther low, while it is a peak hour for restaurants.Modify the purely spatial ranking by pulling and pushing places based ontheir check-in probability.

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Spatial-Semantic Bands and Signatures

POIs plotted by similarity to bar and post office in OSM data, London, UKLocal Reifications (Primitives): e.g., Uniform and ClumpedBars (and similar features) tend to clump togetherPost Offices (and similar features) are rather uniformly distributed

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Semantic Signatures

Spatial-Semantic Bands and Signatures

Where you are is what you are

Dzero measures the likelihood of features of a certain type to co-occurwithin a specific semantic and spatial range.User support: generate recommendations, and clean up data based ontype likelihood. ’How likely is a post office directly next to an existing one?’

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Backup Slides

When Do You Need Semantics?

The Why, What, and How of Geo-Information Observatories K. Janowicz

Why What How

Backup Slides

Observation-Driven Ontology Engineering

The Why, What, and How of Geo-Information Observatories K. Janowicz