Dealing with Data Diversity in a Smart City Data Hub

37
Dealing with Data Diversity in a Smart City Data Hub Mathieu d'Aquin - @mdaquin slideshare.net/mdaquin Knowledge Media Institute, The Open University

description

Keynote speech at the Semantics for Smarter Cities workshop SSC 2014 at ISWC 2014

Transcript of Dealing with Data Diversity in a Smart City Data Hub

Page 1: Dealing with Data Diversity in a Smart City Data Hub

Dealing with Data Diversity in aSmart City Data Hub

Mathieu d'Aquin - @mdaquin

slideshare.net/mdaquin

Knowledge Media Institute, The Open University

Page 2: Dealing with Data Diversity in a Smart City Data Hub

where a penguin is a dataset

Diversity

Page 3: Dealing with Data Diversity in a Smart City Data Hub

Why should we care aboutdiversity?

Because diversity is good, and whatmakes data diverse is not the same aswhat makes it more or less relevant

Page 4: Dealing with Data Diversity in a Smart City Data Hub

Why should we care aboutdiversity?

Because it is hard to manage

How many species of species ofpenguins/animals/things?

How many biologist to classify them?

and that's purely static... unlike species, new dataappear all the time...

Page 5: Dealing with Data Diversity in a Smart City Data Hub

Why should we care aboutdiversity?

The Eskimo language

has 255 differentwords for

"visiting linguist"

Because we might have a lot of it, orwhat we need to manage is verygranular

Page 6: Dealing with Data Diversity in a Smart City Data Hub

Data diversity in a Smart City

Example of the MK:Smart project inMilton Keynes, UK ( )mksmart.org

Page 7: Dealing with Data Diversity in a Smart City Data Hub

Data diversity in a Smart CityPartners in the MK:Smart project

Page 8: Dealing with Data Diversity in a Smart City Data Hub

Data diversity in a Smart CityAreas of the MK:Smart project

Page 9: Dealing with Data Diversity in a Smart City Data Hub

Data diversity in a Smart CityMK Data Hub - Where diversity is handled

Page 10: Dealing with Data Diversity in a Smart City Data Hub

A concrete exampleWifi-based presence sensors

Page 11: Dealing with Data Diversity in a Smart City Data Hub

10-12 can covers an reasonably large enclosed area (here, the refectoryof the Open University);

A concrete exampleWifi-based presence sensors

Page 12: Dealing with Data Diversity in a Smart City Data Hub

Use trianglation to find the location of wifi-enabled devices.

A concrete exampleWifi-based presence sensors

Page 13: Dealing with Data Diversity in a Smart City Data Hub

Basic statistical analysis to extract patterns of usage of the facility

A concrete exampleWifi-based presence sensors

Page 14: Dealing with Data Diversity in a Smart City Data Hub

Basic statistical analysis to extract patterns of usage of the facility

A concrete exampleWifi-based presence sensors

Page 15: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 16: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 17: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 18: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 19: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 20: Dealing with Data Diversity in a Smart City Data Hub

A concrete example: Diversity

Page 21: Dealing with Data Diversity in a Smart City Data Hub

for we use alignments, mappings, links, etc.

How do we usually deal withthis

data heterogenity

Example: The LinkedUp Catalogue of datasetsfor education includes mappings betweenthe vocanulaties of different datasetsdata.linkededucation.org/linkedup/catalogue/

Page 22: Dealing with Data Diversity in a Smart City Data Hub

What about diversity at thepolicy level?

Page 23: Dealing with Data Diversity in a Smart City Data Hub

What about diversity at thepolicy level?

Page 24: Dealing with Data Diversity in a Smart City Data Hub

What about diversity at thepolicy level?

Page 25: Dealing with Data Diversity in a Smart City Data Hub

What about diversity at thepolicy level?

Page 26: Dealing with Data Diversity in a Smart City Data Hub

VoID and DC to represent datasets, PROV-O for basic provenance.

More structured representation

Page 27: Dealing with Data Diversity in a Smart City Data Hub

ODRL for the structured representation of policies and rights

More structured representation

Page 28: Dealing with Data Diversity in a Smart City Data Hub

With the tools to deal with it

More structured representation

Page 29: Dealing with Data Diversity in a Smart City Data Hub

And the processes

More structured representation

Page 30: Dealing with Data Diversity in a Smart City Data Hub

Requires an appropriate representation of dataflows

Reasoning on the way policy-information propagates

Page 31: Dealing with Data Diversity in a Smart City Data Hub

http://purl.org/datanode/ns/

An ontology of relationships between data artifacts (DataNodes).

DataNode

Page 32: Dealing with Data Diversity in a Smart City Data Hub

Captures the essence of dataflows rather than the process, as a basis formeta-information propagation.

DataNode

Page 33: Dealing with Data Diversity in a Smart City Data Hub

Propagating meta informationaccross dataflows

Examples of rules:Duties such as attributions propagate over relations of derivation, butnot necessraly othersPermissions such as the right to redistribute however do notpropagate over relations of derivation, except of specific cases (e.g.copies)Prohibitions such as preventing commercial exploitation propage overderivations

Page 34: Dealing with Data Diversity in a Smart City Data Hub

A lot of the semantics for Smart Cities work focus on data heterogeneity.

There is a need to look at data diversity at the meta-information level(here we focus on policy related information).

How to manage, catalogue, keep track of and manipulate a largenumber of datasets with diverse rights, access, validity, scope.

How do we help users/developers in exploring and exploiting thisdiversity...

Discussion/future

Page 35: Dealing with Data Diversity in a Smart City Data Hub

Discussion/future

Master of Datasets

Page 36: Dealing with Data Diversity in a Smart City Data Hub

Need for a clear, semantic (i.e. ontological) foundation for describingand defining data artefacts.

DataNode is a step towards defining their relationships. Vocabulariessuch as ODRL and VOID focus on specific aspects.

More is needed to formally represent the foundamental descriptors ofdata (scope, validity, policy, ...)

Discussion/future

Page 37: Dealing with Data Diversity in a Smart City Data Hub

Thanks!

Mathieu d'Aquin Alessandro Adamou Enrico Daga

Shuangyan Liu Keerthi Thomas Enrico Motta