Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

101
01/11/22 1 Lorenzino Vaccari, Juan Pane http:// dati.trentino.it Open Government Data Seminar @USB* *This presentation is taken from the “Open Government Data Tutorial” presented at CLEI2013 Lorenzino Vaccari 1 , Juan Pane 2 1 Autonomous Province of Trento, Trento, Italy [email protected] 2 University of Trento, Trento, Italy – Universidad Nacional de Asuncion, Asuncion, Paraguay [email protected]

description

Seminar on Open Data at Universidad Simon Bolivar presented by Lorenzino Vaccari. Authors: Juan Pane, Lorenzino Vaccari. Contributions (CC-BY) from Maurizio Napolitano: Slides 7,8, 55,56,57 and from 61 to 69 Five parts: 1. Open Data: introduction 2. Open Data: Issues 3. Open Data in Trentino Project 4. Open data: Applications 5. Open Data: Semantic Issues

Transcript of Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

Page 1: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/231 Lorenzino Vaccari, Juan Panehttp://dati.trentino.it

Open Government DataSeminar @USB*

*This presentation is taken from the “Open Government Data Tutorial” presented at CLEI2013

Lorenzino Vaccari1, Juan Pane2

1Autonomous Province of Trento, Trento, Italy [email protected]

2University of Trento, Trento, Italy – Universidad Nacional de Asuncion, Asuncion, Paraguay [email protected]

Page 2: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane2

Goal of the Seminar• Introduce Open Government Data

• Intro, Issues (Part 1)

• If you need it, how can you organize it?• Real experience (Part 2)

• Methods for opening data• Applications (Part 3)• Semantic Issues (Part 4)

Page 3: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane3 15/10/2013Juan Pane, Lorenzino Vaccari3http://www.point-fort.com/index.php?2012/01/25/805-why-how-what

http://www.point-fort.com/index.php?2012/01/25/805-why-how-what

Page 4: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane4

What?

“is data that can be freely used, reused and redistributed by anyone – subject only, at most, to the requirement to attribute and

sharealike.” *

*(Source: )

http://www.opendefinition.org

Page 5: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane5

usereuse

“open” = redistributioncommercial reusederivative works

BUT, may require:- attribution- share alike

http://myfbcovers.com/uploads/covers/2012/06/09/16628a1094aa012f7c6e0025902480d2/watermarked_cover.jpg

J. Gray (OKF): http://www.slideshare.net/jwyg/open-government-data-what-why-how

Page 6: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane6

The value is in its use

Page 7: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane7 Maurizio Napolitano: http://www.youtube.com/watch?v=YlkjrVAW43Q

Page 8: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane8

Is open data useful?

Maurizio Napolitano: http://www.youtube.com/watch?v=YlkjrVAW43Q

Page 9: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane9

Open Data Benefits The Open data are the knowledge base to:

Improve the economic grow and the entrepreneurship based on the development of digital services reusing Public Sector Information

Answer to social needs through the publication of innovative services and applications

Aims at reducing the cost of the public administrative activities within Public – Private Partnerships (PPP)

Improve the transparency of the activities of the public institutions and the participation of the citizens to these activities

Page 10: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane10

Principles

Tim Berners-Lee (5-Stars of Linked Open Data)Vs.Tim Davis (5-Stars of Open Data Engagement)

http://5stardata.info/

http://www.timdavies.org.uk/2012/01/21/5-stars-of-open-data-engagement/

Page 11: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane11

5 Starts Linked Open DataTim Berners-Lee

http://5stardata.info

Page 12: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane12

5-Stars of Open Data Engagement

* Be demand driven * * Provide context * * * Support conversation * * * * Build capacity & skills* * * * * Collaborate with the community

Tim Davis

http://www.timdavies.org.uk/2012/01/21/5-stars-of-open-data-engagement/

Page 13: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane13

Create Communityhttp://msnbcmedia.msn.com/j/MSNBC/Components/Photo/_new/pb-121007-spain-tarragona-pyramid-nj-02.photoblog900.jpg

Page 14: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane14

Open Government Data

Page 15: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane15

State of the ArtWhat is happening around us?-Globally-Europe-Latin America

Page 16: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane16

Open Data Charter - G8The principles are:Open Data by DefaultQuality and QuantityUseable by AllReleasing Data for Improved GovernanceReleasing Data for Innovation

http://opensource.com/government/13/7/open-data-charter-g8

https://www.gov.uk/government/publications/open-data-charter/g8-open-data-charter-and-technical-annex

Page 17: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane17http://opensource.com/government/13/7/open-data-charter-g8

http://census.okfn.org/

OGD around the world

Page 18: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane18http://opensource.com/government/13/7/open-data-charter-g8

http://census.okfn.org/country/

Page 19: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane19

OGD in Europe

http://open-data.europa.eu/

Page 20: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane20

OGD in Europescreenshots

http://epsiplatform.eu/content/european-psi-scoreboard

Page 21: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane21

OGD in EuropeInsert table

http://epsiplatform.eu/content/european-psi-scoreboard http://epsiplatform.eu/content/psi-scoreboard-indicator-list

Page 22: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane22

OGD in Italy

http://www.dati.gov.it/content/infografica

Page 23: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane23

OGD in Latin America*

*In Venezuela some OD projects have been started by the USB

Page 24: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2324 Lorenzino Vaccari, Juan Pane

Questions?

OGD: Part 2 - Issues

Page 25: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2325 Lorenzino Vaccari, Juan Pane 08/10/2013Juan Pane, Lorenzino Vaccari25http://evian-thesource.com/kids-having-fun/http://evian-thesource.com/kids-having-fun/

Page 26: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2326 Lorenzino Vaccari, Juan Pane

Open Data. Oh ohh

08/10/2013Juan Pane, Lorenzino Vaccari26

LegalLegalOrganizationalOrganizational TechnicalTechnicalAdoptionAdoptionBarriersBarriers

ContextualContextual

http://www.wallpapermania.eu/wallpaper/trick-or-treat-cute-pumpkins-lanterns-halloween-wallpaper

Page 27: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane27http://de.straba.us/wp-content/uploads/2012/08/barrieres_for_implementation_of_ogd.png

Page 28: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane28

Organizational Barriers

Not readyLack of resources

ITHuman

Don’t want to be ready

http://montcomediation.org/images/MCMC_MyWayYourWay.jpg

Page 29: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane29

Legal barriersOpen the Data

All the data that was produced using public money has to be made publicly available (with exceptions)

vs PrivacyYou cannot open data that could allow

correlation of private personal data

Or the complete lack of legislation!

Page 30: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane30

Adoption barriersData is not contextualizedPeople are not informedOpening data is a complex task, opening

cleaned data is even more complex.Unclear licenses

Page 31: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane31

Technical BarriersAccess to data:

OrganizationalTechnical, Downtimes, logins, Payment fees

Fragmentation, incomplete data, scattered

FormatCataloging, indexing, searchLack of explicit semantics, metadataData is not reliableConflicting standards, models,

ontologies

Page 32: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane32

BarriersZuiderwijk et al 2010

Listed 118 socio-technical impediments for opening data in the literature.FindabilityUsabilityUnderstandablityQualityLinkingComparability and compatibilityMetadata….

http://www.ejeg.com/issue/download.html?idArticle=255

Page 33: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane33

Context Barriers

Privileged access to dataOther companies what to avoid legislation

of privacy.Transparency is bad for fraudulent business

http://img.gawkerassets.com/img/182n8vzdlg1iojpg/original.jpg

Page 34: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane34http://netdna.webdesignerdepot.com/uploads/photo_manipulation/manipulation-9.jpg

Page 35: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2335 Lorenzino Vaccari, Juan Pane

Preguntas?

Part 3 - Real Experience

Page 36: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane36http://goo.gl/T2Xp80

Page 37: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane37

The “Open Data in Trentino” project

• The “Open Data in Trentino” project is a 3 years initiative finalized to develop an open data infrastructure to enhance Service Innovation for Trentino following the PAT strategy for services innovation enabled by ICT. The project will be developed within a partnership between Trento RISE and the Autonomous Province of Trento (PAT) according to the innovation PAT model

• Goals• Improved quality of life for citizens• Open Data and local businesses• Transparency• Improved efficiency and productivity

Page 38: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane38

Workplan - Steps

Page 39: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane39

Nome (Acronimo) Descrizione

Tipo di Dato Estensione del file

Comma Separated Value (CSV) Formato testuale per l'interscambio testuale di tabelle, le cui righe corrispondono a linee e i cui valori delle singole colonne sono separati da una virgola (o punto e virgola)

Dato tabellare .csv

Geographic Markup Language (GML) Formato XML utile allo scambio di dati territoriali di tipo vettoriale

Dato geografico vettoriale

.gml

Keyhole Markup Language (KML) Formato basato su XML creato per gestire dati territoriali in tre dimensioni nei programmi Google Earth, Google Maps

Dato geografico vettoriale

.kml

Open Document Format (ODF) Formato per l'archiviazione e lo scambio di documenti di testo, fogli di calcolo, diagrammi e presentazioni

Dato tabellare .odc

Resource Description Framework (RDF) Basato su XML, e' lo strumento base proposto da World Wide Web Consortium (W3C) per la codifica, lo scambio e il riutilizzo di metadati strutturati e consente l'interoperabilità tra applicazioni che si scambiano informazioni sul Web

Dato strutturato .rdf

ESRI Shapefile (SHP) Lo Shapefile ESRI è un popolare formato vettoriale per sistemi informativi geografici. Il dato geografico viene distribuito normalmente attraverso tre o quattro files (se indicato il sistema di riferimento delle coordinate). Il formato è stato rilasciato da ESRI come formato (quasi) aperto

Dato geografico vettoriale

.shp, .shx, .dbf,

.prj

Extensible Markup Language (XML) E' un formato di markup, ovvero basato su un meccanismo che consente di definire e controllare il significato degli elementi contenuti in un documento o in un testo attraverso delle etichette (markup)

Dato strutturato .xml

Page 40: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane40

…MeteoMeteo GeoDatiGeoDati StatisticaStatistica Comune

TrentoComuneTrento TrasportiTrasporti Etc…Etc……

Tecnological platform

Page 41: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane41

Catalog

The Open Knowledge Foundation (OKF) is a non-profit organisation founded in 2004 and dedicated to promoting open data and open content in all their forms – including government data, publicly funded research and public domain cultural content.

(2004)

http://okfn.org

Page 42: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane42

http://dati.trentino.it*

Analysis: http://dati.trentino.it/stats Admin: http://dati.trentino.it/admin Harvesting: http://dati.trentino.it/harvest

* Available for all the data providers of Trentino  

Page 43: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane43

Services

Page 44: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane44

Also Trentino is going to launch a challenge to build software applications and creative products (multimedia, audiovisual products, posters, illustrations) based on the datasets published on the http://dati.trentino.it open data catalog.

 #ODTChallenge will be the official hashtag for our first open data challenge in Trentino! 

Page 45: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane45

Page 46: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane46

7 months until now68.555 visits 7.988 unique visits2.516 downloads

37,36% returning visitors

62,64% new visitors

NOW- ALL the departmnets demand to be involved- Plus other local actors

AgricultureCultureGeographical DataWelfareWeather ForecastSocial policiesStatisticsTransports…MUNICIPALITY OF TRENTO, and

INFORMATICA TRENTINA

567 datasetsprovided by 10 departments of PAT…

20 reporting errors15 asking for new data10 new suggestions6 OD Applications

100% ENTHUSIASTIC REACTIONS

Page 47: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane47

Want to Know & Learn more?

Page 48: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane48http://www.theodi.org/

Page 49: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane49http://schoolofdata.org/

Page 50: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane50http://opendatahandbook.org/pt_BR/

Page 51: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane51 http://www.od4d.org/category/open-data/how-to/

Page 52: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane52http://schoolofdata.org/online-resources/

Page 53: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane53

Thanks to the project team !!!!• General Manager: Isabella Bressan

• Project coordinator: Lorenzino Vaccari• Organizational/Communication issues: Francesca Gleria,

Roberto Cibin • Data gatherer: Luca Paolazzi • Catalog: Maurizio Napolitano, Samuele Santi• Semantics: Juan Pane, David Leoni, Alberto Zanella• Legal issues: Eleonora Bassi, Stefano Leucci• Communities: Maurizio Napolitano, Francesca De Chiara• System integration: Marco Combetto, Lorenzo Dallapè• Statistical Linked Data: Pavel Shvaiko

Page 54: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2354 Lorenzino Vaccari, Juan Pane

Questions?

OGD: Part 4 - Applications

Page 55: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane55

Apps4Italy

Page 56: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane56

Best Application: http://parlamento17.openpolis.it/

Page 57: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane57

Open Bilancio

Best Idea: http://opendata.comune.fi.it/open_bilancio/

Page 58: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane58

What?

DAL America Latina (2012): http://desarrollandoamerica.org/aplicaciones-2012/

DAL America Latina (2013): http://2013.desarrollandoamerica.org/appschallenge/

Page 59: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane59

http://limaio.innovacion.pe/ http://www.limaio.com/demo

Page 60: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane60http://www.mysociety.org/2007/more-travel-maps/morehousing

Page 61: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane61

Johann MITTHEISZ (CIO der Stadt Wien)

http://www.slideshare.net/BrigitteLutz/keynote-mittheisz-cio-stadt-wien/16

Total hours to develop 38 applications:around 2.600

City of Wien saved around 208.000 Euro

Page 62: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane62

The Open Data Ecosystem(and the OpenStreetMap case)

Page 63: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane63

Page 64: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane64

OpenStreetMap

~

OpenStreetMap project creates and provides geographical data, such as road maps, freely available to anyone. Behind the establishment and growth of the project have been restrictions on use or availability of map information across much of the world and the advent of inexpensive portable satellite navigation devices.

OpenStreetMap is a free map of theworld, created by someone like you

Page 65: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane65http://tools.geofabrik.de/mc/?mt0=mapnik&mt1=googlemap&lon=11.12042&lat=46.07224&zoom=18

Page 66: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane66http://haiti.ushahidi.com

Page 67: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane67

Watercolor maps

http://content.stamen.com/files/cartography/index_watercolor.html#18.00/46.07204/11.12097

Page 68: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane68

From maps to blankets…

http://softcities.net

Page 69: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane69

Sharing Data Globally(the eHabitat example)

Page 70: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane70

21th Century Challenges

Source: http://www.slideshare.net/angeled/geoss © GEO secretariat

Page 71: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane71

The Group of Earth Observation

Source: http://www.slideshare.net/angeled/geoss © GEO secretariat84 GEO members and 61 Participating organizations

Page 72: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane72

GEOSS Data Sharing Principles

• Full and Open Exchange of Data, recognizing Relevant International Instruments and National Policies

• Data and Products at Minimum Time delay and Minimum Cost

• Free of Charge or minimal Cost for Research and Education

http://www.geoportal.org/web/guest/geo_home

Page 73: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane73

“Venezuela is considered a state with extremely high biodiversity, with habitats ranging from the Andes mountains in the west to the Amazon Basin rainforest in the south, via extensive llanos plains and Caribbean coast in the center and the Orinoco River Delta in the east."

Source: Wikipedia

Page 74: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane74

GEOSS for biodiversity

http://www.eurogeoss-broker.eu/

Page 75: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane75

The eHabitat Model

http://ehabitat-wps.jrc.ec.europa.eu/ehabitat/

Page 76: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/2376 Lorenzino Vaccari, Juan Pane

Questions?

OGD: Part 5 - Semantics

Page 77: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane77

Available

Structured

Open formats

Redefenceable

Linked

Linked Open Data

The best data is an open data

Vs.

All data must be perfect

Page 78: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane78

Lack of explicit semanticsThe real meaning of the data was kept in the developers mind when creating the data

78http://goo.gl/npEHKr

Page 79: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane79

Lack of explicit semanticsCan lead to things like:

Page 80: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane80

Semantic heterogeneityDifference in the meaning of local data

Page 81: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane81

Issues when Opening Trentino Data

Each department has authority on only some part of the data.

Dataset originally created for internal use only.Dataset created for a specific need.Dataset created with custom format:

For structure (some exceptions)For data

Lack of reuse -> duplication.Lack of programmers.We cannot TELL them what/how to do (always).Data changes

Page 82: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane82

Available

Structured

Open formats

Redefenceable

Linked

Entity CentricSemantic Layer

Data Catalog

Data Catalog

Page 83: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane83

Entity centric: Added valueAggregated dataAccurate data, manually curatedUnique identifiers, distributed perspectives

Re-think identifiersSemantified values

E1

name Juan Pane

nationality italian

lives in Trento

affiliation Univ. Trento

E2

name Ignacio P. F.

born in Paraguay

date of birth 1980

affiliation PF-UNA

Page 84: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane84

EntitiesReal world: is something that has a distinct,

separate existence, although it need not be a material (physical) existence. Has a set of properties, which evolve over time. Example:

Mental: personal (local) model created and maintained by a person that references and describes a real world entity.

Digital: capture the semantics of real world entities, provided by people.

Page 85: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane85

Entity Centric Semantic Layer:• Address the integration problems due to

semantic heterogeneity:• Different formats• Different identifiers• Implicit semantics• Homonyms, synonyms, aliases• Partial knowledge• Knowledge evolution

http://www.webfoundation.org/2011/11/5-star-open-data-initiatives/

Page 86: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane86

Entity-based Integration• Focus on entities as first class citizens

• Entities are objects which are so important in our everyday life to be referred with a name

• Each entity has its own metadata (e.g. name, latitude, longitude, …)• Each entity is in relation with many other entities (e.g. Einstein was

born in Ulm, his affiliation was Charles University, Ulm is a city in Germany)

• There are relatively “few” commonsense entity types (person, …, event)

• There are many domain specific entities (bus stops, cycling paths, ..)• All components have explicit semantics: schema, entities, attributes,

values

Page 87: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane87

Importing pipeline, Macro Steps1. Domain analysis

Study the needed entity types, adapt the knowledge base accordingly. First time bootstrapping

2. Import entities Semi-automatic tool.

Domain experts are expensive. Human attention is a scarce resource. Incremental enrichment and aggregation of

entities.

Page 88: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane88

Open Data PeculiaritiesAll data comes from a CKAN repository

(DCAT).Process one data file at a time.Each data file can be represented as a

table.Each row in the table represents a (partial)

entity.The format of the values might not be

enforced in the data files.Not all data is relevant.

Page 89: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane89

Importing tool process

Page 90: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane90

1. Source SelectionImport one data file at a time

Page 91: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane91

2. Schema MatchingSelect a target type of entity -> correspondences between the input columns and the output attributes

nome provincia descrizione funivie lat long

Andalo (1047) Provincia di Trento

Sorge su un'ampia sella prativa al centro...

3 654463 712857

Canazei (1450) Trento Prov. Situato all'estremità settentrionale della...

2 511504 147444

Page 92: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane92

3. Data ValidationApplies format and structure validation and possible automatic transformations needed to have the input data in the expected format.

Page 93: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane93

4. Semantic Enrichment (1/2)Entity disambiguation: Transform text references into links to existing entities.

Page 94: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane94

4. Semantic Enrichment (2/2)Natural Language Processing: Extract concepts and entity references from free-text.

Page 95: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane95

5. ReconciliationRun Identity Management Algorithms to identify each row as a new or existing entity.

Result•No Match•Match•Multiple Matches

Action:•Use ID•New ID•Ignore Row

Page 96: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane96

6. ExportingAt this point:We know what to export.All values for target attributes conform to the expected format.All text has been semantified (NLP).All textual references to entities are converted to linksEach row has an identifier

i i+1v0

Page 97: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane97

7. PublishingPut back the semantified entities into CKAN so that the entities can be Open Data and can be found in the same catalog as the original data.Developers and find the data files of the cleaned, aggregated entitiesBut can also interact with the entities via the Entitypedia APIs

8. VisualizationSearch and Navigation

Page 98: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane98

Semantic Layer: ServicesTool for aiding the “semantification” of the datasets in the catalog based on:

• Schema matching services• Identity Management services

• Entity Matching services• Global Unique Identifier services

• Semantic search and indexing services• Natural Language Processing• Entity store

Page 99: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane99

Our Goal

TN

UK

BEES

Page 100: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane100 http://www.youtube.com/watch?v=Bq_ZWl1ZXA0

BEYOND

Page 101: Open Data Trentino - Seminar at Universidad Simon Bolivar - 15th October 2013

11/04/23 Lorenzino Vaccari, Juan Pane101

Gracias!

Grazie!

Mercy!

Gràcies!Gratias!

Thanks!

Danke!

Dank u!

Kiitos!

ευχαριστώ

We thank in particular CLEI 2013, Autonomous Province of Trento, TrentoRise association, Universidad Nacional de Asuncion, Universidad Simon Bolivar and University of Trento

Lorenzino Vaccari1, Juan Pane2

1Autonomous Province of Trento, Trento, Italy [email protected]

2University of Trento, Trento, Italy – Universidad Nacional de Asuncion, Asuncion, Paraguay

[email protected]