The Web of Data emerging industries

Post on 08-Feb-2016

21 views 0 download

Tags:

description

The Web of Data emerging industries. Michalis Vafopoulos , vafopoulos.org 2014. Creative Commons License This work is licensed under a Creative Commons Attribution- ShareAlike 4.0 International License. Contents. The Web of documents vs. Web of data Some technology Some economics - PowerPoint PPT Presentation

Transcript of The Web of Data emerging industries

The Web of Data emerging industries

Michalis Vafopoulos,

vafopoulos.org 2014

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Contents

① The Web of documents vs. Web of data– Some technology– Some economics– ..and action

② PSNET project ③ and more…

3

The Data trilogy

① Open: access

everyone to use and republish

② Big: scale

high volume, velocity and variety

③ Linked: use

publish once, use as many times

The Web of Documents

• Simple, big and unstructured• Organized in Silos

But humans:• are interested in Things,no documents & these Things might be in docs or elsewhere

• Limited capacity to extract meaning...

5

The Web of Data• Analogy: a global file system ----> global database• Designed for: human consumption ->machines first, humans

later

• Primary objects: documents --> things (or descriptions of things)

• Links between: documents --> things • Degree of structure in objects: fairly low ---> high• Semantics of content and links: implicit --> explicit

(Tom Heath)6

The Web of Data: why?

7

encourages reuse reduces redundancy maximizes its (real and potential)

inter-connectedness enables network effects to add value

to data

The Web of Data: how?

8

– current state on the Web• Relational Databases• APIs• XML• CSV• XLS

Computers can’t consume data because:• Different formats & models• Not inter-connected

The Web of Data: how?

9

– we need to create a standard way of publishing Data on the Web (like HTML for docs)

This is the Resource Description Framework

(RDF)

(a simple example here from Juan F. Sequeda), more next semester!)

Resource Description Framework (RDF)

• A data model – A way to model data– Inspired form Relational databases and

Logic

• RDF is a triple data model• Labeled Graph (semantic networks)• Subject, Predicate, Object<Isidoro> <was born in> <Chios><Chios> <is part of> <Greece>

Example: Document on the Web

Databases back up documents

Isbn Title Author PublisherID ReleasedData

978-0-596-15381-6

Programming the Semantic Web

Toby Segaran

1 July 2009

… … … … …

PublisherID PublisherName

1 O’Reilly Media

… …

This is a THING:A book title “Programming the Semantic Web” by Toby Segaran, …

THINGS have PROPERTIES:A Book as a Title, an author, …

Data representation in RDF

book

Programming the Semantic

Web

978-0-596-15381-6

Toby Segaran

Publisher O’Reilly

title

name

author

publisher

isbn

Isbn Title Author PublisherID

ReleasedData

978-0-596-15381-6

Programming the Semantic Web

Toby Segaran

1 July 2009

PublisherID

PublisherName

1 O’Reilly Media

Everything on the web is identified by a

URI!

link the data to other data

http://…/

isbn978

Programming the Semantic

Web

978-0-596-15381-6

Toby Segaran

http://…/

publisher1

O’Reilly

title

name

author

publisher

isbn

consider the data from Revyu.comhttp://

…/isbn978

http://…/

review1

Awesome Book

http://…/

reviewer

Juan Sequed

a

hasReview

reviewer

description

name

start to link data

http://…/

isbn978

Programming the Semantic Web

978-0-596-15381-6

Toby Segaran

http://…/publisher

1O’Reilly

title

name

author

publisher

isbn

http://…/

isbn978

sameAs

http://…/

review1

Awesome Book

http://…/

reviewer

Juan Sequeda

hasReview

hasReviewer

description

name

Juan Sequeda publishes data too

http://juansequeda.com/id

livesIn

Juan Sequedaname

http://dbpedia.org/Austin

Let’s link more datahttp://

…/isbn978

http://…/

review1

Awesome Book

http://…/

reviewer

Juan Sequeda

http://juansequeda.com/id

hasReview

hasReviewer

description

name

sameAs

livesIn

Juan Sequedaname

http://dbpedia.org/Austin

Linked data = internet + http + RDF

http://…/

isbn978

Programming the Semantic

Web

978-0-596-15381-6

Toby Segaran

http://…/

publisher1 O’Reilly

title

name

author

publisher

isbn

http://…/

isbn978

sameAs

http://…/

review1

Awesome Book

http://…/

reviewer

Juan Sequed

a

http://juansequeda.com/id

hasReview

hasReviewer

description

name

sameAs

livesIn

Juan Sequedaname

http://dbpedia.org/Austin

Linked data = internet + http + RDF

Linked Data Principles

1. Use URIs as names for things2. Use URIs so that people can

look up (dereference) those names.

3. When someone looks up a URI, provide useful information.

4. Include links to other URIs so that they can discover more things.

Web as a database

Linked Data makes the web exploitable as ONE GIANT HUGE GLOBAL DATABASE!

Is there any query language like SQL?

SPARQL…

Examples

Can you find the famous persons born in Beirut before 1900?

Or if the Greek Government buys sperm?

Examples

#anoixtigenia, @vafopoulos

Examples

#anoixtigenia, @vafopoulos

May 2007

What is a Linked Data application/service?

Software system that makes use of data on the Web from multiple datasets and that

benefits from links between the datasets

Characteristics of Linked Data Applications

• Consume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data

• Discover further information by following the links between different data sources: the fourth principle enables this.

• Combine the consumed linked data with data from sources (not necessarily Linked Data)

• Expose the combined data back to the web following the Linked Data principles

• Offer value to end-users

the 5 stars of open linked data

★make your stuff available on the Web (whatever format)

★★make it available as structured data (e.g. excel instead of image scan of a table)

★★★non-proprietary format (e.g. csv instead of excel)

★★★★use URLs to identify things, so that people can point at your stuff★★★★★link your data to other people’s data to provide contexthttp://lab.linkeddata.deri.ie/2010/star-scheme-by-example/

Two magics of Web Science: the case of Linked Data

The (practical) question

contextualized & hands-on experience in Semantic Web & Business 3.0 on a unique, fast evolving and semantified dataset

35

PSNET project: the answer

The first attempt to generate, curate, interlink and distribute daily updated public spending data in LOD formats that can be useful to both expert (i.e. scientists and professionals) and naïve users.

36

The context first…

37

Research question

Web economy: from potential to actual

Enable new virtuous cycles in the economy through Linked Open Data

38

EU Unification: the institutions

Best in theory – poor in practicea (complicated) market example• monetary policy, currency,

eurozone • European Single Market • fiscal policy FORTHCOMING

39

EU Unification: the technology

Linked Data or Web of data• “publish once, use many times”. • different consumers extract

different slices of the data for different purposes

• publish in context: value & “meaning”

40

EU Unification: the technology

• Linked Data (LD) + Open Data =LOD

• Economic LOD as “data currency”

41

Why LOD?

• Transparency & innovation

Network effects: enabling users to • bidirectional & massively processable

interconnections among data • re-using the existing infrastructure in

the government and business spheres

42

Economic LOD: the story so far

• Isolated/fragmented behind technological & institutional barriers

• General statistics: Eurostat etc. • LOD2 case • Some isolated projects

43

budget

tenders

spending

business informatio

n

users

remix

analyze

prices

LOD graph

Follow public money all the way

Economic LOD: use cases

• Business applications on top• Users: citizens, gov., EU, business• track the life-cycle of every financial

flow: evaluate budget allocation, tenders, spending and their efficiency

• pre-allocate resources on provisional public works

• receive & submit information in real-time

45

Economic LOD: engineering

46

Government Budget• heterogeneous repositories & methods

(mainly PDF)

47

Tenders • Closed data in HTML• Public Contracts Ontology (PCO), e.g. – pco:Contract and pco:AwardCriterion

• Common Procurement Vocubulary• now working on linking our ontology

to:– Payments Ontology – GoodRelations – FOAF

48

Spending • most dynamic & open part• increasing number of countries/cities• raw & structured data• leader: the Greek Clarity project• spending decisions ex-ante to

execution• Actually every decision

49

Business Information

• Registries: mainly closed• Key standards– Classification of Products by Activity (CPA)– eXtensible Business Reporting Language

(XBRL)

CHECK OD BAROMETER – OD INDEX

50

Business Information

51

The Transparency program in Greece (2010-2014)

oA revolution in open governmentoex-ante reporting of every state

decision oparadigm shift for 40K public

servants

52

The Transparency program in Greece

omanifests the value of procrastination principle (again)

ostrong rival to the Clientelistic state

oThe new version under beta testing (delivery: in 10 days!)

53

publicspending.net

2011: I believed that the Transparency program is the open data “gold” (& persuaded 7 more people)

54

publicspending.net

2012: …with some dust and rocks in a deep goldmine

55

2013: time to chisel some jewelry2014: open data everywhere

56

Why public spending LOD

omore & better information o objective and processable

information for economic/political “dialogue”

• to promote competition• to decrease cost • to judge the efficiency of policy

mixtures• to enable participation

57

LOD in Greece

• in its infancy – few Apps yet• 2-3 stars• Open not Linked• limited public awareness

58

LOD in Greece: why it is important

• quality of information during economic crisis

• transparency & efficiency in funding development

59

Issues

ohow can we initiate the virtuous cycle of creation?

demonstrate LOD’s added value

ohow to get the most out of data?local & global interconnections

60

In few words,

Apps, Apps, Apps…..

61

Indexing, searching, global comparisons

Indexing, searching, global comparisons

Indexing, searching, global comparisons

Indexing, searching, global comparisons

Interlinking in global scale

Interlinking in global scale

The future of the Web

• Data.gov: a paradigm shift• Policy challenges are related to

data• Freedom, Privacy, Creativity

Policy framework

① Processing power② Storage③ Network access④ Online data &

services⑤ Privacy

Personal grid workspace (g-work)

for every citizen

New analysis: Web science

• a trans-disciplinary field –Web as its primary object of study–Web= techno-social artifact

• positive or negative?Transformative!

3/18

Web science

The envelope question

what technological and other changes need to be made in order for the Web to work better for more people?

3/18

The Web as a social machine

Being protected by digitizing

73

…challenges the basic aspects of human nature:

o Technologyo Bodyo Moral Valueso Socialityo Generations o Economy

Humanizing the Web Webizing Humanity

Successful business & science facilitate this dialogue

Not only answers but make the questions more concrete

Global initiatives

76

• OGP how it works• GIFT• IBP - OBS Tracker• Web index

Let us talk about projects

77