Post on 08-Feb-2016
description
The Web of Data emerging industries
Michalis Vafopoulos,
vafopoulos.org 2014
Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Contents
① The Web of documents vs. Web of data– Some technology– Some economics– ..and action
② PSNET project ③ and more…
3
The Data trilogy
① Open: access
everyone to use and republish
② Big: scale
high volume, velocity and variety
③ Linked: use
publish once, use as many times
The Web of Documents
• Simple, big and unstructured• Organized in Silos
But humans:• are interested in Things,no documents & these Things might be in docs or elsewhere
• Limited capacity to extract meaning...
5
The Web of Data• Analogy: a global file system ----> global database• Designed for: human consumption ->machines first, humans
later
• Primary objects: documents --> things (or descriptions of things)
• Links between: documents --> things • Degree of structure in objects: fairly low ---> high• Semantics of content and links: implicit --> explicit
(Tom Heath)6
The Web of Data: why?
7
encourages reuse reduces redundancy maximizes its (real and potential)
inter-connectedness enables network effects to add value
to data
The Web of Data: how?
8
– current state on the Web• Relational Databases• APIs• XML• CSV• XLS
Computers can’t consume data because:• Different formats & models• Not inter-connected
The Web of Data: how?
9
– we need to create a standard way of publishing Data on the Web (like HTML for docs)
This is the Resource Description Framework
(RDF)
(a simple example here from Juan F. Sequeda), more next semester!)
Resource Description Framework (RDF)
• A data model – A way to model data– Inspired form Relational databases and
Logic
• RDF is a triple data model• Labeled Graph (semantic networks)• Subject, Predicate, Object<Isidoro> <was born in> <Chios><Chios> <is part of> <Greece>
Example: Document on the Web
Databases back up documents
Isbn Title Author PublisherID ReleasedData
978-0-596-15381-6
Programming the Semantic Web
Toby Segaran
1 July 2009
… … … … …
PublisherID PublisherName
1 O’Reilly Media
… …
This is a THING:A book title “Programming the Semantic Web” by Toby Segaran, …
THINGS have PROPERTIES:A Book as a Title, an author, …
Data representation in RDF
book
Programming the Semantic
Web
978-0-596-15381-6
Toby Segaran
Publisher O’Reilly
title
name
author
publisher
isbn
Isbn Title Author PublisherID
ReleasedData
978-0-596-15381-6
Programming the Semantic Web
Toby Segaran
1 July 2009
PublisherID
PublisherName
1 O’Reilly Media
Everything on the web is identified by a
URI!
link the data to other data
http://…/
isbn978
Programming the Semantic
Web
978-0-596-15381-6
Toby Segaran
http://…/
publisher1
O’Reilly
title
name
author
publisher
isbn
consider the data from Revyu.comhttp://
…/isbn978
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequed
a
hasReview
reviewer
description
name
start to link data
http://…/
isbn978
Programming the Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publisher
1O’Reilly
title
name
author
publisher
isbn
http://…/
isbn978
sameAs
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
hasReview
hasReviewer
description
name
Juan Sequeda publishes data too
http://juansequeda.com/id
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
Let’s link more datahttp://
…/isbn978
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequeda
http://juansequeda.com/id
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
Linked data = internet + http + RDF
http://…/
isbn978
Programming the Semantic
Web
978-0-596-15381-6
Toby Segaran
http://…/
publisher1 O’Reilly
title
name
author
publisher
isbn
http://…/
isbn978
sameAs
http://…/
review1
Awesome Book
http://…/
reviewer
Juan Sequed
a
http://juansequeda.com/id
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequedaname
http://dbpedia.org/Austin
Linked data = internet + http + RDF
Linked Data Principles
1. Use URIs as names for things2. Use URIs so that people can
look up (dereference) those names.
3. When someone looks up a URI, provide useful information.
4. Include links to other URIs so that they can discover more things.
Web as a database
Linked Data makes the web exploitable as ONE GIANT HUGE GLOBAL DATABASE!
Is there any query language like SQL?
SPARQL…
Is it working?
• Current Employee Names, Salaries, and Position Titles
• The Open Database Of The Corporate World
• Crime map• NHS efficiency savings: the role of p
rescribing analytics• where public money goes worldwide
Examples
Can you find the famous persons born in Beirut before 1900?
Or if the Greek Government buys sperm?
Examples
#anoixtigenia, @vafopoulos
Examples
#anoixtigenia, @vafopoulos
May 2007
What is a Linked Data application/service?
Software system that makes use of data on the Web from multiple datasets and that
benefits from links between the datasets
Characteristics of Linked Data Applications
• Consume data that is published on the web following the Linked Data principles: an application should be able to request, retrieve and process the accessed data
• Discover further information by following the links between different data sources: the fourth principle enables this.
• Combine the consumed linked data with data from sources (not necessarily Linked Data)
• Expose the combined data back to the web following the Linked Data principles
• Offer value to end-users
the 5 stars of open linked data
★make your stuff available on the Web (whatever format)
★★make it available as structured data (e.g. excel instead of image scan of a table)
★★★non-proprietary format (e.g. csv instead of excel)
★★★★use URLs to identify things, so that people can point at your stuff★★★★★link your data to other people’s data to provide contexthttp://lab.linkeddata.deri.ie/2010/star-scheme-by-example/
Two magics of Web Science: the case of Linked Data
The (practical) question
contextualized & hands-on experience in Semantic Web & Business 3.0 on a unique, fast evolving and semantified dataset
35
PSNET project: the answer
The first attempt to generate, curate, interlink and distribute daily updated public spending data in LOD formats that can be useful to both expert (i.e. scientists and professionals) and naïve users.
36
The context first…
37
Research question
Web economy: from potential to actual
Enable new virtuous cycles in the economy through Linked Open Data
38
EU Unification: the institutions
Best in theory – poor in practicea (complicated) market example• monetary policy, currency,
eurozone • European Single Market • fiscal policy FORTHCOMING
39
EU Unification: the technology
Linked Data or Web of data• “publish once, use many times”. • different consumers extract
different slices of the data for different purposes
• publish in context: value & “meaning”
40
EU Unification: the technology
• Linked Data (LD) + Open Data =LOD
• Economic LOD as “data currency”
41
Why LOD?
• Transparency & innovation
Network effects: enabling users to • bidirectional & massively processable
interconnections among data • re-using the existing infrastructure in
the government and business spheres
42
Economic LOD: the story so far
• Isolated/fragmented behind technological & institutional barriers
• General statistics: Eurostat etc. • LOD2 case • Some isolated projects
43
budget
tenders
spending
business informatio
n
users
remix
analyze
prices
LOD graph
Follow public money all the way
Economic LOD: use cases
• Business applications on top• Users: citizens, gov., EU, business• track the life-cycle of every financial
flow: evaluate budget allocation, tenders, spending and their efficiency
• pre-allocate resources on provisional public works
• receive & submit information in real-time
45
Economic LOD: engineering
46
Government Budget• heterogeneous repositories & methods
(mainly PDF)
47
Tenders • Closed data in HTML• Public Contracts Ontology (PCO), e.g. – pco:Contract and pco:AwardCriterion
• Common Procurement Vocubulary• now working on linking our ontology
to:– Payments Ontology – GoodRelations – FOAF
48
Spending • most dynamic & open part• increasing number of countries/cities• raw & structured data• leader: the Greek Clarity project• spending decisions ex-ante to
execution• Actually every decision
49
Business Information
• Registries: mainly closed• Key standards– Classification of Products by Activity (CPA)– eXtensible Business Reporting Language
(XBRL)
CHECK OD BAROMETER – OD INDEX
50
Business Information
51
The Transparency program in Greece (2010-2014)
oA revolution in open governmentoex-ante reporting of every state
decision oparadigm shift for 40K public
servants
52
The Transparency program in Greece
omanifests the value of procrastination principle (again)
ostrong rival to the Clientelistic state
oThe new version under beta testing (delivery: in 10 days!)
53
publicspending.net
2011: I believed that the Transparency program is the open data “gold” (& persuaded 7 more people)
54
publicspending.net
2012: …with some dust and rocks in a deep goldmine
55
2013: time to chisel some jewelry2014: open data everywhere
56
Why public spending LOD
omore & better information o objective and processable
information for economic/political “dialogue”
• to promote competition• to decrease cost • to judge the efficiency of policy
mixtures• to enable participation
57
LOD in Greece
• in its infancy – few Apps yet• 2-3 stars• Open not Linked• limited public awareness
58
LOD in Greece: why it is important
• quality of information during economic crisis
• transparency & efficiency in funding development
59
Issues
ohow can we initiate the virtuous cycle of creation?
demonstrate LOD’s added value
ohow to get the most out of data?local & global interconnections
60
In few words,
Apps, Apps, Apps…..
61
Indexing, searching, global comparisons
Indexing, searching, global comparisons
Indexing, searching, global comparisons
Indexing, searching, global comparisons
Interlinking in global scale
Interlinking in global scale
The future of the Web
• Data.gov: a paradigm shift• Policy challenges are related to
data• Freedom, Privacy, Creativity
Policy framework
① Processing power② Storage③ Network access④ Online data &
services⑤ Privacy
Personal grid workspace (g-work)
for every citizen
New analysis: Web science
• a trans-disciplinary field –Web as its primary object of study–Web= techno-social artifact
• positive or negative?Transformative!
3/18
Web science
The envelope question
what technological and other changes need to be made in order for the Web to work better for more people?
3/18
The Web as a social machine
Being protected by digitizing
73
…challenges the basic aspects of human nature:
o Technologyo Bodyo Moral Valueso Socialityo Generations o Economy
Humanizing the Web Webizing Humanity
Successful business & science facilitate this dialogue
Not only answers but make the questions more concrete
Global initiatives
76
• OGP how it works• GIFT• IBP - OBS Tracker• Web index
Let us talk about projects
77
References
• Weaving the Economic Linked Open Data• The Web Economy: Goods, Users, Models, an
d Policies
• Public Spending: Interconnecting and Visualizing Greek Public Expenditure Following Linked Open Data Directives
• A Framework for Linked Data Business Models
78