Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless...

16
Mash-up of Linked Government Data from http://data.gov Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic Institute June 24, 2010

Transcript of Mash-up of Linked Government Data from Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless...

Page 1: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

Mash-up of Linked Government Data from http://data.gov

Li Ding, Jim Hendler and Deborah L. McGuinness

Tetherless World Constellation, Rensselaer Polytechnic Institute

June 24, 2010

Page 2: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

2

Raw Government Data NowJa

nu

ary

1,

20

09

“Openness will strengthen our democracy and promote efficiency and effectiveness in Government.”

--- President Obama

Putting Government Data online

Ma

y 2

1,

20

09

Jan

ua

ry 1

9,

20

10

data.gov.uk online

Ma

y 2

1,

20

10

data.gov online data.gov relaunchwith semantic webfeatured

Jun

e3

0,2

00

9

De

cem

be

r 8

, 2

00

9

“Open GovernmentDirective” released

2009 2010 …

Page 3: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

3

Semantic Web featured at data.gov

• leveraged contributions from the Tetherless World Constellation at RPI• published 6.4 billions of triples (almost doubled LOD cloud – 13 billion triple in total)• hosted triple store (virtuoso) and open source RDF mashups

http://www.data.gov/semantic/ http://www.data.gov/semantic/data/alpha

Page 4: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

4

The Data-gov Wiki: Innovations at RPI

The Data-gov Wiki explores and educates the use of semantic web technologies, esp. linked data, in producing, processing and utilizing government data from data.gov.

DataDemo Tutorial Video

The Data-gov Wiki is run by the Tetherless World Constellation at RPI, headed by Professor Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson ,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores .

Page 6: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

6

Open Data => Visualization => More

• Open Data: available for public use• Visualization: easy to understand• Mashups: make it more meaningful• Provenance: make it accountable

Table (raw data) Map (books per state) Map (books per capita per state)

Created by Xian Li, PhD student at RPI, http://data-gov.tw.rpi.edu/wiki/Demo:_Library_Books_Per_Capita,_by_State Source: Dataset 353 (State Library Agency Survey: Fiscal Year 2006, Institute of Museum and Library Services)

Page 7: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

7

Example Mashup

Exhibit Visualization API

Data.govData.gov

CASTNET Ozone(CSV)

epa.govepa.gov

CASTNET Site(CSV)

Convert raw dataset into linkable RDF

Data Mashup Web Application MashupVisualization Mashup

query multiple RDF dataset via SPARQL end point

surf to EPA applications

1

2

drill down for details3

4

Created by Dominic DiFranzo, PhD student at RPI, http://www.data.gov/semantic/Castnet/html/exhibit

Page 8: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

8

Mashup US UK Foreign aid• Sources

– http://data.gov

– http://data.gov.uk

AID Major aids from US Major aids from UK

Pakistan

US >UK Economic/Security Assistance,

Development Assistance, …

Health,

Gov and Civil Society,..

India UK > US Child Survival and Health,

Health,

Economic, …

• Discovery and Explanation

Created by James Michaelis, PhD student at RPI, http://data-gov.tw.rpi.edu/demo/linked/aidviz-1554-10030.html

Page 9: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

9

Adding Social Factor to Mashups

RDFPublish*

Enhance*

UserRawData

consume*

feedback

• Import socially contributed data, e.g. DBpedia • Let users contribute

– links– feedbacks

OtherSocial Web

AppsImport/export

Page 11: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

11

Social Mashup: Web 3.0 Linking

(a) White house visitor search for President Obama (b) Web 3.0 site linking white house record to dbpedia

*[[skos:altLabel::POTUS]]*[[foaf:firstName::Barack]] [[foaf:lastName::Obama]]*[[owl:sameAs::http://dbpedia.org/resource/Barack_Obama]]*[[owl:sameAs::http://rdf.freebase.com/ns/en.barack_obama]]*[[owl:sameAs::http://data.nytimes.com/47452218948077706853]]

Wiki Text

RDF data

DBpedia

White House Visitor Record

whitehouse DBpedia

Data-gov Wiki

“POTUS”

dbpedia:Barack_Obama

Created by Dominic DiFranzo, http://data-gov.tw.rpi.edu/demo/stable/white-house-visitor/top100-visitees.php

Page 12: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

12

Social Mashup: User Feedback

• Mashup multiple time series• Support users to feedback (contributing News)

Created by Sarah Magidson, http://data-gov.tw.rpi.edu/demo/linked/demo-401-usps-news.html

Page 13: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

13

More Mashups: Using Web Tools

SPARQL results (XML) can be converted into other formats (e.g. JSON, CSV) as input of other Web tools: Yahoo Pipes, IBM Many Eyes, Microsoft Web n-gram Service, …

Page 14: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

14

More Mashups: Provenance

• Critical to accountability• Demo => Dataset => Agency

– Where data come from?

• Agency =>Dataset => Comments– Support users’ feedback

DatasetDemo

Agency

Page 15: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

15

Even More …

• Applications– More raw data, data catalog, links, hub datasets– More tools, esp. visualization Web APIs– Friendly UI

• Research– Data Integration: smart and scalable– Data access: search, social interaction,…– Provenance: source, versions, changes,…– Reliability: trust, persistency, quality…

Page 16: Mash-up of Linked Government Data from  Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation, Rensselaer Polytechnic.

16

Conclusion

• 6.4 billions of triples from data.gov

• “data + visualization + mashup” is powerful

• Low-cost prototypes, not difficult, undergraduates and Webmasters can do it

• Open source tools, data, demos and tutorials are available for education

Raw Data Now!