The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L....

12
The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer Polytechnic Institute Nov 8, 2010

Transcript of The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L....

Page 1: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

The Linked Government Data Landscape Today

data.gov and TWC LOGD

Li Ding, Jim Hendler and Deborah L. McGuinness

Tetherless World ConstellationRensselaer Polytechnic Institute

Nov 8, 2010

Page 2: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

2Data.gov and World-Wide Open Government Data Activities

Jan

ua

ry 1

, 2

00

9

“Openness will strengthen our democracy and promote efficiency and effectiveness in Government.”

--- President Obama

Putting Government Data online

Ma

y 2

1,

20

09

Jan

ua

ry 1

9,

20

10

data.gov.uk online

Ma

y 2

1,

20

10

data.gov online data.gov relaunchwith semantic webfeatured

Jun

e3

0,2

00

9

2009 2010 …

Many countries• US• UK• Australia• New Zealand …

Page 3: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

3

Semantic Web Featured at data.gov

http://www.data.gov/semantic/ • data.gov adopted Semantic Web Technolgoies• Web-based Mashups • Downloadable RDF data - 6.4 billions of triples

Page 4: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

4

Data-gov Wiki: Innovations at RPI

The Data-gov Wiki explores and educates the use of semantic web technologies, esp. linked data, in producing, processing and utilizing government data from data.gov.

The Data-gov Wiki is run by the Tetherless World Constellation at RPI, headed by Professors Jim Hendler and Deborah McGuinness and led by Li Ding. Other student team members include: Dominic DiFranzo, Sarah Magidson ,James Michaelis, Alvaro Graves, Adam Bell, Jin Guang Zheng, Xian Li, Tim Lebo, Gregory Todd Williams, Peter Coons, Zhenning Shangguan, Devin Gaffney, William Cooper, Brian Zaik, and Johanna Flores .

40+ Demos 400+ Datasets Tutorials & Videos

Page 5: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

5

The Data-gov Wiki - Architecture

Data Web

Linked Data

Linked Data

LGDin RDF

Enhancement

ConversionKn

ow

led

ge P

roven

an

ce

Consume

LGD: Linked government data

Page 6: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

6Conversion: From Raw Tabular Data to RDF

Page 7: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

7Enhancement: Linking Open Government Data

ID year PHSY_ST site-id cost

1998 10.0

1999 site123 11.3

2000 NY 8.3

2001 20

site-id Latitude longitude

site123 43.993 -70.326

Year claims

2000 382

PHSY_ST: state abbreviationID: unique id

cost: unit is million US dollarsyear: 1975-2008

Correlated dataset Complement dataset

Metadata (field definition) Metadata (value definition)

owl:sameAs

DS123:NY

Page 8: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

8

Exhibit Visualization API

Data.govData.gov

CASTNET Ozone(CSV)

epa.govepa.gov

CASTNET Site(CSV)

Convert raw dataset into linkable RDF

Data Mashup Web Application MashupVisualization Mashup

query multiple RDF dataset via SPARQL end point

surf to EPA applications

1

2

drill down for details3

4

Created by Dominic DiFranzo, PhD student at RPI, http://www.data.gov/semantic/Castnet/html/exhibit

Consumption: Mashing up LOGD Data

Page 9: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

9Provenance: Tracking Create, Derive, Revision Events

Convert

derive derive

create

derive

revision

Access

Enhance

Version

SemDiff

Page 10: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

10

TWC LOGD Status: Website Statistics

• Page Rank=5• Site Traffic

– 378,128 page hits – 28,481 visits– 16,041 visitors– 4126 cities– 34 countries

• Ranked 6th in Google “data gov” Search

Note: the above statistics are about http://data-gov.tw.rpi.edu. Dataset access not counted.

Page 11: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

11

TWC LOGD Status: Part of LOD Cloud

We are here

Page 12: The Linked Government Data Landscape Today data.gov and TWC LOGD Li Ding, Jim Hendler and Deborah L. McGuinness Tetherless World Constellation Rensselaer.

12

Conclusion and Future Work

• Now– 6.4 8.5 billions of triples – “data + visualization + mashup” – Low-cost solutions– Education

• New LOGD site is coming– More raw data, catalog, links, – More technologies, RDFa– More tools, services– More demos, tutorials, videos– More domain applications

• Future Research – Data integration, link, search– Social machine– Provenance, versions, trust– Usability and data quality– Scalability scalable

http://logd.tw.rpi.edu