NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

108
Joel Natividad TCG Thursday, June 9, 2011 SemTech 2011 NYC DataWeb A platform for Integrating Public Data into NYC.gov

description

An Open Public Data Exchange for New York City, submitted to the NYCBigApps 2010 Challenge.

Transcript of NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Page 1: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Joel NatividadTCG

Thursday, June 9, 2011SemTech 2011

NYC DataWebA platform for Integrating Public Data into NYC.gov

joelnatividad
Typewritten Text
Click here for narrated version
Page 2: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

About Me

• TCG Software

• Software Services arm of “The Chatterjee Group”

• Several Portfolio companies in Lifesciences, Telecom, Aviation, Energy, Real Estate, & Info Technology

• Headquartered in NYC

• Delivery Centers in Bangalore, Kolkata & Mumbai

• Look after Knowledge Engineering Practice of TCG

Page 3: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Background

Page 4: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 5: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 6: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• stimulate development of apps that improve access to info and govt transparency, and;

• encourage innovation & the creation of new IP with commercial potential

Main Goals

Page 7: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 8: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 9: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CROWDSOURCING

Page 10: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CROWDSOURCING

• Wisdom of the Crowd

• Self-selecting, motivated developers

• Bang for the Buck

• Ignites Entrepreneurship

Page 11: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CROWDSOURCING

• Challenge: Improve Recommendation Algorithm by 10%

• Dataset:

• 100 million ratings (training set)

• Half a million Users

• 18 thousand movies

• Prize:One million US Dollars

STATISTICS

• just 6 days into contest, Cinematch bested by 1%

• 20,000 Teams, 150 countries

• Entrants:

• Bell Labs

• Opera Solutions

• Well-renowned universities

Page 12: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CROWDSOURCING

• Challenge: Improve Recommendation Algorithm by 10%

• Dataset:

• 100 million ratings (training set)

• Half a million Users

• 18 thousand movies

• Prize:One million US Dollars

STATISTICS

• just 6 days into contest, Cinematch bested by 1%

• 20,000 Teams, 150 countries

• Entrants:

• Bell Labs

• Opera Solutions

• Well-renowned universities

Page 13: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CROWDSOURCING

Page 14: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 15: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 16: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 17: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 18: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 19: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• Washington DC CTO - Vivek Kundra

Page 20: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

Page 21: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

Page 22: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

Page 23: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

Page 24: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

Page 25: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

}Life Support

Page 26: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

• First Federal CIO - Vivek Kundra

• Open Government Initiative

• Recovery.gov

• Data.gov

• USAspending.gov

• IT Dashboard

• Performance.gov

• Fedspace

• Citizen Services Dashboard

}Life SupportBudget slashed

from $34 million to

$8 million

Page 27: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 28: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 29: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Open Data in NYC

Council Member Gale Brewer

Page 30: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 31: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 32: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 33: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 34: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 35: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

$ 500 million!!!

Page 36: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 37: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 38: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 39: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Why $ 500 million?!?!

Page 40: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Why $ 500 million?!?!

Page 41: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 42: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 43: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 44: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 45: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 46: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 47: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 48: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

“Integrated” Inter-Agency System

Page 49: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Data Integration Alphabet Soup

SOAEAI

ORB

SOAPRPC

XML

XSLTJMS

EJB

MOM

MDA

BPM BPEL POJO

Page 50: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Data Integration Alphabet Soup

SOAEA

I

ORB

SOAP

RPC

XML

XSLTJMS

EJB

MOM

MDABPM BPEL POJO

Page 51: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 52: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Principles

• Cost Effective (NOT $500 million dollars)

• Easy to Use (Developers/Publishers/Citizens)

• based on Open Standards

• Low Adoption Curve

• Help Accelerate Open Data Innovation

• Useable Data Now!

bionic hand

Page 53: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

The Next Web of Open Linked DataFebruary 2009

Page 54: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Useable Data Now

• “Beautiful” Website

• Useable by Developers/Publishers/Citizens

• based on Open Standards

• Low Adoption Curve

• Help Accelerate Open Data Innovation

• Useable Data Now!

Page 55: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

What  NYCBigApps  Developers  were  Doing

Siloed Data

46

ETL Processes

• Spend inordinate amount of time interpreting data

• Massaged Data was then staged locally

• Developers kept reinventing the wheel

• Limited Data mashups

• Applications disconnected from NYCDatamine

Text

Download &Decipher

Page 56: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

There must be a Better Way

Page 57: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

How it Started

• Oct 12, 2010 - NYCBigApps 2.0 announced

• Nov 9, 2010 - NYCBigApps 2.0 kickoff meeting

• late Nov 2010 - spoke with Revelytix/Spry about collaborating

• early Dec 2010 - started work on NYCDataWeb

• Jan 26, 2011 ~4:30p - submitted entry

Page 58: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 60: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

What  We  Did

51

Query &Results

Siloed Data

MappingOntology

MetadataOntology

DomainOntology

Optimizer

PlannerIndexes

Re-Writer

Cache

Re-Writer

Indexes

Optimizer

Rules

Planner

Rules

Definitions

Page 61: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

“Beautiful” Website

Three dashboards were built

• NYC Agile Analytics (Spry)

• NYCreation (SMW+)- visualized SPARQL query results

• NYCmantics (SMW+)- NYC datamine explorer

Page 62: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 63: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 64: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 65: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 66: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 67: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 68: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 69: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 70: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 71: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 72: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 73: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 74: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 75: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 76: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 77: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 78: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 79: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 80: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 81: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

What’s Next?

Page 82: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Semantic Gap

Page 83: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Semantic Gap

Developers

Page 84: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

?!?

Semantic Gap

Page 85: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

3.0

Page 86: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

3.0Developers

Page 87: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

JumpStart Semantics

3.0

Page 88: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

3.0

Page 89: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 90: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 92: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

The Computer for the  rest of us.

Page 93: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Semantics for the  rest of us.

Page 94: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Semantics for the  REST of us.

Page 95: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Phase 2Aug 2011 (Powered by NYCDataWeb)

• Hide Complexity(Simplicity = Adoption)

• Incorporate the whole NYC datamine

• Make it easier for Publishers

• Make it easier for Developers

• Make it easier for Citizens

• Open-source collaboration with vendors & other institutions

• Incorporate the best of Socrata and data.gov

• Improved Visualizations

Page 96: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Phase 2Aug 2011 (Powered by NYCDataWeb)

• Hide Complexity(Simplicity = Adoption)

• Incorporate the whole NYC datamine

• Make it easier for Publishers

• Make it easier for Developers

• Make it easier for Citizens

• Open-source collaboration with vendors & other institutions

• Incorporate the best of Socrata and data.gov

• Improved Visualizations

• Position NYCDataWeb as the accelerated data mashup platform

Page 97: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Phase 3Nov 2011 (NYCBigApps 2011)

• DataWeb Deployment Framework SMW bundle

• More Data Sources (Federator - Spinner)

• Linked Open Data

• Make it easier STILL for Publishers, Developers and Citizens

• Enable Widespread adoption of NYCDataWeb(NYCDataWeb bootcamp)

Page 98: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

NYCInformation

Web

The  Broader  Vision

85

Query &Results

DomainOntology

RDF RDF

RDF RDF

RDF

WebPages

Sensorss

Partners

OntologyRDF

Agency  Data  Other

Triplestores

Page 99: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Phase 4Post NYC BigApps 2011

• Multiple solutions powered by NYCDataWeb

• <Your city/community/company here> DataWeb

• Help foster a viable ecosystem of Linked Data

• ... keep standing on the shoulders of giants

Page 100: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Semantic Web

Page 101: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

Hans Rosling shows the best stats you've ever seen

February 2006

Page 103: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

PUBLIC

Page 104: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

PUBLIC

Page 105: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC
Page 106: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

We need your help & feedback

A Platform for Integrating Public Data into NYC.gov

Find out more athttp://knoodl.com/ui/groups/NYC_Homepage

Page 108: NYC Data Web (static version) - A Semantic, Open Public Data Exchange for NYC

CREDITS• Lego Faceparty picture by RichardAM (http://www.richard-am.net/)

• Lego Inauguration Pictures from various Flickr Users (sluggobear, Atwater, Dan Hontz)

• Lego Luke looses his Hand by Flickr user wwwayazdotcom

• Tim Berners-Lee highlight from TED (http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html)

• Hans Rosling highlight from TED (http://www.ted.com/talks/hans_rosling_shows_the_best_stats_you_ve_ever_seen.html)

• FlowerPowerpont2.pptx provided by Anna Rosling Rönnlund of gapminder

• “Star Wars Gangsta Rap” highlight, SizzlechestXXX (http://www.youtube.com/watch?v=Ij4w7ChpuaM)

• Various screenshots provided by Revelytix, Spry Inc. and TCG Software Services