Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The...
-
Upload
amelia-washington -
Category
Documents
-
view
212 -
download
0
Transcript of Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The...
![Page 1: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/1.jpg)
1
Defense Strategies Institute Professional Educational ForumHarnessing the Power of Big Data for The Intelligence
CommunityNovember 17-18, 2015
Mary M. Gates Learning Center, Alexandria, VANational Priorities for Big Data
Dr. Brand NiemannFounder and Co-Organizer Federal Big Data Working Group Meetup
Director and Senior Data Scientist/Data JournalistSemantic CommunitySemantic Community
Federal Big Data Working Group MeetupData Science
November 17, 2015
![Page 2: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/2.jpg)
2
Summary
• The White House Office of Science and Technology Policy (OSTP) and the National Science Foundation (NSF) convened a National Data Science Organizers Workshop, November 5-6,2015, to discuss:• 1. Data Science for the Nation: National Priorities, Impacts of Big Data Science on National Priorities, and
Using Meetups to Explore National Challenges,• 2. Exposing Data;• 3. Coordination and Support of Data Science Meetups; and• 4: The National Priority Challenge.
• The results of this workshop will be summarized along with highlights from the Federal Big Data Working Group Meetup, for which the presenter is the Founder and Co-Organizer.
• Examples of what the Federal Big Data Working Group Meetup has done from 2014-present to provide big data science tutorials and Massive Open Online Courses (MOOCs), curated government datasets, and citizen science and crowdsourcing in support of the White House Open Science and Innovation: Of the People, By the People, For the People, as part of the President's 2013 Second Open Government National Action Plan.
![Page 3: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/3.jpg)
3
http://www.nationalprioritychallenge.org/
![Page 4: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/4.jpg)
4
Agenda
• Day 1: November 5, 2015:• 12:00pm – 1:00pm Lunch with the Big Data
Regional Innovation Hubs Leaders (limited seating)
• 2:00pm – 2:30pm Opening Keynote: What are the National Priorities? by Thomas Kalil
• 2:30pm – 3:30pm Session 1: Leadership Panel on Data Science Innovation and Collaboration
• 3:30pm – 5:00pm Grassroots Data Science Across the Nation with Lightning Talks
• 5:00pm – 5:15pm Support of grassroots data science, crowd sourcing, and challenges
• 6:00pm – 8:00pm PUBLIC: Data Drinks: National Data Community Happy Hour!
• Day 2: November 6, 2015:• 8:00am – 10:00am Session 2:
Exposing Data• 10:30am – 11:30am Session 3:
Coordination and Support of Data Science Meetups• 12:30pm – 1:30pm Lunch
Keynote: Data Science in the Government by D.J. Patil• 1:30pm – 5:30pm Session 4: The
National Priority Challenge• 5:45pm – 6:00pm Closing Remarks
http://www.nationalprioritychallenge.org/schedule/
![Page 5: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/5.jpg)
5
Speakers
• Thomas Kalil, Deputy Director for Policy for the White House Office of Science and Technology Policy and Senior Advisor for Science, Technology and Innovation for the National Economic Council• Opening Keynote: What are the National Priorities? by Thomas Kalil
• Chaitan Baru, Senior Advisor for Data Science, National Science Foundation• Session 3: Coordination and Support of Data Science Meetups
• Dr. D.J. Patil, U.S. Chief Data Scientist and Deputy Chief Technology Officer at the White House Office of Science and Technology Policy• Lunch Keynote: Data Science in the Government by D.J. Patil
http://www.nationalprioritychallenge.org/speakers/
![Page 6: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/6.jpg)
6
Purpose and Organizers
• National Data Science Organizers is a network of individuals who promote data science professionalism in their communities across the nation• Applications of data science to address major economic or societal challenges• Questions that meetups could explore• The Obama Administration has already launched a series of initiatives related to Big Data
and open data
• About The Organizers• Institutional Support
• National Science Foundation• University of Chicago• Data Science for Social Good
• Steering Committee:• Representatives from: District Data Labs, DC2, Big Data Utah, NSF, Boston Predictive Analytics, SF Data
Mining, NIH/NLM/NCBI, University of Chicago, etc.http://www.nationalprioritychallenge.org/purpose/http://www.nationalprioritychallenge.org/about-the-organizers/
![Page 7: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/7.jpg)
7
Key Points
• Data science for the nation: Impacts of big data science on National priorities• Data science for tackling the challenges of big data• Developing people, processes, and products for the Federal
Government• Online Data Science Collaboration and Competition• Kaggle• DevPost (Formerly Challenge Post)
![Page 8: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/8.jpg)
8
Because I am a Data Scientist and Data Journalist
• Who we are?• What we do?• Where we do it?• When we do it?• Why we do it?• How we do it?• Specific example: Data Science for the Map of Federal Crowdsourcing
and Citizen Science Projects for the NDSO Challenge
Poynter: A Global Leader in Journalism: 6 questions that can help journalists find a focus, tell better stories
![Page 9: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/9.jpg)
9
Who we are?:Definitions
• Federal: Supports the Federal Big Data Initiative, but not endorsed by the Federal Government or its Agencies;• Big Data: Supports the Federal Digital Government Strategy which is
"treating all content as data", so big data = all your content;• Working Group: Data Science Teams composed of Federal Government
and Non-Federal Government experts producing big data products; and• Meetup: The world's largest network of local groups to revitalize local
community and help people around the world self-organize like MOOCs (Massive Open On-line Courses) now endorsed by the White House
http://www.meetup.com/Federal-Big-Data-Working-Group/ http://semanticommunity.info/Data_Science/Federal_Big_Data_Working_Group_Meetup#Meetups
![Page 10: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/10.jpg)
10
What we do?:October 19th Meetup
• This Meetup was organized for:• Robin Thottungal, Chief Data Scientist @ EPA, and Division Director, EAD, OIAA, OEI,• Greg Godbout, Chief Technology Officer, Environmental Protection Agency, and former
Executive Director and Co-Founder of 18F, and• Jay Benforado, Director, National Center for Environmental Innovation at EPA, and Co-Chair,
Federal Community of Practice for Crowdsourcing and Citizen Science
• for the National Data Science Organizers Workshop on November 5-6, 2015, as an example of:• data science for curated data sets,• user-centric digital services focused on the interaction between government and the people
and businesses it serves, and• a Federal Community of Practice on Crowdsourcing and Citizen Science of Big Data that meets
bi-monthly to share lessons learned and develop best practices for designing, implementing, and evaluating crowdsourcing and citizen science initiatives.See Recording and Agenda: https://www.cubbyusercontent.com/pl/Instant+Meeting+2015-10-19.webm/_64a119c67b2b4103a43c2e43e356ab35
http://www.meetup.com/Federal-Big-Data-Working-Group/events/223605766/
![Page 11: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/11.jpg)
11
Where we do it?:Locations
• Xcelerate Solutions• 8405 Greensboro Dr., Suite 930, McLean 22102, VA
• National Science Foundation• 4201 Wilson Blvd, Arlington, VA
• Eastern Foundry• 2011 Crystal Drive, 4th Floor, Arlington 22202, VA
• Marriott Wardman Park• 2600 Woodley Road NW, 20008, Washington, DC
• Conferences, Workshops, etc.
![Page 12: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/12.jpg)
12
When we do it?:Meetup Calendar Schedule
• September 28th, Climate Change & Genomic Data - Data Science Meetup of Meetups• October 5th, Data Science for EPA & USGS Fracturing & Fracking Data (Dr. Sophia Liu, USGS
and USGS Staff). See July 13th Meetup: Data Science for USGS Minerals Big Data• October 19th, Sensing Our Air: The Quest for Big Data About Our Air Quality (EPA’s New
Chief Data Scientist, Robin A Thottungal, Invited)• November 2nd, Data Science for Random Forests: TIBCO Enterprise Runtime for R. See June
1st Meetup: Data Science for Homeless Data: QlikView. Tableau, & Spotfire Bakeoff• November 5-6th, OSTP/NSF Data Science Meetup of Meetups, Ballston, VA• November 16th, Data Science for the DataAct Datathon• December 7th, Data Science for DoD Joint Doctrine• January 4th, 2016, Data Science for Semantics: MarkLogic and Cray Graph Appliance Update• February 1st, 2016, Data Science for Census American Community Survey
![Page 13: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/13.jpg)
13
Why we do it?:Use Federal Big Data Examples and Technology
• Federal Big Data Examples:• White House Climate Change and Precision Medicine• NIH Genomic• EPA Air Quality• USGS Water Quality• Department of Commerce Census• Treasury DataAct• DoD Joint Doctrine
• Major Big Data Technologies:• TIBCO Enterprise Runtime for R (TERR)• MarkLogic Semantics• Cray Graph Appliance
![Page 14: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/14.jpg)
14
How we do it?:Like the NIH Data Commons
• FAIR Principles:• Findable• Accessible• Interoperable• Reusable
• Cloud:• Data• Software• Results
• Federal Science Policy:• OSTP Public Access to Scientific Data
Memo (February 2013)• New Program: Big-Data-to-
Knowledge (2013)• New Position: Associate Director of
Data Science (2014)• Digital Enterprise (2015): Data
Commons• Metadata• Open APIs• Digital Objects• Containers
A NIH – Semantic Medline Data Science Data Publication Commons
![Page 15: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/15.jpg)
15
How we do it?:OSTP/NSF National Data Science Organizers Workshop
• Week of November 2nd:• NSF Data Science/Big Data
Principal Investigators (About 300)• NSF Data Hubs (4)• Organizers of Largest Data
Science/Big Data Meetups (About 65)
• Pipeline for Return on Investment:• PIs put their data, tools and
research results in the Data Hubs• Data Hubs provide those data,
tools, and research results to the world, but especially to the Data Science/Big Data Meetups• Data Science/Big Data Meetups
collaborate with PIs and Data Hubs to increase usage and feedback
![Page 16: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/16.jpg)
16
How we do it?:We Already Do This!
• Semantic Community:• Provides a Community Sandbox that is
like a GitHub, Data Hub, Data Commons, etc.• Metadata (MindTouch)• Open APIs (MIndTouch)• Digital Objects (MindTouch)• Containers (Spotfire)
• Organize the Federal Big Data Working Group Meetup
• Support Agencies and Programs in Crowdsourcing Their Data Sets
• Mentor Data Scientists (Tutorials and MOOCs) and Entrepreneurs (Eastern Foundry)
• Federal Big Data Working Group Meetup:• Federal: Supports the Federal Big Data
Initiative, but not endorsed by the Federal Government or its Agencies;
• Big Data: Supports the Federal Digital Government Strategy which is "treating all content as data", so big data = all your content;
• Working Group: Data Science Teams composed of Federal Government and Non-Federal Government experts producing big data products; and
• Meetup: The world's largest network of local groups to revitalize local community and help people around the world self-organize like MOOCs (Massive Open On-line Classes) now embraced by the White House.
![Page 17: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/17.jpg)
17
How we do it?:Data Mining - Science - Questions - Publication
Process
• Data Mining Process:• Business Understanding• Data Understanding• Data Preparation• Modeling• Evaluation• Deployment
• Data Science Process:• Data Preparation• Data Ecosystem• Data Story
• Data Science Questions:• How was the data collected?• Where is the data stored?• What are the data results? and• Why should we believe the data results?
• Data Science Data Publication:• Knowledge Base• Spreadsheet Index• Web & PDF Tables to Spreadsheet• Data Browser• Dynamically Linked Adjacent
Visualizations
![Page 18: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/18.jpg)
18
How we do it?:Collaboration for Data Science Win-Wins• USDA Open Government Data Training, Innovation Competition, and
Online Course in Data-Driven Farming:• http
://semanticommunity.info/Data_Science/Big_Data_Science_for_Precision_Farming_Business#Story
• Many Curated Government Data Sets and Data Science Products:• http://semanticommunity.info
• Pick an Agency and/or a Data Set and Look for a Meetup on That:• http://www.meetup.com/Federal-Big-Data-Working-Group/
• Mentor Startups Partnership with Eastern Foundry:• http://www.meetup.com/Federal-Big-Data-Working-Group/events/223140032/
![Page 19: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/19.jpg)
Specific Example: Data Science for the Map of Federal Crowdsourcing and Citizen Science Projects
for the NDSO Challenge• The National Data Science Organizers (NSDO) are looking for a set of meta-design
categories for each challenge model so teams can find, gather, and share data.• The Federal Crowdsourcing and Citizen Science Toolkit provides both a set of
meta-design categories and agency partners to help teams find, gather, and share data.• The Map of Federal Crowdsourcing and Citizen Science Projects has been
converted to a data set of 102 projects that can be used by the NDSO teams for the upcoming OSTP/NSF NSDO Workshop, November 5-6, 2015, and for going forward for the rest of 2015-2016.• This work also demonstrates a simple data science project for a hackathon
challenge that shows how this map was created in Excel and visualized in Spotfire.
![Page 23: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/23.jpg)
https://ccsinventory.wilsoncenter.org/#projectId/101
![Page 25: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/25.jpg)
CCSInventory.xlsx
![Page 26: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/26.jpg)
Spotfire Imports Boundary Files
Spotfire Geocodes Data
![Page 27: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/27.jpg)
NOAA has the most projects: 26
Web Player
![Page 30: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/30.jpg)
https://www.wilsoncenter.org/article/ppsr-core-metadata-standards
Goal: International network of citizen science data
![Page 31: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/31.jpg)
31
https://inventory.cartodb.com/sessions/create
![Page 32: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/32.jpg)
32
https://inventory.cartodb.com/dashboard/
![Page 33: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/33.jpg)
33
https://inventory.cartodb.com/tables/database_federal_citizen_science/map
![Page 34: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/34.jpg)
34
https://inventory.cartodb.com/tables/database_federal_citizen_science
![Page 35: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/35.jpg)
35
https://inventory.cartodb.com/viz/a03a58b6-2488-11e5-8f29-0e4fddd5de28/table
![Page 36: Defense Strategies Institute Professional Educational Forum Harnessing the Power of Big Data for The Intelligence Community November 17-18, 2015 Mary M.](https://reader036.fdocuments.us/reader036/viewer/2022081603/5697bf9c1a28abf838c9334c/html5/thumbnails/36.jpg)
36
https://inventory.cartodb.com/dashboard/datasets