EarthCube DDMA AGU

16
www.ci.anl.gov www.ci.uchicago.edu A Community Roadmap for Enabling Access to Geosciences Data Tanu Malik Ian Foster Computation Institute University of Chicago and Argonne National Lab. [email protected], [email protected]

Transcript of EarthCube DDMA AGU

Page 1: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

A Community Roadmap for Enabling Access to Geosciences Data

Tanu MalikIan FosterComputation InstituteUniversity of Chicago and Argonne National [email protected], [email protected]

Page 2: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

2

Outline

• Access Workshop• DataSpace • Post Charette EarthCube

Page 3: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

3

Access is Vital for EarthCube’s Success

• The goal of EarthCube is to create a sustainable infrastructure that enables the sharing of all geosciences data, information, and knowledge in an open, transparent and inclusive manner.

I cant get access to *.

It is difficult for me to *.

I want to integrate data from other disciplines, but *.

Access refers to software and activities that make data and computational resources easily, efficiently and reliably available to scientists across disciplines.

Page 4: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

4

Access Workshop Goals

• Encourage discussions on emergent issues:– Use of cloud computing– Exploiting the general principle of moving computation to data – A technological and governance framework for cross-disciplinary access,

service architecture, brokering principles, real-time data, uniform authentication and authorization environment, etc.

– Improving access to data in publications.

• Bring some standardization on research data life cycle issues:– In general, data, once generated, follow a lifecycle---they are stored,

described, processed, transformed, accessed, discovered, analyzed, and curated. In organized networks and campaigns, lifecycle stages are often documented and standardized, though vary significantly across networks and campaigns. In individual initiatives, the lifecycle stages continue to remain ad hoc and ill-defined. [RDLM-Workshop2011]

• Obtain community consensus on a few use cases

Page 5: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

5

Workshop Activity Outcomes

• Use Case 1: Can I access “not large” but “big data” to conduct statistical analysis?

• Use Case 2: I have a hypothesis not tied to a physical instrument or geophysical parameter. Can I still access all the data, in an “interactive” fashion to test my hypothesis?

• Use Case 3: The storm dust paper is vital to my research. Can I access the data in the publication and change parameters of experiments to understand the nature of storm dust?

Page 6: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

6

Workshop Reflections

• Its all about data!

Resources, ServicesData

Import

Export

DataResources, Services

Export

Import

People

Page 7: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

7

Workshop Reflections-2

• Discussing technology issues in insolation is a recipe for disaster.– Access is closely aligned with other subgroups– It is important to organize in functional units

Page 8: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

8

Workshop Reflections-3

• Challenges will continue

Changing Requirements/Changing Technology

• Real-time data• Cross-disciplinary Data• High dimensionality• Network bandwidth, Computational resource, Data management constraints

Adoption Culture

Social Challenges

• Transparency• Openness• Establishing social ties

Adoption is slowSustainabilityEstablishing practices

Page 9: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

9

Principles of Data Sharing in EarthCube

Lowers the barrier to entry for data sharing and reuse Uses tenets like “metadata ASAP” to encourage submission of data Enables creation of “Curation Co-ops” among communities, sub-communities Serve the NSF DMP requirement Based on a cloud-based infrastructure to support data discovery, access, and

mining

Page 10: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

10

Enabling A Data Sharing Space: The DataSpace

• Embrace a “semi- structured” notion ‐‑

• Ingest data in raw form,Structuring and refinement of the data and metadata.

• Open, extensible architecture that supports Software as a Service (SaaS) model,

Process for vetting contributed services prior to their incorporation. Based on on demand resources ‑

• Emphasis on usability instead on developing technology/infrastructure

DataSpaceData

Export

Import

Resc, Services

Page 11: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

11

Post-Charette• 2 Earthcube PI meets at University of Colorado, Boulder

– A Concept group meeting, o some representation from Community groups, o July 10, 2012

– A Concept and Community group meeting, o October 4 -5, 2012

• Primary objective: Convergence– Through Roadmaps– Architecture– On future steps

Page 12: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

12

Highlights: Summary of Roadmaps

• Workplace to collaborate, • Lower barriers for participation, • Openness and extensibility, • Feedback and reproducibility, • Discovery of materials held by long-tailed

scientists, • Education and reward system for scientists, • Cross-domain teams and broad collaboration• A new community paradigm.

Page 13: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

13

Defining DataSpace: Architecture-1

DataResources, Services

Export

Import

Page 14: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

14

Defining DataSpace: Architecture-2

Page 15: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

15

Acknowledgements

• Don Middleton, NCAR• Robert Gibb, New Zealand Landcare

Research• Jeff Heard, U. of North Carolina• Doug Lindholm, U. of Colorado• Joseph Baker, Virginia Tech• Anne Wilson, U of Colorado• Chris Lynnes, NASA/ESIP Federation• Karsten Steinhauser, U. of

Minnesota• Ruth Duerr, NSIDC

• Dave Fulker, OPeNDAP, • Amarnath Gupta, UCS,• Robert Jacob, ANL• Chris Jenkins, JPL• Craig Mattocks, U. Miami• Beth Plale, Indiana Univ. • Stephen M. Richard, AZGS• Sameer Sirugeri, Microsoft • Zhangfan Xing, JPL, • John Williams, NCAR

Page 16: EarthCube DDMA AGU

www.ci.anl.govwww.ci.uchicago.edu

16

Thank You!

• Tanu Malik, [email protected], • Ian Foster, [email protected]

• Questions?