Institute for Sustainable Earth and Environmental Software ISEES Matthew B. Jones National Center...
-
Upload
roderick-ramsey -
Category
Documents
-
view
212 -
download
0
Transcript of Institute for Sustainable Earth and Environmental Software ISEES Matthew B. Jones National Center...
Institute for Sustainable Earth and Environmental Software
ISEES
Matthew B. Jones
National Center for Ecological Analysis and Synthesis (NCEAS)
University of California Santa Barbara
ISEES Sustainability and Adoption WorkshopSeptember 10-11, 2013
Introductions
• Your 140 character twitter intro
Science and Synthesis
• Synthesis critical to advancing science• Merger of synthesis with experimental and
observational science
Ocean Health Index (OHI)O
cean H
ealt
h Ind
ex
Halp
ern
et
al. 2
01
2
Software in the science lifecycle
From Reichman, Jones, and Schildhauer; doi:10.1126/science.1197962
Software for the Earth, Life, and Environmental Sciences
• Statistical analysis– e.g., R, SAS, Matlab, Systat, Excel, etc.
• One-off models (by students, faculty, etc.)
• Custom analytics (e.g., Primer, MetaWin, MaxEnt)
• Modeling frameworks (e.g., ROMS)
• Community models (e.g., Century, Community Climate Model)
• Workflows (Kepler, VisTrails, …)
• Computing engines (e.g., Sun Grid Engine, Amazon ECS)
• Data management (DataONE, Metacat, DataUp)
• Service computing (Blast, WMS, WFS, …)
Software challenges
• Wide range of software types• Code Complexity and Quality• Reproducibility• Systems integration• Development and maintenance are labor
intensive– NSF not set up for infrastructure/maintenance
• Software lifetime long compared to hardware• Under-appreciated value
ISEES Vision
• Massively accelerate science– (Earth, environmental, and life science)
– Enable collaboration and integration across disciplines
– Invent, develop, integrate, mature, and sustain software• used throughout the scientific lifecycle
Determining needs
• What needs to be improved?
• What challenges do we face?
• How do we solve these?
Any solution must…
• Provide value to participants in their reputation economy
• Enable participants, not compete with them
Can an Institute build it “for them”?
• No. Must empower community– Scaling/leverage– Creativity– Knowledge of domain
• Community driven initiative– Model after synthesis centers– Link to community initiatives such as ESIP
ISEES Steering Committee
• Matthew Jones (Cyberinfrastructure)
• Lee Allison (Geology)
• Daniel Ames (Hydrology)
• Bruce Caron (Collaboration)
• Scott Collins (Ecology)
• Patricia Cruse (Library)
• Peter Fox (CI & Semantics)
• Stephanie Hampton (Ecology)
• Chris Mattmann (JPL; Apache)
• Carol Meyer (ESIP Community)
• William Michener (DataONE)
• James Regetz (Analytics)
• Mark Schildhauer (Semantics)
Strategic planning approach
Two tracks
Stakeholders and community
Community Engagement, Sustainability, and Governance
Stakeholders andcommunity
Sustainability
Structure and Governance
Workforce Development
Knowledge and skills: content gaps
Mechanisms for education
StrategicRecommendations
Collaborative Space
• Document sharing and wiki– https://projects.nceas.ucsb.edu/isees/projects/soft
ware/
• Etherpad collaborative editing– https://epad.nceas.ucsb.edu/
• Username/pw– Username: isees– Password: swforscience
ISEES Science Drivers Workshop
• Outcomes– Science challenges limited by software– Functional areas for ISEES
Burrows et al. 2011. Science 334:652-655
Fresh water availability
Ecosystems
Human society
Water dimension
Biological dimension
Human dimension
MeanExtremesUncertainty
MeanExtremesUncertainty
MeanExtremesUncertainty
Time
Allo
catio
n
Now
Visualization
Scenarioprescription
Data resources• CUAHSI HIS• World water
online• GEOSS
• DataONE• NASA/ESA/other• NEON• EarthCube• NSW/WMO/other
• CoCoRaHS• Water managers• Army Corps• Social media
Data types• Precipitation• Atmos. H2O• Groundwater• Reservoir storage• River discharge
• Water quality• Soil moisture• Other climate• LC/LU• Built
infrastructure
• Economic• Population• Ag/irrigation• Sap flux/tower ET• Human use• Physical hydrology
New
dat
a in
itiati
ves
Data management• Selection• Provenance• Rectification
Scenario support• Simulation• Historical• Social science
Earth system models (CSDMS)• CESM• ESMICs
Data fusion• Spatial
statistics• Assimilation
Data ingestion
ExperimentationFeedback analysis
Community input and refinement
TheoryAlgorithmsParameterizations
How will coupled human and biophysical systems shape and be shaped by water availability?
Sources
Transport
Recipient Systems
Resistor
OutputVisualization
ScenariosDecision-Support tool
Clim
ate
chan
ge (m
odel
out
put)
Hyd
rolo
gica
l mod
ifica
tions
Popu
latio
n ch
ange
(sce
nario
s)La
nd u
se a
nd c
over
cha
nge
(mod
els,
obs
erva
tions
)
Archive, provenance, other considerations
Q: What are the controls, impacts, and societal responses to atmosphere–land–water transfer of pollutants, and how will they change under multiple, global-change stressors?
• Modularity: main program with modules (off/on in parameter file)• Flexible I/O:
• OPeNDAP (Open-source Project for Network Data Access protocol); •Storage: flexible output (netcdf, ASCII formats) and data archive system
• Existing pollutant transport models • CMAQ annual deposition Community Modeling and Analysis System (CMAS) Center
• SPARROW water quality model USGS• NASA models of aerosol movement• SMS and Delta3D for sediment transports• CMS for CDOM transport and oil-spills
• Landscape and habitat models (USGS, WRI)
Software Needs for Data & Model Output Synthesis
Perturbations of IC (climate and land-use scenarios)• New transport models: Coupled atmospheric-ocean transport models
- High Performance Computing with multi-processors and MPI capabilities - multi-scale nesting capabilities - hind-cast and near-real time capabilities- stochastic capabilities & ensemble simulations to formulate uncertainties
Output & Visualization Needs•user interface, interactive scenarios •connectivity module linking sources to recipients: where the pollution comes from?•Matlab 2 & 3D animations•R - statistical package
Spatially and temporally predict carbon storage &
flux globally at 1km scales to 2300
What can ISEES do for you?
• Computation training for early career and mid and senior scientists (14)
• Assimilation and QA/QC tools for heterogeneous data (13)• Provide a collaborative environment for ecologists, computing
scientists, social scientists, etc. (10)• Develop dynamic, flexible visualization tools (9)• Support for software maintenance and sustainability,
including software building blocks (e.g., modules) (9)
What can ISEES do for you?
• Improved tools for capturing decisions and workflows in collaborative research projects (6)
• Software discovery: One-stop shopping for finding and characterizing software and models -- focus on users (6)
• Provide consultants, collaborators for software, CS, for researchers (6)
• Community hub for standards convergence (4)• Facilitate merging of disparate software tools (3)• Develop user-friendly interfaces to existing models (3)• Provide a framework for multiscale, coupled modeling
systems (2)
What can ISEES do for you?
• Make high performance computing available to the average ecologist and environmental scientist (2)
• Software to help with uncertainty and error propagation in spatial models (2)
• Provide web-based software services, i.e. ability to run analyses on ISEES servers via accessible interfaces (2)
• Software vetting (check software being developed in-house) (1)
• Help me contribute to community software (1)• Taxonomy scrubbing software (1)• Improved model intercomparison (1)
Software Lifecycle and Components Workshop
• Goal: Envision a model for ISEES that enables efficient, reproducible, scalable, and impactful environmental science– Identify *functions* that ISEES would be ideally suited to
perform or coordinate– Provide recommendations to the ISEES steering committee
for our strategic plan
• Contribute to a paper outlining this vision for ISEES• Stimulate amazing and fun discussions here and later
about software in science
ISEES Software lifecycle model
Figure by M. B. Jones, NCEAS
Functional areas for ISEES
• Participants identified 4 functions
– Community building– Training and Advocacy– Consulting Services– Infrastructural Services
Training and Advocacy
• Advocate for the benefit of Open Source• On-site and remote courses on sw
development for research sw developers• Develop the policies that incentivize sharing
and collaboration (licensing, attribution)• Provide best practices for sw development
aimed at scientific community
Community Building
• Operate community based support services for software
• SW dating game -- match science with sw experts -- e.g. dba services, ux services
• Support collaboration groups focus on sw projects
Consulting Services
• Assisting groups in software:– Design and Architecture, Hardening, Maintenance,
Preservation
• Consultation and mentoring services– licensing, testing as a service, engineering, cost
modeling
• Certify papers as reproducible
Infrastructural services
• Software discovery services• Software review and certification program• Provide software use and quality metrics• Actively survey the community
Science & Software collaboration
• Science-directed hackathon-style working groups– 8-12 member working groups– co-directed by scientist and developer – address one or more software problems that
impede a grand challenge science question
• Advisory board equally comprised of scientists/developers to select winning proposals
• Option for 1-2 developers to continue on problem between working groups
Questions?
• http://isees.nceas.ucsb.edu/
• http://www.nceas.ucsb.edu/ecoinfo/