Ecosystem Data and Australia’s TERN:
Making the most of TERN to benefit your research and data management!
A workshop for the “Genes to Geosciences” SeriesMacquarie University, May 19, 2014: 1000 – 1500 hrs
Contents1. Welcome and Introductions 2. TERN and the Research Cycle and Data Cycle 3. Australian Ecosystem Data
• what’s available• data discovery• evaluation of data – is it suitable for my needs?• download and appropriate re-use
4. eMAST Example - New possibilities with ecosystem data5. Data Management and Publishing
• why does it matter and how can it help you• data management plans• data publishing – what are your options and why does it matter• data publishers – a continuum of approaches• data publishing options with TERN
6. Wrap-up and Exit Survey
Who are we?To understand your current practices and topics of interest we did a survey beforehand.
Have you previously searched for and accessed data from a public repository?Yes: 7 No: 5
Do you have a data management plan?Yes: 4 No: 8
Have you published data?Yes: 6 No: 6
Survey – your prior knowledge, experience, and requests for today
• To explain and demonstrate options available to the ecosystem science research community to use online resources for searching, evaluating, downloading, publishing and managing ecosystem data sets.
• Focus on activity and learning-by-doing, rather than too much talking
• To recognise different needs of researchers in different position and stages in research careers.
1. Aims and outcomes
• What will you walk away with?
- Better understanding of the national research infrastructure available to you – TERN
- Sense of the kinds of ecosystem data that is available, and how you can get it
- Experience searching, assessing and downloading data for your research
- Understanding the principles of good data management and the benefits for you
- Appreciation of the options for data management
- Introduction to tools for managing your data, including TERN infrastructure
1. Aims and outcomes
2. What is TERN?
• Infrastructure and networks to support coordinated, collaborative ecosystem science community
• Enabling sustained, long-term collection, storage, synthesis and sharing of ecosystem data
• Connecting science with policy and management
• TERN’s infrastructure for ecosystem science
Instruments + Sensors
Policy + Management
Analysis + Synthesis
Modelling
Data Searching
Data Sharing
Data Curation + Publishing
Data Storage
Processing + Analysis
Collection Methods
Eciencygain
Increasedeectiveness
Storage,preservation anddiscoverability
of data
Data analysis,integration and
synthesis
r
Ecosystem Science
Data + meta-data,
licensing
Research output:
new data and publications
Enables large scale and coordinated data collection, sharing and
multiple re-uses
Enhanced ability to revise, question and expand knowledge
Knowledge gap: research questions
Proposal and planning
Data collection, verification,
quality assurance and
control
Research lifecycle
Eciencygain
Increasedeectiveness
Storage,preservation anddiscoverability
of data
Data analysis,integration and
synthesis
r
Ecosystem Science
Data + meta-data,
licensing
Research output:
new data and publications
Enables large scale and coordinated data collection, sharing and
multiple re-uses
Enhanced ability to revise, question and expand knowledge
Knowledge gap: research questions
Proposal and planning
Data collection, verification,
quality assurance and
control
This morning
3. Australian Ecosystem Data• Learning Objectives: To identify the following resources for Australian ecosystem science applications:
- ecosystem data stores- meta-data portals- data publishers
• Sections:• 1030 - 1040 Data discovery• 1040 -1055 Data discovery - exercise• 1055 -1125 Evaluation of data – is it suitable for my needs?• 1125 – 1145 Download and appropriate re-use• 1145 – 1215 eMAST Possibilities
Data DiscoveryLearning objectives:To understand how to approach data discovery through systematic use of ecosystem data stores, portals and data journals.
• National infrastructure for Australian ecosystem data
• National infrastructure for Australian ecosystem data
TERN’s data portals and meta-data structure:
Auscover
Ozflux
Ausplots, and Transects
Coasts
Soils
Supersites Network and LTERN
eMAST
AeKOSEcoinformaticsTERN Data
Discovery Portal
TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?AusCover Remote sensing data and derived
products covering: land cover; ecosystem variables; fire; surface radiation, meteorology; base satellite data and inputs to satellite processing; site-based datasets.
Via TDDP or AusCover portal:www.auscover.org.au/data/product-list [Submit - [email protected]]
AusPlots Vegetation and soil surveys and samples; photopoints.Over 330 sites sampled so far. As at March 2014: data from ~130 rangelands sites available, with more coming soon.
Via AEKOS data portal www.aekos.org.au or Soils to Satellites soils2sat.ala.org.au/(In future will also be searchable from TDDP) Specimens (vegetation voucher samples and soils) [email protected]: Contact [email protected]
ACEAS(Australian Centre for Ecological Analysis and Synthesis)
Synthesised data products from ACEAS working groups.
Via TDDP or ACEAS portal:aceas-data.science.uq.edu.au/portal/ [Submit – [email protected]]
TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?ACEFAustralian Coastal Ecosystems Facility
Key datasets include coastal bathymetry, coastal habitats, water quality, beach morphology, turtle distribution and habitat
Via TDDP or ACEF portal:acef.tern.org.au/portal/ [Submit – [email protected]]
Australian SuperSite Network (ASN)
Vegetation composition, structure and cover; fauna surveys; soil properties; gas and energy flux (see OzFlux below); meteorology; surface, ground and soil water
Via TDDP or ASN portal:www.tern-supersites.net.au/knb/ [Submit – [email protected]]
Australian Transect Network (ATN)
Vegetation and soil surveys, including specimens.
Via AEKOS data portal www.aekos.org.au or Soils to Satellites soils2sat.ala.org.au/(In future will also be searchable from TDDP) Specimens (vegetation voucher samples and soils) [email protected]
Eco-Informatics
Ecological data from individual sites, and from broadscale surveys. Data from AusPlots and the Australian Transect Network, alongside key data from State and Federal partners.See AEKOS data publication schedule for more detail.
www.aekos.org.au(In progress of submitting metadata to TDDP) [submit - www.aekos.org.au/access_shared]
TERN Data:TERN facility Kind of data available Where can I access [+ submit] data ?eMASTEcosystem Modelling and Scaling Infrastructure
Modelled climate and land surface data derived from surface observations.
Partially available via eMAST: www.tern.org.au/e-MAST-Data-Products-pg26355.html(In progress of submitting metadata to TDDP) [Submit - [email protected]]
LTERN Long-Term Ecological Research Network
Vegetation composition, structure and cover; fauna surveys; surface, ground and soil water
Via TDDP or LTERN portal:www.ltern.org.au/knb/ [Contact [email protected] ]
OzFlux CO 2 and other gas concentration and fluxes; evapotranspiration; surface energy balance; carbon and water cycles
Via TDDP or OzFlux portal:ozflux.its.monash.edu.au/ecosystem/home [Submit [email protected] ]
Soil and Landscape Grid of Australia
Functional soil attributes and key landscape features.
Under development. Best available data products via TDDP:http://portal.tern.org.au/search#!/q=soils/p=1/tab=collection/group=Soils/num=10 [Submit - [email protected]]
• Other data stores and sources?
• Other data stores and sources?
• Other data stores and sources?
• Other data stores and sources?
Data Discovery - Exercise
Exercise:• Using the TERN Data Discovery Portal:
http://portal.tern.org.au
Data Download and Evaluation
Learning objectiveTo understand how to effectively search, download and critically assess ecosystem data sets for use in your own work from: ecosystem data stores, portals and data journals.
Evaluation of data – is it suitable for my needs? Exercise
Exercise:• Evaluating your chosen dataset:
• What is the metadata?• What do different parts of the metadata mean? • Is this going to be useful for you?• Criteria to use for evaluation?
Data format (s) Data currency Data collection methods Data QA/QC Data licence
Download and Appropriate Re-use of Data
Learning Objective:
To understand what data “licensing” is from the research producer, user and owner’s points of view.
What do licences mean?
If you download data with a licence, what are your obligations for re-use?
TERN’s Data Licenceshttp://ww.tern.org.au/datalicence
Licencing for Australian Data - www.ausgoal.gov.au
ecosystem Modelling And Scaling infrasTructure (eMAST)
Integrating multiple data sets
Presentation by Brad Evans based on contributions by Colin Prentice, Michael Hutchinson, Gab Abramowitz, Ben Evans, Rhys Whitley, Julie Pauwels
Land surface 101: Energy balance
Source: IPCC
Land surface 101: Carbon cycle
Source: NASA
eMAST Domain
Research domain: Impacts of rising CO2 Thus the ecosystem modeller seeks to:1. Understand the effects of CO2 increases on
ecosystems2. Quantify negative feedbacks – the impact of
rising CO2, land surface warming and extreme events on ecosystems
6CO2 + 6H20 C6H12O6 + 6O2
light energy
chlorophyll +nutrients
IPCC Consensus: CO2 Fertilization
WUE
NPPWUE =
GPPET
NPP = GPP - R
N & P
Land Surface Models-> Coupled to Climate Models
Other approaches
Observations , models and policy
(1) MORE Observations
(2) BETTER models are developed
(3) Models evaluated
against observations
(4) EVEN BETTER Models
(5) BETTER Policy
A viscous cycle
Unifying principles for ecosystem modellers
# 1: Observations, Models and Understanding: Integration of empirical science and modelling betters scientific understanding.
# 2: Transparency, Evaluation, Confidence : Reproducible models, evaluated with observations, enhance model efficacy.
# 3: Innovation, Standards, Simplicity: Continuous innovation, use standards, mitigate unnecessary complexity.
eMAST Observations and Models
Models
OzFluxCO2 and water fluxes
Plot NetworksVegetation Observations
via AeKos and Others
AusCoverRemote Sensing –
Satellite, in-situ & Obs.
Bureau of Meteorology and
Geoscience Australia
Land Surface Models
SoilsProperties of soil
dap.nci.org.augeonetwork
TERN TDDPtern.org.au
RDSI VM’s raijin@nciINTERSECT
NeCTAR
PALSEVALUATION
NeCTARVirtualLabs
eMAST Delivers in 2014-2015 : 1 of 3Simple land surface process models• eMAST R-Package: MQ & ANU Bioclimate indices and surface processes• eMAST Earth System Model Connex (C++ & FORTRAN): MQ & ANU
Bioclimate indices and surface processes coupled to ACCESS and other Earth System Models
• ePiSaT R-Package: Continental Gross Primary Production (data model fusion)
• Community R-Packages: Hutchinson Drought & BoM Heatwave – in kind from Ivan Hanigan (ANU)
• pyeMAST: Python version of eMAST tools including big data services (connectivity with SPEDDEXES).
Statistical land surface models• Data Assimilation: Ensemble Kalman Filter coupled to process based land
surface model (Renzullo, CSIRO)• Fubaar: Machine learning land surface model (in-kind MQ – Keenan)
Open Source !
Tools
eMAST Delivers in 2014-2015 : 2 of 3Observation assimilation into Models• eMAST Ecosystem Model Parameters Database (EMP DB).• NCAR’s Data Assimilation Research Testbed (DART)
• DART-CESM : In collaboration with NEON, Inc. (USA)• DART-CABLE : In collaboration with the NCI, NCAR and CSIRO
• Assimilation of : fluxes, leaf properties, plot network observationsModelled Data discovery and ACCESS Tools• SPEDDEXES: A community based solution to (a) publishing big data (b)
sharing big data (c ) discovering big data and (d) programmatic access to big data on Australia’s eResearch infrastructure.
• SPEDDEXES@NeCTAR-VL’s: Collaborative extension of the SPEDDEXES tools to the NeCTAR Virtual Laboratories – embedding in the Climate and Weather Laboratory
Benchmarking and Evaluation• eMAST@PALS : Development of the PALS system for eMAST and TERN data
streams• eMAST BENCH : International collaboration on benchmarking
Tools
eMAST Delivers in 2014-2015: 3 of 3NEXT Generation of Ecosystem Models• ARC DP on Australian Tropical Savanna’s : Past Present and Future:
Enhancing ecosystem models for Tropical Savanna’s• ARC DP on the Next Generation of Ecosystem Models: Using plant trait
observations to inform a new approach to ecosystem modelling.• GePiSaT: Global version of the ePiSaT model (eMAST and Imperial College
of London)• CAMELS: Coupling ACCESS with Models of Ecosystems and the Land
Surface: Next generation approach to ecosystem and land surface modelling
Datasets from eMAST• ANUClimate: A extension of past methods for gridding Climate and
Weather for the Australian continent .• eMAST Bioclimate• eMAST Land Surface Modelling
Tools & Data
Climate and Bioclimate data Res. 0.01 degrees (nominally 1km) T, P, R + and 50 + indices
: New approach for Big DataIt is no longer practical, let alone affordable, to continue to do data-intensive ecosystem science in the copy-and-work paradigm, a new approach to working with Big Data is required.
Think about network data access, not file downloads…
Cross-disciplinary use of file formats and services…
Open-source server technology and file formats…
Work with big data in a high performance facility
Big Data : eMAST’s collections
Climate/W
eather
Earth &
Marin
e Obse
rvations
Geoscience
Collecti
ons
Terrestr
ial Eco
syste
m
Water M
gmt, H
ydro
logy10
100
1000
10000 54191928
326176 140
Scientific Data for Research (NCI RDSI node)by 2015
Dat
a Vo
lum
es (T
B)
Three eMAST projects
1. Observations: The Ecosystem Model Parameters Database
2. Models: Ecosystem Production in Space and Time
3. Observations in Models: CABLE-DART Data assimilation on the NCI
Observations The Ecosystem Model Parameters Database
• Originally setup to generate continental scale surfaces of leaf properties (nitrogen, phosphorus etc) using ANN’s
• Adapted in April 2014 for use with Data assimilation
• Focal point for ecosystem scientists and plot networks to contribute observations for use in models
EMP DBExample One
eMAST : Data assimilation
Example Two
eMAST : Data assimilationCollaborative ‘Community’ approach: Work with international experts (Fox – NEON and Hoar – NCAR) and local champions Renzullo (CSIRO) and Evans. Open to community participation (Wang, Haverd and Trudinger CSIRO)
Data assimilation: NEON Leaf Carbon
Fox et al. 2012
Data assimilation: NEON Leaf Carbon
Fox et al. 2012
Ecosystem Production in Space and TimeExample Three
ePiSaT
Data filtering: Removal of outliers etc.. Gap filling of PAR (PPFD) for GPP
1
3
1R =
Assimilation
Amax = - 2
Efficiency
Φ =
2
2
3Amax *FC =
Rectangular Hyperbole
3 parameter
1 2 3
Respiration
Quantum
R -Φ I
Amax +Φ I
How does gross primary productivity (GPP) vary in space and time across Australia?
How can we ‘simply’ estimate GPP across Australia?
What data does TERN provide that might be useful for addressing this research question?
Ecosystem Production in Space and TimeePiSaT
Choose the ePiSaT model fromemast.org.au
TDDP orSPEDDEXES
Obtain OzFlux data via the TERN/ OzFlux portals
Run the ePiSaT model – generate estimates of
ecosystem parameters, evaluate them
Obtain climate (eMAST) and satellite data (AusCover) to scale the ePiSaT parameters
Produce continental scale estimates of GPP and evaluate
them
Ecosystem Production in Space and TimeePiSaT
This project is supported by the Australian National Data Service (ANDS). ANDS is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy Program and the Education Investment Fund (EIF) Super Science Initiative. For more information visit the ANDS website ands.org.au and Research Data Australia services.ands.org.au.
Closing thoughts on data sharing…
Lunch
Eciencygain
Increasedeectiveness
Storage,preservation anddiscoverability
of data
Data analysis,integration and
synthesis
r
Ecosystem Science
Data + meta-data,
licensing
Research output:
new data and publications
Enables large scale and coordinated data collection, sharing and
multiple re-uses
Enhanced ability to revise, question and expand knowledge
Knowledge gap: research questions
Proposal and planning
Data collection, verification,
quality assurance and
control
This afternoon
5. Data Management & Publishing• Learning Objectives: To understand recognised best practice in “data management” for ecosystem, science data sets.
To understand what is required for “data publishing” in appropriate storage sites, portals and journals for specific research purposes – and to understand the diversity of options.
• Sections:• 1305-1315 Why does data management + publishing matter and
how can it help you?• 1315-1330 Data management plans - exercise• 1330-1340 Data publishing – your options and why does it matter• 1340-1350 Data publishers – a continuum of approaches• 1350-1430 Data publishing options with TERN
Data ManagementLearning Objectives:To understand recognised best practice in “data management” for ecosystem, science data sets.
- Why good data management is beneficial?- What is good data management?
Poor Data Management
Unusable Lost Re-collected
www.shutterstock.com . 54240301
http:
//36
0dig
est.c
om/2
006/
02/2
5/m
essy
-offi
ce-c
onte
st/
TERN
Aus
Plot
s
Personal Drivers
Increase efficiency of research
Guarantee the quality and authenticity of data
Enable exposure of research outcomes via collaborations and dissemination (40%)
Provide reproducibility of experimental and computational outcomes
Facilitate the validation and verification of results
Source: UQL-050112 – Research Data Management Fact Sheet 2
Survey on research data management 2012:• 63% aware of Australian Code of Conduct• 70% understand their data management responsibilities• 70% don’t do data management plans• 70% don’t keep a registry of research data collections
From Miller, C (2012). “Responses to interviews: University of Adelaide research data repository and metadata store”
• 82% agree data should be available to other researchers• 81% would re-use another’s data• 29% supported public access to their data
Data Management Plans - ExerciseExercise:Design of a “data management plan” to meet Australian Research Council requirements.
ARC Proposal Guidelines – Under “Project Description”“MANAGEMENT OF DATA Outline plans for the management of data produced as a result of the proposed research, including but not limited to storage, access and re-use arrangements.”
Data PublishingLearning Objectives:To understand what is required for “data publishing” in appropriate storage sites, portals and journals for specific research purposes – and to understand the diversity of options available.
To understand the different levels of publishing possible under the “data publishing continuum.”
Why should I publish data?
• replication and verification of work;
• formal and measureable recognition of data as a research output;
• a reduction in the duplication of data collection;
• re-use of data in multi- and interdisciplinary research;
• greater transparency in the research process.
High quality, well-described ecological data for 1000s species occurring at 25,000 sites and another 67,000 coming soon
Successful data publishers get noticed
Correlation between archived or
open access data to copies of
published articles and citation impact (Sharing detailed research data is associated with increased citation rate: Piowar, et al (2007)
Adopting good science practice
• Data are well-described and reproducible• ApplyNHMRC and ARC research ethics
• NHMRC Open Access policy came into effect from 1 July 2012
http://www.nhmrc.gov.au/grants/policy/dissemination-research-findings
• ARC Open Access policy came into effect from 1 January 2013.
http://www.arc.gov.au/applicants/open_access.htm
“A11.5.2. Researchers and institutions have an obligation to care for and maintain research data in accordance with the Australian Code for the Responsible Conduct of Research (2007). The ARC considers data management planning an important part of the responsible conduct of research and strongly encourages the depositing of data arising from a Project in an appropriate publically accessible subject and/or institutional repository. “
When not to publish data or place restrictions
• Patent application
• Confidential human/individual details
• Confidential data due to commercial sponsorship arrangements
• Sensitive species declared by governments
• Sensitive location declared by governments
http://www.tern.org.au/Data-publishing-pg26249.html
Data Publishers – A Continuum
Data Publishing - ExerciseExerciseIdentification and review of potential data publishers.
We will divide you into small groups to assess the approach to data publishing of a given data publisher in terms of: - submission and review process;- attributes required for re-use; - capacity for re-use- costs; and- ability to measure output and re-use.
Data Publishing with TERN
Learning Objectives: Identification of current and planned data publishing options in TERN.
To understand how you can publish your data with TERN
TERN’s data portals and meta-data structure:
Auscover
Ozflux
Ausplots, and Transects
Coasts
Soils
Supersites Network and LTERN
eMAST
AeKOSEcoinformaticsTERN Data
Discovery Portal
Data Publication in TERN - SHaRED
- Metadata complyingwith ISO 19115 and 19139 international standards; specifically the ANZLIC Profile ofthose standard
- Easy to use- Base template which can accommodate in depth details if needed- *.xml format
Tool developed by ANZLIC - the Spatial Information Council (ANZLIC)
Data Publication in TERN - ACEF using ANZMet Lite
http://spatial.gov.au/sites/default/files/legacy/osdm.gov.au/Metadata/ANZLIC%2Bmetadata%2Bresources/default.html
Data Publication in TERN - ACEF using ANZMet Lite
Data Publication in TERN - Morpho
https://knb.ecoinformatics.org/#tools
Questions?
6. Wrap upOutcomes?
- Better understanding of the national research infrastructure available to you – including TERN
- Knowledge of the kinds of ecosystem data that is available, and how you can get it
- Experience searching, assessing and downloading data for your research
- Understanding the principles of good data management and the benefits for you
- Appreciation of the options for data management
- Introduction to tools for managing your data, including TERN infrastructure
6. Wrap up
• Email exit survey tomorrow
• Presentations and materials online and links sent to you
• Please contact us with any questions or follow up items
International Partners
TERN is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy and the Super Science Initiative
Top Related