Hannu Saarenmaa – University of Eastern Finland

18
Hannu Saarenmaa – University of Eastern Finland GEO BON, WG8 – Data Integration and Interoperability EU BON, WP2 – Data Integration and Interoperability BioVeL, WP2 – Workflows for Scientific Research Organising data flows and modelling for the Essential Biodiversity Variables 1 GEO - X Plenary Geneva, 14 January 2014

description

Organising data flows and modelling for the Essential Biodiversity Variables. Hannu Saarenmaa – University of Eastern Finland GEO BON, WG8 – Data Integration and Interoperability EU BON, WP2 – Data Integration and Interoperability BioVeL , WP2 – Workflows for Scientific Research. - PowerPoint PPT Presentation

Transcript of Hannu Saarenmaa – University of Eastern Finland

Page 1: Hannu Saarenmaa – University of Eastern Finland

1

Hannu Saarenmaa – University of Eastern Finland• GEO BON, WG8 – Data Integration and Interoperability• EU BON, WP2 – Data Integration and Interoperability• BioVeL, WP2 – Workflows for Scientific Research

Organising data flows and modelling for the

Essential Biodiversity Variables

GEO - X PlenaryGeneva, 14 January 2014

Page 2: Hannu Saarenmaa – University of Eastern Finland

Essential Biodiversity Variables

• Conceived by GEO BON Collaborators (Pereira et.al. (2013) “Essential Biodiversity Variables”, Science, Vol. 339, 18 Jan 2013).

• EBVs facilitate data integration by providing an intermediate abstraction layer between primary observations and indicators.

• Computed from a large number of inputs (monitoring/incidental data).• EBVs aim to help observation communities harmonise monitoring, by identifying

how variables should be sampled and measured. • EBVs standardise an ontology for biodiversity and harmonise measurements,

observations, and protocols.• Endorsed by Convention on Biological Diversity (CBD) and in line with the 2020

Aichi Targets.• Provide focus for GEO BON and hence for the interoperability thrust within GEO

BON.

• A use case that GEO BON, EU BON and BioVeL focus on.

Page 3: Hannu Saarenmaa – University of Eastern Finland
Page 4: Hannu Saarenmaa – University of Eastern Finland

4

Where does the data come from?

• In Europe there are about 2000 biodiversity observation networks (only 643 listed by EUMON).

• GBIF has 10,000 data sets, openly accessible, conforming to GEOSS Data Sharing Principles.

• LTER/DataONE has 1,000’s biodiversity datasets.• EU BON is carrying out a gap analysis:

– There is a massive duplication of effort in data management, and lack of data sharing.

– There are very few data sets whose ”quality” (coverage, accuracy, etc.) has been documented and guaranteed.

– So called ”Data core” in biodiversity has not yet been defined.

Page 5: Hannu Saarenmaa – University of Eastern Finland
Page 6: Hannu Saarenmaa – University of Eastern Finland

6

• “Workflows” (series of data analysis steps) allow to process vast amounts of data.

• Build your own workflow: select and apply successive “services” (data processing techniques.)

• Import data from one’s own research and/or from existing libraries (i.e. GBIF, Catalogue of Life).

• Access a library of workflows and re-use existing workflows.

• Cut down research time and overhead expenses.

Part of a workflow to study the ecological niche of the horseshoe crab

Biodiversity Virtual e-

LaboratoryBioVeL processing services and workflows

Page 7: Hannu Saarenmaa – University of Eastern Finland

Aim: Predictive modelling of biodiversity change

Available tools from a growing family of ENM workflows – released to public at www.biovel.eu

1. Data Refinement Workflow (DRW) for pre-processing– Taxonomic Name Resolution / Occurrence retrieval– Geo-temporal data selection using ‘BioSTIF’.– Data quality checks / filtering using ‘Google Refine’.

2. Ecological Niche Modelling Workflow (ENM)– Classic ENM with 15 algorithms– Separate BioClim workflow (requires special inputs)

3. ENM Statistical Workflow (ESW) for post-processing– DIFF: Extent and intensity of change– STACK: Extent, intensity, and a cumulated potential– SHIFT: of the centre of gravity (direction, length, in

kilometers)

Data discovery

Data assembly, cleaning, and refinement

Ecological Niche Modeling

Statistical analysis

The analytical cycle

Page 8: Hannu Saarenmaa – University of Eastern Finland

8

Page 9: Hannu Saarenmaa – University of Eastern Finland

Seamlessexchange of data layers

http://openmodeller.cria.org.br/

Page 10: Hannu Saarenmaa – University of Eastern Finland

Use case: The spruce bark beetle, Ips typographus, disturbance of forest ecosystems

Pre 2002 Year 2050 Difference

• Statistical processing of the difference in Finland indicates that susceptibility of spruce forests to Ips typographus damage will get five-fold by 2050.

• Policy advise: Stricter forest hygiene through tougher legislation, so that Ips populations are kept at minimum, because of the increased risk.

• Papers for Silva Fennica and INTECOL session proceedings at Journal of Ecology.

Page 11: Hannu Saarenmaa – University of Eastern Finland

11

Outline of the use case• Running Ecological Niche Modeling (ENM) workflow for large number of

species– Process data points for hundreds of species (e.g. plants, butterflies, …)– Use data mostly from GBIF, but also from elsewhere– Each individual species may have 105 of data points– Run openModeller based ENM for all the data points– Choose predictive layers from WorldClim and GEOSS sources

• Generate summary statistics that can answer questions such as: – How many species are increasing? How many are decreasing?

Does the flora/fauna move to any direction? Is distribution fragmenting? Is distribution shrinking? How many populations are becoming marginalised?

– Prototype automatic data processing for computing the Essential Biodiversity Variables (EBV)

EBVs?

Page 12: Hannu Saarenmaa – University of Eastern Finland

12

Status of the current BioVeL ENM workflow

• Current openModeller based ENM workflows work at a smaller scale – focus on one or a few selected species

• Current workflow requires frequent interaction with the user (many clicks if we simply multiply runs)

• We need a system that is scalable and automated to run ENM for hundreds of species

• We need a system that can perform a summary analysis across all the species based on the individual ENM runs

• The 2nd generation BioVeL portal will provide the required capabilities.

• To be released publicly in January 2014 (currently in beta mode)

Page 13: Hannu Saarenmaa – University of Eastern Finland

13

Envisaged application structure

GBIF query

LTER query

EUMON query. . .

Selected species ENM parameter sets for species

ENM workflow

ENM workflow

ENM workflow. . .

Summary analysis

• Multiple species may use the same ENM parameter set (e.g. Mediterranean dryland plants)

• Parameter sets are generated and tested with another workflow (see next slide)

• Some species may need other offline data, or private data (uploaded from user side).

• One ENM workflow predicts the impact of environmental changes on the distribution of one species.

• Performed with R-based custom tool outside the portal• EBV production by combining data from different models

ENM output

file

ENM output

file

ENM output

file• Portal offers files for download

Page 14: Hannu Saarenmaa – University of Eastern Finland

14

ENM parameter optimisation workflow

Parameter test and selection

job

Parameter test and selection

job

Parameter test and selection

job. . .

Selected species Parameter matrix• Possible parameter combinations.

ENM parameter sets for species

• The optimal parameter input for the large ENM workflow (see previous slide)

Page 15: Hannu Saarenmaa – University of Eastern Finland

15

Initialising the data sweep on portal

Page 16: Hannu Saarenmaa – University of Eastern Finland

16

Results of data sweep, ready to be mapped, and statistically analysed

Page 17: Hannu Saarenmaa – University of Eastern Finland

Example product: Accumulated invasive potential for ecological groups

Example: Stack of combined macrozoobenthic invasion heatmaps

Slide by Matthias Obst, BioVeL

20 blacklisted species divided in 4 ecological regimes- Zoobenthos- Phytobenthos- Zoopelagial- Phytopelagial

Page 18: Hannu Saarenmaa – University of Eastern Finland

18

QUESTIONS?

www.earthobservations.org/geobon.shtmlwww.eubon.euwww.biovel.eu