Post on 04-Jul-2020
Species distribution models:where do we head for?
Munemitsu AKASAKANIES postdoctoral fellowNational Institute for Environmental Studies
First ASIAHORCs Joint Symposium: Topics in current trends
Predicted occurrence
What are species distribution models?-Empirical models that relate field observationof a species to environmental predictorsbased on statistically or theoretically derivedresponse surface. (Guisan & Zimmermann 2000)
http://ja.wikipedia.org/
Observation
+Environmental data
Distribution model
•To identify the core area for conservation
•Prediction of the response of species to environmental change
Why modeling species distribution ?
Prediction Y = f(aX1 + bX2 + cX3 …+ d)Inference
•To understand the environmental factor responsible for the distribution
An example of species distribution modeling
Syartinilia & Tsuyuki (2008)Biological conservation 141: 756-769
flickr.com
logit(Pr) = 0.23*slope – 0.01*elevation + 0.02*NDVI
+ 19.80AUTOCOV – 6.86
Pro
b.
Target: Spizaetus bartelsiJavan Hawk-Eagle (Threatened)
Overview the recently developed methods on distribution modeling from end users' perspective
Provide an idea abouthow does the distribution modeling contribute to evaluation and conservation of biodiversity
Purpose of this talk:
This talk will…
Recent developments have overcome several limitations of the conventional methods!
•Complex non-linear response of species
•Presence-only (occurrence) data
Types of data that would bother ecologists/biodiversity managerswhen bulding generalized linear models.
•Data with spatial autocorrelation
•Non-equilibrium distribution of species
•Interaction among predictor variables
•Observer bias
•Count data with lots of zero …
Why complex non-linear response is troublesome ?
Y = -0.8x + 13.5Y = -0.1x2 + 0.6x +11.2Y = 0.001x3 -0.2x2 +
0.7x +11.2・・・
Unable to represent the relationship properly by OLS or GLM !!
GAM, CART (, BRT, BCT)
Y = 1.9x + 3.4
Soil moisture
Sp
ec
ies
ab
un
da
nc
e
Coping with complex non-linear response: GAM
GAM : Generalized Additive Model
Y =Σ f(x) + c
Strength:•Capable of fitting to complex response
Limitation:•Sensitive to outlier
•Smoothing by non-linear function(e.g. spline, lowess)
Water temperature
Ab
un
da
nc
eo
f fi
sh
Example on implementation of GAM Fukushima et al. 2007Freshwater Biology 52:1511-1524
Years since dam construction
Logi
t(Pro
b. O
ccur
.)
Masu salmonWhitespotted char
•Using compiled database: apporx. 8000 fish surveys•Examined the influence of dam construction on occurrence of 41 fish taxa•Non-linear relationship between fish occurrence
and years after dam construction
Example on implementation of GAM Fukushima et al. 2007Freshwater Biology 52:1511-1524
Predicted impact of dam construction
Coping with complex non-linear response: CART
CART: Classification And Regression Tree
•Creates split-nodes to maximize the reduction in impurityStrength:
•Visually understandable output•robust to outlier•Inherent incorporation of interactions within the predictors
Limitation:Requires sufficient number of data
Name Tape of response variable
Classification Tree Categorical
Regression Tree Numerical
x
y
Boosted Regression Tree : BRTBoosted Classification Tree: BCT( )
Coping with complex non-linear response: CART
CART: Classification And Regression Tree
•Creates split-nodes to maximize the reduction in impurityStrength:
•Visually understandable output•robust to outlier•Inherent incorporation of interactions within the predictors
Limitation:Requires sufficient number of data
x
y
Boosted Regression Tree : BRTBoosted Classification Tree: BCT( )
•complex non-linear response of species
•presence-only (occurrence) data
Data to be modeled:
•data with spatial autocorrelation
•non-equilibrium distribution of species
Presence only data: A type of data that lacks the informationwhere the species is absent.
Presence-only data are often found …
•Occurrence of highly mobile animals/insects •Opportunistically collected data•Museum or herbarium –records…
Absences cannot be inferred with certainty!
-Recursively sample the absence from backgroundMAXENT, GARP
- use presence data only BIOCLIM, DOMAIN, LIVES
For methods requiring absence
Occurrence record
Pseudo-absence(selected randomly)
No data
Evaluation of predictive performance on presence-only data
Wisz et al. 2008 Diversity and distributions 14: 763-773 Fig.3 (modified)
•Data: Non-systematically sample presence-only data41 species from 5 regions (birds, small vertebrates, and plants)
10010 30Sample size
Pre
dic
tive
pe
rfo
rma
nc
e(M
ed
ian
AU
C)
0.5
0.7
Low
HighGBM
MaxEnt
GLM
GAM
GARP
DOMAIN
BIOCLIM
LIVE
-p
-p
-p
-b
-b
-o
-o
-o
Method Principle*
*-p: pseudo absence-b: background sample-o: only presence
MaxEnt : Maximum Entropy modeling
•Recursively sample absence data from background
Strength:
• Capable of handling presence-only data• Can model complex non-linear responses• Capable of good prediction from small sample size
Limitation:
•Requires high computational power
Coping with presence-only data: MaxEnt
Example on implementation of MaxEnt
Study area: Hakone (National Park)Target: Rudbeckia laciniata (invasive)
*Invaded to Hakone ca. 2000.
Akasaka & Osawa (unpubl.)
To exterminate R. laciniataWhat landscape factor is related to their distribution?ProblemBecause this species have recently invaded to our study area, we only had32 observational records….
Known occurrence:32 sites
Road density Urban areadensity
Solar radiation TWI
Example on implementation of MaxEnt
Study area: Hakone (National Park)Target: Rudbeckia laciniata (invasive)
*Invaded to Hakone ca. 2000.
Akasaka & Osawa (unpubl.)
Prob
. Occ
ur.Known occurrence:
32 sites
Road density Urban areadensity
Solar radiation TWI
AUC=0.88
A
Example on implementation of MaxEnt
Kadoya et al. 2009Biological Conservation 142: 1011-1017
Target: Buff-tailed bumblebee(invasive)Bombus terrestris
•Proportion of woodland
PresentNo data
0 50 100 150 20025
km
.
0.0-
0.5-
Prob
. Occ
ur.Species occurrence:
volunteer gathered data
Potential distribution
•Water channel length•Tomato production
Proportion of woodlandWater channel lengthTomato production
+
AUC=0.813
•complex non-linear response of species
•presence-only (occurrence) data
Data to be modeled:
•data with spatial autocorrelation
•non-equilibrium distribution of species
Spatial autocorrelation?-nearby sites are similar
Present
Absent
Spatially autocorrelated
Spatially random
Spatial autocorrelation?
Models correcting for spatial autocorrelation (CAR, SAR…)
-nearby sites are similar
•Distance-related biological process (e.g. dispersal, facilitation)•Unmeasured important environmental factor•etc..
Caused by…
Data points can not be regarded as mutually independent!
Present
Absent
Spatially autocorrelated
Conditional autoregressive models (CAR)
Y = f(Xβ + ε)Explanatory variablesCorrelation coefficientsError
Spatial random effect: composed of the information on the response variable in the neighboring cells.
•Relatively well used in in the field of Ecology
Dormann et al. 2007 Ecography
+ ρW(Y - Xβ)
Example on implementation of CAR
Ishihama , Takeda, Oguma, & Takenaka (unpubl.)
Target: adder’s tongue fern (endangered) Ophioglossum namegatae
Max. grass heightRelative elevationPixel value of air photos
Predictor variables
Logistic (DIC= 799) CAR (DIC = 51)
Distribution data:
+presentabsent
Prob
. occ
urre
nce
High
Low
•complex non-linear response of species
•presence-only (occurrence) data
Data to be modeled:
•data with spatial autocorrelation
•non-equilibrium distribution of species
Static modeling to dynamic
Most of the species distribution models:included only abiotic environments as the predictor
Such models implicitly assume…•Abiotic factors are the primary determinants•The species have (nearly) reached equilibrium
If violated…
•Low predictive power•Poor characterization of the environmental response
×e.g. invasive species, species responding to changing environments
Include variable related to dispersal as a predictor
More detailed process-based model
Example of detailed process-based distribution model
Fukasawa et al. (in press)Ecological Research Target: Bishop wood (invasive)
Bischofia javanica
1977 2003Dispersal source Current distribution
Distribution Abiotic environments
•Elevation•Summit plane
elevation•Slope•Curvature•Watershed area•Skyline
Simultaneousmodel
( )
( )( ) ( )ZXX
qpy
nn
γββα
−×+++−+
=
×==
expexp1
11Pr
11 L
Logistic regression model: evaluate habitat suitabilityColonization kernel: evaluate seed dispersal
Simultaneous model
Example of detailed process-based distribution model
Fukasawa et al. (in press)Ecological Research Target: Bishop wood (invasive)
Bischofia javanica
Logisticmodel
Colonizationkernel
Simultaneousmodel
Example of detailed process-based distribution model
Fukasawa et al. (in press)Ecological Research Target: Bishop wood (invasive)
Bischofia javanica
•Recently developed methods and approach can deal with the limitations of OLS and GLM
Character of data Modeling method & approach
Complex non‐linear response of species
GAM, CART, (BRT, BCT )
Presence‐only data MaxEnt …
Non‐equilibrium distribution•Detailed process‐basedmodeling
Spatially autocorrelated data•Models to correcting for spatial autocorrelation (e.g. CAR)
Summary
Application of the introduced models
Modeling method R library Other free application
GLM: generalized linear models
base
GAM: generalized additive models
gam, mgcv
CART: classification and regression trees
tree, rpart
GBM: generalized boostedmodels
gbm
MaxEnt: Maximum entropy model
‐ MaxEnt software
CAR: conditional autoregressive
modelsspdep WinBUGS
Freely available language and environment for statistical computing and graphics
http://cran.r-project.org/
Toward evaluation and conservation of biodiversity
2. To develop a sensible model …
Characteristics of the distribution data, sample size, and ecology of the target organism should be considered
3. Distribution model has increased its role in biodiversity conservation
1. Abundance and species richness can also be modeled by similar modeling methods
Process-based model
Let’s build distribution models together!!
Acknowledgements
Dr.Taku KADOYADr. Fumiko ISHIHAMAMr. Keita FUKASAWAMr. Takeshi OSAWADr. Akio TAKENAKAMr. Ehab salah
JSPSOrganizing committee
of ASIAHORCs Joint Symposium
Thank you for listening!