1
ResistanceGA: An R package for the optimization of resistance surfaces using genetic 1
algorithms 2
3
Running Title: Landscape resistance optimization 4
5
Author: William E. Peterman 6
7
Address: Illinois Natural History Survey, Prairie Research Institute, University of Illinois, 1816 8
S. Oak Street, Champaign, IL 61820 USA 9
10
Email: [email protected]; Phone: 217.244.8191 11
12
Word count: 3,000 13
14
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
2
Summary 15
1. Understanding how landscape features affect functional connectivity among populations 16
is a cornerstone of landscape genetic analyses. However, parameterization of resistance 17
surfaces that best describe connectivity is largely a subjective process that explores a 18
limited parameter space. 19
2. ResistanceGA is a new R package that utilizes a genetic algorithm to optimize 20
resistance surfaces based on pairwise genetic distances and either CIRCUITSCAPE 21
resistance distances or cost distances calculated along least cost paths. Functions in this 22
package allow for the optimization of both categorical and continuous resistance surfaces, 23
as well as the simultaneous optimization of multiple resistance surfaces. 24
3. There is considerable controversy concerning the use of Mantel tests to accurately relate 25
pairwise genetic distances with resistance distances. Optimization in ResistanceGA 26
uses linear mixed effects models with the maximum likelihood population effects 27
parameterization to determine AICc, which is the fitness function for the genetic 28
algorithm. 29
4. ResistanceGA fills a void in the landscape genetic toolbox, allowing for unbiased 30
optimization of resistance surfaces and for the simultaneous optimization of multiple 31
resistance surfaces to create a novel composite resistance surface. 32
33
Key-words: CIRCUITSCAPE, circuit theory, cost distance, gene flow, genetic algorithm, 34
landscape genetics, least cost path, resistance distance, resistance optimization, resistance surface 35
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
3
Introduction 36
First coined in 2003, landscape genetics has experienced rapid growth in both the number of 37
studies and range of analytical methods utilized (Manel et al. 2003; Storfer et al. 2010). This 38
integrative field draws on landscape ecology, spatial statistics, and population genetics to address 39
a wide range of questions. One of the primary goals of landscape genetic studies has been to 40
understand how landscape features affect spatial genetic structure (Manel et al. 2003; Storfer et 41
al. 2007). Rather than simply assessing isolation-by-distance, many landscape genetic studies 42
quantify the effective distance (i.e. resistance distance) between spatially distinct sample 43
locations as a function of the landscape matrix (Spear et al. 2005; McRae 2006). In the absence 44
of direct observation of movement or dispersal across the landscape, resistance distances are 45
often interpreted as functional connectivity (e.g., Cushman et al. 2006). Critically, to infer 46
functional connectivity and resistance distance, an appropriate resistance surface must be 47
parameterized. As defined by Spear et al. (2010), a resistance surface is a spatial layer that 48
assigns a value to each landscape or environmental feature, with values representing the extent to 49
which that feature impedes or facilitates connectivity for an organism. 50
Resistance values of surfaces have been determined using a variety of methods, 51
including: habitat suitability models (e.g., Wang et al. 2008), telemetry (e.g., Driezen et al. 52
2007), and most commonly expert opinion (Murray et al. 2009). Less often, parameterization is 53
informed by empirical movement studies (e.g., Stevens et al. 2006) or spatial prediction of 54
ecological processes (Peterman et al. 2014). All of these are acceptable approaches, but each 55
comes with its own caveats. Of particular concern is the fact that expert opinion often fails to 56
accurately describe the biological or ecological process(es) being modeled (Shirk et al. 2010; 57
Charney 2012). Even when biological or ecological processes are known and explicitly modeled, 58
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
4
there is no guarantee that these processes will relate meaningfully to the movement of genes 59
across the landscape (Peterman et al. 2014). 60
Ultimately, the assignment and evaluation of resistance values generally covers only a 61
limited parameter space and remains a trial and error process to determine the best resistance 62
parameters. Graves, Beier and Royle (2013) attempted to alleviate the subjectivity of resistance 63
surface parameterization by using a search algorithm to maximize Mantel r correlation between 64
inter-individual genetic distance and least cost path distance. While their optimization procedures 65
recovered the maximum Mantel r when it existed, they found that resistance estimates were often 66
imprecise and much smaller than simulated resistance values. They also found that response 67
surfaces were quite flat, making identification of a global optimum difficult. In contrast, 68
Peterman et al. (2014) found well-defined global optima when using Ricker and monomolecular 69
data transformations in combination with optimization algorithms. However, the optimization 70
procedure utilized by Peterman et al. (2014) is limited to continuous resistance surfaces (e.g., 71
temperature, moisture, percent canopy cover) and requires an inefficient search of all possible 72
data transformations. 73
There are numerous challenges to optimizing resistance surfaces based on pairwise 74
(genetic) distance data. Among these challenges is the fact that resistance surfaces can have a 75
high dimensionality, as in land use, land cover surfaces. Another issue is that there currently is 76
no closed-form expression to determine the landscape resistance values that describe pairwise 77
distances, potentially making optimization intractable with gradient-based algorithms. Finally, 78
landscape features and environmental gradients do not exist in isolation. Therefore, an ideal 79
solution to resistance surface optimization will be to simultaneously optimize multiple resistance 80
surfaces to create a composite resistance surface. The ResistanceGA package for the R 81
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
5
programming environment (R Core Team 2014) has been developed to address these issues, and 82
to fill a void in the landscape genetic toolbox. While the initial impetus for this package was 83
landscape genetic analyses, but any pairwise measures across the landscape (e.g., movement 84
rates) could be utilized to optimize resistance surfaces. 85
86
Description 87
Genetic algorithms 88
Genetic algorithms (GAs) represent a powerful and flexible stochastic optimization framework 89
for finding solutions to both discrete and continuous optimization problems (Holland 1975). 90
Inspired by biological principles, genetic algorithms create a population of individuals 91
(offspring) with traits (parameters to be optimized) encoded on “chromosomes”. The genotypes 92
(parameter combinations) of each individual solve the fitness function, and the fittest individuals 93
from each generation survive to reproduce (Sivanandam & Deepa 2007). The GA evolution 94
process is facilitated by exploration and exploitation (Scrucca 2013). Exploration of parameter 95
space occurs through random generation of new parameter values resulting from mutation, as 96
well as exchange of genetic information through crossover. Exploitation ultimately reduces 97
diversity in the population by selecting the fittest individuals each generation. The population 98
continues to evolve until a sufficient number of generations have passed without an improvement 99
in fitness (Scrucca 2013). 100
Resistance optimization 101
ResistanceGA utilizes the general-purpose genetic algorithm from the GA R package (ga 102
function; Scrucca 2013). Briefly, the optimization proceeds as follows: 103
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
6
1) The original raster surface is imported into R. If the surface is continuous, it is rescaled to 104
range of 0–10, preserving the relative spacing between all levels. 105
2) The evolution process starts by generating a random initial population of size n. If a 106
continuous surface, the selected parameters determine (a) which of eight transformations 107
will be applied (Fig. 1); (b) the shape of the transformation; (c) the maximum value of the 108
transformation. If a categorical surface, each level of the resistance surface is reclassified 109
to the values of the parameters. 110
3) Using the spatial locations where genetic samples have been collected, either 111
CIRCUITSCAPE (McRae et al. 2008; McRae & Shah 2009) is called from R to calculate 112
pairwise resistance distances across the landscape created in step 2 above, or cost 113
distances are calculated from least cost paths using the R package gdistance (van 114
Etten 2014). 115
4) A linear mixed effects model with a maximum likelihood population effects 116
parameterization (MLPE) is fit to the data. Pairwise genetic distance is the response and 117
the scaled and centered pairwise resistance distance is the predictor. The MLPE mixed 118
effects parameterization is used to account for the non-independence among the pairwise 119
data (Clarke, Rothery & Raybould 2002). 120
5) The Akaike information criterion (AIC) is obtained from the fitted MLPE model. AICc is 121
then calculated by penalizing the model for the number of parameters used during 122
transformation/reclassification. AICc is the fitness measure that the genetic algorithm 123
seeks to maximize. Because the ga function works through maximization, the sign of 124
AICc is reversed. 125
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
7
6) Steps 2–5 are repeated until the specified number of n individuals have been created. The 126
genetic algorithm then conducts selection on the population, and the individuals with the 127
best AICc are carried over to the next generation to form the reproducing population. A 128
new population is then formed through mutation and crossover. 129
7) Steps 2–6 are repeated until the specified number of generations have passed without 130
improvement to the AICc. 131
132
Continuous surfaces 133
There are eight transformations that can be applied to continuous surfaces (Fig. 1). Each 134
transformation relies on either the Ricker or monomolecular functions (Bolker 2008), as well as 135
rescaling functions to keep values in positive parameter space . The shape and magnitude of each 136
transformation are controlled by two parameters, which are adjusted during optimization 137
(Peterman et al. 2014). During optimization, the genetic algorithm searches different 138
combinations of transformations, scale parameters, and shape parameters. Linear transformations 139
are not explicitly included, but all of the monomolecular functions become linear as the shape 140
parameter increases in value. In this way, linear responses can effectively be modeled without 141
increasing the number of transformations for the genetic algorithm to test. 142
Categorical surfaces 143
Categorical or feature surfaces, such as land cover or roads, can also be optimized using 144
ResistanceGA. A surface is considered categorical in ResistanceGA if it contains 15 or fewer 145
levels. To make this process tractable, it is necessary to hold the value of one level constant 146
throughout optimization. Because pairwise resistance values are relative, failing to hold one level 147
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
8
constant can result in multiple equivalent solutions, and the algorithm may fail to reach an 148
optimal solution (e.g., relative resistance values of 1, 5, and 10 are equivalent to 2, 10, and 20). 149
Combining resistance surfaces 150
Resistance surfaces are simultaneously optimized by sequentially modifying each surface to be 151
optimized. Continuous and categorical surfaces are each modified as described above, then all 152
modified surfaces are summed together to create a single, composite resistance surface, which is 153
then used to calculate pairwise resistance distances or cost distances. 154
155
Overview of ResistanceGA functions 156
The optimization functions passed to ga rely heavily on the R package raster (Hijmans 2014) 157
to import, export and manipulate spatial raster (.asc) files, as well as lme4 (Bates et al. 2014) to 158
fit mixed effects models. To visualize continuous surface transformations, ResistanceGA uses 159
the graphing features of ggplot2 (Wickham 2009). The functions available in ResistanceGA 160
are summarized in Table 1. 161
162
Implementation 163
Resistance surfaces can be optimized using cost distances (least cost path) or using circuit-based 164
resistance distances. Cost distances are calculated using the R package gdistance (van Etten 165
2014), while electrical current resistances are calculated using the free, open-source software 166
CIRCUITSCAPE (v4.0 or greater; McRae & Shah 2009). If using CIRCUITSCAPE, input data 167
formats must conform to those of CIRCUITSCAPE (McRae & Shah 2009). Currently, 168
CIRCUITSCAPE can only be executed from R through ResistanceGA on computers using 169
Windows operating systems. Prior to running any of the optimization functions (MS_optim, 170
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
9
SS_optim, Resistance.Opt_multi or Resistance.Opt_single) the 171
CS.prep/gdist.prep, and GA.prep preparation functions must be run. These functions create 172
and format the inputs and data objects necessary to run CIRCUITSCAPE/gdistance, 173
parameterize the genetic algorithm, and fit MLPE mixed effects models. These functions have 174
been developed to require a minimum input from the user. For instance, at minimum, all that 175
needs to be specified to run GA.prep is the directory where the .asc files to be optimized reside. 176
However, all available arguments of the ga function can be set by the user to modify the genetic 177
algorithm. The arguments and default settings of the preparation functions are described in Table 178
2. 179
There is no established framework for optimization of resistance surfaces, but an 180
envisioned work flow is detailed in Figure 2. Genetic algorithms are stochastic optimization 181
procedures; therefore it is highly advised to run all optimizations at least twice to confirm 182
convergence and parameter estimates. Also, because boundaries are placed on the parameter 183
space searched, if the optimized resistance values are at or near the limits set, the optimization 184
should be rerun after expanding the search space. It is important to note that simultaneous 185
optimization of multiple surfaces often results in parameter values that differ from truth. 186
However, the relative resistance values of the resultant resistance surface are highly or perfectly 187
correlated with truth (see worked example), and the resistance surfaces are identical once they 188
are rescaled to a minimum resistance of one. Users of ResistanceGA should be aware of this 189
fact, and interpret results accordingly. 190
Genetic algorithms are effective at finding an optimal solution, but the can be 191
computationally intensive. To ensure that parameter space is adequately searched, the population 192
of individuals produced each generation must be of sufficient size. In ResistanceGA the default 193
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
10
setting is to produce a population that is 15 times the number of parameters being optimized, up 194
to a maximum population size of 100 individuals. For example, when a continuous surface is 195
being optimized, 45 individuals (3 parameters x 15) will be produced each generation, and a 196
typical optimization takes 50–300 generations. This results in the creation of 2 250–13 500 197
resistance surfaces and CIRCUITSCAPE/gdistance runs. Therefore, the greatest impediment 198
to optimizing resistance surfaces is time. Both the spatial extent and the number of sample 199
locations in the analysis affect the time necessary to calculate pairwise resistance or cost 200
distances. On a computer with an Intel i7 3.4 GHz processor, 25 sample locations distributed 201
across a 250 x 250 cell resistance surface takes ~5 seconds to complete an iteration with 202
CIRCUITSCAPE. The same surface with 100 sample locations takes ~14 seconds. Finally, 25 203
sample locations on and a 1 000 x 1 000 cell surface can take up to 90 seconds to complete. 204
Because CIRCUITSCAPE calculates resistance over all possible pathways, it is more 205
computationally intensive than calculating least cost paths. On average, optimization with 206
gdistance is about three times faster than calculating resistance distances with 207
CIRCUITSCAPE. To further reduce the optimization time when using least cost paths, the 208
optimization can be run in parallel by setting parallel = TRUE in GA.prep. Additional 209
strategies to reduce the runtime of CIRCUITSCAPE/gdistance include modifying the 210
connection scheme (Neighbor.Connect/directions) from the default setting of 8 to 4, and 211
reducing the resolution of the resistance surfaces (i.e. increase cell size). The latter will reduce 212
the total number of cells on the surface, and generally has limited effect on the quantification of 213
relative resistances across the landscape (McRae et al. 2008). 214
215
Worked examples 216
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
11
The examples below use simulated data provided with ResistanceGA. Code to generate the 217
example resistance surfaces and spatial point data can be found in Appendix S1. More extensive 218
examples and further descriptions of the functions included in ResistanceGA can be found in 219
the vignette included with the package. 220
221
Example 1: Single surface optimization 222
This example optimizes a single resistance surface using CIRCUITSCAPE. 223
# Get data 224
# Install 'devtools' package, if needed 225
if(!("devtools" %in% list.files(.libPaths()))) { 226
install.packages("devtools") 227
} 228
229
library(devtools) 230
install_github("wpeterman/ResistanceGA") # Download package 231
library(ResistanceGA) # Load package 232
rm(list = ls()) 233
234
# Create directory for reading/writing CIRCUITSCAPE files and results 235
if("ResistanceGA_Examples"%in%dir("C:/")==FALSE) 236
dir.create(file.path("C:/", "ResistanceGA_Examples")) 237
238
# Create a subdirectory for the first example 239
dir.create(file.path("C:/ ResistanceGA_Examples/","SingleSurface")) 240
241
# Directory to write .asc files and results 242
write.dir <- "C:/ ResistanceGA_Examples/SingleSurface/" 243
244
# Load example data, write continuous surface as `.asc` file 245
data(resistance_surfaces) 246
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
12
continuous <- resistance_surfaces[[2]] 247
writeRaster(continuous,filename=paste0(write.dir,"cont.asc"),overwrite=TRUE) 248
249
# Load sample location data 250
data(samples) 251
write.table(samples, 252
file=paste0(write.dir,"samples.txt"), 253
sep="\t",col.names=F,row.names=F) 254
255
# Create a spatial points object for plotting 256
sample.locales <- SpatialPoints(samples[,c(2,3)]) 257
258
# Run preparation functions 259
GA.inputs <- GA.prep(ASCII.dir=write.dir, 260
max.cat=500, 261
max.cont=500, 262
seed=555, 263
quiet=TRUE) 264
265
CS.inputs <- CS.prep(n.POPS=length(sample.locales), 266
CS_Point.File=paste0(write.dir,"samples.txt"), 267
CS.program='"C:/Program Files/Circuitscape/cs_run.exe"') 268
269
# Transform the continuous surface 270
r.tran <- Resistance.tran(transformation="Monomolecular", 271
shape=2, 272
max=275, 273
r=continuous) 274
275
# Visualize the transformation (Fig. 3) 276
plot.t <- Plot.trans(PARM=c(2,275), 277
Resistance=continuous, 278
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
13
transformation="Monomolecular") 279
280
# Calculate pairwise resistance values to use as the response 281
CS.response <- Run_CS(CS.inputs=CS.inputs, 282
GA.inputs=GA.inputs, 283
r=r.tran) 284
285
# Rerun `CS.prep` to include response 286
CS.inputs <- CS.prep(n.POPS=length(sample.locales), 287
response=CS.response, 288
CS_Point.File=paste0(write.dir,"samples.txt"), 289
CS.program='"C:/Program Files/Circuitscape/cs_run.exe"') 290
291
# Run optimization 292
SS_RESULTS <- SS_optim(CS.inputs=CS.inputs, 293
GA.inputs=GA.inputs) 294
295
# Compare results 296
SS_table <- data.frame(c("Monomolecular", 2.0, 275), 297
t(SS_RESULTS$ContinuousResults[c(3:5)])) 298
colnames(SS_table) <- c("Truth", "Optimized") 299
SS_table 300
Truth Optimized 301
Equation Monomolecular Monomolecular 302
shape 2 1.999999 303
max 275 274.9982 304
305
The exact transformation parameters were recovered in 147 iterations (Fig 3). 306
Example 2: Multisurface optimization 307
This example simultaneously optimizes three resistance surface using least cost paths. 308
# Get data 309
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
14
data(resistance_surfaces) 310
data(samples) 311
312
# Create a spatial points object 313
sample.locales <- SpatialPoints(samples[,c(2,3)]) 314
315
# Create a subdirectory for the second example 316
dir.create(file.path("C:/ResistanceGA/","MultipleSurfaces")) 317
318
# Run `GA.prep` 319
GA.inputs <- GA.prep(ASCII.dir=resistance_surfaces, 320
Results.dir="C:/ResistanceGA/MultipleSurfaces/", 321
max.cat=500, 322
max.cont=500, 323
seed = 321, 324
parallel = 4) # Run on in parallel on 4 cores 325
326
# Run `CS.prep` functions 327
gdist.inputs<-gdist.prep(n.POPS=length(sample.locales), 328
samples=sample.locales) 329
330
# Set parameters for transforming and combining resistance surfaces 331
PARM <- c(1,250,75,6,3.5,150,1,350) 332
333
# PARM<- c(1, # First feature of categorical 334
# 250, # Second feature of categorical 335
# 75, # Third feature of categorical 336
# 6, # Transformation equation for continuous surface 337
# 3.5, # Shape parameter 338
# 150, # Scale parameter 339
# 1, # First feature of feature surface 340
# 350) # Second feature of feature surface 341
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
15
342
# Combine resistance surfaces 343
Resist <- Combine_Surfaces(PARM=PARM, 344
gdist.inputs=gdist.inputs, 345
GA.inputs=GA.inputs, 346
out=NULL, 347
rescale=TRUE) 348
349
# Create the response surface 350
gd.response <- Run_gdistance(gdist.inputs=gdist.inputs, 351
GA.inputs=GA.inputs, 352
r=Resist) 353
354
# Re-run `gdist.prep` to include the response 355
gdist.inputs<-gdist.prep(n.POPS=length(sample.locales), 356
response=lower(as.matrix(gd.response)), 357
samples=sample.locales) 358
359
# Run `MS_optim` 360
Multi.Surface_optim.gd <- MS_optim(gdist.inputs=gdist.inputs, 361
GA.inputs=GA.inputs) 362
363
# View optimized parameter values with the true simulation values 364
Summary.table <- data.frame(PARM,round(t(Multi.Surface_optim.gd@solution),2)) 365
colnames(Summary.table)<-c("Truth", "Optimized") 366
row.names(Summary.table)<-c("Category1", "Category2", "Category3", 367
"Transformation", "Shape", "Max", "Feature1", "Feature2") 368
Summary.table 369
Truth Optimized 370
Category1 1.0 1.00 371
Category2 250.0 268.27 372
Category3 75.0 80.42 373
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
16
Transformation 6.0 6.51 374
Shape 3.5 3.50 375
Max 150.0 161.11 376
Feature1 1.0 1.00 377
Feature2 350.0 375.56 378
379
# Compare the true and optimized resistance surfaces 380
optim.resist <- Combine_Surfaces(PARM=Multi.Surface_optim.gd@solution, 381
gdist.inputs = gdist.inputs, 382
GA.inputs = GA.inputs) 383
384
resist.stack <- stack(Resist,optim.resist) 385
names(resist.stack) <- c("Truth", "Optimized") 386
pairs(resist.stack) # Correlation plot (Fig. 4) 387
plot(resist.stack) # Resistance maps (Fig. 4) 388
389
This example took 210 iterations to optimize. As described above, the optimization has 390
recovered different parameter values than those that were simulated, but the relative values of the 391
simulated and optimized surfaces are identical (Fig. 4). 392
393
Obtaining ResistanceGA 394
ResistanceGA is hosted on GitHub, and can be downloaded using the install.github 395
function from the devtools package: 396
install.github(“wpeterman/ResistanceGA”) 397
Following download, the package can be loaded: 398
library(ResistanceGA) 399
View the vignette with more demonstrations of ResistanceGA functions: 400
vignette('ResistanceGA') 401
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
17
Alternatively, the HTML vignette can be viewed here: http://goo.gl/n7VOZx 402
Acknowledgements 403
I thank G. Connette for many discussions concerning the development and implementation of 404
functions in ResistanceGA, and for comments on an early draft of this manuscript. 405
406
Data accessibility 407
Example data are provided with ResistanceGA or can be made using the code provided in 408
Appendix S1. 409
410
Supporting information 411
Appendix S1. R code to generate the example resistance surfaces provided with ResistanceGA 412
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
18
References 413
Bates, D.M., Maechler, M., Bolker, B.M. & Walker, S. (2014) lme4: Linear mixed-effects 414
models using Eigen and S4. R package version 1.1-6. http://CRAN.R-415
project.org/package=lme4. 416
Bolker, B.M. (2008) Ecological Models and Data in R. Princeton University Press, Princeton, 417
NJ. 418
Charney, N.D. (2012) Evaluating expert opinion and spatial scale in an amphibian model. 419
Ecological Modelling, 242, 37–45. 420
Clarke, R.T., Rothery, P. & Raybould, A.F. (2002) Confidence limits for regression relationships 421
between distance matrices: Estimating gene flow with distance. Journal of Agricultural, 422
Biological, and Environmental Statistics, 7, 361–372. 423
Cushman, S.A., McKelvey, K.S., Hayden, J. & Schwartz, M.K. (2006) Gene flow in complex 424
landscapes: Testing multiple hypotheses with causal modeling. American Naturalist, 168, 425
486–499. 426
Driezen, K., Adriaensen, F., Rondinini, C., Doncaster, C.P. & Matthysen, E. (2007) Evaluating 427
least-cost model predictions with empirical dispersal data: A case-study using 428
radiotracking data of hedgehogs (Erinaceus europaeus). Ecological Modelling, 209, 314–429
322. 430
Graves, T.A., Beier, P. & Royle, J.A. (2013) Current approaches using genetic distances produce 431
poor estimates of landscape resistance to interindividual dispersal. Molecular Ecology, 432
22, 3888–3903. 433
Hijmans, R.J. (2014) raster: Geographic data analysis and modeling. R package version 2.2-31. 434
http://CRAN.R-project.org/package=raster. 435
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
19
Holland, J.H. (1975) Adaptation In Natural And Artificial Systems: An Introductory Analysis 436
With Applications To Biology, Control, And Artificial Intelligence. University of 437
Michigan Press. 438
Manel, S., Schwartz, M.K., Luikart, G. & Taberlet, P. (2003) Landscape genetics: combining 439
landscape ecology and population genetics. Trends in Ecology and Evolution, 18, 189–440
197. 441
McRae, B.H. (2006) Isolation by resistance. Evolution, 60, 1551–1561. 442
McRae, B.H., Dickson, B.G., Keitt, T.H. & Shah, V.B. (2008) Using circuit theory to model 443
connectivity in ecology, evolution, and conservation. Ecology, 89, 2712–2724. 444
McRae, B.H. & Shah, V.B. (2009) Cicuitscape user's guide. ONLINE. The University of 445
California, Santa Barbara. Available at: http//www.circuitscape.org. 446
Murray, J.V., Goldizen, A.W., O’Leary, R.A., McAlpine, C.A., Possingham, H.P. & Choy, S.L. 447
(2009) How useful is expert opinion for predicting the distribution of a species within and 448
beyond the region of expertise? A case study using brush-tailed rock-wallabies Petrogale 449
penicillata. Journal of Applied Ecology, 46, 842–851. 450
Peterman, W.E., Connette, G.M., Semlitsch, R.D. & Eggert, L.S. (2014) Ecological resistance 451
surfaces predict fine-scale genetic differentiation in a terrestrial woodland salamander. 452
Molecular Ecology, 23, 2402–2413. 453
R Core Team (2014) R: A language and environment for statistical computing. R Foundation for 454
Statistical Computing, Vienna, Austria. http://www.R-project.org/. 455
Scrucca, L. (2013) GA: A package for genetic algorithms in R. Journal of Statistical Software, 456
53, 1–37. 457
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
20
Shirk, A.J., Wallin, D.O., Cushman, S.A., Rice, C.G. & Warheit, K.I. (2010) Inferring landscape 458
effects on gene flow: A new model selection framework. Molecular Ecology, 19, 3603–459
3619. 460
Sivanandam, S. & Deepa, S. (2007) Introduction to Genetic Algorithms. Springer. 461
Spear, S.F., Balkenhol, N., Fortin, M.J., McRae, B.H. & Scribner, K. (2010) Use of resistance 462
surfaces for landscape genetic studies: considerations for parameterization and analysis. 463
Molecular Ecology, 19, 3576–3591. 464
Spear, S.F., Peterson, C.R., Matocq, M.D. & Storfer, A. (2005) Landscape genetics of the 465
blotched tiger salamander (Ambystoma tigrinum melanostictum). Molecular Ecology, 14, 466
2553–2564. 467
Stevens, V.M., Verkenne, C., Vandewoestijne, S., Wesselingh, R.A. & Baguette, M. (2006) 468
Gene flow and functional connectivity in the natterjack toad. Molecular Ecology, 15, 469
2333–2344. 470
Storfer, A., Murphy, M.A., Evans, J.S., Goldberg, C.S., Robinson, S., Spear, S.F., Dezzani, R., 471
Delmelle, E., Vierling, L. & Waits, L.P. (2007) Putting the 'landscape' in landscape 472
genetics. Heredity, 98, 128–142. 473
Storfer, A., Murphy, M.A., Spear, S.F., Holderegger, R. & Waits, L.P. (2010) Landscape 474
genetics: where are we now? Molecular Ecology, 19, 3496–3514. 475
van Etten, J. (2014) gdistance: distances and routes on geographical grids. R package version 476
1.1-5. http://CRAN.R-project.org/package=gdistance. 477
Wang, Y.-H., Yang, K.-C., Bridgman, C.L. & Lin, L.-K. (2008) Habitat suitability modelling to 478
correlate gene flow with landscape connectivity. Landscape Ecology, 23, 989–1000. 479
Wickham, H. (2009) ggplot2: Elegant graphics for data analysis. Springer, New York. 480
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
21
Table 1. Functions of the ResistanceGA package 481
Function Returned objects Description Combine_Surfaces R raster object Combine multiple resistance surfaces into
a new composite surface based on specified parameters
CS.prep Named list This function prepares and bundles the inputs necessary to run CIRCUITSCAPE
Diagnostic.Plots .tif file This function returns a multipanel figure including the histogram of residuals and qqplot from a fitted mixed effects model
GA.prep Named list This function prepares and compiles the objects and commands needed to execute the genetic algorithm
gdist.prep Named list This function prepares and bundles the inputs necessary to run gdistance
Grid.Search contour plot, R data object
For a single continuous surface, this function can be used to visualize the AICc response surface
lower R data object Convenience function to obtain the lower half of a square distance matrix
MLPE.lmm lmer object Fits a maximum likelihood population effects mixed effects model in lme4
MS_optim GA object, diagnostic plots, .txt summary file
This is a wrapper function for simultaneously optimizing multiple surfaces.
Plot.trans ggplot object Implements and plots a continuous surface transformation
Resistance.Opt_multi AICc value This is the function passed to ga to optimize multiple resistance surfaces simultaneously
Resistance.Opt_single AICc value This is the function passed to ga to optimize resistance surfaces in isolation
Resistance.tran R raster object, .asc file
Applies specified transformation to continuous resistance surface and returns a raster object. Optionally, a .asc file can be exported
Run_CS R raster object or R data object
This function executes CIRCUITSCAPE from R and returns either the lower half of the pairwise resistance distance matrix or the cumulative current map produced by CIRCUITSCAPE
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
22
Run_gdistance costDistance matrix object from gdistance
This function executes gdistance and returns the costDistance matrix object
SS_optim .tif files, .csv summary tables
This is a wrapper function for optimizing surfaces in isolation. All surfaces in a common directory will be optimized in turn, and numerous summary tables of optimized parameters and AICc values are produced
482
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
23
Table 2. Arguments of the required preparation functions and their default settings. Only one of 483
either CS.prep or gdist.prep needs to be run, depending upon whether optimization will use 484
CIRCUITSCAPE or gdistance. 485
Function Arguments Defaults Comments CS.prep n.POPS Must be defined
response NULL Must be defined to run optimization
CS_Point.File Must be defined
CS.program "C:/Program Files/Circuitscape/cs_run.exe"' Default Windows installation location
Neighbor.Connect 8
gdist.prep n.POPS Must be defined
response NULL Must be defined to run optimization
samples Must be defined
transitionFunction function(x) 1/mean(x)
directions 8
longlat FALSE
GA.prep ASCII.dir Must be defined .asc files should be in
their own directory
Min.Max "max" Must be "max" when
optimization with ga
min.cat 0.0001
max.cat 2500
max.cont 2500
cont.shape NULL
pop.mult 15
percent.elite 0.05
type "real-valued" Must be "real-valued"
pcrossover 0.85
pmutation 0.1
maxiter 1000
run 25
keepBest TRUE
population "gareal_Population"
selection "gareal_lsSelection"
crossover "gareal_blxCrossover"
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
24
mutation "gareal_raMutation"
seed NULL
quiet FALSE
486
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
25
Figure 1. There are eight continuous resistance surface data transformations implemented in 487
ResistanceGA. Prior to transformation, the original continuous resistance surface had values 488
ranging from 0–10. The shape and magnitude of each transformation are each controlled by a 489
single parameter. All transformations in the figure have a shape parameter value of 3, and 490
maximum value parameter of 100. Linear relationships are not explicitly incorporated, but all 491
monomolecular functions become linear as the shape parameter increases. 492
493
Figure 2. Flow chart depicting a potential work flow for optimizing resistance surfaces. Analysis 494
begins by optimizing each resistance surface in isolation. If multiple surfaces are supported (e.g., 495
∆AICc ≤ 4), these surfaces can be simultaneously optimized to create a composite surface. If 496
desired, different combinations of the best supported single surfaces can be optimized and 497
compared. Inputs into the optimization are shown in yellow ovals, optimization steps are blue 498
rectangles, decision points are white diamonds, and final products from optimization are green 499
ovals. 500
Figure 3. Optimization of a continuous surface that was transformed using a monomolecular 501
function following Example 1 in the main text (transformation = Monomolecular, shape = 2.0, 502
maximum = 275). Optimization with CIRCUITSCAPE took 147 iterations of the genetic 503
algorithm and precisely recovered the transformation parameters resulting in perfectly correlated 504
resistance surfaces. 505
506
Figure 4. Simultaneous optimization of three resistance surfaces (Example 2 in the main text). 507
The original categorical, continuous, and feature resistance surfaces are shown on the left and 508
their actual resistances following transformation/rescaling are shown in the middle. When these 509
surfaces are added together to create a composite, the minimum resistance value is no longer 1, 510
so the surface is rescaled to present the relative resistance values (scaled composite resistance). 511
Optimization using least cost paths took 215 iterations of the genetic algorithm when run in 512
parallel on 4 cores. The relative resistance values of the optimized surface and the true resistance 513
surface are perfectly correlated. 514
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
26
515
Figure 1 516
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
27
517
Figure 2 518
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
28
519
Figure 3 520
.CC-BY-NC 4.0 International licenseacertified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (which was notthis version posted July 29, 2014. ; https://doi.org/10.1101/007575doi: bioRxiv preprint
29
521
Figure 4 522
.C
C-B
Y-N
C 4.0 International license
acertified by peer review
) is the author/funder, who has granted bioR
xiv a license to display the preprint in perpetuity. It is made available under
The copyright holder for this preprint (w
hich was not
this version posted July 29, 2014. ;
https://doi.org/10.1101/007575doi:
bioRxiv preprint
Top Related