Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage...

59
Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana Fish, Wildlife and Parks

Transcript of Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage...

Page 1: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Modeling Species Distribution with MaxEnt

Bryce Maxell, Acting Director, Montana Natural Heritage Program

&Scott Story, Nongame Data Manager, Montana Fish,

Wildlife and Parks

Page 2: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Agenda - Wednesday• 8-9 Introduction to MaxEnt • 9:05-10 Reptile and Amphibian Model Examples• 10:05-11 Installation and Walkthrough of MaxEnt• 11:05-12 Preparation of Data• 12-1 Lunch• 1-1:55 Thresholds & Model Validation• 2-3 Using models in your DSS• 3 - 5 Hands-on Session• Tomorrow 8-11 Hands-on, Data Prep, Questions &

Discussion

Page 3: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

• About to start again folks on the phone.

Page 4: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

INSTALLATIONInstalling and Running MaxEnt

Page 5: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Download & Install• http://www.cs.princeton.edu/~schapire/maxent/• Current MaxEnt Version = 3.3.3e • Requires Java Version 1.4 or later

• Type java –version at command prompt• http://www.java.com

• Extract the .zip file to a very simple directory– No spaces, no strange characters, short– C:\maxent

• Three files are installed– Maxent.bat– Maxent.jar– Readme.txt

– Download the tutorial Word document

Page 6: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Check Java Version

Page 7: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Set PATH and customize .bat file• My Computer Properties Advanced Environment

Variables System Variables PATH Edit• Add to end of the PATH ;c:\maxent• Change the maxent.bat file

– Change the extension to .txt so that you can edit it with Notepad

– Change line reading java -mx512m -jar maxent.jar to…

– java -mx512m -jar c:\maxent\maxent.jar– Change the extension back to .bat– Note that changing the 512 to another number

allocates more memory

512 Mb = 0.5 Gb1024 = 1 Gb1536 = 1.5 Gb2048 = 2 Gb

Page 8: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

BASIC MODELING RUNRunning MaxEnt

Page 9: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Required Inputs

• Species presence localities (“samples”) file

• Environmental feature layers

• Output directory

Page 10: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

MaxEnt – Main Screen

Page 11: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Supply presence localities

Page 12: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 13: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 14: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Supply folder containing

environmental feature layers

Page 15: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 16: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Change variable types as necessary

Supply an output directory

Page 17: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Ready to Run

Page 18: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

What MaxEnt Does• Reads through each layer to

– Determine type– Create .mxe file for each layer in maxent.cache

• Extracts the random background and sample data– You will get warnings about points that are “missing

some environmental data”• Calculates the gain until a threshold is reached• Creates the output grids for each species (this takes the

longest)• Creates the thumbnail .png images

Page 19: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Time Required

• Ten feature layers (3 categorical)– 46 million pixels

• 2 Species• Intel Core 2 Quad CPU (2.83 GHz)• 4.00 GB RAM• Windows 7• 32-bit Operating System• 512Mb of memory specified

Without maxent.cache = 38 minutesWith maxent.cache = 24 minutes

Page 20: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

EXAMINING OUTPUTRunning MaxEnt

Page 21: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 22: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Output• plots folder• logfile• maxentResults.csv• For each species

– .asc– .html– .lambdas– _omission.csv– _sampleAverages.csv– _samplePredictions.csv

Page 23: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Logfile• Timestamp• Version of MaxEnt• Samples file name• Warnings• Command line to repeat• Species• Layers• Layertypes• Directories for: samples file, layers, output• Number of samples• Maximum gain

Page 24: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Gain

• Closely related to deviance, a measure of GOF in GAM and GLM

• Starts at zero and heads toward an asymptote• MaxEnt trying to come up with best fit• Average log probability of presence samples

minus a constant• Gain indicates how closely the model is

concentrated around presence samples• Avg likelihood of presence samples = exp(gain)

Page 25: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Gain Examples

• McCown’s Longspur– Resulting gain: 2.275– Average likelihood for presence points = 9.728

• Olive-sided Flycatcher– Resulting gain: 1.297– Average likelihood for presence points = 3.658

• Average likelihood of the presence sample is X times higher than that of a background pixel

Page 26: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Html

• Analysis of omission/commission• Receiver Operating Curve (AUC calculated)• Preset Thresholds• Pictures of the Model• Analysis of Variable Contributions• Raw Outputs

Page 27: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Omission Rate vs. Cumulative Threshold

Page 28: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Receiver Operating Curve

Page 29: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Sample Predictions File

• Coordinates for all points• Test or Training• Predicted values

– Raw– Cumulative– Logistic

• Use this file to calculate deviance• Use samples procedure in ArcMap to extract the

ones and zeros (above threshold or not)

Page 30: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Sample Predictions File

Page 31: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Logistic Ouput

High probability of suitable conditions

Low predicted probability of suitable conditions White dots = training (1059 points or 75%)

Purple dots = test (352 points or 25%)

Page 32: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Viewing Data in ArcMap• Build Raster Attribute Table (Categorical)

– .vat.dbf

• Build Histograms (Classified)– .aux

• Build Pyramids– .rrd– .xml

• For species output grids– Convert ASCII to Raster (Output Data Type = FLOATING)

– Output as .bil (Band interleaved by line)

Page 33: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 34: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

MORE ADVANCED PARAMETERSRunning MaxEnt

Page 35: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

REPLICATE RUNSRunning MaxEnt

Page 36: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

BATCH MODERunning MaxEnt

Page 37: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Preparation of Data

Scott Story

Page 38: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Required Inputs

• Species presence localities (“samples”) file

• Environmental feature layers

• Output directory

Page 39: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Getting Feature Data Ready

• Same projection (coordinate system, units, datum)

• Same resolution• Same extent• ESRI ascii format

Page 40: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Two Raster Datasets

Land cover• Source = Montana Natural

Heritage Program• Type = IMAGINE Image• Cell size = 30 meters• Columns & Rows =33005,

24008• Spatial Reference = Montana

State Plane (NAD83)• Pixel Type = Unsigned Integer

(8-bit)

Precipitation• Source = PRISM Climate

Center• Type = ASCII grid• Cell size = 0.0083333333• Columns & Rows = 7025,

3105• Spatial Reference =

undefined (see metadata)• Pixel Type = Signed Integer

(32-bit)

Page 41: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Two Raster Datasets

Land cover Precipitation

Page 42: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Making Rasters Match

• Define coordinate systems for both• Set some environment variables

– Tools Options Geoprocessing Tab Environments

– General Settings: Extent and Snap Raster– Raster Analysis Settings: Cell Size, Mask

• Project Raster– Select target raster to match for output cell size

Page 43: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Precipitation Reprojected & Resampled

• Same exact extent• Same exact number or

rows & columns• Same exact cell size• Real test is…does Maxent

throw any errors?• In this case…it worked!• Getting all your data

layers squared away will take some time!

Page 44: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Deriving New Raster Data - Ruggedness

Page 45: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Types of Environmental Features• Continuous (Quantitative)

– Interval-scale (interval data, order, linear scale)– Ordinal variables (scale unknown-transformed?, rank clear)– Ratio-scale (interval data, ordered, not on linear scale, e. g.

temp on F or C scale)

• Categorical (Qualitative)– Nominal (e.g. gender)– Ordinal (has order, e.g. low to great)– Dummy variables from quantitative (classes)

• Name the ASCII files with CONT or CAT prefix

Page 46: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Preparing Point Data

• Create a separate file for each species• Combine them all\groups of them into one file• Probably want to retain a unique identifier• May want to setup scripts in ArcGIS to extract

presence data• Might also want more control of how background

data is selected• Let’s look at an example script -

ExtractModelInputData.py

Page 47: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Other “Feature” Layers• Masks

– useful if you want to train a model using only a subset of the region

– mask.asc– containing a constant value (1, for

example) in area of interest and no-data values everywhere else.

• Bias– assumption that species

occurrence data are unbiased– good understanding of the spatial

pattern– values should indicate relative

sampling effort

Page 48: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

THRESHOLDSRepresenting the output

Page 49: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Logistic Output (Ranges 0-1)

Page 50: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Reclassify with ArcGIS

Page 51: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Preset MaxEnt ThresholdsCumulative Threshold

Logistic Threshold

Fractional Predicted Area

Training Omission Rate

Test Omission Rate

Fixed Cumulative Value 1 1 0.043 0.344 0.002 0.000

Fixed Cumulative Value 5 5 0.172 0.255 0.020 0.020

Fixed Cumulative Value 10 10 0.260 0.210 0.044 0.082

Minimum Training Presence 0.699 0.029 0.365 0.000 0.000

10 Percentile Training Presence 17.522 0.351 0.167 0.099 0.151

Equal Training Sensitivity & Specificity

21.989 0.393 0.149 0.148 0.205

Maximum Training Sensitivity Plus Specificity

9.201 0.248 0.216 0.035 0.065

Equal test sensitivity & specificity 18.603 0.361 0.162 0.106 0.162

Maximum test sensitivity plus specificity

7.729 0.225 0.228 0.029 0.043

Balance Training Omission, Predicted Area, &Threshold Value

1.054 0.047 0.342 0.002 0.000

Equate Entropy of Thresholded & Original Distributions

5.465 0.182 0.250 0.021 0.026

Page 52: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Thresholds – Ends of SpectrumBalance Training Omission, Predicted Area, &Threshold Value

Equal Training Sensitivity & Specificity

Page 53: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

MODEL VALIDATIONModel Validation

Page 54: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Validation Metrics

• Receiver Operating Curve – obtained by plotting, for each threshold in this range, the proportion of true positive against the proportion of false positive

• Area Under Curve – computed by computing the area under the above described curve

• Deviance – 2 times the log probability of the test data.• Absolute Validation Index - the proportion of presence

evaluation points falling above the threshold or within the GAP predicted distribution

• Point Biserial Correlation - The correlation between a model’s predictions and presence/absence in test data (regarded as a 01 variable)

Page 55: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

_samplePredictions.csv

Page 56: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 57: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Discussion Point

Page 58: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.
Page 59: Modeling Species Distribution with MaxEnt Bryce Maxell, Acting Director, Montana Natural Heritage Program & Scott Story, Nongame Data Manager, Montana.

Topics Left

• Data Prep• Output• Thresholds• Validation• Batch• Replicates