Week 10 Lecture: Beyond Geoprocessing Gluing Beyond...
Transcript of Week 10 Lecture: Beyond Geoprocessing Gluing Beyond...
Week 10 Lecture:Beyond Geoprocessing Gluing Beyond Geoprocessing, Gluing
Software Together With PythonIntroduction to Programming for GIS & Remote Sensing
GEO6938-4172GEO4938-4166GEO4938 4166
ReminderReminder
The introduction to geoprocessing lab is due The introduction to geoprocessing lab is due ThursdayKeep posting to the forum if you have Keep posting to the forum if you have questions or examples
This week’s lab will be time to prepare ideas and collect data for your projects– Projects due April 8th, presentations on 8th and
10th
Re-cap From Past LecturesRe cap From Past Lectures
We’ve discussed ArcGIS and geoprocessingWe ve discussed ArcGIS and geoprocessing– Model builder
Exporting/importing scripts– Exporting/importing scripts
But Python (and other languages) can be used in many ways:– Code blocks– Independent processing, simulation– Glue
Python Can Be Used Inside ArcGISand the Modeler
Calculate Field for tables:Calculate Field for tables:
ArcUser, April-June 2007
Python Code Blocks Greatly Enhance Table/Field Editing
Code blocks to create functions:Code blocks to create functions:
07pr
il-Ju
ne 2
00A
rcU
ser, A
p
Calculate Value is Used in Models:Calculate Value is Used in Models:
07pr
il-Ju
ne 2
00A
rcU
ser, A
p
Sophisticated Models Can Be Constructed With Calculate Values:
ArcUser, April-June 2007
Calc. Value Must Return a Single Number, But Processing May Be Complex:
ArcUser, April-June 2007
Example of Very Python Glue: ArcGIS and R
Habitat and Connectivity Modeler Toolboxes for ArcGIS
B j i D B t D L U b Benjamin D. Best, Dean L. Urban, Patrick N. Halpin, Song S. Qian
D k U i it NC USADuke University, NC USA
OverviewOverviewLOGISTIC
ENV. RASTERSHABITAT [0-1]
BINARYPOINT_OBS.
BINARYHABITAT [0,1]
HABITAT MODELER CONNECTIVITY MODELERRANDOMPOINTS
MULTIVAR.REGRESSION
LEAST-COSTPATHS
CREATENETWORK
HABITAT MODELER CONNECTIVITY MODELER
O S G SS O
NETWORK
SWO
POINTS_RANDOMNETWORK
LINES_EDGES
GoalsGoalsWhat:1. Model habitat with multivariate regression2 Model connectivity with graph theory2. Model connectivity with graph theoryHow:
ESRI ModelBuilder – scientific workflowInterface to R statistics applicationInterface to R statistics applicationUtilize Python NetworkX moduleProvide building block templates
1 Habitat Modeler1. Habitat Modeler
pts_obs
RandomPoints pts_rand Statistical
Plots dir_plotsSample to
Table tbl_env
M lti i t
rstr_viable
rstr aspect
rstr_dem
rstr_landcov MultivariateRegression,
GLMrstr_glm
rstr_glmroc
<> Lakes
rstr_aspect
rstr_tci
2 Connectivity Modeler2. Connectivity Modeler
tinln edgeslc
CreateNetwork
pt_nodes
ln_edges
poly_patchesNetwork
Least CostP th
ln_edgeslc
poly_patchsmNetwork g
txt_networkrstr_cost
Path
txt_networkl
poly_patches_
pt_centroids NetworkCentrality
Metrics
Example Environmental DataExample Environmental Data
aspectdemdigitaldigital
elevationmodel
landcovertcitopographic topographic convergence
index Grandfather Mountain, NC
1 1 Random Points1.1. Random Points
R library spatstat (more R library spatstat (more point patterns possible)grid mask for point grid mask for point generation (Rgdal)
1 2 Sample to Table1.2. Sample to Table
pts_obs
RandomPoints pts_rand Statistical
Plots dir_plotsSample to
Table tbl_env
M lti i t
rstr_viable
rstr aspect
rstr_dem
rstr_landcov MultivariateRegression,
GLMrstr_glm
rstr_glmroc
<> Lakes
rstr_aspect
rstr_tci
1 2 Sample to Table1.2. Sample to Table
File formats: DBF or MDB (geodatabase)Presence: 1 = Observed, 0 = RandomAppended into single table
1 3 Statistical Plots1.3. Statistical Plots
pts_obs
RandomPoints pts_rand Statistical
Plots dir_plotsSample to
Table tbl_env
M lti i t
rstr_viable
rstr aspect
rstr_dem
rstr_landcov MultivariateRegression,
GLMrstr_glm
rstr_glmroc
<> Lakes
rstr_aspect
rstr_tci
1 3 Statistical Plots1.3. Statistical Plots• Density Histograms • Pairs Plot
1.4. Multivariate Regression, GLM1.4. Multivariate Regression, GLM
pts_obs
RandomPoints pts_rand Statistical
Plots dir_plotsSample to
Table tbl_env
M lti i t
rstr_viable
rstr aspect
rstr_dem
rstr_landcov MultivariateRegression,
GLMrstr_glm
rstr_glmroc
<> Lakes
rstr_aspect
rstr_tciFit
Modeltbl_env
ModelPredictrstr_env map
Regression TechniquesRegression Techniques
Source: Guisan and Zimmermann 2000 “Predictive habitat Source: Guisan and Zimmermann, 2000. Predictive habitat distribution models in ecology.” Ecol. Mod.:135.
Marine Reference: Redfern et.al., 2006. “Techniques for Cetacean-h bi M d li ” MEPS 310habitat Modeling.” MEPS: 310.
1.4. Generalized Linear Model (GLM)( )logit(y) ~ β0 + β1x1 + β2x2 + … + βmxm
presence ~ β + β dem + β tcipresence ~ β0 + β1dem + β2tci
• OLS regression• binary response: [0-1]• inv logit = 1 / (1 + exp( -x))inv.logit 1 / (1 + exp( x))• categorical (factor), ie
landcover > dummy x landcover -> dummy x variables [0,1]
t AIC f d l l ti • stepAIC for model selection of ‘best’ predictors
1.4. Multivariate Regression, GLM1.4. Multivariate Regression, GLM*_summary.txt GLM best model, using step-wise AIC selection of variables...
Call:l (f l d f l b l(l k "l ") glm(formula = presence ~ dem + tci, family = binomial(link = "logit"),
data = samples)Deviance Residuals:
Min 1Q Median 3Q Max
*_coefficients.csv
Q Q-3.0314 -0.4194 0.0467 0.6924 2.3991 Coefficients:
Estimate Std. Error z value Pr(>|z|) (I t t) 0 130559 1 461863 0 089 0 929 (Intercept) 0.130559 1.461863 0.089 0.929 dem 0.006760 0.001025 6.597 4.19e-11 ***tci -0.108406 0.016632 -6.518 7.13e-11 ***---Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 (Dispersion parameter for binomial family taken to be 1)
Null deviance: 454.70 on 327 degrees of freedomResidual deviance: 250 21 on 325 degrees of freedomResidual deviance: 250.21 on 325 degrees of freedomAIC: 256.21Number of Fisher Scoring iterations: 6
1.4.c. Generalized Additive Model (GAM)
1.4.b. Classification and Regression Tree (CART)
Marine Example: Benthic Habitat of Rockfish in Monterey (2005 Workshop)
Blue Rockfish
Rockfish in Monterey (2005 Workshop)
ROV Transects
Carmel Bay, Carmel Bay, CACASource: M. Park CDFG
Benthic Habitat LayersDepthDepth Distance to shelfDistance to shelf Distance to kelpDistance to kelp
Bottom complexityBottom complexity Substrate typeSubstrate type
Carmel Bay, CACarmel Bay, CA
Benthic Habitat Prediction
1 4 Binary Habitat1.4. Binary HabitatOptional arguments:
1. binary map2 binary threshold
p g
2. binary thresholdor, use ROC
i i optimimum threshold
1.4. Receiver Operating Characteristic (ROC) Curve
Prediction Prediction performanceTrue positive (i e True positive (i.e. omission, false negative) negative) vs. False positive (i.e. commission)Optimize trade-offOr assign more risk-averse threshold
1. Habitat -> 2. Connectivity: Patches1. Habitat 2. Connectivity: Patches
1.->2. Patches and Cost Surface1. 2. Patches and Cost SurfaceRaster toPolygon
poly_patches
RegionGroup rstr_regions
BoundaryClean rstr_patche
CostDistance
rstr_cost
rstr_glmroc
rstr_dem Slope rstr_slope
PatchesDistinguish patches (Region Group)g p ( g p)Trim edges (Boundary Clean)
Cost SurfaceCost SurfaceAccumulate cost from patch
1 ->2 Cumulative Cost1. >2.Cumulative Cost
2 Connectivity Modeler2. Connectivity Modeler
tinln edgeslc
CreateNetwork
pt_nodes
ln_edges
poly_patchesNetwork
Least CostP th
ln_edgeslc
poly_patchsmNetwork g
txt_networkrstr_cost
Path
txt_networkl
poly_patches_
pt_centroids NetworkCentrality
Metrics
Graph TheoryGraph TheoryRRelationships between entitieselationships between entities
Source: Treml & Halpin, 2006
RRelationships between entitieselationships between entities
Social sciences• Small world phenomenon• Small-world phenomenon• Six-degrees of separation
Complex systems• Random network theory• Neural networks• Scale-free networks
Ecology
Barabasi & Bonabeau, 2003
www.sojamo.de/iv/index.php
Ecology• Flow of energy, water or materials • Movement of individuals• Habitat characteristic
Urban & Keitt, 2001
Graph StructureGraph StructureConnectivity data Connectivity data
Source: Treml & Halpin, 2006
Data model• Distance matrix [D]
Connectivity data Connectivity data
de
de ii
To Node To Node jj[D][D]
di,ji,j = connectivity= connectivity• Distance matrix [D]or
• Edge list (from-id, to-id, distance) From
Nod
From
Nod
• Adjacency matrix (1/0)•Vertices/Nodes matrix (id,x, y)
N d i ( d i li )• Node properties (area, density, quality, etc)
Graph representation1
2• Nodes• Edge or arc • Clusters• Node degree • Node degree • Hubs• Path
Ecological ConnectivityEcological ConnectivityG h i & b h iG h i & b h i
Source: Treml & Halpin, 2006
Graph properties and metrics• Neighborhood metrics
Graph properties & behaviorGraph properties & behavior
• Neighborhood metrics• Shortest paths• Betweenness measures• Identify likely/unlikely routes• Robustness and resilience – node removal• Analyze flow structure through networky g• ‘Community’ structure, clusters & cliques
2 Connectivity Modeling2. Connectivity Modeling
tinln edgeslc
CreateNetwork
pt_nodes
ln_edges
poly_patchesNetwork
Least CostP th
ln_edgeslc
poly_patchsmNetwork g
txt_networkrstr_cost
Path
txt_networkl
poly_patches_
pt_centroids NetworkCentrality
Metrics
2.1. Triangulated Irregular Network (TIN)
ArcMap ArcScene
Novel TIN ApproachNovel TIN Approach
Captures spatial (X Y) and functional (Z) Captures spatial (X,Y) and functional (Z) relationshipsEdge length = cumulative costFastFastComplexity tweakable– Max. number of nodes– Max. allowable Z difference
2 Connectivity Modeling2. Connectivity Modeling
tinln edgeslc
CreateNetwork
pt_nodes
ln_edges
poly_patchesNetwork
Least CostP th
ln_edgeslc
poly_patchsmNetwork g
txt_networkrstr_cost
Path
txt_networkl
poly_patches_
pt_centroids NetworkCentrality
Metrics
2 2 Network Least Cost Paths2.2 Network Least Cost Paths
Djikstra algorithm Djikstra algorithm highly efficient over ArcGIS CostPathfunctionFuture: create corridors with CostDistancef hfrom paths
2 Connectivity Modeling2. Connectivity Modeling
tinln edgeslc
CreateNetwork
pt_nodes
ln_edges
poly_patchesNetwork
Least CostP th
ln_edgeslc
poly_patchsmNetwork g
txt_networkrstr_cost
Path
txt_networkl
poly_patches_
pt_centroids NetworkCentrality
Metrics
Network Centrality MetricsNetwork Centrality MetricsDegree Closeness Betweenness
Brandes, 2000. “Faster Evaluation of Shortest-Path Based Centrality Indices.” CiteSeer.
Software RequirementsSoftware RequirementsCommercial: ArcGIS 9.0+
A I f– ArcInfo– Spatial Analyst– 3D Analyst [CM]3D Analyst [CM]
Free/Open-Source– Download: www.env.duke.edu/geospatialDownload: www.env.duke.edu/geospatial– Python 2.3.5+ (www.python.org)– Python NetworkX (networkx.lanl.gov) [CM]– R 2.0.1+ (www.r-project.org) [HM]
• libraries: mass, rpart, mgcv, maptools, foreign, Rgdal, spatstatp
• R COM connector
Developer PerspectiveDeveloper Perspective
ArcCatalog – Add ScriptArcCatalog Add Script
Script SourceScript Source
ScripScrip
Getting Arguments in PythonGetting Arguments in Python
Python – Programming GluePython Programming GluePythonWin IDE
R SourcingR Sourcing
R Spatial LibrariesR Spatial LibrariesR package Functions
t l di / iti h fil ( i t limaptools reading/writing shapefiles (points, lines, polygons)
foreign reading/writing DBASE attribute tablesforeign reading/writing DBASE attribute tablesRODBC reading/writing Access (geo)database tablesRgdal reading/writing ArcGIS rastersg g gspatstat point pattern analysis for random sample
creationsp common R spatial classes for referencing
vector and raster objects
R PerformanceR Performance
Map Algebra formulas (GLM CART) vs Map Algebra formulas (GLM, CART) vs. prediction in R (GAM)Future: simplify GAM prediction with with Future: simplify GAM prediction with with table lookup values and Map AlgebraW k h h f l & d bWorks with shapefiles & geodatabases
FutureFuture
Open-Source Software Control Hosting with Open-Source Software Control Hosting with TRACGAM with Lookup TableGAM with Lookup TableImprove Error Checking, auto-install librariesImprove DocumentationSpatial weighted regression (or CAR)p g g ( )Zero-Inflated ModelsBayesian statisticsBayesian statistics
ConclusionsConclusions
Habitat and connectivity modeling accessible Habitat and connectivity modeling accessible to the GIS massesProvide templates/building blocks for analysis Provide templates/building blocks for analysis of habitat and connectivityF k f d l A GIS Framework for continuing to develop ArcGIS integration with R, Python tools
Download/FeedbackDownload/Feedback
www env duke edu/geospatialwww.env.duke.edu/[email protected]@
Thanks to:S L B P l• Scott Loarie, Ben Poulter