Parameter estimation for forest modeling Jeremy Jeremy ...

30
Parameter estimation for forest modeling Jeremy Jeremy Lichstein Lichstein [email protected] [email protected] September 17, 2010 September 17, 2010 September 17, 2010 September 17, 2010 Princeton University Princeton University US Forest Service US Forest Service

Transcript of Parameter estimation for forest modeling Jeremy Jeremy ...

Page 1: Parameter estimation for forest modeling Jeremy Jeremy ...

Parameter estimation for forest modeling

Jeremy Jeremy LichsteinLichstein

[email protected]@ufl.edu

September 17, 2010September 17, 2010September 17, 2010September 17, 2010

Princeton UniversityPrinceton University US Forest ServiceUS Forest Service

Page 2: Parameter estimation for forest modeling Jeremy Jeremy ...

Parameter estimation for forest modeling

Ryan et al. 2010

Page 3: Parameter estimation for forest modeling Jeremy Jeremy ...

Parameter estimation for forest modeling

• Stand-level C pools/fluxes vs. stand age, soil, topography, climate, etc.– Non-linear functions.

• Demographic parameters for dynamic models (SORTIE, FVS, many others)

Growth, mortality, and reproductive rates are non-– Growth, mortality, and reproductive rates are non-linear functions of environmental conditions (light, temperature, etc.).

• Individual allometry: parameters relating dbh to biomass (total, litter, roots, etc.)– How to estimate parameters for rare species?

Page 4: Parameter estimation for forest modeling Jeremy Jeremy ...

Goals of today’s talk

• Introduce advanced methods for estimating

parameters.

• Introduce free software to execute the

methods.methods.

• This is NOT a statistics course, so relax and

enjoy!

• Implementing the methods will require a

significant time investment.

Page 5: Parameter estimation for forest modeling Jeremy Jeremy ...

Outline

• Maximum Likelihood Estimation

• Non-linear regression with R

• Bayesian Hierarchical Models to

estimate parameters for rare speciesestimate parameters for rare species

Page 6: Parameter estimation for forest modeling Jeremy Jeremy ...

Example: linear regression (σ2 = 1.5; two unknown parameters)

• “Best fit” line minimizes Sum of Squared Residuals

• Equivalent to maximizing the Likelihood = Probability of the

Data given the Model (intercept and slope) = P(y1)P(y2)…P(yN)

Page 7: Parameter estimation for forest modeling Jeremy Jeremy ...

Example: linear regression (σ2 = 1.5; two unknown parameters)

��=� 1√2����exp

−�����− ���� 22��2 ���

��

����= ��0 + ��1����

Page 8: Parameter estimation for forest modeling Jeremy Jeremy ...

• Many classical statistical analyses (ANOVA,

t-test, linear regression, etc.) are special

cases of Maximum Likelihood that can be

solved analytically using Calculus.

• In other cases, we use numerical methods • In other cases, we use numerical methods

to obtain the Maximum Likelihood

Estimates, confidence intervals, etc.

Page 9: Parameter estimation for forest modeling Jeremy Jeremy ...

Example: one-parameter likelihood function

10 trials per

replicate

• Analytical method: p = (# successes)/(# trials) = 30/40 = 0.75.

• Numerical method: try different values of p between 0 and 1 to see which value maximizes the likelihood.

Bolker 2008

Page 10: Parameter estimation for forest modeling Jeremy Jeremy ...

Example:

two-parameter

likelihood surface

Bolker 2008

Nelder-Mead simplex Metropolis

Page 11: Parameter estimation for forest modeling Jeremy Jeremy ...

Maximum Likelihood example

Basal area vs. stand age (FIA data: FL 1990s)

Variance

• Power law of mean:

var = v1*meanv2

MeanMean

• y-intercept

• asymptote

• curvature

• trend

Page 12: Parameter estimation for forest modeling Jeremy Jeremy ...

1970s 1980s 1990s 2000s

������ = 1 + �����− 1990 100 � ����������+ �1 − �� ������������+ �� �

��2 = ��1������� ��2

Page 13: Parameter estimation for forest modeling Jeremy Jeremy ...

## negative log likelihood function

nll = function(biomass,age,yr,p=0.1,bmax=30,k=50,delta=0,v1=1,v2=1){

mu = (1 + yr*delta/100)*(p*bmax + (1-p)*bmax*age/(k + age))

var = v1*mu^v2

-sum(dnorm(biomass,mean=mu,sd=sqrt(var),log=T)) # sum of minus logLik

}

## read FIA data

D = read.csv("c:/Documents/FIA/biomass_data.csv",header=TRUE,sep=",")

D = D[D$state=="FL",] # just Florida

rel.year = D$year - 1990 # measurement year relative to 1990

R code, example

rel.year = D$year - 1990 # measurement year relative to 1990

data.list=list(biomass=D$biomass,age=D$age,yr=rel.year) # data

start.list=list(p=0.1, bmax=30, k=100, delta=0, v1=1, v2=1) # initial values

mle.biomass.age = mle2(nll, data=data.list, start=start.list)

mle = coef(mle.biomass.age) # MLEs

ci = confint(mle.biomass.age) # confidence intervals

plot(D$age,D$biomass) # plot data points

lines(lowess(D$age,D$biomass),col='green') # locally weighted regression

## plot curves using MLEs …

Page 14: Parameter estimation for forest modeling Jeremy Jeremy ...

• Free software for Unix, Windows, Mac

• Linear regression, ANOVA, etc.

• Multivariate, spatial, time-series analysis

• Likelihood and Bayesian estimation

• Error propagation (e.g., Monte Carlo)

• Publication-quality graphics

www.r-project.org

Main disadvantage:Main disadvantage:

Not user friendly.

Page 15: Parameter estimation for forest modeling Jeremy Jeremy ...

fle

xib

ility

SPSS

SAS

R

Comparison of commonly used statistical software

time investment (training)

fle

xib

ility

Excel

Jump

SPSS

Page 16: Parameter estimation for forest modeling Jeremy Jeremy ...

Spanish

•“R para Principiantes”

•A Spanish translation of “An Introduction to R”

•“Gráficos Estadísticos con R”

•“Cartas sobre Estadística de la Revista Argentina de Bioingeniería”

•“Introducción al uso y programación del sistema estadístico R”

•“Generacion automatica de reportes con R y LaTeX”

Free resources in Spanish and Portuguese

www.r-project.org

•“Generacion automatica de reportes con R y LaTeX”

•“Metodos Estadisticos con R y R Commander”

Portuguese

•“Bioestatística usando R”

•“Introdução à Biometria utilizando R”

•“Introdução à Programação em R”

•“Tóppicos de Estatística utilizando R”

•“Guia de instalação do R”

Page 17: Parameter estimation for forest modeling Jeremy Jeremy ...

“R Commander” Graphical User Interface for basic statistics

http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

Page 18: Parameter estimation for forest modeling Jeremy Jeremy ...

Bolker 2008

• Exploratory data analysis

(graphics)

• Likelihood and Bayesian

statistics

• Download text and

example R code for free

from Ben Bolker’s website

http://www.math.mcmaster.ca/~bolker/emdbook/index.html

from Ben Bolker’s website

• Or buy from amazon.com

for $50

Page 19: Parameter estimation for forest modeling Jeremy Jeremy ...
Page 20: Parameter estimation for forest modeling Jeremy Jeremy ...

Estimating parameters for rare species:

Bayesian Hierarchical Modeling

• Would like to have parameter estimates (e.g., allometric

equations) for each species in each region, site, soil type, etc.

• Many species, particularly in diverse tropical forests, are rare,

so sample sizes are small.

• Some alternatives:

– Group rare species into functional types.– Group rare species into functional types.

– Assign rare species a mean value from common species.

– Hierarchical estimation: each parameter comes from a

probability distribution that is informed by entire dataset

(all species).

• All three alternatives combine data from multiple species.

Hierarchical analysis provides a non-arbitrary way to do this.

Hierarchical models are often fit in a Bayesian context.

Page 21: Parameter estimation for forest modeling Jeremy Jeremy ...

Bayesian vs. Maximum Likelihood

• Prior probability distribution of parameters:

What we believe before looking at our data.

Posterior probability distribution of parameters:

P(Model | Data) ∝ P(Data | Model) × P(Model)

Posterior ∝ Likelihood × Prior

• Posterior probability distribution of parameters:

What we believe after looking at our data.

• If priors are non-informative:

– Posterior distribution depends primarily on the data.

– Posterior means ≈ MLEs

Page 22: Parameter estimation for forest modeling Jeremy Jeremy ...

Bolker 2008

Page 23: Parameter estimation for forest modeling Jeremy Jeremy ...

Bayesian statistics: Why Bother?

• Bayesian and Maximum Likelihood analyses yield similar inferences in many cases, but hierarchical models and other complex models are easier to fit in a Bayesian context.

• Markov chain Monte Carlo (MCMC): method for drawing random samples from a probability drawing random samples from a probability distribution. Can describe Bayesian posterior distribution (mean, percentiles, etc.) even if no analytical solutions are available.

• Can execute with R or WinBUGS (also free):

http://www.mrc-bsu.cam.ac.uk/bugs/

Page 24: Parameter estimation for forest modeling Jeremy Jeremy ...

Hierarchical Estimation

Each parameter is a sample from a probability

distribution, whose parameters we must also

estimate.

Parameter estimates for species X are a

compromise between data for species X and the

parameter sampling distributions.

Page 25: Parameter estimation for forest modeling Jeremy Jeremy ...

Hierarchical Estimation:

Example using taxonomic hierarchy

Each parameter is a sample from a

probability distribution that may be

informed by:

• Covariates

– Soil, topography, local climate,

etc.

order 1, order 2, order 3, …

family 1, family 2, family 3, …

division

etc.

• Taxonomy: Rare species parameter

estimate ≈

– genus-level mean if taxonomic

signal is strong

– overall mean if taxonomic signal

is weak

genus 1, genus 2, genus 3, …

species 1, species 2, species 3, …

Page 26: Parameter estimation for forest modeling Jeremy Jeremy ...

Shade-tolerance index for 315 U.S. tree species:

proportion of saplings in understory (FIA data)

rare species (n < 10): data not informative

Lichstein et al. 2010

Ecological Applications

Page 27: Parameter estimation for forest modeling Jeremy Jeremy ...

Magnolia sp.-0

.50

.51

.5

0.5 1.0 1.5 2.0

Magnolia grandiflora

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

Magnolia macrophylla

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

[he

igh

t (m

)]Height allometry for 315 U.S. tree species

Magnolia acuminata

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

Magnolia virginiana-0

.50

.51

.5

0.5 1.0 1.5 2.0

Magnolia fraseri

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

log10[diameter (cm)]

log

10[h

eig

ht

(m)]

Page 28: Parameter estimation for forest modeling Jeremy Jeremy ...

Sabal sp.-0

.50

.51

.5

0.5 1.0 1.5 2.0

Quercus rubra

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

Quercus durandii

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

[he

igh

t (m

)]Height allometry for 315 U.S. tree species

Sorbus americana

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

Quercus arizonica-0

.50

.51

.5

0.5 1.0 1.5 2.0

Quercus lobata

-0.5

0.5

1.5

0.5 1.0 1.5 2.0

log10[diameter (cm)]

log

10[h

eig

ht

(m)]

Page 29: Parameter estimation for forest modeling Jeremy Jeremy ...

Hierarchical modeling allows you to use all

available data without making arbitrary

decisions about how to group rare species.

Page 30: Parameter estimation for forest modeling Jeremy Jeremy ...

Should you worry about all of this?

• Complex statistical tools are not always

necessary or desirable.

• If you can answer the questions that you want

using the methods you are familiar with… using the methods you are familiar with…

don’t worry, be happy.