Height–diameter models with stochastic differential equations and mixed-effects parameters

9
ORIGINAL ARTICLE Height–diameter models with stochastic differential equations and mixed-effects parameters Petras Rups ˇys Received: 7 March 2013 / Accepted: 15 May 2014 Ó The Japanese Forest Society and Springer Japan 2014 Abstract Height–diameter modeling is most often per- formed using non-linear regression models based on ordinary differential equations. In this study, new models of tree height dynamics involving a stochastic differential equation and mixed-effects parameters are examined. We use a stochastic differential equation to describe the dynamics of the height of an individual tree. The first model is defined by a Gompertz shape stochastic differ- ential equation. The second Gompertz shape stochastic differential equation model with a threshold parameter can be considered an extension of the three-parameter sto- chastic Gompertz process through the addition of a fourth parameter. The parameters are estimated through discrete sampling of diameter and height and through the maximum likelihood procedure. We use data from tropical Atlantic moist forest trees to validate our modeling technique. The results indicate that our models are able to capture tree height behavior quite accurately. All the results are implemented in the MAPLE symbolic algebra system. Keywords Conditional density function Diameter Height Stochastic differential equation Threshold parameter Introduction Accurate information about total tree height (h) and diameter at breast height (1.37 m above ground, d) is essential for effective forest management, particularly in intensively managed production forests, and is used to predict total tree volume, aboveground biomass, and other important factors affecting forest growth and carbon bud- get models under a wide variety of conditions (Kenzo et al. 2009; Hosoda and Iehara 2010; Zeng et al. 2010). Diameter at breast height is the most commonly measured variable in forest inventories. In most applications, height–diameter models are used to predict the height of an individual tree when only diameter is known. These relationships vary substantially among different sites and stand conditions. The regression function of the height–diameter model is usually a parametric growth function, such as a Gompertz, logistic, Richards, or Weibull function, each of which prescribes monotonic growth. The efficacy of these S-shaped and concave-shaped models in forest modeling has been proven (Liang and Fei 2000; Temesgen and Ga- dow 2004; Lumbres et al. 2012; Scaranello et al. 2012). In even-aged stands, a height equation may predict the increase in height as a function of age (Garcia 1983; Vanclay 1995). Over the last decade, height growth models with mixed- effects parameters have attracted increasing research attention (Calama and Montero 2004; VanderSchaaf 2012). These models are supplemented by a model of the between-stand variation in the model parameters and a model of the variation in the residuals that assumes inde- pendence and constant variance such that the residuals are uncorrelated. However, the variance over the full range of predicted values is not homogeneous, and it is well known that a violation of this basic statistical assumption may lead to erroneous estimates of tree height. The newly developed non-linear mixed-effects height–diameter models based on stochastic differential equation (SDE) extend the usual non-linear mixed-effects regression models through the inclusion of system noise as an additional source of P. Rups ˇys (&) Institute of Forest Management and Wood Sciences, Aleksandras Stulginskis University, Studentu 11, 53361 Kaunas, Lithuania e-mail: [email protected] 123 J For Res DOI 10.1007/s10310-014-0454-1

Transcript of Height–diameter models with stochastic differential equations and mixed-effects parameters

Page 1: Height–diameter models with stochastic differential equations and mixed-effects parameters

ORIGINAL ARTICLE

Height–diameter models with stochastic differential equationsand mixed-effects parameters

Petras Rupsys

Received: 7 March 2013 / Accepted: 15 May 2014

� The Japanese Forest Society and Springer Japan 2014

Abstract Height–diameter modeling is most often per-

formed using non-linear regression models based on

ordinary differential equations. In this study, new models

of tree height dynamics involving a stochastic differential

equation and mixed-effects parameters are examined. We

use a stochastic differential equation to describe the

dynamics of the height of an individual tree. The first

model is defined by a Gompertz shape stochastic differ-

ential equation. The second Gompertz shape stochastic

differential equation model with a threshold parameter can

be considered an extension of the three-parameter sto-

chastic Gompertz process through the addition of a fourth

parameter. The parameters are estimated through discrete

sampling of diameter and height and through the maximum

likelihood procedure. We use data from tropical Atlantic

moist forest trees to validate our modeling technique. The

results indicate that our models are able to capture tree

height behavior quite accurately. All the results are

implemented in the MAPLE symbolic algebra system.

Keywords Conditional density function � Diameter �Height � Stochastic differential equation � Threshold

parameter

Introduction

Accurate information about total tree height (h) and

diameter at breast height (1.37 m above ground, d) is

essential for effective forest management, particularly in

intensively managed production forests, and is used to

predict total tree volume, aboveground biomass, and other

important factors affecting forest growth and carbon bud-

get models under a wide variety of conditions (Kenzo et al.

2009; Hosoda and Iehara 2010; Zeng et al. 2010). Diameter

at breast height is the most commonly measured variable in

forest inventories. In most applications, height–diameter

models are used to predict the height of an individual tree

when only diameter is known. These relationships vary

substantially among different sites and stand conditions.

The regression function of the height–diameter model is

usually a parametric growth function, such as a Gompertz,

logistic, Richards, or Weibull function, each of which

prescribes monotonic growth. The efficacy of these

S-shaped and concave-shaped models in forest modeling

has been proven (Liang and Fei 2000; Temesgen and Ga-

dow 2004; Lumbres et al. 2012; Scaranello et al. 2012). In

even-aged stands, a height equation may predict the

increase in height as a function of age (Garcia 1983;

Vanclay 1995).

Over the last decade, height growth models with mixed-

effects parameters have attracted increasing research

attention (Calama and Montero 2004; VanderSchaaf 2012).

These models are supplemented by a model of the

between-stand variation in the model parameters and a

model of the variation in the residuals that assumes inde-

pendence and constant variance such that the residuals are

uncorrelated. However, the variance over the full range of

predicted values is not homogeneous, and it is well known

that a violation of this basic statistical assumption may lead

to erroneous estimates of tree height. The newly developed

non-linear mixed-effects height–diameter models based on

stochastic differential equation (SDE) extend the usual

non-linear mixed-effects regression models through the

inclusion of system noise as an additional source of

P. Rupsys (&)

Institute of Forest Management and Wood Sciences, Aleksandras

Stulginskis University, Studentu 11, 53361 Kaunas, Lithuania

e-mail: [email protected]

123

J For Res

DOI 10.1007/s10310-014-0454-1

Page 2: Height–diameter models with stochastic differential equations and mixed-effects parameters

variation in the first-stage model. This extended model

describes the within-stand variation in the data through two

sources of noise: measurement noise, which represents the

uncorrelated part of the residual variability associated with

the assay and/or sampling errors, and system noise, which

reflects the random fluctuations around the corresponding

theoretical height–diameter model. If the magnitude of the

parameter capturing system noise, r, is zero, the entire

system noise term will vanish, and the remaining part of

the SDE will simply be the differential form, the solution to

which is the regression term of the mixed-effects model.

The pioneers of the SDE approach in forest growth mod-

eling are Suzuki (1971) and Tanaka (1986, 1988). The aim

of this paper is to model the between- and within-stand

variations in tree height using an SDE belonging to the

Ornstein–Uhlenbeck family (Uhlenbeck and Ornstein

1930).

The modeling of the height–diameter process leads to an

equation for the stochastic variable (height), such as a SDE,

or an equation that predicts how the probability density

function for height changes with respect to diameter. The

focus of the present work is on a mixed-effects SDE with a

drift term that depends on the random-effects parameters and

a diffusion term that does not depend on any random-effects

parameters. More precisely, this work considers M real-

valued stochastic processes of height, Hi(d), d C 0, i = 1,

2,…,M (M different stands), with dynamics that are ruled by

a Gompertz shape SDE. The mixed-effects model can be

written in the form of a two-stage model that explicitly

specifies within- and between-stand variations. The mixed-

effects stochastic height–diameter dynamical model allows

us to reduce the unexplained variability in height. In recent

decades, few models have explained the stochastic behavior

of diameter and height (Rupsys et al. 2007; Rupsys and

Petrauskas 2010a, b, 2012). These dynamics are basically the

classical age-varying deterministic logistic growth dynamics

extended by a level-dependent diffusion term. In reality,

external factors, such as climate, terrain, the presence of

other tree species, and indeed any factor that has an uncertain

effect on the height of a tree, will also affect the intrinsic

growth rate. Such factors can be modeled by adding an

external random term to the intrinsic growth rate, a, that

represents this environmental stochasticity. The basic sto-

chastic dynamics model for the height process, Hi(d), d C 0,

of the ith stand (i ¼ 1; 2; . . .;M) can be described by the Ito

(1942) univariate SDE,

dHi tð Þ ¼ lðHiðdÞ; h;/iÞdd þ bðHiðdÞ; hÞdWi dð Þ ð1Þ

starting from an initial point Hi(0) = 1.37, where Wi(d),

d C 0 represents standard Brownian motion and M is the

total number of stands used for model fitting. Intuitively, in

this work, the term dWi(�) is interpreted as ecological and

environmental noise. Parametric approaches assume that

the drift l H dð Þ; h;/i� �

and diffusion b(H(d),h) are known

functions, with the exception of an unknown fixed-effects

parameter vector h and a random-effects parameter /i. The

random-effects parameter, /i, i ¼ 1; 2; . . .; M, varies from

stand to stand to account for the between-stand variation.

We assume that the random-effects parameter, /i, is nor-

mally distributed with a mean of 0 and a standard deviation

r/. Because no repeated measurements of a tree is included

in the database used for model fitting, it is assumed that

trees from the same stand do not show any pattern of

temporal correlation. Parametric SDEs often provide a

convenient way to describe the dynamics of tree data

(Rupsys and Petrauskas 2010a, b, 2012), and a great deal of

effort has been expended searching for efficient ways to

estimate model parameters. The maximum likelihood

approach is typically the estimator of choice.

Following the recent trend in SDEs, the focus is on

developing a stochastic height–diameter model using a

Gompertz shape SDE with a mixed-effects parameter. This

SDE is reducible to an Ornstein–Uhlenbeck process (Uh-

lenbeck and Ornstein 1930). Multivariate models can

address, for instance, multiple explanatory factors (e.g.,

diameter and density) in assessing tree height.

An aim of this study is to discuss the advantages of

using SDEs with a mixed-effects parameter for the analysis

of height–diameter relationships and to illustrate how an

adequate model can be constructed. The greatest advantage

of the mixed-effects modeling approach is the ability to

calibrate the model’s parameters using data independent of

those data used for model fitting. The present work also

discusses how a conditional density function can be used to

construct maximum likelihood estimators and presents an

application of the SDE approach for the study of the

height–diameter dynamics of tropical trees. A MAPLE

macro program is implemented to conduct the calculations

required for the maximum likelihood methodology

described in the Appendix.

Materials and methods

The focus of the present work is on the dynamics of height as

a stochastic process, H(d), with respect to diameter, d. In this

study, we use the deterministic ordinary differential equation

developed by Gompertz (1825) as the basis of the newly

developed stochastic model. The changes in tree height, h(d),

are described using the ordinary differential equation

dhðdÞdd¼ ahðdÞ � bhðdÞ ln hðdÞð Þ; ð2Þ

where a is the intrinsic growth rate of the height and b is

the growth deceleration factor. The parameters a and b

J For Res

123

Page 3: Height–diameter models with stochastic differential equations and mixed-effects parameters

characterize the evolution of the height of different tree

species and stands.

There are alternative ways of introducing stochasticity

into the behavior of tree height. In this work, the ran-

domness in the tree height function is defined by standard

Brownian motion. Therefore, the complete deterministic

model defined by Eq. 2 for tree height is converted to a

stochastic model assuming that the intrinsic growth rate

varies randomly around the mean

aðdÞ ¼ aþ reðdÞ; ð3Þ

where a is the constant mean value of a(d), r is the dif-

fusion coefficient, and e(d) is a Gaussian white noise pro-

cess. The relationship between total tree height and

diameter are altered by environmental conditions. Stand-

specific characteristics, such as soil type, nutrient status,

and elevation, cause parameters to vary between different

stands. In the case of between-stand variation, the param-

eters a and b vary from stand to stand and hence account

for this variation. The interest of this study lies in the

development of height–diameter models for a large geo-

graphic region rather than localized areas. Thus, specific

stands may have what are generally termed ‘‘random

parameters’’ in mixed-effects model terminology. For the

construction of a mixed-effects model, the model must first

determine which parameters should be considered mixed

and which should be considered purely fixed. The param-

eters with high variability could be considered mixed-

effects parameters. The parameter a exhibits high variation

between stands (see Table 2, below) and thus can be

altered by adding stand-specific random effects to the

fixed-effects parameter to produce a stand-specific param-

eter of the following form:

aþ /i; ð4Þ

where /i (i = 1, 2,…,M) are stand-specific random-effects.

It is assumed that the random effects /i (i = 1, 2,…,M) are

independent and normally distributed with a mean of 0 and

a constant variance ð/i�Nð0; r2

/ÞÞ Therefore, the total tree

height, Hi(d), d C 0, i ¼ 1; 2; . . .; M, is described using an

SDE of the Gompertz form.

dHi dð Þ ¼ aþ /ið ÞHi dð Þ � bHi dð Þ ln Hi dð Þ� �� �

dd

þ rHi dð ÞdWi dð Þ;

P Hi 0ð Þ ¼ 1:37� �

¼ 1; d 2 0; D0½ � ; ð5Þ

where Wi(d), d C 0 are the independent standard Brownian

motions, Wi(d) and /i are assumed to be mutually inde-

pendent for all 1 B i B M, and M is the total number

of stands used for model fitting. The term

P(Hi(0) = 1.37) = 1 ensures that if d = 0, then h = 1.37.

By Ito’s (Ito 1942) lemma, Eq. 5 implies that the exponent

transformation w : ln(h) follows an Ornstein–Uhlenbeck

process. This transformation changes the state space R?

into R and allows us to obtain the conditional probability

density function for the considered height process, yielding

f h; dð Þ ¼ 1

hffiffiffiffiffiffiffiffiffiffiffiffiffiffi2pvðdÞ

p exp � 1

2vðdÞ ln h� lðdÞð Þ2� �

; ð6Þ

which corresponds to a lognormal distribution, K1(l(d),

v(d)), where

lðdÞ ¼ lnð1:37Þe�bd þ aþ /i �r2

2

� �1� e�bd

b

� �; ð7Þ

vðdÞ ¼ 1� e�2b

2br2 : ð8Þ

The conditional mean trend and variance functions, h(�),w(�), of the stochastic height process are given by the fol-

lowing expressions (Rupsys and Petrauskas 2012):

hðdÞ ¼ exp lnð1:37Þe�bd þ 1� e�bd

baþ /i �

r2

2

� ��

þ r2

4b1� e�2bd� �� ��

; ð9Þ

wðdÞ ¼ exp 2 lnð1:37Þe�bd þ 1� e�bd

baþ /i �

r2

2

� �� ��

þ r2

2b1� e�2bd� ��

� expr2

2b1� e�2bd� �� �

� 1

� �:

ð10Þ

The next focus of this study is on the development of a

new stochastic Gompertz-type height–diameter model with

a fixed-effects threshold parameter c. This model can be

considered an extension of the three fixed-effects parame-

ters stochastic Gompertz height process defined by Eq. 5

with the addition of a fourth fixed-effects threshold

parameter (Gutierrez et al. 2006). Hence, the height, Hi(d),

i ¼ 1; 2; . . .; M, is described by an SDE of the form.

dHi dð Þ ¼ aþ /ið Þ Hi dð Þ � c� �

� b Hi dð Þ � c� ��

� ln Hi dð Þ � c� ��

dd þ r Hi dð Þ � c� �

dWi dð Þ;P Hi 0ð Þ ¼ 1:37� �

¼ 1; d 2 0; D0½ �:ð11Þ

The conditional probability density function for the

considered height process (Eq. 11) is defined in the fol-

lowing form

f t h; dð Þ ¼ 1

ðh� cÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi2pvtðdÞ

p

� exp � 1

2vtðdÞ lnðh� cÞ � ltðdÞð Þ2� �

;

ð12Þ

which corresponds to a lognormal distribution, K1(lt(d),

vt(d)), where

J For Res

123

Page 4: Height–diameter models with stochastic differential equations and mixed-effects parameters

ltðdÞ ¼ lnð1:37� cÞe�bd þ aþ /i �r2

2

� �1� e�bd

b

� �;

ð13Þ

vtðdÞ ¼ 1� e�2bd

2br2: ð14Þ

The conditional mean trend and variance functions of

the height process are given by the following expressions,

ht(�), wt(�), respectively:

htðdÞ ¼ cþ exp lnð1:37� cÞe�bd þ 1� e�bd

b

� aþ /i �r2

2

� �þ r2

4b1� e�2bd� �� ��

;

ð15Þ

wtðdÞ ¼ exp 2 lnð1:37� cÞe�bd þ 1� e�bd

baþ /i �

r2

2

� �� ��

þ r2

2b1� e�2bd� ��

� expr2

2b1� e�2bd� �� �

� 1

� �:

ð16Þ

Data

The next aim of this study is to model a tropical Atlantic

forest tree dataset. The tropical forest tree height–diam-

eter database published by Scaranello et al. (2012) is

analyzed, which includes 280 individual tree height and

diameter measurements across stands along an altitudinal

gradient. The dataset contains both stand- and tree-level

information. The stand-level information includes the

altitude. A total of four different altitudes (stands) are

used in this study. One objective is to improve the

understanding of tropical tree variability and, using

mixed-effect parameters, reduce the uncertainty in tree

height estimates at the altitudinal scale. The statistics of

the diameter outside the bark at breast height (d) and the

total height (h) of all of the trees used for parameter

estimation are summarized in Table 1.

Results and discussion

To examine the effect of random-effects parameters on

height predictions, Eqs. 5 and 11 are initially fitted without

the random-effects parameters using the MAPLE 11

computational algebra system (Monagan et al. 2007). This

analysis is performed by assuming that the within-stand

variance is homogeneous and that the residuals are

uncorrelated. Models with fixed-effects and random-effects

parameters are evaluated based on Akaike’s information

criterion (AIC), which is defined as

AIC ¼ �LLa þ 2p; a ¼ f ; m; ð17Þ

where LLa is the log-likelihood function defined by Eq. 28

for the fixed-effects parameters and by Eq. 30 for the

mixed-effects parameters, and p is the number of param-

eters in the model; the models are also evaluated based on

numerical and graphical analyses of the residuals. The

model with the smallest AIC value is considered to be the

best. Using the estimation dataset summarized in Table 1,

the parameters of the stochastic height–diameter models

(Eqs. 5 and 11) are estimated by the maximum log-likeli-

hood procedure (Eqs. 28 and 30) using the NLP Solve

procedure in MAPLE 11. The parameter estimation results

and the Akaike’s information criterion are summarized in

Table 2.

The smallest AIC value for all of the newly developed

height–diameter models is demonstrated by the stochastic

Gompertz-type height–diameter model defined by Eq. 11

with threshold parameter c and mixed-effects parameter a.

Thus, the results of this study further suggest that the sto-

chastic Gompertz-type height–diameter model with

threshold parameter c and mixed-effects parameter a is

significantly superior to all of the models used to estimate

tree height.

The inclusion of the random-effects parameter, /,

allows for the modeling of the variability among different

stands, provides consistent estimates of the fixed-effects

parameters, a, b, r, and c, and improves the predictions if it

is possible to estimate (calibrate) the random effects for a

particular stand. The random parameter calibrated in such a

way is added to the fixed parameter to obtain a localized

parameter. The mean responses (population average trend

dynamics) are obtained with the newly developed fixed-

effects models and mixed-effects models by setting the

random effects equal to zero, E(/) = 0. To understand the

advantages of the newly developed height–diameter mod-

els in different regions (altitudes), the fixed-effects models,

the mixed-effects models, and the mixed-effects models

with random-effects set to zero are used to predict the tree

height over the entire dataset and to predict the tree height

over each region.

Table 1 Summary statistics of the dataset

Altitude Count Variable Min Max Mean SD

Sea level 61 d (cm) 4.8 76.9 20.4 15.3

h (m) 3.0 19.0 10.2 4.2

100 m 73 d (cm) 6.0 75.1 30.5 20.0

h (m) 4.0 22.0 11.7 4.7

400 m 77 d (cm) 4.9 79.0 30.6 20.7

h (m) 4.0 25.0 11.3 4.9

1,000 m 79 d (cm) 4.9 100.4 28.6 24.3

h (m) 3.5 30.0 13.4 6.6

All levels 280 d (cm) 4.8 100.4 27.8 20.9

h (m) 3.0 30.0 11.7 5.3

J For Res

123

Page 5: Height–diameter models with stochastic differential equations and mixed-effects parameters

The performance statistics of the newly developed

height equations includes three statistical indices: the mean

prediction bias (B), the root mean square error (RMSE),

and an adjusted coefficient of determination (R2):

B ¼ 1

n

Xn

i¼1

yi � yið Þ; ð18Þ

RMSE ¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi

1

n� p

Xn

i¼1

yi � yið Þ2s

; ð19Þ

�R2 ¼ 1� n� 1

n� p

Pni¼1 yi � y2

iPni¼1 yi � �yð Þ2

; ð20Þ

where n is the total number of observations used to fit the

height–diameter models, p is the number of model

parameters, and yi, yi, and y are the measured, predicted,

and average values of the dependent variable (total tree

height), respectively.

Table 3 displays the performance statistics for the

height–diameter models defined by Eqs. 5 and 11 for all

three scenarios (the fixed-effects models, the mixed-effects

models, and the mixed-effects models with random effects

set to zero), and the results illustrate the extent to which the

inclusion of the random-effects parameter improves the

statistical indexes. Compared with the basic fixed-effects

models, the mixed-effects models exhibit better perfor-

mance with lower bias, lower root mean square error, and a

higher adjusted coefficient of determination over the entire

dataset. Both the mixed-effects models with random effects

set to zero and the mixed-effects model show worse per-

formance with greater bias, higher root mean square error,

and a lower adjusted coefficient of determination for each

regional dataset (altitude). In general, both stochastic dif-

ferential equations mixed-effects parameter models (Eqs. 5

and 11) produce relatively high root square mean errors

(3.16 and 2.96 m, respectively) and explain a relatively low

proportion (65.3 and 69.5 %, respectively) of the total

variation in the observed values of tree height. Neverthe-

less, these results may not be surprising because the

height–diameter relationships observed in the data are

highly scattered. The plots of the residuals as a function of

the height estimated over the entire dataset show that the

residuals of the mixed-effects model defined by Eq. 11 are

distributed more symmetrically around zero with an

approximately constant variance compared with the other

models.

For the evaluation of the goodness-of-fit of the newly

developed stochastic Gompertz shape height–diameter

models (Eqs. 5 and 11), the Shapiro–Wilk statistic and the

normal probability plot are also used. For both mixed-

effects models, the p value of the Shapiro–Wilk statistic

exceeds 0.01. The normal probability plots of the pseudo-

residuals, obtained using the estimates of the parameters

presented in Table 1, show that the fits of both height–

diameter models with mixed-effects parameters are satis-

factory. This result does not indicate any serious violation

of the assumption of normality of the residuals.

The coefficient of variation is typically used to indicate

the precision of the dispersion of datasets and is also often

used to compare numerical distributions measured at dif-

ferent scales. Tree-height-based quantification of the stand

structural diversity can be performed using the coefficient

of variation. The coefficient of variation reaches its maxi-

mum with two-storied stands, and the standard deviation

measures the differences between the heights of individual

trees and the mean (Staudhammer and LeMay 2001). The

coefficient of variation of a tree height measures the vari-

ability of the tree height relative to its mean and relates the

Table 2 Estimated parameters (standard deviations) of both models applied to the fitting dataset

Models Altitude Parameters AIC

a b r c r/

Equation 5 Sea level 0.3872 (0.0269) 0.1439 (0.0122) 0.1502 (0.0147) – – 379.58

100 m 0.3405 (0.0226) 0.1236 (0.0094) 0.1348 (0.0121) – – 475.18

400 m 0.3491 (0.0250) 0.1289 (0.0105) 0.1456 (0.0129) – – 529.38

1,000 m 0.3876 (0.0214) 0.1299 (0.0085) 0.1386 (0.0124) – – 483.99

All levels 0.3698 (0.0129) 0.1326 (0.0054) 0.1521 (0.0070) – – 1,418.03

All levels 0.3661 (0.0117) 0.1313 (0.0049) 0.1438 (0.0065) – 0.0139 (0.0049) 1,399.87

Equation 11 Sea level 0.2908 (0.0417) 0.0939 (0.0242) 0.0684 (0.0441) -7.017 (2.879) – 378.76

100 m 0.2132 (0.0291) 0.0496 (0.0060) 0.0126 (0.0044) -56.468 (22.688) – 460.75

400 m 0.2126 (0.0291) 0.0646 (0.0140) 0.0484 (0.0225) -10.676 (5.055) – 518.12

1,000 m 0.2318 (0.0321) 0.0671 (0.0139) 0.0496 (0.0209) -10.442 (4.596) – 473.92

All levels 0.2337 (0.0175) 0.0709 (0.0083) 0.0560 (0.0136) -9.681 (4.026) – 1,386.66

All levels 0.2267 (0.0114) 0.0667 (0.0045) 0.0449 (0.0067) -12.470 (2.834) 0.0041 (0.0015) 1,359.58

J For Res

123

Page 6: Height–diameter models with stochastic differential equations and mixed-effects parameters

mean and standard deviation by expressing the standard

deviation as a percentage of the mean. To further discuss

the results of this study, the coefficient of variation, which

may help examine the dispersion in tree heights occurring

at diameter d, is defined by

CV dð Þ ¼ffiffiffiffiffiffiffiffiffiffiffiwtðdÞ

p

htðdÞ � 100: ð21Þ

Figure 1 shows a plot of the coefficient of the variation

as a function of diameter using the population mean trend

and variance functions (fixed-effects parameters model). In

both cases, the coefficient of variation of the tree height

monotonically evolves into a stationary coefficient of

variation.

The coefficient of variation based on tree height

decreases with an increase in sea level and peaks at an

altitude of sea level. The height-based coefficient of vari-

ation increased increases significantly with diameter.

Calibration

The mixed-effects models more closely approximate the

actual values for all altitudes, which indicates that the mixed-

effects models describe the height–diameter relationship

well. In forestry literature, calibration requires the prediction

of the random-effects parameter using a supplementary

sample of observations collected at the same sampling unit.

The heights of trees in new stands can be predicted either by

setting the random effects to zero or by adding random

parameters predicted from prior observations.

When the diameter and height of a sub-sample of trees

are known, the predicted random effects are added to the

fixed parameters to obtain localized parameters for the

corresponding stand. If a sub-sample of m trees with height

hi and diameter di, i = 1, 2,…, r, is taken from a new stand,

the random-effects parameter / for the new stand and both

developed models can be predicted using the best linear

unbiased predictors derived for linear mixed-effects

regression models (McCulloch and Neuhaus 2012) in the

following form:

/ ¼r2

/r

r2e þ r2

/ry� aþ r2

2

� �; ð22Þ

where

r2e ¼

1

r � 1

Xr

i¼1

yi � aþ r2

2

� �2

; ð23Þ

for the stochastic Gompertz-type model defined by Eq. 5,

Fig. 1 Plot of the variation dynamics (Eq. 21) of a tree height

process (Eq. 11) with fixed-effects parameters

Table 3 Fit statistics for all scenarios tested

Models Altitudes Models

Fixed effects Mixed effects Mixed effects, /i = 0

B RMSE R2 B RMSE R

2 B RMSE R2

Equation 5 Sea level -0.0052 2.6486 0.6264 -0.1880 2.6924 0.6140 -0.5586 2.7385 0.6006

100 m -0.0970 2.8260 0.6640 -0.1319 2.8751 0.6522 -0.6509 2.8712 0.6532

400 m -0.0545 3.3077 0.5701 -0.1174 3.3204 0.5668 -1.0112 3.3235 0.5660

1,000 m -0.0663 3.7131 0.7021 0.1942 3.7368 0.6982 2.0397 3.9688 0.6596

All levels -1.1072 3.4910 0.5766 -0.0604 3.1641 0.6522 -0.0672 3.4620 0.5836

Equation 11 Sea level -0.0057 2.6301 0.6314 -0.0180 2.7192 0.6028 -0.2334 2.7473 0.5981

100 m -0.0103 2.5634 0.7236 -0.0874 2.6076 0.7139 -0.6868 2.6197 0.7113

400 m 0.0047 3.0774 0.6279 -0.0669 3.0897 0.6249 -1.0458 3.1162 0.6185

1,000 m 0.0136 3.4555 0.7420 0.2114 3.4597 0.7413 1.9928 3.6464 0.7127

All levels -1.4801 3.3811 0.6028 0.0071 2.9632 0.6950 -0.0267 3.2593 0.6310

J For Res

123

Page 7: Height–diameter models with stochastic differential equations and mixed-effects parameters

yi ¼b

1� expð�bdiÞlnðhiÞ � ln 1:37ð Þ expð�bdiÞ

� r2

4b1� expð�2bdiÞ

!

; ð24Þ

for the stochastic Gompertz type model with a threshold

parameter defined by Eq. 11,

yi ¼b

1� expð�bdiÞlnðhi � cÞ � ln 1:37� cð Þ expð�bdiÞ

� r2

4b1� expð�2bdiÞ

!

;

ð25Þ

and a, b, c, r, r/ are estimates of the parameters calculated

by the maximum likelihood procedure for mixed-effects

models. The height of another tree from the same stand can

be estimated by adding the random-effects parameter pre-

dicted by Eq. 22 to parameter a.

Mixed-effects models incorporate the variability

between stands through the models’ parameters and in

terms of both fixed and random effects. Random effects are

conceptually random variables and can be simulated as

such in terms of their distribution. To this end, a random

component can also be added to the random-effects

parameter prediction, / and the height predictions, h. This

stochastic approach uses the distribution functions and

confidence intervals of random variables, /, and H(d). The

stochastic predictions of / and H(d) can be defined in the

following form, respectively:

/stoch ¼ /þ U�1U 0; r2

/

; ð26Þ

hstoch ¼ LN�1U lðdÞ; vðdÞð Þ; ð27Þ

where / is the estimation value of random effects obtained

by Eq. 22; r/ is the estimation value of the standard

deviation of random effects; U�1U 0; r2

/

is the inverse of

the normal distribution with a mean of 0 and a constant

variance r2/ for a uniform random variable, U, in the

interval (0.05; 0.95); l dð Þ and m dð Þ are the estimated trend

of the mean and variance (calculated by Eqs. 7 and 8 or

Eqs. 13 and 14) of the lognormal density of the height,

respectively; and LN�1U l dð Þ; mðdÞð Þ is the inverse of the

lognormal distribution with a mean of l dð Þ and a variance

of m dð Þ for a uniform random variable, U, in the interval

[0.05; 0.95].

The functionality of the calibration is tested using the

regional (by altitude) datasets and their measure of the

central tendency. Although the mean, median, and mode

are all valid measures of the central tendency, we prefer to

use the median for model calibration because it is less

affected by outliers and skewed data. The first calibration

approach uses the full regional datasets, and the second

approach uses the median of the regional datasets (median

of diameter and height: de and he, respectively). One aim is

to evaluate the advantage of these calibration approaches,

which implies that both calibration approaches are com-

pared in terms of their functionality with non-calibrated

fixed-effects and mixed-effects models. The results are

presented in Table 4. The predictive ability of all of the

calibrated models for the entire dataset is better than that of

the fixed-effect model and, logically, worse than that

obtained using an individual fit to the data of each regional

dataset.

It is clear that the random effects predicted by Eq. 22 for

a particular stand are statistics and thus have a sampling

distribution for a particular sub-sample size, r. The interest

of the present study does not lie in capturing the variability

in the random-effects predictions for a particular stand

subject to size r of a sub-sample. The random effects

predicted using Eq. 22 for the regional datasets are very

similar to the estimates given by the maximum likelihood

procedure.

Next, the predicted stand-specific random effects, /,

calculated by Eq. 22 are added to the population average

parameter, a (estimated using the mixed-effects model), to

determine the predicted stand-specific mixed-effects

parameter for the stochastic prediction of the height, hstoch.

To validate our developed approaches (Eqs. 26, 27) for the

stochastic prediction of random effects, /, and tree height,

h, a large-scale simulation study is performed using the

models defined by Eqs. 5 and 11. The results are compared

using 90 % confidence intervals for each fit statistic. More

precisely, for each height model (Eqs. 5 and 11), 100 sto-

chastic predictions made by both approaches (Eqs. 26, 27)

are generated. The corresponding 90 % confidence interval

for each fit statistic is summarized in Table 4. The sto-

chastic prediction approaches show greater variability in

the height predictions.

Conclusion

New height–diameter models were developed using

Gompertz shape SDEs with one mixed-effects parameter.

The comparison of the predicted height values calculated

using the SDEs defined by Eqs. 5 and 11 with the observed

values revealed predictive power comparable to that of the

stochastic height model with the threshold parameter

(Eq. 11). In addition, the use of the mixed-effects model in

the analysis of a sub-sample of trees to determine height

allows for the maintenance of a simple model structure

J For Res

123

Page 8: Height–diameter models with stochastic differential equations and mixed-effects parameters

without the inclusion of additional predictor variables. The

developed stochastic models may be recommended both

for the ease of their fitting procedures and the biological

interpretations of the relevant parameters.

The variance functions developed in this study can be

applied to generate weights in every linear and nonlinear

least-squares regression height model.

In summary, this paper demonstrates that the standard

approach of assuming normally distributed random effects

results in predicted values that exhibit good performance

across a wide range of situations presented by different

regional datasets.

Acknowledgments The author appreciates the anonymous review-

ers and the editor for their helpful comments on the manuscript.

Appendix

In the context of this study, there is only one height mea-

surement for each tree. First, the maximum log-likelihood

function is derived for fixed-effects models (in this case,

the parameter of random effects, /i, is assumed to be equal

to its mean value E(/i) = 0, i = 1,…,M). Second, the

maximum log-likelihood function is derived for mixed-

effects models.

The fixed-effects parameters a, b, c, and r are estimated

through the maximum likelihood procedure using discrete

sampling and conditional probability density functions

(Eqs. 6 and 12). Let us consider a discrete sample of the

process (hi1; hi

2; . . .; hini

) at the diameters (di1; di

2; . . .; dini

),

where ni is the number of observed trees of the ith

stand, i = 1,2,…,M. Under the initial condition P(H(0) =

1.37) = 1, the associated log-likelihood function can be

obtained by the following expression:

LLfðhÞ ¼XM

i¼1

Xni

j¼1

ln f hij; d

ij

; ð28Þ

where the density function f(h, d) takes the forms of Eq. 6

or 12 and h = a, b, r or h = a, b, r, c, respectively, with

the random-effects parameters /i : 0, i = 1, 2, …, M.

The maximum log-likelihood function for mixed-effects

models defined by Eqs. 5 and 11 takes the following form:

LLmðh;r/Þ ¼XM

i¼1

Z

R

Xni

j¼1

ln f ðhij;d

ijÞ

þ ln pð/i r/

�� � �

� d/i;

ð29Þ

where h = a, b, c, r is the vector of the fixed-effects

parameters (the same for all stands) and /i is the random-

effects parameter (stand-specific), which is assumed to

Table 4 Fit statistics for the calibrated height–diameter models

Model Altitude Equation 22 Equation 22 (with de, he) Equation 26 Equation 27

90 % confidence interval 90 % confidence interval

B RMSE R2 B RMSE R

2 B RMSE R2 B RMSE R

2

Equation 5 Sea level 0.1422 2.6641 0.6220 0.2783 2.6560 0.6243 -0.0481 2.6212 0.5250 0.3486 2.5648 0.5650

0.2816 2.9867 0.6341 0.6312 2.8581 0.6497

100 0.3283 2.8938 0.6477 1.6647 3.0228 0.6156 0.1181 2.8914 0.5640 0.6264 2.8307 0.5770

0.4454 3.1294 0.6483 0.8935 3.1709 0.6629

400 0.2513 3.3329 0.5636 2.6887 3.5940 0.4925 0.0344 3.2448 0.4782 0.5257 3.2562 0.5055

0.4019 3.6443 0.5863 0.7703 3.5478 0.5834

1,000 0.2587 3.7405 0.6976 1.4313 3.8655 0.6771 0.0271 3.7051 0.6229 0.5621 3.6056 0.6510

0.4369 4.1774 0.7034 0.8718 4.0186 0.7191

All levels 0.2520 3.1651 0.6520 1.6027 3.4087 0.5964 -0.5355 3.1884 0.5765 0.5913 3.2000 0.6089

0.9543 3.4915 0.6468 0.7298 3.3554 0.6443

Equation 11 Sea level -0.1036 2.7297 0.6032 1.2842 2.6545 0.6248 -0.2763 2.6761 0.5020 -0.0115 2.6762 0.5859

0.0626 3.0579 0.6186 0.0851 2.7884 0.6186

100 0.0341 2.6087 0.7137 1.5753 2.7184 0.6891 -0.2031 2.5585 0.6259 0.1440 2.5762 0.7011

0.1507 2.9822 0.7246 0.2360 2.6656 0.7208

400 0.0043 3.0903 0.6248 1.5732 3.1869 0.6010 -0.2015 2.9949 0.5309 0.1114 3.0421 0.6078

0.2090 3.4552 0.6476 0.2092 3.1593 0.6364

1,000 0.0971 3.4573 0.7417 2.5565 3.7572 0.6950 -0.0960 3.3246 0.6708 0.2073 3.4223 0.7292

0.3029 3.9029 0.7612 0.3177 3.5400 0.7469

All levels 0.0115 2.9634 0.6949 1.7717 3.1177 0.6623 -0.7274 2.9853 0.6358 0.1468 2.9416 0.6853

0.6573 3.2378 0.6904 0.1910 3.0100 0.6994

J For Res

123

Page 9: Height–diameter models with stochastic differential equations and mixed-effects parameters

follow a univariate normal distribution, p(/i|r/), with a

mean of 0 and constant variance r2/. Unfortunately, the

integral in Eq. 29 does not have a closed-form solution.

Because the analytic expression for the integrand in Eq. 29

is known, the Laplace method may be used (Picchini et al.

2011). The log-likelihood function for the mixed-effects

models defined by Eqs. 5 and 11 is approximately given

by:

LLmðh; r/Þ �XM

i¼1

g /i h; r/

��

þ 1

2ln 2pð Þ

� 1

2ln �H /i h; r/

��

ð30Þ

where

g /i h; r/

��� �¼Xni

j¼1

ln f ðhij; d

ijÞ

þ ln pð/i; r/Þ

��� �;

Hð/i h; r/

�� Þ ¼o2g /i h; r/

��� �

o2/i

/i ¼ /i

�����;

/^

i

� �¼ arg max

/i

g /i h; p/

��� �ð31Þ

The maximization of LLm

h;r/

� �is a nested optimiza-

tion problem. The internal optimization step estimates the

/i

for every stand i = 1,2,…,M. The external optimi-

zation step maximizes LLm

h; r/� �

after substituting the

value of /i

into Eq. 30.

References

Calama R, Montero G (2004) Interregional nonlinear height–diameter

model with random coefficients for stone pine in Spain. Can J

For Res 34:150–163. doi:10.1139/03-199

Garcia O (1983) A stochastic differential equation model for the

height growth of forest stands. Biometrics 39:1059–1072

Gompertz B (1825) On the nature of the function expressive of the

law of human mortality, and on a new mode of determining the

value of life contingencies. Philos Trans R Soc Lond B

115:513–585

Gutierrez R, Gutierrez-Sanchez R, Nafidi A, Ramos E (2006) A new

stochastic Gompertz diffusion process with threshold parameter:

computational aspects and applications. Appl Math Comput

183:738–747

Hosoda K, Iehara T (2010) Biomass equations for four shrub species

in subtropical China. J For Res 15(5):299–306

Ito K (1942) On stochastic processes. Jpn J Math 18:261–301

Kenzo T, Furutani R, Hattori D, Kendawang JJ, Tanaka S, Sakurai K,

Ninomiya I (2009) Allometric equations for accurate estimation

of above-ground biomass in logged-over tropical rainforests in

Sarawak, Malaysia. J For Res 14:365–372

Liang WM, Fei LX (2000) Research on nonlinear height–diameter

models. J For Res 13(1):75–79

Lumbres RIC, Cabral DE, Parao MR, Seo YO, Lee YJ (2012)

Evaluation of height–diameter models for three tropical planta-

tion species in the Philippines. Asia Life Sci 21:455–468

McCulloch CE, Neuhaus JM (2012) Prediction of random effects in

linear and generalized linear models under model misspecifica-

tion. Biometrics 67:270–279

Monagan MB, Geddes KO, Heal KM, Labahn G, Vorkoetter SM,

McCarron J, DeMarco P (2007) Maple Advanced programming

Guide. Maplesoft, Canada

Picchini U, Ditlevsen S, De Gaetano A (2011) Practical estimation of

high dimensional stochastic differential mixed-effects models.

Comput Stat Data Anal 55:1426–1444

Rupsys P, Petrauskas E (2010a) The bivariate Gompertz diffusion

model for tree diameter and height distribution. For Sci

56:271–280

Rupsys P, Petrauskas E (2010b) Quantifying tree diameter distribu-

tions with one-dimensional diffusion processes. J Biol Syst

18:205–221

Rupsys P, Petrauskas E (2012) Analysis of height curves by stochastic

differential equations. Int J Biomath 5(5):1250045. doi:10.1142/

S1793524511001878

Rupsys P, Petrauskas E, Mazeika J, Deltuvas R (2007) The Gompertz

type stochastic growth law and a tree diameter distribution. Balt

For 13:197–206

Scaranello MA, Alves LF, Vieira SA, Camargo PB, Joly CA,

Martinelli LA (2012) height–diameter relationships of tropical

Atlantic moist forest trees in southeastern Brazil. Sci Agr

69:26–37

Staudhammer CL, LeMay VM (2001) Introduction and evaluation of

possible indices of stand structural diversity. Can J For Res

31:1105–1115

Suzuki T (1971) Forest transition as a stochastic process. Mitt. Forstl.

Bundesversuchsanstalt Wien 91:69–86

Tanaka K (1986) A stochastic model of diameter growth in an even-

aged pure forest stand. J Jpn For Soc 68:226–236

Tanaka K (1988) A stochastic model of height growth in an even aged

pure forest stand: why is the coefficient of variation of the height

distribution smaller than that of the diameter distribution? J Jpn

For Soc 70:20–29

Temesgen H, Gadow KV (2004) Generalized height–diameter

models––an application for major tree species in complex stands

of interior British Columbia. Eur J For Res 123:45–51

Uhlenbeck G, Ornstein LS (1930) On the theory of Brownian motion.

Phys Rev 36:823–841

Vanclay JK (1995) Growth models for tropical forests: a synthesis of

models and methods. For Sci 41:4–42

VanderSchaaf CL (2012) Mixed-effects height–diameter models for

commercially and ecologically important conifers in Minnesota.

North J App For 29:15–20

Zeng HQ, Liu QJ, Feng ZW, Ma ZQ (2010) Aboveground biomass

equations for individual trees of Cryptomeria japonica, Cha-

maecyparis obtusa and Larix kaempferi in Japan. J For Res

15:83–90

J For Res

123