A robust algorithm for the sequential linear analysis of environmental radiological data with...

Research Article

132

Received: 28 December 2007, Accepted: 28 September 2009, Published online in Wiley Online Library: 9 December 2009

A robust algorithm for the sequential linearanalysis of environmental radiological data withimprecise observationsCarlos Riveroa and Teofilo Valdesb*

In this paper we present an algorithm suitable to analyse linear models under the following robust conditions: the data is

(wileyonlinelibrary.com) DOI:10.1002/env.1034

* Corresp

E-mail:

a C. Rive

Departa

b T. Valde

Departa

Environme

not received in batch but sequentially; the dependent variables may be either non-grouped or grouped, that is, impreciselyobserved; the distribution of the errors may be general, thus, not necessarily normal; and the variance of the errors isunknown. As a consequence of the sequential data reception, the algorithm focuses on updating the current estimation andinference of the model parameters (slopes and error variance) as soon as a new data is received. The update of the currentestimate is simple and needs scanty computational requirements. The same occurs with the inference processes which arebased on asymptotics. The algorithm, unlike its natural competitors, has some memory; therefore, the storage of thecomplete up-to-date data set is not needed. This fact is essential in terms of computer complexity, so reducing both thecomputing time and storage requirements of our algorithm compared with other alternatives. Copyright � 2009 JohnWiley & Sons, Ltd.

Keywords: algorithmic estimation; stochastic approximation; linear model under robust conditions; conditional imputationtechniques; asymptotics and simulation studiesAMS subject classifications: 62F10; 62F15

1. INTRODUCTION

This paper focuses on the statistical analysis of the linear model

yi ¼ x0ibþ s"i (1)

where b is the slope vector parameter, xi and yi are, respectively, the independent variable vector of orderm and the dependent variable of the

individual i (i¼ 1,. . .,n), ei 0s denote the standardized random errors and s> 0 is a scale parameter. The following robust conditions will be

assumed in the sequel:

(a) T
he data is received sequentially instead of in batch. This usually occurs in the context of on-line transactions, continuous sampling or
quality control, among others. Under these circumstances, it is desirable to advance partial analyses and to update them as soon as a new

observation (or a group of them) is received. The great majority of the usual statistical procedures are completely memory-less with

respect to the sequential data reception. This implies that, when a new observation is received, we must throw away the former

estimations and inferences and all of the computations need to be repeated from the complete up-to-date data sets; therefore, certain

degrees of wastefulness remain in the air. For its part, our procedure has some memory in the sense that it does not need the complete

storage of the individual observations, but only a small part of them, and the update of the current estimate and inferences is

computationally done in a rather effective way.

(b) E
ach dependent observation yi may be either ungrouped (with probability p0> 0) or grouped (with probability p1¼ 1�p0> 0) with
different classification intervals. This situation is typical of lateral censored data, however, interval censored data (that is, grouped data)

appear, as will be seen, in many situations related to the precision of the measuring apparatus. For simplicity of notation, we will assume

that there exists a unique set of known classification intervals given by their extremes

�1 ¼ c0 < c1 < . . . < cr ¼ 1: (2)

ondence to: T. Valdes, Departamento de Estadıstica e I.O. I, Facultad de Matematicas, Universidad Complutense de Madrid, 28040 Madrid, Spain.

[email protected]

ro

mento de Estadıstica e I.O. II, Facultad de Ciencias Economicas y Empresariales, Universidad Complutense de Madrid, 28223 Pozuelo de Alarcon, Spain

s

mento de Estadıstica e I.O. I, Facultad de Matematicas, Universidad Complutense de Madrid, 28040 Madrid, Spain

trics 2011; 22: 132–151 Copyright � 2009 John Wiley & Sons, Ltd.

ROBUST ALGORITHM FOR SEQUENTIAL LINEAR ANALYSIS

W

Env

hen a grouped observation is within the interval (ch�1, ch] its value is lost and only this interval is known. In spite of this simplification,

the proposed algorithm is also capable of handling the following groupingmechanisms: (a) the set of classification intervals could, as was

said, vary from one grouped observation to another, and (b) it could also be possible that the value yi is only lost if it falls within some

known subset of the intervals (ch�1, ch]. Thus, some common cases of incomplete data arewithin the scope of this paper. Missing data, for

example, is a particular case of grouped data for which there exists a unique classification interval equal to (�1,1). Also, right (or left)

censored data can be visualized as a grouping process with two classification intervals, (�1, c] and (c,1), in which each observation is

only lost if it falls within one of them.

(c) T
he distribution of the error components sei may be general with the sole restriction of being within the general class of the strongly
unimodal distributions (see An (1998)) centred on zero (either symmetrical or non-symmetrical). Let f> 0 denote the density function of

the standardized errors.

(d) T
he scale parameter s> 0 is unknown and needs to be estimated jointly with the vector slope parameter b.
Under these conditions it is clear that (1) the existence of grouped data makes the OLS estimation and inference inapplicable even if the

errors are assumed to be distributed normally, and (2) the non-normality of the errors reinforces the non-applicability of the OLS estimation

and inference mentioned above. For its part, the algorithm proposed here is operative and easy to implement computationally under the robust

conditions mentioned above. The maximum likelihood methods (implemented either by the EM algorithm or by direct optimization) are the

natural alternative to our procedure. However, these methods have a much greater computational complexity with sequential data in the

memory-less sense mentioned above, which is critical with massive data sets. For example, if the maximum likelihood estimation and

inference is implemented through the EM algorithm, this, unlike ours, not only demands the complete storage of the individual data and the

re-execution of its loops, but additionally it has the disadvantage, compared with our algorithm, of the awkward computations included in

each loop. These affect both the quadrature and maximization processes (involved, respectively, in the E and M steps of the EM) by which b

and s are updated. For its part, not only the direct maximum likelihood estimation has no memory with respect to the sequential data

reception but, for a fixed sample size, it also needs to tackle the numerical maximization of the integral likelihood function of the incomplete

data described, which our algorithm avoids. Finally, it is important to highlight that the maximum likelihood b-inferences are based on the

asymptotic distribution of the ML-estimates as the sample size tends to infinity. This is typically normal centred on the true value of b and

with a covariance matrix which needs to be estimated. The same occurs with the proposed algorithm, although this allows us to estimate the

asymptotic covariance matrix of the slope parameter estimates more easily than when we employ either the direct maximum likelihood

method or the EM algorithm (see, Louis (1982) and Meng and Rubin (1991) in this respect).

The paper is organized as follows. Section 2 presents a motivating real life environmental case study with which we sum up the

potentialities of the proposed algorithm. This has a direct antecedent on which Section 3 is focussed. Section 4 includes the rationale, the final

loops and the convergence properties of the proposed algorithm. In Section 5 we present several exhaustive simulation studies, the intention

being to analyse the performance and sensibility of the algorithm estimates. The sequential analysis of the radiological measurements of the

aforementioned case study will be addressed in Section 6. Also in this section the proposed algorithm is compared to the two natural

maximum likelihood alternatives: the EM algorithm and the direct optimization of the likelihood function. In this respect some technicalities

have been annexed at the end of the paper for more interested readers. Finally, the comments and remarks of Section 7 will bring the paper to a

close.

2. A MOTIVATING REAL ENVIRONMENTAL CASE STUDY

In the following we will show a real life case study in which the potentiality and effectiveness of the algorithm become evident. The data

presented in Table 1 originally motivated our study and was provided by the Nuclear Security Council of Spain which has funded a research

contract tender to mitigate biases on the estimated levels of low environmental radiological contamination. Cell values of Table 1 are

composed of two elements. On one hand, the first cell elements show several radiological gamma emissions (in units BQ/kg) recorded from

samples of vegetables taken from sites around the Spanish nuclear power stations of Almaraz (A) and Trillo (T) at distances of 100, 600 and

1000m. In compliance with law, these samples are periodically taken and sent to different laboratories, each working with different

exposition periods and with apparatus from several manufactures, which may be calibrated differently. As a consequence, some of the

recorded levels of radioactivity have been, on the one hand, submitted to upper and lower limits of detection and, on the other hand, some

rough measures have been registered as interval censored (their extreme values written in square brackets). The signs þ and � indicate,

respectively, that their corresponding measurements are above or below the upper or lower limits of detection which are written on the left of

the sign. The laboratories send their measurements on-line and, since the sampling is continuous, they are sequentially received. The second

cell elements are shown, in circular brackets, to the right of the first, and they represent the order of reception of their corresponding

radiological measurement. Thus, these second elements vary from 1 to 48 as this latter number agrees with the total number of radiological

measurements received throughout a certain time period under study. In these circumstances, we will focus on the two following aspects, for

simplicity:

(a) T

1

o determine, first, the extent to which the distance affects the gamma emissions through the simple covariance linear relation

yi ¼ mAui þ mTwi þ gzi þ s"i; (3)

where yi represents the emission, zi the distance to the central station, ui is the dummy variable I(yi belong to the station A) and, similarly,

wi¼ 1� ui¼ I(yi belong to the station T) and, finally, (mA, mT, g) is the parameter vector. With double sub-indices this model is

ironmetrics 2011; 22: 132–151 Copyright � 2009 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

33

Table 1. Radiological measurements (with their order of reception in circular brackets)

Central power stationDistance (in metres)

100 600 1000

A 110 (1) 97 (4) 38 (5)

108 (3) [92,98] (9) 22� (12)

[109,115] (13) 88 (11) [32,37] (16)

120þ (20) 87 (28) 43 (24)

101 (21) 90 (33) 39 (25)

102 (22) 94 (34) 20� (39)

115þ (32) [93,96] (42) 17� (40)

102 (38) 90 (44) 29 (48)

T 103 (2) 91 (6) 27 (8)

[95,99] (10) 79 (7) [21,26] (15)

115þ (14) 85 (19) 20� (18)

98 (17) [79,87] (23) 33 (26)

110þ (30) 84 (29) 17� (27)

99 (31) [83.88] (41) [19,23] (35)

[92,100] (37) 81 (45) [21,25] (36)

89 (43) [82,88] (47) 21 (46)

wile

C. RIVERO AND T. VALDES

134

equivalent to

yij ¼ mj þ gzi þ s"i;

where i¼ 1,. . ., 24, j¼A,T and the rest of the elements are clearly identifiable from Equation (3). The common slope g and the different

interceptsmA andmTare plausible from the fact that both central stations have similar characteristics and are submitted to similar security

controls although their operating periods differ. Secondly, to test the hypothesis mA¼mT.

(b) T
o appraise whether or not the hypothesis that the measurements of the two central stations are similar is statistically plausible. With this
intention we pose a 3� 2 factorial experiment to determine the effects that the three distances (D) and the two power stations (P) have on

the environmental radiological measurements. We will analyse the complete model with interactions

yijh ¼ hþ Di þ Pj þ DPij þ s"ijh (4)

(i¼ 100, 600, 1000 and j¼A,T), with the usual constraintsPi Di ¼ 0;

Pj Pj ¼ 0;

Pi DPij ¼ 0 and

Pj DPij ¼ 0

where yijh denote the radiological gamma emissions and eijh are error terms.

From a general linear model perspective, the errors ei of Equation (3) and eijh of Equation (4) are usually assumed to follow a normal

distribution. However, analysts of environmental radiological measurements prefer to make the assumption that the error distribution is

double exponential due to their exponential decay. Additionally, as was said, some values yi or yijh are unknown, since they were registered as

interval censored.

Let us observe that, even assuming that the error distribution is normal and that the data was received in batch, the parameters of models (3)

and (4) can not be inferred by ordinary least squares from the data of Table 1 unless we assign particular values to all the interval-censored

data. However, the greater the censure interval length of a grouped observation, the more unclear assigning a value to it becomes; thus, any

assigned value may be questioned and may also have a determinant influence on the results. Although this influence becomes evident when

the censure interval is unbounded, contradictory upshots may be derived from assigning different values to the grouped observations with

finite censure intervals. The following cases show how contradictory statistical inferences can be obtained after assigning different values to

the censored data of Table 1. For simplicity’s sake, the attention will be focussed on testing the hypotheses H0: mA¼mT (in model (3)) and

H00: PA¼PT¼ 0 (in model (4)) against their opposite alternatives, at the 5-per cent a-level.

Case 1:H0 andH00 are accepted and rejected, respectively, when each grouped observation is given a value equal to the finite extreme of its

censure interval, if this is unbounded; otherwise, the value is equal the lower or upper extreme of its censure interval, depending on if the

observation is of station A or T, respectively.

Case 2: Both H0 and H00 result in being rejected if we assign the values mentioned above to the infinite interval censored data, and

the assignations given to the finite interval grouped data of stations A and Twere the other way round compared to case 1, that is, the upper/

lower extreme of the grouping intervals to the data of station A/T.

Case 3: Pivoting again from the assignation rules of case 1, we have given the values 12, 10 and 7 to the grouped data 22�, 22� and 17� of

station A, and the values 130 and 125 to the grouped observations 115þ and 110þ of station T. Now we find thatmA¼mTand PA¼PT¼ 0 are

both accepted.

yonlinelibrary.com/journal/environmetrics Copyright � 2009 John Wiley & Sons, Ltd. Environmetrics 2011; 22: 132–151


Case 4: Finally, as an extreme case of this sensitivity analysis, it can be shown that one single assignation of an infinite interval-censored data

may completely modify the results. If we maintain the assignations of case 1 with the sole exception of giving the value 200 to the grouped data

115þ of station T at distance 100, the hypothesis H0 and H00 result in being rejected and accepted, respectively.

Aside from the assumption that the data was recorded in batch, it is important to highlight that the former standard statistical analyses were

also made assuming that the errors follow a normal distribution; otherwise, it is well-known that the t–F statistics are not applicable. Thus, if

we assume, for instance, that the errors follow a Laplace distribution, as usually happens with radiological measures, a new problem arises,

which adds to the mentioned statistical inconclusiveness that is derived from the existence of grouping data. Finally, let the sequential data

reception come into play. If, as said, we wish to put forward estimations and inferences after receiving a certain bevy of data and update them

as a new observation (or group of them) is received, then, new conflictive elements appear in the sequential statistical analyses and their

consistency. This joint situation is tackled in Section 6 where the data of Table 1 is analysed in detail, once our algorithm is stated. At this

moment it is important to highlight that, for a fixed sample size, the asymptotic covariance matrix of the slope parameter estimate is easy to

estimate with our algorithm, whereas its computation using ML techniques (either directly or through the EM algorithm) is far more

complicated. Although this point will be clarified in Section 6, let us advance that in the first case only first derivatives are involved, while

with theML techniques the second derivatives that form part of the Hessian of the log-likelihood do not admit an explicit expression and need

to be numerically evaluated. These comments sum up the potentialities of the algorithm proposed in this paper, which has a direct antecedent

in Rivero and Valdes (2004), as will be explained in the next section.

3. REMOTE AND DIRECT ORIGINS OF OUR ALGORITHM

The remote precedents of the proposed algorithm to treat the type of data and models described above can be found, in chronological order,

in (1) the procedure given in Healy and Westmacott (1956), (2) the missing information principle of Orchard and Woodbury (1972) and

(3) the EM algorithm (Dempster et al., 1977) when the error distribution of the linear model is normal. More recent bibliographical

antecedents are James and Smith (1984), Ritov (1990) and Anido et al. (2000). However, the direct precursor must undoubtedly be sought in

Rivero and Valdes (2004) (Section 3, p. 471), where the authors suggest an estimating algorithm useful when the data are sequentially

received and the scale parameter s of model (1) is assumed to be known. This algorithm only iterates as the sample size, n, increases. It starts

from an arbitrary size, say n0, and an associated guess of b (let us denote this by bn0, for which, in absence of other rational criterion,

we suggest to equal the OLS estimate of b based on the ungrouped data of the initial sample). Strictly speaking, the algorithm is formalized

as follows:

Basic algorithm assuming that the scale parameter is known.

Initialization: Let n0 and bn0 be the initial algorithm values (mentioned above).

Iteration: Assuming that bn�1 is given, the iteration process runs through the following steps once the nth new data is recorded:

(1) Conditional mean imputation step: For i¼ 1. . .,n, let us compute yi(bn�1)¼ yi, if yi is an ungrouped observation; otherwise,

yi(bn�1)¼ x0i b

n�1þE(sei j�x0i bn�1þ ch�1< sei��x0i bn�1þ ch), if yi is within the grouping interval (ch�1, ch].

Then, let us define yn(bn�1)¼ (y1(bn�1),. . ., yn(b

n�1)).

(2) Updating step (OLS projections): bn ¼ X0nXn

� ��1X0ny

n bn�1ð Þ:(3) bn�1 bn, and return to Step 1.

1

In this algorithm E(seij � x0bnþ ch�1< sei��x0i bnþ ch) denotes the conditional expectation of the error term sei assuming that its

corresponding grouped data yi is within the classification interval (ch�1, ch] and, as usual, X0nXn ¼

Pni¼1 xix

0i. The point b

n¼bn(s) defines the

estimate of the slope parameter of the linear model (1) which corresponds to with the sample size n.

Remark 1: Although the mean imputation step resembles the expectation step of the EM algorithm, this and the basic algorithm are not

comparable, since the EM iterates under the assumption that the sample size is fixed. This means, as was said, that if the data is

sequentially recorded, the EM algorithm (as well as the direct numerical optimization of the maximum likelihood function) has to be rerun

when the new data is received. In this sense, the iterations of either the EM algorithm or the directML procedures are, in fact, nested in the

sample size loops. This point will be addressed in detail later on.

Remark 2: Equally, in spite of the incomparability between the basic algorithm and the EM, the updating step of the basic algorithm looks

like the maximization step of the EM if the error terms are normally distributed. Otherwise, this resemblance completely disappears.

Remark 3: For a fixed sample size n, the estimates bn and the maximum likelihood estimates differ independently of the error distribution.

In spite of this, if the error distribution is strongly unimodal (thus, not necessarily normal) and we assume some additional weak

conditions, the asymptotic properties of both estimates are similar. Strictly speaking, this fact will be synthesized in Theorem 1.

Let us partition the set of indices In¼ {1,. . .,n} into the two subsets Ign and Iun which correspond to the indices of the grouped and ungrouped

data, respectively, when the sample size is n. Let us assume that X0nXn ¼P

i2In xix0i is a full rank matrix and let us decompose it into the two

summands

X0nXn ¼ X0nXn

� �gþ X0nXn

� �u;

where X0nXn

� �g¼Pi2Ign xix0i and X0nXn

� �uis defined in a similar way.

Environmetrics 2011; 22: 132–151 Copyright � 2009 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/environmetrics

35


136

Theorem 1 (bn-asymptotics, as n!1) Let us assume that the error distribution is strongly unimodal. If, for all n, X0nXn

� �uand X0nXn

� �gare positive definite matrices, xnk k � K <1, and the minimum eigenvalue of n�1 X0Xð Þu is greater than l> 0, then bn is a

consistent estimate of the slope parameter b and

ffiffiffinpðbn � bÞ �!D

n!1Nð0;LÞ: (5)

Additionally, the asymptotic covariance matrix L can be consistently estimated by

Ln ¼ n X0nMnXn

� ��1X0nRnXn

� �X0nMnXn

� ��1: (6)

The diagonal matrices Mn ¼ diag mni

� �and Rn ¼ diag rni

� �, both of order n, are, respectively, given by mn

i ¼ 1 if i 2 Iun , otherwise,

mni ¼

@

@aEðs"ijaþ ch�1 < s"i � aþ chÞ

��a¼�x0

ibn

if i 2 Ign and yi 2 ðch�1; ch�; (7)

and

rni � s2 if i 2 Inn ; otherwise;rni ¼ Varð"�i Þ if i 2 Ign and yi 2 ðch�1; chÞ

(8)

where "�i is a discrete random variable which takes the values

E s"ij � x0ibn þ ch�1 < s"i � �x0ibn þ ch

� �with probabilities

Pr �x0ibn þ ch�1 < s"i � �x0ibn þ ch� �

;

for h¼ 1,. . .,r.(Proof: See Rivero and Valdes, 2004: pp. 482–486).

Observations about this theorem:

(1) I

wile

t is clear that

E s"ij � x0ibn þ ch�1 < s"i � �x0ibn þ ch

� �¼

sR �x0ibnþchð Þs�1�x0

ibnþch�1ð Þs�1 xf ðxÞdx

Prob �x0ibn þ ch < s"i � �x0ibn þ ch�1ð Þ (9)

and

Prob �x0ibn þ ch�1 < s"i � �x0ibn þ ch� �

¼Z �x0ib

nþchð Þs�1

�x0ibnþch�1ð Þs�1

f ðxÞdx

¼ F �x0ibn þ ch� �

s�1� �

� F �x0ibn þ ch�1� �

s�1� �

;

(10)

where F denotes the distribution function of the standardized errors.

(2) C
learly
@

@as

Z ðaþchÞs�1ðaþch�1Þs�1

xf ðxÞdx !��

a¼�x0ibn

¼ s�1 �x0ibn þ ch� �

f �x0ibn þ ch� �

s�1� �

� �x0ibn þ ch� �

f �x0ibn þ ch� �

s�1� ��

and

@

@aProb aþ ch�1 < s"i � aþ chð Þ

��a¼�x0

ibn¼ s�1 f �x0ibn þ ch

� �� f �x0ibn þ ch�1� ��

Thus, the first derivative involved in (7) admits an explicit expression in terms of the density and distribution functions of the

standardized errors.

(3) I
t follows from (5) and (6) that, if n is sufficiently large, bn approximately follows the multivariate normal distribution N(b,n�1Ln),
which allows the use of standard procedures to carry out statistical inferences (in terms of either confidence regions or hypothesis testing)

on the true slope parameter.

4. THE RATIONALE OF THE PROPOSED ALGORITHM: RESULTING LOOPS ANDPROPERTIES

In model (1) with a fixed sample size and grouped and ungrouped data such as was mentioned in Section 1, if the true values of the parameters

b and s were known (which is not the case), the scale parameter could be consistently approximated from the data sample of size n using



1

standard variance decomposition techniques by means of

s2nðb; sÞ ¼ s2

wnðb; sÞ þ s2bnðb; sÞ; (11)

where the within and between variances s2wnðb; sÞ and s2

bnðb; sÞ satisfy, respectively,

ns2bnðb; sÞ ¼

Xi2Iun

ðyi � x0ibÞ2 þ

Xi2Ign

Xrh¼1

Iðch�1 < yi � chÞE2ðs"ij � x0ibþ ch�1 < s"i � �x0ibþ chÞ

and

ns2wnðb; sÞ ¼

Xi2Ign

Xrh¼1

Iðch�1 < yi � chÞVarðs"ij � x0ibþ ch�1 < s"i � �x0ibþ ch�1Þ;

with Eðs"ij � x0ibþ ch�1 < s"i � �x0ibþ chÞ computed in similar way to (9) and

Varðs"ij � x0ibþ ch�1 < s"i � �x0ibþ chÞ ¼s2

Z �x0ibþchð Þs�1

�x0ibþch�1ð Þs�1

x2f ðxÞdx

Prob ð�x0ibþ ch�1 < s"i � �x0ibþ chÞ:

�E2 s"ij � x0ibþ ch�1 < s"i � �x0ibþ ch� �

(12)

Briefly, we can write

ns2nðb; sÞ ¼

Xi2Iun

ðyi � x0ibÞ2 þ s2

Xi2Ign

Xrh¼1

Iðch�1 < yi � chÞ



x2f ðxÞdx



f ðxÞdx(13)

Clearly, Eððyi � x0ibÞ2Þ ¼ s2 if i 2 Iun , whereas

EXrh¼1




x2f ðxÞdx



f ðxÞdx

0BBBB@

1CCCCA ¼ 1

if yi is grouped. Thus, s2nðb; sÞ ! s2 i.e. as n!1; therefore, with probability 1, the following equality holds in the limit

s1ðb; sÞ ¼ s (14)

However, since the true slope and scale parameters are unknown, snðb; sÞ is incomputable. However, these expressions together with the

basic algorithm have induced us to extend expression (11) to any pair of possible values (b�, s�) of the parameters by means of:

ns2nðb�; s�Þ ¼

Xi2Iun

yi � x0ib�� 2 þ s�2

Xi2Ign

Xrh¼1


Z �x0ib�þchð Þs��1

�x0ibþch�1ð Þs��1

x2f ðxÞdx

Z �x0ib�þchð Þs��1

�x0ib�þch�1ð Þs��1

f ðxÞdx: (15)

As

Xi2Iun

yi � x0ib�� 2 ¼X

i2Iun

yi � x0ib� �2 þ b� � bð Þ0

Xi2Iun

xix0i

" #b� � bð Þ þ oðnÞ a:e:;

reasoning as in (13), it is clear that

s2n b�; s�ð Þ !

n!1s2 þ p0 b� � bð Þ0P b� � bð Þ þ p1 s�2 � s2

� �(16)

with probability 1, where P denotes the limit of the mean product matrix

P ¼ limn!1

n�1u X0nXn

� �uand nu denotes the cardinal of I

un . Taking this into account, we propose the following estimating algorithm which, like the basic algorithm,

starts with an initial sample size n0 and, then, its loops are executed as soon as a new observation is received:


37


138

Proposed robust estimating algorithm.

Initialization: Let bn0 and sn0 be two arbitrary starting guesses of the slope and scale parameter, respectively. In absence

of any other external information, we suggest the use of the OLS estimates based on the ungrouped data

of the initial sample of size n0.

Iteration: Assuming that bn�1 and sn�1 are known, let us update them through the following steps:

(1) Assuming that the scale parameter agrees with sn�1, let us run one loop of the basic algorithm given

in Section 3 to update bn�1. Therefore, with a notation similar to that used in the basic algorithm:

(a) First, compute the vector of conditional imputations

ynðbn�1; sn�1Þ ¼ ðy1ðbn�1; sn�1Þ; . . . ; ynðbn�1; sn�1ÞÞwhere yi(b

n�1,sn�1) is equal to yi if this data is ungrouped; otherwise, it agrees with

x0ibn�1 þ Eðsn�1"ij � x0ib

n�1 þ ch�1 < sn�1"i � �x0ibn�1 þ chÞ ð17Þif the grouped data yi is within the interval (ch�1, ch].

(b) Secondly, let us use the OLS projections to update the current slope parameter, that is,

bn= X0nXn

� ��1X0ny

n bn�1; sn�1ð Þ ð18Þ(2) Update the scale parameter by

sn ¼ sn bn; sn�1ð Þ ð19Þusing (14).

bn�1 bn, sn�1 sn, and return to step 1.

The point (bn, sn) will define our slope/scale-estimates of model (1) for the sample size n. It is important to observe that, in harmony with

the limit equality (14), this estimate fulfils

s1 ¼ s1 b1; s1ð Þ: (20)

The following remarks synthesize the most important points about the proposed algorithm:

(1) A

wile

fter the enormous number of simulations included in the next section we are in a position to maintain that the proposed algorithm

certainly stabilizes towards a point (b1, s1) which does depend on {(xi, yi)}i=1,2,. . ., although it is independent of both the starting values

bn0 ; sn0ð Þ and the initial sample size n0. This is in spite of the fact that both sequences {bp} and {sp} are inter-linked and they differ when

some of the initial values vary. This will be justified in terms of stochastic convergences below.

(2) T
he algorithm estimate (b1, s1) is, in fact, an M-estimator since it fulfils the implicit relation (16). This means that we can expect its
asymptotic distribution to be a multivariate normal.

(3) T
he consistency of the estimates (bn, sn) can be proven under certain conditions. Although their strict formulations (see Fahrmeir and
Kufmann, 1985) are far from the scope of the present paper, the following arguments sketch the proof. As before, let (b, s) denote the true

values of the model parameters. From (17) and (16) we can write the equality chain

s12 ¼ s2

1ðb1; s1Þ ¼ p0ðb1 � bÞ0Pðb1 � bÞ þ p0s2 þ p1s

12

:

As the first term does not depend on b1, necessarily (b1�b)0P (b1� b)¼ 0. Thus, if we assume that P is a definite positive matrix,

we can conclude that b1¼b and s1¼ s with probability 1, in complete accordance with our empirical experience commented on in

Point 1.

(4) F
rom the last two points and taking into account the asymptotic distribution given in (4), our proposal is to use the following natural
distributional approximation

bn � N b; n�1Ln� �

; ð21Þwhere the covariance matrix Ln ¼ ðlnijÞ is

Ln ¼ n X0nMnXn

� ��1X0nRnXn

� �X0nMnXn

� ��1 ð22Þand the diagonal matricesMn ¼ diag mn

i

� �and Rn ¼ diag rni

� �are defined as in (7) and (8), save that bn now stands for the slope estimate

of the proposed algorithm (instead of Section 2’s) and sn substitutes s. Finally, from (20), it is straightforward to make standard

inferences on the true slope parameters. For example, at the 95% level, the approximate confidence interval that is derived from the

proposed algorithm is given by

bnj �

1:96ffiffiffinp lnjj

� �: ð23Þ

Equally, tests of hypothesis that some set of linear combinations of b are all zero are carried out by the following standard procedure

based on assymptotics. Let us suppose that A is the m� pmatrix specifying the p linear combinations of b that are to be tested and that n

is the current sample size. Under the null hypothesisH0: Ab¼ 0 (againstH1: notH0), nb0nA0 ALnA0ð Þ�1Abn approximately distributes as a


Env


(central) x2-distribution with p degrees of freedom. If x2pðaÞ is the value such that

Pr x2p > x2

pðaÞn o

¼ a;

then

Pr nb0nA0 ALnA0ð Þ�1Abn > x2pðaÞ

n o:

Thus, to test the null hypothesis, we use as our critical region

Rna ¼ bnjnb0nA0 ALnA0ð Þ�1Abn > x2

pðaÞn o

: ð24Þ

(5) W
ith respect to the partial memory capacity of the proposed estimating algorithm, the following points merit to be commented on:
only the individual values (yi, xi) corresponding to the grouped yi -values need to be stored, since for the rest there exist recurrent

formulas by which the update of current estimates can be implemented when a new observation is received. This is a relevant fact, since

the natural opponents of our algorithm are completely memory-less. A detailed analysis of this fact is included in the Annexe at the end of

this paper.

5. SIMULATION STUDY ON THE PERFORMANCE OF THE ALGORITHM

In this section we present a large number of simulations made with the intention of analysing the performance of the proposed algorithm. We

have considered the model

yi ¼ b0 þ b1x1i þ b2x2i þ s"i; (25)

in which (a) we have fixed the slope parameter b¼ (1, �4, 3)0; (b) 200 independent variables xi¼ (xi1, xi2)0, i¼ 1,. . .,200, were selected

from a uniform distribution on the square [�5, 5]2; (c) the values s¼ 1, 2, 4 were assigned to the scale parameter; (d) the errors eiwere generated from the following distributions duly standardized: (i) Laplace(1), that is, with a density function f(x)¼ 2�1exp(�jxj), (ii)Logistic, that is, f(x)¼ exp(�x)(1þexp(�x))�2 and (iii) Standard Normal and (e) we have computed the corresponding dependent variables

yi, after which they were grouped with probabilities p0 equal to 0.2, 0.4 and 0.6 using the grouping intervals

ð�1;�7�; ð�7;�2�; ð�2; 3�; ð3; 6� and ð6;1Þ:

For each combination of the values s and p0 and each standardized error distribution, we have made 250 replications of the data (ei, yi) andof the grouping process. Then, for each replication r, we ran the proposed algorithm from the following starting values: no¼ 25 and b25 equal

to the OLS estimate calculated from the ungrouped data of the initial sample. In this way, we obtained the parameter estimates (bn(r), sn(r))

and the covariance matrix estimate Ln(r) given in (21) for n¼ 25,. . .,200. With these replicated values we calculated:

(1) T
he empirical biases, variances and mean square errors of the estimates of the slope and scale parameters given by
B bnj

¼ E bn

j

� bj

�� ¼ 250�1X250r¼1

bnðrÞj � bj

��;

Var bnj

¼ E bn

j � E bnj

2� �¼ 250�1

X250r¼1

bnðrÞj � E bn

j

2(26)

and

^MSE bnj

¼ E bn

j � bj

2� �¼ Var bn

j

þ B2 bn

j

( j¼ 1, 2, 3), and similarly for sn. With the objective of making comparisons, for each replication we have also computed two types of

ordinary least square parameter estimates. The first was based on the complete data, that is, without being submitted to the grouping process

explained above; these estimates are denoted by bOLS,n and sOLS,n, respectively. For its part, the second type was only based on the

non-grouped data once the grouping process was implemented and the resulting grouped data discharged; the estimates obtained are denoted

by bols,n and sols,n (with the super-index in lower case letters). Also, the empirical biases, variances and mean square errors of the OLS and ols

estimates were calculated as in (26). For the different value combinations of s and p0 mentioned above, we have assumed Laplacian errors in

Table 2a which also shows the empirical biases and mean square errors of the slope and scale parameter estimates based on the complete

sample of size 200 and their sequential precedents corresponding to the sample sizes 50 and 100. The equivalent results for logistic and

normal error are of similar orders to those shown in Table 2a. By way of proof we have partially included them in Table 2b, which has an

easily recognizable structure having seen Table 2a, and shows biases and mean square errors only when p0¼ 0.4 (the authors commit

themselves to send complete results upon request).

On seeing these tables, the following general remarks can be made:

� A

1

lthough the three slope estimates bn, bOLS,n and bols,n are consistent for the error distributions considered, only the first one is

asymptotically unbiased (and normally distributed) independently of the distributions mentioned above. This last property can only be


39

Table 2a. Empirical biases and mean square errors of the model parameter estimates

Laplacian errors

Values Estimates of the model parameters

p1 s n bn0 bn

1 bn2 sn bOLS;n

0 bOLS;n1 bOLS;n

2 sOLS,n bols;n0 bols;n

1 bols;n2 sols,n

Empirical biases

0.2 1 50 �0.016 0.032 �0.005 �0.019 �0.010 0.026 �0.003 �0.017 �0.018 0.040 �0.005 �0.024100 �0.003 0.015 �0.010 �0.012 �0.002 0.014 �0.015 �0.009 �0.006 0.011 �0.005 �0.013200 0.016 �0.006 �0.003 �0.013 0.015 �0.003 �0.009 �0.009 0.012 �0.005 �0.001 �0.015

2 50 0.029 �0.048 0.014 �0.017 0.028 �0.051 0.017 �0.004 0.048 �0.017 �0.013 �0.025100 0.016 0.005 �0.003 �0.005 0.009 0.003 �0.007 0.007 0.027 �0.013 �0.005 �0.006200 0.007 �0.002 0.016 �0.011 0.005 �0.004 0.018 �0.006 �0.001 �0.001 0.015 �0.009

4 50 �0.042 0.049 �0.017 �0.062 �0.059 0.052 �0.002 �0.046 �0.011 0.074 �0.047 �0.059100 0.061 �0.039 �0.007 �0.044 0.067 �0.041 �0.007 �0.034 0.061 �0.050 �0.005 �0.057200 0.033 �0.007 �0.025 �0.055 0.030 �0.005 �0.014 �0.028 0.054 �0.012 �0.031 �0.042

0.4 1 50 0.012 �0.025 �0.003 �0.031 0.016 �0.025 �0.013 �0.026 0.011 �0.020 0.003 �0.034100 �0.002 �0.006 0.012 �0.006 0.002 �0.003 0.004 �0.003 �0.007 �0.001 0.012 �0.001200 0.013 �0.001 �0.017 �0.010 0.017 �0.004 �0.020 �0.003 0.019 �0.003 �0.019 �0.005

2 50 �0.015 �0.020 0.043 �0.046 �0.011 �0.012 0.042 �0.034 �0.012 �0.027 0.025 �0.048100 �0.029 �0.010 0.031 0.006 �0.036 0.003 0.024 0.012 �0.007 �0.026 0.039 0.002

200 0.015 0.001 �0.012 0.003 0.017 �0.005 �0.017 0.013 0.013 0.001 �0.010 0.014

4 50 �0.059 0.064 �0.075 �0.002 �0.049 0.065 �0.074 0.009 0.018 �0.045 �0.055 �0.014100 �0.068 �0.039 0.095 �0.103 �0.067 �0.028 0.106 �0.077 �0.739 �0.049 0.102 �0.089200 �0.052 0.058 0.056 0.003 �0.057 0.050 0.044 �0.018 �0.078 0.064 0.066 0.022

0.6 1 50 �0.015 0.036 0.012 �0.032 �0.015 0.029 0.014 �0.022 �0.016 0.015 0.013 �0.036100 0.003 0.009 �0.010 �0.006 0.007 0.003 �0.009 �0.002 �0.014 0.017 0.014 0.005

200 0.011 �0.002 �0.007 �0.020 0.014 �0.002 �0.005 �0.002 0.025 �0.011 �0.018 �0.0142 50 �0.079 0.060 0.035 �0.022 �0.098 0.066 0.032 �0.016 �0.058 0.068 �0.014 �0.057

100 �0.013 �0.002 0.012 �0.029 �0.013 0.001 0.012 �0.016 �0.002 �0.004 �0.006 �0.043200 0.016 �0.010 0.002 0.017 0.015 �0.009 �0.006 0.033 �0.007 0.022 0.001 0.028

4 50 0.017 0.005 �0.034 �0.011 �0.005 0.013 �0.015 �0.010 �0.017 �0.011 �0.110 �0.057100 �0.039 �0.016 0.047 �0.065 �0.031 �0.014 0.048 �0.003 0.026 �0.091 0.127 �0.086200 �0.009 �0.007 0.017 �0.064 �0.010 �0.009 0.020 �0.032 �0.052 0.030 0.036 �0.067

Empirical mean square errors

0.2 1 50 0.040 0.028 0.033 0.028 0.035 0.024 0.030 0.025 0.049 0.033 0.039 0.031

100 0.028 0.018 0.015 0.013 0.024 0.016 0.013 0.013 0.029 0.019 0.017 0.015

200 0.011 0.009 0.009 0.007 0.010 0.007 0.007 0.007 0.012 0.010 0.011 0.009

2 50 0.161 0.132 0.118 0.102 0.157 0.129 0.112 0.097 0.196 0.159 0.150 0.128

100 0.081 0.058 0.057 0.049 0.075 0.056 0.052 0.043 0.098 0.063 0.068 0.062

200 0.040 0.033 0.030 0.029 0.036 0.032 0.029 0.028 0.046 0.034 0.036 0.035

4 50 0.582 0.399 0.448 0.525 0.565 0.393 0.429 0.519 0.714 0.529 0.555 0.622

100 0.337 0.265 0.255 0.226 0.338 0.265 0.257 0.240 0.399 0.314 0.315 0.288

200 0.135 0.116 0.122 0.106 0.137 0.118 0.120 0.098 0.165 0.154 0.142 0.141

0.4 1 50 0.041 0.039 0.037 0.032 0.033 0.032 0.028 0.026 0.059 0.048 0.049 0.044

100 0.019 0.019 0.017 0.017 0.016 0.016 0.014 0.014 0.029 0.029 0.024 0.024

200 0.010 0.010 0.009 0.009 0.007 0.007 0.009 0.006 0.013 0.012 0.012 0.013

2 50 0.205 0.144 0.154 0.101 0.186 0.121 0.140 0.086 0.347 0.255 0.224 0.142

100 0.077 0.065 0.077 0.062 0.067 0.057 0.068 0.054 0.109 0.095 0.126 0.087

200 0.047 0.033 0.034 0.027 0.041 0.030 0.031 0.026 0.070 0.048 0.050 0.044

4 50 0.678 0.552 0.429 0.446 0.605 0.531 0.407 0.443 1.115 0.820 0.744 0.733

100 0.302 0.208 0.217 0.213 0.312 0.201 0.234 0.205 0.548 0.326 0.388 0.304

200 0.156 0.140 0.124 0.109 0.150 0.128 0.119 0.097 0.286 0.230 0.184 0.187

0.6 1 50 0.054 0.044 0.041 0.035 0.036 0.028 0.031 0.022 0.112 0.098 0.078 0.068

100 0.024 0.020 0.024 0.019 0.016 0.014 0.017 0.014 0.049 0.040 0.043 0.037

200 0.013 0.013 0.009 0.011 0.009 0.007 0.006 0.007 0.027 0.020 0.017 0.017

2 50 0.183 0.152 0.151 0.133 0.171 0.122 0.124 0.118 0.379 0.302 0.307 0.264

100 0.076 0.059 0.073 0.064 0.065 0.051 0.065 0.051 0.197 0.148 0.161 0.125(Continues)

wileyonlinelibrary.com/journal/environmetrics Copyright � 2009 John Wiley & Sons, Ltd. Environmetrics 2011; 22: 132–151


140

Table 2a. (Continued)

Laplacian errors


p1 s n bn0 bn

1 bn2 sn b

OLS;n0 b

OLS;n1 b

OLS;n2 sOLS,n b

ols;n0 b

ols;n1 b

ols;n2 sols,n

200 0.046 0.031 0.042 0.029 0.039 0.029 0.034 0.024 0.097 0.063 0.080 0.060

4 50 0.623 0.465 0.492 0.541 0.625 0.438 0.493 0.494 1.435 1.310 1.263 1.238

100 0.315 0.227 0.212 0.229 0.296 0.236 0.220 0.216 0.722 0.516 0.621 0.471

200 0.165 0.107 0.114 0.112 0.159 0.098 0.109 0.090 0.350 0.292 0.292 0.260

En


assured for bOLS,n and bols,n if the error distribution is normal, in which case these estimates agree with the maximum likelihood estimates

based on the complete data and on the non-discharged data, respectively. Despite the theoretical basis mentioned above, the similarity

between the biases when the errors follow a Laplace distribution and those obtained with the other error distributions, points to the fact that

the asymptotic unbiasedness of the slope estimates when the error distribution is normal, seems to be extendable to the rest of the error

distributions considered in our simulation study.

� A
lso the mean square errors of the slope estimates are of similar orders independently of the error distributions considered. For all of these,
our proposed slope estimates have a similar efficiency to the OLS-estimates. This means that the proposed algorithm is useful to avoid the

negative consequences that are derived from the existence of grouped or missing data. In all of the cases, the empirical mean square errors

increase as the proportion of grouped data grows and as the sample size decreases, in agreement with what could be expected in advance.

� T
he former conclusions (in terms of biases and MSE’s) about the slope parameter estimates can be straightforwardly extended to the scale
parameter estimate.

(2) T

v

he empirical covariance matrices of the slope parameter estimates bn, bOLS,n and bols,n. These are denoted by Gn, GOLS,n and Gols,n,

respectively, and in harmony with (22) they were computed by

Gn ¼ 250�1X250r¼1ðbnðrÞ � EðbnÞÞðbnðrÞ � EðbnÞÞ0 (27)

and similarly for GOLS,n and Gols,n. For each replication r, the empirical matrix Gn should be compared with n�1LnðrÞ, since LnðrÞ

approximates the asymptotic covariance matrix offfiffiffinpðbn � bÞ, as was indicated. For its part, the comparison between the mean

matrix

G�n ¼ n�1E Lnð Þ ¼ n�1250�1X250r¼1

LnðrÞ (28)

and Gn will allow us to evaluate the biases of the different elements of the covariance matrix estimate of the slope parameter

estimates obtained with the proposed algorithm. Table 3a includes the distinct elements of the matrices Gn, G�n, GOLS,n and Gols,n, respectively,

when the errors are Laplacian.With logistic and normal errors the equivalent former matrices are rather similar. Again, these have been partly

included in Table 3b (which is similar to Table 2b) as proof of it.

Briefly speaking, it can be stressed that the matrices Gn, G�n and G

OLS,n are quite similar, independent of the values of the percentage

of grouped data, the true scale parameter, the sample size and, also, independent of the error distributions considered in this study.

This fact seems to indicate that the proposed algorithm tends to eliminate the negative consequences that spring from the existence of

grouped or missing data. Additionally, as could be foreseen, the empirical variances of the matrices Gols,n, which are based on the

non-grouped data only, are larger than those of the former matrices Gn, G�n and G

OLS,n, and the observed differences increase as the

percentage of grouped data and the true scale parameter increase and, also, as the sample size decreases.

(3) E
mpirical efficiencies of the algorithm confidence interval estimates. At the 95% level, the coverage probability of the confidence
intervals given in (23) can be empirically assessed by the expression:

1

CðbjÞ ¼ 250�1X250r¼1

I bnðrÞj � 1:96ffiffiffi

np lnjj � bj � b

nðrÞj þ 1:96ffiffiffi

np lnjj

� �: (29)

These empirical coverage probabilities are included in Table 4 for the three error distributions simulated in this study.

As can be seen, the orders of these empirical coverage probabilities are rather similar for the error distributions mentioned above. In all of

the cases the more sensitive element seems to be the true scale parameter s and, in second place, the percentage of grouped data.

Nevertheless, as these probabilities are close to 0.95, it cannot be again said that the large empirical efficiency of the proposed algorithm

confidence interval estimates is encouraging.


41

Table 2b. Empirical biases and mean square errors of the model parameter estimates

Logistic errors


p1 s n bn0 bn1 bn2 sn bOLS;n0 bOLS;n1 bOLS;n2 sOLS,n bols;n0 bols;n1 bols;n2 sols,n

Empirical biases

0.4 1 50 0.026 �0.013 �0.018 �0.027 0.025 �0.005 �0.026 �0.024 0.019 0.002 �0.006 �0.036100 0.002 0.005 0.000 �0.004 0.000 0.009 0.001 �0.006 �0.007 0.009 0.002 �0.001200 �0.012 0.010 0.000 �0.018 �0.012 0.010 0.003 �0.011 �0.003 0.006 �0.012 �0.018

2 50 0.022 �0.007 �0.002 �0.020 0.024 0.002 �0.002 �0.006 0.080 �0.027 �0.036 �0.024100 �0.036 �0.003 0.025 0.004 �0.046 0.004 0.020 0.007 �0.003 �0.026 0.020 0.001

200 0.009 �0.005 0.009 �0.036 0.010 �0.010 0.003 �0.030 0.006 �0.002 0.014 �0.0224 50 �0.063 0.075 �0.005 0.032 �0.070 0.093 �0.041 0.010 �0.113 0.055 0.050 0.006

100 �0.072 �0.039 0.067 �0.057 �0.085 �0.028 0.075 �0.046 �0.120 0.007 0.087 �0.063200 �0.031 0.029 0.012 �0.046 �0.035 0.036 0.004 �0.049 �0.039 0.036 0.003 �0.032


0.4 1 50 0.047 0.034 0.042 0.021 0.035 0.029 0.029 0.016 0.064 0.050 0.054 0.027

100 0.021 0.019 0.016 0.011 0.018 0.017 0.014 0.010 0.028 0.027 0.024 0.016

200 0.011 0.010 0.010 0.005 0.009 0.007 0.007 0.004 0.015 0.012 0.014 0.006

2 50 0.181 0.134 0.129 0.074 0.164 0.120 0.110 0.072 0.310 0.239 0.196 0.109

100 0.070 0.054 0.074 0.042 0.060 0.045 0.065 0.039 0.107 0.092 0.106 0.058

200 0.042 0.040 0.031 0.019 0.039 0.036 0.028 0.017 0.065 0.059 0.040 0.026

4 50 0.627 0.520 0.479 0.309 0.589 0.513 0.453 0.287 1.044 0.873 0.811 0.570

100 0.326 0.250 0.218 0.143 0.327 0.244 0.217 0.131 0.620 0.377 0.403 0.184

200 0.122 0.118 0.113 0.076 0.121 0.116 0.110 0.068 0.204 0.205 0.202 0.120

Standard normal errors

Empirical biases

0.4 1 50 0.050 0.037 0.045 0.023 0.038 0.031 0.031 0.017 0.069 0.054 0.057 0.029

100 0.023 0.021 0.017 0.011 0.019 0.018 0.015 0.010 0.030 0.029 0.025 0.017

200 0.011 0.010 0.010 0.006 0.009 0.008 0.008 0.005 0.016 0.013 0.015 0.007

2 50 0.193 0.143 0.139 0.079 0.175 0.128 0.118 0.077 0.332 0.255 0.210 0.117

100 0.074 0.057 0.079 0.045 0.064 0.048 0.070 0.041 0.114 0.098 0.113 0.062

200 0.045 0.042 0.033 0.021 0.041 0.039 0.030 0.018 0.070 0.063 0.042 0.027

4 50 0.671 0.556 0.513 0.331 0.630 0.548 0.484 0.307 1.117 0.934 0.868 0.610

100 0.349 0.268 0.234 0.153 0.350 0.261 0.232 0.140 0.663 0.403 0.432 0.197

200 0.131 0.126 0.121 0.081 0.129 0.124 0.118 0.073 0.219 0.220 0.216 0.128


0.4 1 50 0.048 0.032 0.043 0.016 0.040 0.025 0.030 0.012 0.059 0.048 0.058 0.021

100 0.029 0.022 0.017 0.009 0.022 0.017 0.014 0.006 0.041 0.027 0.024 0.010

200 0.011 0.010 0.009 0.004 0.009 0.007 0.006 0.003 0.014 0.012 0.012 0.005

2 50 0.165 0.113 0.127 0.059 0.150 0.103 0.108 0.050 0.266 0.190 0.225 0.077

100 0.074 0.057 0.072 0.031 0.068 0.055 0.064 0.028 0.096 0.091 0.101 0.037

200 0.040 0.034 0.029 0.014 0.034 0.031 0.026 0.012 0.063 0.055 0.044 0.020

4 50 0.534 0.563 0.432 0.250 0.522 0.533 0.422 0.230 0.977 0.910 0.749 0.365

100 0.274 0.228 0.200 0.087 0.269 0.225 0.198 0.075 0.486 0.392 0.319 0.155

200 0.150 0.141 0.118 0.059 0.147 0.140 0.116 0.048 0.233 0.197 0.219 0.068


142

6. CASE STUDY: THE PROPOSED ALGORITHM VERSUS ITS ALTERNATIVES

With the real life data of Table 1, we have used the proposed algorithm to estimate, first, the parameters b¼ (mA, mT, g) and s of model

(3) and, secondly, the parameters h, Di, Pj, DPij and s of the ANOVA model (4), among other models which were not included in this

paper. Although the final analyses should refer to the complete data, we were required to have updated forecasts which could be supplied

on request. The partial analyses were initiated as soon as the current sample size was n0¼ 16 (the first one third of the complete

sample received). In this section we will show the results obtained with the sample sizes 16, 32 and 48 (one, two and three thirds of the total

dataset).

With respect to model (3), Table 5 exhibits the sequences ðmnA;m

nT ; g

n; snÞ, for n¼ 16,. . .,48, generated by the proposed algorithm from

three different starting points (b16, s16). The first of them, (123.919, 117.788,�74.415, 14.788), agrees with the OLS estimates based on the


Table 3a. Covariance matrices G�n, Gn, GOLS,n and Gols,n given in Section 5

Laplacian errors

Values G�n Gn

p1 s n g00 g11 g22 g01 g02 g12 g00 g11 g22 g01 g02 g12

0.2 1 50 0.040 0.033 0.033 �0.016 �0.016 0.000 0.039 0.027 0.033 �0.012 �0.017 0.000

100 0.020 0.016 0.016 �0.008 �0.008 0.000 0.028 0.017 0.015 �0.011 �0.010 0.000

200 0.010 0.007 0.007 �0.004 �0.004 0.000 0.010 0.008 0.008 �0.003 �0.005 0.000

2 50 0.149 0.125 0.121 �0.063 �0.057 0.000 0.159 0.129 0.117 �0.047 �0.065 �0.008100 0.073 0.059 0.059 �0.030 �0.029 0.000 0.081 0.058 0.056 �0.023 �0.025 �0.008200 0.036 0.030 0.029 �0.015 �0.014 0.000 0.039 0.033 0.030 �0.018 �0.017 0.002

4 50 0.568 0.463 0.463 �0.233 �0.225 �0.001 0.578 0.395 0.445 �0.193 �0.248 0.017

100 0.271 0.218 0.220 �0.109 �0.108 0.000 0.332 0.263 0.253 �0.121 �0.164 �0.005200 0.134 0.108 0.108 �0.055 �0.053 0.000 0.134 0.116 0.121 �0.048 �0.048 �0.010

0.4 1 50 0.045 0.037 0.036 �0.018 �0.017 0.000 0.040 0.037 0.037 �0.015 �0.018 0.000

100 0.022 0.019 0.018 �0.010 �0.008 0.000 0.019 0.019 0.017 �0.008 �0.007 0.000

200 0.011 0.008 0.008 �0.004 �0.004 0.000 0.008 0.010 0.008 �0.004 �0.004 0.001

2 50 0.148 0.122 0.123 �0.061 �0.057 0.000 0.204 0.143 0.152 �0.091 �0.085 0.007

100 0.075 0.061 0.061 �0.032 �0.029 0.000 0.076 0.066 0.076 �0.023 �0.029 �0.012200 0.037 0.031 0.030 �0.015 �0.015 0.000 0.047 0.033 0.034 �0.019 �0.018 �0.001

4 50 0.535 0.445 0.446 �0.215 �0.212 �0.011 0.672 0.546 0.422 �0.319 �0.173 �0.034100 0.250 0.205 0.204 �0.105 �0.098 0.002 0.288 0.206 0.208 �0.092 �0.122 0.005

200 0.127 0.105 0.105 �0.052 �0.050 �0.001 0.153 0.136 0.121 �0.081 �0.064 0.007

0.6 1 50 0.052 0.043 0.042 �0.022 �0.020 0.001 0.053 0.042 0.040 �0.018 �0.020 �0.001100 0.025 0.021 0.021 �0.011 �0.010 0.000 0.023 0.020 0.023 �0.012 �0.012 0.002

200 0.012 0.011 0.010 �0.005 �0.004 0.000 0.013 0.013 0.008 �0.005 �0.004 �0.0012 50 0.154 0.128 0.127 �0.064 �0.058 �0.001 0.176 0.147 0.148 �0.070 �0.092 0.004

100 0.075 0.061 0.061 �0.032 �0.029 0.000 0.075 0.058 0.072 �0.028 �0.039 0.002

200 0.037 0.031 0.031 �0.016 �0.014 0.000 0.046 0.031 0.041 �0.018 �0.023 0.004

4 50 0.505 0.417 0.421 �0.210 �0.195 �0.003 0.620 0.463 0.489 �0.232 �0.228 �0.024100 0.232 0.192 0.193 �0.096 �0.088 �0.001 0.312 0.226 0.209 �0.128 �0.103 0.002

200 0.116 0.094 0.094 �0.048 �0.043 �0.001 0.164 0.107 0.113 �0.056 �0.061 �0.004

Values GOLS,n Gols,n


0.2 1 50 0.035 0.022 0.030 �0.012 �0.015 0.000 0.049 0.032 0.038 �0.016 �0.022 0.003

100 0.023 0.016 0.013 �0.008 �0.008 0.000 0.029 0.019 0.017 �0.012 �0.011 0.001

200 0.008 0.007 0.007 �0.003 �0.004 0.000 0.012 0.010 0.011 �0.004 �0.006 0.001

2 50 0.156 0.127 0.111 �0.049 �0.063 �0.004 0.193 0.158 0.149 �0.064 �0.082 �0.005100 0.074 0.055 0.052 �0.024 �0.023 �0.007 0.098 0.063 0.068 �0.027 �0.032 �0.007200 0.037 0.032 0.029 �0.017 �0.017 0.002 0.046 0.034 0.036 �0.019 �0.021 0.002

4 50 0.559 0.388 0.427 �0.184 �0.223 0.011 0.710 0.520 0.551 �0.236 �0.292 0.016

100 0.332 0.263 0.255 �0.128 �0.161 �0.008 0.394 0.310 0.314 �0.134 �0.198 �0.013200 0.136 0.118 0.120 �0.049 �0.049 �0.007 0.161 0.154 0.140 �0.059 �0.054 �0.019

0.4 1 50 0.033 0.032 0.028 �0.013 �0.015 0.001 0.059 0.048 0.049 �0.023 �0.025 0.002

100 0.016 0.016 0.014 �0.008 �0.005 0.000 0.029 0.029 0.022 �0.015 �0.010 0.000

200 0.007 0.007 0.007 �0.003 �0.003 0.000 0.013 0.012 0.012 �0.004 �0.006 0.000

2 50 0.184 0.121 0.138 �0.075 �0.080 0.004 0.345 0.252 0.222 �0.166 �0.131 0.020

100 0.066 0.056 0.067 �0.018 �0.027 �0.012 0.109 0.094 0.124 �0.033 �0.056 �0.007200 0.040 0.031 0.031 �0.016 �0.015 �0.002 0.069 0.048 0.051 �0.029 �0.023 �0.001

4 50 0.599 0.524 0.400 �0.280 �0.144 �0.040 1.110 0.814 0.737 �0.404 �0.401 �0.032100 0.296 0.199 0.224 �0.092 �0.130 0.007 0.530 0.323 0.376 �0.163 �0.209 �0.002200 0.145 0.126 0.117 �0.074 �0.058 0.002 0.279 0.226 0.179 �0.139 �0.100 0.013

0.6 1 50 0.036 0.027 0.031 �0.013 �0.017 0.001 0.111 0.098 0.077 �0.052 �0.033 �0.002100 0.016 0.014 0.017 �0.007 �0.008 0.000 0.049 0.039 0.042 �0.020 �0.021 0.000

200 0.007 0.007 0.006 �0.003 �0.002 0.000 0.025 0.020 0.017 �0.012 �0.008 �0.0012 50 0.161 0.118 0.122 �0.061 �0.085 0.003 0.374 0.296 0.305 �0.144 �0.167 0.006

100 0.065 0.051 0.065 �0.021 �0.037 0.001 0.195 0.147 0.160 �0.067 �0.101 �0.001(Continues)



143

Table 3a. (Continued)

Laplacian errors

Values G�n Gn


200 0.038 0.029 0.034 �0.016 �0.020 0.004 0.096 0.063 0.080 �0.036 �0.046 0.008

4 50 0.622 0.436 0.491 �0.234 �0.234 �0.005 1.429 1.304 1.247 �0.656 �0.358 �0.148100 0.295 0.235 0.217 �0.125 �0.104 0.002 0.719 0.506 0.603 �0.255 �0.316 0.007

200 0.159 0.099 0.108 �0.058 �0.057 0.001 0.346 0.290 0.289 �0.143 �0.138 �0.013

Table 3b. Covariance matrices G�n, Gn, GOLS,n and Gols,n given in Section 5

Values G�n Gn


Logistic errors

0.4 1 50 0.043 0.036 0.037 �0.018 �0.017 0.000 0.047 0.034 0.041 �0.019 �0.015 �0.001100 0.022 0.018 0.018 �0.010 �0.008 0.000 0.021 0.019 0.016 �0.010 �0.008 0.001

200 0.011 0.008 0.008 �0.004 �0.004 0.000 0.011 0.010 0.010 �0.004 �0.004 0.000

2 50 0.149 0.124 0.127 �0.063 �0.056 �0.001 0.179 0.133 0.129 �0.081 �0.061 0.004

100 0.074 0.060 0.060 �0.031 �0.028 0.000 0.068 0.053 0.072 �0.022 �0.031 �0.005200 0.035 0.029 0.029 �0.015 �0.014 0.000 0.042 0.040 0.031 �0.020 �0.016 �0.002

4 50 0.545 0.448 0.469 �0.219 �0.215 �0.008 0.621 0.512 0.477 �0.305 �0.211 0.001

100 0.257 0.209 0.213 �0.106 �0.101 �0.001 0.320 0.248 0.213 �0.126 �0.122 �0.002200 0.126 0.103 0.104 �0.053 �0.048 �0.001 0.121 0.117 0.112 �0.050 �0.042 �0.014

Standard normal errors

0.4 1 50 0.045 0.037 0.037 �0.018 �0.018 0.000 0.048 0.032 0.042 �0.019 �0.020 0.003

100 0.022 0.018 0.018 �0.010 �0.008 0.000 0.029 0.022 0.016 �0.012 �0.010 �0.001200 0.011 0.008 0.008 �0.004 �0.004 0.000 0.011 0.008 0.008 �0.005 �0.004 0.001

2 50 0.155 0.129 0.130 �0.065 �0.058 �0.002 0.164 0.112 0.126 �0.068 �0.077 0.013

100 0.074 0.060 0.061 �0.031 �0.029 �0.001 0.074 0.056 0.071 �0.027 �0.038 0.005

200 0.036 0.030 0.030 �0.015 �0.014 0.000 0.039 0.035 0.029 �0.018 �0.014 0.000

4 50 0.538 0.441 0.452 �0.228 �0.202 �0.013 0.527 0.561 0.430 �0.250 �0.129 �0.076100 0.270 0.222 0.223 �0.113 �0.104 �0.001 0.262 0.227 0.199 �0.087 �0.090 �0.013200 0.133 0.109 0.111 �0.055 �0.050 �0.001 0.147 0.139 0.117 �0.068 �0.051 �0.004

Logistic errors

0.4 1 50 0.034 0.029 0.029 �0.014 �0.011 �0.002 0.064 0.051 0.054 �0.029 �0.021 �0.001100 0.018 0.017 0.014 �0.008 �0.007 0.001 0.028 0.027 0.023 �0.014 �0.008 0.001

200 0.008 0.007 0.007 �0.004 �0.003 0.000 0.015 0.012 0.013 �0.005 �0.006 0.000

2 50 0.162 0.120 0.110 �0.074 �0.051 0.004 0.303 0.237 0.194 �0.123 �0.101 0.011

100 0.057 0.045 0.065 �0.017 �0.028 �0.006 0.107 0.090 0.105 �0.039 �0.043 �0.013200 0.038 0.036 0.028 �0.018 �0.016 �0.001 0.065 0.058 0.039 �0.027 �0.023 �0.005

4 50 0.582 0.502 0.449 �0.279 �0.205 0.000 1.028 0.866 0.806 �0.401 �0.393 �0.041100 0.319 0.243 0.211 �0.118 �0.125 0.001 0.604 0.374 0.394 �0.226 �0.216 0.011

200 0.120 0.113 0.109 �0.050 �0.045 �0.010 0.201 0.204 0.201 �0.071 �0.077 �0.022Standard normal errors

0.4 1 50 0.039 0.024 0.030 �0.014 �0.018 0.002 0.058 0.048 0.057 �0.023 �0.028 0.004

100 0.022 0.017 0.014 �0.008 �0.007 �0.001 0.040 0.027 0.023 �0.015 �0.016 �0.002200 0.008 0.007 0.006 �0.004 �0.003 0.001 0.014 0.012 0.012 �0.007 �0.005 0.001

2 50 0.148 0.101 0.107 �0.060 �0.065 0.010 0.265 0.189 0.224 �0.127 �0.118 0.022

100 0.068 0.054 0.064 �0.027 �0.033 0.004 0.096 0.090 0.100 �0.032 �0.050 �0.007200 0.034 0.031 0.025 �0.015 �0.012 0.001 0.063 0.054 0.043 �0.031 �0.024 0.004

4 50 0.515 0.530 0.420 �0.248 �0.152 �0.042 0.970 0.904 0.745 �0.393 �0.287 �0.060100 0.259 0.224 0.197 �0.082 �0.086 �0.023 0.475 0.389 0.318 �0.179 �0.156 0.001

200 0.145 0.138 0.114 �0.071 �0.047 �0.006 0.231 0.196 0.216 �0.096 �0.098 0.002

GOLS,n Gols,n



144

Table 4. Empirical coverage probabilities of the confidence intervals at the level of 95%

ValuesCoverage probabilities of the algorithm confidence intervals of the slope estimates

Laplacian errors Logistic errors Standard normal errors

p0 s n bn0 bn

1 bn2 bn

0 bn1 bn2 bn

0 bn1 bn

2

0.2 1 50 0.960 0.980 0.954 0.965 0.965 0.949 0.934 0.939 0.954

100 0.919 0.949 0.965 0.919 0.934 0.939 0.970 0.960 0.960

200 0.970 0.954 0.939 0.960 0.965 0.965 0.929 0.954 0.980

2 50 0.960 0.960 0.975 0.975 0.939 0.954 0.954 0.960 0.960

100 0.954 0.965 0.970 0.949 0.970 0.944 0.954 0.939 0.975

200 0.949 0.944 0.965 0.954 0.965 0.954 0.929 0.970 0.944

4 50 0.960 0.990 0.944 0.949 0.924 0.965 0.934 0.965 0.929

100 0.929 0.929 0.919 0.934 0.944 0.919 0.949 0.944 0.924

200 0.985 0.975 0.949 0.929 0.949 0.954 0.975 0.954 0.954

0.4 1 50 0.980 0.960 0.954 0.949 0.965 0.939 0.934 0.980 0.939

100 0.960 0.965 0.980 0.949 0.970 0.970 0.919 0.939 0.975

200 0.965 0.954 0.970 0.960 0.944 0.944 0.970 0.949 0.954

2 50 0.924 0.929 0.929 0.909 0.934 0.965 0.949 0.965 0.970

100 0.960 0.934 0.914 0.980 0.970 0.939 0.939 0.975 0.934

200 0.919 0.949 0.939 0.929 0.919 0.960 0.960 0.939 0.949

4 50 0.924 0.929 0.949 0.934 0.960 0.944 0.944 0.939 0.970

100 0.934 0.960 0.934 0.954 0.929 0.934 0.939 0.975 0.975

200 0.909 0.914 0.944 0.975 0.949 0.954 0.965 0.899 0.960

0.6 1 50 0.949 0.934 0.965 0.960 0.944 0.965 0.944 0.944 0.954

100 0.954 0.970 0.934 0.975 0.975 0.924 0.980 0.944 0.985

200 0.960 0.929 0.970 0.954 0.949 0.960 0.975 0.939 0.949

2 50 0.909 0.939 0.939 0.929 0.944 0.934 0.929 0.975 0.944

100 0.954 0.975 0.965 0.975 0.944 0.919 0.960 0.944 0.960

200 0.904 0.960 0.919 0.919 0.944 0.949 0.939 0.965 0.909

4 50 0.934 0.944 0.914 0.919 0.929 0.949 0.939 0.985 0.904

100 0.914 0.944 0.960 0.894 0.944 0.939 0.894 0.939 0.939

200 0.889 0.949 0.939 0.924 0.949 0.929 0.914 0.954 0.929


1

nine ungrouped data that form part of the initial sub-sample of size 16, as suggested in the description of the algorithm. For their part, the

remaining initial points were chosen rather separated from the first.

Table 5 shows, as was indicated, that all of the sequences mentioned above quickly stabilize towards a limit point which is independent of

the algorithm initial values. It can be seen that the (mA, mT, g , s)-estimate based on the complete 48 observations (grouped and ungrouped)

was

m48A ;m48

T ; g48; s48� �

¼ 128:37; 120:66;�89:72; 16:52ð Þ:

Also, we have used (22) to compute

48�1L48 ¼21:33 11:56 �19:84

19:86 �19:2332:97

0@

1A;

the approximate covariance matrix of ðm48A ;m48

T ; g48Þ given in (21). Finally, we used (23) to calculate the 95%-confidence intervals formA,mT

and g , which resulted to be

½119:32; 137:42�; ½111:93; 129:40� and ½�100:97;�78:46�;respectively, from where individual hypothesis tests on these parameters follow straightaway. For its part, to test null linear hypotheses of the

formH0: Ab¼ 0 at a given a-level, it is sufficient to use the critical region shown in (24). In particular, the null hypothesisH0:mA¼mT tallies

the matrix A¼ (1, �1, 0) of rank one and, at the level of 95%, the critical region (24) based on the complete sample agrees with

R480:05 ¼ b48j48 m48

A � m48T

� �l4811 þ l4822 � 2l4812� ��1

m48A � m48

T

� �> 3:84

n o;

since x21ð0:05Þ ¼ 3:84; therefore, H0 is accepted at the level cited above (the value of the statistic within R48

0:05 being 3.29). The similar

analysis based on the suggested initial guess (b 16, s16)¼ (123.919, 117.788, �74.415, 14.788) and the sample 32 yield the following


45

Table

5.Sequencesðm

n A;m

n T;g

n;s

nÞg

enerated

bytheproposedalgorithm

from

differentstartingpoints

andn0¼16(w

ithdataofTable

1)

Sam

ple

size

nInicialvalues

m16 A,m16

T,g16ands16

123.919

117.885

�74.415

14.788

00

01

�40

40

�40

320

17

127.218

121.929

�87.777

17.430

125.234

120.210

�88.706

92.936

148.342

171.933

�127.502

162.072

18

126.095

119.416

�85.897

16.247

130.550

129.833

�94.277

30.004

135.613

141.413

�103.272

50.612

19

128.213

118.909

�89.593

16.462

129.259

121.052

�91.740

18.974

130.911

124.570

�94.831

23.936

20

127.888

120.611

�89.027

16.551

128.073

120.963

�89.435

16.783

128.424

121.646

�90.175

17.365

21

130.146

121.604

�90.928

16.532

130.190

121.648

�90.989

16.563

130.300

121.759

�91.141

16.649

22

127.100

120.275

�88.375

16.703

127.106

120.281

�88.383

16.710

127.123

120.298

�88.406

16.730

23

124.939

119.353

�86.584

16.613

124.940

119.354

�86.585

16.614

124.943

119.358

�86.590

16.618

24

124.768

120.529

�86.211

16.489

124.768

120.529

�86.212

16.490

124.768

120.530

�86.212

16.490

25

124.766

120.160

�85.523

16.073

124.766

120.160

�85.523

16.073

124.766

120.160

�85.523

16.073

26

124.704

120.109

�85.468

15.642

124.704

120.109

�85.468

15.642

124.704

120.109

�85.468

15.642

27

124.739

120.007

�85.564

15.244

124.739

120.007

�85.564

15.244

124.739

120.007

�85.564

15.244

28

125.931

119.529

�87.846

15.673

125.931

119.529

�87.846

15.673

125.931

119.529

�87.846

15.673

29

126.792

119.462

�87.711

15.571

126.792

119.462

�87.711

15.571

126.792

119.462

�87.711

15.571

30

126.777

120.672

�87.689

15.607

126.777

120.672

�87.689

15.607

126.777

120.672

�87.689

15.607

31

127.330

121.886

�88.724

15.547

127.330

121.886

�88.724

15.547

127.330

121.886

�88.724

15.547

32

126.606

120.300

�87.371

15.481

126.606

120.300

�87.371

15.481

126.606

120.300

�87.371

15.481

33

127.498

120.645

�88.070

15.390

127.498

120.645

�88.070

15.390

127.498

120.645

�88.070

15.390

34

128.267

120.490

�87.785

15.374

128.267

120.490

�87.785

15.374

128.267

120.490

�87.785

15.374

35

129.149

120.322

�87.479

15.469

129.149

120.322

�87.479

15.469

129.149

120.322

�87.479

15.469

36

129.754

120.265

�88.591

15.367

129.754

120.265

�88.591

15.367

129.754

120.265

�88.591

15.367

37

130.121

120.187

�89.280

15.183

130.121

120.187

�89.280

15.183

130.121

120.187

�89.280

15.183

38

129.431

118.646

�87.946

15.147

129.431

118.646

�87.946

15.147

129.431

118.646

�87.946

15.147

39

127.735

117.848

�86.544

15.224

127.735

117.848

�86.544

15.224

127.735

117.848

�86.544

15.224

40

127.512

119.177

�88.897

15.729

127.512

119.177

�88.897

15.729

127.512

119.177

�88.897

15.729

41

127.436

120.498

�91.189

16.207

127.436

120.498

�91.189

16.207

127.436

120.498

�91.189

16.207

42

127.459

121.517

�91.206

16.315

127.459

121.517

�91.206

16.315

127.459

121.517

�91.206

16.315

43

128.368

121.450

�91.055

16.478

128.368

121.450

�91.055

16.478

128.368

121.450

�91.055

16.478

44

127.505

119.463

�89.427

16.692

127.505

119.463

�89.427

16.692

127.505

119.463

�89.427

16.692

45

128.132

119.387

�89.297

16.666

128.132

119.387

�89.297

16.666

128.132

119.387

�89.297

16.666

46

128.074

120.008

�89.173

16.624

128.074

120.008

�89.173

16.624

128.074

120.008

�89.173

16.624

47

128.440

119.961

�89.845

16.486

128.440

119.961

�89.845

16.486

128.440

119.961

�89.845

16.486

48

128.371

120.662

�89.719

16.522

128.371

120.662

�89.719

16.522

128.371

120.662

�89.719

16.522



146


1

estimates:

m16A ;m16

T ; g16; s16� �

¼ ð123:92; 117:79;�74:41; 14:79Þ;m32A ;m32

T ; g32; s32� �

¼ ð126:61; 120:30;�87:37; 15:48Þ;

16�1L16 ¼45:41 20:96 �43:69

39:69 �33:1969:48

0@

1A

and

32�1L32 ¼23:87 11:91 �21:42

24:61 �22:7940:97

0@

1A:

From them, it can be verified, for example, that, at the significance level of 0.05, the hypothesisH0:mA¼mT is accepted in both cases, since

the values of the statistics that form part of R160:05 and R32

0:05 are 0.87 and 1.61, respectively.

For its part, the analysis of model (4) was tackled by means of its auxiliary reformulation yijh¼ bijþ seijh (i¼ 100, 600, 1000 and j¼A, T),

the slope parameters of which determine the slope free parameters of (4) through the equations:

h ¼ b: ¼ 6�1X

ijbij; D100 ¼ b100: � b::

¼ 2�1ðb100A þ b100LÞ � b::; D600 ¼ b600: � b::; PA ¼ b:A � b::;

DP100A ¼ b100A � b100: � b:A þ b::; andDP600L ¼ b600L � b600: � b:L þ b::

For n¼ 16,. . .,48, first, the proposed algorithm supplies the estimates sn, bnij and n

�1Ln, this last element being the approximate covariance

matrix of the bij-estimates. Then, we can use this matrix and (24) to test the typical ANOVA null hypotheses which identify the absence of

main effects or interactions; in particular, (1) distance main effects: Hð1Þ0 D100 ¼ D600 ¼ D1000 ¼ 0, (2) power station main effects:

Hð2Þ0 PA ¼ PT ¼ 0 and (3) interaction between distance and power station: H

ð3Þ0 DPij ¼ 0, for all possible values (i, j). It is sufficient to

realise that, after denoting b¼ (b100A, b600A, b1000A, b100T, b600T, b1000T)0, these three null hypotheses are linear and equivalent to

Hð1Þ0 A1b ¼ 0, H

ð2Þ0 A2b ¼ 0 and H

ð3Þ0 A3b ¼ 0, for the particular matrices

A1 ¼ 6�12 �1 �1 2 �1 �1�1 2 �1 �1 2 �1

� �; A2 ¼ ð1; 1; 1;�1;�1;�1Þ and A3 ¼ 6�1

2 �1 �1 �2 1 1

�1 2 �1 1 �2 1

� �:

With the complete sample, the auxiliary parameter estimates were s48¼ 8.78 and b48 ¼ ð110:26; 91:86; 28:28; 102:56; 84:18; 21:85Þ0. Thelinear combinations of b48 which form part of the null hypotheses H

ð1Þ0 , H

ð2Þ0 and H

ð3Þ0 were A1b

48 ¼ ð33:25; 14:86Þ0, A2b48 ¼ 21:81 and

A3b48 ¼ ð0:21; 0:20Þ0.

Finally, the observed values 48b048A0ðAL48A0Þ�1Ab48 which, in accordance with (24), need to be computed to find out if b48 belongs to the

critical regions associated with the hypothesis Hð1Þ0 , H

ð2Þ0 and H

ð3Þ0 were 957.13, 9.75 and 0.07, respectively. At the a-level of 0.05, each of

these values must be compared to x22ð0:05Þ ¼ 5:99, x2

1ð0:05Þ ¼ 3:84 and x22ð0:05Þ ¼ 5:99, respectively. Therefore, the hypotheses of null

distance and power station main effects are rejected and the hypothesis of null interactions is accepted, at the significance level mentioned

above. These conclusions are in complete agreement with those which are derived from the sub-sample of size 32. The values of the statistic

32b032A0ðAL32A0Þ�1Ab32 were 542.67, 7.24 and 0.90 for Hð1Þ0 , H

ð2Þ0 and H

ð3Þ0 , respectively. For its part, the equivalent values for the sample

size 16 were 290.96, 1.42 and 0.29, thus the conclusion about Hð2Þ0 differs using the suggested initial guesses b16 and s16 (the OLS estimates

based on the non-grouped data of the first one third of the sample, as was indicated).

The natural alternatives to the proposed algorithm are the maximum likelihood procedures. Thinking, for simplicity’s sake, in terms of

model (3) and denoting f¼ (b, s) with b¼ (mA, mT, g), it holds that, for a certain sample size n, the log-likelihood function (under the robust

conditions considered in this paper) agrees with the integral function

lðfÞ ¼Xi2Iun

log s�1fyi � x0ib

s

� �� þXi2Ign

logXrh¼1

Iðch�1 < yi � chÞZ �x0ibþchð Þs�1


f ðxÞdx: (30)

Its maximization can be carried out either directly, typically using the Newton–Raphson algorithm, due to its quadratic convergence, or the

EM algorithm, given its extended use under incomplete data. If the Newton–Raphson procedure is employed, it starts from an initial guess of

f (let f(0) denote this) and, then, assuming that the current f-estimate is f(k), the iteration equation is defined by

fðkþ1Þ ¼ fðkÞ þ I�1ðfðkÞÞSðfðkÞÞ; (31)

where the vector S(f) and the matrix I(f) are, respectively, the gradient and the negative of the second-order partial derivatives of the log

likelihood function with respect to the elements of f, that is,

SðfÞ ¼ @lðfÞ@f


47


148

and

IðfÞ ¼ � @2lðfÞ@f@f0

:

According to this, if the data is sequentially received and we want to advance partial estimations, as maintained in this paper, the iteration

(31) need to be conceived as nested in a primary iteration upon the sample size. Therefore, the sequential data reception version of the

Newton–Raphson procedure is formalized as follows (from a given initial sample size n0):

The Newton–Raphson algorithm adapted to the sequential data reception.

Initialization: Let n0 and fn0 be a starting sample size and guess of f, respectively.

Iteration:

Step 1: Assuming that the current sample size, n�1, and f-value, fn�1, are known, let us update fn�1

by the following iterative process nested in the former one (as a new observation is received):

Initialization: Let us take an initial f(0),n (a plausible option could be f(0),n¼ fn�1 to take

advantage of the previous estimate).

Iteration: Assuming that f(k�1),n is known, let us update this through the following steps:

Step 1.1: fðkÞ;n ¼ fðk�1Þ;n þ I�1ðfðk�1Þ;nÞ Sðfðk�1Þ;nÞ ð32ÞStep 1.2: k�1 k, and return to Step 1.1 until the convergence is achieved, and let f(1),n

be the limit point.

Step 2: Let us define fn¼f(1),n.

Step 3: n�1 n, and return to Step 1.

The similar version of the EM algorithm simply substitutes the former Step 1.1 with the concatenation of the E and M steps of the EM (for

example, see McLachlan and Krishnan, 1997: p. 22). The E step obliges us to compute

Qðf;f k�1ð Þ;nÞ ¼ �n log s þXi2Iun

log fyi � x0ib

s

� �

þXi2Ign

Xrh¼1

Iðch�1 < yi � chÞE log fyi � x0ib

s

� �jf k�1ð Þ;n; ch�1 < yi � ch

� �(33)

where

E log fyi � x0ib

s

� �jfðk�1Þ;n; ch�1 < yi � ch

� �¼

R chch�1

log ft�x0ibs

f

t�x0ibðk�1Þ;nsðk�1Þ;n

dtR ch

ch�1f

t�x0ibðk�1Þ;n

sðk�1Þ;n

dt

;

whereas the M step updates the current f-values of the secondary iteration with the maximum of Q(f,f(k�1),n) as a function of f. Therefore,

Step 1.1 of the EM algorithm adapted to the sequential data reception is

Step1:1 : fðkÞ;n ¼ argmaxf

Q f;fðk�1Þ;n

: (34)

The proposed algorithm avoids the secondary iteration (defined by the former steps 1.1 and 1.2), which is substituted with one

single computation of the simple expressions (17), (18) and (19). This by itself represents a clear computational advantage of the proposed

algorithm compared to the sequential adaptations of the Newton–Raphson and EM algorithms. There exist many other technical advantages

to support our algorithm. However, we have decided to annex them at the end of the paper for interested readers.

7. CONCLUSIONS AND FINAL COMMENTS

This paper provides the presentation of an algorithm for linear model estimation and inference under robust conditions which include the

sequential data reception and the need to forecast partial statistical analyses, the existence of grouped or missing data and the possibility of

general errors, not necessarily normal or symmetrical, with null means and unknown variances. As the sample size increases, the parameter

estimate sequences generated by the algorithm stabilize towards a point which does not depend on the starting point. Additionally, for a fixed

sample size, the asymptotic covariance matrix of the slope estimates can also be consistently estimated by means of an explicit expression

which is easy to implement computationally. In this sense the proposed algorithm presents a clear advantage compared to the maximum

likelihood procedures, the natural competitors of the proposed algorithm. With these procedures, implemented either directly or through the



EM algorithm, the computation of the estimate asymptotic covariance matrix entails the evaluation of the Hessian matrix of the log-likehood

integral function associated with the robust conditions mentioned above, which can only be done numerically through quite unstable

methods. This advantage and others with regard to the computational complexity and the memory capacity of the procedures justify our

proposal. A last point, added to its simplicity, merits to be emphasized: the large variety of simulations made corroborates that if the linear

model is well specified the capacity of the proposed algorithm for treating the incomplete data situations considered is remarkable. In fact, the

statistical acuteness of our estimates (in terms of biases and mean square errors) is similar to the OLS estimates with complete data. As can be

seen from Table 2, this occurs for all of the combinations of error distributions, proportion of grouped data, values of the scale parameter and

sample sizes displayed there.

Finally, a brief computational observation will bring the paper to a close. This refers to our computer implementation of the algorithm. This

was made in MATLAB and its source language is available from the authors on request.

Acknowledgements

This paper springs from research partially funded by MEC under grant MTM2004-05776.

REFERENCES

An, MY. 1998. Logconcavity versus logconvexity: a complete characterization. Journal of Economic Theory 80: 350–369.Anido, C, Rivero, C, Valdes, T. 2000. Modal iterative estimation in linear models with unihumped errors and non-grouped and grouped data collected from

different sources. Test 9: 393–416.Dempster, AP, Laird, NM, Rubin, DB. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society B 39:

1–22.Fahrmeir, L, Kufmann, H. 1985. Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models. Annals of Statistics

13: 342–368.Healy, MJR, Westmacott, M. 1956. Missing values in experiments analysed on automatic computers. Applied Statistics 5: 203–206.James, IR, Smith, PJ. 1984. Consistency results for linear regression with censored data. Annals of Statistics 12: 590–600.Louis, TA. 1982. Finding observed information using the EM algorithm. Journal of the Royal Statistical Society B 44: 98–130.McLachlan GJ, Krishnan, T. 1997. The EM Algorithm and Extensions. Wiley: New York.Meilijson, I. 1989. A fast improvement of the EM algorithm on its own terms. Journal of the Royal Statistical Society B 51: 127–138.Meng, XL, Rubin, DB. 1991. Using EM to obtain asymptotic variance-covariance matrices. Journal of the American Statistical Association 86: 899–909.Orchard, T, Woodbury, MA. 1972. A missing information principle: theory and applications Proceedings of the 6th Berkeley Symposium on Mathematical

Statistics Vol. I:697–715.Ritov, Y. 1990. Estimation in linear regression model with censored data. Annals of Statistics 18: 303–328.Rivero, C, Valdes, T. 2004. Mean based iterative procedures in linear models with general errors and grouped data. Scandinavian Journal of Statistics 31:

469–486.

8. ANNEXE: TECHNICAL ADVANTAGES OF THE PROPOSED ALGORITHMCOMPARED TO ITS SEQUENTIAL ALTERNATIVES

In addition to the clear computational advantage of the proposed algorithm (against the sequential adaptations of the Newton–Raphson and

EM algorithms), which was commented on at the end of Section 6, many other technical reasons support the first. In particular the following:

(a) A

Env

sole iteration of the Step 1.1 of the Newton–Raphson version is more complicated to compute than the expressions (17), (18) and (19),

since it needs to evaluate numerically the first and second derivatives of the log-likelihood (30).

(b) T
he same occurs with the EM version. With general error distributions, the computational complexity of a sole iteration of (34) greatly
surpasses that of the expressions (17), (18) and (19). Only when the errors are distributed normally, does there exist a certain likeness

between Step 1.1 of the EM and the expressions cited above. In this respect, if the ei-distribution is standard normal, it can be verified that

(33) essentially agrees with

1

Qðf;fðk�1Þ;nÞ ¼ �n log s � 1

2s2

Xi2Iun

ðyi � x0ibÞ2 þ

Xi2Iun

E� ðyi � x0ibÞ2jfðk�1Þ;n

h i( )

where E�[r(yi)jf(k�1),n]¼E[r(yi)jf(k�1),n, ch�1< yi� ch] if (ch�1, ch] is the actual grouping interval which overlaps the grouped observation

yi. The first order optimality condition

s2 @Qðf;fðk�1Þ;nÞ@b

¼Xi2Iun

ðyi � x0ibÞxi þXi2Iun

E� yijfðk�1Þ;n

� x0ib

xi ¼ 0

clearly indicates that to update current b-values of the secondary iteration we need, first, to impute each grouped yi-observation with its

conditional expectation E�(yijf(k�1),n) and secondly, to apply least squares from both the ungrouped data and the imputations mentioned

above. With the notation used in Section 4 when describing the proposed robust estimating algorithm, this is equivalent to

bðkÞ;n ¼ ðX0nXnÞ�1X0nynðbðk�1Þ;n; sðk�1Þ;nÞ (35)


49


150

which resembles (18). Finally, from (35) and

@Q f;fðk�1Þ;n� �

@s¼ 0;

one can easily verify that the s-updating equation of the secondary loops is

sðkÞ;n ¼ n�1Xi2Iun

yi � x0ibðkÞ;n

2þXi2Ign

E� yi � x0ibðkÞ;n

2fðk�1Þ;n�� 0

@1A

24

351=2

(36)

where the conditional expectation on the right can be calculated from

E� yi � x0ibðkÞ;n

2fðk�1Þ;n��

¼ E� yi � x0ibðk�1Þ;n

2fðk�1Þ;n��

þ x0i bðkÞ;n � bðk�1Þ;n h i2

þ2x0i bðkÞ;n � bðk�1Þ;n

E� yi � x0ibðk�1Þ;n fðk�1Þ;n

�� (37)

and the following easily verifiable equalities

E� yi � x0ibðk�1Þ;n

2fðk�1Þ;n��

¼ sðk�1Þ;n 1�mðk�1Þ;nih ’ m

ðk�1Þ;nih

� m

ðk�1Þ;nih�1 ’ m

ðk�1Þ;nih�1

F m

ðk�1Þ;nih

�F m

ðk�1Þ;nih�1

24

35 (38)

and

E� yi � x0ibðk�1Þ;n fðk�1Þ;n

�� ¼ �

’ mðk�1Þ;nih

� ’ m

ðk�1Þ;nih�1

F m

ðk�1Þ;nih

�F m

ðk�1Þ;nih�1

; (39)

in which w and F denote the density and distribution functions of the standard normal, respectively, mðk�1Þ;nih ¼ ð�x0ibðk�1Þ;n þ chÞ=sðk�1Þ;n,

mðk�1Þ;nih�1 is similar with ch�1 instead of ch, and (ch�1, ch] the actual grouping interval of the grouped observation yi. Let us take into account that

with normal errors (19) adopts the form

sn ¼ n�1Xi2Iun

yi � x0ibn

� �2 þ sn�1Xi2Ign

1�lnih’ lnih

� �� lnih�1’ lnih�1

� �F lnih� �

�F lnih�1� �

" #0@

1A

24

351=2

; (40)

where lnih ¼ ð�x0ibn þ chÞ=sn�1, lnih�1 defined in a similar way with ch substituted with ch�1 and (ch�1, ch] being, as before, the actual

grouping interval of the grouped observation yi. Therefore, the expressions (36) and (19) admit an explicit form and, although (36) is slightly

more complicated than (19), their computational complexities are of similar orders. It follows from (35) and (36), likened with (18) and (19),

respectively, that if in the sequential data reception version of the EM with normal errors one had decided to execute one single loop of the

secondary iteration, the resulting hypothetical algorithm would have a similar complexity to the one proposed in this paper, although they do

not agree. In this sense, if f�n¼ (b�n,s�n) denotes the sequence generated by the hypothetical algorithm, its updating equations would be

(from (35) and (36))

b�n ¼ X0nXn

� ��1X0ny

n b�n�1; s�n�1� �

; (41)

which agrees with (18), and

s�n ¼ n�1Xi2Iun

yi � x0ib�n� �2 þX

i2Ign

E� yi � x0ib�n� �2

f�n�1�� 0

@1A

24

351=2

(42)

where the computation of the conditional expectation on the right can be performed from the expressions (37), (38) and (39) conveniently

adapted. Although (41) differs from (and it is slightly more complicated than) (40), the normal error version of (19), the most relevant

drawback of the hypothetical EM version concerns the asymptotic distribution of its estimates, b�n, as n!1, which, in our opinion, is

inscrutable and without which no inferences about the true slope parameter can bemade. For its part, if the error distribution is non-normal, the

similarity between the step 1.1 of the sequential data reception version of the EM and the expressions (18) and (19) of the proposed algorithm not

only disappears but, additionally, the steps E andM of the EM do not usually admit a closed form. If this is the case, the E step requires the use of

quadrature techniques, whereas the maximization involved in the M step has to be numerically carried out (through the Newton–Raphson

procedure or alternative algorithms). This last point implies that, besides the secondary iteration of the sequential data reception version of the

EM, a third nested iteration usually needs to be added to compute Step 1.1. For example, with the Laplacian errors which are typically assumed

with the radiological data of Table 1, as was said, it can be verified that

Q f;fðk�1Þ;n

¼ �n log s � 2�1=2s�1Xi2Iun

yi � x0ib�� þX

i2Ign

E� yi � x0ib�� fðk�1Þ;n�� 2

435

the maximization of which does not admit an explicit form. Thus, the third nested iteration mentioned above is required in this case.



Additionally to the former comments, which concern the parameter point estimation, there exists a second substantial statistical advantage

of the proposed procedure (compared to the maximum likelihood methods). This affects the parameter interval estimation and hypothesis

testing which, in both cases, lean on asymptotics. As explained above, our algorithm tackles these inferences by means of the covariance

matrix n�1Ln, in which only first derivatives are involved. From expressions (8) and (9) as was said, these first derivatives admit an explicit

expression in terms of the density and distribution functions of the standardized errors. For its part, the maximum likelihood methods (either

direct or through the EM algorithm) need to estimate the Fisher information matrix evaluated in the maximum likelihood estimate. If the

direct ML method is employed this matrix need to be numerically evaluated, as it occurs in each of the steps 1.1 of the secondary iteration.

Within the context of the EM algorithm, the Hessian (matrix) of the log-likelihood function can be evaluated by several methods, among them

by direct computation/numerical differentiation (Meilijson, 1989), the Louis method (Louis, 1982) or using EM iterates (Meng and Rubin,

1991). Therefore, in none of the former MLmethods does the estimate of the asymptotic covariance matrix of the b-estimates admit a closed

form and the numerical evaluation of the second derivatives of the log-likelihood is quite unstable. Undoubtedly the equivalent estimation in

our proposed algorithm is simpler if we recall that only first differentiations with analytical form are involved.

To finish, let us comment on the partial memory capacity of the proposed algorithm compared to its maximum likelihood alternatives.

Neither the Newton–Raphson version nor the EM version (adapted to the sequential data reception) has the least memory with respect to the

primary iteration upon the sample size. The reason is that the maximization of the sum of n functional summands (which form part of (30) or

(33) when the sample size is n) do not hinge at all on the similar maximizations of the sum of their first n� 1 summands (which are associated

with the sample size were n� 1). In fact, this means that the computation of the secondary iteration when the sample size is n (a) requires the

use of the complete up-to-date data set of individual grouped or ungrouped observations (xi, yi), which, therefore, has to be completely stored,

and (b) is utterly independent of the similar computation based on the previous sample size n� 1: hence, the dregs of wastefulness which the

maximum likelihood methods, as well as other traditional statistical procedures, are pervaded with when we try to adapt them to a sequential

data reception, as commented on in Section 1. For its part, the proposed algorithm (1) needs only the storage of the (xi, yi)-information

corresponding with the grouped yi-data, and (2) its computation admits a certain sequential organization. To see this, it suffices to observe,

first, that the (xi, yi)-information with regard to the grouped yi-data is needed to compute (17) within the proposed algorithm; secondly, the

sequential computation of (18) and (19) can be made as follows. Let us rewrite (18) and (19) in the form

bn ¼ M�1n þ Nn þXi2Ign

xiyi bn�1; sn�1� �

and

nsn2 ¼ Pn þ b0nQnbn � 2Rnb

n þ sn�12Xi2Ign

Xrh¼1

I ch�1 < yi � ch�1ð Þ

R �x0ibnþchð Þ=sn�1�x0

ibnþch�1ð Þ=sn�1 x

2f ðxÞdxR �x0

ibnþchð Þ=sn�1

�x0ibnþch�1ð Þ=sn�1 f ðxÞdx

;

where

Mn ¼Xni¼1

xix0i; Nn ¼

Xi2Iun

xiyi; Pn ¼Xi2Iun

y2i ; Qn ¼Xi2Iun

xix0i and Rn ¼

Xi2Iun

yix0i:

These latter matrices, along with the individual information concerning the grouped yi -observations (that is, xi and the actual grouped interval

(ch�1, ch] which overlaps yi), are the only elements that need to be stored. As soon as a new observation is received, the matrices can be

sequentially updated (meaning by this that the updating depends on their stored values and the new observation) by means of

Mnþ1 ¼ Mn þ xnþ1x0nþ1; Nnþ1 ¼ Nn þ I nþ 1 2 Iunþ1

� �xnþ1x

0nþ1

and the updating expressions for Pnþ1, Pnþ1 and Pnþ1 which are similar to the latter one. For their part, it is clear that the updating of the sums

Xi2Ign

xiyi bn�1; sn�1� �

andXi2Ign

Xrh¼1

I ch�1 < yi � ch�1ð Þ

R �x0ibnþchð Þ=sn�1

�x0ibnþch�1ð Þ=sn�1 x

2f ðxÞdxR �x0

ibnþchð Þ=sn�1

�x0ibnþch�1ð Þ=sn�1 f ðxÞdx

do not admit a sequential representation but, on the contrary, it requires the use of the all of the individual information regarding the grouped

data. Although the memory of the proposed algorithm is only partial, as it only affects the ungrouped data, there is no comparison with the

memory of its alternatives which is null, as was said. This represents a new advantage of our proposal in terms of storage requirements to be

added to those which, with regard to its statistical simplicity and computational complexity, were commented on above.


151

A robust algorithm for the sequential linear analysis of environmental radiological data with...

Documents

Transcript of A robust algorithm for the sequential linear analysis of environmental radiological data with...