dipartimenti.unicatt.it1 Combined arrays in Taguchi approach: testing the hypotheses on the fixed...
Transcript of dipartimenti.unicatt.it1 Combined arrays in Taguchi approach: testing the hypotheses on the fixed...
UNIVERSITÀ CATTOLICA DEL SACRO CUORE
ISTITUTO DI STATISTICA
Angelo Zanella – Laura Deldossi
Combined arrays in Taguchi approach: testing the hypotheses on the fixed effects when the
estimated noise factor variances are taken into consideration
Serie E.P. N. 110 - Dicembre 2002
1
Combined arrays in Taguchi approach: testing the hypotheses on the fixed effects when the
estimated noise factor variances are taken into consideration(1)
Angelo Zanella(2) - Laura Deldossi Università Cattolica del S. Cuore, Istituto di Statistica, Milano
1. Introduction. A major undeniable original merit of Genichi Taguchi – a celebrated Japanese quality
expert, whose methods began to spread out all over the world in the eighties – has been
that of giving a new impetus to the use of statistical design of experiments in the
technological investigations aimed at improving a field product performance, see for
example Giovagnoli et al. (1994), and Kackar (1985) for a general review of Taguchi’s
approach and methods. Following Shoemaker et al. (1991), for instance, the basic ideas
in Taguchi’s approach to industrial experimentation can be summarized as follows: a)
products and their manufacturing processes are not only influenced by factors
(operating variables, etc.) which are controlled by the designers (controllable factors)
but also by the so-called noise factors, the effects of which happen at random, and
which may express randomly variable process environmental conditions, raw material
properties, factors which depend on the use of the customers, etc.; b) the variability of a
product quality characteristic (response) is obviously also related to that of the noise
factors; however the novelty is that, in general, we have to suppose that there exist also
controllable factors which have an effect on the response variance. Thus, process
(1) This preliminary version was presented and discussed at the Meeting of 6 March 2003 of Istituto Lombardo – Accademia di Scienze e Lettere and accepted for publication on the “Rendiconti dell’Istituto Lombardo”. (2)Angelo Zanella is the author of the methodological part of the paper, especially of sections 1., 2a),b), 3., 4.a); Laura Deldossi developed all aspects of the numerical application presented in the paper and of the
2
designers and manufacturing operators can look for levels of controllable factors so that
the latter can not only ensure the aimed at response mean value but also allows reducing
the overall response variance. This gives rise to the so-called “robust product and
process parameter design” which has the goal of specifying operating conditions so that
quality characteristics of interest are little sensitive to variation due to noise factors. To
make the content of the paper easier to understand we shall consider a very simple
example (which can be considered a simplified version of the example developed later
on), that is appropriate to summarize the main traits and the related statistical methods
which are typical in the context under study.
a) We can start with the “product array” approach, from which the “combined array”
approach considered here will follow immediately. Let Y be a random variable
describing a product characteristic subject to random fluctuations (in the following
example, concerning the manufacturing of a rectangular plastic sheet, Y represents the
material strength loss under stress). Suppose that there is only one process controllable
factor x (for instance conditioning temperature) and that an experimental design is
carried out composed of m×n independent trials so that m levels x1, x2,…, xm, of the
controllable variable x are considered and for each condition xi, i=1,2,…,m, the
experiment is repeated n times, without any other known modification of the
experimental asset. Assume that the observable results Yij of the experiment can be
interpreted according to the following models:
Yij = β0 + β1xi + Zij + Eij = ηi + Zij + Eij (1.1)
i=1,2,…,m; j=1,2,…,n, where for simplicity we assume that Zij are stochastically
independent random variables, describing the effects of an unobservable noise factor Z,
with means zero and variances σ2Z(xi), ∀j, Eij are random variables (errors) independent
of each other and of Zij, with means zero and common variance σ2E, β0, β1 are real
unknown parameters.
The variance of Yij is obviously:
σ2Y(xi) = σ2
Z(xi) + σ2E,
corresponding tables and graphs; she is the author of sections 2.c), 5. and contributed to section 4.b). All work was discussed and agreed on by both authors.
3
i=1,2,…,m, say, and if we indicate by S2i = ∑
j=1
n (Yij-Y
_i)2/(n-1), the usual unbiased
estimate of σ2Y(xi) obtained through the n replications, model (1.1) can be completed, for
instance, by a relationship of the type:
loge S2i = γ0 + γ1xi +γ11x
2i + E*
i (1.2)
i=1,2,…,m, where E*i still can be assumed to be zero mean independent random
variables, γ0, γ1, γ11 are unknown real parameters which allow us to connect the
controllable factor x with the variance of Y. Models (1.1), (1.2) give a simplified
example of the dual approach proposed by Vining et al. (1990), see also Magagnoli et
al. (1990) and Zanella (1992). In theory after fitting models (1.1), (1.2) to the
observations, typically by having recourse to the least square criterion, possibly
weighted, we can try to determine the level of the controlled factor x which minimizes
σ2Y(⋅) subject to an appropriate target constraint on the mean value η.
b) The original “array product” of Taguchi raised some criticisms followed by
corresponding improvements which led to the “combined array approach”:
1) a major criticism is that the product array approach requires a full factorial
experimental design for combining the controllable factors levels with the noise factors,
since it requires for each of the m sets of experimental conditions on the controllable
factors – that is for each “row” of the so-called “inner array” of the experiment – to
carry on a complete replication of an “outer array” with n trials, in which random
variation of the noise factors effects can be assumed. Thus the required number of trials
is n×m and it can become very big (e.g. if the inner array corresponds to a fractional 25-1
experiment in 5 controllable factors at 2 levels with 16 trials and an outer array also
with 16 trials is considered, to study the effects of say 5 additional noise factors, the
experiment would include 256 runs, etc.).
2) The product array approach allows what in my view is the most direct and coherent
use of the method, since the outer array may consist of n random independent replicates
for each set of conditions chosen for the controllable factors. It follows that the
statistical analysis of the observations can be based on two neatly distinct models –
named “dual models” in Vining et al. (1990) – a first model, like (1.1), for the study of
the effects of controllable factors on the mean value η of the response; a second model,
4
like (1.2), for the study of the effects of the controllable factors on the response
variability. However a drawback of a model like (1.2) is that it does not allow us to
describe the origin of the latest mentioned effects, which seem naturally related to
interactions between controllable and noise factors.
3) There is a surprising ambiguity in defining the nature of the noise factors: on one
hand they are presented as factors whose behaviour is “by definition unsystematic and
possibly random” see for instance Alt (1988), p.166; on the other hand it is supposed
that the experimenter is capable of setting up conditions so that a noise factor can either
produce a negative or an intermediate or a positive effect, see for instance Kackar
(1985), p. 185 and also Park (1996), p. 281: this latter is speaking about a noise factor of
“three levels or conditions: good, normal or bad”.
The “combined array approach” unifies both the controllable and the noise factors in a
single “design matrix”, which can be possible under the assumption that the noise factors
can be reproduced as controllable factors in a laboratory or on the pilot plant level, see
Myers et al. (1997), p. 430, and also Khattree (1996), p. 189. In other words it is assumed
that for each guessed noise factor the experimenter can identify a controllable factor that
will reproduce the noise factor effect when its levels appear at random.
From the former point of view the study of both types of factors, controllable and noise
factors, can be conducted on the basis of a unique factorial design, typically fractional,
with a corresponding often drastic reduction of the number of trials and the possibility
of describing controllable by noise factor interactions explicitly (as a very recent, from a
certain point of view, problematic on a large scale application of the “combined array
approach” we mention Dasgupta et al. (2002)).
If we adopt the combined array approach in the example examined above, for the
statistical interpretation and analysis of the experimental results obtained in laboratory
or on pilot plant level, model (1.1) has to be replaced by another one, for instance, of the
following type:
Yi = β0 + β1xi + γξi + δξixi + Ei (1.3)
i=1,2,…,n, where n is now the total number of trials, ξi the chosen level of a
controllable variable ξ, which is the systematic counterpart of the random noise Z, Ei
still are stochastically independent random errors, with zero means and variance σ2E, β0,
5
β1, γ, δ are unknown real parameters. Model (1.3) describes the results obtained in
“laboratory” through an appropriate designed experiment and it is instrumental in
pointing out the relevant factor effects. However, on the other hand, the former
experiment is conceptually related to a “second one” in which the levels of ξ happen at
random as values of an observable noise variable Z and which would produce some
results actually similar to those experienced for instance by a customer in the real life of
a product. As regards the simple example considered above a natural model for this
second experiment would be the following:
Yi = β0 + β1xi + γZi + δZixi + Ei (1.4)
where we preserve the assumptions on the stochastic independence of the random
variables Zi, Ei, etc. made in connection with (1.1), see Box e Jones (1992) for a general
presentation of this type of model. It is immediate to show that for model (1.3) the
response variance is
Var(Yi ) = σ2Y = σ2
E ,
while for model (1.4) we obtain:
Var(Yi ) = σ2Y(xi) = σ2
E + (γ + δxi)2v2, (1.5)
if we further assume that the sample noise variables Zi have constant variance v2,
besides having zero means. In the example presented in the paper one of the noise
variables is associated with irregularities in the plastic sheet from which the finished
product – a photographic film strip – is obtained. One of the sources of this irregularities
is related to an axial trend along the longer sheet dimension, that is to a spatial
coordinate ξ. When the product is retailed one can assume that the product unit received
by a costumer corresponds to a strip for which the level of ξ has been chosen at random
and that it actually is the value of a random variable Z. With regard to model (1.3) it is
of primary importance also to assess whether the effect β1 of the controllable factor x is
clearly distinguishable from the random fluctuations which affect the response Y. In the
framework of the laboratory experiment if β̂1 is the usual least squares estimate of β1
this can be carried out, under suitable assumptions (in particular that the errors Ei have
also a normal distribution) by having recourse to the well-known T-test. However this
standard procedure would only point out whether β1 is sufficiently large with regard to
6
random fluctuations due to the errors Ei of variance σ2E. In the context of the combined
array approach what really interests whether the effect β1 too is really distinguishable
from the overall random errors including the noise effects, as it is shown by relationship
(1.4), whose corresponding possible variance increment is given by (1.5). This paper
has the purpose of going into the last subject. In section 2. the reference experimental
framework, based on a completely orthogonal design, and the statistical model used for
the interpretation of the corresponding observations are presented together with an
example which later on, in section 5. will allow us to illustrate the application of the
obtained theoretical results.
In section 3. a generic controllable factor effect, regardless of its being either a direct or
an interaction effect, say βh, is considered with the corresponding estimate β̂h obtained
by the least squares criterion applied to the results of the first laboratory experiment.
Then the true overall variance hσ2T of β̂h is considered, which takes the noise factor
effects into consideration and would arise when the observations were obtained
according to the second conceptual experiment. After some preparatory Lemmas an
unbiased estimate hσ̂2T of hσ
2T is established by having recourse to the sole first
experiment observations; adding the normality assumption for the random error
probability distribution also a condition is pointed out so that the probability distribution
of hσ̂2T becomes a noncentral chi-square distribution.
In section 4. two procedures, that have recourse to the observations of the first
laboratory experiment only, are proposed, aimed at assessing the distinguishable value
of an effect βh with respect to the real overall variability resumed by the standard
deviation hσT of the estimate β̂h. The first proposed method is a statistical test, whose
percent point, defining the acceptance/rejection regions of the hypothesis under study, is
the appropriate (1-α) percentile of a doubly noncentral F distribution. The second
method considers the least squares estimate β̂^
h of βh which would be obtained if the
second real experiment should also be carried out. The method suggests to have
recourse to a conditional indicator whose value approximates the probability of
7
accepting the hypothesis βh=0 when applying to β̂^
h the usual T-test, conditionally to the
results of the first experiment and, thus, in particular, to the estimate β̂h obtained from
the same.
In section 5. the theoretical results are applied to the example already presented in
section 2..
2. A completely designed experiment and its conceptual counterpart. The
corresponding models.
a) First experiment: completely designed.
Consider an experimental design with n trials in p+q quantitative continuous
controllable factors aiming at the study of their direct and first order interaction effects
on a quantitative response Y. Denote by x1, x2, …, xp the variables describing the levels
of the first p factors and by ξ1, ξ2, …, ξq similar variables related to the other ones,
which in ordinary cases, that is outside the programmed experiment, would assume
random values and, thus, correspond to the so called noise factors (e.g. environment,
temperature, humidity, light intensity, etc.). Assume that the response value in a generic
trial can be expressed as follows:
Yi = β0 + ∑j=1
pβjxji + ∑
j=1
p-1 ∑s=j+1
p βjsxjixsi + ∑
j=1
qγjξji + ∑
j=1
q ∑s=1
p δjsξjixsi + Ei, (2.1)
where β0, βjs, γj, δjs denote sets of unknown real parameters, xji, xsi, ξji are the factor
levels in the i-th trial, i=1,2,…,n, n≥1+p⋅(p+1)/2+ q⋅(p+1), Ei are independent normal
variates with mean zero and constant variance σ2E , Ei∼N(0,σ2
E), i=1,2,…,n.
Now define the column vectors hx = (xh1,…,xhn)’, for h=1,2,…,p, and put hx = jx * sx for
h=p+s+(j-1)(j-2)/2, s=1,…,j-1, j=2,..,p, with * denoting Hadamard’s product of two
vectors (product component by component) and likewise hξ = (ξh1, …, ξhn)’, h=1,2,…,q,
8
and define hξ = jξ * sx for h=q+(j-1)p+s, s=1,2..,p; j=1,2,..,q. In matrix notations (2.1)
becomes:
Y = Xβ + Ξ0γ + Ξ1δ1 + … + Ξqδq + E (2.2)
where the vectors and matrices are defined as follows: Y = (Y1,…,Yn)’, E = (E1,…,En)’;
if m=p⋅(p+1)/2, β=(β0,β1,…,βm)’ is a column vector (m+1)×1 summarizing the β’s
parameters: β=(β0,β1,…,βh,…,βm)’, for h≠0 with βh≡βjs defined and ordered according
to the above rule for h=p+1,…,m; γ=(γ1,…,γq)’; δh=(δh1,…,δhp)’, h=1,2,…,q; X is a
n×(1+m) matrix with the first column of unitary elements, say 0x, and the other columns
given by hx, h=1,2, …,m; Ξ0 is a n×q matrix with columns hξ, h=1,2,…,q, Ξj are n×p
matrices whose columns are the vectors hξ, h=q+(j-1)p+s, s=1,2,…,p, j=1,2,…,q,
respectively.
We state an assumption which will result essential later.
Orthogonality assumption. The “design matrix”:
(XΞ0 Ξ1…..Ξq)
has columns which are all mutually orthogonal.
As well known the linear efficient estimators of the unknown parameters of model (2.1)
are obtained by applying the least squares criterion which leads to the expressions listed
below.
In force of the assumed orthogonality we have precisely:
β̂h = ∑i=1
n
b
xhi /∑
i=1
nx2
hi Yi = hx′(Xβ + Ξ0γ + Ξ1δ1 + …… + Ξqδq + E)/ hx2 =
βh +∑i=1
n ( )xhi / hx2 Ei = βh + ∑
i=1
nahiEi (2.3)
h=0,1,…,m, where hx denotes the modulus of vector hx, n×1, and we put (xhi/ hx2)=ahi;
likewise:
γ̂j = ∑i=1
n
b
ξji /∑i=1
n
ξ2ji Yi = γj + ∑
i=1
n
( )ξji /jξ2 Ei = γj + ∑i=1
nbjiEi , (2.4)
9
j=1,2,…,q, where jξ is the modulus of jξ and we put (ξji /jξ2) = bji;
δ̂js = ∑i=1
n
b
ξhi /∑i=1
n
ξ2hi Yi = δjs + ∑
i=1
n
( )ξhi /hξ2 Ei = δjs + ∑i=1
ncjsiEi (2.5)
where h=q+(j-1)p+s, j=1,2,…,q; s=1,2,…,p, hξ is the modulus of hξ and we put,
resuming the original definitions, (ξji xsi/∑i=1
n
ξ2jix2
si ) = cjsi.
Thus any estimator equals the value of the corresponding parameter plus an “error”,
which is a zero mean normal random variable, since it is a linear combination of the
zero mean normal random variables Ei, i= 1,2,…,n which have constant variance σ2E; it
ensues from the orthogonality assumption that all these errors are stochastically
independent. Obviously for a given realization of the experimental design the
parameters’ estimates are obtained by substituting the obtained observations yi for Yi,
i=1,2,…,n, in (2.3), (2.4), (2.5).
b) Second experiment: a conceptual counterpart of the first experiment.
In the real world the factors ξ1, ξ2, …, ξq are noise factors the levels of which take on
values at random. Therefore we can conceptually associate the former experiment with
another one which is exactly the same as far as the levels of the controllable factors are
concerned, while the levels of the noise factors are realizations of some random
variables Z1, Z2, …, Zq. Let us assume that the latter are zero mean normal variables
with constant variance v2, also stochastically independent of each other and of the errors
Ei, i=1,2,…,n, and that the n trials originate a simple random sampling from them which
will be described by the variables Z1i, Z2i, …, Zqi, i=1,2,…,n. Therefore Zji∼N(0,v2), ∀i,j
and all the Zji are stochastically independent of each other and of the errors. We note
that assuming Var(Zj) = v2j = v2, j=1,2,…,q, doesn’t represent a major restriction. In the
designed experiment ∑i=1
nξji = 0 ∀j, because of the orthogonality assumption. Thus the
levels ξji have to be thought as differences from a mean value ξ-
j . For an appropriate
choice of these differences one has to guess the possible ranges 2Rj , say, so that ξj ∈
10
{-Rj, Rj}, j=1,2,…,q, by the symmetry of the noise factors distributions and 2Rj will also
be the ranges of the latter. Then we can put, e.g., vj≅Rj/3 and if we re-scale the ξj
variables as 3ξj/Rj, we can assume that for the corresponding random variables Zj we
have Var(3Zj/Rj) ≅ 1 ∀j, when they are expressed on a same conventional scale by
putting 3Zj/Rj . We shall actually introduce a parameter v2, which can also be different
from 1, in order to allow one to study the effects of possible adjustment of the scales
definition, in particular see later Remark 3.1. We can now summarize the preceding
point by the following assumption.
Scale assumption. The scales of the variables ξj related to the noise factors are chosen
in such a way that Var(Zj)=v2, ∀j = 1,2,…,q, with v2 a known constant.
For the second experiment we assume the following model:
Yi = β0 + ∑j=1
pβjxji + ∑
j=1
p-1βh ∑
s=j+1
pβjsxjixsi +
∑j=1
qγjZji + ∑
j=1
pβh ∑
s=1
qδjsZjixsi + Ε
∼i, (2.6)
i=1,2,…,n, where the parameters βj, γj and δjs are the same as in model (2.1);
furthermore the errors Ε∼
i have the same stochastic structure as in (2.1) with Var(Ε∼
i) = σ2E
∀i, and are assumed to be stochastically independent of Ei; furthermore, as stated above,
Zji, j=1,2,…,q, i=1,2,…,n, are zero mean independent normal variables with constant
variance v2 independent of Ε∼
i as well as of Ei, pertaining to the first experiment,
i=1,2,…,n.
The expression in square brackets of (2.6) shows the “increment” of random error which
arises in the second experiment.
According to model (2.6) the ordinary least squares estimator of a parameter βh
becomes, in view of (2.3) but with reference to (2.6):
β̂^
h = ∑i=1
n
b
xhi /∑
i=1
nx2
hi Yi = βh +∑i=1
n ahi Ε
∼i + ∑
i=1
n ahi
∑j=1
qZji (γj+∑
s=1
pδjsxsi) =
βh +∑i=1
n ahi Ε
∼i + ∑
i=1
n ahi Z′i (γ+Dxi) , (2.7)
11
h=0,1,…,m, where in the last expression we introduced the vectors Zi = (Z1i,Z2i,…,Zqi)′,
xi = (x1i,x2i,…,xpi)′, γ and the matrix D with rows δ′h, h=1,…,q, see above (2.2). Now,
under the above assumptions and definitions we have:
Var(∑i=1
n ahi Ε
∼i) = σ2
E (∑i=1
n
a2hi ) = σ2
E /hx2 = hσ2E* ; (2.7’)
Var
∑i=1
n ahi Z′i (γ+Dxi) = ∑
i=1
n
a2hi (γ+Dxi)′E(ZiZ′i) (γ+Dxi) =
= v2∑i=1
n
(x2hi/hx4)(γ+Dxi)′(γ+Dxi) = hσ
2D, (2.7’’)
when E(⋅) indicates the expected value and the scale assumption is taken into
consideration. It follows from (2.7’) (2.7’’) that (2.7) can be given the following forms:
β̂^
h = β̂h +β̂*h = βh + hσ
E*U1+ hσ
.DU2 = βh + hσ
2E*+ hσ
2D U (2.8)
where in force of the normality and independence assumptions U1, U2 are standardized
normal random variables, independent of each other and, thus, also U is a standardized
normal random variable. Finally we also have:
Var(β̂^
h) = hσ2E* + hσ
2D = hσ
2T , (2.9)
h=0,1,…,m.
Remark 2.1. With reference to (2.6) note that in the second experiment the observed
variables Yi are heteroscedastic. In fact it is:
Var(Yi) = [σ2E + v2 (γ+Dxi)’ (γ+Dxi)] = 1/w2
i , (2.10)
say, an expression which depends on the experiment conditions xi in the i-th run,
i=1,2,…,n.
As well-known, should Var(Yi) be known, efficient estimates of βh, h=0,1,…,m, could
be obtained by having recourse to weighted least squares instead of ordinary ones, that
is by minimizing (Y-Xβ)’W (Y-Xβ), where W is the diagonal matrix with non null
elements w2i , i=1,2,…,n, equal to variance reciprocals, instead of (Y-Xβ)’(Y-Xβ). The
corresponding estimators would be:
β̂ = (X’WX)-1X’WY;
12
however the variances (2.10) are unknown and should be estimated, for instance by
using the results of the first experiment, so that the weights w2i and the resulting
estimates would be only approximated. Also the observations of the second experiment
should be known, which to avoid is the purpose of the paper and it turns out to be much
more easily done if we refer to ordinary least squares.
c) An example
To illustrate the techniques here presented we consider the experiment reported in
Zanella and Cascini (1997) concerning the optimization of a mechanical characteristics
of a plastic film produced on industrial scale.
In particular the goal of the experiment was to assess the effects on the mean film
strength loss under stress η, measured in laboratory on a test strip as y, of two process
controllable factors: x1 (stretch ratio), x2 (conditioning temperature, oC ) also having
regard to two between product noise factors z1 and z2, related to manufacturing
imperfections. More precisely we have to mention that the finished product is obtained
by cutting an appropriate strip from a many meters long film roll, “whole roll”, which
after unwinding can be represented by a rectangle as it is shown in Figure 1 where also
a corresponding coordinate system (ξ1,ξ2) is indicated.
Fig. 1 Geometric representation of a “whole roll” of plastic material
Now technical experience suggested as unrealistic to consider a whole roll, when
unwound, as equivalent to a uniform plastic sheet and it seems more appropriate to
ξ2
0
Test strip
ξ1
13
assume, with obvious simplification, that the local film strength loss under stress of a
small rectangle film strip depends linearly on the coordinates (ξ1,ξ2) of its centre. For
simplicity the test strip centre will be supposed to resume the strip location within the
whole roll and to define a uniform modification of the film strength loss under stress
within a given strip, also when this corresponds to a finished product unit. When such
units are retailed we can assume that a customer receives a finished product unit
corresponding to a strip the location of which in the original whole roll was chosen at
random. Thus while the values of the coordinates (ξ1,ξ2) can be chosen as desired by the
experimenter in the first part of the investigation and thus correspond to two
controllable factors ξ1, ξ2 the values of ξ1, ξ2 will appear at random in the successive
real life of the product as values of two random variables Z1, Z2.
In conclusion the experimental setting we have just illustrated complies with the
conceptual framework assumed in §2.a) above, with x1, x2 as fixed effects controllable
factors and ξ1, ξ2 as controllable counterpart of the noise factors, which are actually
described by the random variables Z1, Z2. Correspondingly a 24 full factorial design
with 16 runs in the four variables x1, x2, ξ1, ξ2 could be carried out, with conventional
units x1= (xA-1.425)/0.025; x2= (xB-80)/10 for the first two variables, while (-1,-1),
(-1,+1), (+1,-1), (+1,+1) express, in the conventional units chosen for ξ1, ξ2, the
coordinates of the vertices of the rectangle describing a whole roll, see Fig. 1.
According to Zanella and Cascini (1997) the completely designed experiment, with
conditions given in conventional units, together with the ensuing experimental results
are summarized in the following table.
Table 1: Experimental design and observed film strength loss under stress yi as reported in Zanella and Cascini (1997).
Design Factors Noise Factors (ξ1, ξ2)
x1 x2 (-1,-1) (+1,-1) (-1,+1) (+1,+1)
-1 -1 3.20 3.21 3.19 3.22
+1 -1 3.10 3.08 3.11 3.12
-1 +1 2.10 2.05 3.22 3.05
+1 +1 2.66 2.65 2.64 2.65
14
1) First experiment: completely designed.
Model (2.1) for the example we are considering becomes
Yi=β0+β1x1i+β2x2i+β12x1ix2i+γ1ξ1i+ γ2ξ2i+δ11ξ1ix1i+δ12ξ1ix2i+δ21ξ2ix1i+δ22ξ2ix2i+Ei
where i=1,2,…,n=16 enumerate the trials. Later on we shall suppose that all
assumptions considered in §2.1, a) and b) hold true. The total number of unknown real
parameters in this case is 1+p⋅(p+1)/2+ q⋅(p+1)=10 that is less than the number of the
trials (n=16) so that all the parameters in the model are estimable; in particular σ2E with 6
degree of freedom.
The matrix X is
X =
+1 -1
+1 -1+1 -1+1 -1+1 +1+1 +1+1 +1+1 +1+1 -1+1 -1+1 -1+1 -1+1 +1+1 +1+1 +1+1 +1
-1 +1
-1 +1-1 +1-1 +1-1 -1-1 -1-1 -1-1 -1+1 -1+1 -1+1 -1+1 -1+1 +1+1 +1+1 +1+1 +1
where the first column of unitary element corresponds to 0x, while the other columns are
respectively 1x, 2x, 3x = 1x * 2x; the matrix Ξ0 = [1ξ , 2ξ ], Ξ1 = [3ξ , 4ξ ] and Ξ2 = [5ξ , 6ξ ]
with 3ξ = 1ξ * 1x; 4ξ = 1ξ * 2x; 5ξ = 2ξ * 1x; 6ξ = 2ξ * 2x.
Explicitly we have:
15
Ξ0 =
-1 -1
+1 -1-1 +1+1 +1-1 -1+1 -1-1 +1+1 +1-1 -1+1 -1-1 +1+1 +1-1 -1+1 -1-1 +1+1 +1
Ξ1 =
+1 +1
-1 -1+1 +1-1 -1-1 +1+1 -1-1 +1+1 -1+1 -1-1 +1+1 -1-1 +1-1 -1+1 +1-1 -1+1 +1
Ξ2 =
+1 +1
+1 +1-1 -1-1 -1-1 +1-1 +1+1 -1+1 -1+1 -1+1 -1-1 +1-1 +1-1 -1-1 -1+1 +1+1 +1
.
Putting β=(β0,β1,β2,β3)’ where β3=β12, γ=(γ1,γ2)’, δ1=(δ11,δ12)’ and δ2=(δ21,δ22)’ the
above model in matrix notations becomes:
Y = η + E = Xβ + Ξ0γ + Ξ1δ1 + Ξ2δ2 + E.
It can easily be verified that the design matrix (16×10)
(XΞ0 Ξ1Ξ2)
has columns which are all mutually orthogonal, so that the “orthogonality assumption”
is satisfied.
Applying the least squares criterion, we obtain the efficient estimators of the unknown
“linear” parameters of model (2.1) listed in Table 2.
From the Table 2 we see that, if we choose α=0.10 as significance level, two main
effects relating to “conditioning temperature” (x2) and to “vertical location of the test
strip in the roll” (ξ2) have an effect on the response mean value η. In addition there are
two large interactions referring to ξ2⋅x1 and ξ2⋅x2. Then, if the goal of the experiment is
to obtain a lower value of η, the best experimental conditions are: a) high values of
conditioning temperature (x2) and choice of the test strip in the lower side of the roll. It
has to be remembered that only x1 and x2 are process factors, always controllable, while
ξ2 is a noise factor which is controllable only in the programmed experiment.
16
Table 2: Parameter estimates and corresponding T-test with regard to the complete model (2.1)
Parameter Parameter estimate Standard Error hσ̂2E* t p-value
β0 2.890625 0.056673177 51.00517 3.81e-09
β1 -0.014375 0.056673177 -0.25364773 0.80823595
β2 -0.263125 0.056673177 -4.642849* 0.0035299595
β3 = β12 0.036875 0.056673177 0.65066054 0.53936760
γ1 -0.011875 0.056673177 -0.20953475 0.84096424
γ2 0.134375 0.056673177 2.3710511* 0.055441083
δ11 0.010625 0.056673177 0.18747846 0.85746479
δ12 -0.015625 0.056673177 -0.27570362 0.79202514
δ21 -0.130625 0.056673177 -2.3048823* 0.060692958
δ22 0.128125 0.056673177 2.2607697* 0.064479931
2) Second experiment: a conceptual counterpart of the first experiment.
In this second experiment we suppose that the factors (ξ1, ξ2) are noise factors the levels
of which take on values at random; for the data of our example it means that the test
strip is taken at random on the roll.
Then, the levels of the noise factors are realizations of two random variables (Z1,Z2) so
that Z1i∼N (0,v2), Z2i∼N (0,v2), ∀i, and Z1i are stochastically independent of Z2i and
errors.
So the matrices Ξ0 = [1ξ , 2ξ ], Ξ1 = [3ξ , 4ξ ] and Ξ2 = [5ξ , 6ξ ] become
17
Ξ0 =
z11 z21
z12 z22z13 z23z14 z24z15 z25z16 z26z17 z27z18 z28z19 z29
z110 z210z111 z211z112 z212z113 z213z114 z214z115 z215z116 z216
Ξ1 =
-z11 -z11
-z12 -z12-z13 -z13-z14 -z14+z15 -z15+z16 -z16+z17 -z17+z18 -z18-z19 +z19
-z110 +z110-z111 +z111-z112 +z112+z113 +z113+z114 +z114+z115 +z115
+z116 +z116
Ξ2 =
-z21 -z21
-z22 -z22-z23 -z23-z24 -z24+z25 -z25+z26 -z26+z27 -z27+z28 -z28-z29 +z29
-z210 +z210-z211 +z211-z212 +z212+z213 +z213+z214 +z214+z215 +z215+z216 +z216
while the matrix X is the same as in the first experiment.
We have assumed that the variables (Z1, Z2) satisfy the “scale assumption” that is
Var(Z1)= Var(Z2)=v2 with v2 known constant. This assumption is not a restrictive one.
In fact suppose that the unwound roll of film is 5000 cm long and 50 cm high.
In this case in the original units we have -2500≤ξ1≤2500, -25≤ξ2≤25 and we can put
approximately v1= R1/3= 833.3, v2= R2/3= 8.333.
Then we can always transform the variables so that their standard deviations v1 and v2
are the same. For example if we consider the variables ξ1/v1 and ξ2/v2 they have both
variance equal to v2=1.
3. Effects estimators in the second experiment: obtaining their variances through
the first experiment
The effects of the factors, which are not noise factors, are estimated by the ordinary
least squares criterion according to expressions (2.7), h=1,2,…,m, with corresponding
variances (2.9). We now shall establish a formula which gives unbiased estimators of
these variances by using the observations of the first experiment. With regard to (2.1) let
ηi = E(Yi), i=1,2,…,n, denote the expected value of a generic observed variable Yi and
18
η̂i the corresponding estimator obtained by substituting the estimators β̂h, (2.3), γ̂j (2.4),
δ̂js (2.5) for the unknown parameters in (2.1). For later reference we note that in force
of (2.4), (2.5) we have
E(γ̂j) = E
∑i=1
nbji (ηi+Ei) = ∑
i=1
nbjiηi = γj, (3.1)
E(δ̂js) = E
∑i=1
ncjsi (ηi+Ei) = ∑
i=1
ncjsiηi = δjs, (3.2)
j=1,2…,q; s=1,2,…,p.
1) Since the errors Ε∼
i in the second experiment have the same variance σ2E as the errors
Ei in the first one, as an unbiased estimator of σ2E we shall assume the usual one, based
on residuals after fitting model (2.3) to the data of the first experiment, that is:
σ̂2E = ∑
i=1
n
(Yi-η̂i)2 / ν (3.3)
where ν=n-[1+p(p+1)/2+q(p+1)]. It ensues, according to (2.7’), that:
hσ̂2E* = σ̂2
E/hx2 . (3.4)
2) To estimate hσ2D, (2.7’’), we first consider the “natural” estimator defined by
replacing the unknown parameters by their estimators. Thus we obtain from (2.4), (2.5),
by taking (3.1), (3.2) into consideration:
hσ̂2D =
v2
hx4 ∑i=1
n x2
hi
∑j=1
qA( γ̂j+δ̂'
jxi)2 =v2
hx4 ∑i=1
n x2
hi
∑j=1
qA
γj+δ'jxi + ∑
k=1
n
bjk + ∑
s=1
p cjskxsi Ek
2
=
= v2
hx4 ∑j=1
q
∑i=1
nb
∑
k=1
n xhi
bjk + ∑
s=1
p cjskxsi (ηk+Ek)
2
= v2
hx4 ∑j=1
q Q2
j , (3.5)
say, h= 0,1,…,m; j=1,2,…,q. We want to find the expected value of Q2 j , ∀ j=1,2,…,q.
In order to do so it is convenient to write the expressions in square brackets Q2 j in
matrix form:
19
xh1 0 .. .. 0
0 .. .. .. 00 .. xhi .. 00 .. .. .. 00 .. .. .. xhn
⋅
bj1 bj2 .. .. bjn
bj1 bj2 .. .. bjn.. .. .. .. .... .. .. .. ..
bj1 bj2 .. .. bjn
+
x11 x21 .. .. xp1
.. .. .. .. ..x1i x2i .. .. xpi.. .. .. .. ..
x1n x2n .. .. xpn
⋅
cj11 cj12 .. cj1k .. cj1n
cj21 cj22 .. cj2k .. cj2n.. .. .. .. .. .... .. .. .. .. ..
cjp1 cjp2 .. cjpk .. cjpn
(η+ E)=
= A⋅(Bj + X*Cj)(η + E) = Gj (η + E) (3.6)
with obvious notations and where Gj are n×n square matrices, η, E are the n×1 vectors
summarizing the Yi expected values and the errors Ei respectively.
Then we can write:
Q2 j = (η + E)’ G′jGj (η + E), (3.7)
j=1,2,…,q. We can now prove some lemmas useful later on.
Lemma 3.1. i. If all elements xhi , i=1,2,…,n, are different from zero, the rank of Gj in
(3.6) is rj=p+1, otherwise rj≤p+1.
ii. The probabilistic structure of any Q2 j is the following:
Q2 j = σ2
E ∑i=1
rj
.jλ2i jT
2i (3.8)
j=1,2,…,q, where jλi are the rj non-null eigenvalues of G′jGj assumed to be the first rj
ones, without loss of generality; jT2i are independent noncentral chi-square random
variables each with one degree of freedom and noncentrality parameters jτ2i =
(η’oi/ σE)2, where oi is the corresponding column of an orthogonal matrix O, n×n,
such that O′G′jGj O = Diag (jλ21…,jλ
2r , 0, …, 0).
Proof. i) Note that as it is well-known the rank of a matrix X*Cj, n×n, defined by (3.6),
is equal to C′jX*′X*Cj = C′jDiag(1/1x, 1/2x,…, 1/px)Cj, which in turn is equal to C′j, which
is p if jx≠0, j=1,2,…,p. This follows from the orthogonality assumption which, in
particular, ensures that the p columns of C′j are linearly independent. On the other hand
B′j has only a distinct column, which, once more by the orthogonality assumption, is
orthogonal to the linear p-dimensional manifold spanned by the columns of C′jX*′; thus
the mentioned column of B′j cannot be linearly dependent on the latter ones and it
follows that the rank of B′j + C′jX*′, and likewise of Bj + X*Cj, is p+1. When A has all
elements different from zero it has rank n and consequently Gj has rank p+1; in the other
20
cases, by the orthogonality condition, the rank r of A must be 2, at least, and the product
leading to Gj has at most rank equal to the smallest rank shown by the two factors,
which cannot exceed p+1.
ii) Since G′jGj is a symmetric matrix, as it is known, it is possible to find an orthogonal
matrix O = (o1, o2, …, on) so that:
Q2 j = σ2
E [η+ E]/σE]′O[O′G j′GjO]O′[η+ E]/σE] = σ2E ∑
i=1
rj jλ
2i jT
2i (3.10)
where jλ2i are the non-null eigenvalues of G′jGj, and jTi = [(η+ E)′/σE] oi , i=1,2, … ,rj
are independent normal random variables with means (η′oi)/σE and unit variance, since
oi are columns of an orthogonal matrix and we assumed that the errors E1,E2,…,En are
zero mean normal random variables with variance σ2E and independent of each other.
Thus jT2i are noncentral chi-square random variables with one degree of freedom, with
noncentrality parameter (η′oi)2/σ2E , and relationship (3.8) is proved.
Lemma 3.2. Suppose, for simplicity, that the matrices Gj, j=1,2,…,q, have rank
r=(p+1); it is shown that all Q2 j , j=1,2…,q, are stochastically independent of each
other and also independent of σ̂2E , (3.3), and β̂h , (2.3), respectively, i.e. of the unbiased
estimators of σ2E and βh in the first experiment.
Proof. 1) By assumption the square matrix, Gj =(Gj1,Gj2, …, Gjn)′, where G′ji , i=1,2,…,n,
are the vectors defined by the rows of Gj, n×n, has rank r=p+1. Thus in ℜn the former
vectors define a linear vector space V(r) of dimension r. With reference to comma ii. it is
easily seen that any vector of V(r) is orthogonal to any vector of the linear vector space
V(n-r) spanned by the (n-r) columns oi, i=r+1,…,n, of the orthogonal matrix O. In fact we
have:
O′G′j GjO =
o′1
.. o′i .. o′n
⋅ [ ]Gj1,..,Gji,..,Gjn ⋅
G′j1
..G′ji..
G′jn
⋅[ ]o1 ..o2 .. on = Diag (jλ21…,jλ
2r , 0, …, 0),
(3.11)
which implies that the resulting diagonal terms must vanish, for i>r, that is:
21
∑k=1
n (G′jkor+1)2 = … = ∑
k=1
n (G′jkoi)2 = … = ∑
k=1
n (G′jkon)2 = 0, (3.12)
and (3.12) ensures that G′jkoi=0 , i=r+1,…,n; k=1,2,…,n, that is, the vectors oi,
i=r+1,…,n – and any linear combination of the same – are orthogonal to the vectors Gjk,
and to any linear combination of the latter, which belongs to V(r) by definition. Now, as
well-known, the other columns oi, i = 1,2,…,r, of matrix O span the orthocomplement of
V(n-r), say V*(r) , which is the r-dimensional vector space of ℜn that includes all vectors
orthogonal to V(n-r). It follows that V(r) ⊂ V*(r) , but, since also the dimension of V(r) is r,
this implies that any basis for V(r) is such also for V*(r), i.e. V(r)⊃V*
(r)and, thus, V(r)=V*(r).
2) Now consider the vectors of ℜn corresponding to two generic rows, of two different
matrices Gj, Gj′, j≠j′, say G′ji , G′j′t , i, t=1,2,…,n. According to (3.5) their scalar product
is:
G′ji Gj′t = ∑k=1
nbxhixht
(bjk+∑
s=1
p cjskxsi ) (bj′k+∑
s=1
p cj′skxst ) = 0,
for a given pair i, t since in force of the orthogonality assumption:
∑k=1
n bjkbj′k = ∑
k=1
n bjkcj′sk = ∑
k=1
n cjskbj′k = ∑
k=1
n cjskcj′sk = 0
∀s, j≠j′. It follows that the vector spaces spanned by G′1i, G′2i, …, G′qi, i=1,2,…,n, say
jV(r), j=1,2,…,q, are all mutually orthogonal. Let us denote by jo′i , i=1,2,…,r, the set of
rows of the orthogonal matrix jO′ - that following (3.11) allows putting Q2j in diagonal
form – which corresponds to the non-null eigenvalues of G′jGj – and represents a basis
for jV(r) .
3) From the orthogonality assumption and with reference to (2.2) and (2.3) it also
ensues that:
xhi ∑k=1
nbahk (bjk+∑
s=1
p cjskxsi ) = 0
h=0,1,…,m; j=1,2,…,q; i=1,2,…,n, which implies that also the vectors of coefficients
ahk = xhk/hx2 , k=1,2,…,n, whose linear combinations define the random components of
the estimators β̂h , are orthogonal to the vector spaces jV(r), j=1,2,…,q, besides being
orthogonal to each other by the orthogonality assumption. Let us denote by o~′h,
22
h=0,1,…,m, the corresponding row vectors obtained through the transformation o~hk =
ahk/ha= xhi/hx, with ha=
∑k=1
n a2
hk 1/2
.
4) Further note that the (p+1) vectors, n×1, of ℜn, whose components are bjk, cjsk,
k=1,2,…,n; s=1,2,…,p, say bj, cjs , are orthogonal to each other by the orthogonality
assumption, and thus they are linearly independent; in particular each set (bj,cjs),
s=1,2,…,p, for a given j, spans of vector space of dimension (p+1) of ℜn . When the
rank of Gj is (p+1), which requires, in particular, that xhi≠0, i=1,2,..,n, it is easily seen
that they too form a basis for jV(r). In fact as it is shown by (3.5), (3.6), we have Gjk =
xhi (bj+∑s=1
p cjsxsi ), i=1,2,…,n, that is for each j the vectors Gji are linear combinations of
vectors bj, cjs and, by the hypothesis on the rank of Gj, there are (p+1) linear
independent vectors Gji, which, as we saw, span jV(r) = jV(p+1), in this case. It follows
that also the (p+1) vectors bj, cjs , s=1,2,…,p, form a basis for the vector space jV(p+1),
thus belonging to it, and this holds for j=1,2,…,q.
5) Now consider the following, n×n, orthogonal matrix:
O~
′ =
o~ 0
.. o~ m 1o′1 ..
1o′p+1 ..
qo′1 ..
qo′p+1
O′*
=
O
~′1
O′* (3.13)
where the first m+1+q(p+1) = 1+p(p+1)/2+q(p+1) rows are the orthogonal vectors
defined above and their number equals that of the parameters in model (2.1), O′* is a
further set of ν =n-[m+1+q(p+1)] rows which are orthogonal to each other and with
respect to the other ones. We note that in (3.10) we can assume O′=O~
′, (3.13), to put Q2 j
in diagonal form, whichever is j, because the orthogonal rows jo′i , i=p+2,…n, which are
23
the eigenvectors of matrix G′j Gj, corresponding to the null eigenvalues, can be arbitrarily
chosen, provided that they are orthogonal to jo′i , i=1,2,…,p+1, and the order change
only induces a corresponding displacement of the non-null diagonal elements of O′G′j
GjO (see e.g. Zurmühl (1961), p.167-168, 190). Thus we can conclude that: 1) O~
′(η+
E) still is a set of n independent normal random variables; 2) correspondingly the
quadratic forms Q2 j , j=1,2,…,q, are stochastically independent, since they are
functions of q independent sets of independent normal variables; 3) they are also
independent of any of the quantities hxβ̂h, h=0,1,…,m (2.3) – obtained from the first
(m+1) rows of matrix O~
′ – since these quantities are normal variables independent of
each other and also of any of the sets of normal variables used in the definitions of the
quantities Q2 j, j=1,2,…,q, implying the same property for the β̂h; 4) finally the rows of
matrix O~
′1 span the vector space to which the mean values ηi, i=1,2,…,n, are constrained
by the linear model (2.1) – recall the remark on the vectors bj, cjs – so that O′*η=0. This
implies that (3.3) is equivalent to:
σ̂2E = E′O*O*′ E/ν
which is distributed (as a (σ2E/ν)χ2
ν random variable) independently of Q2 j , j=1,2,…,q,
and of β̂h, h=0,1,2,…,m. Thus the statements of the lemma are proved.
Lemma 3.3. If |xhi|=1, i=1,2,…,n, ∑i=1
n b2
ji =1/jξ2 =1/n, ∑i=1
n c2
jsi=1/jsξ2=1/sx2, js=q+(j-1)p+s,
s=1,2,…,p, j∈{1,2,…,q}, see (2.4), (2.5), the matrix G′jGj is idempotent of rank (p+1)
and Q2 j /σ
2E has a noncentral chi-square distribution with (p+1) degrees of freedom and
noncentrality parameter:
τ2j = ∑
i=1
n (γj+δ'
jxi)2/σ2E. (3.9)
Proof. By using the notations defined through (3.6) we obtain:
G′j Gj = (B′j + C′j X*′)A′A(Bj+X*Cj) = B′j Bj + B′j X*Cj + C′j X*′Bj + C′j X*′ X*Cj =
= B′j Bj + C′j X*′ X*Cj , (3.14)
24
since the diagonal matrix A′A has unitary elements, x2hi =1, i=1,2,...,n, thus coinciding
with the identity matrix, n×n and all elements of matrix B′j X*, n×p, are zero, because, by
(3.6), any row of B′j has constant elements and is multiplied by a coloumn hx,
h=1,2,...,p, of X* which is orthogonal to the vector with unitary elements 0x; obviously
the same holds for the transpose X*′Bj . From (3.14) it ensues that:
G′j Gj ⋅ G′j Gj = (B′j Bj + C′j X*′ X*Cj ) ⋅ (B′j Bj + C′j X*′ X*Cj ) =
B′j (Bj B′j) Bj +C′j X*′ X*(Cj C′j)(X*′ X*)Cj , (3.15)
because all elements of matrix BjC′j, n×p, are zero as the rows of Bj are all “orthogonal”
to the coloumns of C′j , owing to the orthogonality condition, and obviously the same
holds for the transpose Cj B′j.
Now by the definition of Bj given in (3.6) and recalling (2.4) we have:
(Bj B′j) Bj =
bj1 .. bjn
.. .. .. bj1 .. bjn
bj1 .. bj1
.. .. .. bjn .. bjn
Bj = (1/jξ2)U*n Bj= (n/jξ2) Bj= Bj (3.16)
if U*n denotes an n×n matrix with all elements equal to unity and we take the hypothesis
jξ2 = n into consideration. Furthermore by the definitions of Cj and X* given in (3.6) and
the orthogonality condition, it follows that:
(Cj C′j) (X*′ X*) =
1/j1
ξ2 .. 0 .. .. .. 0 .. 1/jp
ξ2
1x2 .. 0
.. .. .. 0 .. px2
= Ip, (3.17)
if Ip denotes the identity matrix p×p and the hypothesis jsξ2 = sx2 is taken into account.
By inserting (3.16), (3.17) into (3.15) and in view of (3.14) the desired result:
(G′j Gj)2 = G′j Gj ⋅ G′j Gj = G′j Gj
is thus obtained , that is G′j Gj is an idempotent matrix and of rank (p+1), since xhi=1
(see remarks at the end of comma i.).
Finally, as Yi = ηi+Ei , i =1,2,…,n, are assumed to be independent normal random
variables with constant variance σ2E and mean ηi , a well-known result, see e.g. Graybill
(1961), p. 83, ensures that Q2 j/ σ2
E has a noncentral chi-square distribution with (p+1)
degrees of freedom and noncentrality parameter:
τ2 j = (η′G′j Gj η)/ σ2
E = ∑i=1
n x2
hi(γj+δ'jxi)2/σ2
E, (3.18)
25
if we recall (3.1), (3.2) and the second and third relationship (3.5). Since by hypothesis
x2hi =1, ∀ i, (3.9) follows from (3.18), thus the proof of the lemma is complete.
Now we can obtain a first main result.
Proposition 3.1. i. With reference to relationships (3.5), (3.6), (2.7’), (2.7’’) and the
related hypotheses it is shown that the expected value of hσ̂2D has the following form:
E(hσ̂2D) = (v2/hx4) ∑
j=1
q E(Q2
j) = (v2/hx4) ∑j=1
q
∑
i=1
n x2
hi (γj+δ'jxi)2 +σ2
E tr (G′j Gj) =
= hσ2D + hσ
2E* (v2/hx2) [∑
j=1
q tr(G′j Gj)] , (3.19)
where tr(⋅) denotes the trace of a matrix (sum of the diagonal elements).
ii. It follows that an unbiased estimator, hσ̂2T , say, of Var(β̂
^h) = σ2
E* + hσ2D, (2.9), which is
the variance of the ordinary least squares estimator β̂^
h, (2.7), of the fixed effect
parameter βh, in the second experiment, is given by:
hσ̂2T = hσ̂
2E* [1-(v2/hx2)∑
j=1
q tr(G′j Gj)] + hσ̂
2D =
= σ̂2
E
hx2 ⋅[1-(v2/hx2)∑j=1
q tr(G′j Gj)] +
v2
hx4 ∑j=1
q
∑
i=1
n x2
hi (γ̂j+δ̂'jxi)2 (3.20)
where relationships (3.3), (3.4), (3.5) are taken into account and we can suppose that
[1-(v2/hx2)∑j=1
q tr(G′j Gj)]≥0, since v2 is substantially arbitrary, see Remark 3.1 below, so
that the estimator is certainly positive, i.e. admissible. Furthermore, assuming for
semplicity, rj=p+1 ∀j, that is that the rank of any matrix G′j Gj is maximum, Lemma 3.2
ensures that the estimator hσ̂2T is stochastically independent of β̂
^h = β̂h+β̂*
h , (2.8), also if
we assume that β̂h is calculated by means of the results of the first experiment through
(2.3).
Proof. i. From (3.5), (3.10) it follows that:
E(hσ̂2D) =
v2
hx4 ∑j=1
q E(Q2
j) = k2σ2E ∑
j=1
q ∑i=1
rj
jλ2i E (jT
2i ) =
k2σ2E ∑
j=1
q ∑i=1
rj
jλ2i [1+(=(η′joi/σE)2] = k2σ2
E {∑j=1
q [∑
i=1
rj
jλ2i (=(η′joi /σE)2] + ∑
j=1
q tr(G′j Gj)} =
26
= k2[σ2E (∑
j=1
q (η′G′j Gj η)/σ2
E + σ2E ∑
j=1
q tr(G′j Gj)], (3.21)
where k2= (v2
hx4 ); the notation joi is used, cfr. Lemma 3.2, 2), to indicate the eigenvector
of matrix G′jGj pertaining to jλ2i ; the fourth expression (3.21) is obtained by recalling that
the expected value of a noncentral chi-square variate, with ν degrees of freedom and
noncentrality parameter, τ2 j is ν+τ2
j and that in our case ν=1, τ2 j =(η′joi/σE)2; the fifth
expression (3.21) takes into account that the trace of a matrix equals the sum of its
eigenvalues; in the last expression the original coordinates system is restored, see
(3.10), which, for E =0, leads to the equivalent form used in (3.21).
On the other hand:
∑j=1
q (η′G′j Gj η) = ∑
j=1
q
∑i=1
n x2
hi (γj+δ'jxi)2 (3.22)
according to (3.1), (3.2), see also (3.5). Combining (3.21) with (3.22) we obtain (3.19)
as desired.
ii. Consider the expectation E(hσ̂2T). Remembering that E(hσ̂
2E*)=σ2
E/hx2 and that E(hσ̂2D)
has the expression given in (3.19), we see that the bias of hσ̂2D and the term
σ2E(v2/hx4)∑
j=1
qtr(G′j Gj) cancel out so that:
E(hσ̂2T) = hσ
2E* + hσ
2D = Var(β̂
^h)
as required.
With reference to (2.7) (2.8), we see that β̂h, even if calculated by means of the results
of the first experiment – Ei replace E∼
i, i=1,2,…,n – is stochastically independent of hσ̂2T,
since by Lemma 3.2, the latter quantity is a function of random variables independent of
β̂h. On the other hand β̂*h is a function of the noise variables Zi, which are assumed to be
independent of the errors E∼
i , Ei, i=1,2,…,n; thus β̂*h is independent of σ̂2
T (and of β̂h
whichever is the reference experiment). It follows obviously that the sum: β̂^
h = β̂h + β̂*h
remains stochastically independent of hσ̂2T. This completes the proof of the proposition.
27
The following corollary to Proposition 3.1 shows that, in particular, by an appropriate
choice of the scales for the noise factors, the estimator hσ̂2T can result proportional to a
noncentral chi-square variable. We shall use the notations χ2ν, χ2
ν(τ2) to indicate chi-
square random variables, respectively central and noncentral, with noncentrality
parameter τ2 and with ν degrees of freedom.
Corollary 3.1. Under the hypotheses of Lemma 3.3, which in particular ensure that
(Q2j /σ
2E) is a χ2
p+1(τ2j ) variate, suppose that the scale of the noise factors ξ1, ξ2, …, ξq are
chosen in such a way that the constant variance v2 of the noise variables Zj, ∀j, assured
by the scale assumption, has the value:
v*2 = n/(n-m-1) (3.23)
where n, m+1 are the number of observations and that of the parameters βh , pertaining
to the fixed effect factors, respectively. Then we have:
hσ̂2T = (σ2
E/n) [χ2ν* (τ2 )]/ν* (3.24)
that is the unbiased estimator hσ̂2T , (3.20), is proportional to a noncentral chi-square
variate with ν*=(n-m-1) degrees of freedom and noncentrality parameter τ2 defined by:
τ2 = ∑j=1
q ∑i=1
n (γj+δ'
jxi)2 /σ2E . (3.25)
Proof. By Lemma 3.2 and 3.3, since, in particular, hx2= ∑i=1
nx2
hi = n and, as well-known
the trace of an idempotent matrix of rank (p+1), as are the matrices G′j Gj, is p+1, we can
give to (3.20) the following form:
hσ̂2T = (σ2
E/nν)[1 - v2
n q(p+1)] χ2ν +
v2σ2E
n2 [χ2q(p+1) (τ2 )], (3.26)
where, because of (3.3), ν=n-[1+p(p+1)/2 + q(p+1)]=n-[m+1+q(p+1)] and the second
term on the right side of (3.26) follows from the fact that also independent noncentral
chi-square variates, as are Q2 j/σ
2 E by Lemma 3.2, have a distribution reproductive under
convolution. This implies that their sum leads to a similar variate with degrees of
freedom as well as a noncentrality parameters given by the corresponding sums so that
in our case the values are q(p+1) and τ2, (3.25), respectively.
By (3.23) we obtain:
28
[1- v*
2
n q(p+1)] = [1- n
n(n-m-1)q(p+1)] = n-m-1-q(p+1)
n-m-1 = ν
n-m-1 >0 ;
v*
2
n2 = n
[n2(n-m-1)] = 1
[n(n-m-1)] , (3.26’)
and (3.26) becomes
hσ̂2T = (σ2
E/n)[χ2ν +χ2
q(p+1) (τ2 ) ]/(n-m-1) =(σ2E/n) [χ2
ν* (τ2 )]/ν*
using once more the property that the chi-square distribution is reproductive under
convolution and thus ν* = n-[m+1+q(p+1)] + q(p+1) = n-m-1, while τ2 , (3.5), remains
unchanged, since the first variable in the sum is a central chi-square. Thus (3.24) holds
and the corollary is proved.
Remark 3.1. The further constraints on the scales of the noise factors introduced in
Lemma 3.3 and Corollary 3.1 deserve some comments. In general the noise factors
correspond to certain variables with values ξ*j , j=1,2,…,q expressed on “physical scales”
according to appropriate measurement units, like centimeter, degree, etc.. The scale
assumption of § 2, b) leads to introduce for each noise factor a conventional unit
(indicated as uconvj later on) defined through the relationship:
v × uconvj = Rj/3, (3.27)
j=1,2,…,q, where v is a positive constant, which represents the approximate
conventional standard deviation common to all noise factors when their values happen
at random as values of the variables Zj; Rj is the half range of the expected differences
(ξ*j - ξ*
j0) from a central value ξ*j0. It follows from (3.27) that
uconvj = Rj/3v, (3.28)
j=1,2,…,q, express the conventional units through the original physical ones. In the first
experiment the conventional levels of the noise factors are defined as:
ξji = (ξ*
ji - ξ*j0)
Rj/3v (3.29)
j=1,2,…,q; i=1,2,…,n, with ξ*j0 coinciding with the average of the ξ*
ji values, i=1,2,..,n,
because of the orthogonality condition. Note that the range of the possible conventional
values ξji is {-3v,3v}, which corresponds to {-Rj, Rj}, j=1,2,…,q, in the original units by
(3.27). Now consider the constraints jξ2=n, j=1,2,…,q, imposed by Lemma 3.3, which is
tantamount to requiring that:
29
∑i=1
n ξ2
ji/n1/2
= 1 (3.30)
j=1,2,…,q, that is, that the standard deviation of the noise factors conventional levels is
equal to 1. This only apparently puts a limit to the spread of the real factors levels if the
choice of v remains arbitrary. In fact a change in v, from v to v*, say, only represents a
different definition of the conventional units according to (3.28) (and correspondingly
the values of the parameters γj, δjs in model (2.6) will be multiplied by (v/v*) so that the
model is actually unchanged. Thus, if for example, we choose v=1/3, the conventional
ranges become {-1,1} and the constraints (3.30) allow one to choose the real levels of
the noise factors in such a way that they cover their physical ranges. The situation is
different if we have to fix the value v as required by Corollary 3.1, which imposes to
choose v= n/(n-m-1). This value is larger than 1, which ensures that the constraints
(3.30) are compatible with the allowed range {-3v,3v}. However in this case they
impose a true restriction of the spread of the noise factors real levels, since, in
consequence of conditions (3.30) and of (3.29), their standard deviations in the original
units must be (1)⋅ (n-m-1)/n(Rj/3), that is, smaller than those, equal to Rj/3 in the same
units, assigned through (3.27) to the corresponding random variables Zj. This is not a
serious drawback if the model is exactly linear in the variables ξ*j . In the example
presented in the paper n=16, m+1=4 and we get 0.289Rj.
4. Testing whether the effects of the systematic factors may be detected when the
noise factors variability is taken into account.
We shall discuss two proposals.
a) The case when Corollary 3.1 holds. Consider the statistic:
β̂h
hσ̂T
2
≅
(n-m-1)χ21(
nβ2h
σ2E
)
χ2n-m-1 (τ2 )
= F1,ν*
nβ2
h
σ2E
, τ2 , (4.1)
30
ν*=n-m-1, with regard to (2.3), in particular to the implied assumption xhi=1,
i=1,2,…,n, and to (3.23), (3.24) with the corresponding notations, also remembering
that β̂h is a normal variate with mean βh and variance σ2E/n.
Since according to Lemma 3.2, β̂2h and hσ̂
2T are independent random variables, the
variable (4.1) follows a so-called doubly noncentral F-distribution with 1, ν* degrees of
freedom and noncentrality parameter β∼2
h = nβ2h /σ
2E, say, and τ2 (3.25). For a review of
the subject see Johnson and Kotz (1992), Chp. 30, § 7, p.499; more specifically in the
following, for the numerical applications, we shall have recourse to the software
DATAPLOT, and in particular to the commands DNFCDF, DNFPPF written on the
basis of the paper of Revee (1986), who refers to the series representation given in
Bulgren (1971).
We note that under the assumptions of Corollary 3.1 the variance of β̂^
h, (2.9), related to
the second experiment is:
Var(β̂^
h) = (σ2E/nν*)(ν*+τ2)= (σ2
E/n)(1+τ2/ ν*) =
(σ2E/n)(1+
n hσ2D
σ2E
) = (σ2E/n)(1+ζ2), (4.2)
say; this follows directly by recalling the definitions (2.7’), (2.7’’), (2.9) and expressing
hσ2T as the expected value of the unbiased estimate hσ̂
2T obtained from (3.26). We are
interested in assessing whether the effect βh can be distinguishable from random
fluctuations when their variance is the real one, which is present in the second
experiment and is given by (4.2). Thus, for instance, we are led to test a hypothesis like
the following:
H0 =
|βh|
σ2E/n+hσ
2D
= n |βh/σE|
1+ζ2 ≥ c0
|βh|/σE ≥ |βh0|/σE
0≤ζ2≤ζ20
(4.3)
where c0 is a chosen positive “threshold value of distinguishability”, for instance c0 =
1.5 or 2; ζ20 =
hσ2D
σ2E/n
is a chosen variance increment for the estimator of the effect βh when
31
the noise factors happen at random; βh0/σE = 1+ ζ2
0 c0
n is the standardized threshold
value of a distinguishable factor effect when ζ2 =ζ20 , see Fig. 2.
Fig. 2 Graphic illustration of the hypothesis H0 in the parameter space: dashed strip. The
region above the thin line corresponds to a subset of the parameter space where the
probability of accepting H0 is larger than 1-α=0.90; it follows that, if we assume the
Neyman-Pearson approach, the test is biased in the regions A, B. The bold line is the contour
of the region defined by considering in (4.3) the first condition only, for c0=1 and n=16.
For testing the hypothesis (4.3) we propose a test of significance which accepts H0 if:
β̂h
hσ̂T
2
≥ F1,ν*,1-α (c∼20 , ν*ζ2
0 ), (4.4)
otherwise rejects it, where F1,ν*,1-α ( ⋅ , ⋅ ) is the value which is exceeded with probability (1-α)
by the doubly non central F-variate with 1, ν∗= n-m-1 degrees of freedom and values of the
noncentrality parameters which are defined as β∼2
h=n⋅(βh0/σE)2=(1+ζ20)c
20= c∼2
0 and τ2=ν* hσ
2D
σ2E/n
= ν*ζ20 respectively, see (4.3). The test (4.4) is justified by the following proposition.
Proposition 4.1. Relationship (4.4) defines a test of significance of size α for testing H0
since
ξ0 = 4
c∼0=nβh0
σE
= 4.12
A
B
32
supθ∈Θ
P[
β̂h
hσ̂T
2
< F1,ν*,1-α (c∼20 , ν*ζ2
0 )] = α (4.5)
where P(⋅) denotes the probability, θ is the parameter vector [(βh/σE), ζ], Θ is the
domain in the parameter space defined by (βh/σE )≥(βh0/σE), 0≤ζ≤ζ0.
Proof. With reference to our notations, the theorem given by Scheffè (1959), p.136,
ensures that the probability:
P
ν*χ2
1(β∼2
h )χ
2ν* (τ2 )
> F1,ν*,1-α (c∼20 , ν*ζ2
0 ) = P(E0) (4.6)
for a fixed β∼2
h, is a strictly decreasing function of τ2, that is of ζ2, according to
(4.2). Thus, if we consider the low boundary of the domain Θ, defined by
(βh/σE)=(βh0/σE), 0≤ζ≤ζ0, see Fig. 2, we have that the probability of the
complementary event E_
0, defined by considering ≤ instead of > in (4.6), is an increasing
function of τ2. Correspondingly the supremum of P(E_
0) is the value of P(E_
0) at ζ=ζ0,
which is α because of the choice F1,ν*,1-α (c∼20, ν*ζ2
0).
It follows that for any 0≤ζ<ζ0: 1-P(E)=P(E-)<α. Now for such a ζ suppose that βh0/σE
is increased to obtain a larger value βh/σE, with a corresponding increment of the
noncentrality parameter, say from c∼20 to β
≈2h . We now extend Scheffe’s argument as
follows. The corresponding probability (4.6) can be equivalently written as:
P[(W+β≈
h)2> F1,ν*,1-α (c∼20 , ν*ζ2
0 ) χ2ν* (ν*ζ2 )/ ν*] (4.7)
where W is a normal random variable with mean zero and unit variance, independent of
χ2ν* (⋅). For a given value of this latter variable, (4.7) expresses a conditional probability,
which is the same as the unconditional probability because of the independence, we
have just remarked. For simplicity let us denote the expression on the right side of the
inequality in (4.7) by a2. Thus:
P(E∼
) = P[(W+β≈
h)2> a2] = P {(W+β≈
h)∉[-a,a]} (4.8)
33
is the probability that a normal variate of unit variance and “centered” at β≈
h assumes
values outside of the given interval [-a,a] and it is known that this is a function which
increases strictly when |β≈
h| increases. So if we multiply the probability (4.8) by the
probability density of χ2ν* (ν*ζ2 ), which does not depend on βh/σE, and integrate over
all possible values, to obtain the unconditional probability, we find a value P(E∼
)>P(E)
and still more 1-P(E∼
)=P(E∼_
)<α. Thus in any point of Θ, which is not on the boundary,
we have a probability of rejecting H0 smaller than α. For the aforesaid and continuity
considerations, which allow replacing < with ≤ in (4.5) we can conclude that this
probability has α as its supremum and (4.5) holds, as it was to be proved.
Remark 4.1. A thorough investigation of the properties of the test defined by (4.4),
when the Neyman-Pearson’s approach is considered, will be presented in another paper.
We only point out that the results obtained in the proof of Proposition 4.1 allows one to
forecast that regions like A and B of Fig. 2 will correspond to a subset of the parameter
space where the test is biased, that is, the probability of falsely accepting H0 could be
larger than (1-α). How extensive this subset is will be one of the subject of future
investigation. However Fig. 2 let one guess that the region where the test is unbiased
will not differ much from the set of true interest in the parameter space which is
composed of all distinguishable values and is defined by considering in (4.3) the first
condition only (see bold line in Fig. 2).
The following Table I gives an example of the elements which are required for the
practical application of the test (4.4). It refers to a 2-levels experimental design 24 of
n=16 runs in 4 factors, of which 2 are noise factors. The critical values F1,ν*,1-α (c∼20,ν*ζ2
0 )
are given for the values c0 = c∼0/ 1+ζ20 = 1, 1.5, 2, of the “distinguishability threshold”
and the values ζ20/n =
hσ2D
σ2E
= 0.25, 0.7, 1, 2, 4 of the error variance increments; the level
of significance α = 0.05, 0.10 are considered. We recall that, since we are applying
Corollary 3.1, the scale of the noise factors is supposed to be chosen in such a way that
Tab
le I.
Cri
tical
val
ues F
1,ν *
,1-α
(c∼2 0 , ν
* ζ2 0 ) n
eede
d fo
r app
lyin
g te
st (4
.4);
c∼ 0=c 0
1+ζ2 0,
ζ2 0=n(
hσ2 D/σ
2 E);
n =1
6, p
=2, q
=2; v
=n/
ν* ,
α=0
.05;
0.1
0.
α=0
.05
c 0
1 1.
5 2
hσ2 D/σ
2 E
0.25
0.
7 1
2 4
0.25
0.
7 1
2 4
0.25
0.
7 1
2 4
ν* V
12
1.16
0.
0751
5 0.
2763
0.
3561
0.
5023
0.
6261
0.
2368
0.
5564
0.
6677
0.
8641
1.
0259
0.
4506
0.
8654
1.
0035
1.
2432
1.
4380
13
1.11
0.
0751
3 0.
2766
0.
3565
0.
5028
0.
6267
0.
2370
0.
5572
0.
6687
0.
8653
1.
0271
0.
4514
0.
8672
1.
0055
1.
2453
1.
4399
14
1.07
0.
0751
2 0.
2768
0.
3569
0.
5032
0.
6272
0.
2372
0.
5580
0.
6696
0.
8664
1.
0281
0.
4520
0.
8687
1.
0072
1.
2472
1.
4416
15
1.03
0.
0751
1 0.
2770
0.
3571
0.
5036
0.
6276
0.
2373
0.
5587
0.
6704
0.
8673
1.
0290
0.
4525
0.
8701
1.
0087
1.
2488
1.
4431
α=0
.10
c 0
1 1.
5 2
hσ2 D/σ
2 E
0.25
0.
7 1
2 4
0.25
0.
7 1
2 4
0.25
0.
7 1
2 4
ν* V
12
1.16
0.
1833
15
0.39
61
0.46
92
0.59
67
0.70
06
0.41
94
0.72
28
0.82
07
0.98
80
1.12
20
0.69
32
1.07
18
1.19
10
1.39
28
1.55
29
13
1.11
0.
1833
46
0.39
65
0.46
96
0.59
72
0.70
11
0.41
97
0.72
37
0.82
18
0.98
92
1.12
30
0.69
43
1.07
36
1.19
30
1.39
47
1.55
55
14
1.07
0.
1833
72
0.39
68
0.46
99
0.59
77
0.70
16
0.42
01
0.72
46
0.82
27
0.99
02
1.12
39
0.69
52
1.07
52
1.19
46
1.39
64
1.55
60
15
1.03
0.
1833
96
0.39
70
0.47
03
0.59
80
0.70
19
0.42
04
0.72
53
0.82
35
0.99
10
1.12
47
0.69
59
1.07
66
1.19
61
1.39
78
1.55
73
35
the corresponding random variables Zj have common variance v2=n/ν*. The different
degrees of freedom ν* considered in Table I with an obvious modification of the general
definition ν*=n-m-1, allows you also to take into account the cases when in the first
experiment not all estimates of the parameters βh are statistically significant, typically
on the basis of a T-test, so that by “pooling” we can increase the value ν*.
We note that also some estimates of parameters relating to the noise factors, like γj, δjs,
could be statistically not significant and be neglected in calculating the estimator hσ̂T
through formula (3.20). However the degrees of freedom ν* remain unchanged; what is
changing is the detailed definition of the noncentrality parameter τ2 , (3.25), which only
might lose some addends leaving unchanged the meaning of the ratio hσ
2D
σ2E
= τ2/n ν*.
After fixing the “level of distinguishability” c0, to apply the test (4.4) you also have to
choose a value ζ20 = n
hσ2D
σ2E
, that is, a value of the ratio hσ
2D
σ2E
.
A procedure to direct the use of test (4.4) for a given c0 can be the following.
For each value ζ20 considered, see for example Table I, in view of (4.3) and by taking
(3.26), (3.26’) into consideration, you can check whether ζ20 is consistent or not with the
data by having recourse to the test that accepts H0: ζ2 ≤ζ20 with ν*ζ2=ν*n hσ
2D/σ2
E, in
consequence of (3.25), (3.26), (3.26’) and (2.7’’), if:
F = (ν*n/ ν0)( hσ̂2D / σ̂2
E) = χ
2ν0
(ν*ζ2 )/ν0
χ2ν/ν
≤ Fν0,ν,α (ν*ζ20 ) (4.9)
otherwise rejects it and accepts H1: ζ2>ζ20 , where Fν0,ν,α(⋅) is the value which is
exceeded with probability α by the noncentral F-variate with ν0≤q(p+1), ν degrees of
freedom and noncentrality parameter of value ν*ζ20 ; ν*=n-m-1, as we told before, might
be augmented to take into account the possible elimination of some parameters βh from
the model, whose estimates did not result to be statistically significant; ν0≤q(p+1) is the
number of degrees of freedom of hσ̂2D, see (3.26), equal to the number of parameters
related to the noise factors, that is in the sets {γj, δjs}, s=1,…,p, j=1,…,q, retained in the
model since in each of them at least one of the corresponding estimates was found
36
statistically significant; ν is the value of the degrees of freedom of the error variance
estimate, (3.3), in the first experiment, which likewise can be larger than that referring
to the complete model in order to take into account possible non significant estimates of
some parameters βh as well as of some γj, δjs. The stochastic independence of numerator
and denominator in (4.9) is once more ensured by Lemma 3.2. The test (4.9) is
unbiased; this follows from the result given in Scheffè (1959), p. 136, according to
which the probability that F satisfies (4.9) is a continuously decreasing function of ζ2.
Table II shows some critical values which can be used to carry out test (4.9). Then with
regard to Table I for a given c0 you can use the minimum value ζ20 which still does pass
test (4.9) to carry out the test (4.4). Conceptually this procedure corresponds to having
recourse to an approximate lower bound of a confidence set for ζ2, defined through the
test (4.9), cfr. Lehmann (1986), p. 90-91.
b) A conditional indicator for more general settings: all assumptions of Corollary
3.1 might not be satisfied. Consider the estimator β̂^
h=β̂h +β̂*h , (2.8), which we could
obtain by carrying out the second experiment. An unbiased estimator hσ̂2T of Var(β̂
^h),
defined through the results of the first experiment, was given in Proposition 3.1, ii.,
formula (3.20). Now let us approximate the probability distribution of hσ̂2T by means of
a central chi-square distribution as hσ̂2T ≅ πχ
2
ν∼ / ν∼, where the constant π and the degrees
of freedom ν∼ are to be appropriately chosen. For this purpose we consider the two
moment approximation which entails to make coincide mean and variance of the two
variables in the former relationship. Precisely we put:
E(hσ̂2T) = hσ
2T = Var(β̂
^h) = π E(χ
2
ν∼ / ν∼) = π, (4.10’)
Var(hσ̂2T) = π2 Var(χ
2
ν∼ / ν∼) = 2π2 / ν∼
from which we obtain:
ν∼ = 2 [Var(β̂^
h) ]2 / Var(hσ̂2T). (4.10”)
T
able
II. C
ritic
al v
alue
s Fν0
,ν,α
(ν* ζ2 0
) nee
ded
for a
pply
ing
test
(4.9
); λ=
ν* ζ2 0= ν
* ⋅n⋅ hσ
2 D/σ
2 E; n
=16,
p=2
, q=2
; α=0
.05.
hσ2 D
/σ2 E
0.25
0.
70
1.00
2.
00
4.00
ν∗
12
13
14
15
12
13
14
15
12
13
14
15
12
13
14
15
12
13
14
15
λ 48
52
56
60
13
4.4
145.
6 15
6.8
168
192
208
224
240
384
416
448
480
768
832
896
960
ν 0
ν
1 13
22 22.. 66
11 24
.89
27.1
7 29
.46
72.3
9 78
.88
85.3
7 91
.87
105.
79
115.
08
124.
37
133.
66
217.
31
235.
91
254.
5 27
3.1
440.
5 47
7.7
514.
92
552.
1
12
22
.32
24.5
6 26
.8
29.0
5 71
.23
77.6
83
.98
90.3
6 10
4.03
11
3.15
12
2.27
13
1.39
21
3.52
23
1.78
25
0.04
26
8.3
432.
64
469.
17
505.
68
542.
2
11
22
24
.19
26.3
9 28
.6
69.9
5 76
.19
82.4
4 88
.69
102.
09
111.
02
119.
96
128.
89
209.
35
227.
24
245.
13
263.
01
424
459.
78
495.
57
531.
35
2 12
11.4
7 12
.59
13.7
1 14
.83
35.9
1 39
.1
42.2
8 45
.47
52.3
1 56
.87
61.4
3 65
.99
107.
05
116.
18
125.
31
134.
44
216.
61
234.
87
253.
13
271.
39
11
11.3
12
.4
13.5
14
.6
35.2
6 38
.39
41.5
1 44
.63
51.3
3 55
.8
60.2
6 64
.73
104.
96
113.
91
122.
85
131.
79
212.
28
230.
18
248.
06
265.
95
10
11.1
1 12
.18
13.2
6 14
.34
34.5
4 37
.59
40.6
4 43
.7
50.2
4 54
.61
58.9
7 63
.34
102.
65
111.
39
120.
12
128.
86
207.
5 22
4.99
24
2.46
25
9.94
3
11
7.
73
8.46
9.
2 9.
93
23.7
25
.78
27.8
6 29
.95
34.4
1 37
.39
40.3
7 43
.34
70.1
6 76
.12
82.0
9 88
.05
141.
71
153.
64
165.
56
177.
49
10
7.6
8.32
9.
03
9.75
23
.22
25.2
5 27
.28
29.3
2 33
.68
36.5
9 39
.5
42.4
1 68
.61
74.4
4 80
.26
86.0
9 13
8.52
15
0.17
16
1.82
17
3.48
9
7.45
8.
15
8.85
9.
55
22.6
7 24
.65
26.6
3 28
.61
32.8
6 35
.7
38.5
3 41
.37
66.8
9 72
.56
78.2
3 83
.9
134.
96
146.
31
157.
66
169
4 10
5.85
6.
38
6.92
7.
46
17.5
5 19
.08
20.6
22
.13
25.4
27
.58
29.7
6 31
.95
51.6
55
.97
60.3
4 64
.7
104.
03
112.
77
121.
51
130.
25
9
5.
73
6.25
6.
77
7.3
17.1
4 18
.62
20.1
1 21
.6
24.7
8 26
.91
29.0
3 31
.16
50.3
54
.55
58.8
1 63
.06
101.
36
109.
87
118.
38
126.
89
8
5.
6 6.
1 6.
61
7.12
16
.67
18.1
1 19
.55
20.9
9 24
.08
26.1
4 28
.21
30.2
7 48
.83
52.9
6 57
.08
61.2
1 98
.35
106.
6 11
4.85
12
3.11
5
9
4.7
5.12
5.
54
5.95
13
.82
15
16.2
17
.39
19.9
4 21
.64
23.3
4 25
.04
40.3
5 43
.75
47.1
6 50
.51
81.2
88
94
.82
101.
62
8
4.
59
4.99
5.
4 5.
81
13.4
4 14
.59
15.7
5 16
.9
19.3
7 21
.02
22.6
7 24
.32
39.1
7 42
.47
45.7
7 49
.07
78.7
8 85
.38
91.9
9 98
.59
7
4.
46
4.85
5.
24
5.64
13
14
.12
15.2
3 16
.34
18.7
3 20
.32
21.9
1 23
.5
37.8
2 41
44
.19
47.3
8 76
.04
82.4
1 88
.76
95.1
4 6
8
3.91
4.
25
4.59
4.
93
11.2
9 12
.25
13.2
1 14
.17
16.2
3 17
.61
18.9
8 20
.35
32.7
3 35
.48
38.2
3 40
.98
65.7
4 71
.24
76.7
5 82
.25
7
3.
8 4.
13
4.46
4.
78
10.9
2 11
.85
12.7
8 13
.7
15.7
17
.01
18.3
4 19
.67
31.6
34
.26
36.9
1 39
.56
63.4
5 68
.75
74.0
6 79
.37
6
3.
67
3.98
4.
3 4.
61
10.4
9 11
.38
12.2
7 13
.16
15.0
6 16
.33
17.6
18
.87
30.3
32
.84
35.3
8 37
.92
60.7
9 65
.87
70.9
6 76
.04
Crit
ical
val
ues i
n gr
ey c
ells
cor
resp
ond
to th
ose
com
bina
tions
of ν
0 , ν
and
λ th
at c
an’t
be re
aliz
ed in
our
exa
mpl
e.
ν∗ : (n
umbe
r of o
bser
vatio
ns n
) – (n
umbe
r of p
aram
eter
s βh r
etai
ned
in th
e m
odel
as b
eing
sign
ifica
nt);
ν : n
umbe
r of o
bser
vatio
ns o
n w
hich
the
varia
nce
estim
ate
σ̂2 E is
bas
ed: (
num
ber o
f obs
erva
tions
n)-
(num
ber o
f all
para
met
ers β
h, γ j,
δjs re
tain
ed in
the
mod
el a
s bei
ng si
gnifi
cant
); ν 0
: nu
mbe
r of t
erm
s of t
ype
γ j+δ′ j
x i re
tain
ed in
the
mod
el a
s bei
ng si
gnifi
cant
.
38
It is easy to show that, if the matrices G′jGj, (3.7), are idempotent of rank (p+1) with
|xhi|=1, ∀i, see Lemma 3.3, and we refer to the complete model (2.6), it follows from
(3.25) , (3.26) that:
ν∼ = [ ]1+(n hσ
2D/σ2
E)2
v4
n2q(p+1)+1ν
1-
v2
n q(p+1)2+2
v2
n (n hσ2D/σ2
E) (4.11)
where, in particular, ν is defined with reference to (3.26). If in expression (4.11) we put
ζ2= n (hσ2D/σ2
E) and we consider the derivative with respect to ζ2 of the corresponding
function it is easily shown that ν∼, (4.11), is a monotonic increasing function of ζ2 for
ζ2=n (hσ2D/σ2
E) > 1.
With regard to (4.11) a guide to a choice of ζ2, which is consistent with the data, may be
obtained by applying the test which accepts H0: ζ2≥ζ20 see Lemma 3.3, and (3.26), if:
F = [n2/(v2ν0)] hσ̂2D/σ̂2
E = χ
2
ν0( nζ2
v2 )/ν0
χ2ν (ν)/ν
≥ Fν0
,ν; 1-α
nζ2
0
v2 , (4.12)
otherwise rejects it and accepts H1: ζ2<ζ20 , where Fν
0,ν; 1-α ( ⋅ ) is the value exceeded with
probability (1-α) by the noncentral F-variate with ν0≤q(p+1), ν degrees of freedom,
which are defined as in (4.9), and noncentrality parameter nζ20/v2, with regard to which
we recall that in general we put ζ2= n hσ̂2D/σ̂2
E. Also the test (4.12), which has a
probabilistic structure similar to that of test (4.9), is unbiased. With regard to (4.12), we
might consider a set of reasonable values for the ratio hσ2D/σ2
E , see Table I, and for
calculating ν∼, (4.11), use the largest value which still leads to accepts H0 according to
the test (4.12).
Alternatively we can estimate the ratio hσ2D/σ2
E by the method of moments replacing the
unknown parameters with the corresponding unbiased estimates of hσ2D, see (3.19),
(3.20) and (4.15) below, and of σ2E, (3.3). Now, by assuming the approximation defined
through (4.10’) , (4.10”), (4.11) with regard to (2.8) and (2.9) and remembering that the
39
random variable U is stochastically independent of hσ̂2T in force of Proposition 3.1. ii. we
have that:
T = (β̂^
h / hσ̂T ) = (β̂h +β̂*h ) / hσ̂ T = (βh + hσ
E*U1+ hσ
.DU2) / hσ̂ T =
= (βh + hσ2E*+ hσ
2D U ) / hσ̂T = [(βh / hσT) + U] / χ
2
ν∼ / ν∼ (4.13)
follows a T-distribution, in general noncentral, with ν∼ degrees of freedom and
noncentrality parameter |βh | / hσT regardless of the fact that either β̂^
h and thus β̂h ,β̂*h are
obtained together through the second experiment or β̂h is actually obtained from the first
experiment, while β̂*h would be implicitly obtained only by carrying out the second
experiment, see (2.7).
Let t
ν∼,α/2 be the value exceeded with probability α/2 by the central T random variable
with ν∼ degrees of freedom. Correspondingly under H0: βh = 0 consider the probability
that a possible value |β̂^
h|/hσ̂T, where the numerator would be actually available according
to (2.7) when carrying out also the second experiment, is not larger than t
ν∼,α/2,
conditional on the results of the first experiment, that is on β̂h, (2.3), and hσ̂2T , (3.20).
Note that the last estimates respectively give information on the “size” of the βh effect as
well as on the error variance increment, leading from hσ 2E* to hσ
2T, which arises when the
noise factors levels truly happen at random. If we take (2.8), (4.10’) and Proposition
3.1, ii. into considerations in explicit terms the above probability is given by:
P( - t
ν∼,α/2 hσ̂T ≤ β̂^
h = β̂h +β̂*h ≤ t
ν∼,α/2 hσ̂T | β̂h , hσ̂T) =
= P[ -( t
ν∼,α/2 hσ̂T + β̂h ) / hσD ≤ U2 ≤ ( t
ν∼,α/2 hσ̂T - β̂h ) / hσD | β̂h , hσ̂T)] =
= 12 π
⌡⌠
-( t
ν∼,α/2 hσ̂T + β̂h ) / hσD
( t
ν∼,α/2 hσ̂T - β̂h ) / hσD
exp(-u22/2)du2 = 1- α(β̂h, hσ̂T, hσD) (4.14)
where we took into consideration the normality assumptions on the errors E2i , from
which it followed that U2, with hσDU2=β̂*h , is a standardized normal variate, see (2.8),
40
stochastically independent of the variates β̂h , hσ̂T, which, in particular, ensues from the
stochastic independence of the random variables E∼
i , Zji among themselves and with
respect to the errors Ei of the first experiment, see § 2., b), and from Lemma 3.2. We
underline once more that the overall distributional characteristics of β̂h ,β̂*h, hσ̂T don’t
change if, besides hσ̂T, we consider β̂h as obtained from the first experiment too, while
β̂*h is meant to pertain to the second one.
As an indicator of distinguishability of an effect βh, on the basis of the first experiment
only, we propose to consider the value α̂(β̂h, hσ̂T, hσ̂D) obtained from (4.14) by replacing
the unknown parameter hσD with the square root of the unbiased estimator of hσ2D, which
can be obtained from (3.19) as:
hσ̂̂ 2D= hσ̂
2D - hσ̂
2E* (v2/ hx2 ) [∑
j=1
q tr(G′j Gj)] = hσ̂
2D - hσ̂
2E* (v2/ hx2 )[q(p+1)] (4.15)
where the last expression holds when all matrices G′j Gj , (3.7) are idempotent and the
complete model (2.6) is considered (for possible simplifications see the comments on ν0
following (4.9)).
The meaning of the proposed indicator is the following. Suppose that also the second
experiment is carried out. Then we could test the hypothesis H0: βh = 0 by applying the
usual two-sided T-test to the ratio β̂^
h /hσ̂T , where β̂^
h is the estimator (2.7) of βh
obtained from the second experiment, hσ̂2T the one of hσ
2T obtained from the first and,
thus, we would accept H0 if |β̂^
h| /hσ̂T ≤ t
ν∼,α/2. Instead of actually carrying out the second
experiment you can ask which is the probability, conditional on the results coming from
the first one leading to β̂h, hσ̂T, that the results coming from the second one would
produce non-significant values for the T-test. This probability is given by (4.14).
Consequently if hσ 2D should be known, the indicator α(⋅) would express the probability
of rejecting H0 through the above test, that is, of distinguishing the effect βh from zero
also in the presence of the noise factors variability, given the values β̂h , hσ̂T observed in
the first experiment and for a generic random contribution β̂*h of the second one, if the
41
latter should be carried out. If α(⋅) is large, say, larger than the level of significance
α chosen for the T-test, this can be considered as an indication that, in the light of the
first experiment and without the need of carrying out the second one, the effect βh can
be considered as distinguishable from zero, even if the noise factors levels happen at
random. The proposed indicator α̂(⋅), which requires the estimation of hσ 2D, see (4.15),
obviously represents an approximation of the former one.
5. Completing the discussion of the case studied in the light of the theoretical
results - Conclusion.
Coming back to the example presented in § 2.c) if we want to obtain an unbiased
estimator of Var(β̂^
h) = σ2E* + hσ
2D = hσ
2T , (2.9), we need unbiased estimators of both σ2
E*
and hσ2D .
From (3.4) we have hσ̂2E* = σ̂2
E/hx2 , which, for the complete model and for the data of
the example leads to:
hσ̂2E* = σ̂2
E/hx2 = σ̂2E/16 = 0.051389583/16 = 0.003211848938, ∀ h=1,…,m=10, and 6
degree of freedom.
From (3.5), see also (3.19), we have that the “natural” biased estimator of hσ2D is hσ̂
2D =
v2
hx4 ∑j=1
q Q2
j , where Q2 j = (η+E)’G′jGj(η+E), (3.7), and Gj = A⋅(Bj+X*Cj), (3.6). To obtain
hσ̂2D, for example for h=1, by using the elements of Table 1, first we have to write down
G1 and G2. By taking (3.6) into consideration we obtain:
G1=A⋅(B1+X*C1) =
42
= 116
+3 -3
+3 -3+3 -3+3 -3-1 +1-1 +1-1 +1-1 +1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1
+3 -3+3 -3+3 -3+3 -3-1 +1-1 +1-1 +1-1 +1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1
+1 -1+1 -1+1 -1+1 -1-3 +3-3 +3-3 +3-3 +3-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1
+1 -1+1 -1+1 -1+1 -1-3 +3-3 +3-3 +3-3 +3-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1
+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+3 -3+3 -3+3 -3+3 -3-1 +1-1 +1-1 +1-1 +1
+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+1 -1+3 -3+3 -3+3 -3+3 -3-1 +1-1 +1-1 +1-1 +1
-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1+1 -1+1 -1+1 -1+1 -1-3 +3-3 +3-3 +3-3 +3
-1 +1
-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1-1 +1+1 -1+1 -1+1 -1+1 -1-3 +3-3 +3-3 +3-3 +3
,
since, according to the definitions of bji, c jsi which follow (2.4), (2.5),
• B1 = 116
-1 +1 -1 +1 -1 +1 -1 +1
.. .. .. .. .. .. .. ..-1 +1 -1 +1 -1 +1 -1 +1
-1 +1 -1 +1 -1 +1 -1 +1
.. .. .. .. .. .. .. ..-1 +1 -1 +1 -1 +1 -1 +1
with B1 a square matrix, 16×16, with 16 equal rows whose elements are those of 1ξ, that
is the elements of the first column of Ξ0;
• X*= [1x, 2x ]
with X* a 16×2 matrix where 1x, 2x equal respectively the second and the third column
of X;
• C1= 116
+1 -1 +1 -1 -1 +1 -1 +1 +1 -1 +1 -1 -1 +1 -1 +1
+1 -1 +1 -1 +1 -1 +1 -1 -1 +1 -1 +1 -1 +1 -1 +1 ,
with C1 a 2×16 matrix where the elements of the two rows of the matrix equal those of
3ξ and 4ξ, respectively, that is the elements of the two columns of Ξ1;
for h=1, we also have:
• A =
x11 0 .. .. 0
0 .. .. .. 00 .. x1i .. 00 .. .. .. 00 .. .. .. x1n
where [x11, …, x1i, …, x1n]’ is the vector 1x’ which equals
the second column of X.
Likewise we obtain:
G2 = A⋅(B2+X*C2) =
43
= 116
+3 +3
+3 +3+3 +3+3 +3-1 -1-1 -1-1 -1-1 -1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1
-3 -3-3 -3-3 -3-3 -3+1 +1+1 +1+1 +1+1 +1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1
+1 +1+1 +1+1 +1+1 +1-3 -3-3 -3-3 -3-3 -3-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1
-1 -1-1 -1-1 -1-1 -1+3 +3+3 +3+3 +3+3 +3+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1
+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+3 -3+3 +3+3 +3+3 +3-1 -1-1 -1-1 -1-1 -1
-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-3 -3-3 -3-3 -3-3 -3+1 +1+1 +1+1 +1+1 +1
-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1-1 -1+1 +1+1 +1+1 +1+1 +1-3 -3-3 -3-3 -3-3 -3
+1 +1
+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1+1 +1-1 -1-1 -1-1 -1-1 -1+3 +3+3 +3+3 +3+3 +3
with
• B2 = 116
-1 -1 +1 +1 -1 -1 +1 +1
.. .. .. .. .. .. .. ..-1 -1 +1 +1 -1 -1 +1 +1
-1 -1 +1 +1 -1 -1 +1 +1
.. .. .. .. .. .. .. ..-1 -1 +1 +1 -1 -1 +1 +1
where the rows of the matrix are all equal to 2ξ,
• C2 = 116
+1 +1 -1 -1 -1 -1 +1 +1 +1 +1 -1 -1 -1 -1 +1 +1
+1 +1 -1 -1 +1 +1 -1 -1 -1 -1 +1 +1 -1 -1 +1 +1 ,
where the two rows of the matrix are equal to 5ξ and 6ξ, respectively, and A and X* the
same as for G1.
With the data of the example we obtain, when v2=1:
Q2 1 = (η + E)’ G′1 G1 (η + E) = y’ G′1 G1 y = 0.00796875
Q2 2 = (η + E)’ G′2 G2 (η + E) = y’ G′2 G2 y = 0.82456875
and hσ̂2D =
v2
1x4 ∑j=1
q Q2
j = 0.0032520996, where 1x4=n2=162=256, a result which is shown
to hold not only for h=1 but also for h=2,3.
Following Lemma 3.1, it can easily be verified that the ranks r1 of G1 and r2 of G2 are
both equal to p+1=3, in fact ∀ h=1,..,3 all elements xhi , i=1,2,…,n, are different from
zero since they are the levels of the controlled factors recorded as –1,+1 in conventional
units (see comma i.). In this case, for Lemma 3.2, Q21 and Q2
2 are stochastically
independent and independent also of σ̂2E , β̂1, β̂2 and β̂12 .
44
The r1=3 eigenvalues of G′1G1 and of G′2G2 are all equal to 1. This holds since for the
experimental design of our example, ∀ h=1,..,3, the conditions of Lemma 3.3 are
satisfied. In fact: 1) |xhi| =1, i=1,2,…,16; 2) 1/1ξ2 =1/2ξ2 = 1/16; 3) 1/3ξ2 =1/5ξ2= 1/1x2 =
1/16; 1/4ξ2 =1/6ξ2= 1/2x2= 1/16. So the matrix G′1G1 and G′2G2 are idempotent of rank 3
and ∑j=1
2 Q2
j /σ2E is the sum of two independent noncentral chi-square random variables
each with 3 degrees of freedom and, thus, given the reproductivity of the corresponding
distributions has a noncentral chi-square distribution with 6 degrees of freedom and
noncentrality parameter τ21+τ2
2 , with τ2 j, j=1,2, defined by (3.18).
We note that as shown in Proposition 3.1 the natural estimator hσ̂2D of hσ
2D is biased.
However an unbiased estimator hσ̂2T of the overall variance hσ
2T , (2.9) of the estimate β̂
^h
is given in comma ii. of the same Proposition, formula (3.20). We remember that above
we obtained the values hσ̂2E* = 0.003211848938 and hσ̂
2D =
v2
1x4 ∑j=1
q Q2
j = 0.0032520996; to
apply formula (3.20) we note that [1-(v2/hx2)∑j=1
2 tr(G′j Gj)] = 0.625 for v2=1 and recalling
that the trace of an idempotent matrix is equal to its rank, which is 3 for both matrices
of interest. For the case under consideration with v2=1, we finally obtain hσ̂2T
=0.0052595052, h=1,2,3. From Lemma 3.2 it also follows that hσ̂2T is stochastically
independent of the effects estimates β̂^
h ∀h=1,…,3.
In view of Corollary 3.1, we remark that if we choose v*2=n/(n-m-1)=16/(16-3-1)=4/3,
we would obtain that hσ̂2T=(σ2
E/n)[χ2ν*(τ2 )]/ν*, (3.24), that is that hσ̂
2T follows a noncentral
chi-square distribution with ν*=16-3-1=12 degrees of freedom and noncentrality
parameter τ2 given by (3.25). In fact with any other value of v2, hσ̂2T is a linear
combination of noncentral chi-square random variables which can only be approximated
with a noncentral chi-square distribution.
What are the implications of fixing the value of v*2 = n/(n-m-1) = 4/3? As pointed out in
Remark 3.1 this condition added to (3.30) imposes a true restriction on the spread of the
noise factors real levels (see Remark 3.1). In fact their standard deviations in the
45
original units are not equal to the conventional standard deviation v1= R1/3= 833.3 and
v2= R2/3= 8.333 - where, as mentioned before, R1, R2 are respectively the lengths of the
longer and smaller side of the rectangle describing the unwound whole roll - but result
to be smaller and respectively equal to v1= 3/4 (R1/3) = 721.688 and v2= 3/4 (R2/3)=
7.21688 which correspond to 0.289Rj instead of conventional 0.333Rj. We note that this
means assigning conventionally a negligible probability to noise factor levels which
correspond to points outside a rectangle a little smaller than the real one related to a
whole roll.
For v2 = 43 we obtain the new estimates of hσ̂
2T =0.0059420573, as [1-(v2/hx2)∑
j=1
q tr(G′j Gj)]
= 0.5, and hσ̂2D = 0.0043361328.
In the second experiment, when the noise factors levels happen at random the variance
of the effect estimate β̂^
h is Var(β̂^
h)=σ2E*+hσ
2D or, under the hypotheses of Lemma 3.3, and
of Corollary 3.1, it assumes the form: (σ2E/n)(1+
n hσ2D
σ2E
) see (4.2).
Then the variance of the parameter estimate becomes greater than that related to the first
experiment when we have Var(β̂h)=σ2E*=σ2
E/n. Now it is of interest to assess whether the
effect βh is actually distinguishable from random fluctuations even when the variance of
its estimate is the real one: so, for example, we may wonder whether conditioning
temperature actually shows a significant effect on the response η even if the test strip is
chosen at random in the roll as we assume to be the case in the real utilization of the
product. Then the variance to be considered is not hσ̂2E*=0.003211848938 but is hσ̂
2T
=0.0059420573.
The theory developed above in § 4 allows us to find out an answer to such a question in
two ways.
a) The case when Corollary 3.1 holds (v2=n/(n-m-1) and G′j Gj is idempotent ∀j=1,..,q).
The answer to the question we are interested in can be obtained by testing the following
hypothesis (4.3),
46
H0 =
|βh|
σ2E/n+hσ
2D
= n |βh/σE|
1+ζ2 ≥ c0
|βh|/σE ≥ |βh0|/σE
0≤ζ2≤ζ20
where we recall that c0 is a chosen threshold of distinguishability of the effect βh of
interest with respect to random fluctuations when the variance of the least squares
estimate β̂h of βh in the first experiment is augmented to take the noise factors also into
account; ζ20 = n hσ
2D/σ2
E is a chosen upper bound for said variance increment.
According to (4.4) we accept H0, (4.3), at a significance level α if:
β̂h
hσ̂T
2
≥ F1,ν*,1-α (c∼20 , ν*ζ2
0 ),
where hσ̂T is the total variance unbiased estimate obtained by (3.20), leading to (3.24)
under the assumed hypotheses, F1,ν*,1-α(⋅,⋅) is the (1-α), critical value of the doubly
noncentral F-distribution, with (1+ζ20)c
20, ν*ζ2
0 the values of the noncentrality parameters,
and degrees of freedom 1 and ν*=(n - number of the significant estimates parameter βh
retained in the model). In the sequel for simplicity we shall refer to the complete model;
thus for the case under consideration we have: ν*=n-m-1=16-4=12, since there are n=16
observations and 4 parameters βh.
To choose ζ20 we can follow the procedure suggested in §4, a). We have to consider the
test (4.9):
(ν*n/ ν0)( hσ̂2D / σ̂2
E) ≤ Fν0,ν,α (ν*ζ20)
which leads to accept the hypothesis H0: ζ2≤ζ20 when the former relationship is satisfied
otherwise to reject it, where Fν0,ν,α(⋅) is the critical value at a significance level α of the
noncentral F distribution which, in the case of a complete model, has degrees of
freedom ν0=q(p+1), equal to the number of parameters pertaining to the noise factors –
ν0=2⋅3=6 in the example – and ν, which express the number of degrees of freedom left
for the variance estimation σ2E after fitting the complete model to the data of the first
experiment – ν=16-10=6 in the example – while the value of the noncentrality
parameter is set equal to ν*ζ20.
47
With the help of a table like Table II and with reference to test (4.9) recalled above, for
example, one may choose as ζ20, or equivalently as corresponding ratio hσ
2D/σ2
E, the
largest value consider in the table which still allows to accept H0. In the example we are
considering we obtain
(ν*n/ ν0)( hσ̂2D / σ̂2
E) = (12/6)(hσ2D/σ2
E*) = 2⋅0.0043360.003211 = 2.7 .
In Table II for ν*=12 and ν=ν0=6 we read the critical value 3.67>2.7, corresponding to a
variance increment hσ2D/σ2
E=0.25.
We can now apply test (4.4), for example, to test the distinguishability from zero of the
conditioning temperature effect β2 also when we have a variance increment of the
random errors equal to hσ2D=0.25σ2
E. For this purpose we have to consider the value of
the statistic
β̂2
hσ̂T
2
=11.652 and compare it with the critical values given in Table I. We
see that for ν*=12, v=1.16 (note that this parameter has little influence on the upper
percent points), 2σ2D/σ2
E=0.25 (corresponding to ζ20=16⋅0.25=4), α=0.10 and for any
value c0=1,1.5,2 the value 11.652 is much larger than the corresponding upper percent
points and allows us to accept the hypothesis H0 that the effect β2 is distinguishable
even if the variability of its estimate is incremented to take the random effects of the
noise factors into consideration. According to the results given in Table II the same
conclusion is reached even if instead of the complete model we eliminate all not
significant effects.
b) A conditional indicator for more general settings (v2≠n/(n-m-1) and G′j Gj is
idempotent ∀j=1,..,q).
While the tests of significance discussed in a) are essentially based on the assumption
that the random errors Ei, (2.1), E∼
i, (2.6) i=1,2,…,n follow a same zero mean normal
distribution, besides being stochastically independent of each other and of the noise
variables Zji, j=1,2,…,q, i=1,2,…,n, only the stochastic independence of the latter plays
a role, but not the type of their distribution. On the contrary this second method
requires that the variables Zji also have a same zero mean normal distribution. Thus the
48
second method does not seem to be completely appropriate to deal with the example
considered in the paper, since two independent uniform distributions on the intervals
(-R1,R1), (-R2,R2) seem more suitable to describe the response η modification connected
with a random choice of a film strip from the whole roll. However also the second
procedure will be applied to the data of the example considered above for illustrative
purpose. So we can now consider the indicator of distinguishability 1-α(⋅) of an effect
βh defined by (4.14). We recall that α(⋅) expresses the probability of rejecting H0: βh=0,
that is, of distinguishing the effect βh from zero also in the presence of the noise factors
variability, that is when carrying out also the second experiment, given the values β̂h ,
hσ̂T observed in the first experiment. By applying (4.14) and having resort to the
variance components unbiased estimates, see (3.4), (3.20) and (4.15), in the case of the
complete model and for different values of v2, with regard to the example under study
we obtained the following results:
v2=4/3 (∗)
hσ̂2T hσ̂
^2D hσ̂
^2D /hσ̂2
E* ν∼ 1-α̂ (⋅)
β̂1 0.0059420573 0.0027302083 0.85004257 15.211331 0.99759854
β̂2 0.0059420573 0.0027302083 0.85004257 15.211331 0.029022987
β̂12 0.0059420573 0.0027302083 0.85004257 15.211331 0.99248746
v2=1 (∗)
hσ̂2T hσ̂
^2D hσ̂
^2D /hσ̂2
E* ν∼ 1-α̂ (⋅)
β̂1 0.0052595052 0.0020476562 0.63753193 15.939253 0.99886658
β̂2 0.0052595052 0.0020476562 0.63753193 15.939253 0.0078390572
β̂12 0.0052595052 0.0020476562 0.63753193 15.939253 0.99509812
v2=1/9 (∗)
hσ̂2T hσ̂
^2D hσ̂
^2D /hσ̂2
E* ν∼ 1-α̂ (⋅)
β̂1 0.0034393663 0.00022751736 0.070836881 7.4296307 1
β̂2 0.0034393663 0.00022751736 0.070836881 7.4296307 0
β̂12 0.0034393663 0.00022751736 0.070836881 7.4296307 1
(*) see Table 2 where:
49
• hσ̂2T is the unbiased estimator of Var(β̂
^h ) observed in the first experiment and
given in (3.20);
• hσ̂^ 2
D is the unbiased estimator of hσ2D given in (4.15), which represents the added
variability due to the second experiment;
• hσ̂^ 2
D /hσ̂2E* is the estimator of the error variance increment hσ
2D/σ2
E* obtained by the
method of moments, that is by replacing the unknown parameters with the
corresponding unbiased estimators;
• ν∼ is the approximate value of the degree of freedom obtained by substituting the
above estimator for hσ2D/hσ
2E* in (4.11).
In the last column of the former table we can read the value of 1-α̂ (⋅) which represents
an approximation to the proposed indicator (4.14) which expresses the probability of
accepting H0: βh = 0, that is, of not distinguishing from zero the effect βh when carrying
out the second experiment, that is in the presence of the noise factors variability,
conditional to the results obtained from the sole first experiment. Then, if 1-α̂(⋅) is small
or at least smaller than 1-α, where α is the level of significance chosen for the
underlying T-test, we can conclude that βh can be considered distinguishable from zero.
We can see that the results previously obtained for the case a) are confirmed by this new
indicator. In fact the factor x2 “conditioning temperature” turns out to affect the
response η even in the presence of noise factors, that is when the test strip is chosen at
random on the whole roll. We can also notice that the principal characteristic of this
indicator is its ability to discriminate the relevant factors from the others: in fact the
value of 1-α̂(⋅) for factor x2 is much smaller than those regarding x1 and the interaction
x1⋅x2, see Table 2.
Furthermore, the indicator is not sensitive to changes in v2 values (v2=4/3 is considered
to allow comparing these results with those of case a)).
Conclusion.
In conclusion both methods lead us to accept that x2, that is the conditioning
temperature, has an evident effect on the mean value η of the response, strength loss
50
under stress, Y, which is clearly distinguishable from zero even when it is assessed in
relation to the overall random variability which includes the noise factor effects. In the
opposite case a possibly costly effort to control this factor at its higher value during
production may appear as meaningless, in view of the subsequent real utilization and
behaviour of the finished product.
References
Alt, F.B., (1988), “Taguchi method per off-line quality control”, in Encyclopedia of
Statistical Sciences, Kotz & Johnson Eds., vol. 9, pp.165-167.
Box, G., Jones, S. (1992), “Designing products that are robust to the environment”,
Total Quality Management, 3, 3, pp. 265-283.
Bulgren, W.G., (1971), “On Representations of the Doubly Non-Central F
Distribution”, Journal of the American Statistical Association, 66, 333, pp. 184-
186.
Dasgupta, T., Sarkar, N.R., Tamankar, K.G., (2002), “Using Taguchi methods to
improve a control scheme by adjustment of changeable settings: A case study”,
Total Quality Management, 13, 6, pp. 863-876.
Giovagnoli, A., Merola, G., (1994) “Parameter design for the production of unleaded
petrol”, Statistica Applicata, 6, 2, pp. 177-193.
Graybill, F.A., (1961), An introduction to linear statistical models, McGraw-Hill, New
York.
Johnson, N.L., Kotz, S., (1992), Continuos univariate distributions, Wiley, New York.
Kackar, R.N., (1985), “Off-line Quality Control, Parameter Design, and the Taguchi
Methods (with discussion)”, Journal of Quality Technology, 17, 4, pp. 176-188.
Khattree, R., (1996), “Robust Parameter Design: A Response Surface Approach”,
Journal of Quality Technology, 28, 2, pp. 187-198.
Lehmann, E. L., (1986), Testing statistical hypotheses, Wiley, New York.
Magagnoli, U., Vedaldi, R., (1990), “I metodi di Taguchi: valutazioni sui fondamenti
metodologici”. Atti della XXXV Riunione Scientifica della SIS, Cedam, Padova,
pp. 243-254.
51
Myers, R.H., Kim, Y., Griffiths, K., (1997), “Response Surface Methods and the use of
Noise Variables”, Journal of Quality Technology, 29, pp. 429-441.
Park, S.H.,(1996), "Simultaneous multiresponse optimisation for robust designs in
quality engineering, Atti della XXXVIII Riunione Scientifica della SIS, Maggioli
Editore, Rimini, vol.1, pp. 273-288.
Revee, C., (1986), “An algorithm for Computing the Doubly Non-Central F C.D.F. to a
Specified Accuracy”, SED, Note 86-4, November.
Scheffè, H., (1959), The analysis of variance, Wiley, New York.
Shoemaker, A.C., Tsui, K.L., Wu, C.F.J., (1991), “Economical Experimentation
Methods for Robust Design”, Technometrics, 33, pp. 415-427.
Vining, G.G., Myers, R.H., (1990), “Combining Taguchi and Response Surface
Philosophies: A Dual Response Approach”, Journal of Quality Technology, 22,
pp. 38-45.
Zanella, A., (1992), “L’approccio statistico nell’ottimizzazione del controllo dei sistemi
produttivi”, Quaderni di Statistica e Matematica Applicata alle Scienze
Economiche e Sociali, Università degli Studi di Trento, vol. XIV, 5, pp. 157-191.
Zanella, A., Cascini, E., (1997), “I modelli di base per la progettazione ed il controllo: la
sperimentazione condotta con criteri statistici”, in Atti del XIX Convegno
Nazionale AICQ “Qualità: sfida al sistema Italia”, Milano, vol. E, pp. 43-71.
Zurmühl, R., (1961), Matrizen und ihre technischen Anwendungen, Springer-Verlag,
Berlin.