The Behavior of Some Standard Statistical Tests Under
-
Upload
jhon-jairo-murillo-giraldo -
Category
Documents
-
view
216 -
download
0
Transcript of The Behavior of Some Standard Statistical Tests Under
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 1/70
THE BEHAVIOR OF SOME STANDARD STATISTICAL TESTS UNDER
NON-STANDARD CONDITIONS
by
Harold Hotelling
University of North Carolina,
Chapel Hill
Contents
l.
2.. 3·
4.
5·
6.
PageIntroduction and summary 1
Geometry of the Student rat io and,projections on a sphere 4
Probabilities of large values of t in non-standard cases. 8
Samples from a Cauchy distribution 10
Accuracy of approximation . Inequalities 13
Exact probabilities for N=3.Cauchy distribution' 16
7. The double and one-sided exponential distributions 25
8. Student ratios from certain other distributions . 31
9. Conditions for approximation to Student distribution. 34
10. Correlated and heteroscedastic observations . Time series 39
11. Sample variances and correlations 48
12. -Inequalit ies for variance distributions
13. What can be done about non-standard conditions? .
52
58
.Bibliography 65
Research carried out in part under Office of Naval Research Project NR 042 031,Contract Nonr-855(06) and i ts predecessor contract. Reproduction is permittedfor any purpose of th e United States Government.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 2/70
THE BEHAVIOR OF SOME STANDARD STATISTICAL TESTS
UNDER NON-STANDARD CONDITIONS*
Harold Hotelling
University of North Carolina
1. Introduction and Summary
In the burst of new stat is t ical developments that have followed the
/.
work of Sir Ronald Fisher in introducing and popularizing methods involving
exact probability distributions in place of the old approximations, questions
as to the effects of departures from the assumed normality, independence and
uniform variance have often been subordinated. I t i s true that much recogni-
tion has been given to the exis tence of ser ia l cor re la tion in time ser ies,
with the resultant vitiat ion of stat is t ical tests carried ou t in the absence
of due precautions, and to some other special situations, such as manifestly
unequal variances in l e a s t - s q u a ~ e problems. Wassily Hoeffding has established
some general considerations on the role of hypotheses in stat is t ical decisions
B!:T '. Also, there have been many stu die s o f distributions of the Student
rat io, the sample variance, variance ratio, and correlation coefficient In
samples from non-normal popUlations. C''$ome are c i : t ~ d ' a t the eng. of this
p a p e r . ~ . These efforts have encountered formidable mathematical difficul-
t ies , Btnce the distribution functions sought cannot usually, except in t r iv ial
cases, be easily speci fi ed or calculated in terms of familiar or t abula ted
* This research was carried out in part under Office of Naval ResearchProject NR 042 031, Contract Nonr-855(06) and i ts p r e d e c e s s o ~ contract .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 3/70
,
2
functions. Because of these difficult ies, mathematics has in some such
studies been supplemented or replaced by experiment ' ~ 2 0 , 5 8 , 5 4 ~ ' ~ t c ~ 7 .) .
or, as in some important w o r ~ 1 .
led to approximations for which definite error bounds are not
a t hand.
Practical statist icians have tended to disregard non-normality, partly
for lack of an adequate body of mathematical theory to which an appeal can
be made, partly because they think i t is too much trouble, and partly because
of a hazy t radi ti on tha t a l l mathematical i l l s arising from non-normality will be
cured by sufficiently large numbers. This las t idea presumably stems from
central l imit theorems, or rumors or inaccurate recollections of them.
Central limit theorems have usually dealt only with linear functions
of a large number of variates, and under various assumptions have proved
convergence to normality as the number increases. For a large but fi xed
number th e approximation of the distribution to normality i s typically close
within a restricted por ti on of i ts range, but bad in the ta i ls . Yet i t is
the ta i ls that are used in tests of significance, and the statist ic used
for a tes t is seldom a linear funct ion o f the observations.I
The non-linear ti
sta t i s t ics t , F and r in common use can be shown ~ 9 , p. 2 6 6 ~ under
certain wide sets of assumptions to have distributions approaching the
normal form, though slowly in the ta i ls . But in the absence of a normal
d is tr ibu tion in the basic population, with independence, there is a lack of
convincing evidence that the errors in these approximations are less than
in the nineteenth-century methods now supplanted, on grounds o f g re at er
accuracy , by the new sta t i s t ics .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 4/70
3
In ta i ls corresponding to small probabilit ies the behavior of t ,
r and various related s ta t is t ics may be examined with the help of a method
of tubes and spheres proposed in C10J, CllJ and ['12J, and applied by
Daisy Starkey in 1939 to periodogram analysis C57_7 j and by Bradley .. ,"
["4'.,5,6:::.,7,to,t,"'Fand multivar.iate T;! ,',,'" :' : : ' , ; ~ , . :':'.c .: ' : ' ~
, by E. S. Keeping C29J to testing
the f i t of exponential regressions, and by Siddiqui L-55,56 ~ to th e dis
tribution of a serial correlation coefficient.
The distribution of the Student-Fisher t , as used for testing devia
tions of means in samples from various non-normal populations, and in c er ta in
cases of observations of unequal variances and inter-correlations, will be
examined in th e next few sections, with special reference to large values of
t . New exact distributions of t will be found for a few cases. A l imit
for increasing t , with wide applicability, will be obtained for the rat io
of two probabilit ies of t being exceeded, one probabi li ty for the standard
normal theory with independence on which the tables in use ar e based, the
other per ta in ing to var ious cases of non-normality, dependence, and
heteroscedasticity. This l imit depends on the sample size, which as i t
increases provides a double limit for th e rat io. This double l imit, for
random samples from a non-normal population, is for many commonly considered
populations either ° or 1. After considering the accuracy of the f i r s t l imit
and exact distributions of t in c er ta in cases of small samples, we obtain
the conditions on the populat ion that the double limit be unity, a situation
favorable to the use of the standard table. The nature of the condi tions
leads to certain remarks on proposals to modify th e t - tes t for non-normality.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 5/70
It
4
The marked effects of non-normality on th e distributions of sample
standard deviations and correlation coefficients will then be shown. The
paper will conclude with a discussion of th e dilemmas with which statis-
t icians are confronted by the anomolous behavior in many cases of
sta t is t ica l methods that have become standard, and of possible means of
escape from such ~ i f f i c u l t i e s .
2. Geometry of th e Student Rat io and Projections on a Sphere
To test th e deviation of the mean x of observations ~ , •.• , XNfrom a hypothetical value, which is here assumed without loss of essential
general ity to be zero, the sta t is t ic appropriate in case the observations
constitute a random sample from a normal population, i . e . , are independently
distributed with the same normal distribution, is
(2.1)_ 1
t = x N2/s ,
where s is th e positive square root of
S stands for summation over th e sample, x = Sx/N, and n = N - 1. We shall
continue to adhere to Fisher's convention that S shall denote summation
over a sample and ~ other kinds of summation.
In space of Cartesian coordinates Xl ' . . . , XN le t 0 be the origin
and X a random point whose coord ina te s a re the sample values. Let Q be the
angle, between 0 and inclusive, made by OX with the positive extension of
th e eqUiangular line, i .e . the line on which every point has a l l i t s co-
ordinates equal. I t will soon be shown that
(2.3) t = cot Q.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 6/70
5
We deal only with population distributions specified everywhere by a
joint density function. This we cal l f(xl, x
2' ••• , ~ ) . On changing to
spher ical polar coordinates by means of the N equations xi = P ~ i ' where
••. , are functions of angular coordinates $1' ••• , $N-l and sat isfy
identically
(2.4) =1
+ ..• + 1,
while th e rad ius vector p is the positive square root of
2+ ~ ;
.'
the Jacobian by which f is multiplied is of the form
Thus the element of probability is
(2.6)
where
is th e element of area on the unit sphere.
Let th e random point X be projected onto the unit sphere by the
radius vector OX into the unique point I on th e same side of 0 as X.
The Cartesian coordinates of are ••• , £N'
(2.7 )
in accordance with (2.6).
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 7/70
6.
Any statist ic that is a continuous function of the N observations,
that is , of the coordinates throughout a region of the n-space,determines
through each point of this region what Fisher has called an i sosta t i s t ical
surface, on a l l poin ts o f which the sta t is t ic takes the same constant value.
I f , as in the cases of t , F and r , the sta t is t ic remains invariant when a l l
observations are multiplied by anyone positive constant, the i sosta t i s t ical
surface is a cone with vertex a t the origin, or a nappe of such a cone. The
probability that the statist ic l ies between specified values is the integral
of (2.1) over the portion of the unit sphere between i ts intersections with
the corresponding cones.
Since t is unaffected by the division of each Xi by p, and therefore
by i ts replacement by the corresponding we may rewrite (2.1) and (2.2) in1
terms of the ~ i ' s and then simplify with the help of (2.4). We also use
the notation
(2.8)
and obvain
1 1
t =a(N - a2)-2(N _ 1)2 •
Thus the locus on the unit sphere for which t is constant is also a locus
on which has the constan t value a. This locus is therefore a sub-sPhere
of intrinsic dimensionality N - 2 =n - 1, that of the original unit sphere
being n =N - 1.
The distance between 0 and the hyperplane (2.8) is a N - ~ , and also
equals cos Q. From these facts (2.3) easily follows.
The probability that t i s greater than a constant T, which we shall
write p(t > T), is the integral of (2.1) over a spherical cap. In the
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 8/70
7
case of independent observations a l l having the same central normal
distribution, a case for which we attaCh a star to each of the symbols f ,
D and P already introduced, we readily find
(2.10)
independently of S.
The constancy of probability density over the sphere in this case,
with the necessity that the integral over the whole sphere must in every case
be unity, provides a ready derivation of the (N - I)-dimensional "area" of
*..1a unit sphere, which is at once seen to be DN
, t h a t is
This is one of s evera l contributions of statistics to geometry arising in
connection with contributions of geometry to statistics.
The N-dimensional "volume" enclosed by such a sphere is found by
integrating with respect to r from 0 to 1 the product of (2.11) by rN-l,
and therefore equals
Evaluation of the probability of t being exceeded in this standard
normal case is equivalent to finding the area of a spherical cap, a problem
solved by Arbhimedes for the case N = 3. A geometrical solution in any
number of dimensions may be obtained by noticing that the subsphere of
N - 2 =n - 1 dimensions considered above for a particular value of t has
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 9/70
..
8
radius sin Q and that i t s " area" is proportional to the (n - l ) s t power of
this radius. By considering a 11 zone" of breadth dQ obtained by differ-
entiating (2.3), the Student distribution may be reached in the form obtained
by Fisher 1127 in 1926 by quite different reasoning and for a broader
class of uses,
The integral of this from T to 00 is the proba.bility P ( t > T). This must
of course be doubled to get the usual two-tailed proba.bility.
3. Probabilities of large values of t in non-standard cases.
The probabi li ty pet > T) when T > 0 i s the integral of the n-dimen
s ional densi ty DN(s) over a spherical cap. This cap is the locus of points
on the unit sphere in N-space whose distance from the equiangular l ine is
1
less than sin Q, where T = n2 cot Q and 0 S Q < in. Central to such a cap
is the point A on the unit sphere a t which a l l the Car tesian coordinates
_1..ar e equal and positive, and are therefore a l l equal to N 2 . The projected
densi ty (2.7) thus takes a t A the value
A similar situation holds a t the point At d i a m e ~ r i c a l l y opposite to A,
1
whose coordinates a?e a l l N-2.
We ccnsider the a p ~ r o x i m a t i o n to pet < T) consisting of the product
of the n-dimensiona.l area of the cap centered a t A by too d ~ s i t y a t A.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 10/70
9
If corresponding to two different population distributions both
the DN
functions are continuous, the l imit of the ratio of these as T
increases, i f i t eXists, equals the l imit of the ratios of the probabilities
in the two cases of the same event t > T. This also equals the limit of
the ratio of the two probability densities of t when t increases, prOVided
that this las t l imit eXists, as L'Hosp it al 's rule shows.
We are particularly concerned with comparing pet > T) for
various populations with the probability of the same event for a normally
distributed population with zero mean and a fixed variance. Using a star
attached to P or D to identify this case of normality, we introduce for
each alternative population d ist ribu t ion for which the indicated limits
exist the following notation:
pet > T).. --- =
P (t > T)
in accordance with (2.10). By L'Hospital's rule,
lim
t c>·co
get)
*(t),
where get) is the probability density of t in the non-standard case and
g*(t) ( )s given by the standard t density 2.13 .
R = lim ~N ~ 00
We also put
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 11/70
10
I f RNi s less than unity, a standard one-tailed t - tes t of signi
ficance on a sufficiently stringent level, i .e . corresponding to a suffi-
ciently small probability of rejecting a null hypothesis because t exceeds
a standard level given in tables of th e Student distribution, will in
samples of N overstate th e probability of the Type I error of wrongfully
rejecting the null hypothesis. If ~ 1, th e probability of such an error
is understated. Such biases exist even for very large e ~ p l e s i f R ~ 1.
Negative values of t may be treated just as are positive ones above,
replacing A by the diametrically opposite poin t A ', with only t r ivial ly
different results. For symmetrical two-tailed t - tests , DN(A) is to be
replaced by i t s average with DN(A').
In random samples of N from a population with density f(x), i .e .
with independence, (3.1) takes the form
Substituting this in (3.2) yields
l.N -11 OJ N N- ,~ = 2(nN)2 ff<!N)]' 0 g( z17 z ~ dz
4. Samples from a Cauchy distribution
When th e Cauchy density
(4.1) f(x) -1 2 -1= n (1 + x )
is substituted in (3.5) th e integral may be reduced to a beta function by
2/( 2, f dubstitution of a new le t ter for z 1 + Z ) , and we in
(4.2) ~ = ( N / n ) ~ N r ( ~ N ) / r ( N ) .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 12/70
II
For large samples we find with the help of Stirl ing 's formula that ~approximates
and thus approaches zero. For N = 2,3,4, the respective values of ~ are
.6366, .4134, .2703·
The fa ct that these are markedly less than unity points to substantial over..
estimation of probabilities of getting large values of t when a population
really of the Cauchy type is treated as i f i t were normal.
A desirable but diff icul t enterprise is to provide for these
approximations, which are l imits as t increases to inf inity , definite and
close upper and lower bounds for a fixed t . This can be done for samples
of any size from a Cauchy distribution by a method th at w ill be outlined
below, but the a lgeb ra is lengthy except for very small samples. The case
of th e double exponential, or Laplace,distrtbution is more tractable, as
will be seen later.
For N = 2 the exact distribution of t , which then equals
(xl+ x
2) / Ix
l- x21, is readily found directly by change of variables in
f(Xl
)f(X2) and integration. On the basis of the Cauchy population (4.1)
this yields
(4.4) 21 log (tt_+ll) dt.
2 i t
To prove this, we oeserve that It I has the same distribution as lui,
where
u = (x+y)/(x-y)
and the joint distribution of x and y is specified by
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 13/70
12.
-2 2 -1 2 - 1( l+x) ( l+y) dx dy.
Substituting
x =y(u+l) /(u- l) , dx = -2y (u_l)-2 du,
we have for the joint distribution of y and u,
In spite of a singularity a t the point u = 1, y =0 this may be integrated
With respect to y over a l l real values. The resul t , when t is substituted
for u, is ( 4 ~ 4 ) .The density given by (4.4;) is infinite for t = .:!: 1, but is elsewhere
continuous, and is integrable over any interval. Upon expending in a series
of powers of t -l
and integrating, i t 1s seen that th e probability that any
t > 1 should be exceeded in absolute value is given by the series
411 1
T( ' t + 2 3 + 2 5 + • • . ) ,3 t 5 t
* 1which converges even for t = 1, and then takes the same value '2 as the
corresponding probabi li ty for t = 1 based on a fundamentally normal population,
for which the distribution (2.13) becomes
dt
+ t2
)
( -1Expansion of 4.6) in powers of t and integration gives for the
probability corresponding to (4.5),
(4.6)
* co -2 2/The formula El
n = 8 used here may be derived by integrating
from 0 to the well known Fourier series sin x + (sin 3x) + (sin 5x)/5+ . . .=n/4
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 14/70
(4.7)
','
The ratio of (4.5) to (4.7), or of (4.4) to (4.6), approaches when t 2
increases, agreeing with th e result for N = 2 in (4.3), but exceeds this
limit for f inite values of t .
The 5 per cent point for t based on normal theory, i . e . , th e value
of t for which the probability (4.7 ) equals .05, is 12.706. When this
value of t is substituted in (4.5) th e resulting exact probability is found
to be .0218. The approximation obtained by multiplying R2 from (4.3) by
the originally chosen probability .05 is .0318.
The true 5 per cent point when the underlying population is of the
Cauchy form is 8.119.
When we go further out in the t a i l we may expect a better approxi-
mation in using the product of R2by the "normal theory" probabi li ty to
estima te the true "Cauchy" theory probability, and indeed this is soon
verified: the one per cent point from standard normal theory tables for
n =1, i .e . N = 2 in th e present case, is 63.657. The approximation to the
true Cauchy probability obtained on multiplying the chosen "normal theory'
probability .01 by R2is .006366. The true value given by (4.5) is .006364.
Thus the approximations by the R2method are rough a t th e five per cent
point, but qUite satisfactory at the one per cent point or beyond.
5. Accuracy of a p p r o x i m a t i o n ~ I n e q u a l i t i e s
Various inequalities may be obtained r el evan t t o the accuracy of the
approximations that must usually be used to est imate the probabilities
associated With standard statist ical tests under non-standard conditions.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 15/70
14
For the Cauchy distribution and others similarly expressible by
rational functions, and some others, th e theorem that the geometric mean of .
posi tive quant i ties cannot exceed the arithmetic mean can be brought into
play. Thus when the n-dimensional density n(g) on th e unit sphere is
specified by substituting in (2.7) for f the product of N Cauchy functions
(4.1) with different arguments, the result ,
_N[CD 2 2 2 2 -1 N-lD ( g > = ~ Lrl + P gl ) ••. (1 + P gN 17 pdp,
in which = 1 and = a, can be shown to exceed D(A), in which each
in (5.1) is replaced by N-land which is given by (3.5). This readily
follows from the fact that the product within the square brackets in
(5.1) is the Nth power of th e geometric mean of these binomials, and is
therefore not greater than the Nth power of their mean 1 + p2/N. Since
th e density on th e sphere thus takes a minimum value a t A, the estimate
of the t a i l probability obtained by mult iplying the area of the polar cap
by the density at i ts center A, equivalent to mult iplying the ta i l probability
given by "normal theory' by the factors ~ given in (4.2) and (4.3), is
an underestimate. A numerical example of this is a t the end of Sec.4.
Another kind of inequality may be used to set upper bounds on the
difference between the probability densities on the sphere a t different
points, and from these may be derived others setting lower bounds. Consider
for example samples of three from a Cauchy population, and instead of
(5.1) use th e notation DOOO
' This will be changed to DIOO
i f gl is
changed to a new value while and g3 remain unchanged. In relation
to a new set of values g2', g3' and the old se t gl' g2' g3' we shall
use D with three subscripts, aach subscript equal to 0 i f the corresponding
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 16/70
15
argument is an old g, or to 1 i f i t is a new g'. Now
ro-N( 2 ,2)JIJ 2 ,2)( 2 2) ( 2 217-1 N+lD
100- D
OOO =n gl - Sl 1 + P Sl 1 + P Sl . . . 1 + P SN pdp.
o
This difference is of the same sign as S12- S22. The integrand on the right
is the product of that of DOOO
by p2/(1+p2s1,2). This last factor is less
-2than for a l l positive values of p. Thus we find
~ In the same way,
and
Combining these results gives
and
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 17/70
16
6. Exact probabi li ties for N =3: Cauchy distribution
For samples of three or more from non-normal populations the
direct method used for samples of two in Section 4 leads usual ly to almost
inextricable difficul t ies. Another method, particularly simple for finding
the por tion of the distribution of t pertaining to values of t greater
than n , will be applied in this section to samples of three from the
Cauchy distribution, and in the next section to samples from the double
exponential. We continue to use what may seem an odd dualism of notation,
with N for the sample value and n
=N-l, in dealing with the Student ratio,
partly in deference to d if fe ring traditions, but primatily because some
of the algebra is simpler in terms of the sample number and some in terms
of the number of degrees of freedom.
I t is easy to prove the following lemma, in which we continue
to use the notation xi for the ith observation, Si= pXi' (i=l, • . .N),
2P > 0, Ss = 1, a = SS:
1LEMMA: If a ~ n2", then a l l Si ~ 0. On the other hand, i f a ll
S. > 0, then a > 1.-If the f i rs t statement were not true, one or more of the
must be negative; then since the sum of the SIS is positive, there must
be a subset o f them, say Sl"" 'S , with 1 < r < N, such thatr -
The maximum possible value of this upper bound for a, subject to the1
last condition, is r2 . Since r is an integer less than N, this leads
1.to a < n2 , which contradicts the hypothesis. The second part of the
lemma follows from the relations
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 18/70
17
222 2a = (SS) = sg + 2Sg.S., Ss =1,
J
and, because a ll the SIS are non-negat ive, SSiS' > O.J -
The Cauchy distribution (4.1) generates on the unit sphere
in N-space the n-dimensional density
(6.1)
This may be decomposed into par t ia l fractions and integrated by ele-
mentary methods. The process is slightly more troublesome for even than
for odd values of N because of the infinite l imit, but the integral
converges to a f ini te positive value for every positive integral N.
For N =3 the expression under the integral sign equals
where
(6.2)
and a2
and a3
may be obtained from this by cyclic permutations.
00r 2 2 -1Jo
(1 + P s) dp = ~ / lsi,
we find
Since
When this sum is evaluated with the help of (6.2) and reduced to a single
fraction, both numerator and denominator equal determinants of the form
known as alternants, and their quotient is a symmetric function of the
form called by Muir a bialternantj this gives a form of (6.1) for a l l odd
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 19/70
18
values of N. In the present case we obtain after cance llat ion of the
common factors,
(6.4)
as the density on an ordinary sphere. This density is continuous excepting
at the six s ingular points where two observations, and hence two g's , are
zero. At each of these six points, t =
We shall use on the unit sphere polar coordinates Q, ¢ with the
equiangular line as polar axis,and
~ the anglebetween
this axisand a
random point having the density (6.4). For samples of th ree , n = 2, and
so we have from (2.3),
(6.5)1.
t = 22 cot Q.
The element of area on the sphere is sin Q dQ d¢. When this is multiplied
by the density (6.4), with the replaced by appropr ia te funct ions of
the polar coordinates, and then integrated with respect to ¢ from 0 to
2n, the result will be the element of probability for Q, which will be
transformed through (6.5) into that for t .
Let Yl' Y2' Y3 be rectangular coordinates of which the third is
measured along the old equiangular l ine. These are re la te d to the old
coordinates g by an orthogonal transformation
(6.6)i= 1, 2, 3.
The orthogonal ity of the transformation implies the following conditions,
in which the sums are with r es pect t o k from 1 to 3:
(6.7) ( j = 1,2).
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 20/70
19
The orthogonality also implies that
(6.8)
2On the sphere, whose equation in terms of the y 's is of course E Yk = 1,
the transformation to the polar coordinates may be written
(6.9) y = sin ¢1
sin Q, Y2 = cos ¢ sin Q, Y3 = cos Q.
The five conditions (6.7) on the six unspecified c i j ' s are suff icient
for the orthogonality of the transformation (6.6), so one degree of
freedom remains in choosing th e CiJ's. This degree of freedom permits
adding an arbitrary constant to ¢, thus changing at will the point on a
circle Q = constant from which ¢ is measured. In going around one of
these circles, ¢ always ranges over an interval of length 2n. This degree
of freedom will be used la ter to simplify a complicated expression by taking
c32
= O.
We shall now derive as a single integral and as a series tbe proba-
bil i ty p(t ~ T) that th e Student t should in a random sample of three
from the Cauchy distribution (4.1) exceed a number T which is i tse lf
greater than 2. We thus confine attention to values of t greater than 2,
~and these correspond to values of a greater than 22 , as is seen by
1putting N =3 and a = 22 in (2.9), which specifies a monotonic increasing
relation between a and t . Then from the lemma a t the beginning of this
section i t follows that a l l the observations and a l l th e e 's are positive
in th e samples with which we are now concerned, as well as in some others.
The positiveness of the S iS permits for the present purpose th e
discarding of the absolute value signs in (6.4). The resulting cubic
expression may, since
(6.10)
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 21/70
20
be written
(6,11)
2With the help of (6.10) and the relation, derived from (6.10) and Ss = 1,
(6.12)
the cubic (6.11) reduces to
With the new symbols
(6.14)
the transformation (6.6) can be rewritten
(6.15)
(6.16)
th e last equation holding because solving (6.6) for Y3 gives
1 1
Y3= 3-2 (s l + £2 + S3) = 3-2a.
From (6.14), (6.7) and (6.9) i t will be seen that
2 2 2 . 2I ;u
k= 0, I ;u
k= Y
1+ Y
2= Q
From (6.16), by reasoning analogous to that leading to (6.12), we find
(6.17)
From (6. 14) ,
(6.18)
where
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 22/70
In each term of kl, the f i rs t two factors may be eliminated with the
help of (6.8), which may also be written
( i , j = 1, 2, 3; i ~ j ) .
Thus k = -3k .1 3
(6.19)
The linear terms thus introduced cance l out in accordance with the second
of (6.7). The three remaining terms are each equal to k3
.
Likewise, k2= -3kl.
At this point a substantial simplification is introduced by using
the one available degree of freedom to make c32
=0, thereby making ~
and kl vanish. The orthogonal matrix in which c . . is the element in thel.J
ith row and jth column is now
1 1 1 -I6-2 _2-2 3-2
e1 1 1
6-2 2-2 3-2
1 - ~-2 x 6-2
° 3
Thus ko = - 2 - ~ 3-3/
2, ~ =O. From (6.18) we now find, with the help of
(6.9),
ulu2u3
= ko
(y13- 3Y1Y22) = ko Sin3g (sin
3¢ - 3 sin ¢ COs2¢)
= 3-3/2sin3 9 sin 3 ¢.
From (6.15), (6.16), (6.17), and (6.19) we find
~ 1 ~ 2 ~ 3 = (a/3 + u1)(a/3 + u2 )(a/3 +u3 )
= a3/2'7 - (a/6)sin2g -+ 3-'i
2 Sin3g: sin 3¢.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 23/70
22
Substituting this in (6.13) and s impl ifying g ives the express ion that
appears in th e denominator of the density (6.4) in the form
This may be put into terms of t and ¢ alone by substituting in i t the
expressions, derivable when N = 3 from (2.9) and (6.5),
(6.20)
The result is
'e and this equals the conten t of the square bracket in (6.4) when a ll
th e are positive, as they are for t > 2.
In the probability element D 3 ( ~ ) sinQ dQ d¢, the factor sinQ dQ
may be replaced by I d cos Q Ifrom (6.20) that is , by 2 (2 + t2) -3/
2dt.
Substituting (6.21) in (6.4) and multiplying by th e element of area as
thus transformed we have as the element of probability
The probability element for t when i t is greater than 2 will be
found by integrating (6.22)with respect to ¢ from 0 to We put
and observe that
w = 4t3
- 3t,
w - 1 = ( t - 1) (2t +1)2,2
w + 1 = ( t +1) (2t - 1) •
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 24/70
23
From the f i rs t of these relations i t fol lows, s ince ~ ~ 2, that w ~ 25,
and therefore that the integral
(6.25) J =
i
2
"(W - sin 31W
1
d¢
exists. If the substitution ¢, =3¢ is made in this integral, a factor
1/3 is introduced by the differential, but the range of integration is
tripled, and we have
12Jr
-1J = 0 (w - sin ¢, ) d¢' •
This is evaluated by a well known device. For the complex variablei¢'
Z = e , t he path of integration runs counterclockwise along the unit
circle C around 0, and d¢' = dZ/(iz). Since sin ¢, = (z - z-1)/(2i), we
find
Poles ar e at
of which only the la tte r is with in the unit circle. Hence
Substitution in this from (6.24) yields a function of t alone. When this
is multiplied by the facto rs of (6.22) no t included in the integrand of (6.25)
the element of probability g(t) dt is obtained, where
(6.2<5 ) ( t > 2)- .
A check on the reasoning of this whole section is provided by dividing
this expression for g(t) by the corresponding density of t as given by
normal theory, that is , by g*(t) as given by (2.13), and letting t increase.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 25/70
24
3/2 -1The limit, 3 (4n) , agrees with ~ as given by (4.2), in accordance
with (3.3).
The Laurent series obtained by expanding (6.25) in inverse powers of t
is
and converges uniformly for a l l values of t ~ 2. For any such value i t may
therefore be integra ted to infinity term by term. After doubling to include
the like probabil ity for negative values from th e synmetric distribution
this gives
e a rapidly convergent series.;
Thisprob.ability also ava ilable i n closed form. Indeed, · t h e ~ /
integral of (6.26) may be evaluated after rationalizing the integrand by
means of the substitution t = (u2+ 1)/(u
2_ 1). The indefinite integral then
becomes
1. -1 1. -1 _1.) . .32 (tan 32u - tan 3 2U + constant.
The two inverse tangents may be combined by the usual formula, and th e
probability may be expressed in various ways, of which the most compact
appears to be
(6.28)-1 rz_1. - l ( 2 ~ (
p( It I ~ T) = 1 - 6n arctan L3 2 T T - l ) ~ / , T ~ 2).
One consequence is that p( ItI ~ 2) is about .115.
This equation may be solved for T to obtain a "percentage pointl l
with specif ied probabil ity P of being exceeded in absolute value when th e
null hypothesis 1s t rue. We have thus the formula
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 26/70
25
which is true only i f the value of IT I obtained from i t is at least 2.
Taking in turn P as .05 and .01, we find from (6.29):
7. The double and one-sided exponential distributions
We now derive for random samples of N =n + 1 the probability
that t should exceed any value T ~ n, when the populat ion sampled has a
frequency function of either of the forms
(7.1)-kx
f(x) = ke for x ~ 0, =0 for x < 0;
where k is in each case a positive scale factor. Since t is invariant
under changes of scale, k does not ente r in to i ts distribution and we
shall take k = 1 without any loss of generality. Thus in the following
we shall use the simpler frequency functions
flex) = e-
x
for x~
0, =0 for x < 0;
(7.4) f2(X) = ~ - I x l for a ll real x;
and the results will be applicable without change to (7.1) and (7.2).
We are free also to interchange the signs ~ and >, and to say "positivell
for IInon-negative" without changing our probabilities. The limitation of
T and therefore t to values not less than n implies, in accordance with
the lemma at the beginning of Section 6, that a l l observations are non-
negative in the samples with which we are dealing .
Such samples are represented by points in the "octant" of the
sample space for which a l l coordinates are positive, and in this part of
the space the density, which is the product of N values of the frequency
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 27/70
26
-Sx () -N -Sxfunction for independent variates, is e for 7.3 and 2 e "; for
(7.4). Both these densities are constant over each n-dimensional simplex
(generalized triangle or te trahedron) defined, for c > 0, by an equation
of the form
and the N inequalities xi ~ 0. Since this definition is unchanged by
permuting the XIS, the simplex is regular. I ts n-dimensional volume,
which we designate v , may be determined by noticing that this simplex
nis a face, which may be called the "base'" of an N-dimensional simplex
with vertices at the origin and a t th e N points where the coordinate
axes meet the hyperplane (7.5). These last N points are a l l at distance
c from the origin. A generalization, easily established inductively
by integration, of the ordinary formulae for the area of a triangle and
the volume of a pyramid, shows that the volume of an N-dimensional simplex
is the product of the n-dimensional area of any face, which may be desig-
nated the "base,"hy one-Nth of the length of a perpendicular ("altitude")
dropped upon this base from the opposite vertex. Since the perpendicular
distance from the o rig in t o the hyperplane (7.5) is• .!.
cN 2 , the volume
of th e N-dimensional simplex enclosed between this hyperplane and the
coordinate hyperplanes is v CN-3/2• But this N-dimensional volume mayn
also be computed by taking another face as base, with c as alt i tude,
evaluating the n-dimensional area of this base by the same method, and
so on through a l l lower dimensions. This gives c N / N ~ as the N-dimen-
sional volume. Equating, we find
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 28/70
27
The samples for which t exceeds T are represented by points with-
in a r ight c ir cu la r cone with vertex a t the origin, aXis on the equiangular
1
l ine, and semivertical angle Q, where T = n2cot Q. This cone inter-
sects the hyperplane (7.5) in a sphere whose radius we may cal l r . Then
1
since the d is tance o f th e hyperplane from the origin is CN-2, we find
1 ~ -1r =CN-2 tan Q = cn2N 2T • The n-dimensional volume enclosed by this
sphere is within the simplex and, in accordance with (2.12), equals
nr n2 / r ( ~ n + 1).
When the express ion for r is inserted here and the resul t is divided
by the volume v of the simplex, the result isn
(7.6)
,e
This is the ratio of the n-dimensional volume of the sphere to that of
the simplex, and is independent of c. I t is thus the same on a l l the
hyperplanes specified by (7.5) with positive values of c. Because
of the constancy of the density on each of these hyperplanes, (7.6) is
the probability that t ~ T when i t is given that the probability is unity
that a l l observations are positive, as is the case when the frequency
function (7.1) or (7.3) is the true one. But i f (7.2) or (7.4) holds,
(7.6) is only the conditional probability that t ~ T, and to give the
unconditional probability must be multiplied by the probability of the
-Ncondition, that is , by 2 .
If a two-tailed t - t e s t of the null hypothesis that the mean is
zero is applied and the double exponential (7.2) is the actual popula-
t ion dist ribution, the probability of a verdi ct o f s igni ficance because
of t exceeding T is double the foregoing, and therefore 2-n
times (7.6).
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 29/70
28
With a slight change in the la t ter through reducing the argument of the
gamma function, this gives
(7.7)
The obvious relation
(7.8)
e
which is a special case of the duplication formula for the gamma function,
makes i t possible to write (7.7) in the simpler form
Such probabilities of errors of the f i r s t kind in a two-tailed
t - test are illustrated in Table 1 for the values of T found in tables
based on normal theory corresponding to the frequently used .05, .01
and .001 probabilities. One of these probabilities appears at the top le f t
of each of the three main panels into which the table-- is divided, and
is to be compared with the exact probabilities in the second column of
the pane l, computed as indicated above for the values t* of T obtained
from a table of the Student distribution and often re ferre d to as "percentage
points" or "levels." I t will a t once be evident that the normal-theory
probabilities at the hea,ds of t he panel s are material ly greater than the
respective probabilities of the same events when the basic distribution of
the observed variate is of the double exponential type (7.2).
This tabl e a lso i l lustrates the comparison of the exact probabilities
with the approximations introduced in Section 3 for large values of T.
These approximations, presented in the: bottom row;' of each panel, are
obtained by multiplying the normal-theory probability at the head of the
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 30/70
29
panel by the expression ~ of (3.6) as adapted to the particular case.
For the double exponential (7.4) we find
(7.10)
This may, with the help of (7.8), be put in the simpler alternative form
( 7.11)
e
The table i l lustrates the manner in which the approximation slowly improves
as T increases for a fixed n but grows poorer when n increases and T re-
mains fixed.
TABLE 1
PROBABILITIES FOR t IN SAMPLES FROM THE DOUBLE EXPONENTIAL
Stars refer to normal theory.
Sample2 3 4 5 6
size N
P = .05: 12.706 4.303 3.182 2.776 2.571t .05*
p( It I > t .05
*):
Exact .039 35 .032 65 .031 051 . . . . . .Approximate .039 27 .030 23 .023 132 .017 656 .013 458
P = .01: 63.657 9·925 5.841 4.604 4.032
t .01*
p( It/ > t .ot) :
Exact .007 855 .006 138 .005 120 .004 715 . . .Approximate .007 854 .006 046 .004 626 .003 531 .002 692
P = .001 636.619 31.598 12.941 8.610 6.859
t . 001*
p(lt l > t .001
*):
Exact .000 799 .000 606 .000 471 .000 386 .000 337Approximate .000 785 .000 605 .000 463 .000 353 .000 269
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 31/70
30
The blank spaces in Table 1 correspond to cases in which T < n,
to which our formulae do not apply. Expressions of different analytical
forms could be, but have not been, derived for such cases.
All the probabilities in the body of Table 1 are less than the
normal-theory probabilities .05, .Dl and .001 a t the heads of the respect ive
columns, i l lustrating the greater concentr ati on of the new distribution
about i t s center in comparison with the familiar Student distribution.
As N increases this
N- 2
( N ~ 2 ) ,
~ + 2 n N+l~ = 2 N + 2factors a ll less than unity.he product of three
This concentration is further i l lustrated by the circumstance that ~ < 1
for a l l values of N. Indeed, R
2
= n/4 and R
3
= n/33/2are obviously less
than unity, and for a l l N,
approaches n/(2e) < 1, and R, the l imit of ~ , is zero.
Percentage points, the values that t has assigned probabilities
P of exceeding, can be found for the double exponential population by
solving (7.7) or (7.9) for T, provided th e value thus found is not less
than n. For samples of three (n = 2) the 5 07 point thus found iso
3 . 4 ~ and the 1 070point is 7.78. Each of these is between the corresponding
points for the normal and the Cauchy parent distributions, th e la t ter of
which were found a t the end of Sec. 6. These results are summarized in
Table 2, along with th e values of R3for the same populations, which
provide good approximations when t is very large. The degree of concen-
tration of t about zero is in th e same order by a l l three of the measures
in the columns of Table 2.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 32/70
31
TABLE 2
PERCENTAGE POINTS OF t AND MULTIPLIERS OF SMALL EXTREME-TALL NORMAL-THEORY
PROBABILITIES FOR SAMPLES OF THREE
Population T. 05 T.OlR3
Normal 4.30 9.92 1.000
Double exponential 3.48 7.78 .785
Cauchy 2·95 3.69 .413
For a f ixed bas ic distribution and a fixed probability of T being
exceeded by t , T is a functi on of n alone. The behavior of this function as
n increases depends on the basic distribution. If this is normal, a
percentage point T approaches a fixed value, th e corresponding percentage
point. of a standard normal distribution of unit variance. If the basic
distribution is the double exponential, i t is deducible from (7.9) with
.!.the help of Stir ling's formula that T = 0(n2 ) when P is fixed.
8. Student ratios ' from eertain other distributions
The distribution of the Student ratio in samples from the "rec-
tangular" distribution with density constant between two limits symmetric
about zero, and elsewhere zero, has been the obje ct o f considerable atten-
tion. The exact distribution for samples of two is easily found by a geo-
metrical method. Perlo 51.-7 in 1931 obtained the exact distribution for
samples of three in terms of elementary functions--not th e same analytic
function over th e whole range--again by geometry, but with such greatly
increased complexity attendant on the extension from two to three dimen-
sions as to discourage attempts to go on in this way to samples larger than
three. Rider ~ _ 7 gave the distribution for N = 2, and also, by enumeration
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 33/70
32
of a ll possibi l i t ies, investigated the distribution in samples of two and
of four from a discrete distribution with equal probabilit ies at point s uni -
formly spaced along an interval; the resul t presumably resembles that for
samples from th e continuous rectangular distribution.
The distribution of t is independent of the scale of the original
1variate, and we take this to have density 2 from -1 to 1, and zero out-
sid e th ese limits. In an N-dimensional sample space the density is then
-N2 within a cube of vertices whose coordinates are a l l t 1, and zero out-
side i t . The projection of the volume of this cube from the center onto
a surrounding sphere will obviously produce a concentration of density at
the points A and AI on the sphere where the coord inates a l l have equal values,
greater than a t the points of minimum density in the ratio of the length of
1
th e diagonal of the cube to an edge, that is , of N2 to 1. Substituting
the frequency function
(8.1)
(8.2)
f(x) = ~ for -1 < x < 1, =0 otherwise, in (3.6) gives
~ = ( n N / 4 ) ~ N / r ( ~ N + 1).
This increases rapidly with N and without bound; indeed, R N + 2 / ~ is nearly
ne, so R = 00 . The lowest values are
e
This provides a sharp contrast with the cases of the Cauchy and the
single and double exponential distributions, for which R = O. There seems
to be a real danger that a s tat is t ic ian applying the st anda rd t - tes t to a
sample of moderate size from a population which, unknown to him, is from a
rectangular population, will mistakenly declare the deviation in th e mean
significant. The opposite was true for the populat ions considered earlier;
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 34/70
33
that is , use of the standard normal-theory tables of t where th e Cauchy or
exponential distribution is the actual one, with central value zero the
hypothesis to be tested, will not lead to rejec tion of this hypothesis as
often as expected.
An interesting generalization of the rectangular distribution is th e
Pearson Type I I frequency function
2 p-lf(x) = c (1 - x ) ,p > 0, -1 < x < 1,
P
which reduces to the rectangular for p =1. Values of p greater than unity
determine frequency curves looking something like normal curves, but the in-
duced distribution of t is very different in i ts ta i ls . These are given
approximately for large t by the p roduct of the corresponding normal theory
probability by
~ = ~ / 2 L F { p ~ ) / r { p 1 7 N r { p N - N + l ) / r { P N - ~ N + l ) .This is a complicated function of which only an incomplete exploration has
been made; i t appears generally to increase rapidly with N, less rapidly
with p; wllether i t has an upper bound has not been determined. If i t ap-
proaches a definite l imit as N increases, the t a i l distribution of t has a
feature not yet encountered in other examples: th e probabilities of the
extreme ta i ls on this Type I I hypothesis , divided by their probabilities
on a normal hypothesis, may have a definite positive l imit as the sample
size incresases. Such l imits of ra t ios have in previous examples been
zero or infinity.
Values of p between °and 1 (not inclusive) yield U-shaped frequency
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 35/70
curves. The expression just given for ~ may then present difficul t ies
because of zero or negative values of the gamma functions. The unusual
nature of the function and the variety of possibi l i t ies connected with i t
appear particularly when studying large samples, as the application of
Stirl ing 's formula is not quite so straightforward when the arguments of
the gamma functions are not a l l positive; and the factor in square brackets
with the exponent Nmay be either greater or less than unity.
Skew distributions and distributions symmetric about values different
from zero generate distributions of t which can in many cases be studied by
th e methods used in this paper, particularly by means of ~ , which is bften
easy to compute and gives approximations good for sufficiently large values
4It of t , even with skew distributions. In ~ a r t i c u l a r , power functions of one
sided or two-sided t - tests can readily be studied in this way for numerous
non-normal populations.
9. Conditions for apprOXimation to Student distribution
We now derive conditions on probability density functions o fa wide
class, necessary and sufficient for the distribution of t in random samples
to converge to the Student distribution in th e special sense that R = 1; that
is , that
R = lim... R_= l'l'l=oo--N '
where P is the probability that t exceeds T in a random sample of N from
a distribution of density f(x), and P* is the probability of the same event
on th e hypothesis of a normal distribution with zero mean. A l ike result
~ will hold, with only t r ivia l and obvious modifications, for the limiting
form when t approaches - 00 . I t will be seen from th e results of this section
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 36/70
35
that the condi tions for this kind of approach to the Student distribution
are of a different character, and are far more restr ict ive t han those
treated in cen tr al l im it theorems for approach to a central normal dist r i -
bution, which is also approached by a Student distribution, with increasing
sample size within a fixed f ini te interval.
The probability density (frequency) function we suppose such that
the probability that x is positive i s i tself positive. In the oppos it e case
the t r iv ial variation mentioned above comes into play. We rewrite (3.6)
in the form
where
and consider only case s in which this last integra l ex is ts and can be ap-
proximated asymptotically by a method of Laplace, of which many accounts are
available, including those of D. V. Widder in The Laplace Transform and of
A. Erdelyi in ASymptotic Expansions. Direc t app li ca tion of this method re
quires that xf(x), and therefore
g(x) = log x + log f(x),
shall be twice differentiable and shall attain i t s maximum value for a
single positive value x of x; and that this be an ordinary maximum in theo
sense that the f i r s t derivative is zero and the second derivative i s nega-
tive a t this point. I t is further required that , for a l l positive x x ,o
the strong inequality g(x) < g(x ) shall hold. Variations of the methodo
may be applied when these conditions are somewhat relaxed; for instance,
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 37/70
when there are several equal maxima, the interval of integration may broken
into subintervals each satisfying the conditions just stated; but we shall
not in this paper consider any such variations.
Thus we deal with cases in which there is a uniquely determined
x > 0 for which g(x ) > g(x) when x x> 0, g' (x ) =0, g"(xo) < 0, and
o 000g(x) = g(x ) +l(x_X )2 g" (x ) + o(x-x )2 •
0 2000
Then (9.2) may be written
lJq=[00 eNg( x) x-1 dx = eNg( "0) [a;,tNg" ("0)(x-Xc)2+ No (x-"o)2x-i ldx
When f , g and their derivatives appear without explicit arguments the
argument x will be understood. With this convention, a new variable ofo
integration
1
Z = (_Ng")2 (x-x)o
leads to the form
IN = eNg JOO e_z2 /21 o(l /N)[Xo+ O ( N - ~ ) J dz
(_Ng")-2o
1(-Ng") -"2 •
An asymptotic approximation to IN' which by definition has a ratio to ~that approaches unity as N increases, is found here by le t t ing the lower
2/ ( _1..l imit tend to -00 and dropping the terms o(z N) and 0 N 2), which are small
in comparison with the terms to which they are added. This gives
(1.. -1 _1.. Ng
I - 21t ) 2 X ( - Ng") 2 eN 0
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 38/70
37
With the understanding that the argument is x in (9.3) and i tso
f i rs t and second derivatives, we find:
whence
g = log Xo+ log f , g'= x-
l+ f ' / f = 0,
o
eg=x f , x = - f / f ' ,
o 0
By st ir l ing 's formula, r(tN) is asymptotically equivalent to
(2n)'LrN_2)/g/(N-l)/2 e-(N-2)/2
N/2 ~ -1This con ta in s the factor (1 - 2/N) - 2 , which may be replaced bye.
In this way we obtain
Making both this substitution and (9.5) in (9.1) and simplifying gives
R_ 2t ( , , )- t -1(2 2g+l)N/2- ~ - -g Xo ne .
Since g" <°by assumption, i t is clear from (9,8) that the l imit
R of ~ as N increases is zero, a positive number, or infinite, according
2 2g+l i 1 th 1 t t th 't Since thiss ne s eSB an, equa 0 , or grea er an y.
2cr i t ical quantity is , by the f i rs t of (9.7), equal to 2ne(x f) , we have
o
the fol lowing result :
THEOREM I . For a frequency function f(x) satisfying the gene ral con
ditions for application of the Laplace aSymptotic approximation to xr(x)
a necessary and sUfficient condition that the corresponding distribution of
the Student ratio have the property that R is a positive constant is that
1~ f = (2ne(2 is the absolute maximum of xr(x), t aken only where x = xo' and
is an ordinary maximum.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 39/70
In this we replace x and g' byo
If this condition is satisfied, then i t is obvious from (9.8) that
a further necessary and sufficient condition for R to be unity is that
.1. _.1. -1 222 (_gll) 2 XO = 1, that is , xogll + 2 =O.
the expressions for them in the second and third equations respectively in
(9.7) and simplify. The result is simply fit =O. Hence
THEOREM II . In order that R =1 for the distribution of the Student
ratio in random samples from a distriubtion of x of the kind just described,
i t is necessary and suff ic ien t that th e conditions of Theorem I hold and that
fll (xo) =O.
A graphic version of these conditions is as follows. Under the por-
t ion of the smooth frequency curve y = f(x) for which x is positive le t a
~ rectangle be inscribed, with two sides on the coordinate axes and the opposite
vertex at a point L on the curve. Let L be so chosen that th e rectangle has
the greatest possible area. Le t the tangent to the curve a t L meet the x-
axis a t a point M. Then th e tr iangle formed by L and M With th e origin 0
is isosceles, sinee th e first-order condition for maximization, th e second
equation of (9.6) or (9.7), sta te s th at th e subtangent a t L equals i ts
abscissa and is measured in the opposite direction.
The area of the isosceles tr iangle OLM equals that of the maximum rec-
tangle. The necessary and sufficient condition that R be a positive constant
1
is that this area have the same value ( 2 ~ e ) - 2 as for a normal curve sym-
metric about the y-axis. If this is satisfied, the further necessary and
sufficient condition that R = 1 is that L be a point of inflection of the
frequency curve for which x is negative.
I t is partiCUlarly to be noticed that these conditions, which indicate
th e circumstances under which the familiar Student distribution can be
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 40/70
39
tr us ted fo r large values of t and large samples, have nothing to do with
the behavior of the distribution of x for extremely large or extremely small
values of this , the observed variate, nor withi t s
moments, but are solely
concerned with certain relat ions of the frequency function and i t s f i rs t
and second derivatives a t points of inflection.
Proposals to lIcor rectl l t for non-normality by means of higher moments
of x (up to the seventh, according to one suggestion), encounter the diff i
culty that such moments reflect primarily the behavior of the distribution
for very large values, whereas the mainly relevant cri ter ia seem from the
results of this section to concern rather the vicinity of the points of in
flection, which for a normal distribution are at a distance of one standard
deviation from the center. I t is such intermediate or "shoulder" portions
of a frequency curve that seem chiefly to cal l for exploration when the
SUitability of the t tables fo ra part icu la r application is in question,
espec ia lly for stringent tests with large samples. Moments of high order,
even i f known exactly, do not appear to be at a l l sensitive to var ia tions of
frequency curves in th e shoulders. Moreoever, i f moments are est imated from
samples, their standard errors tend to increase rapidly with the order of the
moment. More efficient s tat is t ics for the purpose are evidently to be sought
by other methods.
10. Correlated and heteroscedastic observations. Time series
Much waswritten in the nineteenth century, principally by astronomers
and surveyers, about the problem of unequal variances of errors of observa
tion, and hence different weights, which ar e inversely proportional to the
variances. Much has been written in the twentieth century, largely by ec
nomists and others concerned with time series, about the problems raised by
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 41/70
40
th e la c k of independence o f o bs er va ti on s, p a r t i c u l a r l y by the tendency
fo r observations close to each o th e r in time o r space to be more a like
t ha n o b se rv at io ns remote from each other. The two problems are closely
re la te d, though t h i s f a c t i s seldom mentioned i n the two separate l i t e r a -
ture s . They combine into one problem where the o b s e rv a tio n s , o r th e e r r o r s
in them, have j o i n t l y a multivariate normal d i s t r i b u t i o n i n which both
correlations and unequal variances may be present. Problems o f estimating
or t e s t i n g means and regression c oe ffic ie nts may in such cases e a s i l y be
reduced by l i n e a r transformation to the standard forms in which the obser-
va ti on s a re independent with a uniform variance, provided only t h a t the
correlations and the r a t i o s of the variances are known, s in ce t he se para-
meters e nte r in to t he t ra n sf o rm a ti on .
Thus, i f a multivariate normal d i s t r i b u t i o n is assumed, the leading
methodological problem is the determination o f th is s e t o f parameters, which
c o l l e c t i v e l y are equivalent to the covariance matrix o r i ts inverse, a p a r t from
a common f a c t o r . The number o f independent parameters needed to reduce th e
problem to one of l i n e a r e s tima tio n of standard type i s l e s s by unity than
the t o t a l number of variances and correlations o f the N observations, and
th e re fo re equals N + ~ N ( N - l ) - 1 = ~ ( N - l ) ( N + 2 ) . Since t h i s exceeds the num-
b er o f o bse rv at io ns , often g r e a t l y , th e re i s no hope of obtaining reasonable
estimates from these N observations alone. Numerous expedients, none of
them universally s a t i s f a c t o r y , are available f or a rr iv in g a t values fo r
these parameters i n various applications. Thus astronomers add estimated
variances due to d i f f e r e n t sources, one of which, th e "personal equation,"
~ involves a sum of squares of estimated e r r o r s committed by an in d iv id u a l
observer. Fo r observations made at d i f f e r e n t times, a maintained assumption,
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 42/70
41
ostensibly a priori knowledge, that the observations or their errors con-
sti tute.a stationary stochastic process of a particular sort, often reduces
the number of independent parameters enough so that, with large aamples,
estimates can be made with small calculated standard error, but valid
only i f the maintained assumption is close to the real t ruth. Another ex
pedient, often in time series the most satisfactory of al l , is to adjust
the observations by means of concurrent variables, either supplementary
observations, such as temperature and barometric pressure in biological
experiments, or variables known a priori, such ·as the time as used in es
timating seasonal variation and secular t rend with the help of a number of
estimated parameters sufficiently small to leave an adequate number of de
grees of freedom for the effec ts pr inc ipally to be examined. The simplest
of a l l such expedients is merely to ignore any possible dif fe rences of vari
ances or of intercorrelations that may exist.
If methods of this kind leave something to be desired, the stat ist ician
using them may be well advised at the next stage, when estimating the ex
pected value common to a l l the observations, or that of a specified linear
function of a l l the observations, or testing a hypothesis that such a linear
function has a specified expected value, to modify or supplement methods
based on the Student distribution so as to b rin g ou t as clearly as possible
how the probability used is affected by variations in the covariance matrix.
In addition, when as here the maintained hypotheses about the ancillary pa
rameters of the exact model used are somewhat suspect, a general protective
measure, in the nature of insurance against too-frequent assertions that
part icular effec ts are signiticant, is simply to require for such a judgment
of s igni ficance an unusually small value of the probability of greater dis-
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 43/70
42
crepancy than that actually found for the sample. Thus instead of such
conventional values of this probability P as .05 and .01, i t may be rea-
sonable, where some small intercorrelations or dif ferences in variances in
the observations are suspected but are not well enough known to be used in
the calcu la tion , to disclaim any finding of significance unless P is below
some smaller value such as .001 or .0001. Such circumstances point to use
of the t -s tat is t ic , with a new distribution based on a distribution of ob-
servations which, though multivariate normal, differs from that usually
assumed in that the several observations may be intercorrelated and may
have unequal variances. Moreover, the ta i ls of the new t-distribution be
yond large values of It I are here of particular interest. The probabilities
of such ta ilS w ill now be approximated by methods s im ila r to those already
used in this paper when dealing with t in samples from non-normal distri-
butions.
Consider N observations, whose deviations from their respective ex-
pectations will be calledx l " " ' ~ ' having
a non-singularmultivariate
normal distribution whose density throughout an N-dimensional euclidean
space is
(10.1)
where ~ is the inverse of the covariance matrix and is symmetric, is
i ts determinant, and . .= Aj,. is the element in the i th row and j th column
o f ~ .Let the Student t be computed from these
X iS
by the formula (2.1),
just as i f were a scalar matrix as in familiar usage, wi th the object of
testing whether the expecta tions of the x's, assumed equal to each other,
were zero. Then, as betore",t='(N-l)"cotQWhere Q is the angle between the
equiangular line xl= •.•= ~ and the line through the origin 0 and the point
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 44/70
x whose coordinates are x l " " ' ~ ' When the points x are projected onto
the (N-l)-dimenSional unit sphere about 0, the density of the projected points
~ ( E l " " ' ~ N ) is given by (2.7), which for the part icu lar d ist r ibu tion (10.1)
becomes
(10.2) D N ( ~ ) = [ ' : (PS 1, .. · ,P5N
)pN
-ldP
= (2n)-N/2 IA /i J : ~ N . l e - t p 2 E E A 1 J ~ 1 ~ J 4p
When ~ = 1 (the identity matrix), the double sum in the last parenthesis
reduces to unity, since 512
+. . . + 5N2= 1, and in this case
The ratio of (10.2) to (10.3) is
(10.4)
and is the ratio of the density in our case to that in th e standard case
a t point S on the sphere. I f ' 5 moves to either of the points A, where a l l
5i= N - ~ , or AI, where a l l 5
i= - N - ~ , (10.4) approaches
this we cal l s i m p l y ~ . For large values of t , (10.5) approximates both the
ratio of the probabili ty densit ies and the ratio of the probabilities of a
~ greater value of t in our case to that in t he st anda rd case.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 45/70
44
A small bias may here be mentioned in our earl ier method of approxi
mating the distribution of t in random s am pl es, a bias generally absent
from the applications in the present sec tion. The approximation is , as
was seen earl ier, equivalent to replacing the integral of the density over
a spherical cap centered at A or AI by the product of D(A) or D(A I ) respec
tively by the (N-l)-dimensional area of the cap. In the applications we have
considered to the distribution of t in random samples from non-normal popula
tions, A and AI are likely to be maximum or minimum points of the density on
the sphere, in which case the approximation will tend to be biased, with
the product of ~ by the probability in the s tandard case of t taking a
greater value tending in case of a maximum to overestimate slightly the pro
bability of this event and in case of a minimum to underestimate i t . In non
random sampling of the kind dealt with in this section, maxima and minima of
the density on the sphere can occur only at points that are also on latent
vectors of the quadratic form. A and AI will be such points only for a
special subset of the m a t r i c e s ~ . Apart from these special cases, the ap
proximation should usually be more accurate in a certain qualitative sense
with correlated and unequally variable observations than with random samples
from non-normal distributions.
The simplest and oldest case is that of heteroscedasticity with in
dependence. Here both the covariance matrix and i ts inverse are of diag
onal form. Let wl, ••• ,w
N
be the principal diagonal elements of while
a l l i ts other elements are zero. These w's are the reciprocals of the vari
ances, and are therefore the true weights of the observations. From (10.5),
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 46/70
(10.6)
= (geOmetric mean )N/2 < 1
arithmetic mean -
Hence, fo r suff ic ien tly large T,
(10.7) *( t > T) ~ P ( t > T),
where th e right-hand probability is that found i n famil ia r tables.
Moreover, the ratio of the tru e p robability to the standard one is seen
from (10.6) to approach zero i f the observations are replicated more and
more times with th e same unequal but correct weights, and with concurrent
appropriate decreases in the cri t ica l probability used. The equality signs
in (10.6) and (10.7) hold only i f the true weights are a l l equal.
In con tr as t t o th e foregoing cases are those of correlated normal vari-
ates with equal variances. The common variance may without affecting th e
distribution of t be taken as unity, and th e covariances then become the
correlations.We
consider two special cases.
A simple model useful in time series analysis has as the correlation
between the i th and j th observations pl i - j l , with p of course the serial cor-
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 47/70
46
From this we find
The determinant of is of course the reciprocal of that of the correlation
matrix, and therefore equals (1 - p2)-N+l. Substituting in (10.5) gives
(10.8)N'2
(2 .l. -N r , ( )-L_-17" ,.
RN= 1 - p )2(1 - p) . Ll + 2p 1 - p ~ _ •
If N increases this becomes infinite , in contrast with the case of in-
dependent observations of unequal variances, where th e limit was zero.
RN also becomes infinite as p approaches 1 with N fixed. But when p tends
to -1 with N fixed, RNvanishes.
As another example, consider N observations, each with correlation p
~ with each of the others, and a l l with equal variances. The correlation
matrix has a determinant equal to
which must be positive, implying that p > _(N_l)-l; and th e inve rse matrix
may be written
l+g g
g l+g
~ = (l_p)-l
g •
g .
g
g
where
g
g = -pLl + ( N _ l ) ~ l
g g l+g
A short calcu la t ion now gives
~ =I i + NP/(1_p11N... l )/2 .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 48/70
47
As in the previous example, ~ becomes infinite i f N increases with p fixed,
or i f p tends to unity with N fixed. The possibility that p approaches -1
cannot here arise.
A further application of the methods of this section having some
importance is to situations in which the mean is in question of a normally
distributed set of observations whose correlations and relative variances
are known approximately but not exactly. An exact knowledge of these para-
meters would provide transformations of the observations ~ , ••. , ~ into new
* *ariates x l " " ' ~ ' independently and normally distributed with equal vari-
ance. If now t is computed from the starred quantities, i t will have ex-
actly the same distribution as in the standard case. But errors in the es-
timates used of the correlations and relative variances will generate errors
in the transformation obtained by means of them, and therefore in the dis-
tribution of t . The true and erroneous values of the parameters together
determine a matrix A which, with (10.4) or (10.5), gives indications re-
garding the distribution of t , from which approximate percentage points may
be obtained, wi th an accuracy depending on that of the original estimates
of the correlations and relative variances. In this way general theory and
independent observations leading to knowledge or estimates of these para-
meters may be used to improve the accuracy of the probabilities employed
in a t - tes t . If a confidence region of known probability is available for
the parameters, an interesting set of values of ~ corresponding to the
boundary of the confidence region might be considered.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 49/70
48
11. Sample variances and correlations
The sensitiveness to non-normality of th e Student t - tes t is ,
as we have seen, chiefly a matter of the very non-standard forms
of tai ls distant from the center of the true t-distribution to an
extent increas ing with the sample number. Between fixed l imits , or
for fixed probabili ty levels, i t is easy to deduce from tDOw.n l imit
theorems of probabili ty that the Student distribution is s t i l l often
reasonably accurate for SUffic iently large samples. However these
theorems do not apply in a l l cases; for instance, they do not hold
i f the basic population has the Cauchy distribution; and unequal vari
ances and intercorrelations of observations are notorious sources of
fallacies. Moreover, the whole point in using the Student distribution
in st ead of the older and simpler procedure of a sc ribing the normal
distribution to the Student ratio is to improve the accuracy in a way
having importance only for small samples from a normal distribution
with independence and uniform variance. Deviations from these s tandard
conditions may well, for small samples with fixed probabili ty levels
such as .05, or for larger samples with m o r e ~ r i n g e n t levels,produce
much greater errors than the int roduct ion of the Student distribution
was designed to avoid. The l imit theorems are of l i t t l e use for small
samples. To substitute actual observations uncr it ica lly in the beauti
ful formulae produced by modern mathematical sta t is t ics , without an y
examination of the applicability of th e basic assumptions in the par
t icular ciecumstances, may be straining a t a gnat and swallowing a
camel.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 50/70
49
Even the poor consolation provided by limit theorems in the
central part of some of the distributions of t is further weakened when
we pass to more complicated statist ics suCh as sample variances and
correlat ion coeffic ients . Here the moments of the statist ic in samples
from non-normal populations betray the huge errors t ha t a ri se so easily
when non-normality is neglected. Thus the familiar expression for th e
variance of the sample variance,
( l l . l )
~ h e n d±y1ded by' the squaredpopulattontvariahce' y i e ~ a s a q u o ~ i e n t'two ' a n d ' I ' ~ 4 h a J . P " . l ' t 1 . m . e s 8:8 great ' for a' normal as, fOZJl'a rectangular 'pop-
ulation, ·and has no general upper bound.
The asymptotic standard error of the correlat ion coeffic ient
(see, e.g. Kendall, vol. 1 , p. 211 ) reduces for a bivariate normal
population to the well known approximation
( l l . 2)2 2
(1 - p ) In ,
e
where p is th e correlation in the population. This may now be compared
with the asymptotic variance of the correlat ion coeffic ient between x
and y when the density o f poin ts representing these variates is uniform
within an ellipse in the xy-plane and zero outside i t . The popUlation
correlation p and the moments of the distribution are determined by the
shape of the ellipse and the inclinations of i t s axes to the coordinate
axes. On substituting the moments in the general asymptotic formula for
the s tandard error of the correlation coeffic ient r , the result reduces
to
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 51/70
50
2 22(1 - P ) /(3n),
only two-thirds of the s tandard express ion (11.2). I t is note-
worthy that in the special case p = 0, in thich the ellipse reduces to
a circle , the variance is only 2/<3n), differing from l /n , the exact
variance of r in samples from a bivariate normal distribution with
p = 0, and the asymptotic variance of r in samples from any population
in which there is independence between x and y, or even in which the
moments o f f ou rth and lower orders satisfy the conditions for indepen-
dence.
Fisher1s ingenious u s ~ s L14, Sec.327 of the transformation
(11.4) z = f (r) =! log (1+r ) - ! log (1- r ) , t = f ( p) ,
whose surprising accuracy has been verified in a study by Florence David
[ i ~ 7 , and which has since been studied and modified [ 2 ~owes i ts value primari ly to the properties that, for large and even
moderately large samples, z has a nearly normal distribution with
r -1mean close to ':I and variance n , apa r t from terms of higher order.
This transformation may be derived (though i t was not by Fisher) from
the c ri te ri on tha t z shall be such a function of r as to have the
-1asymptotic variance n , i n d e p e n d e n ~ l y of the parameter p. This
problem has been s tUdied L02§7, and an asymptotic series
for the solution has been found to have z .as i ts leading term; two
-1 -2additional terms, mUltiples respectively of nand n ,have been
calculated, with resultant modifications of z that presumably improve
somewhat the accuracy of the procedures using i t .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 52/70
51
All uses of z, however, depend for the accuracy of their
probabilities on the applicability of the familiar formula (11.2)
for the var iance of r , which has been der ived only from assumptions
including a bivariate normal population, and even then, only as an
approximation; cf. /J§]. If the bivar ia te d ist ribut ion is
not normal, and in spite of this fact the var iance (11 .2) is used
e i ther d ir ec tly, or indirectly through the z-transformation, the
errors in th e final conclusions may be very substantial. This trans-
fonnation does not in any way atone for non-normality or any other
deviation from t he s tandard conditions assumed in i t s derivation.
A different transformation having the s ~ m e desirable asymp-
totic properties as Fisher 's may be obtained for cor re la tions in
random samples from a non-normal bivariate distribution of known form,
or even one whose moments of order not exceeding four are a l l known,
provided certain weak conditions are satisfied. In these cases the
_.1-asymptotic standard error rr of r is of order n 2 and may be found inrthe usual way as a function of p. If z = f ( r ) and = f(p), with f
a sufficiently regular increas ing function;" ' a time-honored method
of .getting approximate standard errors may be applied: Expand z -
in a series of powers of r - p; square, and take the mathematical ex-
pectation; neglect terms of the resulting series that are of order
-1
higher than the f i r s t in n , t hus ordinar ily retaining only the
f i r s t term, which emerges as asymptotical ly equivalent to the vari-
ance of r. A theorem justifying this procedure under suitable con
ditions of regUlar ity and boundedness is given by Cramer 19 _7, but
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 53/70
52
the technique has been in frequent use for more than a century with-
out the benefit of a careful proof such as Cramer's. In the present
case, after taking a square root, i t yields
rr - rr ( d ~ / d p ) .z r
1If now we put rr =n-2 , integrate, and impose the additional con
z
dition that f(O) = 0, we find:
(11.5 )
I f we wish to obtain a transformation resembling Fisher 's in being
independent of n, as well as in other respects, we may replace the
sign of asymptotic approximation in (11.5) by one o f equalit y.
SUbstituting in (11.5) the reciprocal of the square root of the
asymptotic variance (11.2) of r in samples from a normal distribution
leads to Fisher 's transformation (11.4). But for the b ivar ia te d is
tr ibution having uniform density in an ellipse we substitute from (11.3)
instead of (11.2), thus redefining z and by multiplying Fisher 's
1 -1values by {3/2)2, while retaining the approximate variance n •
12. Inequali ties for variance distributions
Light may often be thrown on distributions of statist ics by
means of inequalities determined by maximum and minimum values, even
where i t is excessively diff icul t to calculate exact distributions.
An example of this approach is provided by the following brief study
of the sample variance in certain non-normal populations. I t will
be seen that this method sometimes yie lds rel evan t information where
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 54/70
53
none can be obtained from the asymptotic standard error formula
because the fourth moment is infinite.
One family of populations presenting special interest for the
study of sample variances is specified by th e Pearson Type VII f re-
quency curves
(12.1)
where
(12.2)
(p ~ 1)
This family of symmetric-distributions may be said to interpolate
between the Cauchy and normal forms, which i t takes respectively
for p = 1 and p = 00 .
The even moments, so far as they exist , are given by
Putting k =1, 2 we fizld
( 1 ~ . 4 ) 2 2a = =a /(2p-3)
2i f P > 3/2,
4( - l( -13a 2p-3) 2p-5) i f P > 5/2.
We conside r the case in which the populat ion mean is known but
methods appropriate to a normal distribution are used in connection
with an estimate of a and tests of hypotheses concerning a. In a
random sample le t x1, .•• ,x
nbe t he dev ia tions from the known popula
tion mean; then n is both th e sample number and th e number of degrees222f freedomj and s = Sx /n is an unbiased estimate of the var iance a
in any population for which th e la t ter exists. But despite th is fact
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 55/70
and the similarity of the Type VII and normal curves, the accuracy
2 2of s as an estimate of ~ differs widely for the different populations,
even i f has the same value.
When the formula (11.1) for the variance (i .e. squared standard
error) of the
the result is
sample variance s2 is applied to a normal distribution,
42 ~ In, as is well known. But when the moments (12.4)
of the Type VII distribution are inserted in the same formula, the
result is
(p > 5/2).
In this , a2
may by (12.4) be replaced by ( 2 P _ 3 ) ~ 2 , yielding
22= 4 ~ 4 ( p _ l ) L t 2 P - 5 ) ~ 7 - 1s
2The variance of s is infinite when p ~ 5/2, for instance when p=l and
the Type VII take s the Cauchy form. When p > 5/2 the ratio of the
2variance of s in samples from a Type VII to that in samples from a
normal population of the same variance is equal to 1 + 3/(2p-5);
this may take any value greater than unity.
In studying the distributions of s le t us put22222u = ns = SX
i,(u ~ 0), Q = Slog (1 + xi /a ).
Then in the sample space u is the radius of a sphere whose (n-l)-
dimensional surface has , by (2.11), the"area"
Also, le t us denote by ~ ( u ) , q(u) and ~ ( u ) , or simply ~ , q and qM'
the minimum, mean and maximum respect ive ly of Q over the spherical
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 56/70
55
(n- l) -d imensional surface of rad ius u. Thus
(12.6) ~ ( u ) < q(u) < ~ ( u ) .The density in the sample space found by multiplying together
n values of (12.1) with independent arguments is , in the new notation,
(12.7)n -pQ
y e •o
The probability element for u is seen, on considering a thin spherical
shell , to be
(12.8)
upper and lower bounds will be found by replacimg q(u) here by ~and ~ respectively.
At the extremes of Q for a fixed value of u, differentiation
yields
( 2 / 2 -1X. l+x. a) = AX
i,
1. 1.
where A is a". Lagrange multiplier.
(i=1,2, • .. ,n),
This shows that a ll non-zero
coordinates of an extreme point must have equal squares. Let K be
the number of non-zero coordinates at a point satisfying th e equations.
Since th e sum of t he squares is u2, each o f these coordinates must
1
be I K-2U; and i f u 0, K can have only the values 1,2, • . . ,n. At such
a cri t ica l point,
(2 -1 -2
Q = K log 1 + U K a ) .
This is a monotonic increasing funct ion o f K. Hence,
(12.10) (2 -2~ = log 1 + U a ) , (
2 -1 -2 n~ = log 1 + una )
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 57/70
56
A slightly larger but simpler upper bound is also ava ilab le ; for the
2 2monotonic character of the funct ion shows that qM < u /a .
22 2The well known and tabulated distribution of X =ns , based
on normal theory, is used for various purposes, one of which is to
test the hypothesis that ~ has a certain value ~ against largero
values o f ~ . Such a test might for example help to decide whether to
launch an investigation of possible excessive errors of measurement.
The simplest equivalent of the s tandard distribution is that of
(12.11)
of which the element of probability is
(12.12)
We now inquire how this must be modif ied i f the population is really
of Type VII rather than normal.
Let usf i r s t
consider the cases for which the Type VII dis
tribution has a variance, that is , for which p > 3/2; then by
2 2(12.4), a = ( 2 p - 3 ) ~ •
where
Eliminating a between this and (12.2) gives
1
C = (P-3/2)-2 r(p)/r(p-i).p
When p increases, c tends to unity through values greater than unity.p
Also, c3is approximately 1.23.
2 2 2 J..Since ns = Sx = u , we find from (12.11) that u = ~ ( 2 W ) 2 . This
and (12.13) we substitute in the distribution (12.8), which may then
be written with the help of the notation (12.12) in the form
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 58/70
57
(12.14)
Making the subs ti tutions for u and a also in (12.10), where u2/a
2
becomes 2w/(2p-3), and referring to (12.6), we observe that the
q function in (12.14) is greater than
and less than
fJ. - l( )-17nlog 1 + 2w n 2p-3 - I .Consequently, for a l l w > 0,
h' (w) < h(w) < h" (w) ,
where
h' (w) = c nLr2P-3)/(2P-3+2nw17np eW
g (w),p n
If p increases while nand w remain fixed, both h' (w) and h"(w) tend
to g (w), and th eir r ati o tends to unity. Thus the ordinary normaln2
theory tables and usages of X give substantially correct results
when applied to samples from a Type VII population if only p is large
enough. This of course might have been expected since the Type VII
distribution i tself approaches the normal form when p grows large; but
the bounds just determined for the error of assuming normality may
often be useful in doubtful cases, or where p is not very large.
In a l l cases , h'(w) and h"(w) may be integrated by elementary
methods. If n=2, h' (w) coincides with h" (w).
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 59/70
13. What can be done about non-standard conditions?
Standard sta t is t ica l methods involving probabilities calculated
on standard assumptions of normality, independence and uniform variance
of the observed variates lead to many fallacies when any of these assumed
condi tions a re not fully present, and adequate correc tions , often re
quiring additional knowledge are not made. The object of this paper is
to clar i ty the behavior of these aberrant probabilities with the help
of some new mathematical techniques, and to take some soundings in the
vast ocean of possible cases .
We have concentrated for the most part on the effects of non
standard conditions on the distribution of Student's t , and especially
on the por tions of this distribution for which It I is rather large. I t
is such probabilities (or their complements) that appear to supply the
main channel for passing from numerical values of t to decisions and
actions. A knowledge of the "tail" areas of the frequency distribution
of t (as also of F, r and other sta t i s t ics) seems t he re fo re of greater
value than information about re la t ive probabil i ties of different regions
a ll so close to the center that no one would ever be l ikely to notice
them,to say nothing of distinguishing between them. Thus instead of
moments, which describe a distribution in an exceedingly diffuse manner
and are often not sufficiently accurate when only conventional numbers
of them are used, i t seems better to ut i l ize for the present purpose a
functional only of outer portions of the distribution of t . Such a
measure, here called ~ for samples of N, and R in the l imit for very
large samples, is the l imit as T increases of the ratio of two proba
bil i t ies of th e same event, such as t > T or I t I > T, with the denom-
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 60/70
59
inator probabil ity based on standard normal central random sampling
and the numerator on some other condition which the investigator wishes
to study. Additional methods are here developed for evaluating the
exact probabilities of t a i l areas for which T is a t least as great
as the number of degrees of freedom n, in such leading cases as the
Cauchy and the single and double s imple exponential distributions.
These findings could be pushed further inward toward the center, but
with increasing labor and, a t least in some cases, diminishing uti l i ty .
To heal the i l ls t ha t r esul t from attempts to absorb into the
smooth routine o f standard normal theory a mass of misbehaving data of
doubtful antecedents, an inquiry into the natu re and circumstances
of the trouble is a natural part of the preparations for a prescription.
Such an inquiry may in diff icul t cases require the combined efforts of
a mathematical sta t i s t ic ian, a special ist in the field of applica tion ,
and a psychiatrist . But many relatively simple situations yield to
knowledge obtainable with a few simple instruments. One of these
instruments is ~ , which in case of erratiCally high or low values oft will eliminate many possible explanations of the peculiarity. Positive
ser ia l correlations in time series, one of Markoff type and the other
with equal correlations, tend to make ~ and t too large, whereas in-dependence without accurate weights makes them too small. The Cauchy
distribution makes them too low, some others too high. Beldom will
~ be close to unity, especially in small samples, unless the standardconditions come close to being sat isfied.
The conditions that R, which is 1 for random sampling from a
normal distribution, be also 1 with independent random sampling, but
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 61/70
e
60
without actual normality, are very special, and are developed in Section
9. These are rather picturesque, and show that this kind of approximate
norrJality has absolutely nothing to do with moments, or with the behavior
of the frequency curve a t infinity, or very near the cente r o f symmetry
i f there is one, but are solely concerned with what happens in a small
neighborhood of each point of inflection. Any correct ion of t for
non-normality should evidently be responsive to deviations from the
conditions of Section 9.
In some cases the trouble can be dispelled as an i l lusion, gener
ated perhaps by too few or too i l l controlled observations.
When a remedy is required for the error s resul ting from non-nor
mality, a f i r s t procedure sometimes available is to transform the variate
to a normal form, or some approximation thereto. The transformation
must be based on some real or assumed form of the populat ion distribution,
and this should be obtained from some positive evidence, which may be
in an empirical good f i t or in reasoning grounded on an acceptable body
of theory regarding the nature of the variates observed. One form of
the f i rs t , practiced by at least some psychologists, consi st s o f new
scores constituting a monotonic func tion of the raw scores but adjusted
by means of a large s a m p ~ e so as to have a nearly normal distribution,
thus bring ing the battery of tests closer to the domain of normal cor
relation analysis. This is satisfactory from the s tandpoin t of uni
variate normal distributions, though for consistency of mult ivariate
normal. theory additional condi tions a re necessa ry , which mayor may no t
be contradicted by the facts.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 62/70
61-
The replacement of price relatives by their logarithms appears
to be a sound practice. Indeed, a positive factor, the general level,
persisting through a l l prices, has a meaning and movement of i ts own;
and with i t are independent fluctuations of a multiplicative character.
All this points to the logari thms of the prices as having the right
general properties of approximate symmetry and normality instead of
the very skew distributions of price relatives themselves, taken in
some definite sense such as a price divided by the price of the same
good a t a definite earlier time. There is also some empirical evidence
that logarithms of such price relatives have normal, or a t least sym
metric, distributions. In preparing a program of logarithmic trans
formations,it:lsdes1rable to provide f or suitable steps in the rare
ins tances of a price becoming zero, infinite or negative. The ad
vantages of rep lacing each price,with these rare exceptions,by i ts
logarithm (perhaps 4-place logarithms are best) a t the very beginning
of a s tudy a re considerable, and extend also to some other time
variables in economics.
Non-parametric methods, including rank correlation and th e use
of contingency tables and of order sta t i s t ics such as quantiles, prOVide
a means of escape from such errors as applying normal theory, for
example through the Student rat io, to d is tr ibu tions that are not normal.
Getting suitable exact probabilities based purely on ranks is acom-
binatorial problem after a decision has been reached in detai l as to
the alternative hypotheses to be compared, and the steps to be taken
after each relevant set of observations. These however ar e often diff i
cult decisions, lacking such definite cr i ter ia as occur naturally in
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 63/70
62
parametric situations like those involving correlation coefficients
in normal distributions. Changing from measures to ranks often implies
some sacrifice of information. For example, in a sample from a bivari
ate normal distribution, i t is possible to est imate the correlation
parameter either by the sample product-moment correl at ion or by a
function of the rank correlation coefficient; but the lat ter has a
higher standard error.
In spite of such str ictures, non-parametric methods may suit
ably be preferred in some situations. The median is a definitely
useful sta t is t ic , and so, for some purposes, is M. G. Kendall's rank
correlation coefficient.
One caution about non-parametric methods should be kept in mind:
Their use escapes the possible errors due to non-normality, but does
nothing to avoid those of falsely assumed independence, or of hetero
scedasticity, or of bias; and a l l these may be even greater threats
to sound sta t is t ica l inference than is non-normality. These prevalent
sources of trouble, and the relative inefficiency of non-parametric
methods manifested in their higher standard errors where they compete
with parametr ic methods based on correct models, must be weighed against
their advantage of not assuming any particular type of population.
We have not unti l the last paragraph even mentioned biased errors
of observation. Such errors, though troublesome and common enough, are
in g reat part best discussed in connection with specific applications
rather than general theory. However, the sta t is t ica l theory of design
of experiments does prOVide very material help in combatting the age
old menace of bias, and a t the same time contributes part ia l answers to
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 64/70
t he question at the head of this section.
Objective randomization in the allocation of treatments to exper
imental units, for example with the help of mechanisms of games of
chance, serves an important function that has been described in at
least two different ways. I t may be called elimination of bias,
since in a balanced and randomized experiment any bias tends to be
distributed equally in opposite directions, and thereby to be trans
formed into an increment of variance. Randomization may also be re
garded as a restoration of independence impaired by spec ia l ident i fiable
fe atu re s o f individual units, as in shuffling cards to destroy any
relations between their positions that may be known or suspected. This
aspect of randomization offers a substantial opportunity to get rid of
one of th e th re e chie f kinds of non-standard conditions that weaken the
applicability of standard procedures of mathematical stat is t ics .
Normality, independence and uniformity of variance can a l l be pro
moted by a tt en ti on to their desirability during careful design and ex
ecution of experiments and analogous investigations, such as sample surveys.
Emphasis during such research planning should be placed on features tend
ing to normali ty , including use of composite observations, such as mental
test scores, formed by add iti on of ind iv idua l item scores; and the removal
of major individual causes of var iance leav ing as dominant a usually larger
number of minor causes, which by their convolutions produce a tendency
to normality.
After factorial and other modern balanced experiments, the unknown
quantities are commonly estimated by linear func tions o f a considerable
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 65/70
64
number of observations , frequently with equal coefficients. In addition
to the other benefits of such designs, this feature is conducive to close
approach ton o ~ i t y
in the distributions of these estimates. A case
in point is in weighing several l ight objects, which is much better done
together in various combinations than separately one a t a time, especially
i f both pans of the scale can be used, weighing some combinations against
others. In the development of this subject (Yates [bg7, Hotelling 12'2.7,
Kishen 51.7, Mood 15,-7) the primary motives were cance ll at ion of b ia s
and reduction of variance, fol lowed by advancement of the combinatorial
theory. But from the point of view of the present inqui ry , an important
part of the gain is the increase in the number of independent observations
combined l inearly to estimate each unknown weight. Errors in weights
arrived a t through such experiments must have not only sharply reduced
bias and variance, but much closer adherence to the normal form.
One other protection is available from the dangerous fallacies
that have been th e main sub je ct o f this paper. I t l ies in continual
scrut iny of the sources of the observations in order to understand as
fUlly as possible the n atu re of the random elements in them, as well as
of the biases, to the end that the most accurate and reasonable models
possible be employed. To employ the new models most effectively will
also cal l frequently for dealing in an informed and imaginative manner
with new problems of mathematicalstat is t ics .
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 66/70
BIBLIOGRAPHY
The references below have been selected mainly because of theirbearing on the problems of distributions of. t and F in non-standardcases, with a few dealing instead with distributions of correlationand serial correlation coefficients and other s tat is t ics . The many
papers on distributions of means alone or of standard deviationslone have been barred, save in a few cases such as Rietz 1 derivation of the distribution of s in samples of three from a rectangula r distribution, which throws l ight on the nature of the difficulty of th e problem for larger samples. A few papers on weighingdesigns have been included for a reason arising in the f inal paragraphs of the paper.
An effort has been made to l i s t a l l studies giving particularattention to extreme t a i l probabilities of th e several sta t is t icsjas to this approach, which is also much used in the p resent paper,i t is possib le the l i s t is complete. But in the general field indicated by the t i t le , completeness has not been attained, and could
not be without a t r u ~ y enormous expansion. I t is with great regretthat many meritorious papers have had to be omitted. But referencesare abundant, and well-selected l is ts can be found in many of thepublications listed below. The short bibliographies of Bradley andRietz are extremely informative regarding the condition of our sub
ject when they wrote. Kendall's is virtually all-inclusive, and whentaken with appropriate parts of his text gives the most nearly adequate conspectus of th e great and devoted efforts that have goneinto th e field since the ! i r s t World War.
A relatively new non-standard situation for app lication o f s tandard methods is that in which different members of a sample are drawnfrom normal populations with different means. This is not treatedin th e present paper, but is in papers by Hyrenius, Quensel , Robbins
and zackrisson l isted below.
1:!7 G. A. Baker, I lDistribution of the mean divided by the standard
deviations of samples from non-homogeneous populations/1Ann. Math. Stat. ,
vol. 3 (1932), pp.1-9.
I:gl, M. S. Bartlett, liThe effects of non-normality on the t distribution, ' ~ Camb. Phil. Soc., vol. 31 (1935), pp. 223-231.
L"L7 G. E. P. Box, "Non-normality and tests on variances," Bio
metrika, vol. 40 (1953) , pp. 318-335.
I:fl Ralph A. Bradley, "Sampling dis tr ibu tions for non-normal universes, I Ph.D. Thesis, Universi ty of North Carolina library, Chapel Hill,1949.
1:2.7 Ralph A. Bradiey, "The d i s t r i ~ u t i o n of th e t and F stat ist icsfor a class of non-normal populat ions ,' Virginia Journal of Science, vol.3(NS), No. l(Jan. 1952), pp. 1-32. - -
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 67/70
66
l"fl Ralph A. Bradley, "Corrections for non-normality in the use
of the two-sample t and F toasts at high significance levels," Ann. Math.Stat. , vol. 23 (1952), pp.103-113. -
l"17 Chung, Kai Lai, "The approximate distribution of Student's
stat ist ic," ~ M a t h . ~ . , v o l ~ 17 (1946),pp. 447-465.
LfJfl A. T. Craig, "The simultaneous distribution of mean and standarddeviation in small samples," ~ Math. ~ ,vol. 3 (1932), PP.126-140.
£:27 Harold Cramer, Mathematical Methods of Statistics,Princeton, 1946.
LI27 F. N. David, Tables of the ordinates and probability integral ofthe distribution of the corre la t ion coeff icient in small samples. London,Biometrika office. 1935.
f i l l F. N. David, "The moments of the z and F distributions," Biometrika
vol. 3b (1949), pp.394-403.
[ig7 F. N. David and N. L. Johnson, "The effect of non-normality onthe power function of the F-test in the ana ly sis o f variance," Biometrika,
vol. 38(1951), pp. 43-57.
[iil A. Erdelyi, Asymptotic expansions. New York, Dover. 1956.
[ i ~ R. A. (Sir Ronald) Fisher, Stat is t ica l Methods for Research WorkersEdinburgh, Oliver and Boyd. 1st ed. 1925.
[i17 R. A. (Sir Ronald) Fisher, "Applications of Student's distribution,"Metron; vol. 5, no. 3, (1925) , pp . 90-104.
[i§] A. K. Gayen, "The distribution of 'Student 's ' t in random samplesof any size drawn from non-normal universes," Biometrika,vol.36(1949),
pp. 353-369.
[i17 A. K. Gayen, "The distribution of t he var iance rat io in randomsamples of any size drawn from non-normal universes," Biometrika, vol. 37(1950), pp.236-255.
[i§.7 A. K. Gayen, "The frequency distribution of the product-moment correlation coefficient in random samples of any size drawn from non-normaluniverses," Biometrika, vol. 38 (1951), pp. 219-247.
/f.cjJ R. C. Geary, "The distribution of S tudent 's ratio for non-normalsamples,'! Supp. Journ. Roy. Stat. Soc., vol. 3(1946), pp. 178-184.
8,07 G. B. Hey, "A new method of experimental sampling i l lustrated incertain non-normal populations," Biometrika, vol. 30 (1938), pp. 8-80.(Contains an important bibliography).
ffi.!7 Wassily Hoeffding, "The role of hypotheses in stat is t ical decisionsProceedings of the third Berkeley symposium on mathematical s ta t i s t ica l andprobability, vol. 1, pp. 105-116, (1956).
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 68/70
67
@27 Harold Hotelling, "An application of analysis situs to sta t is t ics ,"Blll l . Amer. Math. ~ , vol.33 (1927) , pp. 467-476.
l2g Harold Hotel*ing, "Tubes and spheres in n-spaces and a class ofstat is t ical problems, Amer. Journ. of ~ , vol. 61, (1939), pp. 440-460.
(247 Harold Hotelling, "Effects of non-normality a t high significance
levels-;" (abstract) , ~ Math. Stat. , vol. 18 (1947), p. 608.
122.7 Harold Hotell!ng, "Some improvements in weighing and other ex perimental techniques, 1 ~ ~ Stat. , vol. 15 (1944), pp. 297-306.
12§.7 Harold Hotelling, "New l ight on the corre .lat ion coefficient andi ts t r a n s f ~ r m s , " Journ. Roy. ~ Soc., Ser ie s B, vol. 15 (1953), pp. 193-23
@77 Hannes Hyrenius, "Sampling distributions from a compound normalparent-population, " Skand. Aktuar. .Tidskrift . , vol. 32 (1949), pp. 180-187.
12§.7 Hannes Hyrenius, "Distribution of 'Student'-Fisher 's t in samplesfrom compound ,normal functions," Biometrika, vol. 37 (1950), pp . 429-442.
[297 E. S. Keeping, "A significance test for exponential regression,"Ann. Math. ~ , vol.22 (1950), pp.180-l98.
1327 M. G. Kendall, The advanced theory of stat ist ics,2 vols. London,ttiliffin, 1948.
/317 K. Kishen, "On the design of experiments for weighing and makingother types of measurements," ~ Math. ~ , vol. 16 (1945), pp. 2 9 4 ~ 3 0 b .
13'5} Jack Laderman, "The distribution of 'Student's ' rat io .for samplesof two items drawn from non-normal universes," Ann. Math. Stat. , vol. 10(1939), pp. 376-379. --! 5 ~ 7 Alexander M. Mood, "On Hotelling's weighing problem," Ann. Math.~ , - v o l . 17(1946), pp.432-446. ---
f5'±7 Jerzy Ne;yman, "On the correlation of the mean and variance insamples drawn from an infinite population;' Biometrika, vol. 18 (1926),pp. 40l-4l}.
132.7 Jerzy Neyman and E. S. Pearson, "On th e use and interpretation ofcertain tes t cri teria for purposes of stat is t ical inference," Biometrika,
vol. 20A (1928) , pp. 175-240.
139 Egon S. Pearson, assisted by N. K. Adayantha and others, "Thedistribution of frequency constants in small samples from non-normalsymmetrical and skew populations," Biometrika, vol. 21 (1929) , pp. 259-286.
1517 Egon S. Pearson, "The analysis of variance in cases of non-normalvariation," Biometrika, vol. 23 (1931), pp. 114-133.
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 69/70
68
!5ffJ Leone Chesire, Elena Oldis, and Egon S. Pearson, "Furtherexperiments on the sampling distribution of the correlation coefficient,"Journ. ~ ~ ~ Assoc., vol. 27 (1932), pp . 121-128.
/5.97 Egon S. Pearson, "The test of significance for the correlationcoefficient: some further results," Journ. Am. Stat. Assoc., vol. 27,
(1932), pp. 424-426. -
1497 Egon S. Pearson, "The test of significance for th e correlationcoefficient, " Journ. ~ Stat. Assoc., vol. 26 (1931), pp. 128-134.
f!i.17 Victor Perlo, "On the distribution of S tudent 's ratio for samplesof three drawn from a rectangular distribution," Biometrika, vol. 25,(1933), pp. 203-204.
/.4.27 Carl-Erik Quensel, "An extension of the validity of StudentFisherTs law of distribution," Skandinavisk Aktuarietidskrift, vol. 26(1943), 210-219.
The distribution is examined of t for normally distributed observationshaving a common mean but different variances, and is found to be more closely concentrated about zero than in the standard case.
/.4.37 Carl-Erik Quensel, "The validity of tlE z-criterion when the variates are taken from different normal populations," Skandinavisk Aktuarietids
krift , vol. 30, (1947), pp. 44-55.~ e n e r a 1 1 z e s th e foregoing to analysis of variance, with similar results.
fF4J Carl-Erik Quensel, "The distribution of th e second-order momentsin random samples from non-normal multivariate universes," Lund Univ.Arsskrift. N. F. Avd. 2. Bd. 48 No.4. 1952. - - - - - - - -
14,7 Paul R. Rider, "On the distribution of the ratio of mean tostandard deviation in small samples from certain non-normal universes,ll
Biometrika, vol. 21 (1929), pp. 124-143.
ffi§] Paul R. Rider, "On small samples from certain non-normal universes,"~ Math. S t a ~ . : . , vol. 2 (1931), pp. 48-65.
ffiJJ Paul R. Rider, "A note on small sample theory,ll Journ. Amer. Stat.Assn., vol. 26 (1931), pp. 172-174. --
L'4g Paul R. R ~ d e r , "On the distribution of the correlation coefficientsin small samples-," Biometrika, vol. 24 (1932), pp. 382-403.
ffiil H. L. Rietz, "Note on th e distribution of the s tandard deviationof sets of three variates drawn at random from a rectangular distribution,"Biometrika, vol. 23, (1931), pp. 424-426.
897 H. L. Rietz, "Comments on applications of recently developedtheory of small samples," Journ. Amer. Stat. Assoc., vol. 26 (1931), PP.36-44
15.17 H. L. Rietz, IlSome topics in sampling theory," Bull. Amer. Math.Soc., vol. 43, (1937), pp. 209-230. -- -
7/28/2019 The Behavior of Some Standard Statistical Tests Under
http://slidepdf.com/reader/full/the-behavior-of-some-standard-statistical-tests-under 70/70
69
.6g7 H. L. Rietz, "On the distribution of the 'Student' ratio forsmall samples from certain non-normal populations," Ann. Math. Stat . ,vol. 10 (1939), pp. 265-274. - - -
1327Herbert Robbins, "The distribution of Student's twhen the
population means are unequal, II ~ Math. Stat . , vol. 19 (1948), pp. 406-410.
/51£1 W. A. Sh,ewhart and F. W. Winters, "Small samples - new experimenta l results," Journ. Amer. Stat. Assn., vol. 23 (1928), pp. 144-153.
This art icle combIneS experimental with other approaches, especiallythat of NeJiUlan f.3}:7 .
.6'2..7 M. M. Siddiqui, "Distribution of a serial correlat ion coefficientnear the ends of the range ," Ann. Math. Stat . , vol. 29 (1958), pp.852-86l.
/527 M. M. Siddiqui, Distributions of some serial correlation coeffi- 'cients. University of North Carolina Ph.D. thesis, 1957. Chapel Hill:
University of North Carolina Library.
8 ]] D. M. Starkey, "The distribution of the multiple correlationcoefficient in periodogram analysis, II Ann. Math. Stat . , vol. 10 (1939),pp. 327-336. - -
8 ~ Martin Weibull, "The distribution of the t and z variables in thecase of stratified sample with individuals taken from normal parent populations with varying means,1I Skandinavisk Aktuarietidskrift, vol. 33,(1950), pp. 137-167.
8il Martin Weibull, liThe regression problem involving non-randomvariates in the case of stratif ied sample from normal parent populations
with varying regression coefficients,1I Skandinavisk Aktuariatidskrift,vol. 34 (1951), pp. 53-71.
lbQ7 Martin Weibull, "The distribution of t - and F- statistics and ofcorrelation and regression coefficients in stratified samples from normalpopUlations wi,th different means," Skand. Aktuar. Tidskrift, vol. 36, (1953)pp. 106, Supp. 1'-2.
lb!7 D. V. Widder, The Laplace Transform. Princeton, 1941.
&27 F. Yates, " C o ~ p l e x experiments," Supp. Journ. Roy. Stat. Soc.,vol. 2-(1935), pp. 181-247. - --