ASYMPTOTIC EXPANSIONS OF THE Paa:R … EXPANSIONS OF THE Paa:R FUNCTIONS OF THE LIKELIHOOD RATIO...
Transcript of ASYMPTOTIC EXPANSIONS OF THE Paa:R … EXPANSIONS OF THE Paa:R FUNCTIONS OF THE LIKELIHOOD RATIO...
ASYMPTOTIC EXPANSIONS OF THE Paa:R FUNCTIONS OFTHE LIKELIHOOD RATIO TESTS FOR MULTIVARIATE
LIHEAR HYPOTHESIS AND IRDEPDDEllCE
by
NariakiSugiuraUniversity of NOrth Carolina
and Hiroshima University
Institute of Statistics Mimeo Series No. 563
January, 1968
This research was supported by theNational Science Foundation GrantNo. 00-2059 and the Sakko-kai Foundation.
DEPARTMENT OF STATISfiCSUJlIVERSITY OF NORTH CAROLIRA
Chapel Hill, N. C.
SUMMARY. Asymptotic non-null distribution of the likelihood ratio
criterion for testing the linear hypothesis in multivariate analysis
-1is obtained up to the order N , "Where N means the sample size, by
using the characteristic function expressed by the hypergeometric
function of matrix argument. This result holds without any assumpticn
on the rank of noncentrality matrix. Asymptotic non-null distribution
of the likelihood ratio test for independence between two sets of
variates is also obtained up to the order N-t .
~~t~~. Let each column vector of p xN matrix X be distributed
independently according to p-variate normal distribution with the
comnxm covariance matrix 1: and E(X) = SA, Where A is a known s X N
matrix of rank s and e is unknown p X s matrix. Then the multivariate
linear hypothesis is defined by asking, under this model, Whether the
parameters 8 satisfies the hypothesis H: 9B=O, where B is a known s X b
matrix (b < s) and 8B is assumed to be estimable. By making an appro-
priate orthogonal transformation from X to Y by Y = X!, we can obtain
the canonical form of the linear hypothesis. Each column vector of
Y = (Yl
, ••• , YN) is distributed independently according to the normal
distribution with the COlDlOOn covariance matrix 1:. The hypothesis H
and alternatives K are specified by
HI E(Yj
) = 0 j=1,2, ••• ,b
(Ll)
K: E(Yj
) = 0 j=s+l, ••• ,N.
The likelihood ratio test for this problem is expressed by
(1.2) " = (Is I / Is +Sh I f/2e e
The matrix S is the sum ofe
square due to error and the mtrix Sh is the sum of square due to the
hypothesis. Hence under the alternative K, Se is distributed accordjng
to the Wishart distribution with N-s degrees of freedom and Sh is
distributed according to the noncentral Wishart distribution with b
degrees of freedom and noncentrality matrix of n = (!)M'1:-1 where
A =E(Yl , ••• , Yb), as in Constantine [4].
Posten and Bargmann [8] obtained the asymptotic expansion of the
power function P(-2p log" < x) up to the order N-2 under the
assumption that the noncentrality matrix n is of rank 2, where p is a
correction factor such that under H the first remainder term of lowest
degree will disappear if we approximate the statistic -2p log " by lvariate with bp degrees of freedom, that is, p is determined by pH = N-s
+ (b-p-l)/2 (Anderson [1, p. 208]).
On the other hand Constantine [4] showed that the hth moment of
the ratio of the determinants Isel / \Se+shl under K could be expressed
by the hypergeometric function of matrix argument. His result can be
expressed by our notation as follows.
(1.3) F ( h N-s+b _ n),I 1 h; + 2 ; u
where rp(x) and the hypergeometric function lFl are defined by
(1.4 ) r (x) = ~(p-l)/4 n p r(x-(a-l)/2)p a=l
2
IF1 (a;b ;Z)
The function C (Z) is called a zonal polynomial of symmetric matrix ZX
corresponding to the partition K = (kl,k2, ••• ,kp),kl+k2+••• +kp=k,
ki~ 0, i = 1, ••• , p and )' means the sum of all such partition~ It
(K)is a kth degree homogeneous symmetric polynomial of p characteristic
roots of Z.
By using this result we shall show the asymptotic expansion of
PK(-2p log A< x) up to the order N-l without restricting the rank of
n. We further require some formula for zonal polynomials in Constantine
[4].ClO
(1.5) II-zl-a= I
~k=O
•(1.6) c!o(Z)
= k~ ~
(a) C (Z) I k!K K
C (Z) I k!K
= etr z.
Put m = PN = N-s+(b-p-l)/2 and let m tend to infinity instead of N as
in Posten and Bargmann [8], we can express the characteristic function
of -2p log A as
C(t) = E[e-2it p log A ]
= E[ ISe ,-itm liSe+Sh I-itm]
3
r (~ (1-2it)- ~) r (!!l +~)- ;p 2 P 2 IFl(-itm; 2~1-2it)+~;-0)- r (!!l _ b-p-L)r (~(1-2it)+ b+;p+l) ~
P 2 4 P 2 4
Under the hypothesis H, the noncentrality matrix 0 = 0 and the
hypergeometric function IF1 is equal to unity. So the first four r
functions give us the characteristic function under H, which we shall
denote by Cl(t) and IFI by C2 (t). Cl(t) can be expanded by the usual
manner due to Box [3] (Anderson [1, p. 204]). Applying the formula
m (_l)rB (h)(1. 8) log r(x+h) = log ,J2; + (x+h-!) log x-x- I r;~1 +o( Ix I-m-~
r=lr(r+l)x
which holds for large Ix I and fixed h with the Bernoulli polynomial
2Br(h) of defree r, B2(h) = h -h+(1/6), to Cl(t), we have
log Cl (t)~-1l0g (l-2it)-iI {1l2(b:Pt'2a)_~(-b-pi",2a)l~~t -to{m-2).
Cl=l
The second term of the above expression vanishes to get
This is the reason Why we use the correction factor p. Now we shall
consider the second term C2(t) of (1.7).
(-imt) C (-0)K K
TTm{1-2it)J/2 + {b+P+l)/4)K. k!
This follows from the definition (1.4). The coefficient of each term
can be arranged by the order of m.
4
(loll) P ( a-1X a-l) ( a-I )(-itm) = n_ -itm - --- -itm - ---2 +1 •• ~-itm - --- + k -1K <X=L 2 2 a
(1.12) ([m(1-2it)/2] + (b+p+l)/4)K
P= fm(1-2it) lk
[l + 2. f~ k _ ,ka(a-ka )1 2 J m{1-21t) L~ ~ 2a=l
+ O(~2)} ] .
differentiating
Hence we can write C2(t) as
Now we shall evaluate the each term of the above infinite series. By00
(1. 6) we can see exp(x tr Z) =I }'xkcK(Z) /k!,
~o ~1this equality with respect to x to get
(1.14)
00
(tr z) etr Z =I }' CK(Z)/(k-l)! •
k=l 6<5This formula is used by Fujikoshi [5], in deriving the asymptotic
expansion of generalized variance under the noncentral case. By (1. 5)
we have
5
(1.15)
co
=I )'k::O (;<)
which holds for any number n and a, Z is a p X P positive definite
matrix such that all characteristic root of Z/n lies in the interval
(0,1). This is satisfied, if we consider n morderately large. The
left handside can be expanded asymptotically in another way by the
formula -log II - n-1zl = tr(Z/n) + tr(z/n)2/2 + 0(n-3).
(1.16) II - n-lzjna =exp (-na log II - n-lzl)
= (etr a Z) (1 + (a/2n)tr Z2 + 0(n-2».Comparing the coefficients of each term of the order n-1 in (1.15) and
(1.16), we can get
p
I ka(ka-<X)}= a2tr z2(etr a z).
a=1.
Applying the formula (1.14) and (1.17) to the expression of C2(t) in
(1.13), we have
(1.18) C2 (t) = exp [(2it / (1-2it» tr oJ
X [1 - 1. {(b+P+l)it tr 0 + 2it tr 02 } + 0(-m12) J.. m (l_2it)2 (1_2it)3
Combining this with the expression of C1(t) in (1.9), we can obtain
asymptotic expansion of the characteristic fUnction C(t) in terms of
noncentral X2 distribution.
6
(1.19) C(t) = (1_2it)-bP/2 exp [{2it/ (1-2it)} tr 0]
[1+ 1 {(b(i+l)tr 0 _ 1 (l>+P+l tr 0 _ tr 0.2)
X m 2 1-2it) (1-2it)2 2
tr 0.2
} ~ ]- :3 + ~m2) •(1-2it)
By inverting this characteristic function with the well known result
that (1_2it)-f/2 exp [2it ~2/(1-2it)] is the characteristic fUnction
of the noncentral X2 distribution with f degrees of freedom and
noncentrality parameter ~2, we can finally obtain the following
theorem.
Theorem 1.1. The non-null distribution of the likelihood
ratio criterion (1.2) for multivariate linear hypothesis defined by
(1.1) can be approximated asymptotically up to the order m-l by
( 2(2) ) !Jb+p+l (2 (2)p( -2p log A < x) = P Xr ~ < x + iill 2 tr 0 P Xr+2 t> < x)
_(b+~+l trn _ tro2)p(~+4(52) < x) _ trn2p(~+6(t>2) < x)} + o(m-2),
where
m = PN = N - s + (b-p-l)/2, f = bp/2, e2 = trO = trAAt Z-l /2
and ~(52) means the noncentral l variate with f degrees of freedom
and noncentrality parameter t:P.If we specialize the rank of 0. to two, we can easily recognize
the agreement of the result by Posten and Bargmann [8] and ours,
after minor change of notation.
close connection between multivariate linear hypothesis and the test
7
for indepence between two sets of variates. Because we can reduce
the test for independence to that of the linear hypothesis by taking
conditional distribution of one set of variate for given another set.
MOnotonicity property of the power function of the test criteria
was proved by this fact in Anderson and Gupta [2]. Thus we can ex-
pect that the same is true in asymptotic expansion.
Let pX 1 vectors Xl"",XN be a random sample from multivariate
normal distribution with mean vector J.I. and covariance DRtrix E. Put
N N
S = I (Xex-X) (Xex-X) t, X = N-l I Xex and let us partition 1; and S into
ex=l cx=l
Pl and P2 rows and columns (Pl + P2 = p) as
E=(I:u ~),~l ~2
Without loss of generality we can assume Pl .5. P2 • The likelihood
ratio test for the hypothesis of independence H ~ = O(Pl XP2)
against all alternatives K:~ f 0 is given by
A = (Is IIlsllIIS22I)N/2 = II - Sll-l 812 822-1
S21 1N/2,
Pl 2 N/2 .This can be also expressed as n
j=l (l-r j ) by us~ng the sample
canonical correlations defined by the Pl characteristic roots of
-1 -1Sll S12 S22 S2l' First we shall show the mment of the statistic
A under K in the convenient form for the expansion,
Theorem 2.1. Under the alternative K, the mment of likelihood
ratio statistic can be expressed as
8
(2.2)
is a P:t X P.t diagonal. matrix,
PI2 h N-l 2n (l-Pj ) 2FI(h,h;h+ -2;P ),
j=l
2 2 2 )where P = diag (~ "'" PPI
diagonal. element P~ means the polulation canonical correlation and
the hypergeometric function 2FI(~,a2;b;Z) is defined by
Proof. Considering the conditional distribution of the first PI
components of the sample for given P2 second components in the canonical
set up of Z, Constantine [4] showed that the joint distribution of
I - r1
2, ••• , I - r 2 is the same as that of PI characteristic rootsPI
of WI (W 1+ WI) -1, where PI X PI matrices W' and W' have the Wishart
distribution for given Y(P2
X (N-I)) with cozmoon covariance matrix
r= diag (1-P12, ••• ,1-P 2) such that VV' is central with N-P2-1PI
degrees of freedom, WI is noncentral with P2 degrees of freedom and
noncentrality matrix n = r-l JftrI' A' /2 and that they are distributed
independently for given fi' (P2 X P2), which has the Wishart distribution
with N-l degrees of freedom, covariance matrix I, and the PI X P2 matrix
It. is given by
A.
Hence we can express the conditional moment of Isl/(Islll IS22 I) for
given yy' from (1.3) by simple change of notation.
9
Applying Kummer transformation formula IFl(a;b;Z)=(etr Z) IF1(b-a;b;-Z)
by Hertz [6] to the second expression and taking expectation by YY'
with the Wishart distribution W (N-l,I), we can writeP2
(2.4)
CKi"Nr-1AYI')d(YI') •
By the fOrmula r {etr(_RS»)lslt -(P+l)/2C (ST)dS=r (t)(t) IRI-tc (TR-l ),"..13>0 K P K K
which holds for p X P positive definite matrix R, S and P X P any
symmetric matrix T, established by Constantine [4], we have
Applying again the Kummer transformation formula 't!1 (al ,a2 ;b;Z) =b-~-a
II - ZI 2 't!l(b-~,b-~;b;Z) due to Hertz [6] and the identity
II + A'r-1AI = II - p2 1-1 to the above expression, we have
10
Combining this result with (2.3), we can get the theorem.
From this theorem, we shall expand the non-null distribution of
the likelihood ratio statistic -2 log ~ asymptotically. It is known
by Olkin and Siotani [7] that the limiting distribution of
.[N {( Is II Isll lls22 I) -11: II 1~111~2I) is normal with mea.n 0 and variance
4( 11: 1/ 1~111~21)2tr 1:u-1~2~ -l~l' so -2W-1(10g ~ - log n~:l
(1_Pj2)N/2) has the same limiting distribution with mean 0 and
variance 4 I;:l P~. This situation is somewhat different from the
previous section. Under the hypothesis the test statistic -2p log ~
is recommended instead of -2 log A, where the correction factor p
is determined so tha.t the first remainder term disappears in the
asymptotic expansion. We have pN=N-(3/2)-(Pl+P2)/2 (Anderson [1, p.
239]).
Put m = pN and let m tends to infinity instead of N. Then the
characteristic function of the statistic -2pm-1(10g ~ - log
n;:1(1"P~)N/2) is obtained by putting h = .. iWm in (2.2).
P -P +1 P +p +1r (! _ .[mit + 1 2 )r (! + 1 2 )Pl 2 4 PI 2 4
(2.7) C(t) = P _p +1 P +p +1r (! + 1 2 )r (! _ .[mit + 1 2 ).Pl 2 4 Pl 2 4
(..,[mi t k (..,[mit) K •
P +p +1(! r: 0t + 1 2 )2 .....,nu. 4 I(
k!
Applying the asymptotic formula for log r(x+h) in (1.8) to each of
the four r fUnctions, we get
11
(2.8) First Factor = 1 + !fro P1P2 +0(12).m
By the same way as (1.11) and (1.12),
Pl
(-Jmit) K = (-Jmit)k[l+(~t) I ka(a-kJ + o(m-l )]
a.l
(2.9)P +p +1 .
(~_ J"mit + 1 2 ) = (~)k[l _~ k + 0(1)].2 . 4 K 2 "m m
Hence we have (-J"mit)K(-J"mit)j<<m/2)-J"mit + (Pl+P2+l)/4)K =Pl
(_2t2)k[1+m-!(2itk + (it)-l I ka(a-ka )} + O(m-l )], which gives,
CX=1
with the formula (1.14) and (~17), the second factor of (2.7).
(2.10) 'tll(-J"mit, -J"mit; (m/2)-J"mit + (Pl+P2+l)/4;p2)
= (eKP(_2t2tr p2)} [1 + 4m-!(it)3 (tr p2 _ tr p4) + o(m-l )].
Combining this expansion with (2.b), we have the following expansion
01' tne cnaracteristic tunctlon.
22 1 .3 2 4~c.ll) ~(t) = (exp(-2t tr P »[1 +Ji (it P1P2 + 4(lt) (tr P -tr P )}
which implies the following theorem.
Theorem 2.2. The power fUnction of the likelihood r~tio test
for testing the independence between two sets of variates can be
expanded asymptotically up to the order m-l in the following way.
12
Pl
m = N - (3/2 ) - (Pl +P2)/2 and T =~ = 2<L p~r~.. Then we have
j=l
Pl1 {PIP2 1 ~ \' 4 } 1(2.12) p(~ < x) = ~(x) -rm --:r- ~t(x)+ ~ ~t"(x)(T-4/,p) +O(m-),
j=l
where ~(x) means the distribution fUnction of the standard normal
distribution and ~Xx), ~ttXx) are its derivatives.
In case p = 2 the problem reduces to test that the correlation
coefficient vanishes. It would be of interest to compare numerically
with another formula in Ruben [9].
13
REFERENCES
[1]. Anderson, T. W. (1958). An Introduction to MUltivariate
Statistical Analysis. Wiley, New York.
[2]. Anderson, T. W. and Das Gupta, S. (1964). MOnotonicity of the
power fUnctions of some tests of independence between two
sets of variates. Ann. Math. Statist. 35, 206-208.
[3]. Box, G. E. P. (1949). A general distribution theory for a
class of likelihood criteria. Biometrika, 36, 317-346.
[J~]. Constal1tine, A. G. (1963). Some non-central distribution
problems in multivariate analysis. Ann. Math. Statist.
34, 1270-1285.
[5]. Fujikoshi, Y. (1967). Asymptotic expansion of the generalized
variance. (Reported in the Meeting of Math. Society of
Japan. OCtober, 1967).
[6]. Hertz, C. S. (1955). Bessel functions of matrix argument.
Ann. Math. 61, 474-523.
[7]. OD~in, I. and Siotani, M. (1964). Asymptotic distribution of
functions of a correlation matrix. Stanford University,
Technical Report No.6.
[8]. Posten, H. o. and Bargmann, R. E. (1964). Power of the likelihood
ratio test of the general linear hypothesis in multivariate
analysis. Biometrika. 51, 467-480.
[9]. Ruben, H. (1966). Some new results on the distribution of the
sample correlation coefficient. J.R. Statist. Soc. B 28, 513
525.