Likelihood Ratio Tests for Covariance Matrices of High...
Transcript of Likelihood Ratio Tests for Covariance Matrices of High...
Likelihood Ratio Tests for Covariance Matrices of High-Dimensional
Normal Distributions
Dandan Jiang 1, Tiefeng Jiang 2 and Fan Yang3
Abstract For a random sample of size n obtained from a p-variate normal population, the likelihood
ratio test (LRT) for the covariance matrix equal to a given matrix is considered. By using the Selberg
integral, we prove that the LRT statistic converges to a normal distribution under the assumption
p/n → y ∈ (0, 1]. The result for y = 1 is much different from the case for y ∈ (0, 1). Another test is
studied: given two sets of random observations of sample size n1 and n2 from two p-variate normal
distributions, we study the LRT for testing the two normal distributions having equal covariance
matrices. It is shown through a corollary of the Selberg integral that the LRT statistic has an
aymptotic normal distribution under the assumption p/n1 → y1 ∈ (0, 1] and p/n2 → y2 ∈ (0, 1]. The
case for max{y1, y2} = 1 is much different from the case max{y1, y2} < 1.
1 Introduction
In their pioneer work, Bai, Jiang, Yao and Zheng [2] studied two Likelihood Ratio Tests (LRT) by
using Random Matrix Theory. The limiting distributions of the LRT test statistics are derived.
There are two purposes in this paper. We first use the Selberg integral, a different method, to revisit
the two problems. We then prove two theorems which cover the critical cases that are not studied
in [2]. Now we review the two tests and present our results.
Let x1, · · · ,xn be i.i.d. Rp-valued random variables with normal distribution Np(µ,Σ), where
µ ∈ Rp is the mean vector and Σ is the covariance matrix. Consider the test:
H0 : Σ = Ip vs Ha : Σ = Ip (1.1)
with µ unspecified. Any test H0 : Σ = Σ0 with known non-singular Σ0 and unspecified µ can be
reduced to (1.1) by transforming data yi = Σ−1/20 xi for i = 1, 2 · · · , n (then y1, · · · ,yn are i.i.d.
with distribution Np(µ, Ip), where µ = Σ−1/20 µ). Recall
x =1
n
n∑i=1
xi and S =1
n
n∑i=1
(xi − x)(xi − x)∗. (1.2)
Of course S is a p× p matrix. After scaling and taking logarithm, a LRT statistic for (1.1) is chosen
1School of Mathematics, Jilin University, Changchun 130012 China, [email protected] in part by NSF #DMS-0449365, School of Statistics, University of Minnesota, 224 Church Street,
MN55455, [email protected] of Statistics, University of Minnesota, 224 Church Street, MN55455, [email protected].
Key Words: High-dimensional data, testing on covariance matrices, Selberg integral, Gamma function.
AMS (2000) subject classifications: Primary 62H15; secondary 62H10.
1
to be in the following form:
L∗n = tr(S)− log |S| − p
=1
n
p∑i=1
(λi − n log λi) + p log n− p, (1.3)
where λ1, · · · , λp are the eigenvalues of nS. See, for example, p. 355 from [14] for this. The notation
log above stands for the natural logarithm loge throughout the paper.
For fixed p, it is known from the classical multivariate analysis theory that a (constant) linear
transform of nL∗n converges to χ2
p(p+1)/2 as n → ∞. See, e.g., p. 359 from [14]. When p is large,
particularly as n → ∞ and p/n → y ∈ (0, 1), there are some results on the improvement of the
convergence, see, e.g., [3]. The fact that dimension p is large and is proportional to the sample size n
is a common practice in modern data. A failure for a similar LRT test in the high dimensional case
(p is large) is observed by Dempster [8] in as early as 1958. It is due to this reason that Bai, Jiang,
Yao and Zheng [2] study the statistic L∗n in (1.3) when both n and p are large and are proportional
to each other.
Now, we state our results in this paper next.
THEOREM 1 Let x1, · · · ,xn be i.i.d. random vectors with normal distribution Np(µ,Σ). Let L∗n
be as in (1.3). Assume H0 in (1.1) holds. If n > p = pn and limn→∞ p/n = y ∈ (0, 1], then
(L∗n − µn)/σn converges in distribution to N(0, 1) as n → ∞, where
µn =(n− p− 3
2
)log
(1− p
n
)+ p− y and σ2
n = −2[ pn+ log
(1− p
n
)].
A simulation study was made for the quantity (L∗n − µn)/σn as in Theorem 1. We chose p/n = 0.9
in Figure 1 with different values of n. The figure shows that the convergence becomes more accurate
as n increases. To see the convergence rate for the case y = 1, we chose an extreme scenario with
p = n− 4 in Figure 2. As n increases, the convergence rate seems quite decent too.
Now, note that σ2n → −2y − 2 log(1 − y) if p/n → y ∈ (0, 1). We obviously have the following
corollary.
COROLLARY 1.1 Let x1, · · · ,xn be i.i.d. random vectors with normal distribution Np(µ,Σ). Let
L∗n be as in (1.3). Assume H0 in (1.1) holds. If n > p = pn and limn→∞ p/n = y ∈ (0, 1), then
L∗n − µn converges in distribution to N(0, σ2) as n → ∞, where σ2 = −2y − 2 log(1− y) and
µn = (n− p) log(1− p
n
)+ p− y − 3
2log(1− y).
Looking at Theorem 1, it is obvious that σ2n ∼ −2 log(1 − p
n ) as p/n → 1. We then get the
following.
COROLLARY 1.2 Assume all the conditions in Theorem 1 hold with y = 1. Let rn = (− log(1 −pn ))
1/2. Then
L∗n − p− (p− n+ 1.5)r2n√
2rnconverges in distribution to N(0, 1) as n → ∞.
2
Figure 1: Histograms were constructed based on 10, 000 simulations of the normalized likelihood ratio
statistic (L∗n − µn)/σn according to Theorem 1 under the null hypothesis Σ = Ip with p/n = 0.9.
The curves on the top of the histograms are the standard normal curve.
Figure 2: Histograms were constructed based on 10, 000 simulations of the normalized likelihood ratio
statistic (L∗n − µn)/σn according to Theorem 1 under the null hypothesis Σ = Ip with p = n − 4.
The curves on the top of the histograms are the standard normal curve.
3
The above result studies the critical case for y = 1, which is not covered in [2]. In fact, the
random matrix tool by Bai and Silverstein [4] is used to derive the results in [2]. Their tool fails
when y = 1.
For a practical testing procedure, we would use Theorem 1 directly instead of using Corollaries
1.1 and 1.2, which deal with the cases y ∈ (0, 1) and y = 1 separately. This is because, for a real set
of data, sometimes it is hard to judge when p/n goes to 1 or when it goes to a number less than 1.
Now we study another likelihood test. For two p-dimensional normal distributionsN(µk,Σk), k =
1, 2, where Σ1 and Σ2 are non-singular and unknown, we wish to test
H0 : Σ1 = Σ2 vs Ha : Σ1 = Σ2 (1.4)
with unspecified µ1 and µ2. The data are given as follows: x1, · · · ,xn1 is a random sample from
Np(µ1,Σ1); y1, · · · ,yn2 is a random sample from Np(µ2,Σ2), and two sets of random vectors are
independent. The two relevant covariance matrices are
A =1
n1
n1∑i=1
(xi − x)(xi − x)∗ and B =1
n2
n2∑i=1
(yi − y)(yi − y)∗ (1.5)
where
x =1
n1
n1∑i=1
xi and y =1
n2
n2∑i=1
yi. (1.6)
Let N = n1 + n2 and ck = nk
N for k = 1, 2. The likelihood ratio test statistic is
TN = −2 logL1 where L1 =|A|n1/2 · |B|n2/2
|c1A+ c2B|N/2. (1.7)
See, e.g., section 8.2 from [14] for this. The second main result in this paper is as follows.
THEOREM 2 Let ni > p for i = 1, 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If
n1 → ∞, n2 → ∞ and p → ∞ with p/ni → yi ∈ (0, 1] for i = 1, 2, then
1
σn
(TN
N− µn
)converges in distribution to N(0, 1),
where
µn = (p−N + 2.5) log(1− p
N)−
2∑i=1
(p− ni + 1.5)ni
Nlog(1− p
ni);
σ2n = 2 log
(1− p
N
)− 2
2∑i=1
n2i
N2log
(1− p
ni
). (1.8)
We did some simulations for the statistic (TN/N − µn)/σn as in Theorem 2. In Figure 3, we
chose p/n1 = p/n2 = 0.9, the picture shows that the convergence rate is quite robust with the value
of n1, n2 and p increase even though the ratio 0.9 is close to 1. To see the convergence rate for the
case that max{y1, y2} = 1, we chose an extreme situation with p = n1− 4 = n2− 4 in Figure 4. The
convergence rate looks well too although it is not as fast as the case p/n1 = p/n2 = 0.9 presents.
According to the notation in Theorem 2, we know that pN = (n1
p + n2
p )−1 → y1y2
y1+y2and ni
N =
ni
p · (n1
p + n2
p )−1 → y−1i
y−11 +y−1
2
for i = 1, 2. We easily get the following corollary.
4
Figure 3: Histograms were constructed based on 10, 000 simulations of the normalized likelihood
ratio statistic (TN/N − µn)/σn according to Theorem 2 under the null hypothesis Σ1 = Σ2 with
p/n1 = p/n2 = 0.9. The curves on the top of the histograms are the standard normal curve.
Figure 4: Histograms were constructed based on 10, 000 simulations of the normalized likelihood
ratio statistic (TN/N − µn)/σn according to Theorem 2 under the null hypothesis Σ1 = Σ2 with
p = n1 − 4 = n2 − 4. The curves on the top of the histograms are the standard normal curve.
5
COROLLARY 1.3 Let ni > p for i = 1, 2 and TN be as in (1.7). Assume H0 in (1.4) holds. If
n1 → ∞, n2 → ∞ and p → ∞ with p/ni → yi ∈ (0, 1) for i = 1, 2, then
TN
N− νn converges in distribution to N(µ, σ2),
where
µ =1
2
[5 log(1− y)− 3γ1 log(1− y1)− 3γ2 log(1− y2)
];
σ2 = 2[log(1− y)− γ2
1 log(1− y1)− γ22 log(1− y2)
]; (1.9)
νn = (p−N) log(1− p
N)− (p− n1)n1
Nlog(1− p
n1)− (p− n2)n2
Nlog(1− p
n2)
with γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)
−1 and y = y1y2(y1 + y2)−1.
Our method of proving the above results is much different from [2]. The random matrix theories,
developed by Bai and Silverstein [4] for the Wishart matrices and Zheng [15] for the F -matrices, are
used in [2]. The tools are universal in the sense that no normality assumption is needed. However,
the requirements that y < 1 as in Corollary 1.1 and max{y1, y2} < 1 as in Corollary 1.3 are crucial.
Technically, the study for critical cases that y = 1 and that max{y1, y2} = 1 are more challenging.
Under the normality assumption, without relying on the random matrix theories similar to Bai
and Silverstein [4] and Zheng [15], we are able to use analysis tools. In fact, the Selberg integral is
used in the proof of both theorems. Through the Selberg integral, some close forms of the moment
generating functions of the two likelihood ratio test statistics are obtained. We then study the
moment generating functions to derive the central limit theorems for the two likelihood ratio test
statistics. In particular, our results study the cases that y ≤ 1 and that max{y1, y2} ≤ 1. As shown
in Corollary 1.2, the result for y = 1 and the result for y ∈ (0, 1) are much different. The same
applies for the second test.
We develop a tool on the product of a series of Gamma functions (Proposition 2.1). It is powerful
in analyzing the moment generating functions of the two log-likelihood ratio statistics studied in this
paper.
The organization of the rest of the paper is as follows. In Section 2, we derive a tool to study the
product of a series the Gamma functions. The proofs of the main theorems stated above are given
in Section 3.
2 Auxiliary Results
PROPOSITION 2.1 Let n > p = pn and rn = (− log(1 − pn ))
1/2. Assume that p/n → y ∈ (0, 1]
and t = tn = O(1/rn) as n → ∞. Then, as n → ∞,
log
n−1∏i=n−p
Γ( i2 − t)
Γ( i2 )
= pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)+ o(1).
The proposition is proved through the following three lemmas.
6
LEMMA 2.1 Let b := b(x) be a real-valued and bounded function defined on (0,∞). Then
logΓ(x+ b)
Γ(x)= b log x+
b2 − b
2x+O
(1
x2
)as x → +∞, where Γ(x) is the gamma function.
Proof. Recall the Stirling formula (see, e.g., p.368 from [6] or (37) on p.204 from [1]):
log Γ(z) = z log z − z − 1
2log z + log
√2π +
1
12z+O
(1
x3
)as x = Re (z) → +∞. It follows that
logΓ(x+ b)
Γ(x)
= (x+ b) log(x+ b)− x log x− b− 1
2(log(x+ b)− log x) +
1
12
(1
x+ b− 1
x
)+O
(1
x3
)(2.1)
as x → +∞. First, use the fact that log(1 + t) ∼ t− (t2/2) +O(t3) as t → 0 to get
(x+ b) log(x+ b)− x log x = (x+ b)
(log x+ log
(1 +
b
x
))− x log x
= (x+ b)(log x+
b
x− b2
2x2+O(x−3)
)− x log x
= b log x+ b+b2
2x+O
(1
x2
)as x → +∞. Evidently,
log(x+ b)− log x = log(1 +
b
x
)=
b
x+O
( 1
x2
)and
1
x+ b− 1
x= O
( 1
x2
)as x → +∞. Plugging these two assertions into (2.1), we have
logΓ(x+ b)
Γ(x)= b log x+
b2 − b
2x+O
(1
x2
)as x → +∞. �
LEMMA 2.2 Let n > p = pn. Assume that limn→∞ p/n = y ∈ (0, 1) and {tn;n ≥ 1} is bounded.
Then, as n → ∞,
logn−1∏
i=n−p
Γ( i2 − tn)
Γ( i2 )
= ptn(1 + log 2)− tnn logn+ tn(n− p) log(n− p)−(t2n +
3tn2
)log(1− y) + o(1). (2.2)
7
Proof. Since p/n → y ∈ (0, 1), then n − p → +∞ as n → ∞. By Lemma 2.1, there exists integer
C1 ≥ 2 such that
logΓ( i
2 − t)
Γ( i2 )
= −t logi
2+
t2 + t
i+ φ(i) and |φ(i)| ≤ C1
i2
for all i ≥ n − p as n is sufficiently large, where here and later in this proof we write t for tn for
short notation. Notice −t log i2 = t log 2− t log i. Then,
n−1∑i=n−p
logΓ( i
2 − t)
Γ( i2 )
= pt log 2− t
n−1∑i=n−p
log i+ (t2 + t)
n−1∑i=n−p
1
i+
n−1∑i=n−p
φ(i)
= pt log 2 + (t2 + t)n−1∑
i=n−p
1
i− t log
n!
(n− p)!+ t log
n
(n− p)+O(
1
n)
= pt log 2 + (t2 + t)
n−1∑i=n−p
1
i− t log(1− y)− t log
n!
(n− p)!+ o(1) (2.3)
since∑n−1
i=n−p φ(i) = O( 1n ) and log n(n−p) → − log(1− y) as n → ∞. First,
n−1∑i=n−p
1
i≤
n−1∑i=n−p
∫ i
i−1
1
xdx =
∫ n−1
n−p−1
1
xdx.
By working on the lower bound similarly, we have
logn
n− p=
∫ n
n−p
1
xdx ≤
n−1∑i=n−p
1
i≤
∫ n−1
n−p−1
1
xdx = log
n− 1
n− p− 1.
This implies, by assumption p/n → y, that
n−1∑i=n−p
1
i→ − log(1− y) (2.4)
as n → ∞. Second, by the Stirling formula (see, e.g., p.210 from [11]), there are some θn, θ′n ∈ (0, 1),
logn!
(n− p)!= log
√2πnnne−n+ θn
12n√2π(n− p)(n− p)n−pe−n+p+
θ′n12(n−p)
= n log n− (n− p) log(n− p)− p+1
2log
n
n− p+ o(1)
= n log n− (n− p) log(n− p)− p− 1
2log(1− y) + o(1)
8
as n → ∞. Join this with (2.3) and (2.4), we arrive at
log
n−1∏i=n−p
Γ( i2 − t)
Γ( i2 )
= pt log 2− (t2 + t) log(1− y)− t log(1− y)− tn log n
+t(n− p) log(n− p) + tp+t
2log(1− y) + o(1)
= pt(1 + log 2)− (t2 +3t
2) log(1− y)− tn log n+ t(n− p) log(n− p) + o(1)
as n → ∞. The proof is then complete. �
LEMMA 2.3 Let n > p = pn and rn = (− log(1 − pn ))
1/2. Assume that limn→∞ p/n = 1 and
t = tn = O(1/rn) as n → ∞. Then, as n → ∞,
logn−1∏
i=n−p
Γ( i2 − t)
Γ( i2 )
= pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)+ o(1).
Proof. Obviously, limn→∞ rn = +∞. Hence, {tn; n ≥ 2} is bounded. By Lemma 2.1, there exist
integers C1 ≥ 2 and C2 ≥ 2 such that
logΓ( i
2 − t)
Γ( i2 )
= −t logi
2+
t2 + t
i+ φ(i) and |φ(i)| ≤ C1
i2(2.5)
for all i ≥ C2.
We will use (2.5) to estimate∏n−1
i=n−pΓ( i
2−t)
Γ( i2 )
. However, when n− p is small, say, 2 or 3 (which is
possible since p/n → 1), the identity (2.5) can not be directly applied to estimate each term in the
product of∏n−1
i=n−pΓ( i
2−t)
Γ( i2 )
. We next use a truncation to solve the problem thanks to the fact that
Γ( i2−t)
Γ( i2 )
→ 1 as n → ∞ for fixed i.
Fix M ≥ C2. Write
ai =Γ( i
2 − t)
Γ( i2 )
for i ≥ 1 and γn =
1, if n− p ≥ M ;∏M−1i=n−p ai, if n− p < M .
Then,
n−1∏i=n−p
Γ( i2 − t)
Γ( i2 )
= γn ·n−1∏
i=(n−p)∨M
Γ( i2 − t)
Γ( i2 )
. (2.6)
Easily, (min
1≤i≤M(1 ∧ ai)
)M
≤ γn ≤(
max1≤i≤M
(1 ∨ ai)
)M
9
for all n ≥ 1. Note that, for each i ≥ 1, ai → 1 as n → ∞ since limn→∞ tn = 0. Thus, since M is
fixed, the two bounds above go to 1 as n → ∞. Consequently, limn→∞ γn = 1. This and (2.6) say
that
n−1∏i=n−p
Γ( i2 − t)
Γ( i2 )
∼n−1∏
i=(n−p)∨M
Γ( i2 − t)
Γ( i2 )
(2.7)
as n → ∞. By (2.5), as n is sufficiently large, we know
logn−1∏
i=(n−p)∨M
Γ( i2 − t)
Γ( i2 )
=n−1∑
i=(n−p)∨M
(−t log
i
2+
t2 + t
i+ φ(i)
)
with |φ(i)| ≤ C1i−2 for i ≥ C2. Write −t log i
2 = −t log i+ t log 2. It follows that
log
n−1∏i=(n−p)∨M
Γ( i2 − t)
Γ( i2 )
=(n− (n− p) ∨M
)t log 2− t
n−1∑i=(n−p)∨M
log i+ (t2 + t)
n−1∑i=(n−p)∨M
1
i+
n−1∑i=(n−p)∨M
φ(i)
:= An −Bn + Cn +Dn (2.8)
as n is sufficiently large. Now we analyze the four terms above.
By distinguishing the cases n− p > M and n− p ≤ M, we get
|An − pt log 2| ≤ (t log 2) · |n− p−M | · I(n− p ≤ M) ≤ (M log 2)t. (2.9)
Now we estimate Bn. By the same argument as in (2.9), we get
∣∣∣ n−1∑i=(n−p)∨M
h(i)−n−1∑
i=(n−p)
h(i)∣∣∣ ≤ M∑
i=1
|h(i)| (2.10)
for h(x) = log x or h(x) = 1/x on x ∈ (0,∞). By the Stirling formula (see, e.g., p.210 from [11]),
n! =√2πnnne−n+ θn
12n with θn ∈ (0, 1) for all n ≥ 1. It follows that for some θn, θ′n ∈ (0, 1),
n−1∑i=n−p
log i = logn!
(n− p)!+ log
n− p
n
= log
√2πnnne−n+ θn
12n√2π(n− p)(n− p)n−pe−n+p+
θ′n12(n−p)
+ logn− p
n
= n log n− (n− p) log(n− p)− p+1
2log
n− p
n+Rn
with |Rn| ≤ 1 as n is sufficiently large. Recall Bn = t∑n−1
i=(n−p)∨M log i. We know from (2.10) that∣∣∣Bn −(tn log n− t(n− p) log(n− p)− tp+
t
2log
n− p
n
) ∣∣∣ ≤ Ct, (2.11)
where C here and later stands for a constant and can be different from line to line.
10
Now we estimate Cn. Recall the identity sn :=∑n
i=11i = log n+cn for all n ≥ 1 and limn→∞ cn =
c, where c ∼ 0.577 is the Euler constant. Thus, |(sn − sn−p)− log nn−p | ≤ cn + cn−p. Moreover,
n∑i=n−p+1
1
i= sn − sn−p and
∣∣∣ n−1∑i=n−p
1
i−
n∑i=n−p+1
1
i
∣∣∣ ≤ 1.
Therefore, ∣∣∣ n−1∑i=n−p
1
i− log
n
n− p
∣∣∣ ≤ C.
Consequently, since Cn = (t2 + t)∑n−1
i=(n−p)∨M1i , we know from (2.10) that∣∣∣Cn − (t2 + t) logn
n− p
∣∣∣ ≤ (t2 + t)C. (2.12)
Finally, it is easy to see from the second fact in (2.5) that
|Dn| ≤ C1
∞∑i=M
1
i2(2.13)
for all n ≥ 2. Now, reviewing that t = tn → 0 as n → ∞, we have from (2.7), (2.8), (2.9), (2.11) and
(2.12) that, for fixed integer M > 0,
An −Bn + Cn +Dn
= pt log 2−(tn log n− t(n− p) log(n− p)− tp+
t
2log
n− p
n
)+ (t2 + t) log
n
n− p+Dn + o(1)
= pt(1 + log 2) +
(t2 +
3t
2− nt
)log n−
(t2 +
3t
2− (n− p)t
)log(n− p)︸ ︷︷ ︸
En
+Dn + o(1)
as n → ∞. Write log(n− p) = log n− r2n. Then
En = pt(1 + log 2)− pt log n+ r2n
(t2 +
3t
2− (n− p)t
).
From (2.13) we have that
lim supn→∞
|(An −Bn + Cn +Dn)− En| ≤ C1
∞∑i=M
1
i2
for any M ≥ C2. Recalling (2.7) and (2.8), letting M → ∞, we eventually obtain the desired con-
clusion. �
Proof of Proposition 2.1. The conclusion corresponding to the case y = 1 follows from Lemma
2.3. If y ∈ (0, 1), then limn→∞ rn = (− log(1− y))1/2, and hence {tn : n ≥ 1} is bounded. It follows
that
pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)= pt(1 + log 2)− pt log n− t(p− n) log(1− p
n)−
(t2 +
3t
2
)log(1− p
n).
11
The last term above is identical to(t2 + 3t
2
)log(1− y) + o(1) since p/n → y as n → ∞. Moreover,
−pt log n− t(p− n) log(1− p
n) = −pt log n+ t(n− p)
(log(n− p)− log n
)= −nt log n+ t(n− p) log(n− p).
The above three assertions conclude
pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)= pt(1 + log 2)− nt log n+ (n− p)t log(n− p)−
(t2 +
3t
2
)log(1− y) + o(1)
as n → ∞. This is exactly the right hand side of (2.2). �
3 Proof of Main Results
We first prove Theorem 1. To do that, we need to make a preparation. Assume that x1, · · · ,xn are
Rp-valued random variables. Recall
S =1
n
n∑i=1
(xi − x)(xi − x)∗ where x =1
n
n∑i=1
xi. (3.1)
The following is from Theorem 3.1.2 and Corollary 3.2.19 in [14].
LEMMA 3.1 Assume n > p. Let x1, · · · ,xn be i.i.d. Rp-valued random variables with distribution
Np(µ, Ip). Then nS and Z∗Z have the same distribution, where Z := (zij)(n−1)×p and zij’s are i.i.d.
with distribution N(0, 1). Further, λ1, · · · , λp have joint density function
f(λ1, · · · , λp) = Const ·∏
1≤i<j≤p
|λi − λj | ·p∏
i=1
λ(n−p−2)/2i · e− 1
2
∑pi=1 λi
for all λ1 > 0, λ2 > 0, · · · , λp > 0.
Recall the β-Laguerre ensemble as follows:
fβ,a(λ1, · · · , λp) = cβ,aL ·∏
1≤i<j≤p
|λi − λj |β ·p∏
i=1
λa−qi · e− 1
2
∑pi=1 λi (3.2)
for all λ1 > 0, λ2 > 0, · · · , λp > 0, where
cβ,aL = 2−pa
p∏j=1
Γ(1 + β2 )
Γ(1 + β2 j)Γ(a− β
2 (p− j)), (3.3)
β > 0, p ≥ 2, a > β2 (p−1) and q = 1+ β
2 (p−1). See, e.g., [9, 12] for further details. It is known that
fβ,a(λ1, · · · , λp) is a probability density function, i.e.,∫· · ·
∫[0,∞)p
fβ,a(λ1, · · · , λp) dλ1 · · · dλp = 1.
See (17.6.5) from [13] (which is essentially a corollary of the Selberg integral in (3.23) below).
Evidently,
the density function in Lemma 3.1 corresponds to the β-Laguerre ensemble in (3.2) with
β = 1, a =1
2(n− 1) and q = 1 +
1
2(p− 1). (3.4)
12
LEMMA 3.2 Let n > p and L∗n be as in (1.3). Assume λ1, · · · , λp have density function fβ,a(λ1, · · · , λp)
as in (3.2) with a = β2 (n− 1) and q = 1 + β
2 (p− 1). Then
EetL∗n = e(logn−1)pt ·
(1− 2t
n
)p(t− β2 (n−1))
· 2−pt ·p−1∏j=0
Γ(a− t− β2 j)
Γ(a− β2 j)
for any t ∈(− 1
2β,12 (β ∧ n)
).
Proof. Recall
L∗n =
1
n
p∑j=1
(λj − n log λj) + p log n− p.
We then have
EetL∗n
= e(logn−1)pt
∫[0,∞)p
etn
∑pj=1 λj ·
p∏j=1
λ−tj · fβ,a(λ1, · · · , λp) dλ1 · · · dλp
= e(logn−1)pt · cβ,aL
∫[0,∞)p
e−( 12−
tn )
∑pj=1 λj ·
p∏j=1
λ(a−t)−qj ·
∏1≤k<l≤p
|λk − λl|β dλ1 · · · dλp. (3.5)
For t ∈(− 1
2β,12 (β ∧ n)
), we know 1
2 − tn > 0. Make transforms µj = (1 − 2t
n )λj for 1 ≤ j ≤ p. It
follows that the above is identical to
e(logn−1)pt · cβ,aL ·(1− 2t
n
)−p(a−t−q)− β2 p(p−1)−p
·∫[0,∞)p
e−12
∑pj=1 µj ·
p∏j=1
µ(a−t)−qj ·
∏1≤k<l≤p
|µk − µl|β dµ1 · · · dµp. (3.6)
Since t ∈(− 1
2β,12 (β ∧ n)
)and n− p ≥ 1, we know
t <β
2≤ β
2(n− p) =
β
2(n− 1)− β
2(p− 1) = a− β
2(p− 1).
That is, a− t > β2 (p− 1). Therefore the integral in (3.6) is equal to 1/cβ,a−t
L by (3.2) and (3.3). It
then from (3.5) and (3.6) that
EetL∗n = e(logn−1)pt ·
(1− 2t
n
)−p(a−t−q)− β2 p(p−1)−p
·cβ,aL
cβ,a−tL
= e(logn−1)pt ·(1− 2t
n
)−p(a−t−q)− β2 p(p−1)−p
· 2−pt ·p∏
j=1
Γ(a− t− β2 (p− j))
Γ(a− β2 (p− j))
.
Now, use a = β2 (n− 1) and q = 1 + β
2 (p− 1) to obtain that
EetL∗n = e(logn−1)pt ·
(1− 2t
n
)p(t− β2 (n−1))
· 2−pt ·p−1∏j=0
Γ(a− t− β2 j)
Γ(a− β2 j)
.
13
The proof is complete. �
Let {Z,Zn; n ≥ 1} be a sequence of random variables. It is known that
Zn converges to Z in distribution if limn→∞
EetZn = EetZ < ∞ (3.7)
for all t ∈ (−t0, t0), where t0 > 0 is a constant. See, e.g., page 408 from [5].
Proof of Theorem 1. First, since log(1−x) < −x for all x < 1, we know σ2n > 0 for all n > p ≥ 1.
Now, by assumption, it is easy to see
limn→∞
σ2n =
−2[y + log(1− y)
], if y ∈ (0, 1);
+∞, if y = 1.(3.8)
Trivially, the limit is always positive. Consequently,
δ0 := inf{σn; n > p ≥ 1} > 0.
To finish the proof, by (3.7) it is enough to show that
E exp{L∗
n − µn
σns}→ es
2/2 = EesN(0,1) (3.9)
as n → ∞ for all s such that |s| < δ0/2.
Fix s such that |s| < δ0/2. Set t = tn = s/σn. Then |tn| < 1/2 for all n > p ≥ 1. In Lemma 3.2, take
β = 1 and a = (n− 1)/2, by (3.4),
EetL∗n = e(logn−1)pt ·
(1− 2t
n
)pt−np2 + p
2 · 2−pt ·p−1∏j=0
Γ(n−j−12 − t)
Γ(n−j−12 )
.
Letting i = n− j − 1, we get
EetL∗n = 2−pt · e(logn−1)pt ·
(1− 2t
n
)pt−np2 + p
2 ·n−1∏
i=n−p
Γ( i2 − t)
Γ( i2 )
(3.10)
for n > p. Then
logEetL∗n = pt(log n− 1− log 2) + p
(t+
1− n
2
)log
(1− 2t
n
)+ log
n−1∏i=n−p
Γ( i2 − t)
Γ( i2 )
.
Now, use identity log(1− x) = −x− x2
2 +O(x3) as x → 0 to have
p(t+
1− n
2
)log
(1− 2t
n
)= p
(t+
1− n
2
)(− 2t
n− 2t2
n2+O(
1
n3))
= −2pt
n
(t+
1− n
2
)(1 +
t
n
)+ o(1)
= −2pt
n
(12t+
1− n
2+O(
1
n))+ o(1)
= − p
nt2 + pt− yt+ o(1)
14
as n → ∞. Recall rn = (− log(1− pn ))
1/2. We know t = tn = sσn
= O( 1rn) as n → ∞. By Proposition
2.1,
logn−1∏
i=n−p
Γ( i2 − t)
Γ( i2 )
= pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)+ o(1)
as n → ∞. Join all the assertions from (3.10) to the above to obtain that
logEetL∗n = pt(log n− 1− log 2) − p
nt2 + pt− yt
+ pt(1 + log 2)− pt log n+ r2n
(t2 + (p− n+ 1.5)t
)+ o(1)
=(− p
n+ r2n
)t2 + [p+ r2n(p− n+ 1.5)− y]t+ o(1) (3.11)
as n → ∞. Noticing
p+ r2n(p− n+ 1.5)− y =(n− p− 3
2
)log
(1− p
n
)+ p− y = µn
and from the definition of σn and notation t = sσn
, we know(− p
n + r2n)t2 = s2
2 . Hence, it follows
from (3.11) that
logE exp{L∗
n − µn
σns}= logEetL
∗n − µnt →
s2
2
as n → ∞. This implies (3.9). The proof is completed. �
Now we start to prove Theorem 2. The following lemma says that the distribution of L1 in (1.7) does
not depend the mean vectors or covariance matrices of the population distributions where random
samples xi’s and yj ’s come from.
LEMMA 3.3 Let L1 be defined as in (1.7) with n1 > p and n2 > p. Then, under H0 in (1.4), L1
and
L1 :=(n1 + n2)
(n1+n2)p/2
nn1p/21 n
n2p/22
|C|n1/2 · |I−C|n2/2 (3.12)
have the same distribution, where
C = (U∗U+V∗V)−1/2(U∗U)(U∗U+V∗V)−1/2 (3.13)
with U = (uij)(n1−1)×p and V = (vij)(n2−1)×p, and {uij , vkl} are i.i.d. random variables with
distribution N(0, 1).
Proof. Recall that x1, · · · ,xn1 is a random sample from population Np(µ1,Σ1), and y1, · · · ,yn2 is
a random sample from population Np(µ2,Σ2), and the two sets of random variables are independent.
Under H0 in (1.4), Σ1 = Σ2 = Σ and Σ is non-singular. Set
xi = Σ−1/2xi and yj = Σ−1/2yj
15
for 1 ≤ i ≤ n1 and 1 ≤ j ≤ n2. Then {xi; 1 ≤ i ≤ n1} are i.i.d. with distribution Np(µ1, Ip)
where µ1 = Σ−1/2µ1; {yj ; 1 ≤ j ≤ n2} are i.i.d. with distribution Np(µ2, Ip) where µ2 = Σ−1/2µ2.
Further, {xi; 1 ≤ i ≤ n1} and {yj ; 1 ≤ j ≤ n2} are obviously independent. Similar to (1.5) and
(1.6), define
A =1
n1
n1∑i=1
(xi − ¯x)(xi − ¯x)∗ and B =1
n2
n2∑i=1
(yi − ¯y)(yi − ¯y)∗ (3.14)
where
¯x =1
n1
n1∑i=1
xi and ¯y =1
n2
n2∑i=1
yi. (3.15)
It is easy to check that
A = Σ1/2AΣ1/2 and B = Σ1/2BΣ1/2. (3.16)
By Lemma 3.1,
n1Ad= U∗U and n2B
d= V∗V (3.17)
where U = (uij)(n1−1)×p and V = (vij)(n2−1)×p, and {uij , vkl; i, j, k, l ≥ 1} are i.i.d. random
variables with distribution N(0, 1). Review (1.7),
L1 =|A|n1/2 · |B|n2/2
|c1A+ c2B|N/2=
NNp/2
nn1p/21 n
n2p/22
· |n1A|n1/2 · |n2B|n2/2
|n1A+ n2B|N/2
=NNp/2
nn1p/21 n
n2p/22
· |n1A|n1/2 · |n2B|n2/2
|n1A+ n2B|N/2(3.18)
since |n1A| = |n1A| · |Σ|, and |n2B| = |n2B| · |Σ| and
|n1A+ n2B| = |Σ1/2(n1A+ n2B)Σ1/2| = |n1A+ n2B| · |Σ|
by (3.16), and hence the term |Σ|(n1+n2)/2 in the numerator canceled |Σ|N/2 in the denominator.
Define C = (n1A+ n2B)−1/2(n1A)(n1A+ n2B)−1/2. We see from the independence between n1A
and n2B and the independence between U∗U and V∗V that
Cd= C, (3.19)
where C is as in (3.13). It is obvious that
|C| = |n1A| · |n1A+ n2B|−1 and |I− C| = |n2B| · |n1A+ n2B|−1
Hence we have from (3.18) that
L1 =NNp/2
nn1p/21 n
n2p/22
· |C|n1/2 · |I− C|n2/2. (3.20)
16
Finally, we get the desired conclusion from (3.19) and (3.20). �
Let λ1, · · · , λp be the eigenvalues of the β-Jacobi ensemble or the β-MANOVA matrix, that is,
they have the joint probability density function:
f(λ1, · · · , λp) = cβ,a1,a2
J
∏1≤i<j≤p
|λi − λj |β ·p∏
i=1
λa1−qi (1− λi)
a2−q (3.21)
for 0 ≤ λ1, · · · , λp ≤ 1, where a1, a2 > β2 (p− 1) are parameters, q = 1 + β
2 (p− 1), and
cβ,a1,a2
J =
p∏j=1
Γ(1 + β2 )Γ(a1 + a2 − β
2 (p− j))
Γ(1 + β2 j)Γ(a1 −
β2 (p− j))Γ(a2 − β
2 (p− j))(3.22)
with a1 = β2 (n1 − 1) and a2 = β
2 (n2 − 1). The fact that f(λ1, · · · , λp) is a probability density
function follows from the Selberg integral (see, e.g., [10, 13]):∫[0,1]p
∏1≤i<j≤p
|λi − λj |β ·p∏
i=1
λa1−qi (1− λi)
a2−q dλ1 · · ·λp =1
cβ,a1,a2
J
. (3.23)
It is known that
the eigenvalues of C defined in (3.13) has density function f(λ1, · · · , λp) in (3.21)
with β = 1, a1 =1
2(n1 − 1), a2 =
1
2(n2 − 1)and q = 1 +
1
2(p− 1). (3.24)
See, for example, [7, 14] for this fact.
LEMMA 3.4 Let TN be as in (1.7). Assume n1 > p and n2 > p. Then
EetTN = Cn1,n2 · Un(t) · V1,n(t)−1 · V2,n(t)
−1
for all t < 12 (1−
pn1∧n2
), where
Cn1,n2 =nn1pt1 nn2pt
2
(n1 + n2)(n1+n2)pt, Un(t) =
N−2∏i=N−p−1
Γ( 12 i)
Γ(12 i−Nt),
V1,n(t) =
n1−1∏i=n1−p
Γ(12 i)
Γ(12 i− n1t)and V2,n(t) =
n2−1∏i=n2−p
Γ(12 i)
Γ(12 i− n2t). (3.25)
Proof. From (1.7), etTN = (L1)−2t for any t ∈ R. Therefore, by Lemma 3.3,
EetTN = Cn1,n2 · E(|C|−n1t · |I−C|−n2t
)= Cn1,n2 · E
( p∏j=1
λ−n1tj (1− λj)
−n2t)
where λ1, · · · , λp are the eigenvalues of C in (3.13). Write ca1,a2
J = c1,a1,a2
J . By (3.22) and (3.24),
EetTN = Cn1,n2 · ca1,a2
J ·∫[0,1]p
p∏j=1
λa1−n1t−qj (1− λj)
a2−n2t−q ·∏
1≤i<j≤p
|λi − λj | dλ1 · · ·λp
= Cn1,n2·
ca1,a2
J
ca1−n1t,a2−n2tJ
(3.26)
17
since f(λ1, · · · , λp) is a probability density function. Of course, recalling ai =12 (ni − 1) for i = 1, 2
and the assumption that t < 12 (1−
pn1∧n2
), we know
a1 − n1t >1
2(p− 1) and a2 − n2t >
1
2(p− 1)
which are required in (3.21). From (3.26), we see
EetTN = Cn1,n2 ·p∏
j=1
Γ(a1 + a2 − 12 (p− j))
Γ(a1 + a2 −Nt− 1
2 (p− j)) ·
[ p∏j=1
Γ(a1 − 12 (p− j))
Γ(a1 − n1t− 1
2 (p− j))]−1
·[ p∏j=1
Γ(a2 − 12 (p− j))
Γ(a2 − n2t− 1
2 (p− j))]−1
=: Cn1,n2 · Un(t) · V1,n(t)−1 · V2,n(t)
−1. (3.27)
Now, use ai =12 (ni − 1) for i = 1, 2 again to have
a1 −1
2(p− j) =
1
2(n1 − p+ j − 1); a2 −
1
2(p− j) =
1
2(n2 − p+ j − 1);
a1 + a2 −1
2(p− j) =
1
2(N − p+ j − 2).
Thus, by setting i = N − p+ j − 2 for j = 1, 2, · · · , p, we have
Un(t) =
p∏j=1
Γ(a1 + a2 − 12 (p− j))
Γ(a1 + a2 −Nt− 1
2 (p− j)) =
N−2∏i=N−p−1
Γ( 12 i)
Γ( 12 i−Nt)= Un(t).
Similarly, Vi,n(t) = Vi,n(t) for i = 1, 2. These combining with (3.27) yield the desired result. �
LEMMA 3.5 Let TN be as in (1.7). Assume ni > p and p/ni → yi ∈ (0, 1] for i = 1, 2. Recall σ2n
in (1.8). Then, 0 < σn < ∞ for all n1 ≥ 2, n2 ≥ 2, and E exp{
TN
Nσnt}< ∞ for all t ∈ R as n1 and
n2 are sufficiently large.
Proof. First, we claim that
σ2 := 2[log(1− y)− γ2
1 log(1− y1)− γ22 log(1− y2)
]> 0 (3.28)
for all y1, y2 ∈ (0, 1), where γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)
−1 and y = y1y2(y1 + y2)−1.
In fact, consider h(x) = − log(1 − x) for x < 1. Then, h′′(x) = (1 − x)−2 > 0 for x < 1. That
is, h(x) is a convex function. Take γ3 = 2y1y2/(y1 + y2)2. Then, γ2
1 + γ22 + γ3 = 1. Hence, by the
convexity,
−γ21 log(1− y1)− γ2
2 log(1− y2) = −γ21 log(1− y1)− γ2
2 log(1− y2)− γ3 log(1− 0)
< − log(1− (γ2
1y1 + γ22y2 + γ3 · 0)
)= − log(1− y),
where the strict inequality comes since y1 = 0 and y2 = 0.
Now, taking yi = p/ni ∈ (0, 1) for i = 1, 2 in (3.28), we get
γ1 =y2
y1 + y2=
n1
N, γ2 =
y1y1 + y2
=n2
Nand y =
y1y2y1 + y2
=p
N.
18
Evidently, n1/N, n2/N, p/N ∈ (0, 1). Then, by (3.28), we know 0 < σn < ∞ for all n1 ≥ 2, n2 ≥ 2.
Second, noting that{t :
t
Nσn<
1
2(1− p
n1 ∧ n2)}=
(−∞,
1
2(1− p
n1 ∧ n2)Nσn
),
to prove the second part, it suffices to show from Lemma 3.4 that
limn1,n2→∞
(1− p
n1 ∧ n2)Nσn = +∞. (3.29)
Case 1: y1 < 1, y2 < 1. Recall σ2 in (3.28). Evidently, σ2n → σ2 ∈ (0,∞) as n1, n2 → +∞. Hence,
(3.29) follows since 1− pn1∧n2
→ 1− y1 ∨ y2 > 0.
Case 2: max{y1, y2} = 1. This implies σ2n → +∞ as n1, n2 → ∞ because log
(1 − p
N
)→ log y ∈
(−∞, 0) and the sum of the last two term on the right hand side of (1.8) goes to +∞. Further, the
given conditions say that ni − 1 ≥ p, and hence, 1− pni
≥ 1ni
≥ 1N for i = 1, 2. Thus,
(1− p
n1 ∧ n2)Nσn = min
{1− p
n1, 1− p
n2
}·Nσn ≥ σn → +∞
as n1, n2 → ∞. We get (3.29). The proof is complete. �
Proof of Theorem 2. From Lemma 3.5, we assume, without loss of generality, thatE exp{
TN
Nσnt}<
∞ for all n1 ≥ 2, n2 ≥ 2 and t ∈ R. Fix t ∈ R. Set tn = tn1,n2 = tσn
for n1, n2 ≥ 2. From the condi-
tion p/ni → yi for i = 1, 2 as p ∧ n1 ∧ n2 = p → ∞ by the assumption n1 > p and n2 > p (we will
simply say “p → ∞” in similar situations later), we know σ2n has a positive limit (possibly +∞) as
p → ∞. It follows that {tn; n1, n2 ≥ 2} is bounded. By Lemma 3.4,
logE exp{ TN
Nσnt}
= − log V1,n
( tnN
)− log V2,n
( tnN
)+ logUn
( tnN
)+ptnN
(n1 log n1 + n2 log n2 −N logN). (3.30)
Set γ1 = y2(y1 + y2)−1, γ2 = y1(y1 + y2)
−1 and y = y1y2(y1 + y2)−1. Easily,
ni
N→ γi ∈ (0, 1),
p
N − 1→ y ∈ (0, 1) and 2 log(1− p
N) → 2 log(1− y) ∈ (−∞, 0)
as p → ∞. Then, from (1.8) we know that
ni
Ntn ∼ γit ·
1
σn= O
((− log(1− p
ni))−1/2)
and
tn = O((
− log(1− p
N − 1))−1/2)
(3.31)
for i = 1, 2 as p → ∞. Replacing “t” in Proposition 2.1 with “n1tn/N”, we have
− log V1,n
( tnN
)= log
n1−1∏i=n1−p
Γ( i2 − n1
N tn)
Γ( i2 )
=n1ptnN
(1 + log 2)− n1ptnN
log n1
+ r2n,1
( n21
N2t2n + (p− n1 + 1.5)
n1
Ntn
)+ o(1) (3.32)
19
as p → ∞, where
rn,i :=(− log(1− p
ni))1/2
, i = 1, 2. (3.33)
Similarly,
− log V2,n
( tnN
)= log
n2−1∏i=n2−p
Γ( i2 − n2
N tn)
Γ( i2 )
=n2ptnN
(1 + log 2)− n2ptnN
log n2
+ r2n,2
( n22
N2t2n + (p− n2 + 1.5)
n2
Ntn
)+ o(1) (3.34)
as p → ∞. By the same argument, by using (3.31) we see
− logUn
( tnN
)= log
(N−1)−1∏i=(N−1)−p
Γ( i2 − tn)
Γ( i2 )
= ptn(1 + log 2)− ptn log(N − 1)
+R2n
(t2n + (p−N + 2.5)tn
)+ o(1) (3.35)
as p → ∞, where
Rn =(− log
(1− p
N − 1
))1/2
. (3.36)
From (3.32) and (3.34),
− log Vi,n
( tnN
)+
ptnN
ni log ni
=niptnN
(1 + log 2) + r2n,i
( n2i
N2t2n + (p− ni + 1.5)
ni
Ntn
)+ o(1)
=niptnN
(1 + log 2) +n2i r
2n,i
N2t2n +
(p− ni + 1.5)nir2n,i
Ntn + o(1) (3.37)
as p → ∞ for i = 1, 2. Since {tn} is bounded, use log(1 + x) = x+O(x2) as x → 0 to see
ptn logN − ptn log(N − 1) = ptn log(1 +
1
N − 1
)= ytn + o(1)
as p → ∞, where lim pN−1 = y1y2
y1+y2= y < 1. Therefore, by (3.35) and the fact N = n1 + n2,
− logUn
( tnN
)+ ptn logN
= ptn(1 + log 2) + ytn +R2n
(t2n + (p−N + 2.5)tn
)+ o(1)
=n1ptn + n2ptn
N(1 + log 2) +R2
nt2n +
(y + (p−N + 2.5)R2
n
)tn + o(1) (3.38)
as p → ∞. Joining (3.30) with (3.37) and (3.38), we obtain
logEetnTN/N =( n2
1
N2r2n,1 +
n22
N2r2n,2 −R2
n
)t2n + ρntn + o(1) (3.39)
20
as p → ∞, where
ρn =1
N
((p− n1 + 1.5)n1r
2n,1 + (p− n2 + 1.5)n2r
2n,2
)− (p−N + 2.5)R2
n − y. (3.40)
By using the fact log(1 + x) = x+ o(x2) again, we have that
log(N − 1
N· N − p
N − p− 1
)= log(1− 1
N)− log(1− 1
N − p)
=p
N(N − p)+O(
1
N2)
as p → ∞. Reviewing (3.36), we have
R2n = − log
(1− p
N − 1
)= − log
(1− p
N
)+ log
(N − 1
N· N − p
N − p− 1
)= r2n +
p
N(N − p)+O(
1
N2) (3.41)
as p → ∞, where
rn =(− log
(1− p
N
))1/2
.
In particular, since {tn} is bounded,
R2nt
2n = r2nt
2n + o(1) (3.42)
as p → ∞. By (3.41), recalling p/N → y, we get
(p−N + 2.5)R2n = (p−N + 2.5)r2n − p
N+ o(1) = (p−N + 2.5)r2n − y + o(1)
as p → ∞. Plug this into (3.40) to have that
ρn =1
N
((p− n1 + 1.5)n1r
2n,1 + (p− n2 + 1.5)n2r
2n,2
)− (p−N + 2.5)r2n + o(1) (3.43)
as p → ∞. Now plug the above and (3.42) into (3.39), since {tn} is bounded, we have
logEetnTN/N =( n2
1
N2r2n,1 +
n22
N2r2n,2 − r2n
)t2n + µntn + o(1) (3.44)
as p → ∞ with
µn =1
N
((p− n1 + 1.5)n1r
2n,1 + (p− n2 + 1.5)n2r
2n,2
)− (p−N + 2.5)r2n.
Using tn = t/σn and the definition of σn, we get( n21
N2r2n,1 +
n22
N2r2n,2 − r2n
)t2n
= t2n
(log
(1− p
N
)− n2
1
N2log
(1− p
n1
)− n2
2
N2log
(1− p
n2
))→ t2
2
as p → ∞. This and (3.44) conclude that
logE exp{Tn −Nµn
Ntn
}= logEetnTN/N − µntn → t2
2
21
as p → ∞, which is equivalent to that
E exp{ 1
σn
(Tn
N− µn
)t}→ et
2/2 = EetN(0,1)
as p → ∞ for any t ∈ R. The proof is completed by using (3.7). �
Acknowledgement We thank Danning Li very much for her check of our proofs and many good
suggestions. We also thank an anonymous referee for very helpful comments for revision.
References
[1] Ahlfors, L. V. (1979). Complex Analysis, 3rd. Edition. McGraw-Hill, Inc.
[2] Bai, Z., Jiang, D., Yao, J. and Zheng, S. (2009). Corrections to LRT on large-dimensionalcovariance matrix by RMT. Ann. Stat. 37(6B) 3822-3840.
[3] Bai, Z. and Saranadasa, H. (1996). Effect of high dimension comparison of significance testsfor a high-dimensional two sample problem. Statist. Sinica 6 311-329.
[4] Bai, Z. and Silverstein, J. (2004). CLT for linear spectral statistics of largedimensional samplecovariance matrices. Ann. Probab. 32 553-605.
[5] Billingsley, P. (1986). Probability and Measure. Wiley Series in Probability and MathematicalStatistics, 2nd Edition.
[6] Gamelin, T. W. (2001). Complex Analysis. 1st Ed., Springer.
[7] Constantine, A. (1963). Some non-central distribution problems in multivariate analysis. Ann.Math. Stat. 34 1270-1285.
[8] Dempster, A. (1958). A high-dimensional two sample significance test. Ann. Math. Statist. 29995-1010.
[9] Dumitriu, I. and Edelman, A. (2002). Matrix models for beta-ensembles. J. Math. Phys.,43(11) 5830-5847.
[10] Forrester, P. and Warnaar, S. (2008). The importance of the Selberg integral. Bull. Amer.Math. Soc. 45(4) 489-534.
[11] Freitag, E. and Busam, R. (2005). Complex Analysis. Springer.
[12] Jiang, T. Limit Theorems on Beta-Jacobi Ensembles. http://arxiv.org/abs/0911.2262.
[13] Mehta, M. L. (2004). Random Matrices, 3rd edition. Pure and Applied Mathematics (Ams-terdam), 142. Elsevier/Academic Press, Amsterdam.
[14] Muirhead, R. J. (1982). Aspects of Multivariate Statistical Theory, Wiley, New York.
[15] Zheng, S. (2008). Central limit theorem for linear spectral statistics of large dimensional F-matrix. Preprint. Northeast Normal Univ., Changchun, China.
22