Quasi-Maximum Likelihood Estimators For Spatial Dynamic Panel Data …€¦ · For...
Transcript of Quasi-Maximum Likelihood Estimators For Spatial Dynamic Panel Data …€¦ · For...
Quasi-Maximum Likelihood Estimators For Spatial Dynamic
Panel Data With Fixed E¤ects When Both n and T Are Large:
A Nonstationary Case�
Jihai Yu, Robert de Jong, Lung-fei Leey
Department of Economics
The Ohio State University
August 25, 2007
Abstract
Yu, de Jong and Lee (2006) established asymptotic properties of quasi-maximum likelihood estimators
for spatial dynamic panel data with �xed e¤ects when both the number of individuals n and the number
of time periods T are large. This paper covers a nonstationary case where there are unit roots in the data
generating process. When not all the roots in the DGP are unit, the estimators�rates of convergence will
be the same as the stationary case, and the estimators can be asymptotically normal. The presence of
the nonstationary components however will make the estimators�asymptotic variance matrix singular.
Consequently, a linear combination of the spatial and dynamic e¤ects can converge at a higher rate. We
also propose a bias correction for our estimator. When T grows faster than n1=3, the correction will
asymptotically eliminate the bias and yield a centered con�dence interval.
JEL classi�cation: C13; C23
Keywords: Spatial autoregression, Dynamic panels, Fixed e¤ects, Quasi-maximum likelihood estima-
tion, Bias correction, Unit root, Nonstationarity
�We would like to thank participants of the Econometrics Seminar at The Ohio State University (March 2007) and the Third
Symposium on Econometric Theory and Applications at HKUST (April 2007) for helpful comments.yLee acknowledges �nancial support for his research from NSF under Grant No. SES-0519204.
1
1 Introduction
This paper investigates the properties of maximum likelihood (ML) estimators and quasi-maximum like-
lihood (QML) estimators for spatial dynamic panel data models with individual �xed e¤ects when both the
number of individuals n and the number of time periods T are large for a nonstationary case.
In Yu, de Jong and Lee (2006), the consistency and asymptotic distribution of the QML estimators are
established for the stationary case. Also, a bias correction procedure for the estimators is proposed. It is
shown that as long as T grows faster than n1=3, the correction will asymptotically eliminate the bias and
will yield a centered con�dence interval. When there are unit roots in the process so that the assumption of
absolute summability in Yu, de Jong and Lee (2006) does not hold, the analysis of the properties of estimators
for the stationary case, which is crucially based on the absolute summability condition, will not be valid. In
this paper, we will show that when the spatial weights matrix is row normalized from a symmetric matrix,
we can still obtain the consistency and asymptotic normality of the ML and QML estimators with the same
rate of convergence as in the stationary case. The di¤erence is that the variance matrix is di¤erent from the
stationary case, and it is singular in the limit. Also, for this nonstationary case, there is a linear combination
of common parameters that will have a higher rate of convergence.
The nonstationary case we consider is relevant in empirical applications. In Yu (2006), a spatial dynamic
panel data model is applied to study the growth convergence of 48 contiguous states. In the estimation result,
the spatial e¤ects are signi�cant and the sum of estimators of spatial and dynamic e¤ects equals nearly to
the 1. This implies that there may be nonstationary components in the DGP (see discussion in Section 2.1
for details), which motivates deriving asymptotic theory for the estimators under nonstationarity. Also, in
Tao�s (2006) study on the education spending of local school districts using spatial dynamic panel model, we
have signi�cant spatial e¤ects and the sum of estimators of spatial and dynamic e¤ects equals nearly to 1.
There is growing research interest in nonstationary panels in recent years. For independent panels, we
have Maddala and Wu (1999), Levin, Lin and Chu (2002), Im, Pesaran and Shin (2003), etc. For cross-
sectionally correlated panels, we have Pesaran (2003), Phillips and Sul (2003), Moon and Perron (2004),
etc, where the cross sectional dependence is speci�ed by common factors. This paper covers a case of
nonstationary panel data where the cross sectional dependence is speci�ed by spatial correlation among
units directly. There are already extensive empirical applications for nonstationary panel data1 . We expect
that our model can shed light on existing nonstationary panel data models and empirical applications.
This paper is organized as follows. In Section 2, the model is introduced. We then explain our method
of estimation, which is a concentrated QML estimation. Several lemmas on matrix algebra and a central
limit theorem are stated. Section 3 derives the consistency and asymptotic distribution of the spatial e¤ect
parameter. Using the results of Section 3, we establish the asymptotic distribution of the common parameters
1The applications include purchasing power parity, growth and convergence, money demand, monetary exchange rate model,
in�ation-rate convergence, interest rate, health care expenditure, hysteresis in unemployment, etc. See Choi (2004) for more
details.
2
in Section 4. Also, a bias correction procedure is proposed and simulation results are reported. Section 5
concludes the paper. Some useful lemmas and proofs are collected in the Appendix.
2 The Model and The Likelihood Function
2.1 The Model
The model considered in this paper is
Ynt = �0WnYnt + 0Yn;t�1 + �0WnYn;t�1 +Xnt�0 + cn0 + Vnt; t = 1; 2; :::; T , (2.1)
where Ynt = (y1t; y2t; :::; ynt)0 and Vnt = (v1t; v2t; :::; vnt)0 are n � 1 column vectors and vit is i:i:d: across iand t with zero mean and variance �20,Wn is a known n�n spatial weights matrix which is nonstochastic andgenerates the spatial dependence between cross sectional units yit, Xnt is an n� kx matrix of nonstochasticregressors, and cn0 is n � 1 column vector of �xed individual e¤ects. Therefore, the total number of para-meters in this model is equal to the number of individuals n plus the dimension of the common parameters
( ; �; �0; �; �2)0, which is kx + 4. Wn is usually row normalized from a symmetric matrix such that its ith
row is [cn;i1; cn;i2; � � � ; cn;in]=Pn
j=1 cn;ij , where cn;ij represents a function of the spatial distance of di¤erent
units in some space. As a normalization, cn;ii = 0. It is a common practice in empirical work that Wn is
row normalized, which ensures that all the weights are between 0 and 1 and weighting operations can be
interpreted as an average of the neighboring values. Also, a weights matrix row normalized from a symmetric
matrix has real eigenvalues, with all its eigenvalues less than or equal to one in absolute value and its largest
eigenvalue always 1 (see Ord (1975)). Such a spatial weights matrix is also diagonalizable (see Proposition
B.1 in Appendix B).
De�ne Sn(�) = In � �Wn and denote Sn � Sn(�0) = In � �0Wn. Then, presuming Sn is invertible and
denoting An = S�1n ( 0In + �0Wn), (2.1) can be rewritten as
Ynt = AnYn;t�1 + S�1n Xnt�0 + S
�1n cn0 + S
�1n Vnt. (2.2)
A nonstationary case occurs if some eigenvalues dni of An are equal to 1, i.e., dni = 0+�0$ni
1��0$ni= 1 for some
i where $ni is an eigenvalue of Wn. For the nonstationary case, we can decompose Ynt into a stationary
part and a nonstationary part. To do that, we can �rst diagonalize2 An as An = RnDnR�1n where Rn is the
eigenvectors of An andDn = diag(dn1; dn2; � � � ; dnn) where dni�s are eigenvalues of An. When dn;max = 1 anddn;min > �1 where dn;max and dn;min are respectively the largest and smallest eigenvalues of An, withoutloss of generality, suppose that dni = 1 for i = 1; 2; � � � ;mn and jdnij < 1 for mn + 1 � i � n where
mn is the number of unit roots. Let Bn = Rn ~DnR�1n with ~Dn = Diag(0; � � � ; 0; dn;mn+1; � � � ; dnn) so thatDn = Jn+ ~Dn where Jn = Diagf10mn
; 0; � � � ; 0g with 1mn being anmn�1 vector of ones. As Jn is idempotent2See Proposition B.2 in Appendix B for diagonalizability of An.
3
and Jn � ~Dn = 0, Ahn =Mn +Bhn for any h = 1; 2; � � � where Mn = RnJnR
�1n . Then (see Proposition B.5 in
Appendix B), for t � 0, we can decompose Ynt into sum of a stationary part and a nonstationary part:
Ynt = Yunt + Y
snt, (2.3)
where
Y unt =Mn
Yn;�1 + cn0
t
(1� �0)+
Pt�1h=0Xnh�0(1� �0)
+
Pt�1h=0 Vnh(1� �0)
!, (2.4)
Y snt =1Xh=0
BhnS�1n cn0 +
1Xh=0
BhnS�1n Xn;t�h�0 +
1Xh=0
BhnS�1n Vn;t�h. (2.5)
Compared to the stationary case, the model has a time trend attachment Mncn0t
(1��0) +Mn
Pt�1h=0Xnh�0(1��0) , a
random walk attachment Mn
Pt�1h=0 Vnh(1��0) and a nonstationary initial value component MnYn;�1.
Using (2.3), (2.4) and (2.5), we have3
~Ynt = ~Y unt + ~Y snt; t = 0; 1; � � � ; T , (2.6)
where
~Y unt =1
(1� �0)Mn
�cn0~t+ ~Xnt�0 + ~�nt
�, ~Y snt = ~X s
nt�0 + ~Usnt, (2.7)
with ~t = t� T+12 , Xnt =
Pt�1h=0Xnh, �nt =
Pt�1h=0 Vnh, X s
nt =1Ph=0
BhnS�1n Xn;t�h and Usnt =
1Ph=0
BhnS�1n Vn;t�h.
To analyze the model, the following assumptions are needed.
Assumption 1. Wn is a nonstochastic spatial weights matrix, row normalized from a symmetric weights
matrix.
Assumption 2. The disturbances fvitg, i = 1; 2; :::; n and t = 1; 2; :::; T; are i:i:d across i and t with zero
mean, variance �20 and E jvitj4+�
<1 for some � > 0.
Assumption 3. n is a nondecreasing function of T .
Assumption 4. The elements of Xnt and cn0 are nonstochastic and bounded, uniformly in n and t,
and limT!11nT
PTt=1
~X 0nt~Xnt exists and is nonsingular. Also, limT!1
1nT 3
TPt=1(cn0~t+ ~Xnt�0)0M 0
nMn(cn0~t+
~Xnt�0) 6= 0.Assumption 5. Sn(�) is invertible for all � 2 �. Furthermore, � is compact4 and the true parameter �0with j�0j < 1 is in the interior of �.Assumption 6. �0 + 0 + �0 = 1 with 0 6= 1. Also, dn;max = 1 and dn;min > �1, where dn;max and dn;minare the largest and smallest eigenvalues of An.
Assumption 7. The row and column sums of Wn and S�1n (�) are bounded uniformly5 in n, also uniformly
in � 2 � for S�1n (�).
3For notational purpose, we de�ne for any n � 1 vector at period t, �nt, we have ~�nt = �nt � ��nT and��n;t�1 =
�n;t�1 � ��nT;�1 for t = 1; 2; � � � ; T where ��nT = 1T
TPt=1
�nt and ��nT;�1 =1T
TPt=1
�n;t�1.
4Note that in the literature, � is typically assumed to be a compact subset of (�1; 1).5We say the row and column sums of a (sequence of n � n) matrix Pn are uniformly bounded in n if
sup1�i�n;n�1Pnj=1 jpij;nj <1 and sup1�j�n;n�1
Pni=1 jpij;nj <1.
4
Assumption 8. The row and column sums ofP1
h=1 abs(Bhn) are bounded uniformly in n, where [abs(Bn)]ij =
jBn;ij j.Assumptions 1 and 2 provide essential features of the weights matrix and disturbances of the model.
Assumption 3 allows two cases: (i) n ! 1 as T ! 1; (ii) n is �xed as T ! 1. For case (i), we say thatn; T ! 1 simultaneously. When exogenous variables Xnt are included in the model, it is convenient to
assume that the exogenous regressors are uniformly bounded, as is done in Assumption 4. Also, we make the
assumption that either cn0 or Xnt is relevant in the model. A simple consequence of Assumption 5 is that,
for the system (2.1), Ynt can be solved in terms of cn0, Xnt and Vnt. Assumption 6 speci�es that some roots
of An are equal to 1, while the other roots are less than 1 in absolute value. The �rst part of this assumption
rules out explicitly the pure unit root time series case without spatial interaction; more generally, it rules
out the case where 0 = 1 and �0 + �0 = 0. A su¢ cient condition for Assumption 6 is �0 < 1 with j 0j < 1and j�0j < 1 under �0 + 0 + �0 = 1 (see Proposition B.3 in Appendix B). Assumption 7 is originated byKelejian and Prucha (1998, 2001). The uniform boundedness of Wn and S�1n (�) is a condition to limit the
spatial correlation to a manageable degree. Assumption 8 is the absolute summability condition and the row
and column sum boundedness condition, which will play an important role to derive asymptotic properties
of QML estimators. This assumption is essential for the model because it limits the dependence between
time series and between cross sectional units for the stationary component Y snt in the process. In order to
justify the absolute summability of Bn in (2.5) and Assumption 8, a su¢ cient condition is kBnk < 1 for
any matrix norm (see Horn and Johnson (1985), Corollary 5.6.16) that satis�es kBnk = kabs (Bn)k. WhenkBnk < 1,
P1h=0B
hn exists and can be de�ned as (In �Bn)�1 (see Appendix B.1 for an example where An
has some eigenvalues equalling to one but others strictly less than one in absolute value).
2.2 Concentrated Likelihood Function
Denote Znt = (Yn;t�1; WnYn;t�1; Xnt) and � = (�0; �; �2)0 where � = ( ; �; �0)0. The log likelihood
function of (2.1) is
lnLn;T (�; cn) = �nT
2ln 2� � nT
2ln�2 + T ln jSn(�)j �
1
2�2
TXt=1
V 0nt(�)Vnt(�), (2.8)
where Vnt(�) = Sn(�)Ynt�Znt��cn and � = (�0; �; c0n)0. The QML estimators �nT and cnT are the extremeestimators derived from the maximization of (2.8). When the Vnt�s are normally distributed, �nT and cnT
are the ML estimators; when the Vnt�s are not normally distributed, �nT and cnT are QML estimators. As
the number of parameters goes to in�nity when n goes to in�nity, it�s convenient to use the concentrating
approach. We will concentrate cn and � out and focus asymptotic analysis on the estimator of �0 via the
concentrated likelihood function6 . For the concentrated likelihood function, the dimension of parameter
space does not change as n and/or T increase.
6The reason to concentrate out � is to avoid technical complication in the consistency proof and deriving the asymptotic
distribution jointly for the common parameters. See footnote 15 for details.
5
From (2.8), using the �rst order conditions, we can get the concentrated estimators given �:
�nT (�) = [1
nT
TXt=1
~Z 0nt ~Znt]�1[
1
nT
TXt=1
~Z 0ntSn(�) ~Ynt], cnT (�) =1
T
TXt=1
(Sn(�)Ynt � Znt�nT (�)),
�2nT (�) =1
nT
TXt=1
(Sn(�) ~Ynt � ~Znt�nT (�))0(Sn(�) ~Ynt � ~Znt�nT (�)), (2.9)
and the concentrated likelihood is
lnLn;T (�) = �nT
2(ln 2� + 1)� nT
2ln �2nT (�) + T ln jSn(�)j . (2.10)
The QML estimator �nT maximizes the concentrated likelihood function (2.10), and the QML estimators of
�0, �20 and cn0 are �nT (�nT ), �2nT (�nT ) and cn(�nT ).
Also, the reduced form of (2.1) can be represented as
Ynt = S�1n (Znt�0 + cn0 + Vnt) (2.11)
= Znt�0 + �0GnZnt�0 + S�1n (cn0 + Vnt), t = 0; 1; :::; T ,
because In + �0Gn = S�1n where Gn = WnS�1n . Denote HnT =
1nT
TPt=1( ~Znt; Gn ~Znt�0)
0( ~Znt; Gn ~Znt�0) =0@ H1;nT H2;nT
H02;nT H3;nT
1A whereH1;nT =1nT
TPt=1
~Z 0nt ~Znt,H2;nT =1nT
TPt=1
~Z 0ntGn ~Znt�0 andH3;nT =1nT
TPt=1�00 ~Z
0ntG
0nGn ~Znt�0.
Hence, HnT is the covariance matrix of the explanatory variables of the reduced form (2.11) after taking
di¤erence from time average, which is crucial for our asymptotic analysis of QML estimators because �2nT (�)
in (2.10) involves Hi;nT terms for i = 1; 2; 3. To study HnT , it is desirable to decompose ~Znt into a stationary
part and a nonstationary part such that ~Znt = ~Zunt + ~Zsnt where
~Zunt = (~~Y un;t�1;Wn
~~Y un;t�1;0n�kx), ~Zsnt = (
~~Y sn;t�1;Wn~~Y sn;t�1;
~Xnt). (2.12)
As (see Proposition B.4 in Appendix B) ~~Y un;t�1 = Wn~~Y un;t�1 = Gn ~Z
unt�0, we have ~Zunt =
~~Y un;t�1 � c0
and ( ~Zunt; Gn ~Zunt�0) =
~~Y un;t�1 � c�0 where c = (1; 1;01�kx)0 and c� = (c0; 1)0. Hence, denoting Hs
nT =
1nT
TPt=1( ~Zsnt; Gn ~Z
snt�0)
0( ~Zsnt; Gn ~Zsnt�0), we can express HnT in terms of vectors such that
HnT � !nT�T 2 � c�c�0 + T � dnT � c�0 + T � c� � d0nT +Hs
nT =!nT�, (2.13)
where !nT = 1nT 3
TPt=1
~~Y u0n;t�1~~Y un;t�1, dnT = 1
!nT( 1nT 2
TPt=1( ~Zsnt; Gn ~Z
snt�0)
0 ~~Y un;t�1)0. Similarly, we can express
Hi;nT in terms of vectors such that
H1;nT = !nT�T 2 � cc0 + T � d1;nT � c0 + T � c � d01;nT +Hs
1;nT =!nT�, (2.14a)
H2;nT = !nT�T 2 � c+ T � d1;nT + T � d2;nT � c+Hs
2;nT =!nT�, (2.14b)
H3;nT = !nT�T 2 + 2T � d2;nT +Hs
3;nT =!nT�, (2.14c)
where d1;nT = 1!nT
1nT 2
TPt=1
~Zs0nt~~Y un;t�1, d2;nT = 1
!nT1
nT 2
TPt=1(Gn ~Z
snt�0)
0 ~~Y un;t�1. We notice that elements of
HnT are of the order O(T 2) and T�2HnT is singular in the limit. However, because of the pattern of the
nonstationary component, H�1nT exists and H
�1nT � c� has a lower order of O(T�1) from Proposition 2.1 below.
6
2.3 Two Technical Propositions
To study the asymptotic behavior of H�1nT , we need the following proposition about matrix algebra.
Proposition 2.1 Let KT = T2cT c
0T+T (cT d
0T+dT c
0T )+AT , where cT , dT arem-dimensional column random
vectors, plimT!1cT 6= 0 and is nonstochastic, AT is positive de�nite for large enough T with probability one,plimT!1AT exists and is an m�m positive de�nite matrix. Denote �T = 1�
�d0TA
�1T dT �
(d0TA�1T cT )
2
c0TA�1T cT
�.
Under the assumption that plimT!1�T 6= 0, the sequence fKT g has the following properties:(a) the elements of K�1
T are Op(1);
(b) the elements of K�1T cT are Op(T�1);
(c) T 2c0TK�1T cT = 1 +Op(T
�1).
Proof. See the proof for Proposition B.13 in Appendix B.3.
In our application, we can apply KT to HnT in (2.13) and H1;nT in (2.14). To apply Proposition 2.1, we
need an additional assumption.
Assumption 9. HsnT is nonsingular for large enough T with probability one, plimT!1Hs
nT exists and is
nonsingular.
As HsnT =
1nT
TPt=1( ~Zsnt; Gn
~Zsnt�0)0( ~Zsnt; Gn
~Zsnt�0) is always positive semide�nite, with Assumption 9, HsnT
is positive de�nite for large enough T and plimT!1HsnT will also be positive de�nite.
In this paper, we need a central limit theorem for linear and quadratic forms of Vnt. Denote QnT =
QsnT +QunT where
QsnT =TXt=1
�U0n;t�1Vnt +D0
ntVnt + V0ntBnVnt � �20trBn
�, (2.15a)
QunT =kTT
TXt=1
�Mn
�cn0~t�1 + ~Xn;t�1�0 + �n;t�1
��0� Vnt. (2.15b)
Here, Unt =P1
h=1 Pnt;hVn;t+1�h where fPnt;hg1h=1 is a sequence of n � n nonstochastic square matrices,
Dnt is n� 1 vector, which is nonstochastic and bounded, uniformly in n and t, Bn is a nonstochastic n� nsymmetric matrix7 and its row and column sums are bounded uniformly in n and kT is O(1). Denote the
mean and variance of QnT as �QnTand �2QnT
respectively with �QnT= 0, we have the following proposition.
Assumption A1. The disturbances fvitg, i = 1; 2; :::; n and t = 1; 2; :::; T; are i:i:d across i and t with zeromean, variance �20 and E jvitj
4+�<1 for some � > 0.
Assumption A2. The row and column sums ofP1
h=1 abs(Pnt;h) are bounded uniformly in n and t.
Assumption A3. The elements of n� 1 vector Dnt are nonstochastic and bounded, uniformly in n and t.Assumption A4. n is a nondecreasing function of T .
Proposition 2.2 Assume that row and column sums of Bn are bounded uniformly in n and assume thesequence 1
nT �2QnT
is bounded away from zero. Then under Assumptions A1, A2, A3 and A4, QnT
�QnT
d! N(0; 1).
Proof. See Appendix B.4.7The assumption that Bn is symmetric is maintained w.l.o.g. since V 0ntBnVnt = V 0nt[(Bn + B0n)=2]Vnt.
7
3 Consistency and Asymptotic Distribution of �nT
We have the Taylor expansionpnT (� � �0) =
�� 1nT
@2 lnLn;T (��)
@�2
��1 �1pnT
@ lnLn;T (�0)@�
�where �� lies
between � and �0. From concentrated likelihood function (2.10):
1
nT
@ lnLn;T (�)
@�= � 1
2�2nT (�)
@�2nT (�)
@�� 1
ntrGn(�), (3.16a)
1
nT
@2 lnLn;T (�)
@�2= � 1
2�4n;T (�)
�@2�2nT (�)
@�2�2nT (�)� (
@�2nT (�)
@�)2�� 1
ntr(G2n(�)). (3.16b)
The �2nT (�),@�2nT (�)
@� and @2�2nT (�)
@�2have the explicit forms (see (B.49), (B.50) and (B.51)) implied by (2.9).
Using Proposition B.14, we have (derived in Appendix B.5)
�2nT (�) = �20 + j�� �0j �Op(1) +Op�max
�1pnT;1
T
��, (3.17a)
@�2nT (�)
@�= � 2
n�20trGn + j�� �0j �Op(1) +Op
�max
�1pnT;1
T
��, (3.17b)
@2�2nT (�)
@�2= 2(H3;nT �H0
2;nTH�11;nTH2;nT ) + 2�
20
1
ntrG0nGn +Op
�max
�1pnT;1
T
��,(3.17c)
pnT@�2nT (�0)
@�= � 2p
nT
TXt=1
~V 0ntG0n~Vnt �
2pnT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt (3.17d)
+Op
�max
�1pnT;1
T;
rn
T 3
��,
where the Op(1), Op�max
�1pnT; 1T
��and Op
�max
�1pnT; 1T ;
pnT 3
��are uniform in �. (3.16) through
(3.17) will be used to derive the consistency and asymptotic distribution of the spatial e¤ect parameter �.
3.1 Consistency of �nT
For the log likelihood function (2.10) divided by the sample size nT , we have corresponding Qn;T (�) =
max�;cn;�2 E1nT lnLn;T (�) and the optimal solution to the problem is (equation: concentrated estimators
expect)
��nT (�) = [E1
nT
TXt=1
~Z 0nt ~Znt]�1[E
1
nT
TXt=1
~Z 0ntSn(�) ~Ynt], c�nT (�) = E
1
T
TXt=1
(Sn(�)Ynt � Znt��nT (�)),
��2nT (�) = E1
nT
TXt=1
(Sn(�) ~Ynt � ~Znt��nT (�))
0(Sn(�) ~Ynt � ~Znt��nT (�)). (3.18)
Hence,
Qn;T (�) = �1
2(ln 2� + 1)� 1
2ln��2nT (�) +
1
nln jSn(�)j . (3.19)
Claim 3.1 Under Assumptions 1-9, 1nT lnLn;T (�)�Qn;T (�)
p! 0 uniformly in � in any compact parameter
space � and Qn;T (�) is uniformly equicontinuous for � 2 �.Proof. See Appendix C.1.
8
From (3.19), we have
@2Qn;T (�)
@�2= � 1
2��4n;T (�)
�@2��2nT (�)
@�2��2nT (�)� (
@��2nT (�)
@�)2�� 1
ntr(G2n(�)). (3.20)
Using (B.60) about ��2nT (�),@��2nT (�)
@� and @2��2nT (�)@�2
, we have
@2Qn;T (�0)=@�2 = � 1
�20(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT ) (3.21)
� 1n
�trG0nGn + trG
2n �
2(trGn)2
n
�+O
�1
T
�,
and its limit will be negative if limT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
�6= 0 or limn!1
1n tr(Cn+C
0n)(Cn+C0n)0 6=
0 where Cn = Gn � trGn
n In (see Appendix C.2). Claim 3.1 is the uniform convergence condition, combined
with identi�cation, we can get the consistency of QML estimators.
Theorem 3.2 Under Assumptions 1-9, �0 is globally identi�ed and �nT is consistent.
Proof. See Appendix C.3.
3.2 Distribution of �nT
Plugging (3.17) into @ lnLn;T (�)@� in (3.16a), we have
1pnT
@ lnLn;T (�0)
@�(3.22)
=1
�2nT (�0)
1pnT
TXt=1
~V 0nt(G0n �
1
ntrGn � In) ~Vnt +
1pnT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt
!
+Op
�max
�1pnT;1
T;
rn
T 3
��.
As ~Znt has stationary and nonstationary parts (see (2.12)), we can decompose 1pnT
@ lnLn;T (�0)@� into two parts
accordingly such that 1pnT
@ lnLn;T (�0)@� = 1p
nT
@ lnLsn;T (�0)
@� + 1pnT
@ lnLun;T (�0)
@� + Op
�max
�1pnT; 1T ;
pnT 3
��where 1p
nT
@ lnLsn;T (�0)
@� is the stationary part and 1pnT
@ lnLun;T (�0)
@� is the nonstationary part as de�ned via
(C.5)-(C.9). For 1pnT
@ lnLsn;T (�0)
@� , it has two parts 1pnT
@ lnLsn;T (�0)
@� = 1pnT
@ lnLs�n;T (�0)
@� ���0;nT (de�ned in(C.5) and (C.6) respectively) where 1p
nT
@ lnLs�n;T (�0)
@� has zero mean and��0;nT has nonzero mean because the
latter involves �VnT . For 1pnT
@ lnLun;T (�0)
@� , it also has two parts 1pnT
@ lnLun;T (�0)
@� = 1pnT
@ lnLu�n;T (�0)
@� � N�0;nT(de�ned in (C.8) and (C.9) respectively) where 1p
nT
@ lnLu�n;T (�0)
@� has zero mean and N�0;nT has nonzero mean.To study the asymptotic behavior of 1p
nT
@ lnLn;T (�0)@� , we will �rst study 1p
nT
@ lnLs�n;T (�0)
@� + 1pnT
@ lnLu�n;T (�0)
@�
(using Proposition 2.2) , then ��0;nT + N�0;nT (using Lemma B.11).
Theorem 3.3 Under Assumptions 1-98 ,
1pnT
@ lnLn;T (�0)
@�+
rn
T(as�0;nT +
mn
n� au�0;nT ) +Op
�max
�rn
T 3;1pT
��p! N(0;��0 +�0). (3.23)
8Only parts of Assumptions 5 and 7 are required. Speci�cally, Sn is invertible; and the row and column sums of Wn and
S�1n are uniformly bounded in n.
9
where
��0 =1
�20limT!1
(H3;nT �H02;nTH�1
1;nTH2;nT ) + limn!1
1
n(trG0nGn + trG
2n �
2(trGn)2
n), (3.24)
�0 =�4 � 3�40�40
limn!1
nXi=1
G2n;ii, (3.25)
as�0;nT =1
ntr�Gn 0 � (H�1
1;nTH2;nT )1In
��X1
h=0Bhn
�S�1n (3.26)
+1
ntr�Gn�0 � (H�1
1;nTH2;nT )2In
��X1
h=0WnB
hn
�S�1n ,
au�0;nT = T � (1� c0H�11;nTH2;nT ) �
1
2(1� �0). (3.27)
Proof. See Appendix C.4.
Also, we have the following claims.
Claim 3.4 Under Assumptions 1-9, 1nT
@2 lnLn;T (�)
@�2� 1nT
@2 lnLn;T (�0)
@�2= j�� �0j�O(1)+Op
�max
�1pnT; 1T
��.
Proof. See Appendix C.5.
Claim 3.5 Under Assumptions 1-9, 1nT
@2 lnLn;T (�0)
@�2� @2Qn;T (�0)
@�2= Op
�max
�1pnT; 1T
��:
Proof. See Appendix C.6.
Using Theorem 3.3, Claim 3.4 and Claim 3.5, we have the following theorem:
Theorem 3.6 Under Assumptions 1-9,
pnT (�nT � �0) +
rn
Tb�0;nT +Op
�max
�rn
T 3;1pT
��d! N(0;��1�0 +�
�2�0�0), (3.28)
where
b�0;nT = ��1�0
�as�0;nT +
mn
nau�0;nT
�. (3.29)
When nT ! 0,
pnT (�nT � �0)
d! N(0;��1�0 +��2�0�0). (3.30)
When nT ! k,
pnT (�nT � �0) +
pkb�0;nT
d! N(0;��1�0 +��2�0�0). (3.31)
When nT !1,
T (�nT � �0) + b�0;nTp! 0. (3.32)
Additionally ,if vit is normal, (3.28) becomes
pnT (�nT � �0) +
rn
Tb�0;nT +Op
�max
�rn
T 3;1pT
��d! N(0;��1�0 ). (3.33)
Proof. See Appendix C.7.
10
4 Distribution of QML Estimator �nT and Bias Corrected �1
nT
4.1 QML Estimator �nT
After we get the distribution of �nT , the distribution of �n;T = �n;T (�nT ), �2nT = �
2nT (�nT ) and cn;T =
cn;T (�nT ) can be derived from (2.9). As is derived in Appendix C.8,
pnT��nT � �0
�= ��1�0;nT �
1pnT
@ lnLnT (�0)
@�+Op
�max
�1pnT;1
T;
rn
T 3
��, (4.1)
where
1pnT
@ lnLnT (�0)
@�=
0BBBBBB@1�20
1pnT
TPt=1
~Z 0nt~Vnt
1�20
1pnT
TPt=1( ~V 0ntGn
~Vnt � �20trGn) + 1�20
1pnT
TPt=1(Gn ~Znt�0)
0 ~Vnt
12�40
1pnT
TPt=1
�~V 0nt~Vnt � n�20
�
1CCCCCCA ,
��0;nT =1
�20
0@ EHnT 0
0 0
1A+0BB@0 0 0
0 1n
�tr(G0nGn) + tr(G
2n)�
1�20ntr(Gn)
0 1�20ntr(Gn)
12�40
1CCA .Using the central limit theorem for martingale di¤erence arrays (see Proposition 2.2), we have the joint
distribution of the common parameters in the following theorem. Denote
�0;n =�4 � 3�40�40
�
0BBB@0 0 0
0 1n
nPi=1
G2n;ii1
2�20ntrGn
0 12�20n
trGn14�40
1CCCA , (4.2)
b�0;nT � ��1�0;nT � an;�0 , (4.3)
where a�0;nT = as�0;n
+ mn
n au�0;T
with
as�0;n =
0BBBBBBBB@
1n tr
��P1h=0B
hn
�S�1n
�1n tr
�Wn
�P1h=0B
hn
�S�1n
�0
1n 0tr(Gn
�P1h=0B
hn
�S�1n ) + 1
n�0tr(GnWn
�P1h=0B
hn
�S�1n ) + 1
n trGn12�20
1CCCCCCCCA, (4.4)
au�0;T = T � 1
2(1� �0)� (c�0; 0)0. (4.5)
Theorem 4.1 Under Assumptions 1-9,
pnT (�nT � �0) +
rn
Tb�0;nT +Op
�max
�1pT;
rn
T 3
��d! N(0; lim
T!1��1�0;nT + lim
T!1��1�0;nT�0;n�
�1�0;nT
).
(4.6)
11
When nT ! 0,
pnT (�nT � �0)
d! N(0; limT!1
��1�0;nT + limT!1
��1�0;nT�0;n��1�0;nT
). (4.7)
When nT ! k <1,
pnT (�nT � �0) +
pkb�0;nT
d! N(0; limT!1
��1�0;nT + limT!1
��1�0;nT�0;n��1�0;nT
). (4.8)
When nT !1,
T (�nT � �0) + b�0;nTp! 0. (4.9)
Additionally ,if vit is normal, (4.6) becomes
pnT (�nT � �0) +
rn
Tb�0;nT +Op
�max
�1pT;
rn
T 3
��d! N(0; lim
T!1��1�0;nT ): (4.10)
Proof. See Appendix C.9.
Hence, �nT has the bias of the order O(T�1). Also, the asymptotic variance matrix ofpnT �nT is
singular because ��1�0;nT � (c�0; 0) = O(T�1). This implies that we have a di¤erent rate of convergence of
(c�0; 0) � (�nT � �0) = �nT + nT + �nT � 1 using HnT � c� = O(T�1) in Proposition 2.1.
Theorem 4.2 Under Assumptions 1-9,
pnT 3(c�0; 0)(�nT � �0) +
rn
T(T (c�0; 0)b�0;nT ) +Op
�max
�1pT;
rn
T 3
��(4.11)
d! N�0; limT!1
!�1nT + limT!1
T 2(c�0; 0)( limT!1
��1�0;nT�0;nT��1�0;nT
)(c�0; 0)0�
Proof. See Appendix C.10.
The estimators of �xed e¤ects arepT consistent and asymptotically centered normal, as shown below.
Theorem 4.3 Under Assumptions 1-9, if (Yn;�1=T )i � E (Yn;�1=T )i = op(1) and E (Yn;�1=T )i = O(1)
uniformly in n and i, then, for i = 1; 2; � � � ; n,pT (ci;nT � ci;0)
d! N(0;�n;ci) where �n;ci is in (C.40).
When n also goes to in�nity,pT (ci;nT � ci;0)
d! N(0; �20).
Proof. See Appendix C.11.
4.2 Bias Corrected Estimators �1
nT
From (4:6), the QML estimator has the bias � 1T b�0;nT where b�0;nT � ��1�0;nT �
�as�0;n +
mn
n au�0;T
�and
the con�dence interval is not centered when nT ! k where 0 < k <1. Furthermore, when T is small relative
to n in the sense that nT !1, the presence of b�0;nT causes �nT to have the slower T�1 rate of convergence
in (4.9). An analytical bias reduction procedure is to correct the bias BnT = �b�0;nT by constructing anestimator BnT and de�ning the bias corrected estimator as
�1
nT = �nT �BnTT. (4.12)
12
From Theorem 4.1, BnT = ���1�0;nT ��as�0;n +
mn
n au�0;T
�and a natural candidate for BnT is
h���1�;nT � a�;nT
i����=�nT
.
As ��1�nT ;nT
involves EHnT (�nT ) (see (B.47)) which is hard to evaluate, our alternative estimate is
BnT =h����1�;nT � a�;nT
i����=�nT
, (4.13)
and ���1�;nT is de�ned in (B.36) where EHnT (�nT ) in ��nT ;nT is replaced with HnT (�nT ). We show that when
n=T 3 ! 0, �1
nT ispnT consistent and asymptotically centered normal even when n=T !1.
To show our result for the bias corrected estimators, we need the following additional assumption.
Assumption 10. Either row sum or column sum ofP1
h=0Bhn(�) and
P1h=1 hB
h�1n (�) are bounded uniformly
in n and in a neighborhood of �0.
Assumption 10 can be veri�ed through the following lemma.
Lemma 4.4 If supnfkBn(�0))k1g < 1 (resp: supnfkBn(�0))k1g < 1), then the row sum (resp: column
sum) ofP1
h=0Bhn(�) and
P1h=1 hB
h�1n (�) are bounded uniformly in n and in a neighborhood of �0.
Proof. This is Lemma 3.9 in Yu, de Jong and Lee (2006).
Our result for the bias corrected estimator is as follows.
Theorem 4.5 Under Assumptions 1-10, if nT 3 ! 0,
pnT (�
1
nT � �0)d! N(0; lim
T!1��1�0;nT + lim
T!1��1�0;nT�0;n�
�1�0;nT
). (4.14)
Additionally,
pnT 3(c�0; 0)(�
1
nT � �0)d! N
�0; limT!1
!�1nT + limT!1
T 2(c�0; 0)���1�0;nT�0;n�
�1�0;nT
�(c�0; 0)0
�. (4.15)
Proof. See Appendix C.12.
4.3 Monte Carlo Results
We conduct a small Monte Carlo experiment to evaluate the performance of our ML estimators and
the bias corrected estimators. We generate samples from (2.1) using �a0 = (0:4; 0:2; 1; 0:4; 1)0 and �b0 =
(0:6;�0:4; 1; 0:8; 1)0 where �0 = ( 0; �0; �00; �0; �
20)0, and Xnt; cn0 and Vnt are generated from independent
normal distributions9 and the spatial weights matrix we use is a rook matrix. We use T = 10, 50 and n = 49,
196. For each set of generated sample observations, we calculate the ML estimator �nT and evaluate the
bias �nT � �0; we then construct the bias corrected estimator �1
nT and evaluate the bias �1
nT � �0. We dothis for 1000 times to see if the bias is reduced on average by using the analytical bias correction procedure,
9We generated the spatial panel data with 20 + T periods and then take the last T periods as our sample. And the initial
value is generated as N(0; In) in the simulation. We have also generated the data with a much longer history 1000+ T and the
results are similar. Also, in our example, the second largest eigenvalue of Wn is 0.94107. If we count it as a unit root, the bias
corrected estimator does not change much.
13
Table 1: Performance of QMLs and Their Bias Corrected Estimators: Biases
Case Bias of �nT (1st line) and �1
nT (2nd line)
T n �0 � � � �2
(1) 10 49 �a0 �0.0758 0.0187 �0.0135 -0.0107 -0.1211
-0.0021 0.0161 0.0015 -0.0042 -0.0346
(2) 10 49 �b0 -0.0939 0.0785 -0.0180 -0.0087 -0.1234
-0.0050 0.0124 0.0026 -0.0063 -0.0374
(3) 10 196 �a0 -0.0749 0.0160 -0.0135 -0.0108 -0.1147
-0.0019 0.0163 0.0015 -0.0039 -0.0276
(4) 10 196 �b0 -0.0919 0.0745 -0.0184 0.0071 -0.1179
-0.0042 0.0119 0.0020 -0.0046 -0.0312
(5) 50 49 �a0 -0.0139 0.0081 -0.0009 -0.0018 -0.0219
0.0004 0.0024 -0.0000 -0.0030 -0.0020
(6) 50 49 �b0 -0.0170 0.0172 -0.0003 -0.0029 -0.0204
-0.0002 0.0031 0.0008 -0.0030 -0.0008
(7) 50 196 �a0 -0.0142 0.0087 -0.0005 -0.0019 -0.0208
0.0002 0.0040 0.0004 -0.0031 -0.0010
(8) 50 196 �b0 -0.0172 0.0166 -0.0003 -0.0019 -0.0202
-0.0004 0.0028 0.0008 -0.0023 -0.0003
Note: �a0 = (0:4, 0:2, 1, 0:4, 1) and �b0 = (0:6, �0:4, 1, 0:8, 1).
i.e., to compare 11000
P1000i=1 (�nT � �0)i with 1
1000
P1000i=1 (�
1
nT � �0)i. With two di¤erent values of �0 for eachn and T , �nite sample properties of both estimators are summarized in Table 1 and Table 2, where Table 1
is for the biases and Table 2 is for the standard errors of estimators.
We see that both estimators have some biases, but the bias corrected estimators reduce those biases.
This is consistent with our asymptotic analysis, because the bias corrected estimators will eliminate the bias
of order O(T�1). Also, the bias reduction is achieved while there is no signi�cant increase in the variance of
the estimators, as can be seen from Table 2.
For di¤erent cases of n and T , we see that for each given n, when T is larger, the biases of two sets of
estimators will be smaller and the variances will be smaller; for each given T , when n is larger, the biases of
two sets of estimators will be nearly the same, but the variances will be smaller. This is consistent with our
theoretical prediction.
14
Table 2: Performance of QMLs and Their Bias Corrected Estimators: Standard Errors
Case S.E. of �nT (1st line) and �1
nT (2nd line)
T n �0 � � � �2
(1) 10 49 �a0 0.0320 0.0534 0.0454 0.0426 0.0568
0.0336 0.0572 0.0476 0.0428 0.0625
(2) 10 49 �b0 0.0312 0.0415 0.0460 0.0237 0.0582
0.0327 0.0441 0.0482 0.0237 0.0639
(3) 10 196 �a0 0.0160 0.0276 0.0227 0.0221 0.0286
0.0168 0.0296 0.0238 0.0222 0.0315
(4) 10 196 �b0 0.0156 0.0214 0.0230 0.0126 0.0292
0.0163 0.0228 0.0241 0.0126 0.0321
(5) 50 49 �a0 0.0136 0.0219 0.0203 0.0184 0.0283
0.0137 0.0222 0.0205 0.0185 0.0289
(6) 50 49 �b0 0.0124 0.0165 0.0206 0.0102 0.0290
0.0125 0.0167 0.0208 0.0103 0.0296
(7) 50 196 �a0 0.0068 0.0113 0.0102 0.0095 0.0142
0.0069 0.0114 0.0103 0.0096 0.0144
(8) 50 196 �b0 0.0062 0.0085 0.0103 0.0054 0.0145
0.0062 0.0086 0.0104 0.0055 0.0148
5 Conclusion
In this paper, we derived the properties of QML estimators of a nonstationary spatial dynamic panel
data with �xed e¤ects when both n and T are large. For the distribution of the common parameters, when
T is asymptotically large relative to n, the estimators arepnT consistent and asymptotically normal, with
the limit distribution centered around 0; when n is asymptotically proportional to T , the estimators arepnT consistent and asymptotically normal, but the limit distribution is not centered around 0; and when
n is large relative to T , the estimators are consistent with rate T , and have a degenerate limit distribution.
Compared to Yu, de Jong and Lee (2006), the estimators� rate of convergence will be the same, but the
asymptotic variance matrix will be driven by the nonstationary component and it is singular. Also, the sum
of the spatial e¤ect coe¢ cients and dynamic e¤ect coe¢ cient will have a higher rate of convergence. We
also propose a bias correction for our estimators. We show that as long as T grows faster than n1=3, the
correction will eliminate the bias of order O(T�1) and yield a centered con�dence interval.
15
Appendices
A Notations
The following list summarizes some frequently used notations in the text:
Sn(�) = In � �Wn for any possible �.
Sn = In � �0Wn.
Gn =WnS�1n . An = S�1n ( 0In + �0Wn).
An = RnDnR�1n where Rn is the eigenvectors and Dn is the diagonal matrix of eigenvalues.
Jn = Diagf10mn; 0; � � � ; 0g where 1mn
is an mn � 1 vector of ones.Mn = RnJnR
�1n .
c = (1; 1;01�kx)0 and c� = (c0; 1)0.
Znt = (Yn;t�1; WnYn;t�1; Xnt).
� = (�0; �; �2)0 where � = ( ; �; �0)0.
lnLn;T (�; cn) is the log-likelihood of � and cn.
lnLn;T (�) is the concentrated log-likelihood of �.
Qn;T (�) = max�;cn;�2 E1nT lnLn;T (�; cn).
~�nt = �nt � ��nT and��n;t�1 = �n;t�1 � ��nT;�1 where ��nT = 1
T
TPt=1�nt and ��nT;�1 = 1
T
TPt=1�n;t�1.
!nT =1
nT 3
TPt=1
~~Y u0n;t�1~~Y un;t�1 (see (2.7) for
~~Y un;t�1).
B Algebra for the Nonstationary Case
B.1 An Example to Justify the Assumptions
Consider the group case with equal weights for peers, i.e., Wn is a block diagonal matrix with its jth
block being Wjn =1
nj�1 [lnj l0nj � Inj ], j = 1; � � � ; R, where R is the total number of groups.
The eigenvalues are roots of the characteristic polynomial
jWjn � �Inj j = j1
nj � 1lnj l
0nj � (�+
1
nj � 1)Inj j = (�1)nj (�+
1
nj � 1)nj�1(�� 1);
by using the property of a determinant that jA + �bd0j = jAj(1 + �d0A�1b) (Proposition 31 in Dhrymes(1978)). Hence the eigenvalues of Wjn are a single root with the unit, and (nj � 1) multiple roots withthe value (� 1
nj�1 ) for the jth group. As Wn is a block diagonal matrix, its determinant is the product of
the determinants of the diagonal block matrices. It follows that there are R-multiple roots of the unit, and
(nj � 1)-multiple roots of the value (� 1nj�1 ) for each j = 1; � � � ; R.
16
As the total number of unit eigenvalues ofWn is R, the corresponding orthonormal matrix of eigenvectors
of Wn is Rn = (Rn;R; Rn;n�R), where
Rn;R =
0BBBBBB@
ln1pn1
0 � � � 0
0ln2pn2
� � � 0
... � � � . . . � � �0 0 � � � lnRp
nR
1CCCCCCA :
As Jn =
0@ IR 0
0 0
1A, we have
RnJnR0n = (Rn;R; Rn;n�R)Jn(Rn;R; Rn;n�R)
0 = Rn;RR0n;R =
0BBB@ln1 l
0n1
n1� � � 0
0. . . 0
0 � � � lnR l0nR
nR
1CCCA ;which is uniformly bounded in both row and column sums.
The matrix An = (In��0Wn)�1( 0In+�0Wn) for this group setting is a diagonal block matrix. Because
Bn = An �RnJnRn as de�ned, Bn is also a block diagonal matrix. Consider the �rst diagonal block of Anwhich is An1 = (In1 � �0W1n)
�1( 0In1 + �0W1n). Note that
(In1 � �0W1n)�1 =
n1 � 1n1 � 1 + �0
(In1 ��0
n1 � 1 + �0ln1 l
0n1)
�1
=n1 � 1
n1 � 1 + �0(In1 +
�0(n1 � 1)(1� �0)
ln1 l0n1):
As 0 + �0 = 1� �0, it follows that
An1 = (In1 � �0W1n)�1( 0In1 + �0W1n)
=n1 � 1
n1 � 1 + �0f( 0 �
�0n1 � 1
)In1 +�0 + �0n1 � 1
ln1 l0n1):
Hence,
Bn1 = An1 �ln1 l
0n1
n1= (
n1 0 � 1 + �0n1 � 1 + �0
)(In1 �1
n1ln1 l
0n1):
Because (In1 � 1n1ln1 l
0n1) is an idempotent matrix, it follows that for any positive integer h,
Bhn1 = (n1 0 � 1 + �0n1 � 1 + �0
)h(In1 �1
n1ln1 l
0n1):
The (In1 � 1n1ln1 l
0n1) is uniformly bounded in both row and column sums, so
P1h=0 abs(B
hn1) will be uni-
formly bounded in both row and column sum if j(n1 0�1+�0n1�1+�0 )j < 1. The corresponding Bn will be so if
maxj=1;��� ;R j(nj 0�1+�0nj�1+�0 )j < 1. A su¢ cient condition for this to occur is that j�0j < 1, 0 < 1 and �0 < 1.This is so as follows. De�ne the function f(x) = x 0�1+�0
x�1+�0 . The derivative of f(x) isdf(x)dx = (1��0)(1� 0)
(x�1+�0)2
which will be positive if 1 > �0 and 1 > 0. As the upper bound of f(x) will be 0 and its lower bound
17
is f(2) = 2 0�1+�01+�0
= 1��0�2�01+�0
> �1 because 1 + �0 > 0, 1 > �0 and x � 2. Under this situation, we canjustify the Assumption 8 in the text for this example.
The same consideration will also justify that the smallest eigenvalue of An is less than one in absolute
value in Assumption 610 . Because An is a block diagonal matrix, it is su¢ cient to consider the eigenvalues
of each block Anj . An eigenvalue � of Anj for some j will also be the eigenvalue of An. This is so because
if � is an eigenvalue of Anj with eigenvector xnj such that Anjxnj = �xnj , then Anxn = �xn where
xn = (0; � � � ; 0; x0nj ; 0; � � � ; 0)0. Consider the eigenvalue (� 1n1�1 ) of Wn1 (and the remaining eigenvalue is
one) and the corresponding eigenvalue x1. As
An1x1 = (In1 � �0Wn1)�1( 0In1 + �0Wn1)x1 = (
n1 0 � 1 + �0n1 � 1 + �0
)x1;
thus the corresponding eigenvalue of An is (n1 0�1+�0n1�1+�0 ), which lies in (�1; 0) with 0 < 1, as previously
shown.
B.2 Some Basic Lemmas
Proposition B.1 Suppose that Wn is a weights matrix row normalized from a symmetric matrix Cn, i.e.,
Wn = ��1n Cn, where �n is a diagonal matrix with its diagonal elements formed by the row sums of Cn.
Then, the eigenvalues of Wn are all real and Wn is diagonalizable.
Proposition B.2 Suppose that An = (In��0Wn)�1( 0In+�0Wn), where Wn is the row normalized weights
matrix in Proposition B.1. Then, An is diagonalizable with all real eigenvalues. If Wn is diagonalizable as
Wn = RnD�nR
�1n , then An can be diagonalizable as An = RnDnR
�1n , with its eigenvalue matrix Dn =
(In � �0D�n)�1( 0In + �0D
�n).
Proposition B.3 Denote dn;i�s the eigenvalues of An. Under Assumption 1 for Wn, j�0j < 1 and �0 +
0 + �0 = 1, (1) if �0 + 0�0 > 0 and 0��01+�0
> �1, we have dn;max = 1 and dn;min > �1 ; (2) when�0 + 0 + �0 = 1, ��0 < 1, j 0j < 1 and j�0j < 1� implies ��0 + 0�0 > 0 and
0��01+�0
> �1�.
Proposition B.4 (1) Suppose that j�0j < 1 and 0 6= 1, then the unit eigenvalues of Wn correspond to unit
eigenvalues of An via the relation 0+�0$ni
1��0$ni, if and only if �0 + 0 + �0 = 1.
(2) AnRnJnR�1n = RnJnR�1n An = RnJnR
�1n .
(3) Assuming that the unit eigenvalues of Wn correspond to unit eigenvalues of An, then,
(3i) WnRnJnR�1n = RnJnR
�1n ;
(3ii) S�1n RnJnR�1n = RnJnR
�1n S�1n = 1
1��0RnJnR�1n .
10See also su¢ cient conditions on parameters in Proposition B.3, which guarantee Assumption 6.
18
Proposition B.5 Under Assumptions 5 and 6, for Ynt in (2.1), Ynt = Y unt + Ysnt where
Y unt = (RnJnR�1n )
Yn;�1 + cn0
t
(1� �0)+
Pt�1h=0Xnh�0(1� �0)
+
Pt�1h=0 Vnh(1� �0)
!, (B.1)
Y snt =1Xh=0
BhnS�1n cn0 +
1Xh=0
BhnS�1n Xn;t�h�0 +
1Xh=0
BhnS�1n Vn;t�h. (B.2)
Furthermore,
~~Y un;t�1 =RnJnR
�1n
(1� �0)
cn0[(t� 1)� (
T � 12
)] +t�2Xh=0
(Xnh�0 + Vnh)�1
T
T�2Xh=0
(T � 1� h)(Xnh�0 + Vnh)!,
~~Y sn;t�1 = S�1n
1Xh=0
Bhn[(Xn;t�1�h �1
T
T�1Xt=0
Xn;t�h)�0 + (Vn;t�1�h �1
T
T�1Xt=0
Vn;t�h)]:
Proposition B.6 Under Assumptions 5 and 6, for the nonstationary part Y un;t�1 of Yn;t�1,
WnYun;t�1 = Y
un;t�1, GnY
un;t�1 =
1
1� �0Y un;t�1. (B.3)
Also, for nonstationary part Zunt of Znt, denote c = (1; 1;01�kx)0, we have
Zunt = (Yun;t�1;WnY
un;t�1;01�kx) = Y
un;t�1 � c0, GnZunt�0 = Y un;t�1. (B.4)
Denote �nt =Pt�1
h=0 Vnh, Xnt =Pt�1
h=0Xnh, Unt =P1
h=1 Pnt;hVn;t+1�h and Wnt =P1
h=1Qnt;hVn;t+1�h
where Pnt;h and Qnt;h are n�n nonstochastic matrices and the row and column sums ofP1
h=1 abs(Pnt;h) andP1h=1 abs(Qnt;h) are bounded uniformly in n and t. Also, n� 1 vector Dnt are nonstochastic and bounded,
uniformly in n and t. We note that ��nT =1T
PTt=1 �nt =
PTh=1
hT Vn;T�h. Also, as
�UnT =�PT
t=1 Unt�=T ,
we have �UnT =P1
h=1�PnT;hVn;T+1�h, where
�PnT;h =
8<: 1T (PnT;1 + PnT;2 + � � �+ PnT;h) =
1T
Phg=1 Pn;T�h+g;g for h � T
1T
Phg=h�T+1 Pn;T�h+g;g for h > T .
(B.5)
Lemma B.7 Under Assumption 2, for t � s,
E(UntW0ns) = �
20
1Xh=1
Pnt;t�s+hQ0ns;h
!, E(U0ntWns) = �
20tr
1Xh=1
P 0nt;t�s+hQns;h
!, (B.6)
Cov(U0ntWnt;U0nsWns) = (�4 � 3�40)1Xh=1
nXi=1
(P 0nt;t�s+hQnt;t�s+h)ii(P0ns;hQns;h)ii +
�40tr
" 1Xh=1
Pns;hP0nt;t�s+h
! 1Xh=1
Qnt;t�s+hQ0ns;h
!+
1Xh=1
Qns;hP0nt;t�s+h
! 1Xh=1
Qnt;t�s+hP0ns;h
!#Lemma B.8 11Denote Bn an n � n nonstochastic matrix which is row sum and column sum bounded uni-
formly in n. Under Assumptions 1-8, for �nt, cn0, Xnt, Unt, Dnt and their cross products, we have1
nT
XT
t=1(cn0~t+ ~Xnt�0)0Bn(cn0~t+ ~Xnt�0) = O
�T 2�; (B.7)
11For Mn = RnJnR�1n = An � Bn, as An = S�1n ( 0In + �0Wn) is row sum and column sum bounded, and Bn is also row
sum and column sum bounded implied by Assumption 8, Mn is also row sum and column sum bounded. Hence, we can replace
Bn with Mn or M 0nMn to apply following lemmas.
19
1
nT
XT
t=1(cn0~t+ ~Xnt�0)0Bn~�nt = Op
rT 3
n
!with zero mean; (B.8)
1
nT
XT
t=1~�0ntBn~�nt � E(
1
nT
XT
t=1~�0ntBn~�nt) = Op
�Tpn
�(B.9)
where E( 1nT
PTt=1
~�0ntBn~�nt) = O(T );
1
nT
XT
t=1(cn0~t+ ~Xnt�0)0BnDnt = O (T ) ; (B.10)
1
nT
XT
t=1(cn0~t+ ~Xnt�0)0 ~Unt = Op
rT
n
!with mean zero; (B.11)
1
nT
XT
t=1~�0ntBnDnt = Op
rT
n
!with mean zero; (B.12)
1
nT
XT
t=1~�0nt~Unt � E(
1
nT
XT
t=1~�0nt~Unt) = Op
rT
n
!, (B.13)
where E( 1nT
PTt=1
~�0nt~Unt) = O(1) and
1
nT
XT
t=1~U0nt ~Wnt � E(
1
nT
XT
t=1~U0nt ~Wnt) = O
�1pnT
�, (B.14)
where E( 1nT
PTt=1
~U0nt ~Wnt) = O(1).
Lemma B.9 Denote Bn an n�n nonstochastic matrix which is row sum and column sum bounded uniformlyin n. Under Assumptions 1-8, for cn0, Xnt, �nt, Vnt, Unt and their cross products,
1
nT
XT
t=1(cn0~t�1 + ~Xn;t�1�0)0BnVnt = Op
rT
n
!; (B.15)
1
nT
XT
t=1�0n;t�1BnVnt = Op
�1pn
�; (B.16)
1
n��0n;T�1Bn �VnT � E
1
n��0n;T�1Bn �VnT = Op
�1pn
�, (B.17)
where E 1n��0n;T�1Bn �VnT = O(1) and for Bn =M 0
n, E1n (Mn
��n;T�1)0 �VnT = �
20(T�1)(T�2)mn
2T 2n = O(mn
n );
1
n�U0n;T�1 �VnT � E
1
n�U0n;T�1 �VnT = Op
�1pn
�, (B.18)
where E 1n�U0n;T�1 �VnT = O(
1T );
1
nT
TXt=1
~V 0ntBn ~Vnt = (1�1
T)�20
1
ntr(Bn) +Op
�1pnT
�. (B.19)
20
Lemma B.10 Denote Bn an n � n nonstochastic matrix which is row sum and column sum bounded uni-
formly in n. Under Assumptions 1-8,
1
nT
TXt=1
~~Y u0n;t�1Bn~~Y un;t�1 � E
1
nT
TXt=1
~~Y u0n;t�1Bn~~Y un;t�1 = Op
T �rT
n
!, (B.20)
1
nT
TXt=1
~~Y u0n;t�1Bn~~Y sn;t�1 � E
1
nT
TXt=1
~~Y u0n;t�1Bn~~Y sn;t�1 = Op
rT
n
!, (B.21)
and1
nT
TXt=1
~~Y s0n;t�1Bn~~Y sn;t�1 � E
1
nT
TXt=1
~~Y s0n;t�1Bn~~Y sn;t�1 = Op
�1pnT
�, (B.22)
where E 1nT
PTt=1
~~Y u0n;t�1Bn~~Y un;t�1 = O(T
2), E 1nT
PTt=1
~~Y un;t�1Bn~~Y sn;t�1 = O(T ) and E
1nT
PTt=1
~~Y sn;t�1Bn~~Y sn;t�1 =
O(1).
Lemma B.11 Under Assumptions 1-8 and Bn is an n � n nonstochastic matrix which is row sum and
column sum bounded uniformly in n,
1
nT
TXt=1
~~Y u0n;t�1BnVnt � E1
nT
TXt=1
~~Y u0n;t�1BnVnt = Op
rT
n
!, (B.23)
1
nT
TXt=1
~~Y s0n;t�1BnVnt � E1
nT
TXt=1
~~Y s0n;t�1BnVnt = Op�
1pnT
�, (B.24)
where E 1nT
PTt=1
~~Y u0n;t�1BnVnt = O(1) and E 1nT
PTt=1
~~Y s0n;t�1BnVnt = O( 1T ). For the special case with Bn =In, we have E 1
nT
PTt=1
~~Y u0n;t�1Vnt = �20(T�1)(T�2)mn
2T 2n1
1��0 = O(mn
n ) where mn is the number of unit roots.
Proposition B.12 Consider the m�m square matrix HT = Im+T (gT d0T + hT b0T ), where gT , hT , bT , and
dT are all m-dimensional column vectors. Then, under the assumption �T 6= 0,
H�1T = Im �
T
�TBT , (B.25)
where
�T = 1 + T (b0ThT + d
0T gT )� T 2Det ((bT ; dT )0(gT ; hT )) , (B.26)
and
BT = (hT b0T + gT d
0T )� T [(d0ThT )gT b0T + (b0T gT )hT d0T � (d0T gT )hT b0T � (b0ThT )gT d0T ]: (B.27)
Proposition B.13 Consider the m�m stochastic matrix KT ,
KT = T2cT c
0T + T (bT d
0T + d
0T bT ) +AT ;
where cT , bT and dT are m-dimensional column random vectors with cT proportional to bT such that cT =
!T � bT , where !T is a nonzero random variable with probability one. Suppose that, as T !1, cT , bT , dT ,and !T converge in probability, respectively, to �nite limits c, b, d and ! where c and ! are nonstochastic and
21
nonzero. Assume that AT is positive de�nite for large enough T with probability one and its limit A exists
and is also a positive de�nite matrix. Then, under the condition that � = 1� 1!2
hd0A�1d� (d0A�1b)2
b0A�1b
i6= 0,
(a) the limit of K�1T is a �nite matrix Lk, where
Lk = (A�1 � 1
b0A�1bA�1bb0A�1) +
1
!2�A�1(d� d
0A�1b
b0A�1bb)(d� d
0A�1b
b0A�1bb)0A�1;
(b) K�1T � cT = Op(T�1);
(c) c0TK�1T cT = Op(T
�2) and T 2c0TK�1T cT = 1 +Op(T
�1).
Proposition B.14 Assuming plimT!1HsnT is positive de�nite, we have
H�11;nT c = Op(T
�1), (B.28)
c0H�11;nT c = Op(T
�2), (B.29)
H�11;nTH2;nT = Op(1), (B.30)
1� c0H�11;nTH2;nT = Op(T
�1), (B.31)�H3;nT �H0
2;nTH�11;nTH2;nT
��1exists and is Op(1), (B.32)
plimT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
��16= 0, (B.33)�
H3;nT �H02;nTH�1
1;nTH2;nT
���EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT
� p! 0, (B.34)
plimT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
�= limT!1
�EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT
�. (B.35)
Proposition B.15 For QML estimator �nT in Theorem 4.1, de�ne
���nT ;nT =1
�2nT
0@ HnT (�nT ) 0
0 0
1A+0BBB@0 0 0
0 1n
htr(G0n(�nT )Gn(�nT )) + tr(G
2n(�nT ))
i1
�2nTntr(Gn(�nT ))
0 1�2nTn
tr(Gn(�nT ))1
2�4nT
1CCCA ,(B.36)
where HnT (�nT ) is HnT (�) (see (B.47)) evaluated at �nT , then,
���1�nT ;nT
� ��1�0;nT = Op�max
�1pnT;1
T
��, (B.37)
T �h���1�nT ;nT
� ��1�0;nTi(c�0; 0)0 = Op
�max
�1pnT;1
T
��. (B.38)
22
B.3 Proofs for Basic Lemmas
Proof for Proposition B.1: The �rst part is known in Ord (1975). To show that Wn is diagonalizable,
note that as Wn = ��1n Cn, it implies that �
12nWn�
� 12
n = �� 12
n Cn�� 12
n , which is a symmetric matrix. Let D�n
be the eigenvalue matrix of �� 12
n Cn�� 12
n , which is real; and let R�n be the corresponding orthonormal matrix
such that �� 12
n Cn�� 12
n = R�nD�nR
�0n . Hence, Wn = �
� 12
n (R�nD�nR
�0n )�
12n = RnD
�nR
�1n where Rn = �
� 12
n R�n is
an eigenvector matrix of Wn, and D�n is the eigenvalue matrix for Wn. �
Proof for Proposition B.2: Because Wn = RnD�nR
�1n from Proposition B.1, it follows that
An = (In � �0Wn)�1( 0In + �0Wn)
= (In � �0RnD�nR
�1n )�1( 0In + �0RnD
�nR
�1n )
= Rn(In � �0D�n)�1R�1n �Rn( 0In + �0D�
n)R�1n
= Rn(In � �0D�n)�1( 0In + �0D
�n)R
�1n :
Note that Dn = (In��0D�n)�1( 0In+ �0D
�n) is a diagonal real matrix because D
�n is diagonal and real, and
(In � �0D�n) is invertible because (In � �0Wn) is assumed to be invertible to begin with. �
Proof for Proposition B.3: For (1): The eigenvalue of An has the formula 0+�0$ni
1��0$niwhere $ni is an
eigenvalue of Wn with j$nij � 1 for all i and $ni = 1 for some i (see Ord (1975)). Because@� 0+�0$ni1��0$ni
�@$ni
=�0+ 0�0
(1��0$ni)2and j$nij � 1 for all i, �0 + 0�0 > 0 will imply that 0+�0$ni
1��0$niis an increasing function
of $ni. As $n;max = 1, the maximum value of 0+�0$ni
1��0$niwill be achieved at $ni = 1; additionally, as
$n;min � �1, 0��01+�0> �1 will assure that the minimum value of 0+�0$ni
1��0$niwill be greater than �1. For (2):
If �0+ 0+�0 = 1, then �0+ 0�0 > 0 is equivalent to (1� 0)(1��0) > 0; also, 0��01+�0
> �1 if 0+�0 > 0and �0 > �1. Under �0 + 0 + �0 = 1, 0 + �0 > 0 is equivalent to �0 < 1. The conclusion in (2) follows. �
Proof for Proposition B.4: (1) An eigenvalue dni of An has the form dni = 0+�0$ni
1��0$nifor some eigen-
value $ni of Wn. Thus, dni = 1 is equivalent to 0 + �0$ni = 1 � �0$ni when j�0j < 1. It is apparent
that $ni = 1 is equivalent to dni = 1 when �0 + 0 + �0 = 1 and 0 6= 1. That �0 + 0 + �0 = 1
is a necessary condition is trivial. (2) Because An = RnDnR�1n and DnJn = (Jn + ~Dn)Jn = Jn, we
have AnRnJnR�1n = RnDnJnR�1n = RnJnR
�1n . Note that because Jn and Dn are diagonal matrices,
RnJnR�1n An = RnJnDnR
�1n = RnDnJnR
�1n = AnRnJnR
�1n . (3) From Proposition B.1, Wn = RnD
�nR
�1n .
Hence,WnRnJnR�1n = RnD
�nJnR
�1n = RnJnR
�1n as D�
nJn = Jn when the unit eigenvalues ofWn correspond
to unit eigenvalues of An. As S�1n = Rn(In � �0D�n)�1R�1n , we have S�1n RnJn = Rn(In � �0D�
n)�1Jn =
11��0RnJn because (In � �0D
�n)�1Jn =
11��0 Jn. It follows that S
�1n RnJnR
�1n = 1
1��0RnJnR�1n . Further-
more, RnJnR�1n S�1n = RnJn(In � �0D�n)�1R�1n = Rn(In � �0D�
n)�1JnR
�1n = S�1n RnJnR
�1n . �
Proof for Proposition B.5: Suppose that the number of unit roots of An is mn, then Dn = Jn + ~Dn
where Jn = Diagf10mn; 0; � � � ; 0g and ~Dn = Diagf0; � � � ; 0; dn;mn+1; � � � ; dnng with jdnj j < 1 for all j =
23
mn + 1; � � � ; n. As Jn is idempotent and Jn � ~Dn = 0, we have Ahn = RnJnR�1n +Bhn where Bhn = Rn ~D
hnR
�1n
for any h = 1; 2; 3; � � � .Because Ynt = AnYn;t�1 + S
�1n (Xnt�0 + cn0 + Vnt), we can decompose Ynt as Ynt = Y unt + Y
snt where
Y unt = RnJnR�1n Yn;t�1 and Y snt = BnYn;t�1 + S
�1n (Xnt�0 + cn0 + Vnt). By using BnAn = B
2n and BnS
�1n =
S�1n Bn, Y snt can be written as an in�nite sum of the past by recursive induction for any integer t:
Y snt = BnYn;t�1 + S�1n (Xnt�0 + cn0 + Vnt)
= Bn[AnYn;t�2 + S�1n (Xn;t�1�0 + cn0 + Vn;t�1)] + S
�1n (Xnt�0 + cn0 + Vnt)
= S�1n (1Xh=0
Bhn)cn0 + S�1n
1Xh=0
Bhn(Xn;t�h�0 + Vn;t�h).
For Y unt, there are two versions which will be useful. By using RnJnR�1n An = RnJnR
�1n and RnJnR�1n S�1n =
S�1n RnJnR�1n ,
Y unt = RnJnR�1n Yn;t�1
= RnJnR�1n [AnYn;t�2 + S
�1n (Xn;t�1�0 + cn0 + Vn;t�1)]
= RnJnR�1n Yn;t�2 + S
�1n RnJnR
�1n (Xn;t�1�0 + cn0 + Vn;t�1)
= RnJnR�1n Yn;0 + (t� 1)S�1n RnJnR
�1n cn0 + S
�1n RnJnR
�1n
t�1Xh=1
(Xnh�0 + Vnh);
for t = 1; 2; � � � , whereP0
h=1 is a zero as a convention. Another version is to expand Yunt to Yn;�1 as
Y unt = RnJnR�1n Yn;�1 + tS
�1n RnJnR
�1n cn0 + S
�1n RnJnR
�1n
t�1Xh=0
(Xnh�0 + Vnh); (B.39)
for t = 0; 1; 2; � � � .Using 1
T
PTt=1(t�1) = 1
T
PT�1t=1 t =
T�12 , 1T
PT�1t=1
Pt�1h=0 zh =
1T
PT�1t=1 (T�t)zt�1 and 1
T
PTt=2
Pt�1h=1 zh =
1T
PT�1t=1 (T � t)zt, it follows that
�Y unT =1
T
TXt=1
Y unt
= RnJnR�1n Yn0 + S
�1n RnJnR
�1n cn0
1
T
TXt=1
(t� 1) + S�1n RnJnR�1n
1
T
TXt=2
t�1Xh=1
(Xnh�0 + Vnh)
= RnJnR�1n Yn0 + S
�1n RnJnR
�1n cn0(
T � 12
) + S�1n RnJnR�1n
1
T
T�1Xt=1
(T � t)(Xnt�0 + Vnt);
and
�Y unT;�1 =1
T
T�1Xt=0
Y unt
= RnJnR�1n Yn;�1 + S
�1n RnJnR
�1n cn0
1
T
T�1Xt=1
t+ S�1n RnJnR�1n
1
T
T�1Xt=1
t�1Xh=0
(Xnh�0 + Vnh)
= RnJnR�1n Yn;�1 + S
�1n RnJnR
�1n cn0(
T � 12
) + S�1n RnJnR�1n
1
T
T�2Xt=0
(T � 1� t)(Xnt�0 + Vnt):
24
Hence,
~~Y un;t�1 = Y un;t�1 � �Y unT;�1
= S�1n RnJnR�1n fcn0[(t� 1)� (
T � 12
)] +
t�2Xh=0
(Xnh�0 + Vnh)�1
T
T�2Xh=0
(T � 1� h)(Xnh�0 + Vnh)g
=RnJnR
�1n
(1� �0)fcn0[(t� 1)� (
T � 12
)] +t�2Xh=0
(Xnh�0 + Vnh)�1
T
T�2Xh=0
(T � 1� h)(Xnh�0 + Vnh)g
because S�1n RnJnRn =1
1��0RnJnR�1n from Proposition B.4 (3ii). For the stationary component,
�Y snT;�1 =1
T
T�1Xt=0
Y snt = S�1n (
1Xh=0
Bhn)cn0 + S�1n
1Xh=0
Bhn1
T
T�1Xt=0
(Xn;t�h�0 + Vn;t�h);
and
~~Y sn;t�1 = Ysn;t�1 � �Y snT;�1 = S
�1n
1Xh=0
Bhn[(Xn;t�1�h �1
T
T�1Xt=0
Xn;t�h)�0 + (Vn;t�1�h �1
T
T�1Xt=0
Vn;t�h)]: �
Proof for Proposition B.6: We use the result of Proposition B.4 to prove the result here. Conditions
there are satis�ed under Assumptions 5 and 6. That WnYun;t�1 = Y
un;t�1 follows from (B.1) of Proposition
B.5 using WnRnJnR�1n = RnJnR
�1n from Proposition B.4. For GnY un;t�1 =
11��0Y
un;t�1, this is so because
(1) S�1n Y un;t�1 =1
1��0Yun;t�1 using S
�1n RnJnR
�1n = 1
1��0RnJnR�1n and (2) Gn = WnS
�1n = S�1n Wn. Also,
as Zunt = (Yun;t�1;WnY
un;t�1;01�kx) = Y
un;t�1(1; 1;01�kx)
0, we have GnZunt�0 = Yun;t�1. This follows because
GnZunt�0 = GnY
un;t�1( 0 + �0) and 0 + �0 = 1� �0. �
Proof for Lemma B.7: See Lemma A.2 and A.4 in Yu, de Jong and Lee (2006). �
Proof for Lemma B.8:
Equation (B.7): Let �(Bn) be its spectral radius (the largest eigenvalue in absolute value) and jj � jj be amatrix norm. It is known from matrix theory that �(Bn) � jjBnjj (see Horn and Johnson (1985)). Takingk�k to be either k�k1 or k�k1, it follows that fkBnkg is bounded because Bn is row sum and column sum
bounded. With the above settings,����� 1nTTXt=1
(cn0~t+ ~Xnt�0)0Bn(cn0~t+ ~Xnt�0)�����
� �(Bn) ������ 1nT
TXt=1
(cn0~t+ ~Xnt�0)0(cn0~t+ ~Xnt�0)
������ kBnk �
����� 1nTTXt=1
(cn0~t+ ~Xnt�0)0(cn0~t+ ~Xnt�0)
�����= jjBnjj �
1
nT
TXt=1
(c0n0cn0~t2 + 2c0n0
~Xnt�0~t+ (~Xnt�0)0(~Xnt�0)).
25
Because 1T
PTt=1 t
2 = 16 (T + 1)(2T + 1) = O(T 2),
���~Xnt�0��� = ���Pt�1h=0
~Xnh�0
��� � t � supn;t��� ~Xnt�0���, ~Xnt is
bounded uniformly in all n and t, and elements of cn0 are also uniformly bounded, we have the result that��� 1nT PTt=1(cn0~t+
~Xnt�0)0Bn(cn0~t+ ~Xnt�0)��� = O(T 2).
Equation (B.8): As E�nt�0ns = �
20minft; sgIn, we have
V ar(1
nT
TXt=1
(cn0~t+ ~Xnt�0)0Bn~�nt) = V ar(1
nT
TXt=1
(cn0~t+ ~Xnt�0)0Bn�nt)
=1
n2T 2
TXt=1
TXs=1
(cn0~t+ ~Xnt�0)0BnE�nt�0nsB0n(cn0~s+ ~Xns�0)
=�20n2T 2
TXt=1
TXs=1
minfs; tg(cn0~t+ ~Xnt�0)0B2n(cn0~s+ ~Xns�0)
� �20n2T 2
TXt=1
t(cn0~t+ ~Xnt�0)
!B2n
TXs=1
�cn0~s+ ~Xns�0
�!
=�20T
3
n
1
n
1
T 3
TXt=1
t(cn0~t+ ~Xnt�0)
!B2n
1
T 2
TXs=1
�cn0~s+ ~Xns�0
�!= O(
T 3
n)
by the uniform boundedness elements of cn0 and Xnt, and the uniform boundedness of B2n in row and columnsums. The result follows.
Equation (B.9): We have 1nT
PTt=1
~�0ntBn~�nt = 1
nT
PTt=1(�
0ntBn�nt)� 1
n��0nTBn��nT .
For the �rst part, E( 1nT
PTt=1(�
0ntBn�nt)) = �20tr(Bn)
�1nT
PTt=1 t
�= O(T ) and
V ar( 1nT
PTt=1 �
0ntBn�nt) = 1
n2T 2
PTt=1
PTs=1 Cov(�
0ntBn�nt; �0nsBn�ns). Using Lemma B.7 for covariance be-
tween U0ntWnt and U0ntWnt (in our case here, Unt =P1
h=1 Pnt;hVn;t+1�h and Wnt =P1
h=1Qnt;hVn;t+1�h,
where Pnt;h = In andQnt;h = Bn for h � t, and Pnt;h = Qnt;h = 0 for h > t.), we have V ar( 1nT
PTt=1 �
0ntBn�nt) =
T 2
n .
For the second part, E( 1n��0nTBn��nT ) = E( 1n
�U0nT �WnT ) where �UnT =P1
h=1�PnT;hVn;T�h and �WnT =P1
h=1�QnT;hVn;T�h with
�PnT;h =
8<: InhT for h � T
0 for h > Tand �QnT;h =
8<: Bn hT for h � T0 for h > T
. (B.40)
Then, using Lemma B.7, E( 1n��0nTBn��nT ) = O(T ) and V ar( 1n
��0nTBn��nT ) = 1
n2Cov(�U0nT �WnT ; �U0nT �WnT ) =
O(T2
n ) becauseP1
h=1�PnT;h �P
0nT;h =
PTh=1(
hT )
2In,P1
h=1�Q0nT;h
�PnT;h =PT
h=1(hT )
2B0n,P1
h=1�QnT;h �Q
0nT;h =PT
h=1(hT )
2BnB0n andPT
h=1 h2 = O(T 3).
Equation (B.10): Because of the uniform boundedness of cn0, ~Dnt and Bn, there exist �nite constants c1
26
and c2 such that����� 1nTTXt=1
(cn0~t+ ~Xnt�0)0Bn ~Dnt
������ 1
T
TXt=1
�����c0n0Bn ~Dntn
����� � ��~t��+ 1
T
TXt=1
����� (~Xnt�0)0Bn ~Dntn
����� � c1T
TXt=1
j~tj+ c2T
TXt=1
j~tj = O(T ):
Equation (B.11): 1nT
PTt=1(cn0~t+
~Xnt�0)0 ~Unt = 1n
PTt=1(cn0
~tT+
1T~Xnt�0)0 ~Unt = T [ 1nT
PTt=1(cn0
~tT+
1T~Xnt�0)0 ~Unt].
As ~tT and 1
T~Xnt�0 are bounded, using Theorem A.8 in Yu, de Jong and Lee (2006), 1
nT
PTt=1(cn0
~tT +
1T~Xnt�0)0 ~Unt = Op( 1p
nT). Hence, 1
nT
PTt=1(cn0~t+
~Xnt�0)0 ~Unt = Op(q
Tn ).
Equation (B.12): As E�nt�0ns = �
20minft; sgIn, we have
V ar(1
nT
XT
t=1~�0ntBn ~Dnt)
= V ar(1
nT
XT
t=1(�0ntBn ~Dnt) =
1
n2T 2
XT
t=1
XT
s=1( ~D0
ntB0n(E(�nt�0ns))Bn ~Dns)
=�20n2T 2
XT
t=1
XT
s=1minfs; tg( ~D0
ntB0nBn ~Dns) = O(T
n).
Equation (B.13): We have 1nT
PTt=1
~�0nt~Unt = 1
nT
PTt=1
~�0nt~Unt � 1
n��0nT�UnT .
For the �rst part, using Lemma B.7,
E(1
nT
XT
t=1�0ntUnt) =
1
nT
XT
t=1
��20tr(
Xt
h=1Pnt;h)
�= �20
1
nT
�tr(XT
t=1
Xt
h=1Pnt;h)
�= O(1),
becauseP1
h=1 abs(Pnt;h) is row sum and column sum bounded. Also,
V ar(1
nT
XT
t=1�0ntUnt) =
1
n2T 2
XT
t=1
XT
s=1Cov(�0ntUnt; �
0nsUns) = O(
T
n).
This is so as follows. As �nt =P1
h=1Qnt;hVn;t�h where Qnt;h =
8<: In for h = 1; 2; � � � ; t0 for h � t+ 1
, we have
V ar( 1nT
PTt=1 �
0ntUnt) = O(Tn ) using Lemma B.7 because the leading factor
P1h=1Qns;hQ
0ns;h =
Psh=1 In =
s � In andPT
h=1
Pts=1 s = O(T
3).
For the second part, E( 1n��0nT�UnT ) = �20
nT tr�PT
h=1 h � �PnT;h�where �PnT;h is speci�ed in (B.5). So,
E( 1n��0nT�UnT ) = O(1). Also, V ar( 1n��
0nT�UnT ) = 1
n2Cov(�W0nT�UnT ; �W0
nT�UnT ) where �UnT =
P1h=1
�PnT;hVn;t+1�h
and �WnT =P1
h=1�PnT;hVn;t+1�h with �PnT;h speci�ed in (B.5) and �PnT;h speci�ed in (B.40). Then, using
Lemma B.7, we have V ar( 1n��0nT�UnT ) = O(Tn ).
Equation (B.14): This is Theorem A.7 in Yu, de Jong and Lee (2006).
27
Proof for Lemma B.9
Equation (B.15): Denote �(BnB0nB0n) be the spectral radius of BnB0n. Then,
V ar(1
nT
XT
t=1(cn0~t�1 + ~Xn;t�1�0)0BnVnt)
=�20n2T 2
XT
t=1(cn0~t�1 + ~Xn;t�1�0)0BnB0n(cn0~t�1 + ~Xn;t�1�0)
� �20n2T 2
� �(BnB0n) �����XT
t=1(cn0~t�1 + ~Xn;t�1�0)0(cn0~t�1 + ~Xn;t�1�0)
����� �20
n2T 2� kBnB0nk1 �
����XT
t=1(cn0~t�1 + ~Xn;t�1�0)0(cn0~t�1 + ~Xn;t�1�0)
���� = O(Tn ),because
PTt=1 t
2 = O(T 3) in the leading term.
Equation (B.16):
V ar
�1
nT
XT
t=1�0n;t�1BnVnt
�=
�20n2T 2
XT
t=1E(�0n;t�1BnB0n�n;t�1) =
�20n2T 2
XT
t=1tr�E(BnB0n�n;t�1�0n;t�1)
�=
�40n2T 2
� tr(BnB0n) �XT
t=1(t� 1) = O( 1
n).
Then, 1nT
PTt=1 �
0n;t�1BnVnt = Op
�1pn
�with mean zero.
Equation (B.17): As ��n;T�1 =1T
PT�1t=0 �nt =
1T
PT�1t=0
Pt�1h=0 Vnh =
1T
PT�1t=1 (T � t)Vn;t�1, we have
E(��0n;T�1Bn �VnT ) =
1
T 2E[(XT�1
t=1(T � t)Vn;t�1)0Bn
XT
t=1Vnt] = �
20
(T � 1)(T � 2)2T 2
tr(Bn). (B.41)
For the special case where Bn = Mn, E(��0n;T�1Mn
�VnT ) = �20(T�1)(T�2)
2T 2 tr(Mn) = �20(T�1)(T�2)
2T 2 mn because
tr(Mn) = tr(RnJnR�1n ) = tr(Jn) = mn. Also,
V ar(��0n;T�1Bn �VnT )
= V ar(1
T 2(XT�1
t=1(T � t)Vn;t�1)0Bn
XT
t=1Vnt) =
1
T 4V ar(
XT
t=1V 0ntB0n(
XT�1
t=1(T � t)Vn;t�1))
=1
T 4V ar(U0nT;�1WnT;�1),
where UnT;�1 =P1
h=1 Pnt;hVn;t+1�h with Pnt;h =
8<: In for h � T0 for h > T
and WnT;�1 =P1
h=1Qnt;hVn;t+1�h
28
with Qnt;h =
8<: B0n � (h� 1) for h � T0 for h > T
. From Lemma B.7,
V ar(��0n;T�1Bn �VnT )
=1
T 4�40tr
�In � T � B0nBn �
XT
h=1(h� 1)2 +
XT
h=1(h� 1) �
XT
h=1(h� 1)B0nB0n
�+1
T 4(�4 � 3�40)
XT
h=1(h� 1)2 �
Xn
i=1(Bn)ii(Bn)ii
=1
T 4�40
�tr(B0nBn) � T �
XT
h=1(h� 1)2 + (tr(B0nB0n) �
(T � 1)2T 24
�+Xn
i=1(Bn)ii(Bn)ii �
1
T 4(�4 � 3�40)
XT
h=1(h� 1)2 = O(n).
So, E(��0n;T�1Bn �VnT ) = O(n) and V ar(��
0n;T�1Bn �VnT ) = O(n).
Equation (B.18): This is implied by Theorem A.11 in Yu, de Jong and Lee (2006). �
Proof for Lemma B.10: Using Lemma B.8 and that Ynt has those components, we have the result.
For (B.20), we use (B.7), (B.8) and (B.9) in Lemma B.8. For (B.21), we use (B.10), (B.11), (B.12) and
(B.13) in Lemma B.8. For (B.22), it is implied by Lemma B.1 in Yu, de Jong and Lee (2006). �
Proof for Lemma B.11: Using Lemma B.9 and that Ynt has those components, we have the result.
For (B.23), we use (B.15), (B.16) and (B.17) in Lemma B.9. For (B.24), it is in Lemma B.1 in Yu, de
Jong and Lee (2006). �
Proof for Proposition B.12: The form of the inverse of HT can be checked by direct multiplication
of HT with the right hand side matrix expression (of H�1T ), which will result in an identity matrix. The
explicit expression of H�1T is complicated. But it can be derived by the following motivations. De�ne
QT = Im + TgT d0T and RT = Im + TQ
�1T hT b
0T . It follows, by construction, that HT = QTRT . If both
RT and QT are invertible, then HT must be invertible. By the familiar pattern of QT , its inverse will
have the form (see Dhrymes (1978)) Q�1T = Im � T1+Td0T gT
gT d0T , and also, the inverse of RT has the form
R�1T = Im � T1+Tb0TQ
�1T hT
Q�1T hT b0T . The �nal expression of H
�1T = R�1T Q�1T can be derived by exploring the
explicit expressions of Q�1T , R�1T and their multiplication. �
Proof for Proposition B.13: The following proof is for the case that KT is nonrandom. After we get the
result, it can be extended to the case that KT is random as long as AT is nonsingular with probability 1.
Using the notations in the Proposition B.12, KT = PTHT and K�1T = H�1
T P�1T , where H�1T is in (B.25)
and
P�1T = (T 2cT c0T +AT )
�1 = A�1T � T 2
1 + T 2c0TA�1T cT
A�1T cT c0TA
�1T . (B.42)
Furthermore, denote hT = P�1T dT and gT = P�1T bT , where hT and gT are in Proposition B.12. As cT is
proportional to bT and the explicit inverse formula of PT involves cT , gT = P�1T bT =1
1+T 2c0TA�1T cT
A�1T bT .
29
This implies the following scalar values: b0T gT =b0TA
�1T bT
1+T 2c0TA�1T cT
, d0T gT = b0ThT =
b0TA�1T dT
1+T 2c0TA�1T cT
, and d0ThT =
d0TA�1T dT � T 2(c0TA
�1T dT )
2
1+T 2c0TA�1T cT
. In terms of orders of magnitude, we have d0ThT = O(1), b0T gT = O(
1T 2 ), d
0T gT =
O( 1T 2 ), and b0ThT = O(
1T 2 ).
With these, one can evaluate �T in H�1T and its limit. The two terms of �T are T (b0ThT + d
0T gT ) =
2b0TA
�1T dT
T (T�2+c0TA�1T cT )
= O( 1T ); and
T 2(b0T gT � d0ThT � b0ThT � d0T gT )
= (b0TA
�1T bT
T�2 + c0TA�1T cT
)(d0TA�1T dT �
(c0TA�1T dT )
2
T�2 + c0TA�1T cT
)� ( Tb0TA�1T dT
1 + T 2c0TA�1T cT
)2
�! 1
!2(d0A�1d� (c
0A�1d)2
c0A�1c):
Hence,
� = limT!1
�T = 1�1
!2
�d0A�1d� (d
0A�1b)2
b0A�1b
�:
As K�1T = H�1
T P�1T = (Im� TBT
�T)P�1T , it remains to consider the limiting behavior of TBT �P�1T where BT is
in (B.27) and P�1T is in (B.42). As b0TP�1T = 1
1+T 2c0TA�1T cT
b0TA�1T and d0TP
�1T = (dT � T 2c0TA
�1T dT
1+T 2c0TA�1T cT
cT )0A�1T ,
these imply the following matrices
gT b0TP
�1T = P�1T bT b
0TP
�1T =
1
(1 + T 2c0TA�1T cT )2
A�1T bT b0TA
�1T ;
hT d0TP
�1T = P�1T dT d
0TP
�1T = A�1T (dT �
T 2c0TA�1T dT
1 + T 2c0TA�1T cT
cT )(dT �T 2c0TA
�1T dT
1 + T 2c0TA�1T cT
cT )0A�1T ;
and
gT d0TP
�1T = (hT b
0TP
�1T )0 = P�1T bT d
0TP
�1T =
1
(1 + T 2c0TA�1T cT )
A�1T bT (dT �T 2cTA
�1T dT
1 + T 2c0TA�1T cT
cT )0A�1T :
In terms of orders of magnitude, hT d0TP�1T = O(1), hT b0TP
�1T = O( 1T 2 ), gT d
0TP
�1T = O( 1T 2 ) and gT b
0TP
�1T =
O( 1T 4 ). Therefore, for TBT �P�1T where BT is in (B.27) and P
�1T is in (B.42), we have T (hT b0T +gT d
0T )P
�1T =
O( 1T ) and
T 2[(d0ThT )gT b0T + (b
0T gT )hT d
0T � (d0T gT )hT b0T � (b0ThT )gT d0T ]P�1T
= T 2(b0T gT )hT d0TP
�1T +O(
1
T 2)
=T 2b0TA
�1T bT
1 + T 2c0TA�1T cT
A�1T (dT �T 2d0TA
�1T cT
1 + T 2c0TA�1T cT
cT )(dT �T 2d0TA
�1T cT
1 + T 2c0TA�1T cT
cT )0A�1T +O(
1
T 2)
�! 1
!2A�1(d� d
0A�1b
b0A�1bb)(d� d
0A�1b
b0A�1bb)0A�1;
because c = !b, which is the limit of (�TBTP�1T ).
Thus, K�1T = (Im � TBT
�T)P�1T = P�1T � 1
�TTBTP
�1T converges to Lk, where
Lk = (A�1 � 1
b0A�1bA�1bb0A�1) +
1
!2�A�1(d� d
0A�1b
b0A�1bb)(d� d
0A�1b
b0A�1bb)0A�1:
30
For (b) and (c), we have K�1T = (PTHT )
�1 where P�1T = A�1T � T 2
1+T 2c0TA�1T cT
A�1T cT c0TA
�1T and H�1
T =
Im� T�TBT with �T = 1+T (b0ThT+d
0T gT )�T 2j(bT ; dT )0(gT ; hT )j and BT = (hT b0T+gT d0T )�T [(d0ThT )gT b0T+
(b0T gT )hT d0T � (d0T gT )hT b0T � (b0ThT )gT d0T ]. Hence,
K�1T = P�1T � T
�TBTP
�1T (B.43)
= P�1T � T
�T((hT b
0T + gT d
0T )� T [(d0ThT )gT b0T + (b0T gT )hT d0T � (d0T gT )hT b0T � (b0ThT )gT d0T ])P�1T
= P�1T � T
�T(hT b
0T + gT d
0T )P
�1T +
T 2
�T[(d0ThT )gT b
0T + (b
0T gT )hT d
0T � (d0T gT )hT b0T � (b0ThT )gT d0T ]P�1T
= P�1T � T
�T(P�1T dT b
0TP
�1T + P�1T bT d
0TP
�1T )
+T 2
�T[(d0TP
�1T dT )P
�1T bT b
0TP
�1T + (b0TP
�1T bT )P
�1T dT d
0TP
�1T ]
� T2
�T[(d0TP
�1T bT )P
�1T dT b
0TP
�1T + (b0TP
�1T dT )P
�1T bT d
0TP
�1T ].
As
c0TP�1T cT =
c0TA�1T cT
1 + T 2c0TA�1T cT
= O(T�2), c0TP�1T dT =
c0TA�1T dT
1 + T 2c0TA�1T cT
= O(T�2), (B.44a)
d0TP�1T dT = d0TA
�1T dT �
T 2(c0TA�1T dT )
2
1 + T 2c0TA�1T cT
= O(1) (B.44b)
and bT is proportional to cT , we have
c0TK�1T cT = c0TP
�1T cT �
2T
�T(c0TP
�1T dT b
0TP
�1T cT )
+T 2
�T[(d0TP
�1T dT )c
0TP
�1T bT b
0TP
�1T cT + (b
0TP
�1T bT )c
0TP
�1T dT d
0TP
�1T cT ]
+2T 2
�T[�(d0TP�1T bT )c
0TP
�1T dT b
0TP
�1T cT ].
Using (B.44), we have c0TK�1T cT = O(T
�2) and similarly,K�1T cT = O(T
�1). Also, we have that T 2c0TK�1T cT =
T 2c0TP�1T cT +O(T
�1). This is so because T 2c0TP�1T cT =
T 2c0TA�1T cT
1+T 2c0TA�1T cT
and 1� T 2c0TA�1T cT
1+T 2c0TA�1T cT
= O(T�2).
When this Proposition is applied to Proposition 2.1 and B.14, we have bT = cT . Also, it can be extended
to the case where dT and AT are stochastic. �
Proof for Proposition B.14: We are going to use Proposition 2.1 to prove in Proposition B.14. First, we
need to show that �T = 1��d0TA
�1T dT �
(d0TA�1T cT )
2
c0TA�1T cT
�has the property that � �plimT!1�T 6= 0.
In our paper, � = 1 �hd0A�1d� (d0A�1c)2
c0A�1c
i= 1� d0A�1d
h1� (d0A�1c)2
c0A�1cd0A�1d
i. Using Cauchy inequality,
(d0A�1c)2
c0A�1cd0A�1d � 1; using positive de�niteness of A, (d0A�1c)2
c0A�1cd0A�1d � 0. Hence, 0 � 1 � (d0A�1c)2
c0A�1cd0A�1d � 1.Also, d0A�1d < 1 in our application because it is equivalent to limT!1 (dnT )
0(Hs
nT =!nT )�1(dnT ) < 1
where dnT = 1!nT
�1
nT 2
TPt=1( ~Zsnt; Gn
~Zsnt�0)0 ~Y un;t�1
�, Hs
nT =1nT
TPt=1( ~Zsnt; Gn
~Zsnt�0)0( ~Zsnt; Gn
~Zsnt�0) and !nT =
31
1nT 3
PTt=1
~Y u0n;t�1~Y un;t�1. This is so, as we have 1
nT 2
TXt=1
~Y u0n;t�1( ~Zsnt; Gn ~Z
snt�0)
! 1
nT 3
TXt=1
~Y u0n;t�1 ~Yun;t�1
!�1�
1
nT
TXt=1
( ~Zsnt; Gn ~Zsnt�0)
0( ~Zsnt; Gn ~Zsnt�0)
!�1 1
nT 2
TXt=1
~Y u0n;t�1( ~Zsnt; Gn ~Z
snt�0)
!0< 1
and d0A�1d < 1 because of the generalized Schwartz inequality and that, for large enough T , ( ~Y un;t�1)i
is not a linear function of ( ~Zsnt; Gn ~Zsnt�0)i with some positive probability. Hence, combined with 0 �
1� (d0A�1c)2
c0A�1cd0A�1d � 1 and d0A�1d < 1, � > 0. �
Proof for (B.28) and (B.29): This is implied by (b) and (c) in Proposition 2.1 when KT there is taken
to be H1;nT here. �
Proof for (B.30): To prove it, we take H1;nT =!nT to be KT . As K�1T = O(1) and K�1
T cT = Op(T�1) ,
we need to show K�1T (T 2cT + TdT ) = Op(1).
From (B.43), we have
K�1T =P�1T � 1
�TfTP�1T (dT b
0T + bT d
0T )P
�1T � T 2[(d0TP�1T dT )P
�1T bT b
0TP
�1T
+(b0TP�1T bT )P
�1T dT d
0TP
�1T � (b0TP�1T dT )P
�1T (dT b
0T + bT d
0T )P
�1T ]g;
where �T = 1 + T (b0TP�1T dT + d
0TP
�1T bT ) � T 2[(b0TP�1T bT )(d
0TP
�1T dT ) � (b0TP�1T dT )(d
0TP
�1T bT )]. It follows
that
K�1T dT =
1
�TfP�1T dT + T (b
0TP
�1T dT )P
�1T dT � T (d0TP�1T dT )P
�1T bT g;
and
K�1T bT =
1
�TfP�1T bT + T (b
0TP
�1T dT )P
�1T bT � T (b0TP�1T bT )P
�1T dT g. (B.45)
As bT = cT in our case, after arrangement of terms,
K�1T (T 2cT +TdT ) =
1
�TfT 2(1+Tc0TP�1T dT �d0TP�1T dT )P
�1T cT +T (1+Tc
0TP
�1T dT �T 2c0TP�1T cT )P
�1T dT g:
The �rst part on the right hand side is of order Op(1) because P�1T cT = Op(
1T 2 ) and c
0TP
�1T dT = Op(
1T 2 ).
It is of interest to see that for the second half, because
T + T 2c0TP�1T dT � T 3c0TP�1T cT
= T [1 +T
1 + T 2c0TA�1T cT
c0TA�1T dT �
T 2
1 + T 2c0TA�1T cT
c0TA�1T cT ] =
T (1 + Tc0TA�1T dT )
1 + T 2c0TA�1T cT
= Op(1);
so K�1T (T 2cT + TdT ) = Op(1). �
Proof for (B.31): As H1;nT and HnT have the form speci�ed in (2.14), we need to prove that for KT =
T 2cT c0T + T (cT d
0T + dT c
0T ) + AT , we have 1 � c0TK�1
T (T 2cT + TdT + Td2;nT cT + Hs2;nT =!nT ) = Op(T
�1)
32
where d2;nT and Hs2;nT =!nT are de�ned in (2.14). As c
0TK
�1T = Op(T
�1) and c0TK�1T cT = Op(T
�2), we need
to show 1� c0TK�1T (T 2cT + TdT ) = Op(T
�1).
From (B.43),
T 2c0TK�1T cT = T
2c0TP�1T cT �
T 3
�T(c0TP
�1T dT b
0TP
�1T cT + c
0TP
�1T bT d
0TP
�1T cT )
+T 4
�T[(d0TP
�1T dT )c
0TP
�1T bT b
0TP
�1T cT + (b
0TP
�1T bT )c
0TP
�1T dT d
0TP
�1T cT ]
+T 4
�T[�(d0TP�1T bT )c
0TP
�1T dT b
0TP
�1T cT � (b0TP�1T dT )c
0TP
�1T bT d
0TP
�1T cT ]
=T 2c0TA
�1T cT
1 + T 2c0TA�1T cT
+1
�T(T 4c0TP
�1T bT b
0TP
�1T cT )(d
0TP
�1T dT ) +Op(T
�1)
by using (B.44) because bT = cT for our case, and, similarly,
Tc0TK�1T dT = �
1
�T(T 2c0TP
�1T bT )(d
0TP
�1T dT ) +Op(T
�1):
Hence,
T 2c0TK�1T cT + Tc
0TK
�1T dT
=T 2c0TA
�1T cT
1 + T 2c0TA�1T cT
+1
�T(T 4c0TP
�1T bT b
0TP
�1T cT )(d
0TP
�1T dT )�
1
�T(T 2c0TP
�1T bT )(d
0TP
�1T dT ) +Op(T
�1)
=T 2c0TA
�1T cT
1 + T 2c0TA�1T cT
+d0TP
�1T dT�T
(T 2c0TP�1T bT )((T
2c0TP�1T bT )� 1) +Op(T�1).
As bT = cT , (T 2c0TP�1T bT )�1 = � 1
1+T 2c0TA�1T cT
= Op(T�2), using the fact that �T and d0TP
�1T dT are Op(1),
we have 1� c0TK�1T (T 2cT + TdT ) = Op(T
�1). �
Proof for (B.32):
0@ H1;nT H2;nT
H02;nT H3;nT
1A = !nT�T 2 � c�c�0 + T � dnT � c�0 + T � c� � d0nT +Hs
nT =!nT�where
c� = (c0; 1)0 and dnT = (d01;nT ; d2;nT )0. As we have already established that� > 0, by using Proposition B.13,
inverse of
0@ H1;nT H2;nT
H02;nT H3;nT
1A exists and is Op(1). Using the formula of inverting a partitioned matrix, we
can get�H3;nT �H0
2;nTH�11;nTH2;nT
��1exists and is Op(1). �
Proof for (B.33): To prove plimT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
��16= 0, we will make use of the matrix
algebra result of Proposition B.13 we have developed. Denoting HnT =
0@ H1;nT H2;nT
H02;nT H3;nT
1A and LH =
plimT!1H�1nT , we are going to prove that e
0LHe 6= 0 where e is a unit vector such that e = (0; � � � ; 0; 1)0. Here,HnT takes the form of HnT = !nT �
�T 2 � c�c�0 + T � dnT � c�0 + T � c� � d0nT +Hs
nT =!nT�. From Proposition
B.13, for KT =�T 2 � cT c0T + T � dnT � c0T + T � cT � d0nT +AT
�, the limit of K�1
T is
Lk = (A�1 � 1
c0A�1cA�1cc0A�1) +
1
�A�1(d� d
0A�1c
c0A�1cc)(d� d
0A�1c
c0A�1cc)0A�1
33
where � = 1�hd0A�1d� (d0A�1c)2
c0A�1c
i. Then,
e0Lke = (e0A�1e� (e
0A�1c)2
c0A�1c) +
1
�
�e0A�1d� d
0A�1c
c0A�1ce0A�1c
�2.
The Cauchy inequality guarantees that e0A�1e� (e0A�1c)2
c0A�1c > 0 as e and c are not proportional. The second
part of e0Lke will be nonnegative if� > 0 where� = 1�hd0A�1d� (d0A�1c)2
c0A�1c
i= 1�d0A�1d
h1� (d0A�1c)2
c0A�1cd0A�1d
i,
which is proved in the beginning of the proof for Proposition B.14. �
Proof for (B.34): From (B.32) and (B.33), H3;nT �H02;nTH�1
1;nTH2;nT is Op(1) and is a function of !nT and
dnT ; more explicitly, !nT and !nT dnT in (2.14). As we have !nT �E!nTp! 0 and !nT dnT �E!nT dnT
p! 0
(by using Lemma B.10) where !nT and dnT are Op(1), we have the result. �
Proof for (B.35): (B.33) states that plimT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
��16= 0. AsH3;nT�H0
2;nTH�11;nTH2;nT
is a scalar, we have plimT!1
�H3;nT �H0
2;nTH�11;nTH2;nT
�also exists.
Similarly, plimT!1�EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT
�also exists. Using (B.34), we have the result.
�
Proof for Proposition B.15: From (B.45), using bT = cT for our case, we have
TK�1T cT =
1
�TfTP�1T cT + (T
2c0TP�1T dT )P
�1T cT � (T 2c0TP�1T cT )P
�1T dT g.
As �T = Op(1) and bounded away from zero, c0TP�1T dT = Op(T
�2) and P�1T cT = Op(T�2), we have
TK�1T cT = � 1
�T(T 2c0TP
�1T cT )P
�1T dT + Op(T
�1). Also, as P�1T = A�1T � T 2
1+T 2c0TA�1T cT
A�1T cT c0TA
�1T and
T 2c0TP�1T cT =
T 2c0TA�1T cT
1+T 2c0TA�1T cT
, we have
TK�1T cT = �
1
�T
T 2c0TA�1T cT
1 + T 2c0TA�1T cT
(A�1T dT �T 2
1 + T 2c0TA�1T cT
A�1T cT c0TA
�1T dT ) +Op(T
�1). (B.46)
Also, for K�1T , using (B.43), it is Op(1) and is just a function of A
�1T , cT and dT . We are going to apply the
above results to ��;nT where
��;nT =1
�2
0@ EHnT (�) 0
0 0
1A+0BB@0 0 0
0 1n
�tr(G0n(�)Gn(�)) + tr(G
2n(�))
�1�2n tr(Gn(�))
0 1�2n tr(Gn(�))
12�4
1CCA ,and
HnT (�) =1
nT
TXt=1
( ~Znt; Gn(�) ~Znt�)0( ~Znt; Gn(�) ~Znt�). (B.47)
As ~~Y un;t�1 = Wn~~Y un;t�1 = Gn ~Z
unt�0 (see Proposition B.4 in Appendix B), we have ~Z
unt =
~~Y un;t�1 � c0 wherec = (1; 1;01�kx)
0 and Gn(�) ~Zunt� = +�1��
~~Y un;t�1. Hence, we have ( ~Zunt; Gn(�) ~Zunt�) =
~~Y un;t�1 � (c0; +�1�� )0.
34
Therefore, denote c(�) = (c0; +�1�� ; 0)0, we can write ��;nT as
��;nT = (E!nT )�T 2 � c(�)c0(�) + T � (EdnT (�)) � c0(�) + T � c0(�) � (EdnT (�))0 +�s�;nT =E!nT
�,
where c(�) = (1; 1;01�kx ; +�1�� ; 0)
0, dnT (�) = 1E!nT
( 1nT 2
TPt=1( ~Zsnt; Gn(�) ~Z
snt�; 0)
0 ~~Y un;t�1)0, �s�;nT =
1�2
0@ EHsnT (�) 0
0 0
1A+0BB@0 0 0
0 1n
�tr(G0n(�)Gn(�)) + tr(G
2n(�))
�1�2n tr(Gn(�))
0 1�2n tr(Gn(�))
12�4
1CCA andHsnT (�) =
1nT
TPt=1( ~Zsnt; Gn(�) ~Z
snt�)
0( ~Zsnt; Gn(�) ~Zsnt�).
For c(�) evaluated at �0 and �nT , we have c(�0) = (c0; 0+�01��0 ; 0)
0 = (c0; 1; 0)0 (because 0 + �0 + �0 = 1
under Assumption 6) and c(�nT ) = (c0; nT+�nT1��nT
; 0)0. Also, nT+�nT1��nT
= 1 + Op
�max
�1pnT 3
; 1T 2
��. This
is so as follows. From Theorem 4.2, nT + �nT + �nT � 1 = Op
�max
�1pnT 3
; 1T 2
��. As �nT � 1 6= 0
for large enough T with probability close to one12 , nT+�nT1��nT
= 1 + 11��nT
Op
�max
�1pnT 3
; 1T 2
��. Hence,
nT+�nT1��nT
= 1 +Op
�max
�1pnT 3
; 1T 2
��.
Hence, for ��0;nT , we have
��0;nT = (E!nT )�T 2 � c(�0)c0(�0) + T � EdnT (�0) � c0(�0) + T � c0(�0) � (EdnT (�0))0 +�s�0;nT =E!nT
�,
where c(�0) = (c�0; 0)0, dnT (�0) = 1E!nT
( 1nT 2
TPt=1( ~Zsnt; Gn
~Zsnt�0; 0)0 ~~Y un;t�1)
0 and �s�0;nT =1�20
0@ EHsnT 0
0 0
1A+0BB@0 0 0
0 1n
�tr(G0nGn) + tr(G
2n)�
1�20ntr(Gn)
0 1�20ntr(Gn)
12�40
1CCA. For ���nT ;nT (see (B.36)), we have���nT ;nT = !nT
�T 2 � c(�nT )c0(�nT ) + T � dnT (�nT ) � c0(�nT ) + T � c0(�nT ) � d0nT (�nT ) + ��s�nT ;nT =!nT
�,
(B.48)
where c(�nT ) = (1; 1;0; nT+�nT1��nT
; 0)0, dnT (�nT ) = 1!nT
( 1nT 2
TPt=1( ~Zsnt; Gn(�nT ) ~Z
snt�nT ; 0)
0 ~~Y un;t�1)0 and ��s
�nT ;nT=
1�2nT
0@ HsnT (�nT ) 0
0 0
1A+0BBB@0 0 0
0 1n
htr(G0n(�nT )Gn(�nT )) + tr(G
2n(�nT ))
i1
�2nTntr(Gn(�nT ))
0 1�2nTn
tr(Gn(�nT ))1
2�4nT
1CCCA.As nT+�nT
1��nT= 1+Op
�max
�1pnT 3
; 1T 2
��, we have c(�nT )� c(�0) = (0; 0;0; Op(max( 1p
nT 3; 1T 2 )); 0)
0. For
dnT (�nT )�EdnT (�0) = [dnT (�nT )�dnT (�0)]+[dnT (�0)�EdnT (�0)], dnT (�nT )�dnT (�0) isOp�max
�1pnT; 1T
��as �nT � �0 = Op
�max
�1pnT; 1T
��from (4.6); also, dnT (�0) � EdnT (�0) = Op
�1pnT
�using (B.21) in
Lemma B.10. Hence, dnT (�nT )�EdnT (�0) = Op�max
�1pnT; 1T
��. From Equation (C.9) and (C.10) in Yu,
12From (3.28), �nT � �0 = Op�max
�1pnT; 1T
��, this implies that 1 � �nT = 1 � �0 + Op
�max
�1pnT; 1T
��. As �0 6= 1
under Assumption 5 (If �0 = 1, Sn(�0) = In �Wn would not be invertible because Wn is row normalized under Assumption
1.), 1� �nT 6= 0 for large enough T with probability close to one.
35
de Jong and Lee (2006), (��s�nT ;nT
)�1 � (�s�0;nT )�1 = Op
�max
�1pnT; 1T
��. Also, from (B.8) and (B.9) in
Lemma B.8, !nT � E!nT = Op�max
�1pnT; 1T
��.
To prove (B.37): From (B.43), ���1�nT ;nT
is Op(1) and is a simple function of !nT ; c(�nT ), dnT (�nT ),
(��s�nT ;nT
)�1. Similarly, ��1�0;nT is O(1) and is a simple function of E!nT ; c(�0), EdnT (�0), (��s�0;nT )�1.
As !nT � E!nT , c(�nT ) � c(�0), dnT (�nT ) � EdnT (�0) and (��s�nT ;nT )�1 � (�s�0;nT )
�1 are all having at
most the order Op�max
�1pnT; 1T
��, it implies that elements of ���1
�nT ;nT� ��1�0;nT will be of the order
Op
�max
�1pnT; 1T
��.
To prove (B.38): We are going to show �rst that T �h���1�nT ;nT
c(�nT )� ��1�0;nT c(�0)i= Op
�max
�1pnT; 1T
��.
From (B.46), T � ���1�nT ;nT
� c(�nT ) is a simple function of !nT , c(�nT ), dnT (�nT ), (��s�nT ;nT )�1 and T �
��1�0;nT � c(�0) is a simple function of E!nT , c(�0), EdnT (�0), (�s�0;nT
)�1. As !nT � E!nT , c(�nT ) � c(�0),dnT (�nT )� EdnT (�0) and (��s�nT ;nT )
�1 � (�s�0;nT )�1 are all having at most the order Op
�max
�1pnT; 1T
��,
it implies that elements of T �h���1�nT ;nT
c(�nT )� ��1�0;nT c(�0)iwill be of the order Op
�max
�1pnT; 1T
��. As
T �h���1�nT ;nT
� ��1�0;nTi� c(�0) = T � ���1�nT ;nT (c(�0)� c(�nT ))+T �
���1�nT ;nT
c(�nT )���1�0;nT c(�0), (B.38) follows
because T � (c(�nT )� c(�0)) = Op�max
�1pnT; 1T
��and ���1
�nT ;nTis Op(1). �
B.4 Proof for Proposition 2.2
We have already proved the central limit theorem for statistics like QsnT for stationary case (see Theorem
2.4 in Yu, de Jong and Lee (2006)). As QnT = QsnT +QunT speci�ed in (2.15), we can prove that QnT still
behaves like QsnT . Rewrite QnT as
QnT =
TXt=1
�Un;t�1 +
kTT(RnJnR
�1n )�n;t�1
�0� Vnt
+TXt=1
�Dnt +
kTT(RnJnR
�1n )
�cn0~t�1 + ~Xn;t�1�0
��0� Vnt
+TXt=1
�V 0ntBnVnt � �20trBn
�,
then QnT has just another form of QsnT so that the central limit theorem in Yu, de Jong and Lee (2006) is
applicable. To con�rm this, we need to show that
(1) For Wnt = Unt+ kTT (RnJnR
�1n )�n;t =
P1h=1Qnt;hVn;t+1�h,
P1h=1 abs(Qnt;h) is row sum and column
sum bounded uniformly in n and t;
(2) For Dnt = Dnt+ kTT (RnJnR
�1n )
�cn0~t�1 + ~Xn;t�1�0
�, elements of Dnt is bounded uniformly in n and
t.
For (1), as �nt =Pt�1
h=0 Vnh, we have Qnt;h =
8<: Pnt;h +kTT (RnJnR
�1n ) for h � t� 1
Pnt;h for h > t� 1. Hence,
1Xh=1
abs(Qnt;h) =t�1Xh=1
abs(Pnt;h +kTT(RnJnR
�1n )) +
1Xh=t
abs(Pnt;h).
36
This implies that kP1
h=1 abs(Qnt;h)k � kP1
h=1 abs(Pnt;h)k + Pt�1
h=1 abs(kTT (RnJnR
�1n ))
, where k�k rep-resents either the row or column sum norm. Here,
P1h=1 abs(Pnt;h) is row sum and column sum bounded
uniformly in n and t. Also, Pt�1
h=1 abs(kTT (RnJnR
�1n ))
is row sum and column sum bounded uniformly in
n and t because RnJnR�1n is row sum and column sum bounded and KT is O(1).
For (2), Dnt = Dnt + kTT (RnJnR
�1n )
�cn0~t�1 + ~Xn;t�1�0
�where ~Xn;t =
Pt�1h=0
~Xnt. As elements of cn0,
Xnt and Dnt are uniformly bounded in n and t and RnJnR�1n is row sum and column sum bounded, usingtT � 1 for t = 1; 2; :::; T , elements of Dnt are uniformly bounded. �
B.5 Proof about �2nT (�) and �2nT (�0)
Using (2.9), we have
�2nT (�) = (�0 � �)2(H3;nT �H02;nTH�1
1;nTH2;nT ) +1
nT
TXt=1
~V 0ntS0�1n S0n(�)Sn(�)S
�1n~Vnt (B.49)
+2(�0 � �)1
nT
TXt=1
(�00~Z 0ntG
0n �H0
2;nTH�11;nT
~Z 0nt)Sn(�)S�1n~Vnt
� 1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!0H�11;nT
1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!,
@�2nT (�)
@�= 2(�� �0)(H3;nT �H0
2;nTH�11;nTH2;nT )�
2
nT
TXt=1
~V 0ntG0nSn(�)S
�1n~Vnt (B.50)
�2(�0 � �)1
nT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt)Gn ~Vnt
� 2
nT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt)Sn(�)S�1n~Vnt
+2
1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!0H�11;nT
1
nT
TXt=1
~Z 0ntGn~Vnt
!,
@2�2nT (�)
@�2= 2(H3;nT �H0
2;nTH�11;nTH2;nT ) +
2
nT
TXt=1
~V 0ntG0nGn ~Vnt (B.51)
+4
nT
TXt=1
(�00~Z 0ntG
0n �H0
2;nTH�11;nT
~Z 0nt)Gn ~Vnt
�2 1
nT
TXt=1
~Z 0ntGn~Vnt
!0H�11;nT
1
nT
TXt=1
~Z 0ntGn~Vnt
!.
To study �2nT (�),@�2nT (�)
@� and @2�2nT (�)
@�2, we need the following (B.52).
1
nT
TXt=1
~Zu0ntBn ~Vnt =
1
nT
TXt=1
~Y u0n;t�1Bn ~Vnt
!� c, (B.52a)
Gn ~Zunt�0 � ~ZuntH�1
1;nTH2;nT = ~Y u0n;t�1 � (1� c0H�11;nTH2;nT ). (B.52b)
37
For �2nT (�) in (B.49), using (B.32) in Proposition B.14, (�0��)2(H3;nT�H02;nTH�1
1;nTH2;nT ) is j�� �0j�Op(1).
Using (B.19) and Sn(�)S�1n = In � (�� �0)Gn, 1nT
TPt=1
~V 0ntS0�1n S0n(�)Sn(�)S
�1n~Vnt = �
20 + j�� �0j �Op(1) +
Op
�max
�1pnT; 1T
��. For 2(�0 � �) 1
nT
TPt=1(�00
~Z 0ntG0n � H0
2;nTH�11;nT
~Z 0nt)Sn(�)S�1n~Vnt, it can be rewritten
as a sum of stationary component and nonstationary component. For the stationary component 2(�0 �
�) 1nT
TPt=1(�00 ~Z
s0ntG
0n�H0
2;nTH�11;nT
~Zs0nt)Sn(�)S�1n~Vnt, using (B.24), it is j�� �0j �Op
�max
�1pnT; 1T
��. For the
nonstationary component 2(�0 � �) 1nT
TPt=1(�00 ~Z
u0ntG
0n �H0
2;nTH�11;nT
~Zu0nt)Sn(�)S�1n~Vnt, it is equal to
2(�0 � �)(1� c0H�11;nTH2;nT )
1
nT
TXt=1
~Y u0n;t�1Sn(�)S�1n~Vnt
using (B.52). Then, using (B.23) and (1�c0H�11;nTH2;nT ) = Op(T
�1) from (B.31), 2(�0��) 1nT
TPt=1(�00 ~Z
u0ntG
0n�
H02;nTH�1
1;nT~Zu0nt)Sn(�)S
�1n~Vnt is j�� �0j�Op
�max
�1pnT; 1T
��. For the last term of �2nT (�) in (B.49), we have
1nT
TPt=1
~Z 0ntSn(�)S�1n~Vnt =
1nT
TPt=1
~Zu0ntSn(�)S�1n~Vnt +
1nT
TPt=1
~Zs0ntSn(�)S�1n~Vnt. As 1
nT
TPt=1
~Zs0ntSn(�)S�1n~Vnt =
Op
�max
�1pnT; 1T
��from (B.24) and 1
nT
TPt=1
~Zu0ntSn(�)S�1n~Vnt = c � Op
�max
�1;q
Tn
��from (B.52) and
(B.23), using H�11;nT is Op(1), H
�11;nT c is Op(T
�1) and c0H�11;nT c is Op(T
�2) from Proposition B.14, we have 1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!0H�11;nT
1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!= Op
�max
�1
nT;
1pnT 3
;1
T 2
��. (B.53)
Hence, �2nT (�) = �20 + j�� �0j �Op(1) +Op
�max
�1pnT; 1T
��. Also,
�2nT (�) = (�� �0)2(H3;nT �H02;nTH�1
1;nTH2;nT ) + �20
1
ntr(S0�1n S0n(�)Sn(�)S
�1n ) +Op
�max
�1
T;1pnT
��.
(B.54)
Similarly, we can get the behavior of @�2nT (�)@� and @2�2nT (�)
@�2by using Lemma B.11, (B.52) and Proposition
B.14. The results are summarized as follows.
�2nT (�) = �20 + j�� �0j �Op(1) +Op�max
�1pnT;1
T
��, (B.55a)
@�2nT (�)
@�= ��20
2
ntrGn + j�� �0j �Op(1) +Op
�max
�1pnT;1
T
��, (B.55b)
@2�2nT (�)
@�2= 2(H3;nT �H0
2;nTH�11;nTH2;nT ) + 2�
20
1
ntrG0nGn +Op
�max
�1pnT;1
T
��. (B.55c)
Furthermore, at � = �0, we have, from (B.50),
pnT@�2nT (�0)
@�= � 2p
nT
TXt=1
~V 0ntG0n~Vnt �
2pnT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt
+2
1pnT
TXt=1
~Z 0nt ~Vnt
!0H�11;nT
1
nT
TXt=1
~Z 0ntGn ~Vnt
!.
38
Using (B.53), we have
pnT@�2nT (�0)
@�= � 2p
nT
TXt=1
~V 0ntG0n~Vnt �
2pnT
TXt=1
(�00 ~Z0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt (B.56)
+Op
�max
�1pnT;1
T;
rn
T 3
��. �
B.6 Proof about ��2nT (�) and ��2nT (�0)
From (3.18), we have
��2nT (�) = (�0 � �)2(EH3;nT � EH02;nT (EH1;nT )
�1EH2;nT ) +1
nTE
TXt=1
~V 0ntS0�1n S0n(�)Sn(�)S
�1n~Vnt
+2(�0 � �)1
nTE
TXt=1
(�00 ~Z0ntG
0n � EH0
2;nT (EH1;nT )�1 ~Z 0nt)Sn(�)S
�1n~Vnt
� 1
nTE
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!0(EH1;nT )
�1
E1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!, (B.57)
@��2nT (�)
@�= 2(�� �0)(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT )�
2
nTE
TXt=1
~V 0ntG0nSn(�)S
�1n~Vnt
�2(�0 � �)1
nTE
TXt=1
(�00 ~Z0ntG
0n � EH0
2;nT (EH1;nT )�1 ~Z 0nt)Gn ~Vnt
� 2
nTE
TXt=1
(�00~Z 0ntG
0n � EH0
2;nT (EH1;nT )�1 ~Z 0nt)Sn(�)S
�1n~Vnt
+2
E1
nT
TXt=1
~Z 0ntSn(�)S�1n~Vnt
!0(EH1;nT )
�1
E1
nT
TXt=1
~Z 0ntGn~Vnt
!, (B.58)
@2��2nT (�)
@�2= 2(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT ) +
2
nTE
TXt=1
~V 0ntG0nGn ~Vnt
+4
nTE
TXt=1
(�00 ~Z0ntG
0n � EH0
2;nT (EH1;nT )�1 ~Z 0nt)Gn ~Vnt
�2 1
nTE
TXt=1
~Z 0ntGn ~Vnt
!0(EH1;nT )
�1
1
nTE
TXt=1
~Z 0ntGn ~Vnt
!. (B.59)
Using Lemma B.11, (B.52) and Proposition B.14, similarly as we derived (B.55), we have
��2nT (�) = �20 + j�� �0j �Op(1) +O�1
T
�, (B.60a)
@��2nT (�)
@�= ��20
2
ntrGn + j�� �0j �Op(1) +O
�1
T
�, (B.60b)
@2��2nT (�)
@�2= 2(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT ) + 2�
20
1
ntrG0nGn +
�1
T
�. (B.60c)
39
Furthermore,
��2nT (�) = (�� �0)2(EH3;nT � EH02;nT (EH1;nT )
�1EH2;nT ) + �20
1
ntr(S0�1n S0n(�)Sn(�)S
�1n ) +O
�1
T
�: �(B.61)
C Proof for Theorems
C.1 Proof of Claim 3.1
To prove 1nT lnLn;T (�)�Qn;T (�)
p! 0 uniformly in � in any compact parameter space �:
As lnLn;T (�) = �nT2 (ln 2�+1)�
nT2 ln �
2nT (�)+T ln jSn(�)j and Qn;T (�) = � 1
2 (ln 2�+1)�12 ln�
�2nT (�)+
1n ln jSn(�)j ((2.10) and (3.19)),
1nT lnLn;T (�) � Qn;T (�) =
12 ln�
�2nT (�) � 1
2 ln �2nT (�). By the mean value
theorem, 1nT lnLn;T (�) � Qn;T (�) = � 1
21
~�2n;T (�)(�2nT (�) � ��2nT (�)) where ~�2nT (�) lies between �
2nT (�) and
��2nT (�). We need to show that (1) �2nT (�) � ��2nT (�)
p! 0 uniformly in � and (2) ~�2nT (�) is bounded away
from zero uniformly in � with probability one.
To prove (1): We have �2nT (�) and ��2nT (�) in (B.54) and (B.61). Using (B.34) in Proposition B.14,
�2nT (�)� ��2nT (�)p! 0 uniformly in �.
To prove (2): As ~�2nT (�) lies between �2nT (�) and �
�2nT (�), we have
1~�2nT (�)
� maxf 1�2nT (�)
; 1��2nT (�)
g.Denote �2nT (�) = �20
1n tr(S
0�1n S0n(�)Sn(�)S
�1n ), then �2nT (�) is uniformly bounded away from zero13 . As
H3;nT �H02;nTH�1
1;nTH2;nT is nonnegative14 , �2nT (�) and �
�2nT (�) are uniformly bounded away from zero. So,
1~�2nT (�)
is uniformly bounded.
Combining (1) �2nT (�) � ��2nT (�)p! 0 uniformly in � and (2) 1
~�2nT (�)is Op(1) uniformly in �, we have
1nT lnLn;T (�)�Qn;T (�)
p! 0 uniformly in �.
To prove Qn;T (�) is uniformly equicontinuous in � in any compact parameter space �:
To prove this property, from the expression of Qn;T (�), the followings are su¢ cient: (1) 1n ln jSn(�)j is
uniformly equicontinuous; (2) (���0)2(EH3;nT �EH02;nT (EH1;nT )
�1EH2;nT ) is uniformly equicontinuous;
(4) �2n(�) is uniformly equicontinuous.
For (1), 1n ln jSn(�2)j �1n ln jSn(�1)j =
1n tr
�WnS
�1n
�����(�2 � �1) where �� lies between �2 and �1. As
S�1n (�) is uniformly bounded in row and column sums, uniformly in � 2 �, 1n tr�WnS
�1n
�����is bounded,
we have 1n ln jSn(�)j is uniformly equicontinuous. For (2), because � is bounded and because EH3;nT �
EH02;nT (EH1;nT )
�1EH2;nT is O(1) according to Proposition B.14, the result follows. For (3), �2n(�2) ��2n(�1) =
�20n tr(S
0�1n S0n(�2)Sn(�2)S
�1n )� �20
n tr(S0�1n S0n(�1)Sn(�1)S
�1n ). Using Sn(�)S�1n = In � (�� �0)Gn,
�2n(�2)� �2n(�1) = �20�(�2 � �1) (�2 + �1 � 2�0) trG
0nGn
n � (�2 � �1)tr(G0
n+Gn)n
�. As elements of G0nGn and
Gn are uniformly bounded, �2n(�) is uniformly equicontinuous. �13See the supplement to Lee (2004), Page 8 for the proof of consistency, available in http://economics.sbs.ohio-state.edu/lee/.14Here, H3;nT �H0
2;nTH�11;nTH2;nT � 0 because of the Cauchy inequality.
40
C.2 Proof of Nonsingularity of Information Matrix (Scalar)
From (3.21), @2Qn;T (�0)=@�2 = � 1
�20(EH3;nT�EH0
2;nT (EH1;nT )�1EH2;nT )� 1
n
�trG0nGn + trG
2n �
2(trGn)2
n
�+
Op�T�1
�. Then, using (B.35) in Proposition B.14, limT!1
@2Qn;T (�0)
@�2= � 1
�20plimT!1(H3;nT�H0
2;nTH�11;nTH2;nT )�
limn!11n [trG
0nGn+ trG
2n�
2(trGn)2
n ]. If plimT!1(H3;nT �H02;nTH�1
1;nTH2;nT ) 6= 0 or limn!11n (trG
0nGn+
trG2n�2(trGn)
2
n ) 6= 0, limT!1�@2Qn;T (�0)
@�2is positive. Here, (H3;nT�H0
2;nTH�11;nTH2;nT ) > 0 for large enough
T because of the Cauchy inequality; also, denote Cn = Gn � trGn
n In, then, 1nftrG0nGn + trG
2n �
2(trGn)2
n g =1n tr(Cn + C
0n)(Cn + C0n)0 � 0. �
C.3 Proof of Theorem 3.2
We have
Qn;T (�) = �1
2(ln 2� + 1)� 1
2ln��2nT (�) +
1
nln jSn(�)j (C.1)
where ��2nT (�) = (�0��)2(EH3;nT �EH02;nT (EH1;nT )
�1EH2;nT )+1n�
20tr(S
0�1n Sn(�)Sn(�)S
�1n )+O( 1T ) and
the O( 1T ) is uniformly in �. At � = �0, Qn;T (�0) = � 12 (ln 2� + 1) �
12 ln�
�2nT (�0) +
1n ln jSn(�0)j. We are
going to prove that limT!1Qn;T (�) < limT!1Qn;T (�0) for any � 6= �0.
Qn;T (�)�Qn;T (�0) = �12[ln��2nT (�)� ln��2nT (�0)] +
1
nln jSn(�)j �
1
nln jSn(�0)j
= T1;nT � T2;nT +O(1
T)
where
T1;nT = �12[lnf 1
n�20tr(S
0�1n Sn(�)Sn(�)S
�1n )g � ln��2nT (�0)] +
1
nln jSn(�)j �
1
nln jSn(�0)j
T2;nT = ln
1 +
(�0 � �)2(EH3;nT � EH02;nT (EH1;nT )
�1EH2;nT )
�20tr(S0�1n Sn(�)Sn(�)S
�1n )=n
!.
Consider the pure spatial dynamic panel process Ynt = �0WnYnt+cn0+Vnt, the concentrated log likelihood
function of this process is
lnLp;n;T (�) = �nT
2ln 2� � nT
2ln�2 + T ln jSn(�)j �
1
2�2
TXt=1
(Sn(�)Ynt � cn0)0(Sn(�)Ynt � cn0), (C.2)
and the concentrated likelihood is
lnLp;n;T (�) = �nT
2(ln 2� + 1)� nT
2ln �2p;nT (�) + T ln jSn(�)j , (C.3)
where cp;nT (�) = 1T
TPt=1Sn(�)Ynt and �
2p;nT (�) =
1nT
TPt=1(Sn(�) ~Ynt)
0Sn(�) ~Ynt. Then, E lnLp;n;T (�) �
E lnLp;n;T (�0) would be equal to T1;nT . By information inequality, E lnLp;n;T (�) � E lnLp;n;T (�0) � 0.
Thus, T1;nT � 0 for any �. Also, limT!1 T2;nT > 0 as long as limT!1(EH3;nT�EH02;nT (EH1;nT )
�1EH2;nT ) 6=0. Under Assumption 9, limT!1(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT ) 6= 0 from Proposition B.14. This
proves the global identi�cation. The consistency then follows from the global identi�cation, uniform conver-
gence and uniform equicontinuity in Claim 3.1. �
41
C.4 Proof of Theorem 3.3
As ~Znt has stationary and nonstationary parts (see (2.12)), we can decompose 1pnT
@ lnLn;T (�0)@� from
(3.22) into two parts accordingly:
1pnT
@ lnLn;T (�0)
@�=
1pnT
@ lnLsn;T (�0)
@�+
1pnT
@ lnLun;T (�0)
@�+Op
�max
�1pnT;1
T;
rn
T 3
��, (C.4)
where 1pnT
@ lnLsn;T (�0)
@� is the stationary part and 1pnT
@ lnLun;T (�0)
@� is the nonstationary part as de�ned via
(C.5)-(C.9). The 1pnT
@ lnLsn;T (�0)
@� has two parts, namely, 1pnT
@ lnLsn;T (�0)
@� = 1pnT
@ lnLs�n;T (�0)
@� ���;nT where
1pnT
@ lnLs�n;T (�0)
@�=
1
�2nT (�0)
1pnT
TXt=1
V 0nt(G0n �
1
ntrGn � In)Vnt +
1pnT
TXt=1
(�00�Zs�0nt G
0n �H0
2;nTH�11;nT
�Zs�0nt )Vnt
!(C.5)
and
��0;nT =1
�2n;T (�0)
rT
n�V 0nT (G
0n �
1
ntrGn � In) �VnT
!(C.6)
+1
�2n;T (�0)
rT
n(�00( �U
snT;�1;Wn
�UsnT;�1;0n�kx)0G0n �H0
2;nTH�11;nT (
�UsnT;�1;Wn�UsnT;�1;0n�kx)
0) �VnT
!,
Here, �Zs�nt is the component of ~Zsnt, which is uncorrelated with Vnt such that
~Zsnt = �Zs�nt � ( �UsnT;�1; Wn�UsnT;�1; 0n�kx) (C.7)
and �Zs�nt = (( ~~X sn;t�1�0 + U
sn;t�1); (Wn
~~X sn;t�1�0 + WnU
sn;t�1);
~Xnt) with~~X sn;t�1 = X s
n;t�1 � �X snT;�1. For
1pnT
@ lnLun;T (�0)
@� , it also has two parts 1pnT
@ lnLun;T (�0)
@� = 1pnT
@ lnLu�n;T (�0)
@� � N�0;nT where
1pnT
@ lnLu�n;T (�0)
@�=
1
�2n;T (�0)
((1� c0H�1
1;nTH2;nT ) �1pnT
TXt=1
�Y u�0n;t�1Vnt
)(C.8)
with �Y u�n;t�1 =1
(1��0)Mn
�cn0~t�1 +
�Xn;t�1�0 + �n;t�1
�, ~t�1 = (t� 1)� T�1
2 and
N�0;nT =T (1� c0H�1
1;nTH2;nT )
�2n;T (�0)(1� �0)
�rn
T
1
n(Mn
��n;T�1)0 � �VnT
�. (C.9)
For 1pnT
@ lnL�n;T (�0)
@� = 1pnT
@ lnLs�n;T (�0)
@� + 1pnT
@ lnLu�n;T (�0)
@� from (C.5) and (C.8), denote
�Z�nt = �Zs�nt + �Zu�nt , (C.10)
we have
1pnT
@ lnL�n;T (�0)
@�=
1
�2nT (�0)
1pnT
TXt=1
V 0nt(G0n �
1
ntrGn � In)Vnt +
1pnT
TXt=1
(�00 �Z�0ntG
0n �H0
2;nTH�11;nT
�Z�0nt)Vnt
!.
42
Proposition 2.2 implies that it will be asymptotically normally distributed because 1pnT
@ lnLs�n;T (�0)
@� from
(C.5) and 1pnT
@ lnLu�n;T (�0)
@� from (C.8) are counterparts of QsnT and QunT in Proposition 2.2. To calculate the
limit variance, using uncorrelatedness of �Z�nt and Vnt, we have
Cov
1
�20
1pnT
TXt=1
(V 0nt(G0n � �20
1
ntrGn)Vnt);
1
�20
1pnT
TXt=1
(�0 �Z�0ntG
0n �H0
2;nTH�11;nT
�Z�0nt)Vnt
!
=�3�40
1
nT
nXi=1
(Gn;ii �1
ntrGn)
E
TXt=1
(Gn �Z�nt�0 � �Z�ntH�1
1;nTH2;nT )
!i
= 0
because ETPt=1Gn �Z
�nt�0 = 0 and E
TPt=1
�Z�nt = 0. Hence,
1pnT
@ lnLs�n;T (�0)
@�+
1pnT
@ lnLu�n;T (�0)
@�
d! N(0;��0 +�0) (C.11)
where
��0 =1
�20limT!1
(H3;nT �H02;nTH�1
1;nTH2;nT ) + limn!1
1
n(trG0nGn + trG
2n �
2(trGn)2
n), (C.12)
�0 =�4 � 3�40�40
limn!1
nXi=1
G2n;ii. (C.13)
For ��0;nT , using the results in Yu, de Jong and Lee (2006) (Theorem A.11, page 19), we have ��0;nT ��2nT (�0)
�20=p
nT a
s�0;nT
+Op
�max
�pnT 3 ;
1pT
��where
as�0;nT =1
ntr�Gn 0 � (H�1
1;nTH2;nT )1In
��X1
h=0Bhn
�S�1n
+1
ntr�Gn�0 � (H�1
1;nTH2;nT )2In
��X1
h=0WnB
hn
�S�1n
is O(1). As �2nT (�0) = �20 + Op
�max
�1pnT; 1T
��from (3.17) so that �20
�2nT (�0)= 1 + Op
�max
�1pnT; 1T
��,
��0;nT =p
nT a
s�0;nT
+ Op
�max
�pnT 3 ;
1pT
��. Also, using (B.17) and (C.9), we have N�0;nT �
�2nT (�0)
�20=p
nT �
mn
n � au�0;nT + Op�max
�pnT 3 ;
1pT
��where au�0;nT = T � (1 � c0H�1
1;nTH2;nT )1
2(1��0) . As�20
�2nT (�0)=
1 +Op
�max
�1pnT; 1T
��, N�0;nT =
pnT �
mn
n � au�0;nT +Op�max
�pnT 3 ;
1pT
��. Hence,
��0;nT + N�0;nT =rn
T� (as�0;nT +
mn
n� au�0;nT ) +Op
�max
�rn
T 3;1pT
��. � (C.14)
C.5 Proof of Claim 3.4
First, by the mean value theorem, tr(G2n(�)) = tr(G2n) + 2tr(G3n(��))(� � �0) where �� lies between �
and �0. So, 1n tr(G
2n(�)) =
1n tr(G
2n) + j�� �0j � O(1) as 1
n tr(G3n(��)) is uniformly bounded (see Lee(2001),
Lemma A.8 on page 22). Second, using (3.16), we can express 1nT
@2 lnLn;T (�)
@�2in terms of �2nT (�),
@�2nT (�)@� and
@2�2nT (�)
@�2. Then, using (3.17), we have the result that 1
nT@2 lnLn;T (�)
@�2� 1
nT@2 lnLn;T (�0)
@�2= j�� �0j � Op(1) +
Op
�max
�1pnT; 1T
��. �
43
C.6 Proof of Claim 3.5
Using (3.16), we can express 1nT
@2 lnLn;T (�0)
@�2in terms of �2nT (�0),
@�2nT (�0)@� and @2�2nT (�0)
@�2where, using
(3.17),
�2nT (�0) = �20 +Op
�max
�1pnT;1
T
��,@�2nT (�0)
@�= �2 1
n�20trGn +Op
�max
�1pnT;1
T
��@2�2nT (�0)
@�2= 2(H3;nT �H0
2;nTH�11;nTH2;nT ) + 2�
20
1
ntrG0nGn +Op
�max
�1pnT;1
T
��.
Similarly, we can express @2Qn;T (�0)
@�2in terms of ��2nT (�0),
@��2nT (�0)@� and @2��2nT (�0)
@�2via (3.20) where, using
(B.60),
��2nT (�0) = �20 +O
�1
T
�,@��2nT (�0)
@�= �2�20
1
ntrGn +O
�1
T
�,
@2��2nT (�0)
@�2= 2(EH3;nT � EH0
2;nT (EH1;nT )�1EH2;nT ) + 2�
20
1
ntrG0nGn +
�1
T
�.
Hence, we have the result 1nT
@2 lnLn;T (�0)
@�2� @2Qn;T (�0)
@�2= Op
�max
�1pnT; 1T
��. �
C.7 Proof of Theorem 3.6
(3.28) follows from the Taylor expansion (�nT � �0) = (�@2 lnLn;T (��nT )
@�2)�1
@ lnLn;T (�0)@� where ��nT lies
between �0 and �nT . Note that because
� 1
nT
@2 lnLn;T (��nT )
@�2=
�� 1
nT
@2 lnLn;T (��nT )
@�2��� 1
nT
@2 lnLn;T (�0)
@�2
��+
�� 1
nT
@2 lnLn;T (�0)
@�2� ��0;nT
�+��0;nT
where ��0;nT � �@2Qn;T (�0)
@�2, we have �@2 lnLn;T (��nT )
@�2= ��0;nT +
����nT � �0��� �Op(1) +Op �max� 1pnT; 1T
��according to Claim C.5 and C.6. Because
����nT � �0��� = op(1) as �nT is consistent and ��0;nT is positive inthe limit from Appendix C.2, we have �@2 lnLn;T (��nT )
@�2is invertible for large T and
�� 1nT
@2 lnLn;T (��nT )@�@�0
��1is Op(1).
According to the Taylor expansion,pnT (�nT��0) =
�� 1nT
@2 lnLn;T (��nT )
@�2
��1��
1pnT
@ lnL�n;T (�0)
@� ���0;nT � N�0;nT�
where 1pnT
@ lnL�n;T (�0)
@�
d! N(0;��0 + �0) from (C.11) and ��0;nT + N�0;nT =p
nT � (a
s�0;nT
+ mn
n �au�0;nT ) + Op
�max
�pnT 3 ;
1pT
��with as�0;nT +
mn
n � au�0;nT = O(1) from (C.14). Then,pnT (�nT � �0) =
Op(1) ��Op(1) +O
�pnT
��, which implies that �nT � �0 = Op
�max
�1pnT; 1T
��. Hence,
pnT (�nT � �0) =
�� 1
nT
@2 lnLn;T (��nT )
@�2
��1��
1pnT
@ lnL�n;T (�0)
@����0;nT � N�0;nT
�(C.17)
=
���0;nT +Op
�max
�1pnT;1
T
����1��
1pnT
@ lnL�n;T (�0)
@����0;nT � N�0;nT
�using Claim C.6. Using the fact that�
��0;nT +Op
�max
�1pnT;1
T
����1= ��1�0;nT +Op
�max
�1pnT;1
T
��(C.18)
44
given that ��0;nT is positive in the limit, we have
pnT (�nT � �0) =
���1�0;nT +Op
�max
�1pnT;1
T
�����
1pnT
@ lnL�n;T (�0)
@����0;nT � N�0;nT
�= ��1�0;nT �
1pnT
@ lnL�n;T (�0)
@�+Op
�max
�1pnT;1
T
��� 1pnT
@ lnL�n;T (�0)
@�
���1�0;nT � (��0;nT + N�0;nT )�Op�max
�1pnT;1
T
��� (��0;nT + N�0;nT ),
which implies that
pnT (�nT � �0) + ��1�0;nT � (��0;nT + N�0;nT ) +Op
�max
�1pnT;1
T
��� (��0;nT + N�0;nT )
= (��1�0;nT + op(1)) �1pnT
@ lnL�n;T (�0)
@�. (C.19)
As ��0 = limT!1��0;nT exists, then using Theorem 3.3 and that ��0;nT + N�0;nT =p
nT � (a
s�0;nT
+
mn
n � au�0;nT ) + Op�max
�pnT 3 ;
1pT
��, we have
pnT (�nT � �0) +
pnT b�0;nT + Op
�max
�pnT 3 ;
1pT
��d!
N(0;��1�0 +��2�0�0). The results in (3.30)-(3.32) are immediate consequences of (3.28). �
C.8 Proof for (4.1)15
From concentrated estimators ((2.9)), �nT (�) = �0�(���0)H�11;nTH2;nT+H�1
1;nT
�1nT
TPt=1
~Z 0ntSn(�)S�1n~Vnt
�.
Using Sn(�)S�1n = In � (�� �0)Gn,
pnT��n;T (�nT )� �0
�= �
pnT (�nT � �0)
H�11;nTH2;nT +H�1
1;nT
1
nT
TXt=1
~Z 0ntGn ~Vnt
!(C.20)
+H�11;nT
1pnT
TXt=1
~Z 0nt ~Vnt
!
= �pnT (�nT � �0)H�1
1;nTH2;nT +H�11;nT
1pnT
TXt=1
~Z 0nt ~Vnt
!+R�nT
where R�nT = �pnT (�nT � �0)H�1
1;nT1nT
TPt=1
~Z 0ntGn ~Vnt.
For the termH�11;nT
1nT
TPt=1
~Z 0ntGn ~Vnt = H�11;nT
1nT
TPt=1
~Zs0ntGn ~Vnt+H�11;nT
1nT
TPt=1
~Zu0ntGn ~Vnt, we have1nT
TPt=1
~Zs0ntGn ~Vnt =
Op
�max
�1T ;
1pnT
��from Theorem A.7 in Yu, de Jong and Lee (2006) and H�1
1;nT1nT
TPt=1
~Zu0ntGn ~Vnt =
H�11;nT c � 1
nT
TPt=1
~~Y u0n;t�1Gn~Vnt = Op
�max
�1T ;
1pnT
��because 1
nT
TPt=1
~~Y u0n;t�1Gn~Vnt = Op
�max
�1;q
Tn
��from Lemma B.11 and H�1
1;nT � c = Op(T�1). Then, because �nT � �0 = Op
�max
�1pnT; 1T
��from
15Note that the derivation of (4.1) is built up from the estimates of various components of �nT in (2.9). The reason is that the
conventional mean value theorem can not be directly applied to the @2 lnLnT (�)@�@�0 at �0 for analysis due to technical complication.
45
(3.28), H�11;nT � c = Op(T
�1) and c0H�11;nT � c = Op(T
�2), we have R�nT = Op
�max
�1T ;
1pnT;p
nT 3
��and
c0 �R�nT = Op�max
�1T 2 ;
1pnT 3
;p
nT 5
��.
Also, from (B.49) and �nT � �0 = Op�max
�1pnT; 1T
��,
pnT��2n;T (�nT )� �20
�=
1pnT
TXt=1
~V 0nt ~Vnt � n�20
!� 2�20
1
ntrGn
�pnT (�nT � �0)
�+R�2nT ,(C.21a)
R�2nT = Op
�max
�1
T;1pnT;
rn
T 3
��. (C.21b)
From the Taylor expansion,pnT��nT � �0
�=��@2 lnLn;T (��nT )
@�2
��11pnT
@ lnLn;T (�0)@� where ��nT lies between
�0 and �nT . From Claim 3.4, Claim 3.5 and (C.18),��@2 lnLn;T (��nT )
@�2
��1= ��1�0 + Op
�max
�1pnT; 1T
��.
Using Theorem 3.3,
pnT��nT � �0
�= ��1�0
1pnT
@ lnLn;T (�0)
@�+R�nT and R�nT = Op
�max
�1
T;1pnT;
rn
T 3
��. (C.22a)
Hence, we have
pnT
0BB@Ikx+2 H�1
1;nTH2;nT 0
0 1 0
0 2 trGn
n �20 1
1CCA0BB@�n;T (�nT )� �0�nT � �0�2nT (�nT )� �20
1CCA
=
0BBBBBB@H�11;nT
�1pnT
TPt=1
~Z 0nt ~Vnt
���1�0
�1�20
1pnT
TPt=1
~V 0nt(G0n � 1
n trGn)~Vnt +
1�20
1pnT
TPt=1(�00 ~Z
0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt
�1pnT
TPt=1
�~V 0nt ~Vnt � n�20
�
1CCCCCCA+(R0
�nT; R�nT ; R�2nT )
0
=
0BB@H�11;nT�
20 0 0
0 ��1�0 0
0 0 2�40
1CCA�
0BBBBBBBBBB@
�1�20
1pnT
TPt=1
~Z 0nt ~Vnt
�0BB@
1�20
1pnT
TPt=1( ~V 0ntG
0n~Vnt � �20trGn)� 1
�20
1n trGn
1pnT
TPt=1( ~V 0nt ~Vnt � n�20)
+ 1�20
1pnT
TPt=1(�00 ~Z
0ntG
0n �H0
2;nTH�11;nT
~Z 0nt) ~Vnt
1CCA12�40
1pnT
TPt=1
�~V 0nt ~Vnt � n�20
�
1CCCCCCCCCCA+(R0
�nT; R�nT ; R�2nT )
0
=
0BB@H�11;nT�
20 0 0
0 ��1�0 0
0 0 2�40
1CCA�0BB@
Ikx+2 0 0
�H02;nTH�1
1;nT 1 � 2n�
20trGn
0 0 1
1CCA
�
0BBBBBB@1�20
1pnT
TPt=1
~Z 0nt~Vnt
1�20
1pnT
TPt=1( ~V 0ntG
0n~Vnt � �20trGn) + 1
�20
1pnT
TPt=1(�00
~Z 0ntG0n)~Vnt
12�40
1pnT
TPt=1
�~V 0nt~Vnt � n�20
�
1CCCCCCA+ (R0�nT; R�nT ; R�2nT )
0.
46
Hence,
pnT
0BB@�n;T (�nT )� �0�nT � �0�2nT (�nT )� �20
1CCA = C1;nT �
0BBBBBB@1�20
1pnT
TPt=1
~Z 0nt ~Vnt
1�20
1pnT
TPt=1( ~V 0ntG
0n~Vnt � �20trGn) + 1
�20
1pnT
TPt=1(�00 ~Z
0ntG
0n) ~Vnt
12�40
1pnT
TPt=1
�~V 0nt ~Vnt � n�20
�
1CCCCCCA
+
0BB@Ikx+2 H�1
1;nTH2;nT 0
0 1 0
0 2n�
20trGn 1
1CCA�1
� (R0�nT; R�nT ; R�2nT )
0
where
C1;nT =
0BB@Ikx+2 H�1
1;nTH2;nT 0
0 1 0
0 2n�
20trGn 1
1CCA�10BB@
H�11;nT�
20 0 0
0 ��1�0 0
0 0 2�40
1CCA�0BB@
Ikx+2 0 0
�H02;nTH�1
1;nT 1 � 2n�
20trGn
0 0 1
1CCA
=
0BB@Ikx+2 �H�1
1;nTH2;nT 0
0 1 0
0 � 2n�
20trGn 1
1CCA�0BB@H�11;nT�
20 0 0
0 ��1�0 0
0 0 2�40
1CCA�0BB@
Ikx+2 0 0
�H02;nTH�1
1;nT 1 � 2n�
20trGn
0 0 1
1CCA= ��1�0;nT .
We note that, from the log likelihood in (2.8), by concentrating out cn in terms of � = (�0; �; �2)0, the
concentrated likelihood of � is
lnLn;T (�) = �nT
2ln 2� � nT
2ln�2 + T ln jSn(�)j �
1
2�2
TXt=1
~V 0nt(�) ~Vnt(�) (C.23)
where ~Vnt(�) = Sn(�) ~Ynt � ~Znt�. It follows that the �rst derivative of lnLn;T (�) with � evaluated at �0 is
1pnT
@ lnLnT (�0)
@��
0BBBBBB@1�20
1pnT
TPt=1
~Z 0nt ~Vnt
1�20
1pnT
TPt=1( ~V 0ntG
0n~Vnt � �20trGn) + 1
�20
1pnT
TPt=1(�00 ~Z
0ntG
0n) ~Vnt
12�40
1pnT
TPt=1
�~V 0nt ~Vnt � n�20
�
1CCCCCCA : (C.24)
Hence,
pnT
0BB@�n;T (�nT )� �0�nT � �0�2nT (�nT )� �20
1CCA = ��1�0;nT �1pnT
@ lnLnT (�0)
@�+
0BB@Ikx+2 H�1
1;nTH2;nT 0
0 1 0
0 2n�
20trGn 1
1CCA�1
�(R0�nT; R�nT ; R�2nT )
0.
(C.25)
AsH�11;nTH2;nT = Op(1) from Proposition B.14 and elements ofR0�nT ; R�nT ; R�
2nTareOp
�max
�1pnT; 1T ;
pnT 3
��,
we havepnT��nT � �0
�= ��1�0;nT �
1pnT
@ lnLnT (�0)@� +Op
�max
�1pnT; 1T ;
pnT 3
��.
47
C.9 Proof for Theorem 4.1
As ~Znt has stationary and nonstationary parts ((2.12)), we can decompose 1pnT
@ lnLnT (�0)@� into two parts:
1pnT
@ lnLn;T (�0)
@�=
1pnT
@ lnLsn;T (�0)
@�+
1pnT
@ lnLun;T (�0)
@�(C.26)
where 1pnT
@ lnLsn;T (�0)
@� is the stationary part and 1pnT
@ lnLun;T (�0)
@� is the nonstationary part as follows. For1pnT
@ lnLsn;T (�0)
@� , it has two parts 1pnT
@ lnLsn;T (�0)
@� = 1pnT
@ lnLs�n;T (�0)
@� ���0;nT where
1pnT
@ lnLs�nT (�0)
@�=
0BBBBBB@1�20
1pnT
TPt=1
�Zs�0nt Vnt
1�20
1pnT
TPt=1(V 0ntG
0nVnt � �20trGn) + 1
�20
1pnT
TPt=1(�00
�Zs�0nt G0n)Vnt
12�40
1pnT
TPt=1
�V 0ntVnt � n�20
�
1CCCCCCA (C.27)
and
��0;nT =
0BBB@1�20
qTn (�UsnT;�1;Wn
�UsnT;�1;0n�kx)0 �VnT
1�20
qTn�V 0nTG
0n�VnT +
1�20
qTn (�
00(�UsnT;�1;Wn
�UsnT;�1;0n�kx)0G0n)
�VnT
12�40
qTn�V 0nT
�VnT
1CCCA . (C.28)
For 1pnT
@ lnLun;T (�0)
@� , it also has two parts 1pnT
@ lnLun;T (�0)
@� = 1pnT
@ lnLu�n;T (�0)
@� � N�0;nT where
1pnT
@ lnLu�n;T (�0)
@�=1
�20
1pnT
TXt=1
�Y u�0n;t�1Vnt � (c�0; 0)0 (C.29)
with �Y u�n;t�1 =1
(1��0) (RnJnR�1n )
�cn0~t�1 +
�Xn;t�1�0 + �n;t�1
�, ~t�1 = (t� 1)� T�1
2 and
N�0;nT =1
�20
(1
(1� �0)
rT
n(Mn
��n;T�1)0 � �VnT
)� (c�0; 0)0. (C.30)
Let 1pnT
@ lnL�n;T (�0)
@� = 1pnT
@ lnLs�n;T (�0)
@� + 1pnT
@ lnLu�n;T (�0)
@� . For ��1�0;nT �1pnT
@ lnL�n;T (�0)
@� , because ��1�0;nT �(c�0; 0)0 = O(T�1) according to Proposition 2.1, it has the form of the CLT in Proposition 2.2 and is
normally. For its variance, we can write E( 1pnT
@ lnL�n;T (�0)
@� � 1pnT
@ lnL�n;T (�0)
@�0 ) =
E 1nT
0BBBBBBB@
1�40
�TPt=1
�Z�0ntVnt
��TPt=1
�Z�0ntVnt
�0� �
1�40
�TPt=1(Gn �Z
�nt�0)
0Vnt +TPt=1(V 0ntG
0nVnt � �20trGn)
��TPt=1
�Z�0ntVnt
�00 0
12�60
�TPt=1(V 0ntVnt � n�20)
��TPt=1
�Z�0ntVnt
�00 0
1CCCCCCCA
+E 1nT
0BBBBB@0 0 0
0 1�40
�TPt=1(Gn �Z
�nt�0)
0Vnt +TPt=1(V 0ntG
0nVnt � �20trGn)
�2�
0 12�60
�TPt=1(Gn �Z
�nt�0)
0Vnt +TPt=1(V 0ntG
0nVnt � �20trGn)
��TPt=1(V 0ntVnt � n�20)
�00
1CCCCCA48
+E 1nT
0BBBB@0 0 0
0 0 0
0 0 14�80
�TPt=1(V 0ntVnt � n�20)
��TPt=1(V 0ntVnt � n�20)
�01CCCCA.
As �Z�nt is uncorrelated with Vnt, we have E(1pnT
@ lnL�n;T (�0)
@� � 1pnT
@ lnL�n;T (�0)
@�0 )
=
0BBBBB@1
�20nTE
TPt=1
�Z�0nt�Z�nt
1�20nT
ETPt=1
�Z�0ntGn�Z�nt�0 0
1�20nT
ETPt=1(Gn �Z
�nt�0)
0 �Z�nt1
�20nTE
TPt=1(Gn �Z
�nt�0)
0Gn �Z�nt�0 +
1n
�tr(G0nGn) + tr(G
2n)�
1�20ntr(Gn)
0 1�20ntr(Gn)
12�40
1CCCCCA
+
0BBBBB@0 � �
�3�40nT
nPi=1
Gn;iiE(TPt=1
�Z�nt)i2�3�40nT
nPi=1
Gn;iiE(TPt=1Gn �Z
�nt�0)i +
�4�3�40�40n
nPi=1
G2n;ii �
�32�60nT
l0nETPt=1
�Z�nt1
2�60nT�3l
0nE
TPt=1Gn �Z
�nt�0 +
�4�3�402�60n
trGn�4�3�404�80
1CCCCCA.
As ETPt=1
�Z�nt = 0 and ETPt=1Gn �Z
�nt�0 = 0, the second matrix equals
�0;n =�4 � 3�40�40
0BBB@0 0 0
0 1n
nPi=1
G2n;ii1
2�20ntrGn
0 12�20n
trGn14�40
1CCCA .When Vnt are normally distributed, �0;n = 0 because �4 � 3�40 = 0 for a normal distribution. For the �rstmatrix, premultiplying and postmultiplying it with ��1�0;nT will yield �
�1�0;nT
+ O�1T
�. To see this, denote
�z;nT = ( �Usn;T�1;Wn
�Usn;T�1;0n�kx) and Nz;nT =�
11��0Mn
��n;T�1
�� c0, then, ~Znt = �Z�nt ��z;nT � Nz;nT .
This implies that E( 1pnT
@ lnL�n;T (�0)
@� � 1pnT
@ lnL�n;T (�0)
@�0 ) = ��0;nT +�0;n + ��0;nT where
��0;nT
=
0BBBBB@1
�20nTE
TPt=1
~Z 0nt(Nz;nT +�z;nT ) 1�20nT
ETPt=1
~Z 0ntGn(Nz;nT +�z;nT )�0 0
1�20nT
ETPt=1(Gn ~Znt�0)
0(Nz;nT +�z;nT ) 2�20nT
ETPt=1(Gn ~Znt�0)
0Gn(Nz;nT +�z;nT )�0 0
0 0 0
1CCCCCA
+
0BBBBB@1
�20nTE
TPt=1(Nz;nT +�z;nT )0 ~Znt 1
�20nTE
TPt=1(Nz;nT +�z;nT )0Gn ~Znt�0 0
1�20nT
ETPt=1(Gn(Nz;nT +�z;nT )�0)0 ~Znt 0 0
0 0 0
1CCCCCA
+
0BBBBB@1
�20nTE
TPt=1(Nz;nT +�z;nT )0(Nz;nT +�z;nT ) 1
�20nTE
TPt=1(Nz;nT +�z;nT )0Gn(Nz;nT +�z;nT )�0 0
1�20nT
ETPt=1(Gn(Nz;nT +�z;nT )�0)0(Nz;nT +�z;nT ) 1
�20nTE
TPt=1(Gn(Nz;nT +�z;nT )�0)0Gn(Nz;nT +�z;nT )�0 0
0 0 0
1CCCCCA
49
=
0BBBBB@1
�20nTE
TPt=1(Nz;nT +�z;nT )0(Nz;nT +�z;nT ) 1
�20nTE
TPt=1(Nz;nT +�z;nT )0Gn(Nz;nT +�z;nT )�0 0
1�20nT
ETPt=1(Gn(Nz;nT +�z;nT )�0)0(Nz;nT +�z;nT ) 1
�20nTE
TPt=1(Gn(Nz;nT +�z;nT )�0)0Gn(Nz;nT +�z;nT )�0 0
0 0 0
1CCCCCAbecause the expectations in the �rst two matrices are all zero.
We have
E1
�20nTE
TXt=1
(Nz;nT +�z;nT )0(Nz;nT +�z;nT )
= c ��
1
(1� �0)21
�20nE��
0n;T�1M
0nMn
��n;T�1
�� c0 +
�1
�20nE�0z;nT�z;nT
�:
+c ��� 1
(1� �0)1
�20nE��
0n;T�1M
0n�z;nT
�+
�� 1
1� �01
�20nE�0z;nTMn
��n;T�1
�c0.
Similarly we can expand 1�20nT
ETPt=1(Nz;nT+�z;nT )0Gn(Nz;nT+�z;nT )�0 and 1
�20nTE
TPt=1(Gn(Nz;nT+�z;nT )�0)0Gn(Nz;nT+
�z;nT )�0. Using the orders of relevant terms from Lemma B.8 and Lemma B.10, we have 1nE�0z;nTBn��n;T�1 =
O(1), 1nE�0z;nTBn�z;nT = O(T�1) and 1
nE��0n;T�1Bn��n;T�1 = O(T ) where Bn is a row sum and column sum
bounded matrix. Using ��1�0;nT � (c�0; 0)0 = O(T�1) and (c�0; 0) � ��1�0;nT � (c
�0; 0)0 = O(T�2) (see Proposition
2.1) and the above, it follows that ��1�0;nT � ��0;nT � ��1�0;nT
= O(T�1). Hence,
��1�0;nTE(1pnT
@ lnL�n;T (�0)
@�� 1pnT
@ lnL�n;T (�0)
@�0)��1�0;nT = �
�1�0;nT
+��1�0;nT�0;n��1�0;nT
+O�T�1
�. (C.31)
Therefore,
��1�0;nT
�1pnT
@ lnL�n;T (�0)
@�0
�p! N(0; lim
T!1��1�0;nT + lim
T!1��1�0;nT�0;n�
�1�0;nT
) (C.32)
Using the results in Yu, de Jong and Lee (2006) (Theorem A.11, page 19), we have ��0;nT =p
nT a
s�0;n
+
Op
�max
�pnT 3 ;
1pT
��. Using (B.17) and (C.30), we have N�0;nT =
pnT �
mn
n �au�0;T
+T �(c�0; 0)�Op�max
�pnT 3 ;
1pT
��where
as�0;n =
0BBBBBBBB@
1n tr
��P1h=0B
hn
�S�1n
�1n tr
�Wn
�P1h=0B
hn
�S�1n
�0
1n 0tr(Gn
�P1h=0B
hn
�S�1n ) + 1
n�0tr(GnWn
�P1h=0B
hn
�S�1n ) + 1
n trGn12�20
1CCCCCCCCA(C.33)
au�0;T = T � 1
2(1� �0)� (c�0; 0)0.
Hence,
��0;nT +N�0;nT =rn
T�(as�0;n+a
u�0;T
mn
n)+T �(c�0; 0)�Op
�max
�rn
T 3;1pT
��+Op
�max
�rn
T 3;1pT
��.
(C.34)
Using ��1�0;nT � (c�0; 0)0 = O(T�1), we have ��1�0;nT
���0;nT + N�0;nT
�=p
nT � b�0;nT +Op
�max
�pnT 3 ;
1pT
��.
Hence, combining (4.1), (C.32) and the equation above, we have the result. �
50
C.10 Proof for Theorem 4.2
For the remainder term in (C.25), using (C.20), (C.22) and 1�c0H�11;nTH2;nT = O(T
�1) from Proposition
B.14, we have
(c�0; 0)0
0BB@Ikx+2 H�1
1;nTH2;nT 0
0 1 0
0 2n�
20trGn 1
1CCA�10BB@
R�nT
R�nT
R�2nT
1CCA = (c�0; 0)0
0BB@Ikx+2 �H�1
1;nTH2;nT 0
0 1 0
0 � 2n�
20trGn 1
1CCA0BB@R�nT
R�nT
R�2nT
1CCA= c0R�nT + (1� c
0H�11;nTH2;nT )R�nT = Op
�max
�1T 2 ;
1pnT 3
;p
nT 5
��:
Hence,
pnT 3(c�0; 0)(�nT � �0) (C.35)
= T (c�0; 0)��1�0;nT �1pnT
@ lnLnT (�0)
@�+Op
�max
�1
T;1pnT;
rn
T 3
��= T (c�0; 0)��1�0;nT �
1pnT
@ lnLs�nT (�0)
@�+ T (c�0; 0)��1�0;nT �
1pnT
@ lnLu�nT (�0)
@�
�T (c�0; 0)��1�0;nT (��0;nT + N�0;nT ) +Op�max
�1
T;1pnT;
rn
T 3
��where 1p
nT
@ lnLs�nT (�0)@� is in (C.27), 1p
nT
@ lnLu�nT (�0)@� is in (C.29), ��0;nT is in (C.28) and N�0;nT is in (C.30).
We shall investigate the orders of those terms.
For stationary terms, from Yu, de Jong and Lee (2006) (Claim 3.4, page 10), 1pnT
@ lnLs�nT (�0)@� has the
typical Op(1) and ��0;nT � E(��0;nT ) = Op( 1pT) where E(��0;nT ) = O(
pnT ). For the nonstationary term
T (c�0; 0)��1�0;nTN�0;nT = T (c�0; 0)��1�0;nT (c
�0; 0)n
1�20(1��0)
qTn (Mn
��n;T�1)0 � �VnT
o, we have T (c�0; 0)��1�0;nTN�0;nT�
E�T (c�0; 0)��1�0;nTN�0;nT
�= Op(
1pT) where E
�T (c�0; 0)��1�0;nTN�0;nT
�= O(
pnT ) by using (B.17) in Lemma
B.9 and (c�0; 0)��1�0;nT (c�0; 0)0 = O(T�2). For nonstationary term T (c�0; 0)��1�0;nT �
1pnT
@ lnLu�nT (�0)@� , we have
1pnT
@ lnLu�nT (�0)
@�=
1pnT
TXt=1
1
�20(1� �0)Mn
�cn0~t�1 +
�Xn;t�1�0 + �n;t�1
�0Vnt � (c�0; 0)0:
Hence,
pnT 3(c�0; 0)(�nT � �0)
= T (c�0; 0)��1�0;nT �1pnT
@ lnLs�nT (�0)
@�
+T 2(c�0; 0)��1�0;nT (c�0; 0)0 � 1p
nT
TXt=1
1
�20(1� �0)Mn
1
T
�cn0~t�1 +
�Xn;t�1�0 + �n;t�1
�0Vnt
�T (c�0; 0)��1�0;nTE(��0;nT + N�0;nT ) +Op(1pT) +Op
�max
�1
T;1pnT;
rn
T 3
��where T (c�0; 0)��1�0;nTE(��0;nT + N�0;nT ) = O(
pnT ) represents the asymptotic bias term and the �rst two
terms will be asymptotically jointly normally distributed. As (c�0; 0)(�nT � �0) = (�nT + nT + �nT � 1),the rate of convergence of �nT + nT + �nT to the unit is of higher order O(
1pnT 3
) as long as nT 3 ! 0.
51
So, we have
pnT 3(c�0; 0)(�nT � �0) +
rn
TT (c�0; 0)b�0;nT +Op
�max
�1pT;
rn
T 3
��d! N
�0; limT!1
T 2(c�0; 0)(��1�0;nT + limT!1
��1�0;nT�0��1�0;nT
)(c�0; 0)0�. (C.36)
Also, using Proposition 2.1, limT!1 T2(c�0; 0)��1�0;nT (c
�0; 0)0 = limT!1 !�1nT , where !nT is de�ned in (2.13).
C.11 Proof for Theorem 4.3
From the �rst order condition that @ lnLn;T (�;cn)@cn= 1
�2
TPt=1Vnt(�) = 0, we have cnT (�) = 1
T
TPt=1(Sn(�)Ynt�
Znt�). As SnYnt = Znt�0 + cn0 + Vnt and Sn(�)S�1n = In � (� � �0)Gn, it implies that cnT (�) =
1T
TPt=1((In � (�� �0)Gn) (Znt�0 + cn0 + Vnt)� Znt�). Hence,
cnT (�)� cn0 =1
T
TXt=1
((In � (�� �0)Gn) (Znt�0 + cn0 + Vnt)� Znt�)� cn0
= � 1T
TXt=1
[Znt (� � �0) + (�� �0) (Gncn0 +GnZnt�0)� (In � (�� �0)Gn)Vnt]
= � 1T
TXt=1
[Zsnt (� � �0) + (�� �0) (Gncn0 +GnZsnt�0)� (In � (�� �0)Gn)Vnt]
� 1T
TXt=1
[Zunt (� � �0) + (�� �0) (GnZunt�0)] .
As 1T
TPt=1[Zunt (� � �0) + (�� �0) (GnZunt�0)] =
�1T
TPt=1Y un;t�1
�( + �+ �� 1), we have
1T
TPt=1
hZunt
��nT � �0
�+��nT � �0
�(GnZ
unt�0)
i=
�1T 2
TPt=1Y un;t�1
�T � ( nT + �nT + �nT � 1).
From Theorem 4.2, T �( nT+�nT+�nT�1) = Op�max
�1T ;
1pnT
��. From (2.4), elements of 1
T 2
TPt=1Y un;t�1
are Op(1) if elements ofYn;�1T are Op(1). Then, for each �xed e¤ect, we have
ci;nT (�nT )� ci;0 = � 1T
TXt=1
((Gncn0 +GnZsnt�0)i , (Z
snt)i)�
0@ �nT � �0�nT � �0
1A+ 1
T
TXt=1
n�In � (�nT � �0)Gn
�Vnt
oi
+Op
�max
�1
T;1pnT
��, (C.37)
where (Zsnt)i is the ith row of Zsnt and (Gncn0 +GnZ
snt�0)i is the ith element of (Gncn0 +GnZ
snt�0). As
elements of 1T
TPt=1((Gncn0 +GnZ
snt�0)i, (Z
snt)i) are Op(1) uniformly in n and i implied by Lemma B.4 of
Yu, de Jong and Lee (2006) and �nT � �0 = Op
�max
�1pnT; 1T
��by Theorem 3.6, the dominant term
ofpT (ci;nT (�nT ) � ci;0) would be 1p
T
TPt=1vit + Op
�1pn
�when T ! 1 where the Op
�1pn
�term is the
52
�1T 2
TPt=1Y un;t�1
�term multiplied by the distribution part of T �( nT+�nT+�nT�1), which is T (c�0; 0)��1�0;nT �
1npT
@ lnL�nT (�0)@� (see (C.35)). So, for each �xed e¤ect,
pT�ci;nT (�nT )� ci;0
�=
1pT
TXt=1
vit+1pn
1
T 2
TXt=1
Y un;t�1
!i
�[T (c�0; 0)��1�0;nT ]�
1pnT
@ lnL�nT (�0)
@�
�+Op
�1pT
�(C.38)
where 1pnT
@ lnL�nT (�0)@� = 1p
nT
@ lnLs�nT (�0)@� + 1p
nT
@ lnLu�nT (�0)@� (de�ned in (C.27) and (C.29)) and [T (c�0; 0)��1�0;nT ]�
1pnT
@ lnL�(�0)@� is normally distributed asymptotically with the variance speci�ed in (4.11). Hence,
pT�ci;nT (�nT )� ci;0
�is a linear and quadratic form of Vnt and it will be normally distributed asymptotically using the central
limit theorem by Proposition 2.2. We need to calculate its variance.
Under the assumption that (Yn;�1=T )i � E(Yn;�1=T )i = op(1) and E(Yn;�1=T )i = O(1) uniformly in n
and i, we have 1T 2
TPt=1(Y un;t�1)i = E
1T 2
TPt=1(Y un;t�1)i + op(1) where E
1T 2
TPt=1(Y un;t�1)i is O(1). Hence,
pT�ci;nT (�nT )� ci;0
�=
1pT
TXt=1
vit+1pn
E1
T 2
TXt=1
Y un;t�1
!i
�[T (c�0; 0)��1�0;nT ]�
1pnT
@ lnL�nT (�0)
@�
�+op(1).
(C.39)
As 1pnT
@ lnL�nT (�0)@� =
0BBBBBB@1�20
1pnT
TPt=1
�Z�0ntVnt
1�20
1pnT
TPt=1(V 0ntG
0nVnt � �20trGn) + 1
�20
1pnT
TPt=1(�00 �Z
�0ntG
0n)Vnt
12�40
1pnT
TPt=1
�V 0ntVnt � n�20
�
1CCCCCCA where �Z�nt is de�ned
in (C.10), the asymptotic variance ofpT�ci;nT (�nT )� ci;0
�would be �n;ci where
�n;ci = �20 +2
n
E1
T 2
TXt=1
Y un;t�1
!i
��3
�[T (c�0; 0)��1�0;nT ] � [0; Gii; 1]
0��
(C.40)
+2
n
E1
T 2
TXt=1
Y un;t�1
!i
�20
[T (c�0; 0)��1�0;nT ] � [(
TXt=1
E �Z�nt; )i; (TXt=1
EGn �Z�nt�0; )i; 0]
0
!!
+1
n
0@ E 1
T 2
TXt=1
Y un;t�1
!2i
�limT!1
!�1nT + limT!1
T 2(c�0; 0)( limT!1
��1�0;nT�0;nT��1�0;nT
)(c�0; 0)0�1A .
When n!1, we have �n;ci ! �20. �
C.12 Proof for Theorem 4.5
Theorem 4.1 states thatpnT (�nT � �0) +
pnT b�0;nT + Op
�max
�1T ;p
nT 3
�� d! N(0; limT!1��1�0;nT
+
limT!1��1�0;nT
�0;n��1�0;nT
). As the bias corrected estimator �1
nT = �nT +1T b�nT ;nT , we have
pnT (�
1
nT ��0)
d! N(0; limT!1��1�0;nT
+ limT!1��1�0;nT
�0;n��1�0;nT
) ifp
nT
�b�nT ;nT � b�0;nT
�p! 0 and n
T 3 ! 0.
So, given nT 3 ! 0, we are going to prove that
pnT
�b�nT ;nT � b�0;nT
�p! 0 where b�0;nT = ��1�0;nT ��
as�0;n + au�0;T
mn
n
�and b�nT ;nT =
���1�nT ;nT
��as�nT ;n
+ au�nT ;T
mn
n
�. As ��1�0;nT =
���1�nT ;nT
+Op
�max
�1pnT; 1T
��53
and T �h���1�nT ;nT
� ��1�0;nTi�(c�0; 0)0 = Op
�max
�1pnT; 1T
��from Proposition B.15,
pnT
�b�nT ;nT � b�0;nT
�p!
0 is reduced to rn
T
���1�0;nTa
u�nT ;n
� ��1�0;nTau�0;T
�p! 0 (C.41)
and rn
T
�as�nT ;n
� as�0;n�
p! 0. (C.42)
For (C.41), as au�0;nT = T1
2(1��0) (c�0; 0)0 with ��1�0;nT �(c
�0; 0)0 = O(T�1),p
nT
���1�0;nTa
u�nT ;n
� ��1�0;nTau�0;n
�=p
nT
�T � ��1�0;nT (c
�0; 0)0���
12(1��nT )
� 12(1��0)
�=p
nT
�T � ��1�0;nT (c
�0; 0)0���
�nT��02(1��nT )(1��0)
�. As �nT��0 =
Op
�max
�1T ;
1pnT
��, we have
pnT
���1�0;nTa
u�nT ;n
� ��1�0;nTau�0;n
�p! 0 if n
T 3 ! 0.
For (C.42), as �nT ��0 = Op�max
�1T ;
1pnT
��and asn(�0) is O(1) where a
sn(�0) � as�0;n, according to the
Taylor expansion of asn(�nT ) around asn(�0), to prove (C.42) is reduced to proving that elements of
@asn(��nT )
@�0
are O(1) where ��nT lies between �nT and �0 and
asn(�) =
0BBBBBBBB@
1n tr
��P1h=0B
hn(�)
�S�1n (�)
�1n tr
�Wn
�P1h=0B
hn(�)
�S�1n (�)
�0
1n tr(Gn(�)
�P1h=0B
hn(�)
�S�1n (�)) + 1
n�tr(GnWn
�P1h=0B
hn(�)
�S�1n (�)) + 1
n trGn(�)
12�2
1CCCCCCCCA.
From Proposition B.2, for An(�) = (In��Wn)�1( In+�Wn) whereWn is diagonalizable asWn = RnD
�nR
�1n ,
we have that An(�) is diagonalizable as An(�) = RnDn(�)R�1n , with its eigenvalue matrix Dn(�) = (In �
�D�n)�1( In + �D
�n). As Bn(�) = Rn ~Dn(�)R
�1n with ~Dn(�) = Diag(0; � � � ; 0; dn;mn+1; � � � ; dnn) so that
Dn(�) = Jn + ~Dn(�) where Jn = Diagf10mn; 0; � � � ; 0g, we have
Bn(�) = Rn (Dn(�)� Jn)R�1n = Rn(In � �D�n)�1( In + �D
�n)R
�1n �RnJnR�1n .
With Bn(�) as a function explicitly in �,@Bn(�)@�0 can be easily evaluated. Because @Bh
n(�)@�0 = hBh�1n (�)@Bn(�)
@�0
for h � 1 (see footnote 9 in Yu, de Jong and Lee (2006)), we haveP1
h=1@Bh
n(�)@�0 =
P1h=1 hB
h�1n (�)@Bn(�)
@�0 .
As (1)P1
h=0Bhn(�) and
P1h=1 hB
h�1n (�) are uniformly bounded in either row sum or column sum, uniformly
in a neighborhood of �0, (2) S�1n (�) is uniformly bounded in both row and column sums, also uniformly in �
in a neighborhood of �0 and (3) Wn is uniformly bounded in both row and column sums, we have the result
that the elements of @asn(�)@�0 will be uniformly bounded in n in a neighborhood of �0. As ��nT converges in
probability to �0, we conclude that elements of@asn(
��nT )@�0 are Op(1).
For (4.15), we can start from (4.11). Similarly, we can provep
nT T (c
�0; 0)�b�1nT ;nT
� b�0;nT�
p! 0. Hence,pnT 3(c�0; 0)(�
1
nT � �0)d! N
�0; limT!1 T
2(c�0; 0)���1�0;nT +�
�1�0;nT
�0;n��1�0;nT
�(c�0; 0)0
�under n
T 3 ! 0,
where limT!1 T2(c�0; 0)��1�0;nT (c
�0; 0)0 = limT!1 !�1nT using Proposition 2.1. �
54
References
[1] Dhrymes, P (1978), Mathematics for Econometrics, Springer-Verlag.
[2] Choi, I. (2004), Nonstationary Panels, Palgrave Handbooks of Econometrics, Vol. 1, forthcoming (Jul.,
2004).
[3] Horn, R. and C. Johnson (1985), Matrix Algebra, Cambridge University Press.
[4] Im, K.S., M.H. Pesaran, S. Shin (2003), Testing for Unit Roots in Heterogeneous Panels, Journal of
Econometrics, 115, 53-74
[5] Kelejian, H.H. and I.R. Prucha (1998), A Generalized Spatial Two-Stage Least Squares Procedure for
Estimating a Spatial Autoregressive Model with Autoregressive Disturbance, Journal of Real Estate
Finance and Economics, Vol. 17:1, 99-121.
[6] Kelejian, H.H. and I.R. Prucha (2001), On the Asymptotic Distribution of the Moran I Test Statistic
With Applications, Journal of Econometrics, 104, 219-257.
[7] Lee, L.F. (2001), Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Econo-
metric Models I: Spatial Autoregressive Process, Working Paper, The Ohio State University.
[8] Lee, L.F. (2004), Asymptotic Distributions of Quasi-Maximum Likelihood Estimators for Spatial Econo-
metric Models, Econometrica, Vol. 72, No.6, 1899-1925.
[9] Levin, A., C-F. Lin and C-S.J. Chu (2002), Unit Root tests in Panel Data: Asymptotic and Finite
Sample Properties, Journal of Econometrics, 108, 1-24.
[10] Maddala, G.S. and S. Wu (1999), A Comparative Study of Unit Root Tests With Panel Data and A
New Simple Test, Oxford Bulletin of Economics and Statistics, 61, 631-652.
[11] Moon, H.R. and B. Perron (2004), Testing for A Unit Root in Panels With Dynamic Factors, Journal
of Econometrics, 122, 81-126.
[12] Ord, J.K. (1975), Estimation Methods for Models of Spatial Interaction, Journal of the American
Statistical Association 70, 120-297.
[13] Pesaran, M.H. (2003), A Simple Panel Unit Root Test in the Presence of Cross Section Dependence,
Mimeo, Trinity College, Cambridge.
[14] Phillips, P.C.B. and D. Sul (2003), Dynamic Panel Estimation and Homogeneity Testing Under Cross
Section Dependence, Econometrics Journal, 6, 217-259.
[15] Tao, J. (2006), Analyzing Local School Expenditure in A Dynamic Game, Working Paper, Shanghai
University of Finance and Economics.
55
[16] Rothenberg, T.J. (1971), Identi�cation in Parametric Models, Econometrica, Vol. 39, No.3, 577-591.
[17] Yu, J. (2006), Convergence: A Spatial Dynamic Panel Data Approach, Working Paper, The Ohio State
University.
[18] Yu, J., R. de Jong and L-F. Lee (2006), Quasi-Maximum Likelihood Estimators For Spatial Dynamic
Panel Data With Fixed E¤ects When Both n and T Are Large, Working Paper, The Ohio State
University.
56