Post on 01-Apr-2015
Part 11: Heterogeneity [ 1/36]
Econometric Analysis of Panel Data
William Greene
Department of Economics
Stern School of Business
Part 11: Heterogeneity [ 2/36]
Agenda Random Parameter Models
Fixed effects Random effects
Heterogeneity in Dynamic Panels Random Coefficient Vectors-Classical vs.
Bayesian General RPM Swamy/Hsiao/Hildreth/Houck Hierarchical and “Two Step” Models ‘True’ Random Parameter Variation
Discrete – Latent Class Continuous
Classical Bayesian
Part 11: Heterogeneity [ 3/36]
A Capital Asset Pricing Model
2it 0t 1t i 2t i 3t i it
it
0t
1t Mt 0t
2t
R s
R one period percentage return
expected return on a riskless security (stochastic)
expected premium on the 'market' portfolio, R R
"nonline
3t
2it i i i
ar" risk effect
"nonbeta risk" term
Data are [R , , ,s ], generated by auxiliary regressions
Coefficients are 'random' through time.
Fama - MacBeth, "Risk, Return, and Equilibrium: Empirical
Tes
ts," Journal of Political Economy, 1974.
Part 11: Heterogeneity [ 4/36]
Heterogeneous Production Model
i,t i i i,t i i,t i,tHealth HEXP EDUC
i country, t=year
Health = health care outcome, e.g., life expectancy
HEXP = health care expenditure
EDUC = education
Parameter heterogeneity:
Discrete? Aids domin
ated vs. QOL dominated
Continuous? Cross cultural heterogeneity
World Health Organization, "The 2000 World Health Report"
Part 11: Heterogeneity [ 5/36]
Parameter Heterogeneity
it i it
it i it
i i
i i
i i
X i i
i i
y c
y
u ,
E[u | ] 0 --> Random effects
E[u | ] 0 --> Fixed effects
E E[u | ] 0.
Var[u | ] not yet defined -
it
it
Unobserved Effects Random Constants
x β
x β
X
X
X
X so far, constant.
Part 11: Heterogeneity [ 6/36]
Parameter Heterogeneity
it it
i
i i
X i i
i i
y
E[ | ] zero or nonzero - to be defined
E [E[ | ]] =
Var[ | ] to be defined, constant or variable
it i
i
Generalize to Random Parameters
x β
β β u
u X
u X 0
u X
"The Pooling Problem:" What is the consequence
of estimating under the erroneous assumption of
constant parameters. (Theil, 1960, "The Aggregation
Problem") (Maddala, 1970s-1990s, "The Pooling
Problem")
Part 11: Heterogeneity [ 7/36]
Fixed Effects (Hildreth, Houck, Hsiao, Swamy)
it it
i i
i
i
i i i
i i i i
X i i X i i
i
y , each observation
, T observations
Assume (temporarily) T > K.
E[ | ] =g( ) (conditional mean)
P[ | ] =( -E[ ]) (projection)
E [E[ | ]] = E [P[ | ]] =
Var[ |
it i
i i i
i
x β
y Xβ ε
β β u
u X X
u X X X θ
u X u X 0
u Xi] constant but nonzeroΓ
Part 11: Heterogeneity [ 8/36]
OLS and GLS Are Inconsistent
i i
i
i i i i
i i
i i i i i i
, T observations
, T observations
E[ | ] E[ | ] E[ | ]
i i i
i
i i
i
i
y Xβ ε
β β u
y Xβ Xu ε
y Xβ w
w X X u X ε X 0
Part 11: Heterogeneity [ 9/36]
Estimating the Fixed Effects Model
Ni 1
Estimator: Equation by equation OLS or (F)GLS
1 ˆEstimate ? is consistent N
1 1 1 1
2 2 2 2
N N N N
i
y X 0 ... 0 β ε
y 0 X ... 0 β ε
... ... ... ... ... ... ...
y 0 0 ... X β ε
β β for E[ ] in N.iβ
Part 11: Heterogeneity [ 10/36]
Partial Fixed Effects Model
i i
N 1 N 1i 1 i 1 i
1i
Some individual specific parameters
+ , T observations
Use OLS and Frisch-Waugh
ˆ [ ] [ ], I ( )
ˆˆ [ ] ( - )
E.g., Individual specific tim
i i i i
i i ii D i i D i D i i i
i i i i
y Dα Xβ ε
β XM X XM y M D DD D
α DD D y Xβ
it i0 i1 it
it i0 it
e trends,
y t ; Detrend individual data, then OLS
E.g., Individual specific constant terms,
y ; Individual group mean deviations, then OLS
it
it
x β
x β
Part 11: Heterogeneity [ 11/36]
Heterogeneous Dynamic Models
i,t i i i,t 1 i it i,t
ii
i
logY logY x
long run effect of interest is 1
See:
Pesaran,H.,Smith,R.,Im,K.,"Estimating Long-Run Relationships
From Dynamic Heterogeneous Panels," Journal of Econometrics,
1995.
(Repeated with further study in Matyas and Sevestre, The
Econometrics of Panel Data.
Smith, J ., notes, Applied Econometrics, Dynamic Panel Data Models,
University of Warwick.
http://www2.warwick.ac.uk/fac/soc/economics/staff/faculty/jennifersmith/panel/
Weinhold, D., "A Dynamic "Fixed Effects" Model for Heterogeneous
Panel Data," London School of Economics, 1999.
Part 11: Heterogeneity [ 12/36]
Random Effects and Random Parameters
it it
i i
i
i
i i
i i
Random Parameters Model
y , each observation
, T observations
Assume (temporarily) T > K.
E[ | ] =
Var[ | ] constant but nonzero
We differentiate the classical and
it i
i i i
i
THE
x β
y Xβ ε
β β u
u X 0
u X Γ
Bayesian interpretations
Randomness here is heterogeneity, not "uncertainty"
Bayesian approach to be considered later.
Part 11: Heterogeneity [ 13/36]
Estimating the Random Parameters Model
i i
i
i i i i
i i
i i i i i i
2 2i i i i ,i ,i
2,i
, T observations
, T observations
E[ | ] E[ | ] E[ | ]
Var[ | ] Should vary by i?
,
i i i
i
i i
i
i
y Xβ ε
β β u
y Xβ Xu ε
y Xβ w
w X X u X ε X 0
w X XΓX I <==
Objects of estimation: β, Γ
Second le ivel estimation: β
Part 11: Heterogeneity [ 14/36]
Estimating the Random Parameters Model by OLS
i i
i
i i i i
i i
N 1 N N 1 Ni 1 i i i 1 i i i 1 i i i 1 i i
N 1 N 2 N 1i 1 i i i 1 i i i i 1 i i
2i
, T observations
, T observations
[ ] [ ] [ ] [ ]
[ ]=[ ] [ ( ) ][ ]
= [
i i i
i
i i
i
y Xβ ε
β β u
y Xβ Xu ε
y Xβ w
b XX Xy β XX Xw
Var b| X XX X XΓX I X XXN 1 N 1 N N 1
1 i i i 1 i i i 1 i i i i 1 i i
N 1 N N 1i 1 i i i 1 i i i i i 1 i i
] [ ] [ ( ) ( )][ ]
the usual + the variation due to the random parameters
Robust estimator
ˆ ˆEst.Var[ ] [ ] [ ][ ]
XX XX XX Γ XX XX
b XX Xw w X XX
Part 11: Heterogeneity [ 15/36]
Estimating the Random Parameters Model by GLS
i i
i
i i i i
2i i i i i i ,i
N 1 Ni 1 i i i 1 i i
2,i
, T observations
, T observations
, Var[ | ] = =( )
ˆ [ ] [ ]
ˆFor FGLS, we need and .ˆ
i i i
i
i i
i i
-1 -1i i
y Xβ ε
β β u
y Xβ Xu ε
y Xβ w w X Ω XΓX I
β XΩ X XΩ y
Γ
Part 11: Heterogeneity [ 16/36]
Estimating the RPM
i
1
1
2 1,i
T 22 t 1 it,i
i
i
( ) , = +
= ( )
Var[ | ]= + ( )
(y ) is unbiasedˆ
T K
(but not consistent because T is fixed).
i i i i i i i i i
i i i i i
i i i i
it i
b β X X X w w Xu ε
β u X X Xε
b X Γ X X
x b
Part 11: Heterogeneity [ 17/36]
An Estimator for Γ
2 1,i
X X
2 1X ,i
2 1X ,i
Ni 1
E[ | ]
Var[ | ]= + ( )
Var[ ] Var E[ | ] E Var[ | ]
= 0+ E [ + ( ) ]
+E [ ( ) ]
1Estimate Var[ ] with (
N
i i
i i i i
i i i i i
i i
i i
i
b X β
b X Γ X X
b b X b X
Γ X X
Γ X X
b
2 1 N 2 1X ,i i 1 ,i
N N 2 1i 1 i 1 ,i
)( )
1EstimateE [ ( ) ] with ( )ˆ
N1 1ˆ= ( )( ) ( )ˆN N
i i
i i i i
i i i i
b b b b '
X X X X
Γ b b b b ' - X X
Part 11: Heterogeneity [ 18/36]
A Positive Definite Estimator for Γ
N N 2 1
i 1 i 1 ,i
i
1 1ˆ= ( )( ) - ( )ˆN N
May not be positive definite. What to do?
(1) The second term converges (in theory) to 0 in T. Drop it.
(2) Various Bayesian "shrinkage" estimators,
i i i iΓ b b b b ' X X
(3) An ML estimator
Part 11: Heterogeneity [ 19/36]
Estimating βi
Ni 1
N 2 1 1 2 1i 1 ,i ,i
2 1 1 1,i
ˆ
{ [ ( ) ]} [ ( ) ]
Best linear unbiased predictor based on GLS is
ˆ ˆ ˆ + ( - ) ( )
{ [ ( ) ] }
GLS i i,OLS
i i i i i
i i GLS i i,OLS i,OLS i GLS i,OLS
-1i i i
β Wb
W Γ X X Γ X X
β Aβ I A b b A β b
A Γ X X Γ
ˆ ˆVar[ | all data]= Var[ ]
ˆVar[ ] Var[ ] [ ( - )]
( - )Var[ ] Var[ ]
-1
i i GLS i
GLS i,OLS i ii i
ii,OLS i i,OLS
β A β A
β b W AA I A
I AW b b
Part 11: Heterogeneity [ 20/36]
Baltagi and Griffin’s Gasoline Data
World Gasoline Demand Data, 18 OECD Countries, 19 yearsVariables in the file are
COUNTRY = name of country YEAR = year, 1960-1978LGASPCAR = log of consumption per carLINCOMEP = log of per capita incomeLRPMG = log of real price of gasoline LCARPCAP = log of per capita number of cars
See Baltagi (2001, p. 24) for analysis of these data. The article on which the analysis is based is Baltagi, B. and Griffin, J., "Gasoline Demand in the OECD: An Application of Pooling and Testing Procedures," European Economic Review, 22, 1983, pp. 117-137. The data were downloaded from the website for Baltagi's text.
Part 11: Heterogeneity [ 21/36]
OLS and FGLS Estimates
+----------------------------------------------------+| Overall OLS results for pooled sample. || Residuals Sum of squares = 14.90436 || Standard error of e = .2099898 || Fit R-squared = .8549355 |+----------------------------------------------------++---------+--------------+----------------+--------+---------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |+---------+--------------+----------------+--------+---------+ Constant 2.39132562 .11693429 20.450 .0000 LINCOMEP .88996166 .03580581 24.855 .0000 LRPMG -.89179791 .03031474 -29.418 .0000 LCARPCAP -.76337275 .01860830 -41.023 .0000+------------------------------------------------+| Random Coefficients Model || Residual standard deviation = .3498 || R squared = .5976 || Chi-squared for homogeneity test = 22202.43 || Degrees of freedom = 68 || Probability value for chi-squared= .000000 |+------------------------------------------------++---------+--------------+----------------+--------+---------+|Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] |+---------+--------------+----------------+--------+---------+ CONSTANT 2.40548802 .55014979 4.372 .0000 LINCOMEP .39314902 .11729448 3.352 .0008 LRPMG -.24988767 .04372201 -5.715 .0000 LCARPCAP -.44820927 .05416460 -8.275 .0000
Part 11: Heterogeneity [ 22/36]
Country Specific Estimates
Part 11: Heterogeneity [ 23/36]
Estimated Γ
Part 11: Heterogeneity [ 24/36]
Two Step Estimation (Saxonhouse)
it i it
i
i
it it
i i i
A Fixed Effects Model
y
Secondary Model
Two approaches
(1) Reduced form is a linear model with time constant z
y
(2) Two step
(a) FEM at step 1
(b) a (a
it
i
it i
x β
zδ
x β zδ
i i
2 1i i
i
i
) v
1 Var[v ] ( )
T
Use weighted least squares regression of a on
i
ii D i i
i
zδ
x XM X x
z
Part 11: Heterogeneity [ 25/36]
A Hierarchical Model
it i it
i i
i
it i it
i
Fixed Effects Model
y
Secondary Model
u <========
Two approaches
(1) Reduced form is an REM with time constant z
y u
(2) Two step
(a) FEM at step 1
(b) a
it
i
it i
x β
zδ
x β zδ
i i i i i
2 2 1i i u i
i
(a ) u v
1 Var[u v ] ( )
T
i
ii D i i
zδ
x XM X x
Part 11: Heterogeneity [ 26/36]
Analysis of Fannie Mae Fannie Mae The Funding Advantage The Pass Through
Passmore, W., Sherlund, S., Burgess, G.,“The Effect of Housing Government-SponsoredEnterprises on Mortgage Rates,” 2005,Federal Reserve Board and Real Estate
Economics
Part 11: Heterogeneity [ 27/36]
Two Step Analysis of Fannie-Mae
0 1 2 3i,s,t s,t s,t s,t i,s,t s,t i,s,t
4 5s,t i,s,t s,t i,s,t s,t i,s,t i,s,t
Fannie Mae's GSE Funding Advantage and Pass Through
RM ( LTV) Small Fees
New MtgCo J
i,s, t individual,state,month
1,036,252 observations in 370 state,months.
RM mortgage
LTV= 3 dummy variables for loan to value
Small = dummy variable for small loan
Fees = dummy variable for whether fees paid up front
New = dummy varia
ble for new home
MtgCo = dummy variable for mortgage company
J = dummy variable for whether this is a JUMBO loan
THIS IS THE COEFFICIENT OF INTEREST.
Part 11: Heterogeneity [ 28/36]
Average of 370 First Step RegressionsSymbol Variable Mean S.D. Coeff S.E.
RM Rate % 7.23 0.79
J Jumbo 0.06 0.23 0.16 0.05
LTV1 75%-80% 0.36 0.48 0.04 0.04
LTV2 81%-90% 0.15 0.35 0.17 0.05
LTV3 >90% 0.22 0.41 0.15 0.04
New New Home
0.17 0.38 0.05 0.04
Small < $100,000
0.27 0.44 0.14 0.04
Fees Fees paid 0.62 0.52 0.06 0.03
MtgCo Mtg. Co. 0.67 0.47 0.12 0.05
R2 = 0.77
Part 11: Heterogeneity [ 29/36]
Second Step
s,t 0
1 s,t
2 s,t
3 s,t
4 s,t
5 s,t
6
GSE Funding Advantage - estimated separately
Risk free cost of credit
Corporate debt spreads - estimated 4 different ways
Prepayment spread
Maturity mismatch risk
A
s,t
7 s,t
8 s,t
9 s,t
10-13 s,t
14-16 s,t
ggregate Demand
Long term interest rate
Market Capacity
Time trend
4 dummy variables for CA, NJ , MD, VA
3 dummy variables for calendar quarters
Part 11: Heterogeneity [ 30/36]
Estimates of β1
Second step based on 370 observations. Corrected for
"heteroscedasticity, autocorrelation, and monthly clustering."
Four estimates based on different estimates of corporate
credit spread:
0.07 (0.11) 0
11 1
21 11 2 3 4
1 1 1 1 1 1 1 1 31 1
41 1
.31 (0.11) 017 (0.10) 0.10 (0.11)
Reconcile the 4 estimates with a minimum distance estimator
ˆ( - )
ˆ( - )ˆˆ ˆ ˆ ˆMinimize [( - ),( - ),( - ),( - )]'ˆ( - )
ˆ( - )
-1Ω
Estimated mortgage rate reduction: About 7 basis points. .07%.
Part 11: Heterogeneity [ 31/36]
A Hierarchical Linear Model
German Health DataHsat = β1 + β2AGEit + γi EDUCit + β4 MARRIEDit + εit
γi = α1 + α2FEMALEi + ui
Sample ; all$Reject ; _Groupti < 7 $Regress ; Lhs = newhsat ; Rhs = one,age,educ,married ; RPM = female ; Fcn = educ(n) ; pts = 25 ; halton ; pds = _groupti ; Parameters$Sample ; 1 – 887 $Create ; betaeduc = beta_i $Dstat ; rhs = betaeduc $Histogram ; Rhs = betaeduc $
Part 11: Heterogeneity [ 32/36]
OLS Results
OLS Starting values for random parameters model...Ordinary least squares regression ............LHS=NEWHSAT Mean = 6.69641 Standard deviation = 2.26003 Number of observs. = 6209Model size Parameters = 4 Degrees of freedom = 6205Residuals Sum of squares = 29671.89461 Standard error of e = 2.18676Fit R-squared = .06424 Adjusted R-squared = .06378Model test F[ 3, 6205] (prob) = 142.0(.0000)--------+--------------------------------------------------------- | Standard Prob. Mean NEWHSAT| Coefficient Error z z>|Z| of X--------+---------------------------------------------------------Constant| 7.02769*** .22099 31.80 .0000 AGE| -.04882*** .00307 -15.90 .0000 44.3352 MARRIED| .29664*** .07701 3.85 .0001 .84539 EDUC| .14464*** .01331 10.87 .0000 10.9409--------+---------------------------------------------------------
Part 11: Heterogeneity [ 33/36]
Maximum Simulated LikelihoodNormal exit: 27 iterations. Status=0. F= 12584.28------------------------------------------------------------------Random Coefficients LinearRg ModelDependent variable NEWHSATLog likelihood function -12583.74717Estimation based on N = 6209, K = 7Unbalanced panel has 887 individualsLINEAR regression modelSimulation based on 25 Halton draws--------+--------------------------------------------------------- | Standard Prob. Mean NEWHSAT| Coefficient Error z z>|Z| of X--------+--------------------------------------------------------- |Nonrandom parametersConstant| 7.34576*** .15415 47.65 .0000 AGE| -.05878*** .00206 -28.56 .0000 44.3352 MARRIED| .23427*** .05034 4.65 .0000 .84539 |Means for random parameters EDUC| .16580*** .00951 17.43 .0000 10.9409 |Scale parameters for dists. of random parameters EDUC| 1.86831*** .00179 1044.68 .0000 |Heterogeneity in the means of random parameterscEDU_FEM| -.03493*** .00379 -9.21 .0000 |Variance parameter given is sigmaStd.Dev.| 1.58877*** .00954 166.45 .0000--------+---------------------------------------------------------
Part 11: Heterogeneity [ 34/36]
“Individual Coefficients”--> Sample ; 1 - 887 $--> create ; betaeduc = beta_i $--> dstat ; rhs = betaeduc $Descriptive StatisticsAll results based on nonmissing observations.==============================================================================Variable Mean Std.Dev. Minimum Maximum Cases Missing==============================================================================All observations in current sample--------+---------------------------------------------------------------------BETAEDUC| .161184 .132334 -.268006 .506677 887 0
Fre
qu
en
cy
BETAEDUC
-.268 -.157 -.047 .064 .175 .285 .396 .507
Part 11: Heterogeneity [ 35/36]
A Hierarchical Linear Model
A hedonic model of house values Beron, K., Murdoch, J., Thayer, M.,
“Hierarchical Linear Models with Application to Air Pollution in the South Coast Air Basin,” American Journal of Agricultural Economics, 81, 5, 1999.
Part 11: Heterogeneity [ 36/36]
HLM
ijk
M m mijk jk ijk ijkm 1
mijk
y log of home sale price i, neighborhood j, community k.
y x (linear regression model)
x sq.ft, #baths, lot size, central heat, AC, pool, good view,
age, distance to b
m
qm
Qm q qjk j jk jkq 1
qjk
Sq s qmj j js 1
qmj
each
Random coefficients
N w
N %population poor, race mix, avg age, avg. travel to work,
FBI crime index, school avg. CA achievement test score
E v
E air qu
ality measure, visibility