FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in...

35
regression and causality FE and IV models: regression and causality The Experimental Ideal Fundamentals of Regression Analysis IV and Causality Reading Material Introduction to Econometrics (2nd Edt), by J. H. Stock and M. W. Watson (2007). Mostly Harmless Econometrics: An Empiricist’s Companion, by J. D. Angrist and J. S. Pischke (2009). Instruments, Randomization, and Learning about Development, by A. Deaton (2010). JEL, Vol. 48. Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 1/35

Transcript of FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in...

Page 1: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

regression and causality

FE and IV models: regression and causality

The Experimental Ideal

Fundamentals of Regression Analysis

IV and Causality

Reading Material

Introduction to Econometrics (2nd Edt), by J. H. Stock and M. W. Watson (2007).

Mostly Harmless Econometrics: An Empiricist’s Companion, by J. D. Angrist and J. S. Pischke (2009).

Instruments, Randomization, and Learning about Development, by A. Deaton (2010). JEL, Vol. 48.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 1/35

Page 2: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the experimental ideal

To undertake a successful empirical research project you must start with thefollowing four questions:

1 What is the causal relationship of interest?

2 What would be the ideal experimental setting? But more often thannot this is a Platonic concept...

3 What is your identification strategy? How are you going to make useof observational data to approximate an ideal experiment.

4 What is your mode of statistical inference? The answer to this questiondescribes the population to be studied, the sample to be used and theassumptions made when constructing standard errors.

Lunchtime homework: come up with a research question and go througheach of these questions. We will discuss it this afternoon.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 2/35

Page 3: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

but what is an ideal experiment?

Consider the following research question Grumbach et al. (1993):

Is the use of emergency hospital care for nonemergency conditions amongthe elderly clinically appropriate?

To answer the question, the authors use survey data from the US NationalHealth Interview Survey (NHIS).

group sample size mean health std. error

hospital 7,774 3.21 0.014no hospital 90,049 3.93 0.003

The difference in means is large and significant.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 3/35

Page 4: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

selection bias

Does this result immediately implies that emergency treatment is bad forthe health of the elderly?

Maybe not: adverse selection may bias the results ⇒ selection bias.

Suppose that we describe hospital treatment as a binary random variableDi = 0, 1 and the health outcome for individual i is denoted Yi .

For each individual i there are two potential outcomes:

potential outcome =

Y1i if Di = 1Y0i if Di = 0

.

the causal effect of hospital treatment on individual i is Y1i − Y0i .

But only one of the two potential outcomes is observed, Yi , for each i ...

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 4/35

Page 5: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

The observed outcome can be expressed as

Yi = Y0i + (Y1i − Y0i )︸ ︷︷ ︸causal effect

Di .

We cannot expect to find the (causal) treatment effect for each individual,which is very likely heterogeneous over the population. But we may hopeto recover the average treatment effect.

E [Yi |Di = 1]− E [Yi |Di = 0]︸ ︷︷ ︸observed difference in average health

=

E [Y1i |Di = 1]− E [Y0i |Di = 1]︸ ︷︷ ︸average treatment effect on the treated

+E [Y0i |Di = 1]− E [Y0i |Di = 0]︸ ︷︷ ︸selection bias

and if, for example, the sick are more likely to be hospitalized, the selectionbias will be negative ⇒ there are unobserved confounding factors.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 5/35

Page 6: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

random assignment to solve the selection bias

Random assignment of treatment Di solves the selection bias because itmakes the treatment independent of the potential outcome.

To see this, simply notice that if Di is randomly assigned, then

E [Y0i |Di = 0] = E [Y0i |Di = 1] .

But then we have that

E [Yi |Di = 1]− E [Yi |Di = 0]︸ ︷︷ ︸observed difference in average health

= E [Y1i |Di = 1]− E [Y0i |Di = 0] ,

= E [Y1i |Di = 1]− E [Y0i |Di = 1]︸ ︷︷ ︸average treatment effect on the treated

.

With random assignment there are no confounding factors!

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 6/35

Page 7: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

But random assignment is often difficult in microeconomics and outrightimpossible in macroeconomics...

Nevertheless, we hope to find natural or quasi-experiments that mimic arandomized trial by changing the variable of interest independently of allother confounding factors.

The first question to ask about a randomized experiment is:

Did randomization successfully balance subjects characteristics across thetreatment and control groups?

To assess this, it is common to look at pre-treatment outcomes or othercovariates across groups.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 7/35

Page 8: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

an example: Krueger (1999), STAR study.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 8/35

Page 9: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

experiments as regression analysis

The experimental design can usefully be described using regression analysis.

Suppose first that the treatment effect is the same for everyone,

Y1i − Y0i = ρ, for all i ,

or in other words, there are no heterogeneous treatment effects.

Then we have thatYi = Y0i + (Y1i − Y0i )Di ,

= α + ρDi + ηi ,

with α = E [Y0i ] and ηi = Y0i − E [Y0i ].

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35

Page 10: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

The conditional expectations of the regression equation, conditional on thebinary treatment Di are

E [Yi |Di = 1] = α + ρ+ E [ηi |Di = 1]

E [Yi |Di = 0] = α + E [ηi |Di = 0]

and, hence, we have that

E [Yi |Di = 1]− E [Yi |Di = 0] =

ρ︸︷︷︸treatment effect

+E [ηi |Di = 1]− E [ηi |Di = 0]︸ ︷︷ ︸selection bias

Thus, there is selection bias if there’s a correlation between the regressorDi and the error term ηi . This is a familiar result!

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 10/35

Page 11: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

recalling the OLS estimator, βols =

(∑iXiX

′i

)−1∑iXiYi

The observed outcomes are Yi = X ′i β+(Yi − X ′i β

)︸ ︷︷ ︸ei

, and the OLS estimator

is unbiased if

E

(∑i

XiX′i

)−1∑i

XiYi

= β,

E

(∑i

XiX′i

)−1∑Xi

(X ′i β +

(Yi − X ′i β

)) = β,

E

(∑i

XiX′i

)−1∑i

Xiei

= 0, → E [ei |Xi ] = 0.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 11/35

Page 12: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

In the experimental context, recall

ηi = Y0i − E [Y0i ]

and, hence, we have that

selection bias = E [ηi |Di = 1]− E [ηi |Di = 0] ,

= E [Y0i |Di = 1]− E [Y0i |Di = 0] .

Thus, selection bias exists if there are differences in the potential outcomein the absence of treatment, between individuals that receive treatment andindividuals that do not.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 12/35

Page 13: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

an example: Krueger (1999), STAR study continued.

TABLE V (CONTINUED)

Explanatory variable

OLS.. actual class size Reduced form; initial class size

(1) (2) (3) (4) (5) (6) (7) (8)

C. Second grade

Small class 5.93 6.33 5.83 5.79 5.31 5..52 5.27 5.26 (1.97) (1.29) (1.23) (1.23) (1.70) (1.16) (1.10) (1.10)

Regular/aide class. 1.97 1.88 1.64 L58 ..47 1.44 1.16 1.18 (2.05) (1.10) (1.07) (1.06) (1.23) (0.87) (0.81) (0.81.)

Whiteasian 1 = 6.35 6.36 - 6.27 6.29 Yes) (1.20) (1.19) (1.21) (1.20)

Girl 1 = yes) 3,48 3.45 3,48 3.44 (.60) (.60) (.60) (.60)

Free lunch (1 = -13.61 -13.61 -13.75 -13.77 Yes) (.72) (.72) C73) (.73)

White teacher .39 .43 (1.75) (1.76)

Male teacher 1.32 .82 (3.96) (4.23)

Teacher experience ..10 .10 (.06) (.07)

Master's degree -1.06 -1.16 (1.06) (1.05)

School fixed effects No Yes Yes Yes No Yes Yes YEs .R 2 .01 .22 .28 .28 .01 .21 .28 .28

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 13/35

Page 14: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

You may want to expand the list of covariates in your regression analysis ofexperimental data beyond the random assignment dummy variable Di , andestimate the long regression

Yi = α + ρDi + X ′i γ + ηi .

Covariates play two roles in regression analysis of experimental data:

1 The assignment of treatment may be random within group but notacross groups. Then you want to include group fixed effects. Forexample, in Krueger’s paper there are school fixed effects.

2 To verify that the covariates are balanced across treated and untreatedobservations. If this is the case, then the additional control variablesmust be uncorrelated with Di , and the long and short regressions shouldyield roughly the same estimate for ρ. Inclusion of these covariatesshould yield more precise estimates → smaller standard errors.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 14/35

Page 15: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

fundamentals of regression analysis

Random assignment is rare in social sciences. Most studies are carried onobservational data, and without random assignment regression estimatesmay or may not have a causal interpretation.

What are the universal features of regression analysis? To answer such aquestion, we begin by introducing the notion of conditional expectation.

Even when the relationship between two variables is not causal, a variablemay help predict the other in a statistical sense. This predictive power issummarized by the conditional expectation function (CEF).

The CEF for a variable Yi given a list of covariates Xi is

E [Yi |Xi ] =

∫y f (y |Xi ) dy ,

with f (y |Xi ) the probability density function of Yi conditional on Xi .

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 15/35

Page 16: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the CEF of earnings.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 16/35

Page 17: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the CEF decomposition theorem

For any population Yi ,Xi the CEF has the property that

Yi = E [Yi |Xi ] + εi ,

with E [εi |Xi ] = 0, and εi uncorrelated with any function of Xi , h (Xi ).

To prove this result, simply write εi = Yi − E [Yi |Xi ], so that

E [εi |Xi ] = E [Yi − E [Yi |Xi ] |Xi ] ,

= E [Yi |Xi ]− E [Yi |Xi ] ,

= 0.

and also E [εih (Xi )] = E [E [εi |Xi ] h (Xi )] = 0, where we’ve used the law ofiterated expectations.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 17/35

Page 18: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the CEF best prediction property

The CEF decomposition theorem implies that the CEF is the best predictorof Yi given Xi in the sense that it solves the minimum mean squared errorproblem (MMSE).

Thus, the CEF prediction has the property that

E [Yi |Xi ] = arg minm(Xi )

E[(Yi −m (Xi ))2

].

To prove this result, write

(Yi −m (Xi ))2 =[

(Yi − E [Yi |Xi ]) + (E [Yi |Xi ]−m (Xi ))]2,

= (Yi − E [Yi |Xi ])2︸ ︷︷ ︸

is not dependent on m (Xi )

+ 2εi (E [Yi |Xi ]−m (Xi ))︸ ︷︷ ︸is εih (Xi ) and so has expectation 0

+ (E [Yi |Xi ]−m (Xi ))2︸ ︷︷ ︸is minimized at m (Xi ) = E [Yi |Xi ]

.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 18/35

Page 19: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

linear regression and the CEF

Consider the the population linear least squares problem

β = arg minb

E[(Yi − X ′i b

)2]

the first-order condition solving this problem is

E[Xi

(Yi − X ′i b

)]= 0

and, thus, the solution to this problem is β = E [XiX′i ]−1 E [XiYi ].

A natural upshot from the CEF decomposition theorem is that if the CEFis linear, then the population regression function is the CEF.

To show this result, simply notice that if the CEF is linear, then (for someβ?), Yi − X ′i β

? = εi . But then,

E[Xi

(Yi − X ′i β

?)]

= E [εiXi ] ,

= 0,

from the CEF decomposition theorem and, hence, β = β?.Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 19/35

Page 20: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

Even if the CEF is not linear, the population least squares projection X ′i βprovides the MMSE linear approximation to the true CEF, so that

β = arg minb

(E [Yi |Xi ]− X ′i b

)2.

To prove this result, write(Yi − X ′i b

)2=[

(Yi − E [Yi |Xi ]) +(E [Yi |Xi ]− X ′i b

) ]2,

= (Yi − E [Yi |Xi ])2︸ ︷︷ ︸

is not dependent on b

+(E [Yi |Xi ]− X ′i b

)2

+ 2 (Yi − E [Yi |Xi ])(E [Yi |Xi ]− X ′i b

)︸ ︷︷ ︸is εih (Xi ) and so has expectation 0 for all b

,

and, hence, β must be the arg min of (E [Yi |Xi ]− X ′i b)2, since β is defined

as the arg min of (Yi − X ′i b)2.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 20/35

Page 21: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

asymptotic OLS inference

In practice, we don’t know what the CEF or the population regression vectorβ are. Thus, we must make statistical inferences about these moments usingfinite samples.

The sample analog of the population regression vector β is

βols =

(∑i

XiX′i

)−1∑i

XiYi ,

obtained by replacing expectations E [ • ] with sample moments N−1∑i

.

This is the OLS estimator.

Thus, the OLS estimator is a method of moments estimator.

By the law of large number, sample moments converge in probability tothe population moments, and by the central limit theorem (CLT), samplemoments are asymptotically normally distributed.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 21/35

Page 22: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the Slutsky’s theorem

Let aN be a statistic with an asymptotic distribution and bN a statistic withthe probability limit b. Then

1 the statistics aN+bN and aN+b have the same asymptotic distribution.

2 the statistics aNbN and aNb have the same asymptotic distribution.

the Delta method

Let bN be a vector valued random variable that is asymptotically normallydistributed with p lim = b, and h ( • ) a continuous, differentiable scalarfunction. The asymptotic distribution of

√N [h (bn)− h (b)] is normal with

covariance matrix ∇h (b)′Ω∇h (b), where Ω is the asymptotic covariancematrix of bN .

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 22/35

Page 23: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

We make use of the Slutsky’s theorem and the CLT to derive the assymptoticdistribution of the OLS estimator

βols =

(∑i

XiX′i

)−1∑i

XiYi ,

= β +

(∑i

XiX′i

)−1∑i

Xiei .

The asymptotic distribution of

√N(βols − β

)=

(∑i XiX

′i

N

)−1 1√N

∑i

Xiei ,

is the same as that of E [XiX′i ]−1

(1/√N)∑

iXiei , by the Slutsky’s theorem

and, by the CLT,(

1/√N)∑

iXiei is asymptotically normally distributed

with covariance matrix E[XiX

′i e

2i

].

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 23/35

Page 24: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

heteroscedasticity consistent standard errors (White’s std. errors)

βols has an asymptotical distribution, with p lim β and covariance

E[XiX

′i

]−1E[XiX

′i e

2i

]E[XiX

′i

]−1.

The theoretical standard errors used to construct t-statistics are the squareroots of the diagonal of this matrix (White, 1980).

In practice, these standard errors are estimated by substituting ei with theestimated residuals ei = Yi − X ′i βols and using the sample moments (byreplacing the expectations with sums).

These standard errors are robust as they require a minimal set of assumptionsabout the data and the model.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 24/35

Page 25: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

IV and causality

Suppose that potential effects can be written

Ysi = fi (S) ,

and that fi (S) = α + ρS + ηi .

Assume also that ηi = A′iγ + νi , with E [Siνi ] = 0.

Then, the long regression equation

Ysi = α + ρSi + A′iγ + νi ,

is a linear causal model, given that Ysi |= Si |Xi , which corresponds to theconditional independence assumption (CIA).

But how to estimate the coefficient ρ when Ai is unobserved?

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 25/35

Page 26: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

instrumental variables

If we have access to a variable Zi (the instrumental variable) that is

1 relevant: correlated with the variable of interest Si ,

2 but satisfies the exclusion restriction: is uncorrelated with any otherdeterminant of the dependent variable,

then

ρ =

reduced form regression︷ ︸︸ ︷cov (Yi ,Zi ) /V (Zi )

cov (Si ,Zi ) /V (Zi )︸ ︷︷ ︸first stage regression

,

and the IV estimator is the sample analog of this expression.

Where to find good instruments? This is an art and not a science.

As an example, Angrist and Krueger (1991) use variation in compulsoryschooling laws to estimate the causal effect of schooling on education.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 26/35

Page 27: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

an example: Angrist and Krueger (1991), compulsory schooling study.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 27/35

Page 28: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the Two-Stage Least Squares (2SLS) estimator

Consider the first-stage and the reduced-form regression equations

Si = X ′i π10 + π11Zi + ξ1i ,

Yi = X ′i π20 + π21Zi + ξ2i .

The reduced form equation is derived by substituting the first-stage equationinto the structural equation. We have

Yi = X ′i α + ρSi + ηi ,

= X ′i α + ρ(X ′i π10 + π11Zi + ξ1i

)+ ηi ,

= X ′i (α + ρπ10) + ρπ11Zi + ρξ1i + ηi ,

= X ′i π20 + π21Zi + ξ2i .

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 28/35

Page 29: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

Define the sample first-stage fitted value of Si as

Si = Xi π10 + π11Zi ,

then, the 2SLS estimator can be obtained by OLS estimation of the “second-stage equation”,

Yi = α′Xi + ρSi +[ηi + ρ

(Si − Si

)].

The 2SLS estimator is consistent because the covariates Xi and fitted values

of Si are uncorrelated with both ηi and(Si − Si

).

But the standard errors resulting from the OLS estimation of the “second-stage equation” will not be right, as the first-stage is ignored...

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 29/35

Page 30: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

asymptotic 2SLS inference

Let Vi ≡[Xi ′Si

]′denote the vector of regressors in the 2SLS second stage.

The 2SLS estimator is given by

β2sls =

(∑i

ViV′i

)−1∑i

ViYi ,

= β +

(∑i

ViV′i

)−1∑i

Vi

[ηi + ρ

(Si − Si

)],

= β +

(∑i

ViV′i

)−1∑i

Viηi .

Notice that∑iViρ

(Si − Si

)= 0, as Si − Si is orthogonal to Vi in sample!

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 30/35

Page 31: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

Thus, analogously to the OLS estimator, the 2SLS estimator has asymptoticdistribution with p lim β and covariance

E[ViV

′i

]−1E[ViV

′i η

2i

]E[ViV

′i

]−1,

the familiar sandwich formula.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 31/35

Page 32: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

overidentification and the GMM estimator

Constant effects models with more instruments than endogenous variablesare said to be overidentified.

With more instruments than needed to identify the parameters of interest,overidentified models allow for a process of specification testing.

Let Zi =[X ′i , z

′1i , . . . , z

′qi

]′denote the (k + q) × 1 vector of exogenous

covariates and instrumental variables, and Wi = [X ′i , Si ]′ the (k + 1) × 1

vector of exogenous covariates and the endogenous variable.

The coefficient vector is β = [α′, ρ]′ and the residual from the second-stageregression is

ηi (β) = Yi − β′Wi ,

= Yi −[α′Xi + ρSi

].

The exclusion restriction requires E [Ziηi (β)] = 0.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 32/35

Page 33: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

The sample analog of the population moment restrictions E [Ziηi (β)] = 0is given by

mN (β) =1

N

∑i

Ziηi (β) = 0.

But in sample, these conditions will never hold exactly, as there are (k + q)equations and only (k + 1) unknowns.

The instrumental variables estimator can be implemented as a generalizedmethod of moments (GMM) estimator that chooses the value βgmm that

makes mN

(βgmm

)as close as possible to 0.

By the CLT, the sample moment vector√NmN (β) has the asymptotic

covariance matrix equal to Λ = E[ZiZ

′i ηi (β)2

].

The GMM estimator is the minimizer of the quadratic form

JN

(β)

= NmN

(β)′

Λ−1mN

(β).

In the homoscedastic case, E[ZiZ

′i η

2]

= σ2ηE [ZiZ

′i ], we get βgmm = β2sls.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 33/35

Page 34: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

the overidentification test statistic

We have seen that with more instruments than endogenous variables, thesample moment vector will not be exactly 0. But, we can test if it is closeenough to 0. This is the basis for the test of overidentification, a modelspecification test.

Under the null hypothesis that the model is well specified, the minimum of

JN

(β)

has a χ2q−1 distribution.

We can therefore compare the empirical value of the GMM minimand withthe chi-square table, in a formal test (Hansen, 1982) of

H0 : E [Ziηi (β)] = 0.

In the homoscedastic case, E[ZiZ

′i η

2]

= σ2ηE [ZiZ

′i ], the minimized 2SLS

minimand is N×R2, with R2 taken from the regression of the 2SLS residualson the instruments. This is the Hausman test (Hausman, 1983).

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 34/35

Page 35: FE and IV models: regression and causalitypsm509/SlidesEconometricsULBWeek… · Seminar in Econometrics, autumn term (2015 { 2016), ULB. Instructor: Paulo Santos Monteiro. 9/35 The

bibliography

Angrist, J. D. and A. B. Krueger (1991, 1 November). Does compulsory school attendance affectschooling and earnings? Q. J. Econ. 106(4), 979–1014.

Grumbach, K., D. Keane, and A. Bindman (1993). Primary care and public emergency departmentovercrowding. American Journal of Public Health 83(3), 372–378.

Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators.Econometrica, 1029–1054.

Hausman, J. A. (1983, January). Specification and estimation of simultaneous equation models.1, 391–448.

Krueger, A. B. (1999). Experimental Estimates Of Education Production Functions. The QuarterlyJournal of Economics, MIT Press 114(2), 497–532.

White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct testfor heteroskedasticity. Econometrica, 817–838.

Seminar in Econometrics, autumn term (2015 – 2016), ULB. Instructor: Paulo Santos Monteiro. 35/35