Instrumental Variables I

16
Instrumental Variables I

description

Instrumental Variables I. Objective. We are trying to learn the effect of education on income We have Card (1993)’s data on years of schooling, wages, proximity to a four year college and various other controls. - PowerPoint PPT Presentation

Transcript of Instrumental Variables I

Page 1: Instrumental Variables I

Instrumental Variables I

Page 2: Instrumental Variables I

Objective

We are trying to learn the effect of education on income• We have Card (1993)’s data on years of schooling, wages,

proximity to a four year college and various other controls.• We will obtain OLS and IV estimates of the returns to

education and discuss any problems in this particular context and in general

Page 3: Instrumental Variables I

OLS Results. reg lwage educ exper expersq black smsa smsa66 south reg66*, robustLinear regression Number of obs = 3010 F( 15, 2994) = 91.31 Prob > F = 0.0000 R-squared = 0.2998 Root MSE = .37228

------------------------------------------------------------------------------ | Robust lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- educ | .0746933 .0036462 20.48 0.000 .0675439 .0818427 exper | .084832 .0067548 12.56 0.000 .0715875 .0980765 expersq | -.002287 .0003194 -7.16 0.000 -.0029133 -.0016608 black | -.1990123 .0181644 -10.96 0.000 -.2346282 -.1633964 smsa | .1363845 .0192172 7.10 0.000 .0987042 .1740648 smsa66 | .0262417 .0185908 1.41 0.158 -.0102102 .0626937 south | -.147955 .0280346 -5.28 0.000 -.202924 -.092986 reg661 | -.1405174 .0451252 -3.11 0.002 -.228997 -.0520378 reg662 | -.0441502 .0372945 -1.18 0.237 -.1172756 .0289751

……------------------------------------------------------------------------------

Are you surprised? What is the OLS Identification Assumption? What sources of bias are likely to be present? Which direction are these sources of bias likely to bias our estimates?

Page 4: Instrumental Variables I

What do we require for an instrument to be valid?

Page 5: Instrumental Variables I

What do we require for an instrument to be valid?

1. Relevance: cov(z, x) ≠ 02. Exogeneity cov(z, e) = 0

Page 6: Instrumental Variables I

What do we require for an instrument to be valid?

1. Relevance: cov(z, x) ≠ 0– Important because if the instrument isn’t correlated

with the endogenous variable then knowing the value of the instrument doesn’t tell us anything about the endogenous variable.

– Do we care about the unconditional correlation or the correlation conditional on the other controls? Why?

– Can we test this? How?

2. Exogeneity cov(z, e) = 0

Page 7: Instrumental Variables I

What do we require for an instrument to be valid?

1. Relevance: cov(z, x) ≠ 02. Exogeneity cov(z, e) = 0– Important because we want the instrument to

effect z only through x– Can we test this? If not what do we do instead?– How does this assumption relate to the key OLS

identification assumption?

Page 8: Instrumental Variables I

Testing Relevance

How can we test the relevance of an instrument?

Page 9: Instrumental Variables I

Testing Relevance

How can we test the relevance of an instrument?1. Calculate cor(x,z)– Better than nothing but not ideal. Why?

2. Run the ‘first stage’ regression– What should we include?– What do we look at?– What if we have more than one instrument?– What if we have more than one endogenous variable?

3. Use the post-estimation commands after estimating our main regression.

We’ll do (2) today.

Page 10: Instrumental Variables I

1st Stage Resultsreg educ nearc4 exper expersq black smsa smsa66 south reg66*, robust note: reg666 omitted because of collinearity

Linear regression Number of obs = 3010 F( 15, 2994) = 244.92 Prob > F = 0.0000 R-squared = 0.4771 Root MSE = 1.9405

------------------------------------------------------------------------------ | Robust educ | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearc4 | .3198989 .0850763 3.76 0.000 .153085 .4867128 exper | -.4125334 .0320751 -12.86 0.000 -.4754249 -.3496418 expersq | .0008686 .0017076 0.51 0.611 -.0024795 .0042167...

Where do we look to test the Relevance condition? Is it satisfied?

Page 11: Instrumental Variables I

First-Stage F

A ‘First Stage F-Statistic’ in excess of 10 is often used as the threshold for satisfaction of the Relevance condition• What do we mean by a first stage F Statistic• Can we see it on the previous slide?– (we can, but not directly) in general you can use

Stata’s ‘test’ command

Page 12: Instrumental Variables I

How plausible is it that nearc4 is exogenous?

Page 13: Instrumental Variables I

IV Resultsivregress 2sls lwage (educ=nearc4) exper expersq black smsa smsa66 south reg66*, robustnote: reg669 omitted because of collinearity

Instrumental variables (2SLS) regression Number of obs = 3010 Wald chi2(15) = 840.83 Prob > chi2 = 0.0000 R-squared = 0.2382 Root MSE = .3873

------------------------------------------------------------------------------ | Robust lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+---------------------------------------------------------------- educ | .1315038 .0539995 2.44 0.015 .0256667 .237341 exper | .1082711 .0233466 4.64 0.000 .0625127 .1540295 expersq | -.0023349 .0003478 -6.71 0.000 -.0030167 -.0016532 black | -.1467757 .0523622 -2.80 0.005 -.2494038 -.0441477 smsa | .1118083 .0310619 3.60 0.000 .050928 .1726886 smsa66 | .0185311 .0205103 0.90 0.366 -.0216684 .0587306 south | -.1446715 .0290653 -4.98 0.000 -.2016385 -.0877045 reg661 | -.1078142 .0409668 -2.63 0.008 -.1881077 -.0275208

How have the results changed? Are they what you expect? What explanations could there be for the differences?

Page 14: Instrumental Variables I

Does the exclusion of IQ break the exogeneity condition?

. reg IQ nearc4

Source | SS df MS Number of obs = 2061-------------+------------------------------ F( 1, 2059) = 12.13 Model | 2869.62905 1 2869.62905 Prob > F = 0.0005 Residual | 487188.423 2059 236.614096 R-squared = 0.0059-------------+------------------------------ Adj R-squared = 0.0054 Total | 490058.052 2060 237.892258 Root MSE = 15.382

------------------------------------------------------------------------------ IQ | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearc4 | 2.5962 .7454966 3.48 0.001 1.134195 4.058206 _cons | 100.6106 .6274557 160.35 0.000 99.38014 101.8412------------------------------------------------------------------------------

Page 15: Instrumental Variables I

How about now?. reg IQ nearc4 smsa66 reg662-reg669

Source | SS df MS Number of obs = 2061-------------+------------------------------ F( 10, 2050) = 13.70 Model | 30699.1017 10 3069.91017 Prob > F = 0.0000 Residual | 459358.951 2050 224.077537 R-squared = 0.0626-------------+------------------------------ Adj R-squared = 0.0581 Total | 490058.052 2060 237.892258 Root MSE = 14.969

------------------------------------------------------------------------------ IQ | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+---------------------------------------------------------------- nearc4 | .3478974 .8144087 0.43 0.669 -1.249257 1.945052 smsa66 | 1.089165 .8086998 1.35 0.178 -.4967934 2.675124 reg662 | 1.099282 1.649748 0.67 0.505 -2.136074 4.334639 reg663 | -1.559295 1.622997 -0.96 0.337 -4.742191 1.6236 reg664 | -.5425011 1.916258 -0.28 0.777 -4.300517 3.215515 reg665 | -8.47546 1.665513 -5.09 0.000 -11.74173 -5.209185 reg666 | -7.421172 1.973869 -3.76 0.000 -11.29217 -3.550175 reg667 | -8.39441 1.829768 -4.59 0.000 -11.98281 -4.806013 reg668 | -2.924975 2.34463 -1.25 0.212 -7.52308 1.67313 reg669 | -2.891917 1.797382 -1.61 0.108 -6.416801 .6329674 _cons | 104.7735 1.624972 64.48 0.000 101.5867 107.9602------------------------------------------------------------------------------

Page 16: Instrumental Variables I

Do we believe the IV results?