Lecture 4 – More on instrumental variables - heterogeneity ... · heterogeneity Lecture 4 –...
Transcript of Lecture 4 – More on instrumental variables - heterogeneity ... · heterogeneity Lecture 4 –...
heterogeneity
Lecture 4 – More on instrumental variables -heterogeneity and the Roy model
Economics 2123George Washington University
Instructor: Prof. Ben Williams
heterogeneity
• Outline• a model of heterogeneous effects• the Roy model (selection model/control function approach)• IV in models with heterogeneity
heterogeneity
Suppose Di denotes a binary “treatment” and (Y0i ,Y1i) thepotential outcomes.
• The observed outcome is
Yi = Y1iDi + Y0i(1 − Di)
• If Ydi = µd + Udi then
Yi = µ0 + (µ1 − µ0)Di + U0i + (U1i − U0i)Di
• If Zi is independent of (U0i ,U1i), is β identified?• Typically, no.
• The Roy model (and its extensions) help us think about thismore carefully.
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Two sector economy (e.g., think college graduates andhigh school graduates)Wage differential between sectors in general does not givethe (counterfactual) wage effect from switching sectors.Sector choice causes wage gains (or losses) but wagedifferentials also induces sector choice.
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Suppose Yd = µd + Ud for d = 0,1 and thatD = 1(Y1 − Y0 ≥ c)
Then E(Y0 | D = 1) 6= E(Y0 | D = 0) (selection bias) andand E(Y1 − Y0 | D = 1) 6= E(Y1 − Y0) (sorting gains)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Sector 1 observed earnings:
E(Y1 | D = 1) = µ1 + E(U1 | U1 − U0 > −(µ1 − µ0 − c))
not equal to µ1 because U1 and V = U1 − U0 arecorrelatedSuppose (U1,U0) are jointly normal with mean 0,covariance denoted by σ10 and variances σ2
d = Var(Ud ).Due to normality, U1 = ρσV V + ε where
ρ – correlation between U1 and VσV – variance of Vε1 and V are independent
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
As a result, observed sector 1 earnings –
µ1 +Var(U1)− Cov(U1,U0)√
Var(U1 − U0)λ
(− µ1 − µ0 − c√
Var(U1 − U0)
)
λ(·) is the inverse mills ratio –λ(z) = E(Z | Z > z) = φ(z)/(1− Φ(z)) for Z ∼ N(0,1)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
some empirical implications:positive selection in both sectors if the two distributions areuncorrelatedin general, cannot have negative selection in both sectorscost increase –
∂E(Y1 | D = 1)
∂c=
Var(U1)− Cov(U1,U0)
Var(U1 − U0)λ′ (x∗)
where λ′ > 0
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
more results:Also, ∂Var(Y1 | D = 1)/∂c has the opposite sign.negative selection in sector 0 requiresVar(U0) < Cov(U0,U1) < Var(U1)
selection reduces inequalityif the distributions are not normal these things don’t have tobe true
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Identification:If U1,U0 are jointly normal then the unknown parametersare µ1, µ0,Var(U1),Var(U0),Cov(U1,U0).Three different observation schemes:
Observe (Yi ,Di ) for iid sample i = 1, . . . ,nObserve Yi for iid sample i = 1, . . . ,nObserve (Yi ,Di ) for iid sample i = 1, . . . ,n where Yi isobserved to be missing if Di = 0.
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Identification:If U1,U0 are jointly normal then all parameters areidentified in first sampling scheme.Partial identification in the other two sampling schemes.Suppose there are regressors, µd = β′dX for d = 0,1.Then
E(Y1 | D = 1,X = x) = β′1x + E(U1 | U1 − U0 > −z,X = x)
E(Y0 | D = 0,X = x) = β′0x + E(U0 | U1 − U0 < −z,X = x)
Pr(D = 1 | X = x) = Pr(U1 − U0 > −z | X = x)
where z = (β1 − β0)′x − c
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Now suppose U1,U0 are jointly normal, conditional on X
Then U1 =σ2
1−σ10σV
(U1 − U0) + ε where ε ⊥⊥ U1 − U0 | X
Therefore, E(Y1 | D = 1,X = x) = β′1x +σ2
1−σ10σV
λ(−z)
where z = (β′1x − β′0x − c)/σV andthe conditional probability of being in sector 1 is
Pr(D = 1 | X = x) = Φ (z) (propensity score)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Based on these two conditional moments, β1 is identifiedas well as some combinations of other parameters.The direction of selection is identifiedβ1,k − β0,k is identified up to scale
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
If we have “complete” data
E(Y0 | D = 0,X = x) = β′0x +σ2
0−σ10σV
λ(z) wherez = (β′1x − β′0x − c)/σV
everything is identified
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Roy Model
Counterfactualsthe distribution of potential wage gains – Y1 − Y0
the proportion of the population who benefits –Pr(Y1 > Y0)
the effect of a policy of subsidizing cost for those with Y0below a cutoff value, y0
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Generalized Roy Model:Let Yd = µd (X ) + Ud and D = 1(µD(X ,Z ) ≥ V )
where (U1,U0,V ) ⊥⊥ (X ,Z )
special case –µD(X ,Z ) = µ1(X )− µ0(X )− µC(X ,Z )V = U1 − U0 − UC
what is identified in this case?
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Generalized Roy Model:Assumptions
HV1 (U1,U0,V ) ⊥⊥ (X ,Z )HVN1 (U1,U0,V ) is normally distributed
Then
E(Y1 | D = 1,X = x ,Z = z) = β′1x−Cov(U1,V )
σVλ
(−µD(x , z)
σV
)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Identificationβ1, β0 are identified with data on (Y ,D,X ,Z )
suppose µD(x , z) = β′1x − β′0x + γ′xx + γ′zzif there is a component of x with γx set to 0 then σV , γx , γzare identifiedregardless, the sorting gains and the selection bias can beidentified
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Generalized Roy Model without normalityHeckman and Honore (1990) study the Roy model withand without regressors under nonnormalityLet’s consider the generalized Roy model undernonnormalityIn this case,
E(Y1 | D = 1,X = x ,Z = z) = β′1x + E(U1 | µD(x , z) ≥ V )
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Generalized Roy Model without normalityThe propensity score:P(x , z) := Pr(D = 1 | X = x ,Z = z) = FV (µD(x , z))
Index sufficiency: E(U1 | µD(x , z) ≥ V ) = K1(P(x , z))
K1 is called a control function
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
Generalized Roy Model
Generalized Roy Model without normalityβ1 is identified if limz→∞ P(x , z) = 1 (orlimz→−∞ P(x , z) = 1)because limP(x ,z)→∞ K1(P(x , z)) = 0“identification at infinity”HV1 is needed for this argument but HVN1 is replaced bythe support condition
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
Marginal Treatment EffectDefine UD = FV (V )
Then D = 1(P(X ,Z ) ≥ UD)
what is the distribution of UD?MTE(x ,u) = E(Y1 − Y0 | X = x ,UD = u) (conceptoriginates with Bjorklund and Moffitt (1987))
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
Identification –
∂E(Y | X = x ,P(X ,Z ) = p)
∂p=∂E(Y0 | X = x ,P(X ,Z ) = p)
∂p
+∂E(D(Y1 − Y0) | X = x ,P(X ,Z ) = p)
∂p
= 0 +∂
∂p
∫ p
0MTE(x ,u)du
= MTE(x ,p)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
Marginal Treatment Effectmany parameters of interest can be written as∫ 1
0MTE(x ,u)ω(x ,u)du,
∫ 1
0ω(x ,u)du = 1
for example,
ATE(x) := E(Y1 − Y0 | X = x), ωATE (x ,u) = 1[0,1]
TT (x) := E(Y1 − Y0 | D = 1,X = x),
ωTT (x ,u) ∝ Pr(P(x ,Z ) > u | X = x)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
weights for IV
the IV estimand is ∆IV (x) = Cov(J(Z ),Y |X=x)Cov(J(Z ),D|X=x) where
J = J(Z ) is some function of the instruments that maydepend implicitly on X as well.
then ∆IV (x) =∫ 1
0 MTE(x ,u)ωIV (x ,u)du where
ωIV (x ,u) =E(J − E(J) | X = x ,P ≥ u)Pr(P ≥ u | X = x)
Cov(J,P | X = x)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
weights for IVwhat if J = P?if Z is scalar and binary then
∆IV (x) =
∫ p1
p0
MTE(u, x)1
p1 − p0du
where ps = Pr(D = 1 | Z = s,X = x) for s = 0,1in general, weights are not always positive!!
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
LATELet D(z) denote the value D takes when Z takes the valuez.Imbens and Angrist showed that the IV estimand in thebinary case takes the form E(Y1 − Y0 | D(z)− D(z ′) = 1)
This is called the local average treatment effectrepresents the average effect of treatment for individualsinduced to receive treatment when Z changes from 0 to 1
MTE(P(z)) = limz′→z
LATE(z, z ′)
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
Carneiro, Heckman, and Vytlacil (2011)data from NLSYY is log age in 1991 (individuals are between 28 and 34),D represents college attendance, X contains usual controlsinstruments: (i) distance to college, (ii) local wage, (iii) localunemployment, (iv) average local public tuition
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE2771cARnEiRO Et Al.: EStimAting mARginAl REtuRnS tO EducAtiOnVOl. 101 nO. 6
mean values in the sample. As above, we annualize the MTE. Our estimates show that, in agreement with the normal model, E( u 1 − u 0 | u S = u S ) is declining in u S , i.e., students with high values of u S have lower returns than those with low values of u S .
Even though the semiparametric estimate of the MTE has larger standard errors than the estimate based on the normal model, we still reject the hypothesis that its slope is zero. We have already discussed the rejection of the hypothesis that MTE is constant in u S , based on the test results reported in Table 4, panel A. But we can also directly test whether the semiparametric MTE is constant in u S or not. We evaluate the MTE at 26 points, equally spaced between 0 and 1 (with intervals of 0.04). We construct pairs of nonoverlapping adjacent intervals (0–0.04, 0.08–0.12, 0.16–0.20, 0.24–0.28, …), and we take the mean of the MTE for each pair. These are LATEs defined over different sections of the MTE. We compare adjacent LATEs. Table 4, panel B, reports the outcome of these comparisons. For example, the first column reports that
E ( Y 1 − Y 0 | X = _ x , 0 ≤ u S ≤ 0.04)
− E ( Y 1 − Y 0 | X = _ x , 0.08 ≤ u S ≤ 0.12) = 0.0689.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
US
MT
E
Figure 4. E( Y 1 − Y 0 | X, u S ) with 90 Percent Confidence Interval— Locally Quadratic Regression Estimates
notes: To estimate the function plotted here, we first use a partially linear regression of log wages on polynomials in X, interactions of polynomials in X and P, and K(P), a locally quadratic function of P (where P is the predicted probability of attending college), with a bandwidth of 0.32; X includes experience, current average earnings in the county of residence, current average unemployment in the state of residence, AFQT, mother’s education, number of siblings, urban residence at 14, permanent local earnings in the county of residence at 17, permanent unemployment in the state of residence at 17, and cohort dummies. The figure is generated by evaluating by the derivative of (9) at the average value of X. Ninety percent standard error bands are obtained using the bootstrap (250 replications).
Economics 379 George Washington University
Lecture 4
Review of IV Roy Model Generalized Roy Model MTE
MTE
MTE assumptions:HV1the distribution of µD(x ,Z ) conditional on X = x is notdegenerate (exclusion restriction)0 < Pr(D = 1 | X = x) < 1 for each xX is invariant to counterfactual manipulations (X1 = X0)
Vytlacil shows that these are equivalent to the conditions ofImbens and Angrist (1994).uniformity/monotonicity: Pr(D(z) ≥ D(z ′)) is equal to 1 or0 for each pair z, z ′ – implied by separability in HV model.
Economics 379 George Washington University
Lecture 4
heterogeneity
Summarizing,• In models with essential heterogeneity,IV does not
estimate an economically interesting parameter.• Instead, IV estimates a LATE, or a weighted average of the
MTE.• Under the more restrictive assumptions of Imbens and
Angrist (1994), this is the the “treatment effect for thoseinduced to switch by an increase in Z ”.
• More generally, the weights may be negative.• Different instruments identify different parameters.• The MTE itself can be identified so we can do more.