Selection Bias with Linear Probability Models
-
Upload
suneel-babu-chatla -
Category
Data & Analytics
-
view
31 -
download
0
Transcript of Selection Bias with Linear Probability Models
![Page 1: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/1.jpg)
Selection Bias with Linear Probability Models
(LPM)
Suneel ChatlaGalit Shmueli
Institute of Service Science,National Tsing Hua University, Taiwan
![Page 2: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/2.jpg)
Outline
Ø Introduction to self selectionØ Popular methods for selection bias
correctiono Two step methods (2SLS)o Matching methods (PSM)
Ø Incorporating LPM into 2SLS and PSM Ø Simulation studyØ Conclusions
![Page 3: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/3.jpg)
Quasi-experiments
Like randomized experimental designs that test causal hypotheses but lack random assignment (=self selection)
![Page 4: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/4.jpg)
Pros
• When random assignment is impractical and/or unethical
• Easier to setup, greater external validity• Minimize threats to ecological validity
Cons
• Estimates are subject to contamination by confounding variables (Biased)
• Do not have total control over extraneous variables
Why we need Quasi experiments?
![Page 5: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/5.jpg)
Two Methods for Addressing Selection Bias
![Page 6: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/6.jpg)
Two Methods for Addressing Selection Bias
![Page 7: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/7.jpg)
Two step methods: Heckman vs Olsen
Stage 1: Selection model (T)
AdjustmentStage 2: Outcomemodel (Y)
𝐸[𝑇|𝑋] = Φ(𝑋𝛾) 𝐼𝑀𝑅 =𝜙(𝑋𝛾)Φ(𝑋𝛾) 𝑌 = 𝑋𝜷 + 𝛿𝐼𝑀𝑅 + 𝜀Heckman
(1977)
𝐸[𝑇|𝑋] = 𝑋𝛾 𝜆 = 𝑋𝛾 − 1 𝑌 = 𝑋𝜷 + 𝛿𝜆 + 𝜀Olsen (1980)
Probit
LPM
![Page 8: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/8.jpg)
Heckman’s
• Bivariate normality
• Inconsistent second stage standard errors
• Identification issues
• Expensive computation
• Convergence issues
Olsen’s
• Linear conditional expectation
• Inconsistent second stage standard errors
• Identification issues
• Cheaper computation
• No convergence issues
In Short: For Continuous Outcome
![Page 9: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/9.jpg)
Open Research Questions
1. Selection model with unequal sample sizes (treat/control) - continuous outcome
2. Binary outcome model – coefficient consistency
3. Selection model with unequal sample sizes (treat/control) + binary outcome model with unequal sample sizes
![Page 10: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/10.jpg)
Simulation Design
Selection model: 𝑆∗ = −0.5 + 0.5𝑥? − 0.5𝑥@ + 1.5𝑥A − 𝑥B + 𝜔
𝑇 = D 1𝑖𝑓𝑆∗ > 0
0𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Continuous Outcome model: 𝑌 = 0.5 − 1.5𝑥? + 0.5𝑥@ + 𝑥A + 𝜀
Binary Outcome:
𝑌O = D 1𝑖𝑓𝑌 > 00𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑁00, 0.5 −0.4−0.4 0.5
![Page 11: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/11.jpg)
Q1: Continuous outcome: treat/control sample size ratio has no influence
![Page 12: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/12.jpg)
Q2: Binary outcome - coefficients inconsistent
How about marginals?
![Page 13: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/13.jpg)
Q3: Binary outcome - divergence of marginals with imbalance ratio
Outcome cut-off 50% Outcome cut-off 25% Outcome cut-off 5%
Sel
ectio
n cu
t-of
f 50
%Sel
ectio
n cu
t-of
f 25
%Sel
ectio
n cu
t-of
f 5%
![Page 14: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/14.jpg)
Summary: Heckman Vs Olsen
Ø Continuous outcome: Heckman and Olsen corrections are similar, even when unbalanced
Ø Binary outcome: marginal effects from Heckman and Olsen corrections, diverge with imbalance
ØLPM in both stages provides consistent estimates (OLS)
ØBut how about Probit?
![Page 15: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/15.jpg)
Two Methods for Addressing Selection Bias
![Page 16: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/16.jpg)
Matching Methods
Stage 1: Selection model (T)
Covariate balance
Stage 2: Outcomemodel (Y)
𝑙𝑜𝑔𝑖𝑡(𝐸 𝑇 𝑋 ) = (𝑋𝛾)|𝑝 𝑇 = 1− 𝑝 𝑇 = 0 |< 𝜀
𝑌 = 𝑋𝜷 + 𝜀Rosenbaum and Rubin (1985)
𝐸 𝑇 𝑋 = (𝑋𝛾)|𝑝 𝑇 = 1− 𝑝 𝑇 = 0 |< 𝜀
𝑌 = 𝑋𝜷 + 𝜀LPM
![Page 17: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/17.jpg)
Propensity Score Matching (PSM)
ü Only accounts for observable/observed covariates
ü Requires large samples and substantial overlap between treatment and control
ü What happens to ATE if we use LPM for matching?
![Page 18: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/18.jpg)
Simulation Design
Selection model: 𝑇 = 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖( Z
Z[\](^_[`))
Outcome model :𝑌 = 𝑇 + 𝑋𝛽 + 𝜀
𝑁(0, {0.1,1,5})
𝑋~𝑁 0,1 and 𝛽 = 1
![Page 19: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/19.jpg)
• Sample size1000
•Standard deviation0.1,1,5
• Bootstrap50
• 𝑚𝑒𝑎𝑛 𝑌hi? −𝑚𝑒𝑎𝑛 𝑌hij
ATE
Identical ATE from Logit and LPM matching
![Page 20: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/20.jpg)
Summary & Future Researchü LPM similar to logit in terms of estimated Average
Treatment Effectü Ongoing work: what about binary outcome
models?
ü Logit faces problems if insufficient overlap between treat/control
ü Ongoing work: does LPM have overlap issues?
![Page 21: Selection Bias with Linear Probability Models](https://reader033.fdocuments.us/reader033/viewer/2022051300/5883b1c21a28ab3b488b71f1/html5/thumbnails/21.jpg)
Thank you!