Matching models

1

Matching models

Bill EvansECON 60303

4

Treatment: private collegeControl: public

5

Busso et al.

Outcome: root MSE ofTOT across replications

Pair matchingDoes wellBut it is harderTo implement

IPW, even without rescalingof weights doeswell, and iseasiest to estimate

6

Dehejia and Wahba, 1999

8

Not the easiest table to read. The numbers in brackets are std errors on theDifference from the NSW treatment sample. So EDUCATION in NSW is 10.35. The same value in MCPS3 is 10.69 for a difference of 0.34. The std error on theDifference is 0.48

9

Treatment effect from RCT

Regression based adjustments Propensity scoreestimates

10

Differences in samples

• NSW enrolled people April 75-April 77• DW – wanted two years of pre-program

earnings– Survey asked for earnings in 1974 so they delete

anyone enrolled after April 1976 – But they also include people w. zero earnings 13-

24 months prior to enrollment, for those enrolled after April 1976

13

DW sampleResults from JASA

Lalonde SampleDW methods—Not even close

RA sampleDW methodsLooks awful

14

Example: matching1.do

• workplace1.do: data on indoor workers, their smoking habits and whether they are subject to a workplace smoking ban– Y: smoker (=1 if yes, =0 if no)– D: worka (work area smoking ban, =1 if yes)– X: ln(income) and age plus dummies for male

black, hispanic, hsgrad, somecol, college• Sparse set of controls – this is just to illustrate

the procedure

15

* run the propensity score; probit worka age incomel male black hispanic hsgrad somecol college; predict pscore, pr; * trim the sample to have common support; * the span of the propensity scores is the same for; * treatment=1 and treatment=0; gen ps_y1=pscore; replace ps_y1=. if worka==0; gen ps_y0=pscore; replace ps_y0=. if worka==1; egen ps1max=max(ps_y1); egen ps1min=min(ps_y1); egen ps0max=max(ps_y0); egen ps0min=min(ps_y0); drop if pscore<max(ps1min,ps0min); drop if pscore>min(ps1max,ps0max);

Use a probit for the propensity score, use trimming procedure from DW

16

-1,500 -1,000 -500 0 500 1,000 1,5000.49

0.53

0.57

0.61

0.65

0.69

0.73

0.77

0.81

Counts

Prop

ensi

ty s

core

Distribution of Propensity Scoreworkplace1.do

worka=1 worka=0

0 5005001000 1000

17

* generate weights; * ipw1 is the original weight; gen ipw1=1; replace ipw1=(1-worka)*pscore/(1-pscore) if worka==0; * now construct ipw2 -- re-weight so that the; * new weights sum to the number of observations; * in the comparison sample; egen ipw1s=sum(ipw1) if worka==0; egen nobs_y0=sum(1-worka) if worka==0; gen ipw2=ipw1; replace ipw2=nobs_y0*ipw1/ipw1s if worka==0; sort worka; by worka: sum ipw1 ipw2;

18

. sort worka; . by worka: sum ipw1 ipw2; ------------------------------------------------------------------------------- -> worka = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ipw1 | 5118 2.178971 .6443429 1.020956 5.388466 ipw2 | 5118 1 .2957097 .4685496 2.472941 ------------------------------------------------------------------------------- -> worka = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ipw1 | 11138 1 0 1 1 ipw2 | 11138 1 0 1 1

Average weight of IPW2 should be 1 – weights sum to number of observations inthe comparison sample.

19

* check whether covariates are balanced; * use ipw2 as the weight by worka: sum age incomel male black hispanic hsgrad somecol college [aw=ipw]; reg age worka [aw=ipw2]; reg incomel worka [aw=ipw2]; reg male worka [aw=ipw2]; reg black worka [aw=ipw2]; reg hispanic worka [aw=ipw2]; reg hsgrad worka [aw=ipw2]; reg somecol worka [aw=ipw2]; reg college worka [aw=ipw2]; * run propensity score by ipw1, ask for robust std errors; reg smoker worka [aw=ipw1], robust; * run propensity score by ipw1, ask for robust std errors; reg smoker worka [aw=ipw2], robust; * just for fun, compare to the OLS estimates; reg smoker worka age incomel male black hispanic hsgrad somecol college, robust;

20

Balancing TestVariable Mean | D=0 Mean | D=1 Diff (P-value)Age 38.54 38.61 0.07 (0.74)Incomel 10.43 10.43 0.005 (0.72)Male 0.366 0.368 0.002 (0.78)Black 0.121 0.118 -0.004 (0.55)Hispanic 0.064 0.062 -0.002 (0.69)Hsgrad 0.315 0.316 0.001 (0.90)Somecol 0.273 0.273 0.000 (0.98)college 0.353 0.350 -0.002 (0.78)

21

. * run propensity score by ipw1, ask for robust std errors; . reg smoker worka [aw=ipw2], robust; (sum of wgt is 1.6256e+04) Linear regression Number of obs = 16256 F( 1, 16254) = 70.79 Prob > F = 0.0000 R-squared = 0.0048 Root MSE = .42926 ------------------------------------------------------------------------------ | Robust smoker | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- worka | -.0638802 .0075925 -8.41 0.000 -.0787624 -.048998 _cons | .2890552 .0064792 44.61 0.000 .2763553 .3017552 ------------------------------------------------------------------------------

. * just for fun, compare to the OLS estimates;

. reg smoker worka age incomel male black hispanic hsgrad somecol college, rob > ust; ------------------------------------------------------------------------------ | Robust smoker | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- worka | -.0660522 .007488 -8.82 0.000 -.0807296 -.0513749

22

How to use propensity scores

23

Early discharges

• Recommended postpartum stay– 2 days for normal vaginal birth– 4 days for uncomplicated c-section

• Rise of managed care reduced average length of postpartum stay– By mid 1990s, 80% of births were under

recommended stay• “drive through deliveries”

24

Legislative response

• States adopted mandatory minimum postpartum stays– Insurance must be offered– Patient can leave after 1 day

• Federal law – Passed in 1996, effective January 1, 1998– Exempted Medicaid

• CA state law– Passed and effective on Aug 17, 1997– Expanded to Medicaid January 1, 1999

25

Research question

• Does more medical care generate better outcomes?

• Problem: most births are uncomplicated so the law should have little impact on those

• Is there a way to measure how complicated the birth is?– Different diagnoses – but there are many– Alternative – use PS as a measure of difficulty

Matching models

Documents

Transcript of Matching models