Matching models

33
Matching models Bill Evans ECON 60303 1

description

Matching models. Bill Evans ECON 60303. Treatment: private college Control: public. Busso et al. . Outcome: root MSE of TOT across replications. Pair matching Does well But it is harder To implement. IPW, even without rescaling o f weights does w ell, and is - PowerPoint PPT Presentation

Transcript of Matching models

Page 1: Matching models

1

Matching models

Bill EvansECON 60303

Page 2: Matching models

2

Page 3: Matching models

3

Page 4: Matching models

4

Treatment: private collegeControl: public

Page 5: Matching models

5

Busso et al.

Outcome: root MSE ofTOT across replications

Pair matchingDoes wellBut it is harderTo implement

IPW, even without rescalingof weights doeswell, and iseasiest to estimate

Page 6: Matching models

6

Dehejia and Wahba, 1999

Page 7: Matching models

7

Page 8: Matching models

8

Not the easiest table to read. The numbers in brackets are std errors on theDifference from the NSW treatment sample. So EDUCATION in NSW is 10.35. The same value in MCPS3 is 10.69 for a difference of 0.34. The std error on theDifference is 0.48

Page 9: Matching models

9

Treatment effect from RCT

Regression based adjustments Propensity scoreestimates

Page 10: Matching models

10

Differences in samples

• NSW enrolled people April 75-April 77• DW – wanted two years of pre-program

earnings– Survey asked for earnings in 1974 so they delete

anyone enrolled after April 1976 – But they also include people w. zero earnings 13-

24 months prior to enrollment, for those enrolled after April 1976

Page 11: Matching models

11

Page 12: Matching models

12

Page 13: Matching models

13

DW sampleResults from JASA

Lalonde SampleDW methods—Not even close

RA sampleDW methodsLooks awful

Page 14: Matching models

14

Example: matching1.do

• workplace1.do: data on indoor workers, their smoking habits and whether they are subject to a workplace smoking ban– Y: smoker (=1 if yes, =0 if no)– D: worka (work area smoking ban, =1 if yes)– X: ln(income) and age plus dummies for male

black, hispanic, hsgrad, somecol, college• Sparse set of controls – this is just to illustrate

the procedure

Page 15: Matching models

15

* run the propensity score; probit worka age incomel male black hispanic hsgrad somecol college; predict pscore, pr; * trim the sample to have common support; * the span of the propensity scores is the same for; * treatment=1 and treatment=0; gen ps_y1=pscore; replace ps_y1=. if worka==0; gen ps_y0=pscore; replace ps_y0=. if worka==1; egen ps1max=max(ps_y1); egen ps1min=min(ps_y1); egen ps0max=max(ps_y0); egen ps0min=min(ps_y0); drop if pscore<max(ps1min,ps0min); drop if pscore>min(ps1max,ps0max);

Use a probit for the propensity score, use trimming procedure from DW

Page 16: Matching models

16

-1,500 -1,000 -500 0 500 1,000 1,5000.49

0.53

0.57

0.61

0.65

0.69

0.73

0.77

0.81

Counts

Prop

ensi

ty s

core

Distribution of Propensity Scoreworkplace1.do

worka=1 worka=0

0 5005001000 1000

Page 17: Matching models

17

* generate weights; * ipw1 is the original weight; gen ipw1=1; replace ipw1=(1-worka)*pscore/(1-pscore) if worka==0; * now construct ipw2 -- re-weight so that the; * new weights sum to the number of observations; * in the comparison sample; egen ipw1s=sum(ipw1) if worka==0; egen nobs_y0=sum(1-worka) if worka==0; gen ipw2=ipw1; replace ipw2=nobs_y0*ipw1/ipw1s if worka==0; sort worka; by worka: sum ipw1 ipw2;

Page 18: Matching models

18

. sort worka; . by worka: sum ipw1 ipw2; ------------------------------------------------------------------------------- -> worka = 0 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ipw1 | 5118 2.178971 .6443429 1.020956 5.388466 ipw2 | 5118 1 .2957097 .4685496 2.472941 ------------------------------------------------------------------------------- -> worka = 1 Variable | Obs Mean Std. Dev. Min Max -------------+-------------------------------------------------------- ipw1 | 11138 1 0 1 1 ipw2 | 11138 1 0 1 1

Average weight of IPW2 should be 1 – weights sum to number of observations inthe comparison sample.

Page 19: Matching models

19

* check whether covariates are balanced; * use ipw2 as the weight by worka: sum age incomel male black hispanic hsgrad somecol college [aw=ipw]; reg age worka [aw=ipw2]; reg incomel worka [aw=ipw2]; reg male worka [aw=ipw2]; reg black worka [aw=ipw2]; reg hispanic worka [aw=ipw2]; reg hsgrad worka [aw=ipw2]; reg somecol worka [aw=ipw2]; reg college worka [aw=ipw2]; * run propensity score by ipw1, ask for robust std errors; reg smoker worka [aw=ipw1], robust; * run propensity score by ipw1, ask for robust std errors; reg smoker worka [aw=ipw2], robust; * just for fun, compare to the OLS estimates; reg smoker worka age incomel male black hispanic hsgrad somecol college, robust;

Page 20: Matching models

20

Balancing TestVariable Mean | D=0 Mean | D=1 Diff (P-value)Age 38.54 38.61 0.07 (0.74)Incomel 10.43 10.43 0.005 (0.72)Male 0.366 0.368 0.002 (0.78)Black 0.121 0.118 -0.004 (0.55)Hispanic 0.064 0.062 -0.002 (0.69)Hsgrad 0.315 0.316 0.001 (0.90)Somecol 0.273 0.273 0.000 (0.98)college 0.353 0.350 -0.002 (0.78)

Page 21: Matching models

21

. * run propensity score by ipw1, ask for robust std errors; . reg smoker worka [aw=ipw2], robust; (sum of wgt is 1.6256e+04) Linear regression Number of obs = 16256 F( 1, 16254) = 70.79 Prob > F = 0.0000 R-squared = 0.0048 Root MSE = .42926 ------------------------------------------------------------------------------ | Robust smoker | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- worka | -.0638802 .0075925 -8.41 0.000 -.0787624 -.048998 _cons | .2890552 .0064792 44.61 0.000 .2763553 .3017552 ------------------------------------------------------------------------------

. * just for fun, compare to the OLS estimates;

. reg smoker worka age incomel male black hispanic hsgrad somecol college, rob > ust; ------------------------------------------------------------------------------ | Robust smoker | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- worka | -.0660522 .007488 -8.82 0.000 -.0807296 -.0513749

Page 22: Matching models

22

How to use propensity scores

Page 23: Matching models

23

Early discharges

• Recommended postpartum stay– 2 days for normal vaginal birth– 4 days for uncomplicated c-section

• Rise of managed care reduced average length of postpartum stay– By mid 1990s, 80% of births were under

recommended stay• “drive through deliveries”

Page 24: Matching models

24

Legislative response

• States adopted mandatory minimum postpartum stays– Insurance must be offered– Patient can leave after 1 day

• Federal law – Passed in 1996, effective January 1, 1998– Exempted Medicaid

• CA state law– Passed and effective on Aug 17, 1997– Expanded to Medicaid January 1, 1999

Page 25: Matching models

25

Research question

• Does more medical care generate better outcomes?

• Problem: most births are uncomplicated so the law should have little impact on those

• Is there a way to measure how complicated the birth is?– Different diagnoses – but there are many– Alternative – use PS as a measure of difficulty

Page 26: Matching models

26

Page 27: Matching models

27

Page 28: Matching models

28

Page 29: Matching models

29

Page 30: Matching models

30

Page 31: Matching models

31

Page 32: Matching models

32

Page 33: Matching models

33