1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford...

70
1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. http://www.stanford.edu/~kcobb Stanford University Department of Health Research and Policy

Transcript of 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford...

Page 1: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

1

Kaplan-Meier methods and Parametric Regression

methods

Kristin Sainani Ph.D.http://www.stanford.edu/~kcobbStanford UniversityDepartment of Health Research and Policy

Page 2: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

2

More on Kaplan-Meier estimator of S(t)(“product-limit estimator” or “KM estimator”) When there are no censored data, the KM

estimator is simple and intuitive: Estimated S(t)= proportion of observations with failure times

> t. For example, if you are following 10 patients, and 3 of them

die by the end of the first year, then your best estimate of S(1 year) = 70%.

When there are censored data, KM provides estimate of S(t) that takes censoring into account (see last week’s lecture).

If the censored observation had actually been a failure: S(1 year)=4/5*3/4*2/3=2/5=40%

KM estimator is defined only at times when events occur! (empirically defined)

Page 3: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

3

KM (product-limit) estimator, formally

]1[)(

at timeevent thehave number who theis

risk-at sindividual are there, event timeeach at

sevent timedistinct k

:

1

ttj j

j

jj

jj

kj

jn

dtS

td

nt

t...t t

Page 4: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

4

KM (product-limit) estimator, formally

S(t) represents estimated survival probability at time t: P(T>t)

Observed event times

Typically dj= 1 person, unless data are grouped in time intervals (e.g., everyone who had the event in the 3rd month).

The risk set nj at time tj consists of the original sample minus all those who have been censored or had the event before tj

This formula gives the product-limit estimate of survival at each time an event happens.

dj/nj=proportion that failed at the event time tj

1- dj/nj=proportion surviving the event timeMultiply the probability of surviving

event time t with the probabilities of surviving all the previous event times.

]1[)(

at timeevent thehave number who theis

risk-at sindividual are there, event timeeach at

sevent timedistinct k

:

1

ttj j

j

jj

jj

kj

jn

dtS

td

nt

t...t t

Page 5: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

5

Example 1: time-to-conception for subfertile women

“Failure” here is a good thing.

38 women (in 1982) were treated for infertility with laparoscopy and hydrotubation.

All women were followed for up to 2-years to describe time-to-conception.

The event is conception, and women "survived" until they conceived.

Example from: BMJ, Dec 1998; 317: 1572 - 1580.

Page 6: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 31 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Page 7: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

7

Corresponding Kaplan-Meier Curve

S(t) is estimated at 9 event times.

(step-wise function)

Page 8: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 31 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Page 9: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 31 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Page 10: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

10

Corresponding Kaplan-Meier Curve

6 women conceived in 1st month (1st menstrual cycle). Therefore, 32/38 “survived” pregnancy-free past 1 month.

Page 11: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

11

Corresponding Kaplan-Meier Curve

S(t=1) = 32/38 = 84.2%

S(t) represents estimated survival probability: P(T>t)Here P(T>1).

Page 12: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 2.11 31 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Important detail of how the data were coded:Censoring at t=2 indicates survival PAST the 2nd cycle (i.e., we know the woman “survived” her 2nd cycle pregnancy-free).

Thus, for calculating KM estimator at 2 months, this person should still be included in the risk set.

Think of it as 2+ months, e.g., 2.1 months.

Page 13: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

13

Corresponding Kaplan-Meier Curve

Page 14: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

14

Corresponding Kaplan-Meier Curve

5 women conceive in 2nd month.

The risk set at event time 2 included 32 women.

Therefore, 27/32=84.4% “survived” event time 2 pregnancy-free.

S(t=2) = ( 84.2%)*(84.4%)=71.1%

Can get an estimate of the hazard rate here, h(t=2)= 5/32=15.6%. Given that you didn’t get pregnant in month 1, you have an estimated 5/32 chance of conceiving in the 2nd month.

And estimate of density (marginal probability of conceiving in month 2):f(t)=h(t)*S(t)=(.711)*(.156)=11%

Page 15: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 2.11 3.11 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Risk set at 3 months includes 26 women

Page 16: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

16

Corresponding Kaplan-Meier Curve

Page 17: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

17

Corresponding Kaplan-Meier Curve

S(t=3) = ( 84.2%)*(84.4%)*(88.5%)=62.8%

3 women conceive in the 3rd month.

The risk set at event time 3 included 26 women.

23/26=88.5% “survived” event time 3 pregnancy-free.

Page 18: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 3.11 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Risk set at 4 months includes 22 women

Page 19: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

19

Corresponding Kaplan-Meier Curve

Page 20: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

20

Corresponding Kaplan-Meier Curve

S(t=4) = ( 84.2%)*(84.4%)*(88.5%)*(86.4%)=54.2%

3 women conceive in the 4th month, and 1 was censored between months 3 and 4.

The risk set at event time 4 included 22 women.

19/22=86.4% “survived” event time 4 pregnancy-free.

Hazard rates (conditional chances of conceiving, e.g. 100%-84%) look similar over time.

And estimate of density (marginal probability of conceiving in month 4):f(t)=h(t)*S(t)=(.136)* (.542)=7.4%

Page 21: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 31 4.11 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

Risk set at 6 months includes 18 women

Page 22: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

22

Corresponding Kaplan-Meier Curve

Page 23: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

23

Corresponding Kaplan-Meier Curve

S(t=6) = (54.2%)*(88.8%)=42.9%

2 women conceive in the 6th month of the study, and one was censored between months 4 and 6.

The risk set at event time 5 included 18 women.

16/18=88.8% “survived” event time 5 pregnancy-free.

Page 24: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

24

Skipping ahead to the 9th and final event time (months=16)…

S(t=13) 22%(“eyeball” approximation)

Page 25: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Data from: Luthra P, Bland JM, Stanton SL. Incidence of pregnancy after laparoscopy and hydrotubation. BMJ 1982; 284: 1013-1014

Raw data: Time (months) to conception or censoring in 38 sub-fertile women after laparoscopy and hydrotubation (1982 study)

1 21 31 41 71 71 82 82 92 92 92 113 243 243  4  4  4  6  6  9  9  9  10  13  16  

Conceived (event) Did not conceive (censored)

2 remaining at 16 months (9th event time)

Page 26: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

26

Skipping ahead to the 9th and final event time (months=16)…

S(t=16) =( 22%)*(2/3)=15%

Tail here just represents that the final 2 women did not conceive (cannot make many inferences from the end of a KM curve)!

Page 27: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

27

Kaplan-Meier: SAS output The LIFETEST Procedure

Product-Limit Survival Estimates

Survival Standard Number Number time Survival Failure Error Failed Left

0.0000 1.0000 0 0 0 38 1.0000 . . . 1 37 1.0000 . . . 2 36 1.0000 . . . 3 35 1.0000 . . . 4 34 1.0000 . . . 5 33 1.0000 0.8421 0.1579 0.0592 6 32 2.0000 . . . 7 31 2.0000 . . . 8 30 2.0000 . . . 9 29 2.0000 . . . 10 28 2.0000 0.7105 0.2895 0.0736 11 27 2.0000* . . . 11 26 3.0000 . . . 12 25 3.0000 . . . 13 24 3.0000 0.6285 0.3715 0.0789 14 23 3.0000* . . . 14 22 4.0000 . . . 15 21 4.0000 . . . 16 20 4.0000 0.5428 0.4572 0.0822 17 19 4.0000* . . . 17 18

Page 28: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

28

Kaplan-Meier: SAS output Survival

Standard Number Number time Survival Failure Error Failed Left

6.0000 . . . 18 17 6.0000 0.4825 0.5175 0.0834 19 16 7.0000* . . . 19 15 7.0000* . . . 19 14 8.0000* . . . 19 13 8.0000* . . . 19 12 9.0000 . . . 20 11 9.0000 . . . 21 10 9.0000 0.3619 0.6381 0.0869 22 9 9.0000* . . . 22 8 9.0000* . . . 22 7 9.0000* . . . 22 6 10.0000 0.3016 0.6984 0.0910 23 5 11.0000* . . . 23 4 13.0000 0.2262 0.7738 0.0944 24 3 16.0000 0.1508 0.8492 0.0880 25 2 24.0000* . . . 25 1 24.0000* . . . 25 0

NOTE: The marked survival times are censored observations.

Page 29: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Not so easy to get a plot of the actual hazard function!

In SAS, need a complicated MACRO, and depends on assumptions…here’s what I get from Paul Allison’s macro for these data…

Page 30: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

30

At best, you can get the cumulative hazard function…

t

duuh

duuhtS

etS

t

0

)(

)()(log

)( 0

See lecture 1 if you want more math!

Linear cumulative hazard function indicates a constant hazard.

Page 31: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

31

Cumulative Hazard Function

If the hazard function is increasing with time, e.g. h(t)=kt, then the cumulative hazard function will be curved up, for example h(t)=kt gives a quadratic:

ktkdut

0

If the hazard function is constant, e.g. h(t)=k, then the cumulative hazard function will be linear (and higher hazards will have steeper slopes):

2

2

0

ktktdu

t

If the hazard function is decreasing over time, e.g. h(t)=k/t, then the cumulative hazard function should be curved down, for example:

)log(0

tkdut

kt

t

duuhtS0

)()(log

Page 32: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

32

Kaplan-Meier: example 2

Researchers randomized 44 patients with chronic active hepatitis were to receive prednisolone or no treatment (control), then compared survival curves.

Example from: BMJ 1998;317:468-469 ( 15 August )

Page 33: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Prednisolone (n=22) Control (n=22)

2 2

6 3

12 4

54 7

56 * 10

68 22

89 28

96 29

96 32

125* 37

128* 40

131* 41

140* 54

141* 61

143 63

145* 71

146 127*

148* 140*

162* 146*

168 158*

173* 167*

181* 182*

Data from: BMJ 1998;317:468-469 ( 15 August ) *=censored

Survival times (months) of 44 patients with chronic active hepatitis randomised to receive prednisolone or no treatment.

Page 34: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

34

Kaplan-Meier: example 2

Are these two curves different?

Misleading to the eye—apparent convergence by end of study. But this is due to 6 controls who survived fairly long, and 3 events in the treatment group when the sample size was small.

Big drops at the end of the curve indicate few patients left. E.g., only 2/3 (66%) survived this drop.

Page 35: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 3.000 0.9091 0.0909 0.0613 2 20 4.000 0.8636 0.1364 0.0732 3 19 7.000 0.8182 0.1818 0.0822 4 18 10.000 0.7727 0.2273 0.0893 5 17 22.000 0.7273 0.2727 0.0950 6 16 28.000 0.6818 0.3182 0.0993 7 15 29.000 0.6364 0.3636 0.1026 8 14 32.000 0.5909 0.4091 0.1048 9 13 37.000 0.5455 0.4545 0.1062 10 12 40.000 0.5000 0.5000 0.1066 11 11 41.000 0.4545 0.5455 0.1062 12 10 54.000 0.4091 0.5909 0.1048 13 9 61.000 0.3636 0.6364 0.1026 14 8 63.000 0.3182 0.6818 0.0993 15 7 71.000 0.2727 0.7273 0.0950 16 6 127.000* . . . 16 5 140.000* . . . 16 4 146.000* . . . 16 3 158.000* . . . 16 2 167.000* . . . 16 1 182.000* . . . 16 0

Control group:

6 controls made it past 100 months.

Page 36: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 6.000 0.9091 0.0909 0.0613 2 20 12.000 0.8636 0.1364 0.0732 3 19 54.000 0.8182 0.1818 0.0822 4 18 56.000* . . . 4 17 68.000 0.7701 0.2299 0.0904 5 16 89.000 0.7219 0.2781 0.0967 6 15 96.000 . . . 7 14 96.000 0.6257 0.3743 0.1051 8 13 125.000* . . . 8 12 128.000* . . . 8 11 131.000* . . . 8 10 140.000* . . . 8 9 141.000* . . . 8 8 143.000 0.5475 0.4525 0.1175 9 7 145.000* . . . 9 6 146.000 0.4562 0.5438 0.1285 10 5 148.000* . . . 10 4 162.000* . . . 10 3 168.000 0.3041 0.6959 0.1509 11 2 173.000* . . . 11 1 181.000* . . . 11 0

treated group:

5/6 of 54% rapidly drops the curve to 45%.

2/3 of 45% rapidly drops the curve to 30%.

Page 37: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

37

Point-wise confidence intervals

We will not worry about mathematical formula for confidence bands. The important point is that there is a confidence interval for each estimate of S(t). (SAS uses Greenwood’s formula.)

Page 38: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

38

Log-rank test

Test of Equality over Strata

Pr > Test Chi-Square DF Chi-Square

Log-Rank 4.6599 1 0.0309Wilcoxon 6.5435 1 0.0105-2Log(LR) 5.4096 1 0.0200

Chi-square test (with 1 df) of the (overall) difference between the two groups.

Groups appear significantly different.

Page 39: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

39

Log-rank test

Log-rank test is just a Cochran-Mantel-Haenszel chi-square test!

Anyone remember (know) what this is?

Page 40: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

CMH test of conditional independence

 

Group 1

Group 2

 Event No Event

a b

c d

K Strata = unique event times

21

1

2

1 ~

)(

))](([

k

ik

k

k

ik

aVar

aEa

Nk

)1(

)(*)(*)(*)()(

)(*)()(

2

kk

kkkkkkkkk

k

kkkkk

NN

dbcadcbaaVar

N

cabaaE

Page 41: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

CMH test of conditional independence

21

1

2

1 ~

)(

))](([

k

ik

k

k

ik

aVar

aEa

)1(

2*1*2*1)(

1*1)(

2

kk

kkkkk

k

kkk

NN

colcolrowrowaVar

N

colrowaE

Nk

 

Group 1

Group 2

 Event No Event

a b

c d

K Strata = unique event times

Page 42: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

21

2

sevent time

sevent time sevent time

deviation standard

expectedobserved)(

Z

Z

eventsVar

eventsEevents

k

k k

CMH test of conditional independence

21

1

2

1 ~

)(

))](([

k

ik

k

k

ik

aVar

aEa

)1(

2*1*2*1)(

1*1)(

2

kk

kkkkk

k

kkk

NN

colcolrowrowaVar

N

colrowaE

How do you know that this is a chi-square with 1 df? Why is this the

expected value in each stratum?

Variance is the variance of a hypergeometric distribution

 

Group 1

Group 2

Event No Event

a b

c d

Page 43: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 3.000 0.9091 0.0909 0.0613 2 20 4.000 0.8636 0.1364 0.0732 3 19 7.000 0.8182 0.1818 0.0822 4 18 10.000 0.7727 0.2273 0.0893 5 17 22.000 0.7273 0.2727 0.0950 6 16 28.000 0.6818 0.3182 0.0993 7 15 29.000 0.6364 0.3636 0.1026 8 14 32.000 0.5909 0.4091 0.1048 9 13 37.000 0.5455 0.4545 0.1062 10 12 40.000 0.5000 0.5000 0.1066 11 11 41.000 0.4545 0.5455 0.1062 12 10 54.000 0.4091 0.5909 0.1048 13 9 61.000 0.3636 0.6364 0.1026 14 8 63.000 0.3182 0.6818 0.0993 15 7 71.000 0.2727 0.7273 0.0950 16 6 127.000* . . . 16 5 140.000* . . . 16 4 146.000* . . . 16 3 158.000* . . . 16 2 167.000* . . . 16 1 182.000* . . . 16 0

Event time 1 (2 months), control group:At risk=22

1st event at month 2.

Page 44: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 6.000 0.9091 0.0909 0.0613 2 20 12.000 0.8636 0.1364 0.0732 3 19 54.000 0.8182 0.1818 0.0822 4 18 56.000* . . . 4 17 68.000 0.7701 0.2299 0.0904 5 16 89.000 0.7219 0.2781 0.0967 6 15 96.000 . . . 7 14 96.000 0.6257 0.3743 0.1051 8 13 125.000* . . . 8 12 128.000* . . . 8 11 131.000* . . . 8 10 140.000* . . . 8 9 141.000* . . . 8 8 143.000 0.5475 0.4525 0.1175 9 7 145.000* . . . 9 6 146.000 0.4562 0.5438 0.1285 10 5 148.000* . . . 10 4 162.000* . . . 10 3 168.000 0.3041 0.6959 0.1509 11 2 173.000* . . . 11 1 181.000* . . . 11 0

Event time 1 (2 months), treated group: At risk=22

1st event at month 2.

Page 45: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Stratum 1= event time 1

 

treated

control

 Event No Event

1 21

1 21

Event time 1:

1 died from each group. (22 at risk in each group)

44

244.)43(44

)42(*)2(*)22(*)22()(

144

)2(*)22()(

1

21

1

1

aVar

aE

a

Page 46: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 3.000 0.9091 0.0909 0.0613 2 20 4.000 0.8636 0.1364 0.0732 3 19 7.000 0.8182 0.1818 0.0822 4 18 10.000 0.7727 0.2273 0.0893 5 17 22.000 0.7273 0.2727 0.0950 6 16 28.000 0.6818 0.3182 0.0993 7 15 29.000 0.6364 0.3636 0.1026 8 14 32.000 0.5909 0.4091 0.1048 9 13 37.000 0.5455 0.4545 0.1062 10 12 40.000 0.5000 0.5000 0.1066 11 11 41.000 0.4545 0.5455 0.1062 12 10 54.000 0.4091 0.5909 0.1048 13 9 61.000 0.3636 0.6364 0.1026 14 8 63.000 0.3182 0.6818 0.0993 15 7 71.000 0.2727 0.7273 0.0950 16 6 127.000* . . . 16 5 140.000* . . . 16 4 146.000* . . . 16 3 158.000* . . . 16 2 167.000* . . . 16 1 182.000* . . . 16 0

Event time 2 (3 months), control group: At risk=21

Next event at month 3.

Page 47: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 6.000 0.9091 0.0909 0.0613 2 20 12.000 0.8636 0.1364 0.0732 3 19 54.000 0.8182 0.1818 0.0822 4 18 56.000* . . . 4 17 68.000 0.7701 0.2299 0.0904 5 16 89.000 0.7219 0.2781 0.0967 6 15 96.000 . . . 7 14 96.000 0.6257 0.3743 0.1051 8 13 125.000* . . . 8 12 128.000* . . . 8 11 131.000* . . . 8 10 140.000* . . . 8 9 141.000* . . . 8 8 143.000 0.5475 0.4525 0.1175 9 7 145.000* . . . 9 6 146.000 0.4562 0.5438 0.1285 10 5 148.000* . . . 10 4 162.000* . . . 10 3 168.000 0.3041 0.6959 0.1509 11 2 173.000* . . . 11 1 181.000* . . . 11 0

Event time 2 (3 months), treated group:At risk=21

No events at 3 months

Page 48: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Stratum 2= event time 2

 

treated

control

 Event No Event

0 21

1 20

Event time 2:

At 3 months, 1 died in the control group.

At that time 21 from each group were at risk

42

25.)41(42

)41(*)1(*)21(*)21()(

5.42

)21(*)1()(

0

21

1

1

aVar

aE

a

Page 49: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 3.000 0.9091 0.0909 0.0613 2 20 4.000 0.8636 0.1364 0.0732 3 19 7.000 0.8182 0.1818 0.0822 4 18 10.000 0.7727 0.2273 0.0893 5 17 22.000 0.7273 0.2727 0.0950 6 16 28.000 0.6818 0.3182 0.0993 7 15 29.000 0.6364 0.3636 0.1026 8 14 32.000 0.5909 0.4091 0.1048 9 13 37.000 0.5455 0.4545 0.1062 10 12 40.000 0.5000 0.5000 0.1066 11 11 41.000 0.4545 0.5455 0.1062 12 10 54.000 0.4091 0.5909 0.1048 13 9 61.000 0.3636 0.6364 0.1026 14 8 63.000 0.3182 0.6818 0.0993 15 7 71.000 0.2727 0.7273 0.0950 16 6 127.000* . . . 16 5 140.000* . . . 16 4 146.000* . . . 16 3 158.000* . . . 16 2 167.000* . . . 16 1 182.000* . . . 16 0

Event time 3 (4 months), control group:At risk=20

1 event at month 4.

Page 50: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Survival Standard Number Number time Survival Failure Error Failed Left

0.000 1.0000 0 0 0 22 2.000 0.9545 0.0455 0.0444 1 21 6.000 0.9091 0.0909 0.0613 2 20 12.000 0.8636 0.1364 0.0732 3 19 54.000 0.8182 0.1818 0.0822 4 18 56.000* . . . 4 17 68.000 0.7701 0.2299 0.0904 5 16 89.000 0.7219 0.2781 0.0967 6 15 96.000 . . . 7 14 96.000 0.6257 0.3743 0.1051 8 13 125.000* . . . 8 12 128.000* . . . 8 11 131.000* . . . 8 10 140.000* . . . 8 9 141.000* . . . 8 8 143.000 0.5475 0.4525 0.1175 9 7 145.000* . . . 9 6 146.000 0.4562 0.5438 0.1285 10 5 148.000* . . . 10 4 162.000* . . . 10 3 168.000 0.3041 0.6959 0.1509 11 2 173.000* . . . 11 1 181.000* . . . 11 0

Event time 3 (4 months), treated group:At risk=21

Page 51: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Stratum 3= event time 3 (4 months)

 

treated

control

 Event No Event

0 21

1 19

Event time 3:

At 4 months, 1 died in the control group.

At that time 21 from the treated group and 20 from the control group were at-risk. 41

25.)40(41

)40(*)1(*)20(*)21()(

51.41

)21(*)1()(

0

21

1

1

aVar

aE

a

Page 52: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Etc.

66.4.....25.25.244.

.....]..........)51.0()5.0()11[(

)(

))](([2

22

1

222

1

ik

ki

k

aVar

aEa

Page 53: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

53

Log-rank test, et al.

Test of Equality over Strata

Pr > Test Chi-Square DF Chi-Square

Log-Rank 4.6599 1 0.0309Wilcoxon 6.5435 1 0.0105-2Log(LR) 5.4096 1 0.0200

Likelihood Ratio test is not ideal here because it assumes exponential distribution (constant hazard).

Wilcoxon is just a version of the log-rank test that weights strata by their size (giving more weight to earlier time points).

More sensitive to differences at earlier time points.

Log-rank test has most power to test differences that fit the proportional hazards model—so works well as a set-up for subsequent Cox regression.

Page 54: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

54

Estimated –log(S(t))

Maybe hazard function decreases a little then increases a little? Hard to say exactly…

Page 55: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

55

One more graph from SAS…

log(-log(S(t))=

log(cumulative hazard)

If group plots are parallel, this indicates that the proportional hazards assumption is valid.

Necessary assumption for calculation of Hazard Ratios…

Page 56: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

56

Uses of Kaplan-Meier Commonly used to describe

survivorship of study population/s. Commonly used to compare two

study populations. Intuitive graphical presentation.

Page 57: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

57

Limitations of Kaplan-Meier

• Mainly descriptive• Doesn’t control for covariates• Requires categorical predictors

• SAS does let you easily discretize continuous variables for KM methods, for exploratory purposes.

• Can’t accommodate time-dependent variables

Page 58: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

58

Parametric Models for the hazard/survival function The class of regression models

estimated by PROC LIFEREG is known as the accelerated failure time models.

Page 59: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

59

Recall: two parametric models

Components:

•A baseline hazard function (that may change over time).

•A linear function of a set of k fixed covariates that when exponentiated (and a few other things) gives the relative risk.

ikkii xxth ...)(log 11

Exponential model assumes fixed baseline hazard that we can estimate.

ikkii xxtth ...log)(log 11

Weibull model models the baseline hazard as a function of time. Two parameters (baseline hazard and scale) must be estimated to describe the underlying hazard function over time.

Page 60: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

60

To get Hazard Ratios (relative risk)…

•Weibull (and thus exponential) are proportional hazards models, so hazard ratio can be calculated.

•For other parametric models, you cannot calculate hazard ratio (hazards are not necessarily proportional over time).

scaleeHR

eHR

:Model Weibull

:Model lExponentia

More tricky to get confidence intervals here!

Page 61: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

61

What’s a hazard ratio?

Distinction between hazard/rate ratio and odds ratio/risk ratio:

Hazard/rate ratio: ratio of incidence rates

Odds/risk ratio: ratio of proportions

Page 62: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

62

Example 1

Using data from pregnancy study…

Recall: roughly, hazard rates were similar over time

(implies exponential model should be a good fit).

Page 63: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

The LIFEREG Procedure

Analysis of Parameter Estimates

Standard 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 2.2636 0.2049 1.8621 2.6651 122.08 <.0001

Scale 1 1.0217 0.1638 0.7462 1.3987

Weibull Shape 1 0.9788 0.1569 0.7149 1.3401

Scale of 1.0 makes a Weibull an exponential, so looks exponential.

Page 64: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

64

Parametric estimates of survival function based on a Weibull model (left) and exponential (right).

Compare to KM:

Page 65: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

65

Example 2: 2 groups

Using data from hepatitis trial, I fit exponential and Weibull models in SAS using LIFEREG (Weibull is default in LIFEREG)…

Page 66: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

The LIFEREG Procedure

Dependent Variable Log(time)

Right Censored Values 17

Left Censored Values 0

Interval Censored Values 0

Name of Distribution Exponential

Log Likelihood -68.03461345

Analysis of Parameter Estimates

Standard 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 4.4886 0.2500 3.9986 4.9786 322.37 <.0001

group 1 0.9008 0.3917 0.1332 1.6685 5.29 0.0214

Scale 0 1.0000 0.0000 1.0000 1.0000

Weibull Shape 0 1.0000 0.0000 1.0000 1.0000

P-value for group very similar to p-value from log-rank test.

Scale parameter is set to 1, because it’s exponential.

-2Log Likelihood = 2*68= 176

Hazard ratio (treated vs. control):

e-0.9008 = .406

Interpretation: median time to death was decreased 60% in treated group; or, equivalently, mortality rate is 60% lower in treated group.

Page 67: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Model Information

Dependent Variable Log(time)

Right Censored Values 17

Left Censored Values 0

Interval Censored Values 0

Name of Distribution Weibull

Log Likelihood -66.94904552

Analysis of Parameter Estimates

Standard 95% Confidence Chi-

Parameter DF Estimate Error Limits Square Pr > ChiSq

Intercept 1 4.4811 0.3169 3.8601 5.1022 200.00 <.0001

group 1 1.0544 0.5096 0.0556 2.0533 4.28 0.0385

Scale 1 1.2673 0.2139 0.9103 1.7643

Weibull Shape 1 0.7891 0.1332 0.5668 1.0985

Hazard ratio (treated vs. control):

e-1.05/1.267 = .43

P-value for group very similar to p-value from log-rank test and exponential model.

Scale parameter is greater than 1, indicating decreasing hazard with time.

-2Log Likelihood = 2*67= 174

Shape parameter is just 1/scale parameter!

Comparison of models using Likelihood Ratio test:

-2LogLikelihood(simpler model)—2LogLikelihood(more complex) = chi-square with 1 df (1 extra parameter estimated for weibull model).

=176-174 = 2

NS

No evidence that Weibull model is much better than exponential.

Page 68: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

Parametric estimates of cumulative survival based on Weibull model (left) and exponential (right), by group.

Compare to KM:

Page 69: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

69

Compare to Cox regression:

Parameter Standard Hazard 95% Hazard Ratio Variable DF Estimate Error Chi-Square Pr > ChiSq Ratio Confidence Limits

group 1 -0.83230 0.39739 4.3865 0.0362 0.435 0.200 0.948

Page 70: 1 Kaplan-Meier methods and Parametric Regression methods Kristin Sainani Ph.D. kcobb Stanford University Department of Health.

70

ReferencesPaul Allison. Survival Analysis Using SAS. SAS Institute Inc., Cary, NC:

2003.