Yisen Lin's Term 2 Presentation

16
University of Toronto | 1 Cox Proportional Hazards Model: Modeling Overall Survival Rates with Multi-Covariates Supervisor: Lisa Wang (Biostatistician) Presentation at Dlsph – Yisen Lin (MSc Candidate) April 7 th 2016

Transcript of Yisen Lin's Term 2 Presentation

Page 1: Yisen Lin's Term 2 Presentation

University of Toronto | 1

Cox Proportional Hazards Model: Modeling Overall Survival Rates with Multi-Covariates

Supervisor: Lisa Wang (Biostatistician)Presentation at Dlsph – Yisen Lin (MSc Candidate)April 7th 2016

Page 2: Yisen Lin's Term 2 Presentation

University of Toronto | 2

Content

• Research Site: Princess Margaret Cancer Centre

• General Outline

• Model Development: - Overview of The Data - Coxph Model Introduction - Covariates Selection - Model Validation • Conclusion

• Acknowledgement

Page 3: Yisen Lin's Term 2 Presentation

University of Toronto | 3

Princess Margaret Cancer Centre:• Largest Cancer Centre in Canada• One of 5 Largest Cancer Centre in the world

Biostatistics Department at PMCC:• Supporting approx. 150 biostatisticial request• Closely affiliated with DLSPH of utoronto

Page 4: Yisen Lin's Term 2 Presentation

University of Toronto | 4

General Outline

1. Read/Understand Completed Clinical Trial Protocol

2. Reproduce the result of Data Analysis Part of

Completed Clinical Trial Study

3. Model Development and Validation on the Given Datasets

My second term work is mainly focusing on Part 3

Page 5: Yisen Lin's Term 2 Presentation

University of Toronto | 5

Model DevelopmentOverview of the data

2 Datasets for one model development

Covariates Listing: 1. Hb – Male 140-180g/L, Female 120-160g/L (Looking for the effect below lower limit normal, LLN) 2. Plt 150-400g/L (Looking for ULN) 3. Na 135-145mmol/L (Looking for LLN) 4. Albumin 38-50g/L (Looking for LLN) 5. LD 125-220U/L (Looking for ULN)

Note: Above continuous covariates would be convert to binary outcome in model 6. Age 7. Gender 8. ECOG 9. No. of Metastatic Sites 10. No. of systematic therapies Looking for Predictors for Overall Survival Using Cox ph model

Page 6: Yisen Lin's Term 2 Presentation

University of Toronto | 6

Model DevelopmentOverview of the data

Study. No. Start Date DOBGender(1=

M,0=F) ECOG Hb Platelet NeutrophilLymphocyte NLR Na Albumin LDH

Metastatic sites no

No. Systemic Tx Start Date Stop date

Best response by RECIST

Date of progression

Last date of follow-up Death

1 02-Apr-14 27-Aug-55 1 1 128 226 2.6 1.2 2.17 137 43 173 2 0 02-Apr-14 Ongoing PR NA 15-Dec-15 Alive2 02-Apr-14 10-May-47 1 1 135 187 5.3 1.8 2.94 141 41 230 3 1 02-Apr-14 19-Aug-14 SD 19-Aug-14 15-Dec-15 Alive3 03-Apr-14 21-Sep-54 0 1 98 306 7.9 0.2 39.50 133 32 338 4 1 03-Apr-14 23-Apr-14 PD 22-Apr-15 05-May-14 Deceased4 03-Apr-14 19-Mar-55 1 1 132 173 3 0.4 7.50 139 44 228 2 2 03-Apr-14 24-Jul-14 SD 24-Jul-14 09-Sep-14 Deceased5 07-Apr-14 19-Feb-60 0 1 105 367 6.4 1.3 4.92 135 38 248 2 1 07-Apr-14 08-Jul-14 SD 08-Jul-14 14-Apr-15 Deceased6 08-Apr-14 06-Dec-61 0 1 120 345 5.2 0.8 6.50 132 37 230 4 6 08-Apr-14 11-Aug-14 SD 11-Aug-14 04-Sep-14 Deceased7 16-Apr-14 13-May-52 0 1 103 144 3.2 0.7 4.57 137 36 318 3 2 16-Apr-14 22-Sep-14 PR NA 15-Dec-15 Alive8 22-Apr-14 08-Mar-66 1 0 110 104 1.5 0.4 3.75 139 39 206 4 2 22-Apr-14 23-Jul-14 PD 23-Jul-14 15-Dec-15 Alive9 23-Apr-14 20-Mar-51 0 1 97 332 4.5 0.3 15.00 137 34 594 4 2 23-Apr-14 03-Jul-14 PD 03-Jul-14 27-Sep-14 Deceased

10 24-Apr-14 11-Dec-45 0 1 122 243 5 1.6 3.13 134 39 401 1 1 24-Apr-14 23-Jun-14 PD 23-Jun-14 06-Jul-14 Deceased11 28-Apr-14 10-Apr-71 1 1 106 246 9.5 1 9.50 142 35 436 3 2 28-Apr-14 21-May-14 NE 21-May-14 10-Jun-14 Deceased12 09-May-14 10-Jan-73 0 1 122 247 4.7 0.8 5.88 137 44 413 2 0 09-May-14 27-Oct-14 PD 27-Oct-14 29-Oct-14 Alive13 15-May-14 16-Aug-57 0 0 112 235 2.8 0.9 3.11 138 44 476 5 4 15-May-14 08-Aug-14 PD 08-Aug-14 05-May-15 Alive14 27-May-14 06-Apr-69 1 1 123 178 4.1 1.6 2.56 142 43 178 0 3 27-May-14 09-Mar-15 SD 09-Mar-15 13-Apr-15 Alive15 27-May-14 18-Feb-66 1 1 154 163 2.3 1.2 1.92 149 44 194 3 1 27-May-14 13-Jan-15 SD 13-Jan-15 15-Dec-15 Alive16 03-Jun-14 09-Jan-43 0 1 92 212 3.5 2 1.75 137 36 479 1 1 03-Jun-14 01-Aug-14 PD 01-Aug-14 20-Sep-15 Alive17 05-Jun-14 08-Sep-70 0 1 118 141 3 0.6 5.00 142 41 214 2 3 05-Jun-14 13-Aug-14 PD 13-Aug-14 19-Jan-15 Deceased18 05-Jun-14 02-Apr-59 0 0 123 161 2.4 3 0.80 143 43 353 3 1 05-Jun-14 10-Nov-14 PD 10-Nov-14 14-Apr-15 Deceased19 05-Jun-14 21-Jul-79 0 0 129 154 3 0.3 10.00 139 38 138 0 1 05-Jun-14 07-Aug-14 SD 07-Aug-14 27-Aug-14 Deceased20 11-Jun-14 29-Jan-60 1 0 138 184 4.2 2 2.10 146 43 226 2 1 11-Jun-14 23-Jul-14 PD 23-Jul-14 30-Aug-15 Alive21 19-Jun-14 08-Jan-70 1 0 151 286 2.3 1.2 1.92 137 42 210 2 3 19-Jun-14 Ongoing PR NA 15-Dec-15 Alive

Major Data Manipulation: Convert original continuous covariates to be binary covariatesFor example: Albumin 38-50g/L (Looking for LLN) means assign 1 to those albuminlevel < 38g/L, assign 0 otherwise.

Page 7: Yisen Lin's Term 2 Presentation

University of Toronto | 7

Kaplan Meier Plot for the OS

General look about the survival proportion without any model fitting

Page 8: Yisen Lin's Term 2 Presentation

University of Toronto | 8

Cox Proportional Hazard Model Introduction

Why this model?

1. Since we want to assess the effect of multiple covariates on overall Survival. And Cox-proportional hazards is the most commonly used multivariable survival method (we got 10) 2. Robustness: safe choice for many situation 3. The estimated hazard is always non-negative 4. Hazard rate and survival rate can be estimated under minimum assumptions 5. For example: Comparing with logistic model, it considers censoring info and survival time. ( we are going to use logistic model for 90 day mortality for score testing)

Page 9: Yisen Lin's Term 2 Presentation

University of Toronto | 9

Covariates Selection

1. Using univariate analysis: (with p-vlaue < 0.05 criteria to exclude non-significant covariates)

Hb Plt Na Albumin Ld

P-value 0.0527 0.011 0.0779 2.07e^-5 0.0122

Age Gender Meta.no. Terapy.no. Ecog

P-value 0.766 0.746 0.00237 0.0158 4.09e^-9

(Hb, Na, Gender and Age have been removed from the model, and fitMultivariate model with all the covariates p < 0.05)

Page 10: Yisen Lin's Term 2 Presentation

University of Toronto | 10

Covariates Selection

2. Applied both forward and backward selection method: (with the selection criteria smaller AIC and p-value < 0.05)

We get the same result (model is stable)

Covariates in Final Model

Albumin 38-50 g/L (LLN)

Number of Metastatic Sites

ECOG

For the final Model the C-index is 0.724

Page 11: Yisen Lin's Term 2 Presentation

University of Toronto | 11

Covariates SelectionModel Diagnostics

The residual plot is almost symmetric around 0, which indicates appropriate linearity assumption

Page 12: Yisen Lin's Term 2 Presentation

University of Toronto | 12

Covariates SelectionModel Diagnostics

Testing For the Proportionality of all 3 covariates:

p value are all big enough implies there is no evidence for having a non-proportional hazard. So the coxph we proposed is quite right

Page 13: Yisen Lin's Term 2 Presentation

University of Toronto | 13

Model Validation On the 90 day mortality

Using the same covariates as the coxph model on Overall Survival Same dataset

Area under the curve is 0.6764,which is acceptable.

Page 14: Yisen Lin's Term 2 Presentation

University of Toronto | 14

Model Validation On the 90 day mortality

1. Note that the c-index for the logistic model is 0.724,which is alreadyquite close to 0.8 (we consider 0.8 as the boundary of the model having good

prediction ability .

2. In order to eliminate the effect of over optimism, we use bootstrap to find the modified c-index (0.731), clearly even better

Page 15: Yisen Lin's Term 2 Presentation

University of Toronto | 15

ConclusionCurrent work

In addition to the work presented before, I also use the covariatesand the estimation of parameters generated from dataset1, and validate

the model in the dataset2.(Since, from my view I am doing very similar thing as what I did on dataset1)

Result would be presented in the Report

A glance of the dataset 2

In order to make the definition of covariates consistent We need to adjust the corresponding covariates from dataset 2

Page 16: Yisen Lin's Term 2 Presentation

University of Toronto | 1

Thanks For Listening