Joint Modelling - Challenges and Future Directions

Joint Modelling of Longitudinal and Time-to-Event Data:Challenges and Future Directions

Dimitris Rizopoulos

Department of Biostatistics, Erasmus Medical Center, the Netherlands

[email protected]

Italian Statistical Society Annual Meeting

June 16th, 2010, Padova

Outline

• Basics of the Joint Modelling Framework

• Residuals

• Dynamic Predictions

SIS 2010, Padova 1/33

1.1 Introduction

• Over the last 10-15 years increasing interest in joint modelling of longitudinal andtime-to-event data (Tsiatis & Davidian, Stat. Sinica, 2004; Yu et al., Stat. Sinica, 2004)

• Wide range of applications

◃ HIV studies

◃ cancer studies

◃ surrogate markers

◃ . . .


1.2 When These Models are Applicable?

• In many cases, we are interested in the effect of time-dependent covariates onsurvival times

◃ treatment changes with time (e.g., dose)

◃ time-dependent exposure (e.g., smoking)

◃ longitudinal measurements on the patient level (e.g., blood values)

◃ . . .


1.3 Case Study – AIDS Data

• Longitudinal study on 467 HIV infected patients who had failed or were intolerant ofzidovudine therapy

• Outcomes

◃ Time to death

◃ randomized treatment: didanosine (ddI) and zalcitabine (ddC)

◃ longitudinal measurements of CD4 cell counts


1.3 Case Study – AIDS Data (cont’d)

0 1 2 3 4 5

24

68

10

Time

CD

4



0 1 2 3 4 5

24

68

10

Time

CD

4

Time toDeath



0 1 2 3 4 5

24

68

10

Time

CD

4

Patient Died

Time toDeath

= 2.9 months



Months

CD

4

0

1

2

3

4

0 2 6 12 18

ddC0 2 6 12 18

ddI



0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

Months

Sur

viva

l

Died: 188 (40.3%)

ddCddI


2.1 Requirement for Joint Modelling

• Aim: we want to measure the effect of a time-dependent covariate to the hazard foran event

• Problem: CD4 cell count is an internal time-dependent covariate have a stochasticnature

◃ we do not observe the true values but rather contaminated with measurementerror values

◃ the complete history is not available


2.1 Requirement for Joint Modelling (cont’d)

0 1 2 3 4 5

46

810

Time

CD

4

Event Time

Values measuredwith error

Underlyingtrue evolution


2.1 Requirement for Joint Modelling (cont’d)

• Aim: we want to measure the effect of a time-dependent covariate to the hazard foran event

• Problem: internal covariate have a stochastic nature

◃ we do not observe the true values but rather contaminated with measurementerror values

◃ the complete history is not available

• Solution: Joint Modelling of Longitudinal and Time-to-Event Data


2.2 Joint Modelling Framework

• Step 1: let’s assume that we know mi(t), i.e., the true & unobserved value of CD4cell count at time t

• Then, we can define a standard relative risk model

hi(t | Mi(t)) = h0(t) exp{γ⊤wi + αmi(t)},

where

◃ Mi(t) = {mi(s), 0 ≤ s < t} CD4 cell count history

◃ α quantifies the effect of CD4 cell count on the hazard for death

◃ wi baseline covariates


2.2 Joint Modelling Framework (cont’d)

• Step 2: from the observed longitudinal response yi(t) reconstruct the covariatehistory for each subject

• Mixed effects model

yi(t) | bi = mi(t) + εi(t)

= x⊤i (t)β + z⊤i (t)bi + εi(t), εi(t) ∼ N (0, σ2)

where

◃ xi(t) and β: fixed effects part

◃ zi(t) and bi: random effects part



• Step 3: the two processes are associated ⇒ define a model for their joint distribution

• Joint Models for such joint distributions are of the following form(Tsiatis & Davidian, Stat. Sinica, 2004)

p(yi, Ti, δi) =

∫

p(yi | bi){

hi(Ti | bi)δ S(Ti | bi)

}

p(bi) dbi

where

◃ yi: longitudinal measurements; Ti: time to event; δi: event indicator

◃ bi a vector of random effects that explains the interdependencies

◃ p(·) density function; S(·) survival function



• Assumptions for the baseline hazard functions h0(t): A choice that often workssatisfactorily is a piecewise-constant baseline hazard

h0(t) =

Q∑

q=1

ξqI(vq−1 < t ≤ vq)

where 0 = v0 < v1 < · · · < vQ denotes a split of the time scale

• For the random effects: bi ∼ N (0, D)

◃ sensitivity to misspecification: minimal, especially as ni increases(Rizopoulos, Verbeke & Molenberghs, Biometrika, 2008)


2.3 Estimation

• Estimation of joint models is a computationally challenging task

◃ computation of survival function: integration with respect to time

Si(t | bi) = exp

(

−

∫ t

0

h0(s) exp{γ⊤wi + αmi(s)} ds

)

◃ computation of log-likelihood: integration wrt the random effects

ℓ(θ) =∑

i

log

∫

p(yi | bi; θ){

p(Ti | bi; θ)δi Si(Ti | bi; θ)

1−δi}

p(bi; θ) dbi


2.3 Estimation (cont’d)

• Combination of optimization and numerical integration

• Numerical integration

◃ Gaussian quadrature

◃ Monte Carlo

◃ Laplace (especially useful in high-dimensional settings)(Rizopoulos, Verbeke & Lesaffre, JRSSB, 2009)

• Optimization

◃ EM algorithm

◃ Newton-Raphson or quasi-Newton algorithm

◃ hybrid algorithms (combination of EM & quasi-Newton)


3.1 A Joint Model for the AIDS Data

• Longitudinal submodel

◃ fixed effects: time effect + interaction of treatment with time

◃ random effects: intercept + time effect

• Survival submodel

◃ treatment effect + underlying CD4 cell count effect

◃ piecewise-constant baseline hazard in 7 intervals


3.1 A Joint Model for the AIDS Data (cont’d)

Survival Longitudinal Random Effects

value (std.err) value (std.err) value

Treat 0.35 (0.15) Inter 2.56 (0.04) Inter 0.87

CD4 −1.10 (0.12) Time −0.04 (0.005) Time 0.04

Treat:Time 0.01 (0.01) ρ 0.07

σ 0.38

• Highly significant association between the underlying CD4 cell count at time t, andthe risk for death at t


3.2 A Comparison of JM vs naive TD Cox

• To illustrate the virtues of joint modelling, we compare with the standardtime-dependent Cox model

◃ i.e., we ignore the measurement error in the CD4 cell count

Joint Model Naive TD Cox

value (std.err) value (std.err)

Treat 0.35 (0.15) 0.33 (0.15)

CD4 −1.10 (0.12) −0.72 (0.08)

• Clearly, there is a considerable effect of ignoring the measurement error, especially forthe effect of CD4!


3.2 A Comparison of JM vs naive TD Cox (cont’d)

0 1 2 3 4 5

46

810

Time

CD

4

Event Time

Joint Modeltime−dependent Cox


4.1 Extensions

• Joint modelling of longitudinal and time-to-event data is a very active area of currentBiostatistics research

• Several extensions of the standard joint model have been proposed

◃ recurrent events

◃ competing risks

◃ semiparametric modelling of longitudinal profiles

◃ semiparametric modelling of random effects distribution

◃ . . .


4.2 Challenges: Checking Model Assumptions

• Various types of residuals have been proposed to check the fit of mixed models andsurvival models separately

• However, not all of these cannot be directly used for joint models

• Problems arise due to nonrandom dropout, i.e., the longitudinal evolutions that weend-up observing are not a random sample of the target population

⇓

The reference distribution of the observed residuals is not directly evident



• yoi : longitudinal measurements before Ti; ymi : longitudinal measurements after Ti

• Missing data mechanism:

p(Ti | yoi , y

mi ) =

∫

p(Ti | bi) p(bi | yoi , y

mi ) dbi

• This still depends on ymi , which means nonrandom dropout

• Observed measurements are not a random sample of the target population



• Not too much work for checking the assumptions of joint models

• Two approaches

◃ Conditional residuals (Dobson & Henderson, Biometrics, 2003)

◃ Multiple Imputation residuals (Rizopoulos, Verbeke & Molenberghs, Biometrics, 2010)

* appropriately multiply impute ymi* complete data ⇒ use standard residual plots

• More work is required


4.3 Challenges: Dynamic Prediction

• Increasing interest in individualized predictions

◃ e.g., how treatment will work on a specific patient (with a specific medical history)

• Within the Joint Modelling Framework

◃ estimate the probability for an event for a specific patient

◃ how does this probability changes as longitudinal measurements are collected

• Dynamic Prediction: update probabilities for an event dynamically as longitudinalinformation is recorded


4.3 Challenges: Dynamic Prediction (cont’d)

• Example: we take two patients who provided 4 CD4 cell count measurements each



Months

CD

4

2.0

2.5

3.0

3.5

4.0

0 2 6 12

Patient 200 2 6 12

Patient 7



• Example: we take two patients who provided 4 CD4 cell count measurements each

• We are interested in comparing predicted survival probabilities for these two patients



• More formally, we have available measurements up to time point t

Yi(t) = {yi(s), 0 ≤ s ≤ t}

and we are interested in

πi(u | t) = Pr{

Ti ≥ u | Ti > t,Yi(t),Dn

}

where

◃ where u > t, and

◃ Dn denotes the sample on which the joint model was fitted



• It is convenient to proceed using a Bayesian formulation of the problem. Morespecifically, πi(u | t) can be written as

Pr{

Ti ≥ u | Ti > t,Yi(t),Dn

}

=

∫

Pr{

Ti ≥ u | Ti > t,Yi(t); θ}

p(θ | Dn) dθ

• The first part of the integrand reduces to

Pr{

Ti ≥ u | Ti > t,Yi(t); θ}

=

=

∫

Si

{

u | Mi(u, bi, θ); θ}

Si

{

t | Mi(t, bi, θ); θ} p(bi | Ti > t,Yi(t); θ) dbi



Survival Probability

baseline

after 2 m

after 6 m

after 12 m

0.6 0.7 0.8 0.9 1.0

extra 2 months survivalPatient 20


0.6 0.7 0.8 0.9 1.0


baseline

after 2 m

after 6 m

after 12 m


0.6 0.7 0.8 0.9 1.0




5 Software

• Software: R package JM freely available via http://cran.r-project.org/

◃ http://rwiki.sciviews.org/doku.php?id=packages:cran:jm

• Longitudinal process

◃ linear mixed effects model

• Survival process

◃ Wulfsohn and Tsiatis model (Biometrics, 1997)

◃ Weibull accelerated failure time & PH model

◃ PH model with piecewise-constant baseline hazard

◃ PH model with spline-approximated baseline hazard


Thank you for your attention!


Joint Modelling - Challenges and Future Directions

Documents

Transcript of Joint Modelling - Challenges and Future Directions