Chapter 4 Parametric modelling - Lancaster University
Transcript of Chapter 4 Parametric modelling - Lancaster University
![Page 1: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/1.jpg)
Chapter 4 Parametric modelling
1
![Page 2: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/2.jpg)
Parametric models
• “regression: obs = predictor + error”;
(or signal + noise)
The big intellectual step forward:
• (i) predictor model, contains covariates,
explanatory variables;
• (ii) error model, model for random variation,
e.g. normal.
Generalise: glms are built from
a linear predictor, which gives a regression type
term, and
an EF model for the uncertainty or natural
random variation.
2
![Page 3: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/3.jpg)
Exercise 4.1 Interest lies in comparing the
survival of two groups: is this modelled within the
predictor or within the error?
3
![Page 4: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/4.jpg)
Sol: 4.1 In parametric survival modelling a
group effect may be represented in either.
e.g. different mean lifetimes in a linear predictor
e.g. different survival functions in the error
distribution.
4
![Page 5: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/5.jpg)
Exercise 4.2 Is a model for a lifetime variable a
model for the predictor or for the error or for the
observed value?
5
![Page 6: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/6.jpg)
Sol: 4.2 Both.
Need to think about what modelling means.
6
![Page 7: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/7.jpg)
Exercise 4.3 How are the hazard and survivor
functions expressed within this framework:
observation, predictor or error?
7
![Page 8: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/8.jpg)
Sol: 4.3
obs=data + model →h
8
![Page 9: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/9.jpg)
Probability distributions for lifetime
variables
9
![Page 10: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/10.jpg)
Exponential distribution
The exponential distribution is the ‘canonical
model’ for survival analysis.
lifetime T ∼ Exp(λ) λ > 0, T ≥ 0
pdf f(t) = λ exp(−λt), t ≥ 0;
survivor function S(t) = exp(−λt), t ≥ 0;
integrated hazard H(t) = λt, t ≥ 0;
hazard function h(t) = λ, t ≥ 0.
10
![Page 11: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/11.jpg)
Exercise 4.4 Suggest an appropriate scale to
plot the survivor function.
11
![Page 12: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/12.jpg)
Sol: 4.4
logS(t) = −λt linear:
so plot (t, logS(t)).
12
![Page 13: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/13.jpg)
Exercise 4.5 Suggest how covariates can be
represented in this model.
13
![Page 14: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/14.jpg)
Sol: 4.5
Through λ, note E(T ) = 1/λ.
i.e. this is an instance of a glm.
14
![Page 15: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/15.jpg)
Weibull distribution
The Weibull distribution has scale parameter
λ > 0 and shape parameter α > 0.
lifetime
T ∼ Weibull(α, λ) λ > 0, α > 0, T ≥ 0
pdf f(t) = αλαtα−1e−(λt)α, t ≥ 0;
survivor function S(t) = e−(λt)α, t ≥ 0;
integrated hazard H(t) = (λt)α, t ≥ 0;
hazard function h(t) = λααtα−1, t ≥ 0.
15
![Page 16: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/16.jpg)
Exercise 4.6 Suggest an appropriate scale to
plot the survivor function.
16
![Page 17: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/17.jpg)
Sol: 4.6
logS(t) = −(λt)α and
log [− logS(t)] = α − λ log (t) linear:
so plot ( log (t), log [− logS(t)]).
The linear plot allows rough estimates of (α, λ)
to be read from the graph.
17
![Page 18: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/18.jpg)
The parameters:
λ is the scale parameter, and
α is the shape parameter.
• α > 1, increasing hazard function
• α < 1, decreasing hazard function
• α = 1, Weibull(1, λ) = Exp(λ).
The scale of the hazard is determined by λ.
18
![Page 19: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/19.jpg)
Looking back, for example, at the bearings
data, we have near linearity on the complementary
log–log scale but not the logarithmic scale.
The Weibull distribution is a better model for the
data than the exponential distribution.
19
![Page 20: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/20.jpg)
Weibull moments
The Weibull(α, λ) distribution has the following
moments:
Expectation E(T ) = Γ(1 + 1/α)/λ;
Variance
var(T ) = [Γ(1 + 2/α) − Γ(1 + 1/α)2]/λ2.
One way of understanding the behaviour of a
variable T which has the Weibull(α, λ)
distribution is via the following exercise.
20
![Page 21: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/21.jpg)
Exercise 4.7 Prove using the survivor functions
that if T ∼ Weibull(α, λ) and Z = Tα then
Z ∼ Exp(λα).
Also show that (λT )α ∼ Exp(1).
21
![Page 22: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/22.jpg)
Sol: 4.7
SZ(z) = P (Z > z)
= P (Tα > z)
= P (T > z1/α)
= ST (z1/α)
= exp{−(λz1/α)α}= exp{−λαz},
which is the survivor function of the Exp(λα)
distribution.
The second part of the lemma is proved similarly
(and just as easily.) Homework.
22
![Page 23: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/23.jpg)
The extreme value distribution
The extreme value distribution has parameters µ
and σ:
lifetime T ∼ EV(µ, σ) σ > 0, µ, T ∈ R1
pdf f(t) = σ−1 exp(
t−µσ
)exp
[− exp
(t−µσ
)],
−∞ < t < ∞;
survivor function S(t) = exp[− exp
(t−µσ
)],
−∞ < t < ∞;
integrated hazard H(t) = exp(
t−µσ
),
−∞ < t < ∞;
hazard function h(t) = σ−1 exp(
t−µσ
),
−∞ < t < ∞.
23
![Page 24: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/24.jpg)
EV moments
Expectation E(T ) = µ − γσ where
γ = 0.5772 . . . is Euler’s constant.
Variance var(T ) = π2σ2/6 ∼ 1.645σ2
The extreme value distribution is obtained by a
logarithm transformation of an exponential
random variable.
24
![Page 25: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/25.jpg)
Exercise 4.8 Show that if Z ∼ Exp(1) and
T = µ + σ log (Z) then T ∼ EV(µ, σ).
25
![Page 26: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/26.jpg)
Sol: 4.8
ST (t) = P (T > t) = P (µ + σ log (Z) > t)
= P (Z > exp{(t − µ)/σ})= SZ(exp{(t − µ)/σ})= exp[− exp{(t − µ)/σ})],
which is the survivor function of the EV(µ, σ)
distribution as claimed.
26
![Page 27: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/27.jpg)
The standard EV distribution has µ = 0 and
σ = 1 with pdf exp (t − exp t).
27
![Page 28: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/28.jpg)
Like the Weibull distribution, there is some asymptotic
argument (beyond the scope of this course), analogous to the
central limit theorem for sample means, which suggests that
the extreme value distribution may be appropriate for
modelling lifetimes in some special circumstances.
However, unlike the Weibull distribution, the domain of T is
(−∞,∞), so that negative lifetimes will have a non–zero
probability.
This is obviously a limitation in the use of the model, though it
is still often used.
In particular, P (T < 0) = 1 − expˆ
− exp`
−µσ
´˜
, which can be
made very small for suitable values of µ and σ so in practice
the failings of the model are not so great.
28
![Page 29: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/29.jpg)
Other distributions
other distributions are used: these include the
log–normal,
gamma and
log–logistic distributions,
which have the following density functions.
29
![Page 30: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/30.jpg)
The Lognormal distribution
• T ∼ lognormal(µ, σ2) σ > 0 , t > 0
• T = exp(X) where X ∼ Normal(µ, σ2)
• logT ∼ Normal(µ, σ2)
• T = eµ(eZ)σ where Z ∼ Normal(0, 1)
•f(t) = (2πσ2t2)−1/2 exp{−( log t − µ)2/(2σ2)}.
E[T ] = exp
(µ +
1
2σ2
).
30
![Page 31: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/31.jpg)
Basic functions for the standard log–normal
distribution
0 5 10 15
0.0
0.1
0.2
0.3
0.4
0.5
0.6
f(x
)
H
x0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
F(x
)
x0 5 10 15
0.0
0.2
0.4
0.6
0.8
1.0
S(x
)
x0 5 10 15
0.0
0.2
0.4
0.6
0.8
h(x
)
x
0 5 10 15
01
23
45
H(x
)x
Density, distribution, survivor, hazard and
cumulative hazard functions for the standard
lognormal distribution.
31
![Page 32: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/32.jpg)
The Gamma distribution
• T ∼ Gamma(α, λ) λ > 0 , α > 0 , t > 0
• f(t) = λαtα−1 exp(−λt)/Γ(α)
• E[T ] = α/λ var[T ] = α/λ2
• X1, . . . , Xn iid Exp(λ)
⇒ T =∑n
i=1 Xi ∼ Gamma(n, λ)
32
![Page 33: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/33.jpg)
The log-logistic distribution
The logistic distribution
X ∼ logistic(µ, σ) σ > 0 ,−∞ < x < ∞.
It looks like Normal(µ, σ2) but has a simple
closed form SX(x) = 1
1+exp(x−µσ )
.
• T ∼ log-logistic(µ, σ) σ > 0 , t > 0,
- T = exp(X) where X ∼ logistic(µ, σ),
- logT ∼ logistic(µ, σ).
• ST (t) = 11+(t exp(−µ))1/σ ,
pdf log–logistic
f(t) = 1/(σt) exp{( log t − µ)/σ}/[1 + exp{( log t − µ)/σ}]2, t
33
![Page 34: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/34.jpg)
The logistic distribution is similar in shape to the
normal distribution.
It and the log–logistic are more manageable than
the normal and log–normal when we encounter
censored data since their survivor functions have
a simple closed forms.
34
![Page 35: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/35.jpg)
MLE
We fit probability models to censored lifetime
data using maximum likelihood.
A big plus for ml that it can deal with
censoring naturally, and the
analysis extends to include covariate effects.
Assume n lifetime measurements t1, t2, . . . , tnwith associated censoring indicators δ1, δ2, . . . , δn.
Assume iid: that is ti derive from a common
distribution, with unknown parameters θ.
We derive the likelihood function.
35
![Page 36: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/36.jpg)
Exercise 4.9 Give a generic definition of the
likelihood function.
36
![Page 37: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/37.jpg)
Sol: 4.9
L(paras) = P(realised values| paras)
or
L(θ) = P(T = t|θ).
Obvious problem of 0 with continuous models.
Fudge: want L(θ) ∝ P(T = t|θ) so that define
L(θ) = fT (t|θ).
But how does this work with censored
observations?
37
![Page 38: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/38.jpg)
Sol: 4.9 Take right censored.
Observe t∗ then in the model T ≥ t∗,
so L(paras) = P(realised value| paras)
becomes
L(θ) = P(T ≥ t∗|θ)= ST (t∗|θ).
38
![Page 39: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/39.jpg)
If ti corresponds to an uncensored observation,
then the likelihood contribution is the pdf f(ti|θ)if ti corresponds to a censored lifetime, then
Ti ≥ ti, and the likelihood contribution is S(ti|θ).By independence the overall likelihood is:
L(θ) =∏
δi=1
f(ti|θ)∏
δi=0
S(ti|θ)
=∏
δi=1
h(ti|θ)S(ti|θ)∏
δi=0
S(ti|θ)
=n∏
i=1
h(ti|θ)δiS(ti|θ).
39
![Page 40: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/40.jpg)
This calculation requires the independence of the
censoring mechanism and the lifetimes,
otherwise the likelihood is invalid.
Likelihood inference now proceeds in the standard
way, giving
maximum likelihood estimates,
standard errors and confidence intervals
(asymptotic).
For most lifetime models the procedures cannot
be implemented analytically and so numerical
techniques are required;
but for the exponential distribution . . .
40
![Page 41: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/41.jpg)
Exponential mle
The likelihood becomes
L(λ) =n∏
i=1
h(ti|λ)δiS(ti|λ) =n∏
i=1
λδi exp(−λti),
ℓ(λ) =n∑
i=1
δi logλ −n∑
i=1
λti
= m logλ − λ
n∑
i=1
ti
where m is the number uncensored.
Setting the score to 0 gives the mle λ = mPni=1
ti.
With uncensored data we get the usual mle 1/t.
41
![Page 42: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/42.jpg)
As
var(λ) ≈(− d2ℓ
dλ2
)−1
||bλ
the approximate variance is
var(λ) ≈ λ2
m.
Approx ci: λ ± 1.96√
varλ.
42
![Page 43: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/43.jpg)
Exercise 4.10 With lifetime data 5, 7∗, 11 find
the mle and an asymptotic 95% ci.
43
![Page 44: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/44.jpg)
Sol: 4.10 λ = 2/23 = 0.08695652,
1.96
√bλ2
m= 1.96λ/
√m = 0.1205156,
gives ci 0.086 ± 0.121.
Not great as covers negative values.
44
![Page 45: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/45.jpg)
Exercise 4.11 Find the Fisher information for
this exponential likelihood.
Why does it not depend on the lifetimes?
45
![Page 46: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/46.jpg)
Sol: 4.11
ℓ(λ) = m logλ − λn∑
i=1
ti
ℓ′(λ) =m
λ−
n∑
i=1
ti
ℓ′′(λ) = −m
λ2.
does it not depend on the lifetimes
mathematically as score function only affected by
additive constant
46
![Page 47: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/47.jpg)
Exercise 4.12 Write down the log–likelihood for
the Weibull lifetime distribution.
47
![Page 48: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/48.jpg)
Sol: 4.12
L(λ) =n∏
i=1
h(ti|λ)δiS(ti|λ)
=n∏
i=1
[λααtα−1i ]δie−(λti)
α
ℓ(λ) =n∑
i=1
δi log [λααtα−1i ]−(λti)
α
= (α log [λ] + logα)n∑
i=1
δi
+(α − 1)n∑
i=1
δi log (ti)−λα
n∑
i=1
tαi .
48
![Page 49: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/49.jpg)
Exercise 4.13 The standard errors computed
from the Fisher information are asymptotic in the
sense that for large n the coverage probability of
a confidence interval is numerically accurate.
Briefly outline the mathematical argument that
justifies this statement.
Illustrate with the exponential distribution.
49
![Page 50: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/50.jpg)
Sol: 4.13 The exponential score function is
ℓ(λ|t) = m logλ − λ
n∑
i=1
ti,
ℓ(λ|T ) = m logλ − λ
n∑
i=1
Ti.
But the rhs is a
sum of independent random variables
with finite second moment and if
appropriately standardised
is asymptotically Normal by the CLT.
50
![Page 51: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/51.jpg)
mle using R
The functions Surv and survreg in the
R-package survival provide the means to
numerically find mle for parametric models.
51
![Page 52: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/52.jpg)
Example 4.14
library(survival)
load("./m353data.Rdata")
attach(bearings) ; class(bearings)
survtime = Surv(time,cens) ; class(survtime)
survtime
Surv(time,cens) contains the censoring
information.
52
![Page 53: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/53.jpg)
Example 4.15
survreg(survtime~1,data=bearings,dist=’exponential’)
Coefficients: (Intercept) 4.279729
Scale fixed at 1
Loglik(model)= -121.4
sum(cens)/sum(time) # analytic mle of lambda
log(sum(cens)/sum(time))
The constant model 1 fits the intercept but no
covariates.
There appears to be a mismatch between scale
for mles?
53
![Page 54: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/54.jpg)
Fitting the Weibull
Consider lifetime variable T of individual with
covariates x (e.g. age, sex, . . . ).
We require a linear predictor and link function.
Within the Weibull family it is natural to model
the parameter λ in terms of x.
As λ > 0 it is also natural to use a logarithmic
link
λ = exp(β′x)
for some parameter vector β.
54
![Page 55: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/55.jpg)
Recall that if T ∼ Weibull(α, λ), then
Tα ∼ Exp(λα) and (λT )α ∼ Exp(1).
So letting E = (λT )α and taking logs,
logT ∼ − logλ +1
αlogE.
The R function survreg takes the default
logarithmic link and fits models of the form
logT ∼ β′x + σ logE.
Comparing these expressions, R models
• − logλ = β′x is the linear predictor;
• the R-scale parameter σ = 1/α is the reciprocal
of the Weibull shape parameter.
55
![Page 56: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/56.jpg)
Exercise 4.16 Fitting the bearings data with the
Weibull distribution gives
survreg(survtime~1,data=bearings,dist=’weibull’)
Coefficients:(Intercept) 4.405188
Scale= 0.4757721
Loglik(model)= -113.7
Interpret the estimates in terms of the standard
Weibull parameters.
56
![Page 57: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/57.jpg)
Sol: 4.16
− log λ = 4.405188 st
λ = 0.0122 = 1/81.87,
1bα = 0.4757721 st
α = 2.10.
57
![Page 58: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/58.jpg)
Exercise 4.17 Use this code to compare the
KM estimate of the survivor function with the
parametric estimate from the fitted Weibull
distribution.
plot(survfit(Surv(time,cens),data=bearings),col=’blue’)
t = seq(0.001,150,length=100)
H = (exp(-4.405188)*t)^(1/0.47577121) # integ hazard
S = exp(-H)
lines(t,S,type="l",col=’red’) ; grid()
58
![Page 59: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/59.jpg)
Sol: 4.17 Fitted T ∼ Weibull(2.10, 0.0122).
From the graph:
excellent fit.
59
![Page 60: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/60.jpg)
Model comparison
Does Weibull distribution provides a better fit to
these data than the exponential distribution?
The exponential distribution is obtained by fixing
α = 1 or the R scale=1 with the Weibull
distribution.
Under the null hypothesis, that α = 1, standard
likelihood theory tells us that the distribution of
2(ℓ1 − ℓ2) ∼ χ2.
There is 1 degree of freedom as there is a single
constraint in the sub–model α = 1.
R output:
2(l1 − l2) = 2(−113.7 − (−121.4)) = 15.4,
60
![Page 61: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/61.jpg)
which is significant when compared with the χ21
distribution.
We conclude that the Weibull distribution gives a
much improved fit.
61
![Page 62: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/62.jpg)
Fitting covariates
We look at two examples of fitting models of the
form
logT ∼ β′x + σ logE
which includes both exponential and Weibull
models.
62
![Page 63: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/63.jpg)
Example: leukaemia data
There are n = 33 observations,
none are censored, and
2 explanatory variables: wbc and ag.
The first few observations are
leuk
wbc ag time
1 2300 present 65
2 750 present 156
3 4300 present 100
hist(wbc)
63
![Page 64: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/64.jpg)
wbc measures white blood cell count,
its distribution is skewed, suggesting a logarithmic
transformation of this variable.
ag is a 2–level factor indicating presence or
absence of the compound in the patient’s blood.
First, we fit an additive model in which the linear
predictor − logλ has a different intercept for the
two ag groups, but the effect of log(wbc) is
constant within each group.
This corresponds to the model formula
log(wbc)+ag.
64
![Page 65: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/65.jpg)
attach(leuk)
out = survreg(
Surv(time)~log(wbc)+ag,leuk,dist=’weibull’)
summary(out)
Value Std.Error z p
(Intercept) 5.8524 1.323 4.425 9.66e-06
log(wbc) -0.3103 0.131 -2.363 1.81e-02
agpresent 1.0206 0.378 2.699 6.95e-03
Log(scale) 0.0399 0.139 0.287 7.74e-01
Scale= 1.04
Loglik(model)= -146.5
Loglik(intercept only)= -153.6
summary(out)$loglik # -146.4988
65
![Page 66: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/66.jpg)
Exercise 4.18 Write out the fitted version of
the model
logT ∼ − logλ +1
αlogE,
− logλ = β′x.
Does a larger wbc lead to longer remission times?
66
![Page 67: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/67.jpg)
Sol: 4.18
logT ∼ − log λ +1
1.04logE,
− log λ = 5.85 − 0.31x + 1.02I(ag)
where x represents log(wbc).
A larger wbc leads to shorter remission times as
the coefficient of x is negative.
67
![Page 68: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/68.jpg)
The fit of the model suggests that an exponential
lifetime distribution might be adequate.
out = survreg(
Surv(time)~log(wbc)+ag,leuk,dist=’expon’)
summary(out)
Value Std.Error z p
(Intercept) 5.815 1.263 4.60 4.15e-06
log(wbc) -0.304 0.124 -2.45 1.44e-02
agpresent 1.018 0.364 2.80 5.14e-03
Scale fixed at 1
Exponential distribution
Loglik(model)= -146.5 Loglik(intercept only)= -155.5
summary(out)$loglik # -146.5405
68
![Page 69: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/69.jpg)
The deviance between the two models is
2(−146.4988 − (−146.5405)) = 0.0834, which is
tiny compared to χ21: the hypothesis is accepted.
The regression coefficients are clearly significant.
Hence, this exponential model with formula
log(wbc)+ag represents our ‘best’ model.
The estimated lifetime distribution for a patient is
T ∼ Exp(λ)
for a patient without ag
− log λ = 5.815 − 0.304 log (wbc),
while for a patient with ag present
− log λ = 5.815 + 1.018 − 0.304 log (wbc).
69
![Page 70: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/70.jpg)
Diagnostics
The two models under consideration are the
exponential and the Weibull.
Assessing model fit needs to adjust for the
covariate effects.
If
Ti ∼ Exp(λi) implies λiTi ∼ Exp(1)
Ti ∼ Weibull(α, λi) implies (λiTi)α ∼ Exp(1)
and the rhs is the same for all i.
So use standardised lifetimes λiti for diagnostics.
out = survreg(Surv(time)~log(wbc) + ag, leuk, dist=’exp’)
ntimes <- time*exp(-out$linear.predictors)
70
![Page 71: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/71.jpg)
plot(survfit(Surv(ntimes)),xlab=’t’, ylab=’S(t)’, # KM
col=’blue’, log=T); grid()
0 1 2 3 4
0.05
0.10
0.20
0.50
1.00
t
S(t
)
Residual plot of exponential fit to leuk data.
The plot is reasonably linear, except at long
71
![Page 72: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/72.jpg)
lifetimes where there is some evidence of
curvature.
However there is less data there.
72
![Page 73: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/73.jpg)
Example: gehan data
This diagnostic procedure works equally well with
censored data.
Consider the gehan data, fit the treatment group
as a factor and use a Weibull distribution.
attach(gehan)
out = survreg(Surv(time,cens)~treat,gehan,dist=’weibull’)
Coefficients: (Intercept) treatcontrol
3.515687 -1.267335
Scale= 0.7321944
Loglik(model)= -106.6 Loglik(intercept only)= -116.4
ntimes = time*exp(-out$linear.predictors)
ntimes = (ntimes)^(1/out$scale) # alpha = 1/Rscale
73
![Page 74: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/74.jpg)
plot(survfit(Surv(ntimes,cens),gehan),
col=’blue’, log=T)
74
![Page 75: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/75.jpg)
Exercise 4.19 Comment on the diagnostic plot
and interpret the fitted model.
75
![Page 76: Chapter 4 Parametric modelling - Lancaster University](https://reader034.fdocuments.us/reader034/viewer/2022050611/62738942137a424a6677518e/html5/thumbnails/76.jpg)
Sol: 4.19 The diagnostic based on (λiTi)α
suggests the Weibull is adequate.
The fitted lifetime distribution is
T ∼ Weibull(1/0.73, λi)
where for a patient in the treatment group
− log (λ) = 3.516
while for a patient in the control group
− log (λ) = 3.516 − 1.267 = 2.25.
The lifetimes are increased by taking the
treatment.
76