7071

29
Validation of predictive regression models Ewout W. Steyerberg, PhD Clinical epidemiologist Frank E. Harrell, PhD

description

hkl

Transcript of 7071

Page 1: 7071

Validation of predictive regression models

Ewout W. Steyerberg, PhD

Clinical epidemiologist

Frank E. Harrell, PhD

Biostatistician

Page 2: 7071

Personal background

Ewout Steyerberg: Erasmus MC, Rotterdam, the Netherlands

Frank Harrell: Health Evaluation Sciences,

Univ of Virginia, Charlottesville, VA, USA

“Validation of predictions from

regression models is of

paramount importance”

Page 3: 7071

Learning objectives: knowledge of common types of regression models

fundamental assumptions of regression

models

performance criteria of predictive

models

principles of different types of validation

Page 4: 7071

Performance objectives

To be able to explain why validation is

necessary for predictive models

To be able to judge the adequacy of a

validation procedure

Page 5: 7071

Predictive models provide quantitative estimates of an outcome, e.g.

Quality of life one year after surgery

Death at 30 days after surgery

Long term survival

Page 6: 7071

Predictive models are often based on regression analysis

y ~ a + sum(bi*xi)

y: outcome variable

a: intercept

bi: regression coefficient i

xi: predictor variable i

i in [1,many], usually 2 to 20

Page 7: 7071

3 examples of regression

Quality of life one year after surgery:

continuous outcome, linear regression

Death at 30 days after surgery:

binary outcome, logistic regression

Long term survival:

time-to-outcome, Cox regression

Page 8: 7071

Predictive models make assumptions

Distribution

Linearity of continuous variables

Additivity of effects

Page 9: 7071

Example: a simple logistic regression model

30day mortality ~ a + b1*sex + b2*age

Assumptions:

Distribution of 30day mortality is binomial

Age has a linear effect

The effects of sex and age can be added

Page 10: 7071

Assessing model assumptions

Examine model residuals

Perform specific tests

add nonlinear terms, e.g. age+age2

add interaction terms, e.g. sex*age

Page 11: 7071

Model assumptions and predictionsBetter predictions if assumptions are met

Some violation inherent in empirical data

Evaluate predictions in new data

Page 12: 7071

Evaluation of predictions

Calibration

average of predictions correct?

low and high predictions correct?

Discrimination

distinguish low risk from high risk

patients?

Page 13: 7071

Example: predicted probabilities

0.0 0.1 0.2 0.3 0.4Predicted probability of 30-day mortality

0.0

0.1

0.2

0.3

0.4

Act

ual 3

0-da

y m

orta

lity

Area under ROC: 0.77Calibration: OK

Page 14: 7071

3 types of validation

Apparent: performance on sample used to

develop model

Internal: performance on population

underlying the sample

External: performance on related but

slightly different population

Page 15: 7071

Apparent validity

Easy to calculate

Results in optimistic performance

estimates

Page 16: 7071

Apparent estimates optimistic since same data used for:

Definition of model structure:

e.g. selection and coding of variables

Estimation of model parameters:

e.g. regression coefficients

Evaluation of model performance:

e.g. calibration and discrimination

Page 17: 7071

Internal validity

More difficult to calculate

Test model in new data, random from

underlying population

Page 18: 7071

Why internal validation?

Honest estimate of performance should

be obtained, at least for a population

similar to the development sample

Internal validated performance sets an

upper limit to what may be expected in

other settings (external validity)

Page 19: 7071

External validity

Moderately easy to calculate when new

data are available

Test model in new data, different from

development population

Page 20: 7071

Why external validation?

Various factors may differ from

development population, including

different selection of patients

different definitions of variables

different diagnostic or therapeutic

procedures

Page 21: 7071

Internal validation techniques

Split-sample:

development / validation

Cross-validation:

alternating development / validation

extreme: n-1 develop / 1 validate

(‘jack-knife’)

Bootstrap

Page 22: 7071

Bootstrap is the preferred internal validation technique

bootstrap sample for model development:

n patients drawn with replacement

original sample for validation: n patients

difference: optimism

efficiency: development and validation on n

patients

Page 23: 7071

Example: bootstrap results for logistic regression model

30-day mortality ~ a + b1*sex + b2*age

Apparent area under the ROC curve: 0.77

Mean area of 200 bootstrap samples:0.772

Mean area of 200 tests in original: 0.762

Optimism in apparent performance: 0.01

Optimism-corrected area: 0.76

Page 24: 7071

External validation techniques

Temporal validation: same

investigators, validate in recent years

Spatial validation (other place): same

investigators, cross-validate in centers

Fully external: other investigators, other

centers

Page 25: 7071

Example: external validity of logistic regression model

30-day mortality ~ a + b1*sex + b2*age

Apparent area in 785 patients: 0.77

Tested in 20,318 other patients: 0.74

Tested by other investigators: ?

Page 26: 7071

Example: external validation

0.0 0.1 0.2 0.3 0.4Predicted probability of 30-day mortality

0.0

0.1

0.2

0.3

0.4

Act

ual 3

0-da

y m

orta

lity

Area under ROC: 0.74Calibration: reasonable

Page 27: 7071

Summary

Apparent validity gives an optimistic

estimate of model performance

Internal validity may be estimated by

bootstrapping

External validity should be determined

in other populations

Page 28: 7071

Key references

tutorial and book on multivariable models(Harrell 1996, Stat Med 15:361-87;

Harrell: regression modeling strategies, Springer 2001)

empirical evaluations of strategies (Steyerberg 2000: Stat Med19: 1059-79)

internal validation (Steyerberg 2001:JCE 54: 774-81)

external validation (Justice 1999: Ann Intern Med 130:515-24;

Altman 2000: Stat Med 19: 453-73)

Page 29: 7071

Links

Interactive text book on predictive

modelinghttp://www.neri.org/symptom/mockup/Chapter_8/

Harrell’s Regression modeling strategieshttp://hesweb1.med.virginia.edu/biostat/rms/