Neil Ferguson
MRC Centre for Outbreak Analysis and Modelling
Dept. of Infectious Disease Epidemiology
Faculty of Medicine
Imperial College
What is a model and
why use one?
Why use a model?
• Many uncertainties about emergence/spread of pathogens.
• Often limited historical data.
• Hence models necessarily simplify, make assumptions.
• So why model?
• Because without a model, judgements are made on the basis of
qualitative evidence/opinion/prejudice…
• Models at least have the benefit of
Making assumptions explicit.
Making best use of limited data.
Highlighting key factors determining policy needs.
Being quantitative (e.g. how many doses needed?)
What do infectious diseases
have in common?
• Transmission.
• Via
Aerosol/droplets (measles, mumps,
influenza, pertussis…).
faecal-oral -water-borne/environmental
(Enteroviruses, Rotaviruses, Typhoid,
Cholera, Dysentry, tapeworms,
nematodes).
Sexual contact (HIV, gonorrhoea, syphilis,
chlamydia, HBV)
Vectors (dengue, malaria, onchocerciasis,
nosocomial infections…)
Intermediate hosts (schistosomiasis).
One person gets infected.
That person infects others.
They infect more.
Giving a chain reaction.
Exponential growth
• The most important quantity governing an epidemic is how
many other people one person infects.
• = the Basic Reproduction Number of an epidemic – R0.
•Needs to be >1 for an epidemic to take off.
• Other quantities – e.g. Generation time=Tg – also important.
But the end result is the same…
0
1
2
3
4
5
6
7
8
1 2 3 4
t
Y
When does exponential spread stop?
Rate
of
new
in
fecti
on
s
establish-
ment
Time
exhaustio
n o
f
suscep
tible
s
endemicity
Equilibrium,
or recurrent
epidemicsy
e(R
0-1
)/TG
t
Random effects
• Epidemic eventually begins to run out of people to infect.
• Then the number of secondary cases per case drops below R0 –
instead defined by R, the effective reproduction number = s×R0
(s = proportion still susceptible).
• Once s<1/R0 (so R<1), the epidemic goes into decline.
Controlling infectious diseases
- what does it take (in theory)?
• To control an epidemic a policy needs to
reduce R<1 – so transmission cannot
sustain itself.
• So need to eliminate a fraction 1-1/R0 of
transmission – i.e. 50% for R0 =2, 75% for
R0 =4, 90% for R0 =10.
• This can be achieved by:
Reducing contacts
(quarantine, social distancing).
Reducing susceptibility
(vaccination, prophylaxis).
Reducing infectiousness
(e.g. treatment).
• Key issues are who is targeted, how much
effort is needed, and how fast?
persistence
100%
0 5 10 15 20
p
eradication
pc = 1-1/R0
50%
0%
Epidemic models
• Just capture these ideas mathematically.
• A couple of minor challenges :
How do we estimate R0 (and Tg) for a particular disease and
population?
How do we estimate the effect of control measures on these
parameters?
Deconstructing R0
• Not a fundamental biological constant.
• Determined by:
Pathogen biology (pathogenesis, lifecycle, variability).
Host factors (genetics, nutrition, age, co-morbidities).
Population structure (demography, contact patterns).
• Understanding these at a level which lets R0 to be estimated is
what a lot of quantitative infectious disease epidemiology is about.
• Need mechanistic understanding (not just curve fitting) to predict
impact of controls.
• Need DATA.
Simple example
R0 = D ×C × p
Mean length
of time infectiousRate at which
contacts
occur
Probability of
transmission per
contact
- Highly simplified, as only applies if all contacts have an
equal risk of infection, and if contacts are not repeated.
Data: natural history - not just SIR
• In reality, diseases develop gradually – need to allow for incubation
period (no symptoms) , variable infectiousness, morbidity/mortality.
• e.g. Smallpox:
The 2 week
incubation period is
what let smallpox to
be eradicated
‘Removed’
(immune
or dead)
Data: transmission
• Almost never observed.
• Little quantitative data on
mechanisms.
• Some estimates of
transmission rates for small
groups (e.g. households),
derived via painstaking cohort
sudies.
• But mostly transmission
parameters have to be
estimated by matching models
to surveillance data.
-
0.20
0.40
0.60
0.80
1.00
0 1 2 3 4 5 6
-
0.20
0.40
0.60
0.80
1.00
0 1 2 3 4 5 6
-
0.20
0.40
0.60
0.80
1.00
0 1 2 3 4 5 6
-
0.20
0.40
0.60
0.80
1.00
0 1 2 3 4 5 6
Number of people infected
Pro
po
rtio
n
Household data for flu
Data: surveillance
0.00
200.00
400.00
600.00
800.00
1000.00
1200.00
1400.00
1965 1969 1973 1977 1981 1985 1989 1993 1997 2001 2005
Influenza-like i l lness (ICD9 487) first (F) and new (N) episodes
Incidence rates per 100,000 Total
• e.g. For flu
• GP consultation rates
for E&W (RCGP).
• Affected by healthcare
seeking behaviour.
• Often not flu (e.g.
RSV).
• Only measures
disease, not infection.
• Unknown
ascertainment.
Data: contact patterns
Defining ‘relevant’ contacts often a challenge – STIs the easiest:
Gregson et al, Lancet 2002
Genetic/antigenic data
• Increasing volumes of
pathogen sequence
data.
• Population structure
and polymorphisms still
often not well
understood.
• Antigenic (strain) data
also often available –
and linked to genetic
data for RNA viruses,
but not for many more
complex pathogens.
• Molecular basis of
transmissibility very
poorly understood.
ThD1-0041/82China.Guangzhou/80
ThD1-0037/88ThD1-0036/88
ThD1-0336/91ThD1-0031/91ThD1-P0153/92ThD1-0123/92
ThD- K0127/94ThD1-0398/89
ThD1-0848/90ThD1-K0379/93
ThD1-0009/88ThD1-CN0323/91
ThD1-0179/87ThD1-0384/87ThD1-0875/87
ThD1-0412/86ThD1-0336/88ThD1-0746/87
ThD1-K0229/90Djibouti/98
Taiwan.765101/87ThD1-0001/89
ThD1-0178/92ThD1-0074/93ThD1-0191/93
ThD1-K0485/93ThD1-0641/90
ThD1-K0053/94ThD1-K0109/92
ThD1-S0088/92ThD1-0540/85
ThD1-0128/89ThD1-0118/83
Thailand.PUO 359/80ThD1-S0008/81ThD1-0096/81ThD1-S0081/82
ThD1-0153/81ThD1-0240/86ThD1-0023/81
ThD1-0233/80ThD1-0005/02
ThD1-0762/97ThD1-0277/97
ThD1-0002/95ThD1-0175/02
ThD1-0134/00
ThD1-0067/99ThD1-0289/97
Thailand.23-1NIID/02ThD1-0499/01
ThD1-S0102/01ThD1-0075/02
ThD1-0116/97ThD1-0081/98
ThD1-0483/01ThD1-K0013/01
ThD1-K0163/01ThD1-0876/99
ThD1-0388/98ThD1-0141/00
ThD1-K0080/01ThD1-K0851/01
ThD1-0280/97ThD1-0562/99
ThD1-0438/95ThD1-0726/99
ThD1-K0107/98ThD1-K0079/00
ThD1-K0035/00ThD1-K0051/99
Cambodia.61-1NIID/01ThD1-0119/91
ThD1-0097/94ThD1-0488/94
ThD1-0153/00ThD1-0762/99
ThD1-A0153/95ThD1-S0197/96
ThD1-0301/93ThD1-K0080/97
ThD1-K0056/96ThD1-K0060/98
ThD1-K0048/97ThD1-K0062/97
ThD1-K0052/95ThD1-K0088/95
ThD1-K0407/01ThD1-K0113/99
ThD1-0861/90ThD1-K0022/93
Japan.Mochizuki/43Hawaii/45
Indonesia.A88Tahiti.44-1NIID/01
Australia.HAT17/83Indonesia.17-1NIID/02
Philippines.PRS 228682/74Thailand.2543/63
ThD1-NB0038/83ThD1-0127/80ThD1-0442/80ThD1-0673/80
Myanmar.PRS 228686/76Myanmar.32514/98
Venezuela.28164/97Brazil/90
Peru.DEI 0151/91Argentina.297/00
Angola.RIO H 36589/88Colombia.INS 371869/96
Brazil.BE AR 404147/82Aruba.495-1/85
Singapore.S275/90Cote DÕIvoire:Abidjan/98
Nigeria.IBH 28328/68Cote DÕIvoire:Dakar.A-1520/85
0.005 subst itutions/site
I
II
III
Thai strains
1980-1994
Thai strains 1990-2002
Thai strains 1980-1983
98
96
100
100
100
100
98
100
98
100
93
Data: interventions
• Trials (e.g.
efficacy/effectiveness)
.
• Observational
studies.
• Extrapolation
(nearly) always
needed to predict
population effects.
-1
0
1
2
3
4
5
6
0 50 100 150 200 250 300
e.g. impact of antivirals on
HIV viral load
Knowledge synthesis
Model
Natural history
Epidemiology
Demography
Contact patterns
Interventions
Evolution Fundamental
parameters
Detailed
predictions
Control policy
optimisation
Insight into
mechanism
Not all models are mathematical!
Roles of mathematical modelling
• Quantifying risk.
• Knowledge synthesis:
Data analysis.
Extrapolating to the future.
Optimising control policies.
• Has benefit of:
making assumptions explicit.
being testable/disprovable.
• Not all knowing, can’t predict with no data!
Model complexity
• Many possible choices: deterministic/stochastic,
compartmental/individual-based, spatial/non-spatial,
age-structured...?
• Fundamentally, complexity should be driven by need
– what does the model need to do?
• And by data
– what assumptions/level of detail can be justified?
The art of modeling is knowing what to leave out.
Ydt
dZ
YN
XY
dt
dYN
XY
dt
dX
Good news: models can be
(much) simpler than reality and still work
UK
e.g. Measles dynamics -
0
200
400
600
800
1000
0 1 2 3 4 5 6 7 8 9 10
Time (years)
Y
Very
simple
seasonal
SIR model
SIR model with
age structure
• Modelling pandemic emergence in Indonesia.
• Simulation of 230 million people, with detailed
representation of population.
• But ‘only’ 5 transmission parameters.
More complex
model
0
500000
1000000
1500000
2000000
2500000
0 30 60 90 120 150
Dai
ly c
ases
Day
R0=2
Model validation
• Key parameters should be
estimated from data.
• Models should reproduce past
epidemics (goodness-of-fit).
• But rarely get comparable
‘training’ and ‘validation’
datasets – no 2 epidemics are
quite alike.
• Sensitivity analysis important
Trends in modelling
• Traditionally, most focus on endemic diseases (childhood diseases,
parasitic infections) – because equilibrium properties of models could be
determined analytically, and long-term control (e.g. vaccination).
• HIV and later emerging epidemics – and more powerful computers – have
moved field towards modelling dynamics of (novel) epidemics.
• Foot and Mouth Disease and SARS (& HIV/BSE!) showed potential of real-
time modelling.
• For endemic diseases, more focus on seasonal and spatial dynamics.
• Much more attention to rigorous model fitting/parameter estimation.
• Integrating genetics and epidemic modelling.
• And being relevant to public/veterinary health.
Emerging infections – why worry?
• Pandemic = global epidemic of a
new disease.
• Starts with a zoonosis mutating to
be transmissible.
• SARS – near-pandemic.
• H5N1/Nipah/VHFs/???... – the
next pandemic?
• Can profoundly affect society.
•Risk may be increasing –
encroachment on habitats, higher
human/livestock densities…
• Black Death and syphilis
• Influenza and HIV/AIDS
Detecting emergence
• Need to detect growing
clusters of cases of new
disease.
• Need innovative
surveillance (e.g. electronic
syndromic surveillance,
web crawling).
• Need new analytical
methods to analyse cluster
data.
• And rapid field
investigation.
Outbreak analysis & modelling:
past examples
• UK Foot and Mouth Disease livestock
epidemic (2001) – modelling guided
control policy.
0
50
100
150
200
250
300
350
400
450
18-Feb 4-Mar 18-Mar 1-Apr 15-Apr 29-Apr 13-May 27-May 10-Jun 24-Jun 8-Jul
Date
Co
nfi
rme
d d
aily
ca
se
in
cid
en
ce
A: Several Days to Slaughter
B: Slaughter on infected premises
within 24 hours
C: Slaughter on infected and
neighbouring farms within 24 and 48
hours, respectively
Data up to 29 March
Data from 30 March
A
B
C
Model predictions by Dr Neil Ferguson, Dr Christl Donnelly & Prof. Roy Anderson, Imperial College
0
20
40
60
80
100
120
22-F
eb
1-M
ar
8-M
ar
15-M
ar
22-M
ar
29-M
ar
5-A
pr
12-A
pr
19-A
pr
26-A
pr
3-M
ay
10-M
ay
17-M
ay
24-M
ay
31-M
ay
• SARS 2003 – estimates of
transmissibility (R0~3) and mortality
(~15%).
26
Pandemic modelling
2-4 months to peak at
source, 1-3 months to
spread to West.
Travel restrictions
would only buy a few
weeks at most.
1/3 of UK population
would become ill, 0.5-
1 million new sick
people per day at
peak.
1st wave over ~3
months after 1st UK
case.
Thailand GB
Modelling and preparedness:
assessing control options
Treatment & prophylaxisSchool closureVaccination
Containment at source (i.e. Stopping spread when
there are only a few tens of
cases)
Opposite scenario: eradication
- why is polio holding on in India?
• New analyses by Nick Grassly at
Imperial (published in Science, Lancet)
showed that the key problem was poor
vaccine efficacy in some parts of India.
• Trivalent oral vaccine only giving ~9%
protection in Uttar Pradesh – less than
half that achieved in the rest of India.
• So children were getting 15 doses and
still getting Polio.
• Poor efficacy linked to environmental
factors (competing infections with cross-
immunity).
• Now switching to new high potency
monovalent vaccine.
Impact of new vaccine on Type 1 Polio
Uttar Pradesh, India
Sep 06 Oct 06 Nov 06
Dec 06 Jan 07 Feb 07
Mar 07 Apr 07
* data as on 28th June 2007
May 07
Inferring the effectiveness of public
health measures from observational data
• Data on public health
measures often very limited.
• e.g. no data for masks.
• Can we use historical data
to reduce the uncertainty?
• We asked if public health
interventions provide a
plausible quantitative
explanation of the variation
between US cities?
0
50
100
150
200
250
300
0 90 180 270
Weekly
excess
mo
rtali
ty/1
00k
Days since Sept 7 1918
St Louis
Philadelphia
Correlations
• Both peak and total
mortality weakly correlated
with timing of pandemic wave
and previous year’s mortality.
• Peak mortality correlated
with ‘early’ interventions.
• Peak mortality strongly
correlated with presence of 2
autumn peaks, total mortality
weakly so.
Results in agreement with 2
other analyses.
R² = 0.19
0
100
200
300
400
500
600
700
800
900
500 1000 1500 2000
To
tal
mo
rtali
ty
1917 mortality
a
R² = 0.24
0 2 4 6 8 10
First week wheremortality > 20/100,000
b
R² = 0.69
0100200300400500600700800900
1000
0 200 400
To
tal
mo
rtali
ty
Mortality to 12 daysafter intervention start
c
R² = 0.71
0
50
100
150
200
250
300
0 200 400
Peak w
eekly
mo
rtali
tyMortality to 12 days
after intervention starts
d
Results of 1918 analysis
• Public health measures explain 1918 pattern well.
• Transmission cut by >50% in some cities.
• But measures often started too late, always lifted too early.
• Evidence of spontaneous behaviour change.
Estimating the impact of
school closure
• New analyses of seasonal flu
surveillance data allows effect of school
closure to be estimated.
• Have looked for evidence of changes
in transmission in different age groups in
and out of school terms in sentinel
surveillance data.
• Fitted stochastic model with schools
and households to the surveillance data:
Schools account for 16.5% of
transmission overall.
Overall, school closure in a pandemic
might reduce attack rates by ~5% (from
32% to 27%) overall – but reduces
attack rates in children by a quarter.
Paris-1985
05
01
50
Aix-Marseille-1985
01
50
30
0 Lille-1985
05
01
50
Paris-1989
02
00
40
0 Aix-Marseille-1989
02
00
40
0 Lille-1989
01
00
20
0
Paris-1997
04
08
0
Aix-Marseille-1997
05
01
50
Lille-1997
04
01
00
Paris-2001
04
01
00
Aix-Marseille-2001
01
00
20
0
Lille-2001
01
50
35
0
Infectious disease modelling
-future challenges
General:
More mobile, more populous world –
diseases spread faster so need faster/better
responses.
Prioritising/targeting – emerging infections
vs the rest, insufficient resources overall.
Modelling has to deliver health benefits.
Technical:
Better natural history / transmission
models.
Quantifying and validating proxy
measures of ‘infectious contact’ patterns.
Inference methods
Data on transmission/interventions.
Maintaining simplicity.
Top Related