Parameterizing State-space Models for Infectious Disease...

Parameterizing State-space Models for Infectious Disease

Dynamics by Generalized Profiling: Measles in Ontario

Giles Hooker∗, Stephen P. Ellner†, Laura De Vargas Roditi‡, David J.D. Earn§

Abstract

Parameter estimation for infectious disease models is important for basic understanding (e.g., to

identify major transmission pathways), for forecasting emerging epidemics, and for designing con-

trol measures. Differential equation models are often used, but statistical inference for differential

equations suffers from numerical challenges and poor agreement between observational data and

deterministic models. Accounting for these departures via stochastic model terms requires full speci-

fication of the probabilistic dynamics, and computationally demanding estimation methods. Here we

demonstrate the utility of an alternative approach, generalized profiling, which provides robustness

to violations of a deterministic model without needing to specify a complete probabilistic model. We

introduce novel means for estimating the robustness parameters and for statistical inference in this

framework. The methods are applied to a model for pre-vaccination measles incidence in Ontario,

and we demonstrate the statistical validity of our inference through extensive simulation. The results

confirm that school term versus summer drives seasonality of transmission, but we find no effects of

short school breaks and the estimated basic reproductive ratio R0 greatly exceeds previous estimates.

The approach applies naturally to any system for which candidate differential equations are available,

and avoids many challenges that have limited Monte Carlo inference for state-space models.

Keywords: differential equation model, generalized profiling, state-space model,

measles

1 Introduction

Mathematical models are now widely recognized as an important tool in efforts to understand, manage,and contain infectious diseases in humans, other animals, and plants. For some problems, general modelsmay yield broadly applicable solutions. For example, very simple models that distinguished between“core” and “non-core” groups (having high and low contact rates, respectively) led to the successfuladoption of contact tracing for the control of gonorrhea in the US (Hethcote and Yorke 1984). RecentlyWallinga et al. (2010) showed that in a broad range of situations, vaccinating individuals in the groupexperiencing the highest force of infection would be most effective for reducing the transmission of a novelinfection.

But in many cases progress requires a model tailored to a specific situation with parameters esti-mated as accurately as possible. Fitting models to data is important for basic epidemiology, becausecomparison of alternative models is a powerful tool for discriminating between different hypothesesabout underlying mechanisms (e.g., Koelle and Pascual 2004; Webb et al. 2006; Wearing and Rohani2006; Miller et al. 2006; King et al. 2008; New et al. 2009) or potential environmental drivers of diseaseprocesses (e.g., Pascual et al. 2000; Koelle et al. 2005; King et al. 2008; Ferrari et al. 2008; Shaman et al.

∗Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, NY, 14850, USA†Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY, 14850, USA‡Weill Cornell Medical College, New York, NY, 10065, USA§Department of Mathematics & Statistics and M.G. DeGroote Institute for Infectious Disease Research, McMaster

University, Hamilton, ON, L8S 4K1, Canada

1

2010). And in many cases, the relative gains from different management options depend on quanti-tative values of parameters or on case-specific features such as multiple transmission routes and theirrelative importance (e.g., Blower et al. 2000; Keeling 2005; Longini and Halloran 2005; Miller et al. 2006;Ferguson et al. 2006; Truscott and Ferguson 2009; Medlock and Galvani 2009).

An essential requirement for predicting the course of an outbreak and the responses to possibleinterventions is estimation of transmission rates – who infects whom, when, and at what rate? Butestimating transmission is also especially challenging because transmission is usually unobserved. Wemay know (at best) when a particular host individual contracted the infection, and in what circum-stances, but (apart from sexually transmitted diseases in humans) it is rare to know with any degreeof certainty which other individual passed the infection to them. Consequently, a fundamental goal offitting dynamic models to epidemic data is to uncover the transmission process, and derive predictivemodels of transmission: at what rate will hosts become infected, and what factors (in the host populationand its environment) influence that rate (e.g., Fine and Clarkson 1982; Finkenstaedt and Grenfell 2000;Xia et al. 2005; King et al. 2008; Metcalf et al. 2009)? That problem is the topic of this paper. Wepresent a new statistical method that can be used to fit infectious disease models, generalized profiling(Ramsay, Hooker, Campbell, and Cao, 2007), and illustrate its performance using case-report data onpre-vaccination era measles in Ontario, Canada.

The data, plotted in Figure 1, present a number of challenges for time series analysis.

1. There is substantial measurement error. Measles was contracted by essentially all children becauseof its high infectivity, but the number of reported cases is smaller than the number of births duringthe same time period by factor of ≈ 6, implying very large systematic under-reporting. In addition,an examination of a 24-month moving average of case reports indicates that there is no long-termtrend in the number of case reports, even though the population increased substantially (as seen inthe rising birth rate), suggesting a substantial trend in reporting rate. The low reporting rate alsoimplies that random sampling error will be too large to ignore.

2. We have data on only two processes (birth rate,and the rate of new infections) in a system withmany state variables and other disease processes.

3. The underlying process governing measles dynamics in Ontario, whose parameters we want toestimate, is nonlinear and stochastic, with events occurring continuously in time.

The state-of-the-art methods for this sort of fitting problem are Monte Carlo methods for state-spacemodels, either Bayesian methods using Markov Chain Monte Carlo (e.g., Morton and Finkenstaedt2005; Cauchemez and Ferguson 2008; New et al. 2009 and references therein) or Sequential Monte Carlo(Ionides et al. 2006; King et al. 2008; He et al. 2010). The method that we present here is completelydifferent. The fitting criterion (explained in detail below) combines the least-squares trajectory matching(Banks et al. 1981; Bock 1983; Biegler et al. 1986; Arora and Biegler 2004; Ciupe et al. 2006) and gra-dient matching (e.g., Swartz and Bremermann 1975; Varah 1982; Ellner et al. 2002; Brunel 2008) fittingcriteria, using basis expansion methods to reconstruct the full set of state variables. The underlyingmodel is still a continuous time state-space model, but very few model solutions at candidate parametervalues are required to converge to an excellent fit.

As a result, generalized profiling has some computational and other practical advantages. Mostimportantly, it has proven very challenging to construct MCMC proposal distributions (from which todraw new candidate parameter values) whose acceptance and “mixing” rates are both high enough for theposterior distribution to be sampled well (e.g., to find all modes in the posterior) within a practical amountof time. Off-the-shelf methods have often been unsuccessful, for example New et al. (2009) fitted threealternative state-space models for red grouse population cycles using WinBUGS, one of which involvedgrouse-parasite interactions. They observed poor mixing of the MCMC sampler for two out of the threemodels, and cautioned that “posterior densities may not accurately represent the target distribution,leading to questionable inference”. Sequential Monte Carlo methods may require careful case-specific

2

●

●●●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●●

●

●●

●●

●

●

●●●●●●●●●●●

●●●●

●●●●●●

●

●

●●●

●

●

●●

●●

●

●

●

●●

●

●

●●

●

●●●●

●●●●●

●●●●●●●●●

●●●●●●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●

●●●

●●●●

●●●

●●●

●

●●●●●●●●●●●●●●●●●

●

●●●●●●●●●●

●

●●●●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●●●●●●●●●●

●●●

●

●●

●●●●●

●

●

●

●●

●●●

●●●●●

●●●

●

●

●●●●●

●

●

●●

●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●

●●●●

●●●

●

●●●●●●●●●●●

●●●●

●●●●●●

●

●●●

●

●

●

●

●

●

●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●●●●●●●●●●●●●●●●●●●

●

●

●●●●●●●●●●

●

●●●●●

●●●

●●

●●●●●

●●●●●●●●●●●●●●●

●●●

●●●●●●●

●

●

●

●●

●

●

●●●

●

●●

●

●●

●●

●●

●

●

●●

●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●

●●●

●●●

●

●

●

●●

●

●●●●●

●●●●●

●●●

●●●

●●●●●●●●●●●●●●

●●●●●●●●●●

●

●●●

●●

●

●

●●●●

●

●●●

●

●●●

●

●

●

●

●

●

●●

●

●●●●●●●●●●

●●●●●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●

●

●●

●●

●

●

●

●

●

●

●

●●●

●●●

●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●

●

●

●

●

●

●●●

●

●

●

●

●

●●●●●

●●

●●●●●●●

●●

●●●●●●●●●●●●●

●●●●

●●

●

●

●

●

●●

●●●●●●

●

●

●●

●

●

●●●●

●●●●●

●

●

●

●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●

●●●●●

●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●

●●

●●●●●●●

●

●

●

●●●●●

●

●

●●●●

●

●●

●

●●

●●●

●●●●●●●●●●

●●●●

●

●

●●●

●

●

●

●

●

●

●●

●●

●

●

●

●

●●●●

●

●

●

●

●

●

●

●●●

●

●●●●●●●●

●●●●

●

●

●●●

●

●

●●

●

●

●●

●●●●

●●●●●

●

●●●

●●

●

●

●

●

●●●●

●

●●

●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●

●●●●●●

●●●●●

●●

●●●●●●●●●●●●●

●●●

●●

●●●

●

●●

●

●

●●

●

●●●

●

●

●

●

●●●●●●●●●●

●

●

●

●●

●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●

●

●

●

●

●

●

●●

●●●●●●●●●

●

●●●●●●●●●●●●●●●

●

●●●●●●●●●●

●●●

●

●●

●

●

●

●

●

●

●

●

●

●●

●●●

●

●●

●●●

●

●

●●

●●●●●●●●●●●●

●

●

●

●●●●●

●

●

●●●●

●

●●

●

●

●●●

●

●●

●●

●

●

●●●

●●●●●

●●●●●●●●●●●●●●●

●●●●●●

●●

●

●

●

●

●

●●●●

●●●

●

●

●

●

●●

●

●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●

●

●●●●

●●

●

●

●

●●●

●

●

●●

●

●

●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●●●●●●●●●●●

●●

●●●●●●●

●●●

●●

●●●●

●●

●

●

●●●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●●●●●●●●●●●●●●●●●●●●●

●

●●●●

●●●

1940 1945 1950 1955 1960 1965

010

0020

00

Cas

e re

port

s/w

k (a)

●

●●●

●

●

●

●

●

●

●

●●

●●

●

●

●

●●

●●

●

●●

●●

●

●

●●●●●●●●●●●

●●●●

●●●●●●

●

●

●●●

●

●

●●

●●

●

●

●

●●

●

●

●●

●

●●●●

●●●●●

●●●●●●●●●

●●●●●●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●●

●

●

●

●

●

●●

●

●

●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●

●●●

●●●●

●●●

●●●

●

●●●●●●●●●●●●●●●●●

●

●●●●●●●●●●

●

●●●●

●●

●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●●●●●●●●●●

●●●

●

●●

●●●●●

●

●

●

●●

●●●

●●●●●

●●●

●

●

●●●●●

●

●

●●

●●●●●●●●●●●●●●

●●●●●

●●●●●●●●●●●

●●●●●●●●●●●●●●

●●●●

●●●

●

●●●●●●●●●●●

●●●●

●●●●●●

●

●●●

●

●

●

●

●

●

●●

●

●

●

●●●●

●

●

●

●

●

●

●

●

●

●

●

●●

●

●●

●●

●●●●●●●●●●●●●●●●●●●

●

●

●●●●●●●●●●

●

●●●●●

●●●

●●

●●●●●

●●●●●●●●●●●●●●●

●●●

●●●●●●●

●

●

●

●●

●

●

●●●

●

●●

●

●●

●●

●●

●

●

●●

●

●●

●●

●●

●●●●●●●●●●●●●●●●●●●●●

●●●

●●●

●

●

●

●●

●

●●●●●

●●●●●

●●●

●●●

●●●●●●●●●●●●●●

●●●●●●●●●●

●

●●●

●●

●

●

●●●●

●

●●●

●

●●●

●

●

●

●

●

●

●●

●

●●●●●●●●●●

●●●●●

●

●

●

●

●●

●

●

●

●

●

●

●●●

●

●

●●

●●

●

●

●

●

●

●

●

●●●

●●●

●●●●●●●●●●●●●●●●

●●●●●●

●●●●●●

●

●

●

●

●

●●●

●

●

●

●

●

●●●●●

●●

●●●●●●●

●●

●●●●●●●●●●●●●

●●●●

●●

●

●

●

●

●●

●●●●●●

●

●

●●

●

●

●●●●

●●●●●

●

●

●

●●●●●●●●●●●●●●

●●●●●●●●●●●

●●●●●●

●●●●●

●

●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●

●●

●●●●●●●

●

●

●

●●●●●

●

●

●●●●

●

●●

●

●●

●●●

●●●●●●●●●●

●●●●

●

●

●●●

●

●

●

●

●

●

●●

●●

●

●

●

●

●●●●

●

●

●

●

●

●

●

●●●

●

●●●●●●●●

●●●●

●

●

●●●

●

●

●●

●

●

●●

●●●●

●●●●●

●

●●●

●●

●

●

●

●

●●●●

●

●●

●●●

●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●

●●●●●●

●●●●●

●●

●●●●●●●●●●●●●

●●●

●●

●●●

●

●●

●

●

●●

●

●●●

●

●

●

●

●●●●●●●●●●

●

●

●

●●

●●●●●●●●●●●

●●●●●●●●●●●●

●●●●●●●●●●●

●

●

●

●

●

●

●●

●●●●●●●●●

●

●●●●●●●●●●●●●●●

●

●●●●●●●●●●

●●●

●

●●

●

●

●

●

●

●

●

●

●

●●

●●●

●

●●

●●●

●

●

●●

●●●●●●●●●●●●

●

●

●

●●●●●

●

●

●●●●

●

●●

●

●

●●●

●

●●

●●

●

●

●●●

●●●●●

●●●●●●●●●●●●●●●

●●●●●●

●●

●

●

●

●

●

●●●●

●●●

●

●

●

●

●●

●

●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●

●

●●●●

●●

●

●

●

●●●

●

●

●●

●

●

●

●●

●

●

●●

●

●

●

●

●

●●

●

●

●

●●●●●●●●●●●●

●●

●●●●●●●

●●●

●●

●●●●

●●

●

●

●●●●

●

●

●

●

●

●

●●

●

●●

●

●

●

●

●●●●●●●●●●●●●●●●●●●●●

●

●●●●

●●●

●●●

●

●

●●●●●●●

●●●●

●●●●

●●●●●●●●

●●●●●

●●●●●●●

●●●●●●●●

●●●●

●●●●●

●●●

●●●●

●●

●●●

●●●

●

●●

●

●●●●●

●●●

●●●

●

●●

●●●●●●

●●●

●

●

●

●●●●●

●

●●●●

●●●●

●●●●

●●●

●

●

●

●

●

●

●●●

●●

●

●

●●

●●●

●●

●

●●●

●

●●

●●●

●●●

●●

●●

●

●

●●●●●●

●

●●

●

●

●

●●●●●

●

●●●

●

●●●●

●●●

●

●●●

●

●●

●

●●

●●●

●

●●●

●

●

●

●

●●●●

●

●●

●

●●

●●●

●

●●

●●●

●

●●

●

●●●

●●

●●●

●

●

●

●●

●●●

●

●●●

●

●●●●

●●●

●

●●●

●

●

●

●

●

●

●●●

●

●●

●

●

●●

●

●●●●

●●●

●

●●

●

●

●●●●

●●●●

●

●

●●●

●

●●

●●●

●

●

●

●●●

●●

●

●●●

1940 1945 1950 1955 1960 1965

6000

1200

0

Date

Bir

ths/

mon

th

(b)

Figure 1: (a) Weekly measles case reports for the province of Ontario, Canada, from 1939 though 1965.Data were digitized from unpublished, handwritten spreadsheets obtained from the Ontario Ministry ofHealth. The light grey line represents a moving average over the 24 months preceding and followingeach date. (b) Monthly live births for the province of Ontario, Canada, from 1939 through 1965. Dataobtained from: Government of Canada, Dominion Bureau of Statistics, “Vital Statistics 1921-1966”.Both the weekly case reports and the monthly birth time series can be downloaded from the InternationalInfectious Disease Data Archive (http://iidda.mcmaster.ca).

fine-tuning to avoid “particle depletion”, the analog of poor mixing (see section 3.2 of Newman et al.2009). Fitting by generalized profiling only requires nonlinear optimization. Optimization can also becomputationally challenging in general, but the profiling objective function is often well-behaved becauseof regularization by the gradient matching component of the function (see Ramsay et al. (2007), section2.8).

The generalized profiling method was first described by Ramsay et al. (2007), and it has since beenapplied in several case studies. Here, in addition to challenging the method with epidemic data, we addtwo methodological developments: a data-based method for choosing the “smoothing” parameter thatsets the relative weighting of trajectory and gradient matching (previously there has been no objectiveway to choose the smoothing parameter), and an improved method for calculating asymptotic confidenceintervals on parameter estimates.

Measles modeling is a well-ploughed field. We revisit measles because the degree of knowledgeabout this pathogen makes it possible to test our method in a simulation study that closely mimics thereal-world data, by generating artificial data from a biologically realistic model with biologically realisticparameters. Our main “target” was the pattern of seasonal variation in the transmission rate parameter,

3

which is a key unknown in the transmission process and determines many of the most important featuresof the dynamics (Earn et al. 2000; Earn 2009).

The results from the simulation study are encouraging in three respects: the estimated seasonalvariation was very close to the actual variation in the data-generating model, the variability acrossreplicates was similar to the width of the confidence bands on the estimate from the real data, and thecomputations could be completed in a reasonable amount of time on a current desktop workstation.The seasonal variability that we estimated from the real data is similar to the commonly used “termtime” model (high transmission when schools are in session, low when schools are out). However, therewas some estimated variability within school terms, and no evidence of major decreases during shortschool holidays such as the Christmas break. In addition, the estimated mean transmission rate implies(with high confidence) a considerably higher value of R0 higher than previously estimated for measles inOntario.

2 Background

2.1 The data

In Ontario, notifications of measles cases were tabulated weekly from 1939 to 1989. Although neverpublished, the Ontario Ministry of Health (OMoH) retained the original handwritten spreadsheets, whichwe digitized in 2001 for other work (Bauch and Earn, 2003). For the present analysis, we restrictedattention to the pre-vaccine era.

2.2 Model, Measurement and Parameterizations

As our model for the epidemic process we use a version of the SEIR equations (Bauch and Earn, 2003)with mortality assumed to be negligible in the age-group where infection occurs:

S = ρ(t)− β(t)(I + v)S (1a)

E = β(t)(I + v)S − σE (1b)

I = σE − γI (1c)

where S denotes a susceptible population (number of individuals), with recruitment rate ρ(t). Susceptiblesbecome exposed (E, infected with the disease but not infectious) at rate β(t)(I +v). Exposed individualsmove into the infectious class I at rate σ and move out as they recover, at rate γ. We assume zeromortality in the population of interest. The recovered class (R) does not enter the equations for S, E andI, becausse we are assuming lifelong immunity, so there is no need to include a fourth equation (whichwould be R = γI). The rate of infection is modified by a visiting impact v, which allows a small numberof infectious individuals outside the modeled population to pass the infection to susceptibles. This hasbeen included to remove the potential for extinction of an epidemic in the (stochastic) simulation studiesbelow.

Our main goal is to estimate the time-varying transmission rate parameter β(t), in particular tosee how variation in β(t) relates to the structure of the school year. To represent this time-varying ratewe used a basis expansion

β(t) = β0 + β1(t− 1952) +k∑

i=1

φi(mod(t, 1))β2i (2)

where the φi(t) were taken to be cyclic cubic B-splines with knots taken at intervals of 1/12 years. Thisnumber of knots is sufficient to represent both smooth seasonal variation and the sharper on/off variationof the school year versus summer holidays, and there is no evidence of shorter time-scale features (such

4

as the Christmas holiday) in the case report data (see Figure 6 below)1. The initial linear term allows fora long-term trend in the transmission rate, in addition to the annual cycle parameterized by the φi(t).Linear combinations of B-spline bases can represent constant functions such as β0 meaning that (2) isnot identifiable as written. The β2i were therefore constrained to enforce

∫ 1

0

k∑

i=1

φi(t)β2i dt = 0,

allowing this representation to have the effect of being a variation about an overall average for the year.The linear term was centered at 1952, the mid-point of the dates of our observations [1939, 1965] in orderto stabilize the estimate of β0 and allow its interpretation as the over-all transmission rate at the startof 1952.

In addition to disease processes, we also need to model the observation process. Weekly measlescases reported are assumed to be measurements Io

i = p(ti)I(ti), i = 1, . . . , N for some reporting fractionp(t) and the ti occurring 52 equal intervals across the year. The reporting ratio is also allowed to changelinearly over time, represented by p(t) = p0 +p1(t−1952), where the centering term has been used in thesame manner as for the parametrization of β(t). We observe that (1) can be re-parameterized in termsof I∗ = p(t)I so that all parameters appear in the differential equation:

S = ρ(t)− β(t)(

I∗

p(t)+ v

)S (3a)

E = β(t)(

I∗

p(t)+ v

)S − σE (3b)

I∗ = p(t)σE − γI∗ (3c)

and we can set Ioi = I∗(ti). Note that (3c) is p(t)I; the term p(t)I is negligibly small (i.e., the mean

reporting rate is nearly constant, relative to the epidemic dynamics that occur on a much shorter timescale) so for fitting purposes that term was omitted. It is often numerically easier to take the log of eachstate when solving equations such as (3) and we have followed this convention. We have also used thelog of the observed data. In the simulation study below we employ a stochastic finite-population process(Gillespie algorithm, Kurtz (1980); Gillespie (1976)) corresponding to Eqs. (1) where the log transformalso helps to make the variation in the paths more stable.

2.3 Initial Parameter Values

Some of the parameters in Eqs. (3) were held fixed, because they are directly observable and so their valuesare known with effectively perfect precision relative to the uncertainties in the unobserved transmissionand reporting processes. The recruitment rate ρ(t) is interpolated from the monthly birth rate data (fig.1) at a five-year lag, which approximates the age of entry into primary school. The time period betweenexposure and infectiousness (the latent period) is known to be around 8 days, giving a value σ = 365/8.Similarly, the mean infectious period is around 5 days, implying γ = 365/5.

Initial values also need to be specified for v, β(t) and p(t). We initialized the reporting rate to beconstant (p1 = 0) and obtained an initial p0 from the ratio of total measles cases 1951-1960 to total births1946-1955. This calculation assumes that all children are eventually infected so that the ratio of totalobserved cases to total births five years earlier provides an approximate proportion of reported infections.The resulting estimate is 16%.

1When we used twice as many knots, the estimated β(t) contained only one additional feature: a hint of a brief decrease inthe month prior to the Christmas holiday, which has been observed in other childhood diseases and interpreted as resultingfrom a delay in reporting caused by slower mail delivery. As this is not a property of the actual epidemic dynamics, weconsider 1/12 year knot intervals to be sufficient for representing the actual β(t) process.

5

β(t) was initialized to approximate a step function providing one level at June, July and August andanother during the rest of the year. Initial values are taken from Bauch and Earn (2003). v is initializedat 10−6, or about 10 infectious visitors continuously present.

3 The Generalized Profiling Procedure

Ramsay et al. (2007) describe a method for estimating parameters in ordinary differential equations suchas (3) that is robust to model misspecification. In particular, it allows for small-variance stochasticforcing of a system, which would be expected to result from environmental and demographic stochas-ticity. The method is based on combining a collocation approach to estimating differential equations(Deuflhard and Bornemann, 2000) and allowing deviations from differential equation models in a mannersimilar to the multiple shooting methods of Bock (1983) and the trust region methods in Arora and Biegler(2004).

To facilitate the exposition we use somewhat more general notation to write the state vector asx(t) = (S(t), E(t), I(t)) = (xS(t), xE(t), xI(t)) and refer to the components of x by xi(t) for i ∈ {S,E, I}.In this short-hand, we re-write Eqs. (3) as x = f(x; t, θ) where θ collects all the unknown parameters. Inour model, θ =(p0, p1, v, β0, β1, β2,1, . . . , β2,12). Elements of f will be referenced in the same manner aselements of x.

We begin by representing x(t) in terms of a basis expansion

x(t) = Φ(t)C (4)

for Φ(t) = (φ1(t), . . . , φk(t)) a vector of basis functions and coefficient matrices C = (CS CE CI).Throughout we have used as Φ the cubic B-spline basis with knots at weekly intervals over 1939-1965.Parameters are then estimated through a two-level optimization. First, for any value of the parametervector θ, C is chosen by

C(θ) = argminC

N∑

i=1

(Ioi − Φ(ti)CI)

2 + λ∑

j∈{S,E,I}wj

∫ (Φ(t)Cj − fj(Φ(t)C; t, θ)

)2

dt

. (5)

The first term on the right-hand side of (5) is the sum of squared deviations between the observed case-report data (scaled to compensate for under-reporting) and the estimated trajectory Φ(ti)CI . This is thefitting criterion for estimating parameters by least-squares trajectory matching. The second term is aweighted sum of squared deviations between the gradients (time derivatives) of the estimated trajectoriesfor each state variable, Φ(t)Ci, and the corresponding gradients generated by the model, fi(Φ(t)C; t, θ).This is the fitting criterion for estimating parameters by gradient matching. In contrast with previousapproaches, the gradient matching term includes unmeasured state variables, because the complete systemstate is estimated along with model parameters (a property shared by Monte Carlo methods for state-space models). The parameter λ determines the relative importance given to trajectory-matching andgradient-matching components of the fitting criterion.

The combined fitting criterion (5) means that the estimated trajectories only approximately satisfyEqs. (3) This contrasts with trajectory matching, in which estimated trajectories are model solutions.In this way, (5) allows the data to impose deviations from the model if appropriate. This allows param-eter estimates to be robust to departures from the model, either through imperfect specification of thefunctional forms of process rates, or stochastic departures from the deterministic paths predicted by thedifferential equations (for example, under suitable technical assumptions, the methods in Brunel (2008)can be extended to show that gradient matching remains valid when the dynamics are stochastic withf(x; t, θ) being the expected value of x given x(t)).

Formally, (5) takes the form of a nonparametric penalized likelihood estimate for the noisily-observed state vector. Replacing the second term in with λ

∫(d2Φ(t)C/dt2)2dt corresponds exactly to

6

standard smoothing spline regression models, as studied for example in Wahba (1990). In that situation,the value of λ controls the smoothness of the resulting regression function. Higher values of λ lead tosmoother function estimates, so the optimal value of λ becomes small as sample size increases. In our ap-plication, we think of (5) as allowing the data to impose deviations from the differential equation model,if appropriate. We thus default to thinking of λ as large, rather than small, so that estimated rates ofchange in state variables are always close to what the model predicts, given the estimated system state.

The weights wi allow us to place more or less emphasis on each of equations (3a-3c). We haveplaced relative weight 100 on the equation for S. This corresponds approximately to the relative ordersof magnitudes of the numbers of susceptibles and of those who are exposed and infected. We wouldtherefore expect a corresponding decrease in demographic stochasticity in this variable. Note that whileonly I is measured, all state variables appear in the second term and so the minimization problem isfeasible in principle.

In practice, the integral in the second term of (5) cannot be computed analytically and instead itis approximated by a numerical quadrature rule. We have used a rule that places one quadrature pointtq at the midpoint between knots – i.e., halfway through each week. This allows the basis expansionto exactly solve Eqs. (3) at those points, making the second term in (5) zero if need be. The resultingobjective is written as

N∑

i=1

(Ioi − Φ(ti)CI)

2 + λ∑

j∈{S,E,I}

wj

K

K∑q=1

(Φ(tq)C − fj(Φ(tq)C; tq, θ)

)2

.

This is now a sum of squares criterion which can be minimized by a Newton-Raphson algorithm.Using the definition (5), C is represented as a function of θ. We think of C(θ) as defining, for each

θ, an estimated system trajectory in terms of the basis expansion (4). For choosing θ it is then natural toapply a sum of squares criterion, as would be used in nonlinear regression (e.g. Bates and Watts, 1988):

θ = argminθ

N∑

i=1

(Ioi − Φ(t)CI(θ)

)2

. (6)

This sum-of-squares criteria can be minimized by a Gauss-Newton algorithm with gradients of CI(θ)obtained through the implicit function theorem.

Hooker (2007) demonstrated that in the limit λ →∞, the generalized profiling method is equivalentto fitting both parameters and initial conditions to the data by ordinary least squares. However, forfinite values of λ, the method provides robustness to model misspecification and to disturbances of thedynamical system. It also results in optimization surfaces that have fewer local maxima than attemptingto fit the data by trajectory matching to solutions of Eqs. (3).

4 Choosing the Smoothing Parameter

While the methodology in Ramsay et al. (2007) allows for computationally efficient and statisticallyrobust parameter estimation, no guide was provided on how to choose the “smoothing parameter” λ.As discussed above, it is natural to think of λ as being large, meaning that standard cross-validationmethods that are designed to set λ → 0 are unlikely to yield appropriate values.

Ellner (2007) suggested the approach of forwards cross-validation for these models. This methodcombines both estimates C(θ) and θ to predict future responses. Specifically, let x(t, θ,x0) denote thesolution, at time t, of Eqs. (3) with x(0) = x0. Then we define the forward prediction error

FPE1(λ) =N∑

i=1

(Ioi − xI(h, θλ, Φ(ti − h)CI(θλ))

)2

. (7)

7

1943 1943.1 1943.2 1943.3 1943.4 1943.5 1943.6−12

−10

−8dataestimated trajectoryODE solution

1943 1943.1 1943.2 1943.3 1943.4 1943.5 1943.6−12

−10

−8

Figure 2: An example of forwards cross validation from a particular point, in this case t0 = 1943. Dashedlines represent the estimated trajectory. Solid lines show the solution to Eqs. (3) starting at the estimatedsystem state at t = 1943. Discrepancy between the data (stars) and the solid lines represent forwardsprediction error. Top panel: with λ = 1 the estimated trajectory is close to the data, but the solutionto Eqs. (3) diverges quickly from the estimated trajectory. Bottom panel: with λ = 106 the solutionto Eqs. (3) remain near the estimated trajectory, but the estimated trajectory does not follow the dataclosely.

The terms in FPE1 are the (squared) errors made in predicting the observation at time ti by startingfrom the estimated trajectory at time ti−h, and solving the differential equation model from time ti−h

to time t.Forwards cross-validation is illustrated graphically in Figure 2. The dots are the data, and the solid

line is the estimated trajectory obtained from the generalized profiling approach for an overly small (toppanel) and an overly large (bottom panel) value of λ. The dotted curve is a solution of the model withthe corresponding parameter estimates θ(λ), starting from the estimated trajectory at t0 = 1943. In thetop panel, because λ is very small, the generalized profiling criterion emphasizes trajectory matching,so the estimated trajectory is close to the data. However, the estimated trajectory is not very close tobeing a solution of the model, so the model solution diverges from the trajectory (and from the data),generating large forward prediction errors. In the bottom panel, λ is large, so the estimated trajectoryis very close to being a solution of the model. However, the estimated trajectory is far from the data,so again the forward prediction errors are large. A good choice of λ will lie somewhere between theseextremes.

Because of the computational cost of solving the differential equations, it will frequently be usefulto use a small set of K starting observation times tik

, k = 1, . . . , K and to solve Eqs. (3) over a longertime interval. A single solution can then be compared to the next L observations. These solutions may

8

overlap so that some observations are included more than once. For example, taking every second startingpoint t0, t2, . . . , tN−2 we could solve Eqs. (3) for the next two observations. If ti+1 − ti = h, this gives us

FPE3(λ) =(N−2)/2∑

i=1

(I02i+1 − xI(h, θλ, Φ(t2i+1)C(θλ))

)2

+(I02i+2 − xI(2h, θλ, Φ(t2i+1)C(θλ))

)2

(8)

For our data, we took starting times tikat the beginning of each quarter and solved Eqs. (3) for the

following year, or 52 observations. We have chosen starting points at observation times and assumeda constant interval between observations for the sake of notational convenience; clearly neither choiceis strictly necessary. In practice forwards prediction error can be minimized by a grid search, which istypically best carried out on a logarithmic scale.

Morton et al. (2009) suggested a similar approach for choosing smoothing parameters in nonpara-metric regression, based on predicting forward from an estimate of both the curve and its derivative.Hooker and Ellner (2009) demonstrated that this approach will not, in general, yield consistent estima-tors of non-parametric functions. However, choosing λ to minimize FPEL(λ) for a fixed h will yieldconsistent estimates of θ when the differential equation model is correct (Hooker and Ellner (2009)). Thechoice of L and of the tik

allow the user to choose a level of robustness in parameter estimation thatoptimizes for a desired range of extrapolation forwards in time.

5 Confidence Intervals

To estimate the variability of parameter estimates resulting from the equations above, we use a modi-fication of the confidence intervals suggested in Ramsay et al. (2007). An initial covariance estimate isgiven by

V −10 ≈ σ2

[J(θ)T J(θ)

]−1

(9)

where J is a matrix with rowsJi(θ) = Φ(ti)

d

dθCI(θ).

Here, σ2 is the residual variance. This formulation is derived from asymptotic results in nonlinearregression and is based on the assumption that the errors εi = Io

i −Φ(ti)CI(θ) are independent. Becausethe smoothing process above can create dependence among the residuals at finite samples, we insteadintroduce a correction based on a Newey-West covariance estimate to allow for dependence induced by theestimate CI(θ). We derive this estimate by a modification of asymptotic results for maximum-likelihoodestimates (e.g. Lehmann and Casella, 1997). Observing that

J(θ)T ε = 0

a Taylor expansion results in

(θ − θ) =[J(θ∗)T J(θ∗) +

d

dθJ(θ∗)T ε

]−1

J(θ) ≈ V0J(θ)

from which we obtain a covariance estimate

Cov(θ) ≈ V −10 Cov(J)V −1

0 .

In non-linear regression, the rows of J are considered independent, resulting in (9). In order to account forpotential serial dependence, we employ a Newey-West estimate (Newey and West, 1987) for the covarianceof J ,

Cov(J) ≈ V0 +m∑

k=1

(1− k

m + 1

) (Vk + V T

k

), Vk =

σ2

n

n∑

i=k+1

Ji−kJTi .

9

1952 1952.2 1952.4 1952.6 1952.8 195310

15

20

25

30

35

40

45

β(t)

/γ

Figure 3: Estimated seasonal variation in the transmission rate (plotted for the year 1952–1953; otheryears will be shifted slightly upward or downward due to the long-term linear trend β1). Solid lines showthe generalized profiling estimate, while the dashed lines show pointwise 95% confidence intervals.

Effectively, these estimates allow for local dependence between the εi. Guided by the asymptotic theory(Newey and West, 1987) the number of terms m was chosen as the maximum of 5 and n1/4; empirically,the resulting confidence intervals are insensitive to the choice of m.

6 Results

The statistical methods described above were implemented in Matlab, using the software tools provided inHooker (2006). Matlab code to run this analysis is available as supplementary material. The same func-tionality can be found in the CollocInfer package (Hooker et al., 2010) for the R statistical programminglanguage.

The estimated seasonal pattern of variation in the transmission parameter β(t)/γ is plotted inFigure 3, along with pointwise confidence intervals. A distinct reduction in the transmission rate isevident over the summer holidays. The pronounced peak around the start of the school year may be dueto a higher rate of mixing, or possibly an artifact to do with the influx of new susceptibles (which wasnot modeled here). He et al. (2010) included a pulse of recruitment at the start of the school year, butassumed that contact rate was a step-function taking one value during school terms and another duringholidays. As noted above, the slight decrease towards the end of the year and subsequent small peakmay be artifacts due to delayed reporting in pre-Christmas period. We estimate that the over-all level ofβ(t)/γ to be β0/γ = 27.0±2.91 in 1953 and increased at rate β1/γ = 0.244±0.115 per year. As expected,we estimated a substantial trend in the reporting rate (p0 = 26.9±1.9% and p1 = −0.0097±0.002%/year).The flow of visitors into the infectious pool was estimated as essentially zero, but the objective functionwas effectively flat as a function of visiting rate at the point estimate of the parameters, suggesting thatthe data contain very little information about this parameter. For the purposes of numerical stability, v

was therefore removed from the estimate of covariance used to compute confidence intervals.A further advantage of the methods outlined above is that they provide readily accessible diagnostics

10

1940 1945 1950 1955 1960 1965

10^5.5

10^5.7

10^5.6S

Fit to Data

1940 1945 1950 1955 1960 1965100

1000

10000

E

1940 1945 1950 1955 1960 1965

100

1000

I

year

1940 1945 1950 1955 1960 1965

−0.8−0.6−0.4−0.2

00.20.4

dLog

(S)/

dt

Fit to Model

1940 1945 1950 1955 1960 1965

−10

0

10

dLog

(E)/

dt

1940 1945 1950 1955 1960 1965

−10

0

10

dLog

(I)/

dt

year

Figure 4: Top three panels: a comparison of estimated trajectories (lines) and observed data (points).Note that observations are available only for the I component of the state vector. Bottom three panels:the derivative of the estimated trajectory (dark) compared with the derivative calculated from the righthand side of Eqs. (3) (light).

11

−14 −12 −10 −8−2

−1

0

1

2log Observations

resi

dual

−0.8−0.6−0.4−0.2 0 0.2 0.4−0.2

−0.1

0

0.1

0.2dLog(S)/dt

−10 0 10−1

0

1

2dLog(E)/dt

predicted

resi

dual

−10 0 10−1

−0.5

0

0.5

1dLog(I)/dt

predicted

Figure 5: Residual plots for the estimated trajectory of I(t) (top left panel) and for the estimatedgradients of S, E and I, all on log scale. For I(t), the predicted values are the estimated trajectory,and the residuals are the departures of the weekly case report data from the estimated trajectory, i.e.the trajectory-matching component of the fitting criterion (5). For the gradients, predicted values arethe model gradient at the estimated trajectory (fi(Φ(t)C; t, θ); see equation (5)) and the residuals areΦ(t)Ci − fi(Φ(t)C; t, θ), the gradient-matching component of the fitting criterion (5).

for assessing model fit, based on the two components of the objective function: how well the estimatedtrajectories match the data, and how well the estimated gradients match the model. In Figure 4 the topthree panels show the comparison between the data and the estimated trajectories; there are no obviousdiscrepancies (note that data is available only on I). The bottom three panels in Fig. 4 show the time-derivative of the estimated trajectories and the value of the right hand side of Eqs. (3) at each time point.Here, while the agreement is not perfect, there are no evident consistent patterns in the discrepancy.Residual plots (Figure 5) confirm that there are no systematic patterns of error in the trajectories orgradients, so we can interpret the discrepancies as stochastic departures of the real system from Eqs. (3).

The estimated trend in the number of susceptibles (top panel of Fig. 4), is due to the estimatedlinear trend β1 in the transmission parameter β(t). In both cases, the rate of change is about 1% peryear (1% per year decrease in S, 1% per year increase in the mean of β(t)). However, there is very highuncertainty regarding the trend in β (e.g., a tenfold smaller trend is within two standard errors of thepoint estimate), so we cannot be confident that the estimated trends in S and β1 are real.

In Figure 6 we look more closely at how well the fitted model captures the seasonal variation indisease incidence, by replotting the data as a function of week (within the year) and comparing this withmodel solutions. Because of the high year-to-year variability, the case reports for each week from 1940 to1965 were expressed as fractions of the total number of reports for that calendar year, for both the data(box plots in Fig. 6) and a forward solution of the deterministic model (1) starting from the estimated

12

system state at the start of 1940 (solid and dashed curves). Whereas the estimated trajectories (shown inFig. 4) are “pulled” towards the data by the trajectory matching component of the fitting criterion (5),forward solutions of (1) are free to go wherever they are pushed by the estimated seasonal variation inβ(t), so this comparison is a genuine test of the fit. Forward solutions of the deterministic model followfairly well the observed seasonal pattern of incidence. The main discrepancy is that the decrease andincrease in incidence at the start and end, respectively, of the summer school holiday occur somewhatfaster in model solutions than in the data. Otherwise, with the estimated seasonality of transmission rate(Fig. 3) the deterministic model appears to capture all of the qualititative properties of the observedseasonal pattern.

7 A Simulation Study

Our specific goal in this paper was show how the the transmission rate β(t) could be reconstructedfrom a real measles time series using generalized profiling. While we have achieved that goal, it isultimately impossible to know whether real processes that we have not modelled have influenced thesuccess of our approach. In addition, any real incidence time series represents a single realization of a(very large) stochastic process, and we do not know how our results might vary for other realizationsof the same process. Therefore, to validate the generalized profiling method for epidemiological timeseries, we conducted a simulation study based on a finite population model for an SEIR process. For thesimulations, there is no uncertainty about the processes that generated the data, and we can examinethe variability of our results across many realizations of the identical process.

Specifically, a rough approximation for the estimated parameters from the Ontario measles datawas used in a Gillespie algorithm simulation over the same time period. Observations were generated bybinomial samples with probability p(t) from transitions E → I over the previous week. 250 replicatedhistories were created, each requiring about 10 CPU hours on a current desktop workstation (one PentiumD 3GHz processor). For each replicate, the parameters in the model were estimated using our methods.Each estimation, including a search over λ to minimize forward prediction error, required less than 2 CPUhours on the same workstation, and could be made considerably shorter by more careful initialization ofthe optimization and reducing the number of λ values considered.

We are interested in both the bias of the estimates and their variability. Two potential sourcesof bias should be distinguished: bias due to the use of the generalized profiling objective function, andthe bias associated with the use of a finite collocation basis to approximate system trajectories. Thelatter of these can be assessed by choosing a very large value of λ, and estimating parameters fromerror-free deterministic solutions to Eqs. (3). Because a large value of λ places almost all emphasis onthe second term in (5), this is equivalent to defining C(θ) as the coefficients that give the best possibleapproximation by the basis expansion (4) of exact solutions to the deterministic model (see Hooker,2007). Minimizing (6) is then a straightforward nonlinear least squares criterion. Since the data beingfit in this procedure come from exact solutions of the differential equation without noise, the parameterswill be estimated perfectly except for bias due to approximation of solutions by a finite basis expansion(4). The discrepancy between the original parameters and these can therefore be labeled collocationbias. Additional discrepancy between the parameters estimated by the method just described, and thoseestimated by generalized profiling on the Gillespie simulation data, are estimation bias resulting from theobjective function.

Figures 7 and 8 present estimates of the parameters on simulated data. In Fig. 7 the collocationbias is given by the difference between the solid and dashed-dot lines, while the estimation bias is thedifference between the dashed line and the dashed-dot line. Both biases are small relative to the truemagnitude of β(t), and the collocation bias is dominant, indicating that the bias could be reduced byusing a larger number of terms in the basis expansion for trajectories. The spread of β(t) estimatesfrom different stochastic simulations correponds well to the confidence intervals in Fig. 3. There was

13

10 20 30 40 500

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

Week

Pro

port

ion

of C

ases

Figure 6: Comparison of observed and model seasonal variation in measles incidence. The boxplotsrepresent the week-by-week trend and variability in weekly case reports over the period 1940–1965,expressed as fractions of the total number of reports for the corresponding calendar year (i.e, all weeklycase reports from 1940 were divided by the total number of case reports for all of 1940, and so on). Thesolid and dashed curves show the 25th, 50th, and 75th percentiles (across years) of the same quantity in aforward solution of the deterministic model (1) starting from the estimated system state at the start of1940.

collocation bias present in most of the other parameters (Fig. 8), and the trend in reporting rate showssome additional bias resulting from the profiling objective function. The collocation bias was highest forthe visiting rate parameter, but (as noted above) this parameter was largely not identifiable.

8 Discussion

Parameter estimation for continuous-time epidemic models from typically available data, such as a time se-ries of case reports, is complicated by the nonlinear stochastic nature of the epidemic process, and by sam-pling variability and reporting bias in the observation process. Until very recently, these difficulties havebeen dealt with most effectively either by using a discrete-time approximation to the model, e.g. the TSIRmodel (Finkenstaedt and Grenfell 2000; Morton and Finkenstaedt 2005; Xia et al. 2005; Metcalf et al.2009), or through the use of quasilikelihood indirect inference methods (Gouriroux and Monfort, 1996).

We have presented a method for fitting differential equation models that can deal with these chal-lenges, without having to specify and parameterize a model for the stochastic deviations between thedeterministic model and real-world outcomes. Like Monte Carlo methods (Bayesian MCMC and se-quential Monte Carlo), our method involves estimating the trajectories of all state variables, measuredand unmeasured, along with estimating the model parameters that are our main interest. But unlike

14

1952 1952.2 1952.4 1952.6 1952.8 195310

15

20

25

30

35

Est

imat

edβ(

t)/γ

year

Figure 7: Estimates of β(t) from simulated data. Solid line: the β(t) function used in generating thesimulated data. Grey lines: estimates for 250 stochastic simulations of the epidemic process (Gillespiealgorithm). Dashed line: mean estimate from the stochastic simulations. Dash-dot line: estimate with avery large λ from an exact solution to Eqs. (3).

Monte Carlo methods, this is accomplished by numerical optimization (producing a point estimate andconfidence region for the coefficients specifying the trajectories), rather than by repeated sampling of pos-sible trajectories to approximate the joint posterior distribution of parameters and trajectories. A naiveMCMC approach to fitting the model in our simulation studies would have been impossible because ofthe computation time: ≈ 10 hours simply to generate one proposal for state trajectories, so it would havetaken several months to fit each artificial data set. For this reason, successful applications of Monte Carlomethods to epidemic models have required considerable ingenuity and effort to make the computationsfeasible, often requiring approximations or simplifications to the underlying model (Ionides et al. 2006;Cauchemez and Ferguson 2008; He et al. 2010). Our methods avoid the computational cost of generatingmany proposals, and also the difficult problem of specifying a proposal distribution for trajectories andparameters that efficiently explores the full posterior distribution. Instead, inference can be based onasymptotic variance-covariance matrices for parameters and reconstructed trajectories. An importantdirection for further research is to compare the performance of generalized profiling with Monte Carlomethods.

This study is the first empirical application of forward cross-validation (FCV) for choosing thesmoothing parameter λ in the generalized profiling method; theoretical properties of FCV are presentedelsewhere (Hooker and Ellner 2009). FCV brings generalized profiling one step closer to being fully data-driven, and thus objective in the sense that results are not affected by “tuning parameters” that a usercan set at will. However, the complexity of the seasonal variation model for β(t) (the number of splineknots) was specified a priori based on the expected time scales of possible variation, as in other recentstudies (Finkenstaedt and Grenfell 2000; Morton and Finkenstaedt 2005; Xia et al. 2005; Ionides et al.2006; Cauchemez and Ferguson 2008; Metcalf et al. 2009; He et al. 2010) We have supported our choicein two ways: increasing the number of knots to verify that relevant features were not missed, and providingconfidence bands (Fig 3) that indicate which features are well supported by the data. However, a fully

15

0.24 0.26 0.280

20

40

60Estimated Reporting Rate

Fre

quen

cy

p0

−11 −10 −9 −8 −7

x 10−3

0

20

40

60Change in Reporting Rate

Fre

quen

cy

p1

0 0.5 1

x 10−6

0

20

40

60

80Estimated Visiting Rate

Fre

quen

cy

v5 10 15

0

20

40

60Trend in β(t)

Fre

quen

cy

β1

Figure 8: Estimated parameter values from 250 stochastic simulations of the epidemic process (Gillespiealgorithm). Top left: reporting rate in 1952. Top right: yearly change in reporting rate. Bottom left:estimated visiting rate. Bottom right: linear year-to-year trend in β(t). Solid vertical lines give theparameter values used in generating the simulated data. Dashed lines give values estimated with a verylarge λ for an exact solution to Eqs. (3).

data-driven method for choosing the number of knots for β(t) would be advantageous.Our results for measles in Ontario provide general support for the “term time” model of seasonal

variation, in which the dominant factor is assumed to differences in contact rate between school terms andschool holidays. However, our estimate of β(t) differs from the term time model in two respects. First,there were no detectable decreases in transmission during short school holidays, including the Christmasbreak, even though biweekly data have sufficient resolution for such effects to be evident. Second, ourestimate includes some variation during the school year, in particular the confidence bands in Fig. 3imply that transmission is highest at the start of each school year and then declines, possibly with asecondary peak shortly after the new year. As we noted above, this within-term trend may be due to the“staggered” recruitment of individuals into the susceptible pool (a new cohort at the start of each schoolyear), which our model omits. In addition to estimating the seasonal trend, our estimate of β(t) leadsto an estimate of R0, roughly given by the mean of β(t)/γ. Here, our results are somewhat suprising.The published estimate is R0 ≈ 11 − 12 based on age-structured case reports from London, Ontario in1912-1913 (Anderson and May 1991, 1985); Bauch and Earn (2003) assumed this value in their analysisof incidence patterns in childhood diseases. Our estimate is considerably higher than that (Fig. 3),

16

but it is within the range of values estimated for developed-world cities prior to vaccination (Table 4 inAppendix to Bauch and Earn (2003)). Similarly, He et al. (2010) estimated values of R0 for measles inBritain that were roughly double previous estimates based on serological and age-dependent incidencedata. Their model, like ours, makes a number of simplifying assumptions (homogeneous mixing, neglectof age structure, etc.) so these estimates should be interpreted cautiously until they are supported bymore extensive studies using more realistic models.

Future work will apply this method to other disease time series and address biological issues. Forexample, we have weekly case reports for all reportable infectious diseases in Ontario over the periodexamined in this paper. If we consider childhood diseases with the same mean age of infection then therelevant contact pattern should be identical in all cases. If the estimated transmission rate variation isdifferent for different diseases then we might infer that seasonal effects other than contact rate variationinfluence the seasonality of transmission differently for different diseases (e.g., one can imagine thathumidity or temperature changes affect some pathogens more than others).

We have applied our methods to a recurrent infectious disease, but inferring transmission rate vari-ation from case report data is also important for real-time estimation in the context of disease invasions,where one is often interested in understanding how various control measures are impacting transmission(or, in principle, in detecting evolution of transmissibility of the causative agent). In this situation, itis typical now to have daily case reports so methods have recently been developed that exploit dailydata to estimate β(t) without relying on a specific epidemiological model such as the SEIR framework(Wallinga and Teunis, 2004; Goldstein et al., 2009). With daily case data, and an estimate of the dis-tribution of the serial interval (the time from acquiring to transmitting the infection), one can associatereported cases with transmission events in the past, and consequently estimate the daily reproductiveratio R(t) (or, equivalently, β(t) ' R(t)/Tinf, where Tinf is the mean infectious period). The extent towhich these methods can be adapted usefully for weekly time series of recurrent epidemics remains tobe seen. Conversely, generalized profiling can be applied to daily time series, and this would provide auseful comparison for results obtained with the Wallinga-Teunis method, and other methods to estimateR(t) for disease invasions (e.g., Pourbohloul et al. (2009); Fraser et al. (2009); Yang et al. (2009)).

Acknowledgements This research was supported by NSF grant DEB-0813734 (SPE and GJH), grantsfrom the James S. McDonnell Foundation (SPE and DJDE), by the Cornell University AgriculturalExperiment Station federal formula funds Project No. 150446 (GJH), and CIHR and NSERC (DJDE).

References

Anderson, R. M. and R. M. May (1985). Age-related changes in the rate of disease transmission: Impli-cations for the design of vaccination programmes. The Journal of Hygiene 94 (3), 365–436.

Anderson, R. M. and R. M. May (1991). Infectious Diseases of Humans: Dynamics and Control. Oxford:Oxford University Press.

Arora, N. and L. T. Biegler (2004). A trust region SQP algorithm for equality constrained parameterestimation with simple parametric bounds. Computational Optimization and Applications 28, 51–86.

Banks, H. T., J. A. Burns, and E. M. Cliff (1981). Parameter estimation and identification for systemswith delays. SIAM Journal on Control and Optimization 19 (6), 791–828.

Bates, D. M. and D. B. Watts (1988). Nonlinear Regression Analysis and Its Applications. New York:Wiley.

Bauch, C. T. and D. J. D. Earn (2003). Transients and attractors in epidemics. Proc. R. Soc. Lond.B 270, 1573–1578.

17

Biegler, L., J. J. Damiano, and G. E. Blau (1986). Nonlinear parameter estimation: a case studycomparison. AIChE Journal 32 (1), 29–45.

Blower, S. M., H. B. Gershengorn, and R. M. Grant (2000). A tale of two futures: Hiv and antiretroviraltherapy in San Francisco. Science 287 (5453), 650–654.

Bock, H. G. (1983). Recent advances in parameter identification techniques for ODE. In P. Deuflhardand E. Harrier (Eds.), Numerical Treatment of Inverse Problems in Differential and Integral Equations,pp. 95–121. Basel: Birkhauser.

Brunel, J.-B. (2008). Parameter estimation of odes via nonparametric estimators. Electronic Journal ofStatistics 2, 1242–1267.

Cauchemez, S. and N. M. Ferguson (2008). Likelihood-based estimation of continuous-time epidemicmodels from time-series data: application to measles transmission in London. Journal of the RoyalSociety Interface 5 (25), 885–897.

Ciupe, M. S., B. L. Bivort, D. M. Bortz, and P. W. Nelson (2006). Estimating kinetic parameters fromHIV primary infection data through the eyes of three different mathematical models. MathematicalBiosciences 200 (1), 1–27.

Deuflhard, P. and F. Bornemann (2000). Scientific Computing with Ordinary Differential Equations. NewYork: Springer-Verlag.

Earn, D. J. D. (2009). Mathematical epidemiology of infectious diseases. In M. A. Lewis, M. A. J.Chaplain, J. P. Keener, and P. K. Maini (Eds.), Mathematical Biology, Volume 14 of IAS/ Park CityMathematics Series, pp. 151–186. American Mathematical Society.

Earn, D. J. D., P. Rohani, B. M. Bolker, and B. T. Grenfell (2000). A simple model for complex dynamicaltransitions in epidemics. Science 287 (5453), 667–670.

Ellner, S. P., Y. Seifu, and R. H. Smith (2002). Fitting population dynamic models to time-series databy gradient matching. Ecology 83 (8), 2256–2270.

Ellner, S. P. E. (2007). Commentary on “parameter estimation in differential equations: A generalizedsmoothing approach”. Journal of the Royal Statistical Society, Series B 16, 741–796.

Ferguson, N. M., D. A. T. Cummings, C. Fraser, J. C. Cajka, P. C. Cooley, and D. S. Burke (2006).Strategies for mitigating an influenza pandemic. Nature 442 (7101), 448–452.

Ferrari, M. J., R. F. Grais, N. Bharti, A. J. K. Conlan, O. N. Bjornstad, L. J. Wolfson, P. J. Guerin,A. Djibo, and B. T. Grenfell (2008). The dynamics of measles in sub-saharan Africa. Nature 451 (7179),679–684.

Fine, P. and J. Clarkson (1982). Measles in England and Wales-1: An analysis of factors underlyingseasonal patterns. International Journal of Epidemiology 11, 5–14.

Finkenstaedt, B. F. and B. T. Grenfell (2000). Time series modelling of childhood diseases: a dynamicalsystems approach. Journal of the Royal Statistical Society Series C-Applied Statistics 49, 187–205.

Fraser, C., C. A. Donnelly, S. Cauchemez, W. P. Hanage, M. D. Van Kerkhove, T. D. Hollingsworth,J. Griffin, R. F. Baggaley, H. E. Jenkins, E. J. Lyons, T. Jombart, W. R. Hinsley, N. C. Grassly,F. Balloux, A. C. Ghani, N. M. Ferguson, A. Rambaut, O. G. Pybus, H. Lopez-Gatell, C. M. Alpuche-Aranda, I. Bojorquez Chapela, E. Palacios Zavala, D. M. Espejo Guevara, F. Checchi, E. Garcia,S. Hugonnet, and C. Roth (2009). Pandemic potential of a strain of influenza A (H1N1): Earlyfindings. Science 324, 1557–1561.

18

Gillespie, D. T. (1976). A general method for numerically simulating the stochastic time evolution ofcoupled chemical reactions. Journal of Computational Physics 22, 403–434.

Goldstein, E., J. Dushoff, J. Ma, J. Plotkin, D. J. D. Earn, and M. Lipsitch (2009). Reconstructinginfluenza incidence by deconvolution of daily mortality times series. PNAS 106, 21825–21829.

Gouriroux, C. and A. Monfort (1996). Simulation-Based Econometric Methods. CORE Lectures. Oxford:Oxford University Press.

He, D. H., E. L. Ionides, and A. A. King (2010). Plug-and-play inference for disease dynamics: measlesin large and small populations as a case study. Journal of the Royal Society Interface 7 (43), 271–283.

Hethcote, H. W. and J. A. Yorke (1984). Gonorrhea Transmission Dynamics and Control. New York:Springer.

Hooker, G. (2006). Matlab Functions for the Profiled Estimation of Differential Equations. Dept. Bio.Stat. and Comp. Bio., Cornell University.

Hooker, G. (2007). Theorems and calculations for smoothing-based profiled estimation of differentialequations. Technical Report BU-1671-M, Dept. Bio. Stat. and Comp. Bio., Cornell University.

Hooker, G. and S. Ellner (2009). On forwards prediction error: Nonparametric smoothing and robustnessto parametric models. in preparation.

Hooker, G., L. Xiao, and J. Ramsay (2010). CollocInfer: Collocation Inference for Dynamic Systems. Rpackage version 0.1.0.

Ionides, E. L., C. Breto, and A. A. King (2006). Inference for nonlinear dynamical systems. Proceedingsof the National Academy of Sciences of the United States of America 103 (49), 18438–18443.

Keeling, M. J. (2005). Models of foot-and-mouth disease. Proceedings of the Royal Society B-BiologicalSciences 272 (1569), 1195–1202.

King, A. A., E. L. Ionides, M. Pascual, and M. J. Bouma (2008). Inapparent infections and choleradynamics. Nature 454 (7206), 877–U29.

Koelle, K. and M. Pascual (2004). Disentangling extrinsic from intrinsic factors in disease dynamics: Anonlinear time series approach with an application to cholera. American Naturalist 163 (6), 901–913.

Koelle, K., X. Rodo, M. Pascual, M. Yunus, and G. Mostafa (2005). Refractory periods and climateforcing in cholera dynamics. Nature 436 (7051), 696–700.

Kurtz, T. G. (1980). Relationships between stochastic and deterministic population models. LectureNotes in Biomathematics 38, 449–467.

Lehmann, E. L. and G. Casella (1997). Theory of Point Estimation. New York: Springer.

Longini, I. M. and M. E. Halloran (2005). Strategy for distribution of influenza vaccine to high-riskgroups and children. American Journal of Epidemiology 161 (4), 303–306.

Medlock, J. and A. P. Galvani (2009). Optimizing influenza vaccine distribution. Science 325 (5948),1705–1708.

Metcalf, C. J. E., O. N. Bjornstad, B. T. Grenfell, and V. Andreasen (2009). Seasonality and comparativedynamics of six childhood infections in pre-vaccination Copenhagen. Proceedings of the Royal SocietyB-Biological Sciences 276 (1676), 4111–4118.

19

Miller, M. W., N. T. Hobbs, and S. J. Tavener (2006). Dynamics of prion disease transmission in muledeer. Ecological Applications 16 (6), 2208–2214.

Morton, A. and B. Finkenstaedt (2005). Discrete time modelling of disease incidence time series byusing Markov Chain Monte Carlo methods. Journal of the Royal Statistical Society Series C-AppliedStatistics 54 (3), 575–594.

Morton, R., E. L. Kang, and B. L. Henderson (2009). Smoothing splines for trend estimation andprediction in time series. Environmetrics 20, 249–259.

New, L. F., J. Matthiopoulos, S. Redpath, and S. T. Buckland (2009). Fitting models of multiplehypotheses to partial population data: Investigating the causes of cycles in red grouse. AmericanNaturalist 174 (3), 399–412.

Newey, W. K. and K. D. West (1987). A simple, positive semi-definite, heteroskdasticity and autocorre-lation consistent covariance matrix. Econometrica 55, 703–708.

Newman, K. B., C. Fernndez, L. Thomas, and S. T. Buckland (2009). Monte carlo inference for state-spacemodels of wild animal populations. Biometrics 65 (2), 572–583.

Pascual, M., X. Rodo, S. P. Ellner, R. Colwell, and M. J. Bouma (2000). Cholera dynamics and ElNino-southern oscillation. Science 289 (5485), 1766–1769.

Pourbohloul, B., A. Ahued, B. Davoudi, B. Meza, L. A. Meyers, D. M. Skowronski, I. Villasenor, F. Gal-van, P. Cravioto, D. J. D. Earn, J. Dushoff, D. F. Fisman, W. J. Edmunds, N. Hupert, S. V. Scarpino,J. Trujillo, M. Lutzow, J. Morales, A. Contreras, C. Chavez, D. M. Patrick, and R. C. Brunham (2009).Initial human transmission dynamics of a novel swine-origin influenza A (H1N1) virus (S-OIV) in NorthAmerica, 2009. Influenza and Other Respiratory Viruses 3, 215–222.

Ramsay, J., G. Hooker, D. Campbell, and J. Cao (2007). Parameter estimation for differential equations:a generalized smoothing approach. Journal of the Royal Statistical Society B 69 (5), 741–770.

Shaman, J., V. E. Pitzer, C. Viboud, B. T. Grenfell, and M. Lipsitch (2010). Absolute humidity and theseasonal onset of influenza in the continental united states. PLoS Biology 8 (2).

Swartz, J. and H. Bremermann (1975). Discussion of parameter estimation in biological modelling:algorithms for estimation and evaluation of the estimates. Journal of Mathematical Biology 1, 241–257.

Truscott, J. E. and N. M. Ferguson (2009). Control of scrapie in the UK sheep population. Epidemiologyand Infection 137 (6), 775–786.

Varah, J. (1982). A spline least squares method for numerical parameter estimation in differential equa-tions. Siam Journal on Scientific and Statistical Computing 3 (1), 28–46.

Wahba, G. (1990). Spline Models for Observational Data. Philadelphia: SIAM CBMS-NSF RegionalConference Series in Applied Mathematics.

Wallinga, J. and P. Teunis (2004). Different epidemic curves for severe acute respiratory syndrome revealsimilar impacts of control measures. American Journal of Epidemiology 160 (6), 509–516.

Wallinga, J., M. van Boven, and M. Lipsitch (2010). Optimizing infectious disease interventions dur-ing an emerging epidemic. Proceedings of the National Academy of Sciences of the United States ofAmerica 107 (2), 923–928.

Wearing, H. J. and P. Rohani (2006). Ecological and immunological determinants of dengue epidemics.Proceedings of the National Academy of Sciences of the United States of America 103 (31), 11802–11807.

20

Webb, C. T., C. P. Brooks, K. L. Gage, and M. F. Antolin (2006). Classic flea-borne transmission doesnot drive plague epizootics in prairie dogs. Proceedings of the National Academy of Sciences of theUnited States of America 103 (16), 6236–6241.

Xia, Y., J. R. Gog, and B. T. Grenfell (2005). Semiparametric estimation of the duration of immunityfrom infectious disease time series: influenza as a case-study. Journal of the Royal Statistical SocietySeries C-Applied Statistics 54 (3), 659–672.

Yang, Y., J. D. Sugimoto, M. E. Halloran, N. E. Basta, D. L. Chao, L. Matrajt, G. Potter, E. Kenah,and I. M. Longini Jr (2009). The transmissibility and control of pandemic influenza A (H1N1) virus.Science 326, 729–733.

21

Parameterizing State-space Models for Infectious Disease...

Documents

Transcript of Parameterizing State-space Models for Infectious Disease...