Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic...

21
Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft, 1,2, * Robert Dyrdak, 3, 4 Cristina Andr´ es, 5 Adrian Egli, 6, 7 Josiane Reist, 6, 7 Diego Garc´ ıa Mart´ ınez de Artola, 8 Julia Alcoba Fl´ orez, 8 Hubert G. M. Niesters, 9 Andr´ es Ant´ on, 5 Randy Poelman, 9 Marijke Reynders, 10 Elke Wollants, 11 Richard A. Neher, 1, 2 and Jan Albert 3, 4 1 Biozentrum, University of Basel, Basel, Switzerland 2 Swiss Institute of Bioinformatics, Basel, Switzerland 3 Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden 4 Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden 5 Respiratory Viruses Unit, Virology Section, Microbiology Department, Hospital Universitari Vall d’Hebron, Universitat Aut´onoma de Barcelona, Vall Hebron Research Institute, Barcelona, Spain 6 Clinical Bacteriology and Mycology, University Hospital Basel, Basel, Switzerland 7 Applied Microbiology Research, Department of Biomedicine, University of Basel, Basel, Switzerland 8 Department of Clinical Microbiology, Hospital Universitario Nuestra Se˜ nora de Candelaria, Tenerife, Spain 9 University of Groningen, University Medical Center Groningen, Department of Medical Microbiology, Division of Clinical Virology, Groningen, The Netherlands 10 Unit of Molecular Microbiology, Medical Microbiology, Department of Laboratory Medicine, AZ Sint-Jan Brugge AV, Bruges, Belgium 11 KU Leuven, Rega Institute, Department of Microbiology, Immunology and Transplantation, Laboratory of Clinical & Epidemiological Virology, Leuven, Belgium Background Worldwide outbreaks of enterovirus D68 (EV-D68) in 2014 and 2016 have caused serious respiratory and neurological disease. Methods We collected samples from several European countries during the 2018 out- break and determined 53 near full-length genome (‘whole genome’) sequences. These sequences were combined with 718 whole genome and 1,987 VP1-gene publicly available sequences. Findings In 2018, circulating strains clustered into multiple subgroups in the B3 and A2 subclades, with different phylogenetic origins. Clusters in subclade B3 emerged from strains circulating primarily in the US and Europe in 2016, though some had deeper roots linking to Asian strains, while clusters in A2 traced back to strains detected in East Asia in 2015-2016. In 2018, all sequences from the USA formed a distinct subgroup, containing only three non-US samples. Alongside the varied origins of seasonal strains, we found that diversification of these variants begins up to 18 months prior to the first diagnostic detection during a EV-D68 season. EV-D68 displays strong signs of continuous antigenic evolution and all 2018 A2 strains had novel patterns in the putative neutralizing epitopes in the BC- and DE-loops. The pattern in the BC-loop of the USA B3 subgroup had not been detected on that continent before. Patients with EV-D68 in subclade A2 were significantly older than patients with a B3 subclade virus. In contrast to other subclades, the age distribution of A2 is distinctly bimodal and was found primarily among children and in the elderly. Interpretation We hypothesize that EV-D68’s rapid evolution of surface proteins, ex- tensive diversity, and high rate of geographic mixing could be explained by substantial reinfection of adults. Funding University of Basel and Swedish Foundation for Research and Development in Medical Microbiology INTRODUCTION Enterovirus D68 (EV-D68) has caused worldwide out- breaks of serious respiratory and neurological disease in 2014 and thereafter. In particular, EV-D68 infection has been associated with acute flaccid myelitis (AFM) (Hixon et al., 2019). EV-D68 was first isolated in 1962, (Schieble et al., 1967) but rarely reported until the re- cent outbreaks (Centers for Disease Control and Pre- * Correspondence to: Dr. Emma Hodcroft, Biozen- trum, Klingelbergstrasse 70, 4056, Basel, Switzerland. [email protected] vention (CDC), 2011; Holm-Hansen et al., 2016; Khet- suriani et al., 2006; Rahamat-Langendoen et al., 2011). Yet, an almost ubiquitous presence of specific neutraliz- ing antibodies indicates that infection with the virus has been very common before the recent outbreaks (Vogt and Crowe Jr, 2018). In many countries, particularly in Europe and North America, EV-D68 has exhibited a biennial pattern, with peaks in the late summers and autumns of even- numbered years (2014, 2016, 2018) (Khetsuriani et al., 2006; Kramer et al., 2018; Messacar et al., 2019; Poel- man et al., 2015b; Pons-Salort et al., 2018; Shen et al., 2019; Uprety et al., 2019). Other countries have re- (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprint this version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553 doi: bioRxiv preprint

Transcript of Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic...

Page 1: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

Evolution, geographic spreading, and demographic distribution of EnterovirusD68

Emma B. Hodcroft,1, 2, ∗ Robert Dyrdak,3, 4 Cristina Andres,5 Adrian Egli,6, 7 Josiane Reist,6, 7 Diego GarcıaMartınez de Artola,8 Julia Alcoba Florez,8 Hubert G. M. Niesters,9 Andres Anton,5 Randy Poelman,9 MarijkeReynders,10 Elke Wollants,11 Richard A. Neher,1, 2 and Jan Albert3, 41Biozentrum, University of Basel, Basel, Switzerland2Swiss Institute of Bioinformatics, Basel, Switzerland3Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden4Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden5Respiratory Viruses Unit, Virology Section, Microbiology Department, Hospital Universitari Vall d’Hebron,Universitat Autonoma de Barcelona, Vall Hebron Research Institute, Barcelona, Spain6Clinical Bacteriology and Mycology, University Hospital Basel, Basel, Switzerland7Applied Microbiology Research, Department of Biomedicine, University of Basel, Basel, Switzerland8Department of Clinical Microbiology, Hospital Universitario Nuestra Senora de Candelaria, Tenerife, Spain9University of Groningen, University Medical Center Groningen, Department of Medical Microbiology,Division of Clinical Virology, Groningen, The Netherlands10Unit of Molecular Microbiology, Medical Microbiology, Department of Laboratory Medicine, AZ Sint-Jan Brugge AV,Bruges, Belgium11KU Leuven, Rega Institute, Department of Microbiology, Immunology and Transplantation,Laboratory of Clinical & Epidemiological Virology, Leuven, Belgium

Background Worldwide outbreaks of enterovirus D68 (EV-D68) in 2014 and 2016 havecaused serious respiratory and neurological disease.Methods We collected samples from several European countries during the 2018 out-break and determined 53 near full-length genome (‘whole genome’) sequences. Thesesequences were combined with 718 whole genome and 1,987 VP1-gene publicly availablesequences.Findings In 2018, circulating strains clustered into multiple subgroups in the B3 andA2 subclades, with different phylogenetic origins. Clusters in subclade B3 emerged fromstrains circulating primarily in the US and Europe in 2016, though some had deeperroots linking to Asian strains, while clusters in A2 traced back to strains detected inEast Asia in 2015-2016. In 2018, all sequences from the USA formed a distinct subgroup,containing only three non-US samples. Alongside the varied origins of seasonal strains,we found that diversification of these variants begins up to 18 months prior to thefirst diagnostic detection during a EV-D68 season. EV-D68 displays strong signs ofcontinuous antigenic evolution and all 2018 A2 strains had novel patterns in the putativeneutralizing epitopes in the BC- and DE-loops. The pattern in the BC-loop of the USAB3 subgroup had not been detected on that continent before. Patients with EV-D68in subclade A2 were significantly older than patients with a B3 subclade virus. Incontrast to other subclades, the age distribution of A2 is distinctly bimodal and wasfound primarily among children and in the elderly.Interpretation We hypothesize that EV-D68’s rapid evolution of surface proteins, ex-tensive diversity, and high rate of geographic mixing could be explained by substantialreinfection of adults.Funding University of Basel and Swedish Foundation for Research and Developmentin Medical Microbiology

INTRODUCTION

Enterovirus D68 (EV-D68) has caused worldwide out-breaks of serious respiratory and neurological disease in2014 and thereafter. In particular, EV-D68 infectionhas been associated with acute flaccid myelitis (AFM)(Hixon et al., 2019). EV-D68 was first isolated in 1962,(Schieble et al., 1967) but rarely reported until the re-cent outbreaks (Centers for Disease Control and Pre-

∗ Correspondence to: Dr. Emma Hodcroft, Biozen-trum, Klingelbergstrasse 70, 4056, Basel, [email protected]

vention (CDC), 2011; Holm-Hansen et al., 2016; Khet-suriani et al., 2006; Rahamat-Langendoen et al., 2011).Yet, an almost ubiquitous presence of specific neutraliz-ing antibodies indicates that infection with the virus hasbeen very common before the recent outbreaks (Vogt andCrowe Jr, 2018).

In many countries, particularly in Europe and NorthAmerica, EV-D68 has exhibited a biennial pattern,with peaks in the late summers and autumns of even-numbered years (2014, 2016, 2018) (Khetsuriani et al.,2006; Kramer et al., 2018; Messacar et al., 2019; Poel-man et al., 2015b; Pons-Salort et al., 2018; Shen et al.,2019; Uprety et al., 2019). Other countries have re-

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 2: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

2

ported odd-year outbreaks, such as Thailand in 2009-2011 (Linsuwanon et al., 2012), Australia in 2011 and2013 (peaking in Southern Hemisphere winter to spring)(Levy et al., 2015), and Japan in 2015 (Funakoshi et al.,2019). In 2018, circulation was reported from Europein Wales (Cottrell et al., 2018), Italy (Pellegrinelli et al.,2019), France (Bal et al., 2019), and in the US (Kujawskiet al., 2019; Messacar et al., 2019).

Phylogenetic analysis of EV-D68 sequences has re-vealed extensive diversity (Tokarz et al., 2012). This di-versity is grouped into the major clades A, B, and C,with a most recent common ancestor (MRCA) in themid-1990s. Clades A and B are further divided into sub-clades A1, A2, B1, B2, and B3. Most previous phylo-genetic analyses have focused on the capsid protein VP1sequence (Du et al., 2015; Tokarz et al., 2012). VP1 isone of the more variable proteins and contains importantreceptor binding sites and putative neutralizing epitopes,such as the hypervariable BC- and DE-loops. Differentsubclades differ at several positions in these loops, andprevious studies show that they have an elevated rate ofamino acid substitutions (Dyrdak et al., 2019) and ap-pear to be under positive selection (Du et al., 2015; Dyr-dak et al., 2019; Imamura et al., 2014; Lau et al., 2016;Linsuwanon et al., 2012).

The great majority of reported EV-D68 cases are pe-diatric (Holm-Hansen et al., 2016), and sero-positivityto EV-D68 increases rapidly during childhood, reachingubiquity in adults (Harrison et al., 2019; Karelehto et al.,2019; Sun et al., 2018; Xiang et al., 2017). Relatively lit-tle is known about immunity to EV-D68, but infectionappears to elicit long-lasting strain-specific immunity, asevidenced by the high prevalence among adults of neu-tralising antibodies to the prototype Fermon strain from1962 (Harrison et al., 2019; Karelehto et al., 2019; Smuraet al., 2010; Xiang et al., 2017). While some studies in-dicate strain-specific differences in neutralizing titers indifferent age-groups, the degree to which EV-D68 evo-lution is driven by immune escape or whether incidencepatterns are determined by pre-existing immunity is un-clear.

Here, we report sequences from samples collectedacross Europe in late-summer and autumn 2018 andcombine them with publicly available data to com-prehensively investigate the global phylodynamics andphylogeography of EV-D68. These data and phy-logenies are available on the Nextstrain platform atnextstrain.org/enterovirus, which provides a comprehen-sive phylogenetic analysis pipeline, a visualization tooland a public web-interface (Hadfield et al., 2018).

MATERIALS & METHODS

The study aimed to characterize the genetic diversityand geographic distribution of EV-D68 variants in Eu-

rope during the 2018 season and to compare these vari-ants with EV-D68 variants collected earlier across theworld. Respiratory samples (n=55) that had tested pos-itive for EV-D68 were obtained from six clinical virol-ogy laboratories (Stockholm, Sweden; Groningen, theNetherlands; Leuven, Belgium; Barcelona, Spain; Basel,Switzerland; and Tenerife, Canary Islands, Spain). Fora more detailed description of screening, isolation, andethical approvals, see supplementary methods I.A.

Samples were collected from 29 Aug to 28 Nov 2018(mean 4 Oct, median 1 Oct, interquartile range 22 Septto 15 Oct, see Supp. Fig. 1). Thirty-two of the 53 patientswere male (60.4%). The median age was 3.0 years (mean20.8 years, range 1 month to 94 years), with a majorityof children being five years or younger (34 of 36 children),and a majority of adults being older than 50 years (13of 17 adults). 39 of 43 patients with clinical informationpresented with repiratory illness as the main symptom.Of 38 patients with admission status information, 32 werehospitalized (5 admitted to an intensive care unit (ICU))and 6 were out-patient. There were no records of AFMamong the study patients. Detailed information aboutthe samples is available in Supp. Table I.

Near full-length genomes (NFLG) were obtained bysequencing four overlapping fragments on the Illuminanext-generation sequencing (NGS) platform as describedpreviously (Dyrdak et al., 2019) (53 of 55 samples weresuccessfully sequenced and included in the further anal-yses). Sequences were aligned with and annotated ac-cording to the 1962 Fermon strain (GenBank accessionAY426531). See supplementary methods I.B for details.

We combined the new 53 EV-D68 genomes with se-quences available from GenBank and consolidated avail-able metadata. This included 718 genome sequences withlength ≥ 6, 000bp and 1,987 sequences covering ≥ 700bpof the VP1 region of the genome (see I.C. We ana-lyzed the combined sequence data set with the Nextstrainpipeline (Hadfield et al., 2018). The detailed analysisworkflow is described in supplementary methods I.E.

Role of the funding source

The funding sources had no involvement in any part ofthis study. The corresponding author had full access toall the data in the study and had the final responsibilityfor the decision to submit for publication.

RESULTS

Interactive near real-time phylogenetic analysis withNextstrain

To enable global genomic surveillance of EV-D68,we implemented an automated phylogenetic analysispipeline using Nextstrain which generates an interactive

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 3: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

3

A

A1

A2

B

B1

B2

B3

C

Clade

1990 2000 2010 2020

AA2

A1

C

BB2

B1

B3

Japan Korea

North America

Oceania

Europe

China

Southeast Asia

South Asia

Africa

Region I

II

III

ItalyUKSwitzerlandBelgiumNetherlandsSwedenSpainFranceSenegalUSATaiwanChina

Country

2016 2017 2018 2019

2016 season

2016 2017 2018 2019

2016 season

2016 season

2015 2016 2017 2018 2019

2014 season

I II

III

FIG. 1 Time-scaled phylogeny of Enterovirus D68. A time-scaled phylogeny of VP1 segments, colored by region, isshown top-left. Key clusters (I, II, III) from the 2018 season have been highlighted on the top-right, colored by country. Todisplay these clusters at a high resolution, the whole genome phylogenies are used. The 2014 and 2016 EV-D68 seasons areshown in orange and red boxes. Below, a map shows the distribution of subclades by region from 2014-2018.

visualization integrating a phylogeny with sample meta-data such as geographic location or host age. This analy-sis is available at nextstrain.org/enterovirus and was up-dated whenever new data became available and we intendto keep it up-to-date going forward. This rolling analysisrevealed a dynamic picture of diverse clades of EV-D68circulating globally.

Fig. 1 shows a time-scaled phylogenetic tree of all avail-able VP1 sequences (≥ 700bp) collected since 1990. Since2014, the global circulation of EV-D68 has been domi-nated by viruses from the B1 and B3 subclades, with A1and A2 accounting for about 5 to 30% of viruses sampledin China, Europe, and Africa (see bottom of Fig. 1 andFig. 3B). Both our 53 new sequences and other availableEV-D68 variants from 2018 grouped into the B3 and A2subclades.

Multiple independent origins of the 2018 season

As shown in Fig. 1, VP1 sequences collected in 2018 fellinto multiple distinct subgroups with MRCAs in 2016 ormore recently. Six of these subgroups and one singletonfell into subclade B3, while another four subgroups werepart of subclade A2. The largest B3 subgroup (zoomedat the top of Fig. 1) consisted of 80 samples from the USAand 3 Belgian samples, while most remaining subgroupswere smaller and dominated by European samples.

The B3 subgroups from 2018 mostly originate fromwithin the EV-D68 sequence diversity that circulatedduring the 2016 outbreak. This is expected as the B3subclade was well-sampled in Europe, North America,China, and Japan during 2015-2016. The 2018 subgroupsdid not necessarily have ancestors in 2016 from the samegeographic regions. Instead, some subgroups sampled in

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 4: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

4

Europe in 2018 had their closest relatives in 2016 in Asia,and vice versa. All 2018 A2 subclade samples traced backto strains circulating in China and Taiwan in 2015-2016.These observations suggest relatively rapid global mixingof EV-D68.

Undetected EV-D68 diversification

Our phylogenetic analysis suggests that many of the2018 subgroups started to diversify as early as mid-2017but were not sampled until about one year later. Bothof the large, predominantly European 2018 subgroupshave estimated MRCAs in the middle to end of 2017,with extensive diversification until the first samples werecollected in August of 2018. The large American 2018subgroup began diversifying in early 2017. Though theancestors of these subgroups must have been circulating,there are no corresponding samples. This pattern is notunique to the B3 subclade or to 2018. Similar patternsof ‘hidden’ diversification can be seen in the two large2018 clusters in the A2 subclade, and in the 2014 and2016 outbreaks, with the ancestors of samples taken dur-ing these outbreaks beginning to diversify up to a yearbeforehand.

We quantified this diversification over time in Fig. 2B,where the number of lineages which lead to the samplestaken during each season is plotted over time (Nee et al.,1995) (see supplementary methods I.F). Despite a vari-able number of lineages present at the end of each out-break (363 for 2014, 119 for 2016, and 122 for 2018), eachseason traces back to only 10-18 lineages two years prior,reflecting a marked diversification which began about 18months before the first sequenced samples of each seasonwere collected. Four years prior to the outbreak year thenumber of lineages dropped further, to 2-7, representingdeep splits in the tree.

When the change in number of lineages is plottedalongside sampling times (Fig. 2C), peak diversification(black) appears before peak sampling periods (purple).Both plots illustrate that the majority of diversificationoccurs prior to when samples are taken (i.e. ‘hidden’ di-versification), and slows down during the outbreak. Thispattern is consistent with the expectation that lineagessplit in expanding populations and little coalescence isobserved when populations are large (Kingman, 1982).In the 2014 and 2018 seasons, the majority of diversifica-tion immediately preceded the sampling period, evidentin the steep slopes in the first half of the outbreak yearin Fig. 2B, and in the sharp peaks during the outbreakyear in Fig. 2C). In the 2016 outbreak, however, the di-versification appears to have been slower, shown by theshallower but more stable slope in Fig. 2B and the broadpeak of lineage change in Fig. 2C. These patterns are re-capitulated in the inferred pattern of the per-lineage co-alescence rate through time (‘skyline,’ see Supp. Fig. 3).

The well-sampled seasons in 2014, 2016, and 2018 showa common pattern: diversity within an outbreak tracesback to 10-18 lineages 2 years prior and 2-7 lineages 4years before the outbreak.

Overall, our results suggest that EV-D68 diversity ismaintained through multiple outbreak years by signifi-cant unreported year-round circulation.

Migration of EV-D68 between countries and continents

The relationship between 2016 and 2018 viruses sug-gested that EV-D68 transmits and migrates rapidlyenough that viruses sampled in one continent in 2018often have ancestors detected in 2016 on different conti-nents. To quantify the geographical mixing we investi-gated the viral migrations between countries and conti-nents during the comparatively well-sampled outbreaksof 2014, 2016 and 2018 (see supplementary methods I.F).

Of the 10 outbreak clusters in 2018, all but two werepredominantly European. Within these clusters, strainsfrom different European countries were thoroughly inter-mixed, with little sign of within-country clustering (seeinsets in Fig. 1). Consistent with this qualitative ob-servation, the maximum likelihood estimate of the over-all migration rate between European countries in 2018was approximately 2/year. Repeating the same analysison the 2014 and 2016 outbreaks yielded about 2- to 3-fold lower estimates, possibly due to less representativesampling in these years. Discrete trait analyses make anumber of assumptions and the migration rate estimatesshould be viewed as an empirical summary of transitionsobserved on the tree rather than unbiased quantitativeestimates.

At the level of broader geographic regions, mixing wasless rapid. This is qualitatively evident from the cluster-ing of regions in the phylogeny in Fig. 1. The maximumlikelihood migration rate between Europe, North Amer-ica, and China estimated using a discrete state modelfitted to years 2014-2018 was found to be 0.24/year.This means that any given lineage has a chance of about1 in 4-5 to switch regions in one year. After the twoyear inter-outbreak interval, about half the viral lineageswould be expected to have spread to other geographicregions. Notably, though samples from the 2014 and2016 outbreaks showed considerable mixing between Eu-ropean and North American sequences (Dyrdak et al.,2019), all 77 samples from the US in 2018 formed onedistinct subgroup, containing only three samples fromelsewhere (Belgium). Since the subgroup dominating theUS in 2018 harbors substantial diversity and since boththe US and Europe have been thoroughly sampled, thesedivergent patterns are difficult to explain with samplingbiases alone.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 5: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

5

2014 2015 2016 2017 2018 2019

Zoomed view of top of B3 subclade

A

RegionJapan KoreaNorth AmericaOceaniaEuropeChina

6 5 4 3 2 1 0Years before end of epidemic year (2019, 2017, 2015)

100

101

102

Num

ber o

f Lin

eage

sRe

mai

ning

(log

)

B 2014-52016-72018-9

2014 2015 2016 2017 2018 20190.0

2.5

5.0

7.5

10.0

12.5

15.0

Perc

ent o

f Mea

nTo

tal L

inea

ges

Coal

escin

g pe

r Mon

th

C100

101

102

0

50

100

150

200

250

Mea

n Nu

mbe

r of

Sam

ples

per

Mon

th

FIG. 2 Persistence and diversification of EV-D68 since 2014. Multiple EV-D68 lineages persist from one biennialoutbreak to another. (A) Zoomed in view of the whole genome EV-D68 tree (for better time-resolution), showing 2018subgroups in the B3 subclade. (B) Lineages which lead to the samples in the 2014, 2016, and 2018 seasons diversify over time.The dotted black lines show the IQR of all samples taken during 2014, 2016, and 2018. (C) The change in number of lineages(as % of total lineages) per month for each season (left y-axis) and the mean number of samples per month (right y-axis).

0 20 40 60 80 100

0.00

0.02

0.04

0.06

0.08

Age

Den

sity

A1A2B1B2B3C

A

Year

Per

cent

of s

ampl

es

020

4060

8010

0 Age

<1y1−5y6−17y18−64y>=65yUnknown

Clade

A1A2B1B2B3

Age Clade Age Clade Age Clade

2014 2016 2018

B

FIG. 3 Age distribution of EV-D68 samples. (A) The cumulative distribution over age separated by subclade; A2 beingsignificantly more often detected in older persons. (B) The age of patients and subclade of their sample, for samples takenduring 2014, 2016, and 2018 for which ‘age range 1’ was available.

Subclade A2 is over-represented in elderly

We noted that many samples in A2 subclade were ob-tained from adults and the elderly, in contrast to othersubclades. Similar observations have been made in someprevious studies (Bal et al., 2019; Bottcher et al., 2016;Lau et al., 2016; Schuffenecker et al., 2016). Therefore,we investigated the age distribution in different subcladesusing a comprehensive data set of 743 samples with exact(non-range) age data and unambiguous subclade desig-

nation (see supplementary methods I.D). Almost all se-quences assigned to B or C clades were sampled fromchildren, while subclade A2 viruses showed a bimodalage distribution (Fig. 3A), with 40% of A2 samples fromchildren and 39% from the elderly (>60 years old). Sub-clade A1 was sampled predominantly in children with aminority of sequences sampled from individuals between20 and 60 years of age. The distribution of age categoriesin each subclade is shown in Supp. Fig. 4.

These associations are confirmed by statistical analy-

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 6: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

6

sis. Both the A1 and A2 subclades were associated withsignificantly older ages compared to B3 subclade (linearmodel, p=0.001 and p<0.001, respectively), whereas in-dividuals infected with B2, B1, and C subclades did nothave significantly different ages (Table I). While thesep-values might by inflated due to regional differences insampling different age groups, the difference in age dis-tribution between the A2 and B subclades was robust toremoval of countries by bootstrap analysis (Supp. Fig. 2).In addition to the age differences between the A and Bsubclades, the age distributions of the A1 and A2 sub-clades also differed significantly (p < 0.001, Kolmogorov-Smirnov test).

The higher prevalence of the A2 subclade in 2018 ascompared to previous outbreaks results in a significantskew towards older age groups in 2018 compared to 2014and 2016 (linear model, p=0.005 and p=0.001, respec-tively) (Fig. 3 B).

Antigenic evolution of EV-D68

Previous molecular analyses of EV-D68 showed theBC- and DE-loops in the VP1 gene have elevated ratesof amino acid substitutions (Dyrdak et al., 2019) and in-dications of positive selection (Du et al., 2015; Imamuraet al., 2014; Lau et al., 2016; Linsuwanon et al., 2012).We examined the sequences of these putative neutralizingantibody targets (Liu et al., 2015a) over time, compar-ing the sequences of the loops at the root of the tree in1990, at the base of the A and B clades, and from severalsequences sampled in 2018 (Fig. 4B). Both loops havechanged their sequence multiple times in each of thesecomparisons.

Still, even outside of these loops the surface of the virushas changed rapidly, suggesting further antigenic evolu-tion. Fig. 4A shows the 5-fold symmetric arrangementof the crystal structure of the virus capsid protomer con-sisting of VP1, VP2, VP3, and VP4 (Liu et al., 2015b).Much of the virion surface is variable (panel A, subunit5), and almost all variable residues in VP1 and VP2 arein surface exposed patches of the genome (panel C). Aparticularly dense cluster of variable positions is formedby the C-terminus of VP1 and central region of VP2. TheC-terminus of VP1 was identified by Mishra et al. (2019)as a potential EV-D68 specific epitope.

The epitope turnover dynamics are more readily appar-ent when integrated with the phylogenetic tree, and treescolored by the epitope patterns found in the BC- and DE-loops reveal signs of rapid and putatively antigenic evo-lution in EV-D68 (Fig. 4D & E). The BC-loop in the A2subclade, for example, has changed 3 times since 2010.All 2018 A2 samples have novel epitope patterns in theBC-loop (the two small grey ‘other’ clusters also have dis-tinct novel patterns). The predominantly American B32018 subgroup also has novel BC-loop patterns, mostly

pattern ‘DTTQTF’. A total of 144 of 197 samples from 2018have BC-loop sequences different from those observed in2016 or earlier. The DE loop has undergone less rapidchange in the recent past, but all major clades differ intheir DE-loop sequence. The predominant epitope pat-tern in the mostly-American B3 subgroup (‘NGSNNNTTYV’)was present in 25 European sequences in 2016, but doesnot appear to have previously been seen in the USA.Notably, the B1 and B3 variants dominating the largeoutbreaks in 2014 and 2016 show little variation at theBC- nor DE-loops. Supp. Fig. 5 plots the mean num-ber of mutations in the BC- and DE-loops acquired overtime and shows that different subclades have acquiredmutations at different rates. The patterns of molecularevolution revealed here are compatible with the notionof rapid antigenic evolution with rapid turnover of epi-topes and frequent parallel changes in different clades,as observed for seasonal influenza viruses (Petrova andRussell, 2018).

TABLE I Age and Clade/Subclade

(Sub)Clade Number of Samples Mean Age Differs from Intercept

B3 (Intercept) 371 7.33

B2 59 6.54 NS (p = 0.74)

B1 73 6.64 NS (p = 0.75)

A2 79 38.13 *** p < 2 × 10−16

A1 88 13.66 ** p = 0.002

C 75 4.9 NS (p = 0.25)

Linear model of subclade on age, showing the ages of patients insubclades A2 and A1 are significantly different from those in subcladeB3. Subclades B2, B1, and C do not differ significantly from B3.Number of samples refers to the number of samples within a subcladefor which exact age data was available.

DISCUSSION

Through a European multi-center collaboration anduse of publicly available sequences and metadata in Gen-bank, we have shown that the circulation of EV-D68 in2018 was not a single outbreak but consisted of smallerclusters of diverse viruses from the A2 and B3 subclades.We demonstrated a robust association of A2 viruses withinfections in the elderly and provide evidence for rapidantigenic evolution of the EV-D68 surface proteins.Recent EV-D68 circulation. We compared genetic

diversity of viruses circulating in 2018 with those fromthe major US/European outbreaks in 2014 and 2016. Ineach of these seasons, sampled diversity traced back to10-18 lineages at the time of the previous season twoyears earlier. Four to five years prior to a season, thenumber of ancestral lineages is reduced to 2-7. This sug-gests that EV-D68 has undergone waves of diversificationabout 1 year prior to the recognised outbreaks and thatcoalescence occured on a 2-4 year time scale. Similar pat-

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 7: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

7

BC-Loop DE-LoopFermon ..A..S.G...... ..NND....NY93 NHTSSDARTHKNFF NGSSNSTYMA .....E...D.... -........A2 .....E..VD...Y -..N.....A1 ..A..E.Q.D.... -S......T B D....A.Q.D.... .....N...B3-US D....T.Q.D.... ...N.N..VB3-EU D....A.QAD.... .....N..V

A

B

D: BC-Loop

E: DE-Loop

C

0 200 400 600 800position along polyprotein

0

5

10

variabili

ty VP4 VP2 VP3 VP1

outer surface:

1985 1990 1995 2000 2005 2010 2015

1985 1990 1995 2000 2005 2010 2015

146,149)

Substitutionsin VP1-4

C-terminalepitope

DE-loop (partial)BC-loop (partial)

VP1

VP2

VP3

VP1-4 Subunit

1

2

3

4

5

FIG. 4 Molecular evolution of the EV-D68 capsid. Panel A) Rendering of 5-fold symmetric arrangement of thecrystal structure of the capsid protomer consisting of VP1, VP2, VP3 and VP4 (Liu et al., 2015b), using different copies ofthe protomer to highlight different aspects of the capsid organization, evolution, and immunogenicity. The five subunits arelabelled 1-5 (circled numbers). Subunits 1 and 4 are present in dark grey to complete the structure. Subunit 2 shows the surfaceexposed proteins in purple (VP1), sand (VP2), and green (VP3), subunit 3 displays the different putative epitopes (Mishraet al., 2019) (note that the most variable parts of the BC- and DE-loops are not shown, as their structure is not resolved), andsubunit 5 highlights variable positions in red, with the different proteins forming the surface indicated by pale colors. Panel B)The hypervariable BC- and DE-loops (partly missing from the structure in Panel A) in clusters and subclusters accumulatedmultiple changes since the root of the tree (the inferred root sequence patterns match those of the NY93 strain shown here).(BC-loop is AA positions 90-103; DE-loop is AA positions 140-148). Patterns for the C-terminus and a region of VP2 are shownin Supplementary Table II. Panel C) Variable positions often coincide with surface exposed parts of the protein. Genes arehighlighted by blue (VP4), orange (VP2), green (VP3), and purple (VP1). Grey boxes show, from left to right, the BC- andDE-loops, and C-terminus regions. Panels D&E) Phylogenetic VP1 trees are colored by the most common epitope patternsof the 6 and 8 most variable amino-acid positions in the BC- and DE-loops, respectively. Particularly note-worthy is the rapidturnover of BC-loop variants in the recent evolution of the A2 subclade. Patterns for the C-terminus region are shown inSupp. Fig. 6

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 8: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

8

terns can be observed for seasons prior to 2014, but thescarce sequence availability makes quantification of thesepatterns less reliable.

Geographic mixing. All major subgroups circulat-ing in Europe in 2018 were sampled in multiple countries,despite the fact that the inferred TMRCA of most sub-groups was only about one year prior to sampling. Con-sistent with this, we inferred rapid migration of lineagesbetween European countries at rates of about 2/year.While the great majority of sequences in this datasetwere obtained from respiratory samples, Majumdar et al.(2019) reported 27 EV-D68 sequences from wastewatersamples from the UK (accession numbers MN018235-MN018261). These sequences were well mixed with theEuropean respiratory samples, indicating that the virusesdetected in sampling in medical settings fairly accuratelyreflect the circulating strains.

At the level of continents or major geographic regions,migration was slower and accurate quantification diffi-cult. Discrete trait analysis suggested migration ratesbetween continents to be 5-10 fold lower than betweenEuropean countries. While we observed rapid mixingbetween sequences sampled in the US and Sweden dur-ing the 2016 outbreak (Dyrdak et al., 2019), sequencessampled in the US in 2018 formed a largely US-specificcluster.

Despite the (currently) monophyletic origin of 2018EV-D68 in the US, EV-D68 strains are remarkably well-mixed given that the majority of cases were in youngchildren.

Age distributions. We undertook a thorough liter-ature review and reached out to many authors to obtaininformation on the age of the sampled individuals. In to-tal, we were able to assign ages to 743 sampled patientswhich allowed us to investigate age distributions withmuch higher accuracy than previous studies. We foundthat subclade A2 was sampled significantly more oftenin the elderly than other clades and subclades, which isconsistent with results from previous smaller studies (Balet al., 2019; Bottcher et al., 2016; Lau et al., 2016; Schuf-fenecker et al., 2016). More specifically, we found thatA2 samples had a pronounced bimodal age distribution(Fig. 3A) with about 40% of the samples from childrenand 40% from the elderly above 60 years. Notably, the A1subclade was also moderately over-represented in adults,but not specifically in the elderly.

The fact that almost all samples from the elderly pop-ulation fell into the A2 subclade could have several rea-sons. (i) Individuals in this age group could be moreprone to infection by A2 subclade viruses, possibly dueto lower subclade-specific immunity. This possibility issupported by Harrison et al. (2019) who showed lowerprevalence of neutralizing antibodies among adults to theA2 subclade than to B1, B2 and Fermon. (ii) Alterna-tively, A2 viruses are more virulent/symptomatic in el-derly (irrespective of pre-existing immunity), but there

are no data directly supporting this possibility.Evidence for continuous antigenic evolution.

Given that most EV-D68 cases are pediatric (Holm-Hansen et al., 2016) and that almost everyone has highEV-D68 titers by the age of ten (Harrison et al., 2019;Karelehto et al., 2019; Sun et al., 2018; Xiang et al.,2017), one might expect that selection for antigenicchange is of minor importance for the molecular evo-lution of EV-D68. However, we found rapid evolutionin putative neutralizing epitopes, including the previ-ously studied BC- and DE-loops (Du et al., 2015; Dyrdaket al., 2019). In particular, the C-terminus of VP1 anda surface exposed patch in VP2 are accumulating aminoacid substitutions at a high rate with substantial paral-lelism between clades. The C-terminus of VP1 was re-cently described as an EV-D68 specific immunoreactiveepitope (Mishra et al., 2019). In the 3D-structure, theC-terminus of VP1 and VP2 forms a contiguous ridge,which is a plausible target for antibody binding that dis-plays substantial differences between the A2 and B3 sub-clades.

Antigenic evolution is further supported by serology.Harrison et al. (2019) used a panel of sera collected in2012 (prior to the large clade B outbreak in 2014) andmeasured neutralizing antibody titers against the proto-typic Fermon strain from 1962 and against viruses fromsubclades B1, B2, and A2 sampled in 2014. Individualsolder than 50 (i.e. born before 1962) have uniformly hightiters against the Fermon strain (median log2 titer 10.5),while children below the age of 15 have low titers (log2

titer 6). In contrast, titer distributions against represen-tatives from subclades B1 and B2 were similar for indi-viduals above the age of 5 (median between 7.8 and 9.5).Young children below the age 5, however, had markedlylower titers (median 5.8 (B1) and 3.2 (B2)) suggestingthat a substantial fraction of those under 5 years old arestill naive. In pregnant women tested from 1983 to 2002,all showed continuously high EV-D68 seroprevalence toFermon, but on a population level titers declined overthe study period (Smura et al., 2010). Taken together,this implies that individuals can retain high titers againsta strain they were exposed to decades earlier (as shownby the titers of those over the age of 50 against Fermon,which has not circulated for many years), but that infec-tion with recent EV-D68 strains does not elicit antibodiesagainst Fermon – that is, the virus has evolved antigeni-cally. Consistent with this, Imamura et al. (2014) showedthat antisera raised in guinea pigs against viruses fromsubclades A2, B2, and C, did not neutralize Fermon.Antigenic evolution could be driven by substan-

tial re-infection of adults.While EV-D68 has historically been primarily diag-

nosed in children, unrecognised (re-)infection of adultscould explain two striking features of EV-D68 diversityand evolution: (i) consistent patterns of molecular evolu-tion suggesting rapid immune-driven antigenic evolution

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 9: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

9

(as observed in influenza A/H3N2 (Petrova and Russell,2018)) and (ii) the rapid geographic mixing of differentEV-D68 variants. The speed of antigenic evolution of in-fluenza viruses has been negatively associated with thefraction of infection in children (Bedford et al., 2015), andadults travel much more than children, allowing more mi-gration opportunities.

For pre-existing immunity to be a driving force of viralevolution, reinfection of older, previously exposed indi-viduals, has to be sufficiently common. In adults withhigh antibody titers to EV-D68, post-outbreak titersagainst the circulating strain increase, implying boththat existing immunity is not fully protective and that itcan be boosted by subsequent re-exposure (Xiang et al.,2017). Adults are less likely to attend medical care thanchildren and infections in adults may be milder, fur-ther reducing the likelihood that they are sampled andexplaining why diagnoses and sequences come predom-inantly from children. Less-symptomatic infections inadults may be due to both physiology (e.g. children havesmaller respiratory tracts) and partially protective im-munity from previous EV-D68 infection.

Increases in EV-D68 infection in particular groups ofadults, as we see in the A2 subclade and individuals over60, may be due to ‘original antigenic sin,’ whereby im-mune response is primarily based on the first pathogenencountered. If new EV-D68 strains evolve to be suffi-ciently distant from the initial EV-D68 strain that in-fected a particular age cohort, these individuals could beparticularly vulnerable to re-infection and/or more severeinfections.

While existing serological data are consistent withcontinuous antigenic evolution and frequent infection ofadults, currently available serological studies typically in-clude only a small number of EV-D68 strains (antigens),making systematic comparisons of preexisting immunityto different EV-D68 clades difficult.

Conclusion. Platforms like nextstrain.org allow thenear real-time analysis and dissemination of sequencedata integrated with relevant metadata, which we hopewill be a valuable resource to the community. This hasenabled our comprehensive analysis of new and existingEV-D68 data. We reveal a very dynamic picture of EV-D68 circulation, characterized by rapid mixing at thelevel of countries within Europe, rapid evolution of sur-face proteins, and divergent and at times bimodal agedistributions of different subclades. Combined with pub-lished serological data, our findings suggest a substantialcontribution of adults in the global dispersion and con-tinuous antigenic evolution of EV-D68.

ACKNOWLEDGEMENTS

We gratefully acknowledge expert sequencing serviceand advice by Tanja Normark, Cecilia Svensson, AnnaLyander, Emilia Ottosson Laakso and Valtteri Wirta atthe Clinical Genomics Unit at Science for Life Labo-ratory (SciLifeLab), and expert technical assistance byLina Thebo at the Karolinska Institute. The team atthe University Hospital Basel thanks for excellent tech-nical assistance of Daniela Lang. We would like to thankMoira Zuber for her work on implementing subclade spe-cific mapping and Adam Mazur for his assistance in usingPyMOL to generate the crystal structure in Figure 4.

We are very grateful to researchers who shared theirnon-published age data with us: Amary Falls, In-stitut Pasteur de Dakar, Senegal; Sindy Bottcher &Sabine Diedrich, Robert Koch Institute, Berlin, Ger-many; Katzumi Mizuta, Yamagata Prefectural Insti-tute of Public Health, Yamagata, Japan; Sofie Midgley,Statens Serum Institut, Copenhagen, Denmark.

Analysis were run using sciCORE (scicore.unibas.ch)scientific computing core facility at University of Basel.

Finally, we are grateful to scientists and public healthauthorities globally for sharing EV-D68 sequences andmetadata openly and without delay.

This study was funded by the University of Baselthrough core funding and a grant from the Swedish Foun-dation for Research and Development in Medical Micro-biology.

AUTHORS’ CONTRIBUTIONS

EBH, RAN, RD, and JA contributed to the study con-ception and design, interpreted the data, reviewed therelevant literature, and prepared the manuscript. RDand JA coordinated sample collection, selection, trans-port, and sequencing. EBH collected and curated pub-licly available data and scraped manuscripts for age data.RD contacted authors of previous publications to obtainfurther age information, curated study metadata, andconceptualized age and sampling figures. EBH and RANassembled the sequences and performed the phylogenetic,migration, age, and antigenic evolution analyses. CA,AE, JR, DGMA, JAF, HGMN, AA, RP, MR, and EWcoordinated ethics approval, metadata and sample collec-tion, and sample testing, extraction, and shipping fromtheir respective centers. All authors reviewed, edited,and commented on the draft manuscript and contributedto the final version.

DECLARATION OF INTERESTS

We declare no competing interests.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 10: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

10

TABLE II 2018 Samples Sequenced

Lab Country Number of Samples

Hubert Niesters et al. Netherlands 10

Elke Wollants et al. Belgium 10

Adrian Egli et al. Switzerland 6

Andres Pagarolas et al. Spain 14

Diego Garcıa Martınez de Artola Spain 8

Jan Albert et al. Sweden 5

This refers to samples included in analyses and may not includesamples received that did not achieve sufficient coverage to beincluded.

REFERENCES

Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J.Lipman (1990), Journal of Molecular Biology 215 (3), 403.

Andres, C., J. Vila, L. Gimferrer, M. Pinana, J. Esperalba,M. G. Codina, M. Barnes, M. C. Martin, F. Fuentes, S. Ru-bio, P. Alcubilla, C. Rodrigo, T. Pumarola, and A. Anton(2019), J. Clin. Virol. 110, 29.

Bal, A., M. Sabatier, T. Wirth, M. Coste-Burel, M. Lazrek,K. Stefic, K. Brengel-Pesce, F. Morfin, B. Lina, I. Schuffe-necker, and L. Josset (2019), Euro Surveill. 24 (3).

Bedford, T., S. Riley, I. G. Barr, S. Broor, M. Chadha,N. J. Cox, R. S. Daniels, C. P. Gunasekaran, A. C. Hurt,A. Kelso, A. Klimov, N. S. Lewis, X. Li, J. W. McCauley,T. Odagiri, V. Potdar, A. Rambaut, Y. Shu, E. Skepner,D. J. Smith, M. A. Suchard, M. Tashiro, D. Wang, X. Xu,P. Lemey, and C. A. Russell (2015), Nature 523 (7559),217.

Bottcher, S., C. Prifert, B. Weissbrich, O. Adams, S. Aldab-bagh, A. M. Eis-Hubinger, and S. Diedrich (2016), EuroSurveill. 21 (19).

Centers for Disease Control and Prevention (CDC), (2011),MMWR. Morbidity and mortality weekly report 60 (38),1301.

Cottrell, S., C. Moore, M. Perry, E. Hilvers, C. Williams, andA. G. Shankar (2018), Euro Surveill. 23 (46).

Du, J., B. Zheng, W. Zheng, P. Li, J. Kang, J. Hou,R. Markham, K. Zhao, and X.-F. Yu (2015), PloS one10 (12), e0144208.

Dyrdak, R., M. Grabbe, B. Hammas, J. Ekwall, K. E. Hans-son, J. Luthander, P. Naucler, H. Reinius, M. Rotzen-Ostlund, and J. Albert (2016), Eurosurveillance 21 (46).

Dyrdak, R., M. Mastafa, E. B. Hodcroft, R. A. Neher, andJ. Albert (2019), Virus Evol 5 (1), vez007.

Funakoshi, Y., K. Ito, S. Morino, K. Kinoshita, Y. Morikawa,T. Kono, Y. H. Doan, H. Shimizu, N. Hanaoka, M. Kon-agaya, T. Fujimoto, A. Suzuki, T. Chiba, T. Akiba,Y. Tomaru, K. Watanabe, N. Shimizu, and Y. Horikoshi(2019), Pediatr Int 61 (8), 768.

Hadfield, J., C. Megill, S. M. Bell, J. Huddleston, B. Potter,C. Callender, P. Sagulenko, T. Bedford, and R. A. Neher(2018), Bioinformatics 10.1093/bioinformatics/bty407.

Harrison, C. J., W. C. Weldon, B. A. Pahud, M. A. Jack-son, M. S. Oberste, and R. Selvarangan (2019), EmergingInfect. Dis. 25 (3), 585.

Hixon, A. M., J. Frost, M. J. Rudy, K. Messacar, P. Clarke,and K. L. Tyler (2019), Viruses 11 (9), 10.3390/v11090821.

Holm-Hansen, C. C., S. E. Midgley, and T. K. Fischer (2016),The Lancet Infectious Diseases 16 (5), e64.

Imamura, T., M. Okamoto, S.-i. Nakakita, A. Suzuki,M. Saito, R. Tamaki, S. Lupisan, C. N. Roy, H. Hiramatsu,K.-e. Sugawara, K. Mizuta, Y. Matsuzaki, Y. Suzuki, andH. Oshitani (2014), Journal of Virology 88 (5), 2374.

Karelehto, E., G. Koen, K. Benschop, F. van der Klis, D. Pa-jkrt, and K. Wolthers (2019), Euro Surveill 24 (35),10.2807/1560-7917.Es.2019.24.35.1800671.

Katoh, K., K. Misawa, K.-i. Kuma, and T. Miyata (2002),Nucleic Acids Research 30 (14), 3059.

Khetsuriani, N., A. Lamonte-Fowlkes, S. Oberst, M. A. Pal-lansch, and Centers for Disease Control and Prevention(2006), Morbidity and Mortality Weekly Report. Surveil-lance Summaries (Washington, D.C.: 2002) 55 (8), 1.

Kingman, J. F. C. (1982), Stochastic Processes and their Ap-plications 13 (3), 235.

Kramer, R., M. Sabatier, T. Wirth, M. Pichon, B. Lina,I. Schuffenecker, and L. Josset (2018), Euro Surveill.23 (37).

Kroneman, A., H. Vennema, K. Deforche, H. Avoort, S. Pe-naranda, M. Oberste, J. Vinje, and M. Koopmans (2011),Journal of Clinical Virology 51 (2), 121.

Kujawski, S. A., C. M. Midgley, B. Rha, J. Y. Lively, W. A.Nix, A. T. Curns, D. C. Payne, J. A. Englund, J. A. Boom,J. V. Williams, G. A. Weinberg, M. A. Staat, R. Sel-varangan, N. B. Halasa, E. J. Klein, L. C. Sahni, M. G.Michaels, L. Shelley, M. McNeal, C. J. Harrison, L. S. Stew-art, A. S. Lopez, J. A. Routh, M. Patel, M. S. Oberste, J. T.Watson, and S. I. Gerber (2019), MMWR Morb. Mortal.Wkly. Rep. 68 (12), 277.

Lau, S. K., C. C. Yip, P. S. Zhao, W. N. Chow, K. K. To,A. K. Wu, K. Y. Yuen, and P. C. Woo (2016), Sci Rep 6,25147.

Levy, A., J. Roberts, J. Lang, S. Tempone, A. Kesson, A. Do-fai, A. J. Daley, B. Thorley, and D. J. Speers (2015), Jour-nal of Clinical Virology 69, 117.

Linsuwanon, P., J. Puenpa, K. Suwannakarn, V. Auk-sornkitti, P. Vichiwattana, S. Korkong, A. Theamboonlers,and Y. Poovorawan (2012), PloS one 7 (5), e35190.

Liu, Y., J. Sheng, A. Fokine, G. Meng, W.-H. Shin, F. Long,R. J. Kuhn, D. Kihara, and M. G. Rossmann (2015a),Science 347 (6217), 71.

Liu, Y., J. Sheng, A. Fokine, G. Meng, W.-H. Shin, F. Long,R. J. Kuhn, D. Kihara, and M. G. Rossmann (2015b),Science (New York, N.Y.) 347 (6217), 71.

Majumdar, M., T. Wilton, Y. Hajarha, D. Klapsa, andJ. Martin (2019), bioRxiv , 738948.

Messacar, K., K. Pretty, S. Reno, and S. R. Dominguez(2019), J. Clin. Virol. 113, 24.

Mishra, N., T. F. F. Ng, R. L. Marine, K. Jain, J. Ng,R. Thakkar, A. Caciula, A. Price, J. A. Garcia, J. C. Burns,K. T. Thakur, K. L. Hetzler, J. A. Routh, J. L. Konopka-Anstadt, W. A. Nix, R. Tokarz, T. Briese, M. S. Oberste,and W. I. Lipkin (2019), mBio 10 (4), e01903.

Nee, S., E. C. Holmes, A. Rambaut, and P. H. Harvey (1995),Philosophical Transactions of the Royal Society of London.Series B: Biological Sciences 349 (1327), 25.

Nguyen, L.-T., H. A. Schmidt, A. von Haeseler, and B. Q.Minh (2015), Molecular Biology and Evolution 32 (1), 268.

Nix, W. A., M. S. Oberste, and M. A. Pallansch (2006), J.Clin. Microbiol. 44 (8), 2698.

Pellegrinelli, L., F. Giardina, G. Lunghi, S. C. Uceda Rente-ria, L. Greco, A. Fratini, C. Galli, A. Piralla, S. Binda,

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 11: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

11

E. Pariani, and F. Baldanti (2019), Euro Surveill. 24 (7).Petrova, V. N., and C. A. Russell (2018), Nature Reviews

Microbiology 16 (1), 47.Pickett, B. E., E. L. Sadat, Y. Zhang, J. M. Noronha, R. B.

Squires, V. Hunt, M. Liu, S. Kumar, S. Zaremba, Z. Gu,et al. (2011), Nucleic acids research 40 (D1), D593.

Piralla, A., A. Girello, M. Grignani, M. Gozalo-Marguello,A. Marchi, G. Marseglia, and F. Baldanti (2014), Journalof medical virology 86 (9), 1590.

Poelman, R., E. H. Scholvinck, R. Borger, H. G. Niesters, andC. van Leer-Buter (2015a), Journal of Clinical Virology 62,1.

Poelman, R., I. Schuffenecker, C. Van Leer-Buter, L. Josset,H. G. Niesters, B. Lina, E.-E. E.-D. study group, et al.(2015b), Journal of Clinical Virology 71, 1.

Pons-Salort, M., M. S. Oberste, M. A. Pallansch, G. R. Abedi,S. Takahashi, B. T. Grenfell, and N. C. Grassly (2018),Proc. Natl. Acad. Sci. U.S.A. 115 (12), 3078.

R Core Team, (2014), R: A Language and Environment forStatistical Computing , R Foundation for Statistical Com-puting, Vienna, Austria.

Rahamat-Langendoen, J., A. Riezebos-Brilman, R. Borger,R. van der Heide, A. Brandenburg, E. Scholvinck, andH. G. Niesters (2011), Journal of Clinical Virology 52 (2),103.

Sagulenko, P., V. Puller, and R. A. Neher (2018), Virus Evo-lution 4 (1), 10.1093/ve/vex042.

Schieble, J. H., V. L. Fox, E. H. Lennette, et al. (1967), Amer-ican journal of epidemiology 85 (2), 297.

Schrodinger, LLC, (2015), “The PyMOL molecular graphicssystem, version 2,”.

Schuffenecker, I., A. Mirand, L. Josset, C. Henquell, D. Hec-quet, L. Pilorge, J. Petitjean-Lecherbonnier, C. Manoha,J. Legoff, C. Deback, et al. (2016), Eurosurveillance21 (19).

Sedlazeck, F. J., P. Rescheneder, and A. von Haeseler (2013),Bioinformatics 29 (21), 2790.

Shen, L., C. Gong, Z. Xiang, T. Zhang, M. Li, A. Li, M. Luo,and F. Huang (2019), Scientific reports 9 (1), 6073.

Smura, T., P. Ylipaasto, P. Klemola, S. Kaijalainen,L. Kyllonen, V. Sordi, L. Piemonti, and M. Roivainen(2010), Journal of medical virology 82 (11), 1940.

Sun, S., F. Gao, Y. Hu, L. Bian, X. Wu, Y. Su, R. Du, Y. Fu,F. Zhu, Q. Mao, and Z. Liang (2018), Emerging Microbes& Infections 7 (1), 99.

Tokarz, R., C. Firth, S. A. Madhi, S. R. Howie, W. Wu,S. Haq, T. Briese, W. I. Lipkin, et al. (2012), Journal ofgeneral virology 93 (9), 1952.

Uprety, P., D. Curtis, M. Elkan, J. Fink, R. Rajagopalan,C. Zhao, K. Bittinger, S. Mitchell, E. R. Ulloa, S. Hopkins,and E. H. Graf (2019), Emerg Infect Dis 25 (9), 1676.

Vogt, M. R., and J. E. Crowe Jr (2018), Journal of the Pe-diatric Infectious Diseases Society 7 (suppl 2), S49.

Wisdom, A., E. C. Leitch, E. Gaunt, H. Harvala, and P. Sim-monds (2009), J. Clin. Microbiol. 47 (12), 3958.

Wollants, E., L. Beller, K. Beuselinck, M. Bloemen, K. La-grou, M. Reynders, and M. Van Ranst (2019), J. Clin.Virol. 121, 104205.

Xiang, Z., L. Li, L. Ren, L. Guo, Z. Xie, C. Liu, T. Li, M. Luo,G. Paranhos-Baccala, W. Xu, and J. Wang (2017), EmergMicrobes Infect 6 (5), e32.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 12: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

12

I. SUPPLEMENTARY MATERIALS AND METHODS

A. Patient population/samples.

Respiratory samples positive for EV-D68 were col-lected from six virology laboratories (see Table II). Ethi-cal approvals and consents were obtained as required bylocal regulations (see below).

In Stockholm, EV-D68 testing was done on all respi-ratory samples positive for enterovirus from 1 August to31 October 2018 at the Karolinska University Laboratory,which serves 6 of 8 emergency hospitals in the Stockholmcounty. An additional 25 enterovirus-positive samplesfrom the pediatric ICU at Karolinska University Hos-pital collected during November 2018 were also tested.Specific EV-D68 testing was done using a published real-time PCR (Dyrdak et al., 2016). All samples positivewith an in-house EV-D68-specific qPCR (Dyrdak et al.,2016) were included. Eluates were obtained from extrac-tion on an automated system (MagNA Pure [Roche]).RNA extraction of samples shipped to Stockholm fromthe other study centers was done by manually using theRNeasy Lipid Tissue Mini Kit (Qiagen cat. No. 74804).The study was reviewed and approved by the RegionalEthical Review Board in Stockholm, Sweden (registra-tion no. 2017/1317–32).

In Groningen, testing was done routinely (year around)using a specific assay for rhinovirus or enterovirus, or acombination of rhinovirus/enterovirus (FilmArray RP2,BioFire). If positive, a specific enterovirus D68 assaywas used. A selection of 10 samples with low Ct-valuesof a total of 21 positive samples in 2018 were sent toStockholm (Ethical Approval METc 2017/278).

In Belgium, for the first time, there was an upsurgeof EV-D68 in 2018 (Wollants et al., 2019). At Leu-ven, 10 positive EV-D68 samples with low Ct-values,sent from AZ Sint. Jan Brugge, were confirmed with anested VP4/VP2 PCR (Wisdom et al., 2009). This hos-pital in Bruges detected 83 positive EV-D68 samples in7,986 respiratory samples in 2018 by using TAC (TaqmanArray Card) technology for broad respiratory screening.In 2015 a specific real-time PCR for EVD-68 was inte-grated on this microarray card (Poelman et al., 2015a).In the context of a national reference laboratory for en-teroviruses, no ethical approval was needed.

In Basel, 157 samples were screened using an in-housePCR based on primers by Piralla et al. (2014). The sixsamples sent to sent to Stockholm represent all positivesamples. The catchment area of the laboratory did notinclude pediatric departments. For University HospitalBasel samples, only anonymized samples without addi-tional patient data were used, which does not require aspecific ethical evaluation in accordance with correspon-dence with Swissethics.

In Barcelona, detection of EV in respiratory specimenswas performed by specific real-time multiplex RT-PCR

assay (Allplex Respiratory Panel Assay, Seegene, Ko-rea). EV were characterized by phylogenetic analysesbased on VP1 sequence, as previously described (Andreset al., 2019). A selection of 14 positive samples withlow Ct-values of a total of 44 EV-D68 positive specimensfrom 2018 were used in this study. Institutional ReviewBoard approval (PR(AG)173/2017) was obtained fromthe HUVH Clinical Research Ethics Committee.

In Tenerife, primary detection of enterovirus was doneby specific real-time multiplex RT-PCR assay (AllplexRespiratory Panel Assay, Seegene, Korea) with subse-quent typing of positive samples by VP1 sequencing (Nixet al., 2006). Consent for further analysis was obtainedfor nine of a total of twelve positive samples. Eluates ofthe nine samples, obtained from extraction on an auto-mated system (EasyMag-EMag [Biomerieux]), were sentto Stockholm. Local Research Ethics Committee ap-proval was obtained (code CHUNSC 2019 02).

B. Sequencing and bioinformatic processing

Near full-length genome sequencing (‘whole genome’sequencing) was performed as previously described (Dyr-dak et al., 2019). Briefly, the genome was amplified induplicate by one-step RT-PCR in four overlapping frag-ments. Duplicates of each fragment were pooled and pu-rified using AGENCOURT AMPure XP PCR purifica-tion kit and quantified with Qubit assays (Q32851, LifeTechnologies). Purified DNA from each fragment wasdiluted to the same concentration, pooled and sent tothe Clinical Genomics Unit at Science for Life Labora-tory for library preparation and sequencing (SciLifeLab,Stockholm, Sweden).

In total 1 µl of DNA (∼0.5–2.0 ng/µl) was used inthe tagmentation reaction using Nextera chemistry (Il-lumina) to yield fragments >150 bp. The tagmented li-brary underwent eleven cycles of PCR with single-endindexed primers (IDT Technologies) followed by purifi-cation using Seramag beads. The library was quantifiedusing Quant-iT dsDNA High-Sensitivity Assay Kit andTecan Spark 10 M or FLUOstar Omega plate reader.The library was then pair-end sequenced to a depthof 100,000–1,500,000 reads per sample on either HiSeq2500 (2 × 101bp) or NovaSeq 6000 (2 × 151bp) Illu-mina sequencers. Base calling and demultiplexing wasdone using bcl2fastq v1.87, without allowing any mis-match in the index sequence. Assembly was done as de-scribed previously (Dyrdak et al., 2019), but to improvemapping sensitivity, we replaced BWA by NextGenMap(Sedlazeck et al., 2013) and used mapping referencesfrom the same subclade as the sample. The scriptsimplementing this workflow are available on github atgithub.com/neherlab/EV-D68 sequence mapping. Fifty-two of 55 samples were sequenced with a coverage of>100× in all four fragments and were included in the

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 13: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

13

further analysis along with one sample with a cover-age of >10× in one fragment and >100× in the otherthree fragments, giving a total of 53 successfully se-quenced samples. The consensus sequences for these53 samples have been deposited in GenBank (accessionnumbers MN245396-MN245448). The raw reads havebeen deposited in the Short Read Archive (BioProjectnumber PRJNA525063, BioSample accession numbersSAMN13745166-SAMN13745216). A list of accessionnumbers, along with metadata is available as Supp. Ta-ble I.

C. Whole Genome and VP1 Sequence Data Sets

The consensus sequences from the 53 samples se-quenced in this study were combined with whole genomeswith length >6000bp available in the Virus Pathogen Re-source (ViPR) (Pickett et al., 2011) (as of 2019-09-12) aswell as samples matching this criterion manually curatedfrom GenBank. Of all the sequences available in Gen-Bank (n=4,259, on 1 Nov 2019), 70% were annotatedwith an isolation source, of which the majority were res-piratory specimens (88%).

To conduct additional analyses on an as large andrepresentative a data set as possible, a further dataset of VP1 sequences was assembled. All EV-D68sequences in ViPR were downloaded and BLASTed(Altschul et al., 1990) against a 927 bp reference VP1alignment (KX675261). Only matching regions of at least700 bp and an Expect Value (E-value) of 0.005 or lowerwere included. All sequences from the whole genome dataset were included in the VP1 dataset.

To counter over-representation of countries with highsample numbers during some time periods, the VP1dataset was down-sampled using the ‘augur’ ‘filter’ com-mand, randomly selecting at most 20 samples per month,per year, per country.

Though the 3 whole genome and 7 VP1 sequences sam-pled prior to 1990 fit the estimated molecular clock wellwhen included in the analyses, they were omitted for fig-ure clarity. Samples without a year of sampling were alsoexcluded (3 from the whole genome run; 6 from the VP1run), as well as mouse-adapted samples and extreme out-liers (6 from the whole genome run; 31 from the VP1run). Using ‘augur’ ‘refine’ to generate time-resolvedtrees, branches more then 5 interquartile distances fromthe substitution rate regression were pruned, removing3 and 4 sequences from the whole genome and VP1datasets. The final number of sequences included thewhole genome and VP1 phylogenies was 813 and 1,654,respectively. Accession numbers and author/publicationdetails for each sequence included in the analyses areavailable at nextstrain.org/enterovirus/d68/genome andnextstrain.org/enterovirus/d68/vp1. A list of accessionnumbers, along with metadata is available as Supp. Ta-

bles III and IV.

D. Age Data

In order to analyze associations between and patientage and EV-D68 clade and subclade, roughly 500 ages orage-ranges were manually scraped from over 40 papers.Over 100 additional ages were provided by authors towhom we reached out. Combined with age data availableon GenBank, this resulted in approximately 900 VP1 se-quences and over 450 whole genome sequences with somekind of age information.

As some age information was available only as an age-range, age data was automatically parsed to create anage range variable. ‘Age’ contains the exact decimal year,where available, and ‘age range 1’ consists of four cate-gories (<1yr, 1-5yrs, 6-17yrs, 18-64yrs, and >=65yrs).For the borderline cases of age ranges given as ‘0-1’ and‘0-18’, these were interpreted as <1yr and <18yrs, re-spectively. Data was available for exact age and ‘agerange 1’ for 778 and 792 VP1 samples and 378 and 378whole genome samples.

The effect of subclade designation on age was exam-ined using the lm function in R (R Core Team, 2014) toperform simple linear models.

E. Phylogenetic analysis

We used the augur pipeline (Hadfield et al., 2018) toanalyze the whole genome and VP1 data sets. Briefly,sequences were aligned using mafft (Katoh et al., 2002)and annotated according to the 1962 Fermon strain(GenBank accession AY426531), a phylogenetic treewas inferred using IQ-TREE (Nguyen et al., 2015),and maximum likelihood time trees were inferred usingTreeTime (Sagulenko et al., 2018). Samples deviat-ing from the estimated clock rate by more than 5inter-quartile distances were removed during this step.Classification into clades and subclades was automatedusing the augur ‘clades’ command, based on muta-tions which matched the typing assigned to sequencesby the Enterovirus Genotyping Tool 0.1 at https:

//www.rivm.nl/mpf/typingtool/enterovirus/

(Kroneman et al., 2011) (see Supp. Tables Vand VI for mutations used to classify). Thescripts implementing this workflow are available ongithub at github.com/nextstrain/enterovirus d68.The scripts to produce the further analy-ses and figures for this paper are available atgithub.com/neherlab/2018 evd68 paneurope analysis.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 14: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

14

F. Diversification, Persistence, and Migration

To calculate the number of lineages leading to the sam-ples taken during each season, a season-specific tree wascreated from the time-resolved, whole genome phylogeny,which only contained tips sampled during the outbreakyear under investigation. Trees were assumed to be ul-trametric, so the number of lineages at the end of the sea-son is equal to the number of samples taken that year.Working backwards from the most recent sample, eachcoalescence event and the time is occurred was recorded,until only one lineage remained. To help show the sim-ilarity in lineage change between the seasons, they areplotted as years prior to the end of the ourbreak year inFig. 2B.

Maximum likelihood estimates of migration rates wereperformed with TreeTime v0.7.0 using the command-lineinterface as explicitly documented in the Snakefile. Inorder to minimize bias by countries, regions, and yearswith very few samples, we estimated migration rates fromVP1 trees pruned to only include tips from the outbreakyear (2014, 2016, or 2018) under investigation, and meta-data was masked to include only countries (in Europe)

and regions with a relatively similar number of samplesacross outbreaks. For between-country migration esti-mates the countries were thus France, Spain, Germany,Italy, Sweden, “rest of Europe”, and “rest of world”, andfor regions China, Europe, North America, and “rest ofworld.” However, estimates did not differ greatly fromwhen all tips, countries, and regions were used. Simi-larly, estimates of the coalescent rate through time (aka“skyline”) was performed with TreeTime using n = 150bins and restricting to full genome sequences after Jan2011.

To color the VP1 tree by epitope patterns, amino acidsat the specified locations were concatenated, and onlythose appearing more than 7 times (6 times for the C-terminus) were displayed. The C-terminus was not in-cluded in a large fraction of sequences resulting in manyundetermined amino acids. Tips with more than 7 miss-ing amino acid positions were grouped into a ‘many X’category; missing amino acids were inferred from theparental sequence for patterns with 6 or fewer missingsites. The crystal structure in Fig. 4 was generated usingPyMOL (Schrodinger, LLC, 2015).

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 15: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

15

BelgiumNetherlandsSpainSwedenSwitzerland

Sample Date of 53 Included 2018 Samples

Week Beginning

# of

Sam

ples

01

23

45

27 Aug

03 Sep

10 Sep

17 Sep

24 Sep

01 Oct

08 Oct

15 Oct

22 Oct

29 Oct

05 Nov

12 Nov

19 Nov

26 Nov

Figure S 1 Histogram of the sampling dates, by country, for the 53 samples generated in this study that were included in thefinal analyses.

Table S II AA Mutations in BC and DE Loops, C-Terminus, and VP2. This table extends from Fig 4 panel B, butincludes the pattern differences for the C-terminus and VP2 AA positions 135-156.

Strain BC loop DE loop C-Terminus VP2

Fermon ..A..S.G...... ..NND.... K.R.T............A..T........X HD.T...G........R....N

root NHTSSDARTHKNFF NGSSNSTYM RGKDRAPNTLNAIIGNRESVKTMPHNIVTT YNTNTSPEFNDIMKGEEGGTFN

A .....E...D.... -........ ...E....A..................... .........T...........S

A2 .....E..VD...Y -..N..... K..E....A..........I.....D..N. H..T...G.D............

A1 ..A..E.Q.D.... -S......T ...E....A........D............ .D.T...G.T...........S

B D....A.Q.D.... .....N... K..E....A........D............ .......G.D............

B3-US D....T.Q.D.... ...N.N..V K..E....A........D............ H..T...G.D......A.....

B3-EU D....A.QAD.... .....N..V K..E....A........D............ H..T...G.D......A.....

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 16: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

16

0 20 40 60 80

0.00

0.02

0.04

0.06

0.08

Age Distribution by Clade of 100 Bootstraps

Age

Den

sity

B3A2A1

Figure S 2 For the VP1 dataset, samples were sampled with replacement by country 100 times. For each bootstrapped dataset,the same linear regression shown in Table I was performed, and the age distributions for the A1, A2, and B3 clades wasplotted. The ages of the A2 clade was significantly different from the B3 clade in all bootstrap replicate linear regressions afterBonferroni correction.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 17: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

17

2010 2012 2014 2016 2018year

102

103

104

inve

rse

coal

esce

nt ra

te (N

_e)

Figure S 3 Inferred inverse rate of coalescence per lineage through time.

Table S V Clade definitions for VP1 dataset

Clade Nucleotide Position Base

A207 A

291 A

834 C

A1201 T

533 C

891 T

A2132 T

201 T

345 G

B198 A

390 G

807 G

B1687 T

702 T

B2132 A

343 T

459 A

B3102 A

270 C

468 G

C684 T

798 C

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 18: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

18

Table S VI Clade definitions for whole genome dataset

Clade Gene or Nucleotide Position Base

Anuc 156 T

nuc 333 A

nuc 675 G

A1nuc 204 T

nuc 3864 C

A2nuc 711 G

nuc 721 C

B

nuc 744 G

nuc 942 T

nuc 1062 C

nuc 1122 T

B1nuc 957 A

nuc 3090 T

nuc 4290 C

B2

nuc 711 G

nuc 999 C

nuc 1038 C

nuc 1137 A

B3nuc 831 T

nuc 924 A

C2B 26 R

nuc 1299 T

nuc 1971 T

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 19: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

19

>=65y18−64y6−17y1−5y<1y

Clade

Per

cent

of s

ampl

es

020

4060

8010

0

A1 A2 B1 B2 B3 CN=88 N=79 N=73 N=59 N=371 N=75

Figure S 4 For VP1 dataset samples with ‘age range1’ data, for all years, the proportion of samples in each age categoryfor each clade. The over-representation of adults and the elderly in the A2 subclade can be seen clearly, along with theover-representation of adults (and to a lesser extent, the elderly) in the A1 subclade.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 20: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

20

1990 1995 2000 2005 2010 2015 20200

1

2

3

4

5

6

Cum

ulat

ive

mut

atio

nsin

the

BC lo

op

B3B2A1A2

1990 1995 2000 2005 2010 2015 2020

0

1

2

3

Cum

ulat

ive

mut

atio

nsin

the

DE lo

op

1990 1995 2000 2005 2010 2015 2020Time

0

2

4

6

8

Sum

of B

C an

d DE

loop

Figure S 5 Mean number of AA mutations in the BC and DE loops over time, colored by clade. In the BC loop, AA positions90, 92, 95, 97, 98, and 103, were used, and in the DE loop, AA positions 140-146 and 149 were used. The BC-loop plot (top)shows that the B3 clade had around 4 mutations between 2000 and 2008, then about 7 years without mutations. In contrast,the A2 clade had only one mutation prior to 2010, but then had two between 2011 and 2015. Mean mutation count lines havebeen plotted slightly above and below their true value so that all lines can be seen when they share the same value. Calculatedfor the whole genome dataset.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint

Page 21: Evolution, geographic spreading, and demographic ... · 1/10/2020  · Evolution, geographic spreading, and demographic distribution of Enterovirus D68 Emma B. Hodcroft,1,2, Robert

21

Figure S 6 Most common C-terminus epitope patterns on the VP1 tree. This figure extends from Fig 4 panels D &E, but shows the most common C-terminus patterns. As in the BC-loop in Fig 4, the A2 subclade shows substantial recentevolution in this region.

(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted January 18, 2020. . https://doi.org/10.1101/2020.01.10.901553doi: bioRxiv preprint