Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. ·...

12
Genomic Epidemiology of Penicillin- Nonsusceptible Pneumococci with Nonvaccine Serotypes Causing Invasive Disease in the United States Cheryl P. Andam, a Patrick K. Mitchell, a Alanna Callendrello, a Qiuzhi Chang, a Jukka Corander, b Chrispin Chaguza, c,d Lesley McGee, e Bernard W. Beall, e William P. Hanage a Department of Epidemiology, Center for Communicable Disease Dynamics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA a ; Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland b ; Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Queen Elizabeth Central Hospital, Blantyre, Malawi c ; Institute of Infection and Global Health, University of Liverpool, Liverpool, United Kingdom d ; Respiratory Diseases Branch, Division of Bacterial Diseases, National Center for Immunization and Respiratory Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, USA e ABSTRACT Conjugate vaccination against seven pneumococcal serotypes (PCV7) re- duced disease prevalence due to antibiotic-resistant strains throughout the 2000s. However, diseases caused by resistant nonvaccine type (NVT) strains increased. Some of these emerging strains were derived from vaccine types (VT) that had changed their capsule by recombination. The introduction of a vaccine targeting 13 serotypes (PCV13) in 2010 has led to concern that this scenario will repeat itself. We generated high-quality draft genomes from 265 isolates of NVT pneumococci not susceptible to penicillin (PNSP) in 2009 and compared them with the genomes of 581 isolates from 2012 to 2013 collected by the Active Bacterial Core surveillance (ABCs) of the Centers for Disease Control and Prevention (CDC). Of the seven sequence clusters (SCs) identified, three SCs fell into a single lineage associated with serogroup 23, which had an origin in 1908 as dated by coalescent analysis and included isolates with a divergent 23B capsule locus. Three other SCs represented relatively deep- branching lineages associated with serotypes 35B, 15A, and 15BC. In all cases, the resistant clones originated prior to 2010, indicating that PNSP are at present domi- nated by descendants of NVT clones present before vaccination. With one exception (15BC/ST3280), these SCs were related to clones identified by the Pneumococcal Mo- lecular Epidemiology Network (PMEN). We conclude that postvaccine diversity in NVT PNSP between 2009 and 2013 was driven mainly by the persistence of preexist- ing strains rather than through de novo adaptation, with few cases of serotype switching. Future surveillance is essential for documenting the long-term dynamics and resistance of NVT PNSP. KEYWORDS genomic epidemiology, nonvaccine serotype, penicillin, vaccine T he best characterized virulence factor of pneumococcus is its polysaccharide cap- sule, of which there are at least 90 serologically distinct variants or serotypes (1), which vary in terms of their prevalence in carriage and disease, antibiotic resistance, and clinical manifestation (2). In 2000, a seven-valent pneumococcal conjugate vaccine (PCV7) was introduced for the routine immunization of children that specifically targeted seven serotypes responsible for 70 to 80% of invasive pneumococcal disease (IPD) in the United States (3). After the use of PCV7 was implemented, carriage prevalence and rates of IPD caused by vaccine serotypes (VT) were substantially reduced (4). However, the elimination of VT due to PCV7 was followed by the expansion Received 10 December 2016 Returned for modification 6 January 2017 Accepted 11 January 2017 Accepted manuscript posted online 18 January 2017 Citation Andam CP, Mitchell PK, Callendrello A, Chang Q, Corander J, Chaguza C, McGee L, Beall BW, Hanage WP. 2017. Genomic epidemiology of penicillin-nonsusceptible pneumococci with nonvaccine serotypes causing invasive disease in the United States. J Clin Microbiol 55:1104 –1115. https://doi.org/ 10.1128/JCM.02453-16. Editor Sandra S. Richter, Cleveland Clinic Copyright © 2017 American Society for Microbiology. All Rights Reserved. Address correspondence to Cheryl P. Andam, [email protected], or William P. Hanage, [email protected]. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 jcm.asm.org 1104 Journal of Clinical Microbiology on March 17, 2021 by guest http://jcm.asm.org/ Downloaded from

Transcript of Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. ·...

Page 1: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci withNonvaccine Serotypes Causing InvasiveDisease in the United States

Cheryl P. Andam,a Patrick K. Mitchell,a Alanna Callendrello,a Qiuzhi Chang,a

Jukka Corander,b Chrispin Chaguza,c,d Lesley McGee,e Bernard W. Beall,e

William P. Hanagea

Department of Epidemiology, Center for Communicable Disease Dynamics, Harvard T.H. Chan School ofPublic Health, Boston, Massachusetts, USAa; Department of Mathematics and Statistics, University of Helsinki,Helsinki, Finlandb; Malawi-Liverpool-Wellcome Trust Clinical Research Programme, Queen Elizabeth CentralHospital, Blantyre, Malawic; Institute of Infection and Global Health, University of Liverpool, Liverpool, UnitedKingdomd; Respiratory Diseases Branch, Division of Bacterial Diseases, National Center for Immunization andRespiratory Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, USAe

ABSTRACT Conjugate vaccination against seven pneumococcal serotypes (PCV7) re-duced disease prevalence due to antibiotic-resistant strains throughout the 2000s.However, diseases caused by resistant nonvaccine type (NVT) strains increased. Someof these emerging strains were derived from vaccine types (VT) that had changedtheir capsule by recombination. The introduction of a vaccine targeting 13 serotypes(PCV13) in 2010 has led to concern that this scenario will repeat itself. We generatedhigh-quality draft genomes from 265 isolates of NVT pneumococci not susceptibleto penicillin (PNSP) in 2009 and compared them with the genomes of 581 isolatesfrom 2012 to 2013 collected by the Active Bacterial Core surveillance (ABCs) of theCenters for Disease Control and Prevention (CDC). Of the seven sequence clusters(SCs) identified, three SCs fell into a single lineage associated with serogroup 23,which had an origin in 1908 as dated by coalescent analysis and included isolateswith a divergent 23B capsule locus. Three other SCs represented relatively deep-branching lineages associated with serotypes 35B, 15A, and 15BC. In all cases, theresistant clones originated prior to 2010, indicating that PNSP are at present domi-nated by descendants of NVT clones present before vaccination. With one exception(15BC/ST3280), these SCs were related to clones identified by the Pneumococcal Mo-lecular Epidemiology Network (PMEN). We conclude that postvaccine diversity inNVT PNSP between 2009 and 2013 was driven mainly by the persistence of preexist-ing strains rather than through de novo adaptation, with few cases of serotypeswitching. Future surveillance is essential for documenting the long-term dynamicsand resistance of NVT PNSP.

KEYWORDS genomic epidemiology, nonvaccine serotype, penicillin, vaccine

The best characterized virulence factor of pneumococcus is its polysaccharide cap-sule, of which there are at least 90 serologically distinct variants or serotypes (1),

which vary in terms of their prevalence in carriage and disease, antibiotic resistance,and clinical manifestation (2). In 2000, a seven-valent pneumococcal conjugate vaccine(PCV7) was introduced for the routine immunization of children that specificallytargeted seven serotypes responsible for 70 to 80% of invasive pneumococcal disease(IPD) in the United States (3). After the use of PCV7 was implemented, carriageprevalence and rates of IPD caused by vaccine serotypes (VT) were substantiallyreduced (4). However, the elimination of VT due to PCV7 was followed by the expansion

Received 10 December 2016 Returned formodification 6 January 2017 Accepted 11January 2017

Accepted manuscript posted online 18January 2017

Citation Andam CP, Mitchell PK, Callendrello A,Chang Q, Corander J, Chaguza C, McGee L,Beall BW, Hanage WP. 2017. Genomicepidemiology of penicillin-nonsusceptiblepneumococci with nonvaccine serotypescausing invasive disease in the United States.J Clin Microbiol 55:1104 –1115. https://doi.org/10.1128/JCM.02453-16.

Editor Sandra S. Richter, Cleveland Clinic

Copyright © 2017 American Society forMicrobiology. All Rights Reserved.

Address correspondence to Cheryl P. Andam,[email protected], or William P.Hanage, [email protected].

EPIDEMIOLOGY

crossm

April 2017 Volume 55 Issue 4 jcm.asm.org 1104Journal of Clinical Microbiology

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 2: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

of nonvaccine serotypes (NVT), plausibly due to the removal of competition from VT.This change in serotype distribution at the population level is called serotypereplacement, and has been documented in multiple settings and across differentdisease manifestations (5–7). During the post-PCV7 era, this phenomenon startedwith the replacement of penicillin susceptible lineages within the NVTs withpneumococci not susceptible to penicillin (PNSP), which caused a change in thepopulation structure within individual serotypes (8). In 2010, expanded vaccineformulations (PCV10 and PCV13) were introduced to target serotypes commonlyisolated from IPD in developing countries, as well as those that had becomecommon in vaccinated communities following PCV7 use.

Another challenge in IPD treatment and prevention is the pneumococcus’ ability toobtain genetic material through recombination, which can lead to rapid acquisition ofantibiotic resistance (9). It can also generate novel vaccine escape genotypes (10–12)through serotype or capsular switching, a process whereby strains substitute the genesencoding one type of capsule with genes encoding another (13). The properties of thenovel recombinant strains produced by capsular switching may be quite different fromtheir ancestors and not necessarily easy to predict.

Conjugate vaccination can be considered an ecological experiment imposing adefined selective pressure on a fraction of pneumococcal lineages. Here, we leveragethis ecological experiment to investigate the emergence of antibiotic resistance in NVTcausing IPD using genomic analyses of 846 PNSP isolates obtained from IPD casessampled across the United States before and after PCV13 introduction.

RESULTSPopulation structure and phylogeny. We obtained a total of 881 isolates from the

pneumococcal collection of the Active Bacterial Core surveillance (ABCs) of the CDC. Ofthese, we retrieved high-quality draft genome sequences from 846 isolates, comprising265 genomes from 2009, 328 from 2012, and 253 from 2013 (see Table S1 and Fig. S1in the supplemental material). De novo genome assembly produced sequences rangingfrom 1.96 to 2.26 Mb (Table S1). Assembled genomes were annotated revealing a totalof 8,131 clusters of orthologous genes (COGs), of which 1,170 are present in 99 to 100%of the isolates, making up the core genome. The addition of Streptococcus pneumoniaeATCC 700669 (used to root the core genome tree) reduced the number of single-copycore COGs to 719.

To investigate the genetic structure of the NVT PNSP population, we extracted thesequences of the core genome and identified seven distinct sequence clusters (SCs)using hierBAPS (14), ranging in size from 41 to 233 isolates (Fig. 1). SCs are groups ofrelated strains with similar or closely related genotypes as identified by the hierBAPSsoftware. There is relatively little structure related to sampling location and year ofcollection; all SCs contain isolates from at least eight sites, although there was variationin proportions across the different SCs (Fig. 1). Each SC is represented by isolates fromboth pre- and post-PCV13 time periods.

Prior work on PCV13 serotypes found PNSP in multiple different lineages of thesame serotype (15). In contrast, six of the seven clusters here consist almost exclusivelyof a single sequence type (ST) and serotype (Fig. 1), with five STs (ST338, ST558, ST63,ST1373, and ST3280) comprising 76.24% of the sample. Some of the most dominant SCsare known clones from the Pneumococcal Molecular Epidemiology Network (PMEN)(16), which currently includes 26 multidrug-resistant and 17 susceptible pneumococcalclones (http://www.pneumogen.net/pmen/).

The largest cluster is SC3, associated with a 35B capsule and ST558, which waswidespread in ABCs before and after PCV7 implementation (17). SC3 is closely relatedto the PMEN clone Utah35B-24 (ST377), differing only in one of the seven multilocussequence typing (MLST) loci (17). Isolates in SC3 have high MIC values relative to otherclusters (mean MIC50: SC3, 1.94; other SC, 0.37; Kruskal-Wallis test, P � 0.0001). SC2(serotype 15A) contains members of another PMEN clone Sweden15A-25 (ST63). SCs 1,4, and 5 taken together make up 302 isolates (Fig. 1E) and are all closely related to

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1105

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 3: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

ST338, PMEN clone Colombia23F-26. Hence, we considered these SCs together in ourtemporal analysis. While SC1 and SC4 are predominantly 23A, SC5 isolates haveserotype 23B. SC1 and SC4 are separated on the phylogeny (Fig. 2A) by SC5, whichappears to have emerged from an ancestor in SC4. This is supported by PHYLOViZanalysis (see Fig. S2) and hence, SC4 is not monophyletic. The greater resolution offeredby genomic data enabled us to separate SC4 and SC1 (both ST338) into the two distinctclusters observed here (Fig. 2).

The 23B isolates in SC5 were all part of the ST1373 lineage. It was recently reportedthat isolates typed as 23B can show substantial diversity at their capsule loci, such thata sequence subtype 23B1 has been proposed (18). Examining the 23B capsule loci inSC5, we found that there is very little variation within SC5, but that the capsule locusis highly divergent from other previously published 23B loci (Fig. 2; see also Fig. S2). Toconfirm the presence of 23B1 in NVT PNSP, we used PneumoCaT to accurately identifythis subtype in our data set. Except for the two 23B isolates in SC7, the majority of the23B isolates were identified as 23B1 by PneumoCaT (Table S1). This plainly indicates theimportance of this recently described variant as a cause of nonsusceptible invasivedisease.

Isolates in SC7 comprised low frequency genotypes that are highly divergent fromthe rest of the sample and from each other (Fig. 2). SC7 represents a known tendencyof the Bayesian analysis of population structure (BAPS) method to cluster isolatestogether on the basis of being divergent with no close relatives (analogous to long-

FIG 1 Distribution of the 846 isolates across sampling sites, year, ST, and serotype. (A) Map of the United States showing the10 sampling sites. The proportions of isolates for each SC are calculated based on sampling site, with colors corresponding tothe colors on the map (B), year of collection (C), ST (D), and serotype (E). The number of isolates and the r/m ratio per SC arealso indicated.

Andam et al. Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1106

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 4: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

branch attraction) (14). The rare genotypes found in this group have the potential toincrease in the population with a change in ecological conditions and hence are worthyof study. Within SC7 are two prevaccine and six postvaccine genotypes that have allelicprofiles identical to those of five multidrug-resistant PMEN clones and nine that differat 1 to 2 loci from other PMEN clones (Fig. 3).

The mean estimated substitution rates in SC1 to SC6 fell within the range of 4.9 �

10�5 to 1.32 � 10�6 substitutions per site per year with overlapping 95% credibilityintervals (see Table S2), comparable to those reported in pneumococcal populationsfrom other geographical regions (19–21).

FIG 2 Core genome phylogeny and distribution of genes coding for resistance against other antibiotic classes. (A) The maximum-likelihood tree was generated using the concatenated alignment of 719 core genes, using S. pneumoniae ATCC 700669 as an outgroupto root the tree. The inner ring delineates the seven SCs identified using hierBAPS. The heights of the bars in the outer ringcorrespond to MIC values for penicillin, with bars scaled with respect to the highest value of 8 �g/ml. (B) The maximum-likelihood tree is identical to the phylogeny in panel A. The branches are colored according to the hierBAPS membership. Outerrings show the presence (colored) or absence (gray) of the resistance gene. Shown are the distributions of genes conferringresistance to aminoglycosides (aphIII and sat4A), macrolides-lincosamides-streptogramin (ermB/C, msrD, and mefA), phenicols(cat), and tetracyclines (tetM). ermC was detected in only a single isolate and was therefore included in the ermB ring.

FIG 3 Serotype switching in PMEN clones in SC7. (Left) Maximum-likelihood phylogeny of SC7 on thebasis of point mutations, with polymorphic sites due to recombination excluded. Colored branchesindicate isolates that have an ST identical to that of recognized PMEN clones. The serotypes of theseisolates are indicated on the right of the tree. (Middle) Sampling sites and years of collection for each ofthe isolates represented by the colored branches on the tree. The colors on the table correspond to thecolored branches on the tree. (Right) Nomenclature of PMEN clones that have STs identical to thosefound in the NVT PNSP population. The names indicate the country or region and serotype of the firstclone that was first identified.

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1107

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 5: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

Comparison of pre- and postvaccine populations. The NVT PNSP population contains70 distinct STs with allelic profiles already present in the MLST database (www.pubmlst.org/spneumoniae) and 25 new STs (Table S1). SC7 was the most diverse, having 37 distinctSTs and nine new types that are present in low frequencies. The Simpson’s diversityindex, which measures both the number (richness) and relative abundance (evenness)of each ST, showed that the diversity of the population in terms of STs did not changebetween the two periods (2009: 0.80, 95% confidence interval [CI], 0.77 to 0.82; 2012 to2013: 0.82, 95% CI, 0.80 to 0.84). While two samples could be similarly diverse butcomposed of different STs, we found no significant evidence that this was the case(classification index [48]; P � 0.288).

Variation at the relA locus. The relA gene product catalyzes the synthesis anddegradation of the signal molecule guanosine tetraphosphate, which is involved in thestress response (22, 23). Variation at the relA locus in pneumococcus was shown tocontribute to the resistance to neutrophil killing and may plausibly impact fitness (24).We sought to determine if there is any indication of similar variation in our samples andhow it was distributed among SCs. We detected two highly divergent sequences of relA,which we have termed relA-1 and relA-2. Found in only 11 pre- and 25 postvaccineisolates in SC4 (serotype 23A), relA-2 exhibits 94% sequence identity with the morecommon relA-1, with the latter also found in the reference genome of S. pneumoniaeATCC 700669 (Fig. S2). We compared the sequence of relA-2 with that found in amodified laboratory strain of S. pneumoniae TIGR4 (GenBank accession no. NC_003028)experimentally shown to exhibit reduced resistance to neutrophil-mediated killing anddecreased competitiveness in colonization due to a single nucleotide polymorphism(SNP) in the relA locus (24). A sequence comparison showed that relA-2 does not showclose sequence identity (94%) to the relA gene of the TIGR4 variant strain.

Recombination and serotype switching. To test for the presence of recombina-tion, we calculated the number of polymorphisms accumulated through recombinationrelative to those generated through mutation (r/m) for each cluster. The total numberof SNPs introduced through recombination ranged from 3,562 in SC6 (representing79.0% of the total number of SNPs identified in that SC) to 64,966 (90.9%) in SC2 (seeFig. S3). The estimates for the per-site r/m ranged from 1.77 (SC6) to 8.46 (SC4) andvaried significantly among clusters (Kruskal-Wallis test, H � 25.752, df � 5, P � 0.0001)(Fig. 1B). We also calculated the ratios of the number of recombination events to thenumber of mutations (21). The ratios were less than one in all six SCs, suggesting thatrecombination occurred less frequently than single nucleotide substitutions, but whenthey occurred, they introduced more SNPs. This rate was also significantly differentbetween SCs (Kruskal-Wallis test, P � 0.0001). Our results are consistent with previouslyreported recombination rates in carried pneumococci (19–21).

Two distinct modes of recombination have been proposed to occur in the pneu-mococcus: microrecombination (frequent replacements of short DNA fragments) andmacrorecombination (rare larger replacements, usually associated with major pheno-typic changes, such as capsule switching) (25). In the NVT PNSP, the lengths of therecombination fragments greatly varied, ranging in size from 5 bp to 109,471 bp (seeFig. S4). Overall, the sizes of recombination events follow a geometric distribution, witha majority of the recombination encompassing short DNA segments and mean sizes of5,076 to 9,731 bp. Large recombination events (�30,000 bp) occurred less frequently,with the longest recombination block detected in a postvaccine isolate from SC3(109,471 bp). The genes for which we identified recombination were mostly consistentbetween clusters and may represent recombination hotspots resulting from naturalselection that contain virulence factors, such as the cell wall protein encoded by pspA,mobile elements, and antibiotic resistance genes (pbp and tetM) (Fig. S3).

The genetic diversity generated through recombination may represent independentacquisitions or may be passed on to descendants through clonal descent. If it is throughclonal descent, this may suggest that the actual rate of recombination is very low.Therefore, we calculated the ratios of the recombination events that are unique to an

Andam et al. Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1108

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 6: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

isolate versus those that are shared between multiple isolates within each SC (Fig. S4).Ratios greater than 1 suggest that most recombinations have occurred recently inextant taxa and have been acquired from outside the cluster, while those below 1indicate that most recombinations have occurred on internal branches. The six SCs varyin terms of the unique/shared ratios, with the highest observed in SC5 (6.13) and thelowest observed in SC4 (0.91); the rest have values between 1.74 and 2.81.

Serotype or capsule switching involves the substitution of genes encoding one typeof capsule with genes for another through homologous recombination of the genesflanking the capsular (cps) locus. Across the six clusters, we detected only five plausiblecapsule-switching events occurring both within and between serogroups: 23A ¡ 15A,23A ¡ 15BC, 15A ¡ 15BC, 15A ¡ 23B, and 15BC ¡21 (Fig. 4; see also Fig. S5). Theeight SC7 isolates that have ST profiles identical to those of five PMEN clones (Fig. 3)also appear to have emerged through VT-to-NVT switching (23F ¡ 15B and 3 ¡ 23A,as well as 14 ¡ 24F, and 9V ¡ 35B, which were also reported in other studies (26). Wedid not observe any increase in the incidence of serotype switches after PCV13 wasintroduced. It is unknown whether any of the capsular-switched strains originated atthe time of vaccine pressure, even if selection by the vaccine was integral to theirsubsequent success.

Resistance to multiple antibiotic classes. Other non-penicillin antibiotic resistance(ABR) genes may also be present in PNSP, which may result in the emergence ofmultidrug resistance. Therefore, we considered bioinformatic evidence for this in oursample. A total of 507 genomes (representing 60% of the population; Fig. 2B and seeTable S3) contained loci known to be associated with resistance to antimicrobial classesother than beta-lactams. Other ABR genes we detected include sat4A and aphIII(aminoglycosides), ermC and mefA (macrolides), and cat (chloramphenicol). The distri-butions of these ABR genes varied substantially among the SCs. For example, all isolatesfrom SC2 harbor ermB and tetM, while the majority of isolates in SC3 and SC6 carry themsrD and mef genes (Fig. 2B). Isolates that contain mefA were only found in isolates that

FIG 4 Bayesian phylogeny and population dynamics of SC1. Bayesian maximum clade credibility phylogenyof SC1 based on nonrecombinogenic regions of the core genome. Divergence date (median estimate with95% highest posterior density dates in brackets) is indicated in blue on the tree. Sampling sites, year ofisolation, and serotypes of each isolate are shown on the right of the tree. Inset: a Bayesian skyline plotshowing changes in effective population size Ne over time (median is in black and 95% confidence intervalsare in blue). Results for the other SCs are shown in Fig. S5 in the supplemental material.

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1109

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 7: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

already have msrD (and vice versa), as expected, as these efflux systems appear to becomplementary. It must be noted that the size of the database used for comparison isan important limitation in bioinformatic searches for resistance determinants.

Estimating the date of clonal origin. To estimate the date of clonal origin of eachSC, we first determined the presence of a temporal signal in our sample usingPath-O-Gen and estimated the time to the most recent common ancestor (tMRCA) ofall of the monophyletic clusters (SC1, SC2, SC3, SC5, SC6, and combined SC1, SC4, andSC5). A significant positive correlation between the dates of isolation and root-to-tipdistances was observed for SC1, SC2, SC3, and SC5 (see Fig. S6). While the P value of thecombined SC1, SC4, and SC5 representing serogroup 23 was not significant (P �

0.0625), its tMRCA must have occurred several decades prior to PCV13 introductiongiven that the subclades SC1 and SC5 were estimated to have originated in the 1980sand 1990s, respectively, as described below. We used BEAST to further analyze thetimes of the clonal origins. Of the SCs with a clock-like signal (SC1, SC2, SC3, and SC 5),their tMRCA values as calculated by BEAST were estimated to have occurred in the1970s to mid-1980s and early 1990s (Fig. 4 and see Fig. S5). All four SCs also showedan increase in effective population size (Ne) from the mid- to late-1990s until the mid-to late-2000s.

Geographical distribution of clones. To test whether a signal indicating transmis-sion within sites can be detected from the genomic data, we compared the pairwisegenetic distances obtained from recombination-free phylogenies with the samplingsites from which the isolates were recovered. The probability that the members of thepair would come from the same location decreased exponentially with the distancebetween point mutations between them (Fig. 5), suggesting that more closely relatedisolates were much more likely to have been recovered from the same location than

FIG 5 Geographical structure of NVT PNSP. Pairwise genetic distances, which delineate separation on thebasis of point mutations alone, were calculated between isolates within the same SC from phylogenetictrees in Fig. 3 and Fig. S7 in the supplemental material. The proportions of all pairwise comparisons ofisolates originating from the same location were calculated and plotted against the genetic distances.The resulting values are plotted as black data points, with the blue lines representing the curves withapproximately exponential decays. The red data points represent the outcomes of 100 permutations inwhich the same statistics were calculated when the locations of the isolates were randomized.

Andam et al. Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1110

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 8: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

would be expected by chance. The distribution of pairwise genetic distances withineach sampling site also provides insight into the composition of genotypes present ineach location (see Fig. S7). Notable are Minnesota, Georgia, and New York, where abimodal distribution of pairwise distances is observed and may indicate the presenceof at least two cocirculating genotypes in the population.

DISCUSSION

We have shown that postvaccine diversity in the NVT PNSP population is drivenmainly by the expansion of standing variation, which refers to the outgrowth oflineages that were present prior to PCV13 introduction rather than through de novoadaptation. The major clones were already present in the population decades beforeboth vaccination programs (PCV7 and PCV13) were implemented in the country. Withineach of the six SCs, postvaccine clones exhibited the same combinations of serotypeand nonsusceptibility as exhibited by the prevaccine population. We also foundconsiderable evidence for local geographic spread and an indication of the presence oflocal outbreaks or recent introductions in some states (MN, GA, and NY).

A majority of the isolates (82.51%; SC1 to SC5) are variants of known disease-causing, multidrug-resistant PMEN reference clones. SC5, which we estimate to haveoriginated in the 1990s, is notable for containing a 23B capsule that is apparentlydivergent from previously sequenced isolates with the same serotype. This is hard tointerpret without a more rigorous and systematic sampling of 23B isolates frommultiple sources, as it is possible that such diversity is typical within a capsular locusand does not lead to phenotypic consequences. However, the recent report of a 23B1subtype (18) is consistent with our observation. The 23B1 subtype was reported to onlyappear in samples from 2007 onwards (18). In contrast, our analysis suggests thislineage arose somewhat earlier, but this is not inconsistent, as the increase of thislineage could be related to PCV7 in some way.

The reasons for the comparative successes or failures of resistant clones are unclear.The variation at the relA locus, while interesting, is only one of the many possiblyrelevant changes. Future experimental work is needed to validate the specific functionof relA-2 in IPD. Another feature of the pneumococcus that was previously suggestedto assist in adaptation and resistance is recombination. An analysis of genome variationamong the most common clones in this sample found recombination rates rangingfrom 1.77 to 8.46, which are not unusual in comparison with previous estimates forother clones (20, 21, 25).

SC7 includes multiple examples of PMEN clones previously found with VT, and theserare genotypes are, at present, rare “hopeful monsters” that have either acquired a newNVT or an NVT that recently acquired resistance and had not yet spread (56). The futureof these is not clear, and they may not persist in the population. The lineages of mostconcern here are the 35B variant of PMEN3 (ST156) and the 23A variant of PMEN31(ST180). During the post-PCV7 era, ST156 was associated mostly with the highlyresistant serotype 19A, and 35B is now beginning to become the most successfulserotype in this lineage post-PCV13 (27). It was recently reported that the increase of35B is directly related to the expansion of the clonal complex of ST558 and theemergence of vaccine escape recombinant 35BST156 due to capsular switching (27).Hence, our work further highlights the importance of 35B in causing IPD over the nextseveral years and the potential for additional cases of 35B capsular switches. This andthe high prevalence of 35B among PNSP, which is reported in this study, warrant itsinclusion in future conjugate vaccines. While excessive speculation is premature, acontinued surveillance is important for defining their spread and importance.

MATERIALS AND METHODSBacterial isolates. A total of 881 isolates for NVT PNSP were collected from IPD cases from all age

groups by the Active Bacterial Core surveillance (ABCs) system, a population- and laboratory-basedcollaborative system between the Centers for Disease Control and Prevention (CDC) and state healthdepartments and academic institutions from 10 states across the country (California [CA], Colorado [CO],Connecticut [CT], Georgia [GA], Maryland [MD], Minnesota [MN], New Mexico [NM], New York [NY],

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1111

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 9: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

Oregon [OR], and Tennessee [TN]). These surveillance areas represented an estimated total population of29,206,528 persons, 30,356,544 persons, and 30,604,240 persons in 2009, 2012, and 2013, respectively.Antibiotic resistance testing by broth microdilution (28) and serotyping using the Quellung reaction wereperformed by the CDC. Draft genome sequencing was performed on the 881 NVT PNSP isolates,comprising 285 isolates from 2009, 339 from 2012, and 257 from 2013 (see Table S1 in the supplementalmaterial). These isolates represent the NVT PNSP population from IPD cases before and after the initialimplementation of PCV13 in the country in 2010. These include serotypes other than those that wereincluded in PCV7 (4, 6B, 9V, 14, 18C, 19F, and 23F) and additional serotypes in PCV13 (1, 3, 5, 6A, 7F, and19A). We considered samples to be penicillin-nonsusceptible based on the meningitis breakpoint (MIC�0.06 �g/ml), as recommended by the Clinical and Laboratory Standards Institute (CLSI) (28, 29).Serotypes 15B and 15C were grouped together as 15BC because of the reported reversible switchingbetween the two serotypes, which makes it difficult to precisely differentiate them (30, 31).

DNA preparation, sequencing, and typing. Cultures were grown in Todd-Hewitt medium with0.5% yeast extract (THY; Becton, Dickinson and Company, Sparks, MD) at 37°C in 5% CO2 for 24 h. DNAwas extracted and purified from cultures using a DNeasy blood and tissue kit (Qiagen, Valencia, CA). DNAconcentration was measured using a Qubit fluorometer (Invitrogen, Grand Island, NY) and diluted to 0.2ng/�l. DNA libraries were prepared using the Nextera XT protocol (as per the manufacturer’s instructions)with 1 ng of genomic DNA/isolate. Samples were sequenced as multiplexed libraries on the IlluminaMiSeq platform operated per the manufacturer’s instructions to produce paired-end reads of either 100(n � 116) or 150 (n � 730) nucleotides in length. Samples were only used when they had at least 30-foldcoverage of the reference genome (S. pneumoniae ATCC 700669 [GenBank accession no. NC_011900]).After filtering out the genomes with low coverage and of poor quality, a total of 846 genomes were usedfor downstream analyses (2009, n � 265; 2012, n � 328; and 2013, n � 253) (Table S1). The sequencetype (ST) of each isolate was confirmed using the program Short Read Sequence Typing (SRST2) (32),which extracts the sequences of seven housekeeping genes (aroE, gdh, gki, recP, spi, xpt, and ddl) (33)from the Illumina raw data and compares them to the S. pneumoniae MLST database (www.mlst.net). Wealso used SRST2 to confirm the serotypes by calculating the highest scoring matches to a pneumococcalcapsule reference sequence database (1, 19). To confirm the presence of the serotype variant 23B1, weused the program PneumoCaT, which is more sensitive in identifying new subtypes of more commonserotypes (18).

De novo genome assembly, annotation, and core genome identification. Reads were assembledinto contigs using the de novo assembler SPAdes v.3.5.0 (34). The resulting contigs were annotated usingProkka, a stand-alone tool specifically developed for annotating bacterial genomes (35). Any assemblieswith an N50 �10,000 were excluded from further analysis. This gave us a total of 846 genomes fordownstream analysis, with the numbers of contigs ranging from 82 to 588 and N50 from 15,716 to 96,119bp (see Fig. S1). We then used the clustering algorithm Best Directional BLAST Hits (BDBH) implementedin GET_HOMOLOGUES to identify and cluster orthologous genes (36) with S. pneumoniae ATCC 700669as a reference. Best hits were identified using the default parameters of 75% minimum pairwisealignment coverage and an E value cutoff of 1e-05. The sequences range in length from 1.97 Mb to 2.26Mb. Using the strain ATCC 700669 as a reference, we identified a total of 719 clusters of orthologousgenes (COGs) that were present in single copies in all genomes and were used to generate a 606,993-bpcodon alignment. We also used the program Roary (37) to characterize the pan-genome of the 846 NVTPNSP, consisting of the core genes (present in 99 to 100% of isolates), soft core genes (present in 95 to�99% of isolates), shell genes (present in 15 to �95% of isolates), and cloud genes (present in �15% ofisolates) from all 846 isolates.

Phylogenetic and population structure analyses. Each single-copy orthologous gene familyobtained from GET_HOMOLOGUES was aligned using MAFFT (38). The alignments were concatenated togive a single core alignment, and a maximum-likelihood phylogeny was then generated using theprogram Randomized Axelerated Maximum Likelihood (RAxML) v.8.1.15 (39) with a general time-reversible (GTR) nucleotide substitution model (40) and four gamma categories for rate heterogeneity.Genetic population structure analysis was performed using hierarchical Bayesian analysis of populationstructure (hierBAPS) with the core genome alignment as input (14). hierBAPS fits lineages to genomedata using nested clustering and has been shown to efficiently estimate bacterial population structuresfrom limited core genome variation and draft-genome sequence data (41–44). We used PHYLOViZ tovisualize allelic profiles from MLST data (45).

ST diversity. The diversities of the samples pre- and post-PCV13 were estimated using Simpson’sindex of diversity D (46), defined here as

D � �1 � �i�1

m

xi2�� N

N � 1�

where x is the fraction of the sample with ST i, m is the total number of STs, and N is the sample size.Variance and 95% confidence intervals were calculated as previously described (47). To estimate thedifferences in the ST compositions between the two time periods, the classification index was calculatedand significance was assessed using a permutation method (48).

Estimating recombination rates. Recombination events were detected using the program Gene-alogies Unbiased By RecomBination in Nucleotide Substitutions (GUBBINS) (49). GUBBINS uses aniterative approach to identify loci containing elevated densities of base substitutions and subsequentlybuilds a maximum-likelihood phylogeny based on point mutations alone. We used the default param-eters of five iterations, a minimum of 3 base substitutions within a 500-bp sliding window to define arecombination event and weighted Robinson-Foulds to estimate convergence. The cutoff value of 3

Andam et al. Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1112

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 10: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

single nucleotide polymorphisms (SNPs) was selected because �2 to 4 SNPs are introduced within apneumococcal genome per year (19). To identify the genes affected by recombination, we first reas-sembled the genomes using SMALT v.3.1.1 (https://www.sanger.ac.uk/resources/software/smalt/) with S.pneumoniae ATCC 700669 as a reference. SNPs were called using SAMtools (50) and VCFtools (51). Thealignment from the SMALT assemblies for each BAPS cluster was then used as the input for GUBBINS.RAxML was used to generate the initial tree based on all the SNPs and the subsequent iterations of treereconstructions based on SNPs due to recombination alone. All phylogenies were visualized usingFigTree (http://tree.bio.ed.ac.uk/software/figtree/) and Interactive Tree of Life (http://itol.embl.de). TheKruskal-Wallis test was used to determine significant differences in nucleotide substitutions and recom-bination rates between clusters.

Detection of antibiotic resistance genes. We screened all of the genomes for known accessoryelement resistance genes using a direct read mapping approach implemented in SRST2 (32). The ABRallele sequences used for comparison were retrieved from the ARG-ANNOT database (52) available fromthe SRST2 website.

Identification of spatial and temporal signals. Pairwise genetic distances were extracted fromphylogenies generated for each cluster using R v.3.0.2 (53). These were analyzed separately for eachcluster and were pooled across all clusters following the method described in reference 20. Over a seriesof threshold genetic distances, the proportions of pairs separated by less than or equal to this distancethat came from the same site were calculated and plotted. An exponential curve of the form was fittedto the relationship between threshold distance and the probability of sharing the same site. Onehundred permutations of sampling sites were then performed to generate a null expectation.

Using the recombination-free phylogenies generated by GUBBINS for each SC, Path-O-Gen was usedfor examining signs of a temporal signal (http://tree.bio.ed.ac.uk/software/pathogen/). When a significantpositive correlation between the dates of isolation and root-to-tip divergence was observed, thealignment of polymorphisms caused by point mutations was analyzed using Bayesian Evolution AnalysisUtility (BEAUTi) v.1.8.2 and Bayesian Evolution Analysis by Sampling Trees (BEAST) v.1.8.2 (54, 55). Thesewere supplemented by the numbers of invariant A, C, G, and T nucleotides and were considered in theBayesian estimates. SNPs were extracted from the recombination-free core genome alignment andmutation rates were calculated with BEAST using the skyline population size prior, a relaxed lognormal(uncorrelated) clock model and a GTR model of nucleotide substitution. The clock rate was estimatedfrom the data. We ran the chains for 350 million generations, sampling every 35,000 generations. Theinitial 10% of the samples from the beginning of each run were treated as burn-in and removed from theanalysis. The output for each chain was checked using Tracer (http://tree.bio.ed.ac.uk/software/tracer/) toensure that effective sample size (ESS) values for all parameters were greater than 200. The maximumclade credibility tree was generated using TreeAnnotator v.1.8.2 (as implemented in BEAST) andvisualized using FigTree. The changes in the effective population size for each cluster were estimatedusing the Bayesian skyline plot.

Accession number(s). Sequence data have been deposited in the European Nucleotide Archive(ENA) under study accession number ERP015405 (http://www.ebi.ac.uk/ena/) as listed in Table S1. Allelicprofiles of the 25 novel STs from this collection were submitted to the pneumococcal database of theMLST website (www.pubmlst.org/spneumoniae).

SUPPLEMENTAL MATERIAL

Supplemental material for this article may be found at https://doi.org/10.1128/JCM.02453-16.

TEXT S1, PDF file, 1.2 MB.DATA SET S1, XLSX file, 0.1 MB.DATA SET S2, XLSX file, 0.08 MB.

ACKNOWLEDGMENTSWe thank the principal investigators and surveillance officers at the 10 participating

ABCs sites and the ABCs epidemiology and Streptococcal Laboratory teams at the CDC.We also thank C. Thompson and L. Kagedan for their assistance with the MiSeq.Computations were performed on the Odyssey cluster supported by the FAS Divisionof Science Research Computing group at Harvard University.

Author contributions were as follows: W.P.H. conceived of and supervised the study.C.P.A. and A.C. undertook the DNA extraction, library preparation, and genome se-quencing. C.P.A., P.K.M., Q.C., J.C., C.C., and W.P.H. analyzed the data. C.P.A., L.M., B.W.B.,and W.P.H. wrote the manuscript. All authors read and approved the manuscript.

The authors declare no conflict of interests.This work was supported by the National Institute of Allergy and Infectious Diseases

of the National Institutes of Health (NIH) under award no. R01 AI106786-01 (to W.P.H.).P.K.M. was supported by the NIH Initiative to Maximize Student Diversity under awardno. GM055353-14. The content is solely the responsibility of the authors and does not

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1113

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 11: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

necessarily represent the official views of the US NIH or the Centers for Disease Controland Prevention.

REFERENCES1. Bentley SD, Aanensen DM, Mavroidi A, Saunders D, Rabbinowitsch E,

Collins M, Donohoe K, Harris D, Murphy L, Quail MA, Samuel G, SkovstedIC, Kaltoft MS, Barrell B, Reeves PR, Parkhill J, Spratt BG. 2006. Geneticanalysis of the capsular biosynthetic locus from all 90 pneumococcalserotypes. PLoS Genet 2:e31. https://doi.org/10.1371/journal.pgen.0020031.

2. Hausdorff WP, Feikin DR, Klugman KP. 2005. Epidemiological differencesamong pneumococcal serotypes. Lancet Infect Dis 5:83–93. https://doi.org/10.1016/S1473-3099(05)70083-9.

3. Hsu KK, Pelton SI. 2003. Heptavalent pneumococcal conjugate vaccine:current and future impact. Expert Rev Vaccines 2:619 – 631. https://doi.org/10.1586/14760584.2.5.619.

4. Pilishvili T, Lexau C, Farley MM, Hadler J, Harrison LH, Bennett NM,Reingold A, Thomas A, Schaffner W, Craig AS, Smith PJ, Beall BW,Whitney CG, Moore MR, Active Bacterial Core Surveillance/EmergingInfections Program Network. 2010. Sustained reductions in invasivepneumococcal disease in the era of conjugate vaccine. J Infect Dis201:32– 41. https://doi.org/10.1086/648593.

5. Hanage WP, Finkelstein JA, Huang SS, Pelton SI, Stevenson AE, KleinmanK, Hinrichsen VL, Fraser C. 2010. Evidence that pneumococcal serotypereplacement in Massachusetts following conjugate vaccination is nowcomplete. Epidemics 2:80 – 84. https://doi.org/10.1016/j.epidem.2010.03.005.

6. Pichon B, Ladhani SN, Slack MPE, Segonds-Pichon A, Andrews NJ, WaightPA, Miller E, George R. 2013. Changes in molecular epidemiology ofstreptococcus pneumoniae causing meningitis following introduction ofpneumococcal conjugate vaccination in England and Wales. J Clin Mi-crobiol 51:820 – 827. https://doi.org/10.1128/JCM.01917-12.

7. Gladstone RA, Jefferies JM, Tocheva AS, Beard KR, Garley D, Chong WW,Bentley SD, Faust SN, Clarke SC. 2015. Five winters of pneumococcalserotype replacement in UK carriage following PCV introduction. Vac-cine 33:2015–2021. https://doi.org/10.1016/j.vaccine.2015.03.012.

8. Gertz RE, Li Z, Pimenta FC, Jackson D, Juni BA, Lynfield R, Jorgensen JH,Carvalho Mda G, Beall BW, Active Bacterial Core Surveillance Team. 2010.Increased penicillin nonsusceptibility of nonvaccine-serotype invasivepneumococci other than serotypes 19A and 6A in post-7-valent conju-gate vaccine era. J Infect Dis 201:770 –775. https://doi.org/10.1086/650496.

9. Hanage WP, Fraser C, Tang J, Connor TR, Corander J. 2009. Hyper-recombination, diversity, and antibiotic resistance in pneumococcus.Science 324:1454 –1457. https://doi.org/10.1126/science.1171908.

10. Brueggemann AB, Pai R, Crook DW, Beall B. 2007. Vaccine escape recom-binants emerge after pneumococcal vaccination in the United States. PLoSPathog 3:e168. https://doi.org/10.1371/journal.ppat.0030168.

11. Ansaldi F, Canepa P, de Florentiis D, Bandettini R, Durando P, Icardi G.2011. Increasing incidence of Streptococcus pneumoniae serotype 19Aand emergence of two vaccine escape recombinant ST695 strains inLiguria, Italy, 7 years after implementation of the 7-valent conjugatedvaccine. Clin Vaccine Immunol 18:343–345. https://doi.org/10.1128/CVI.00383-10.

12. Golubchik T, Brueggemann AB, Street T, Gertz RE, Jr, Spencer CCA, Ho T,Giannoulatou E, Link-Gelles R, Harding RM, Beall B, Peto TEA, Moore MR,Donnelly P, Crook DW, Bowden R. 2012. Pneumococcal genome se-quencing tracks a vaccine escape variant formed through a multi-fragment recombination event. Nat Genet 44:352–355. https://doi.org/10.1038/ng.1072.

13. Wyres KL, Lambertsen LM, Croucher NJ, McGee L, von Gottberg A,Liñares J, Jacobs MR, Kristinsson KG, Beall BW, Klugman KP, Parkhill J,Hakenbeck R, Bentley SD, Brueggemann AB. 2013. Pneumococcal cap-sular switching: a historical perspective. J Infect Dis 207:439 – 449.https://doi.org/10.1093/infdis/jis703.

14. Cheng L, Connor TR, Sirén J, Aanensen DM, Corander J. 2013. Hierarchi-cal and spatially explicit clustering of DNA sequences with BAPS soft-ware. Mol Biol Evol 30:1224 –1228. https://doi.org/10.1093/molbev/mst028.

15. Serrano I, Melo-Cristino J, Carriço JA, Ramirez M. 2005. Characterizationof the genetic lineages responsible for pneumococcal invasive disease inPortugal. J Clin Microbiol 43:1706 –1715. https://doi.org/10.1128/JCM.43.4.1706-1715.2005.

16. McGee L, McDougal L, Zhou J, Spratt BG, Tenover FC, George R, Haken-beck R, Hryniewicz W, Lefévre JC, Tomasz A, Klugman KP. 2001. Nomen-clature of major antimicrobial-resistant clones of Streptococcus pneu-moniae defined by the pneumococcal molecular epidemiology network.J Clin Microbiol 39:2565–2571. https://doi.org/10.1128/JCM.39.7.2565-2571.2001.

17. Beall B, McEllistrem MC, Gertz RE, Boxrud DJ, Besser JM, Harrison LH,Jorgensen JH, Whitney CG, Active Bacterial Core Surveillance/EmergingInfections Program Network. 2002. Emergence of a novel penicillin-nonsusceptible, invasive serotype 35B clone of Streptococcus pneu-moniae within the United States. J Infect Dis 186:118 –122. https://doi.org/10.1086/341072.

18. Kapatai G, Sheppard CL, Al-Shahib A, Litt DJ, Underwood AP, HarrisonTG, Fry NK. 2016. Whole genome sequencing of Streptococcuspneumoniae: development, evaluation and verification of targets forserogroup and serotype prediction using an automated pipeline. PeerJ4:e2477. https://doi.org/10.7717/peerj.2477.

19. Croucher NJ, Harris SR, Fraser C, Quail MA, Burton J, van der Linden M,McGee L, von Gottberg A, Song JH, Ko KS, Pichon B, Baker S, Parry CM,Lambertsen LM, Shahinas D, Pillai DR, Mitchell TJ, Dougan G, Tomasz A,Klugman KP, Parkhill J, Hanage WP, Bentley SD. 2011. Rapid pneumo-coccal evolution in response to clinical interventions. Science 331:430 – 434. https://doi.org/10.1126/science.1198545.

20. Croucher NJ, Finkelstein JA, Pelton SI, Mitchell PK, Lee GM, Parkhill J,Bentley SD, Hanage WP, Lipsitch M. 2013. Population genomics ofpost-vaccine changes in pneumococcal epidemiology. Nat Genet 45:656 – 663. https://doi.org/10.1038/ng.2625.

21. Chewapreecha C, Harris SR, Croucher NJ, Turner C, Marttinen P, Cheng L,Pessia A, Aanensen DM, Mather AE, Page AJ, Salter SJ, Harris D, NostenF, Goldblatt D, Corander J, Parkhill J, Turner P, Bentley SD. 2014. Densegenomic sampling identifies highways of pneumococcal recombination.Nat Genet 46:305–309. https://doi.org/10.1038/ng.2895.

22. Brown DR, Barton G, Pan Z, Buck M, Wigneshweraraj S. 2014. Nitrogenstress response and stringent response are coupled in Escherichia coli.Nat Commun 5:4115. https://doi.org/10.1038/ncomms5115.

23. Hauryliuk V, Atkinson GC, Murakami KS, Tenson T, Gerdes K. 2015.Recent functional insights into the role of (p)ppGpp in bacterial physi-ology. Nat Rev Microbiol 13:298 –309. https://doi.org/10.1038/nrmicro3448.

24. Li Y, Croucher NJ, Thompson CM, Trzcinski K, Hanage WP, Lipsitch M.2015. Identification of pneumococcal colonization determinants in thestringent response pathway facilitated by genomic diversity. BMCGenomics 16:369. https://doi.org/10.1186/s12864-015-1573-6.

25. Mostowy R, Croucher NJ, Hanage WP, Harris SR, Bentley S, Fraser C. 2014.Heterogeneity in the frequency and characteristics of homologous re-combination in pneumococcal evolution. PLoS Genet 10:e1004300.https://doi.org/10.1371/journal.pgen.1004300.

26. Metcalf BJ, Gertz RE, Gladstone RA, Walker H, Sherwood LK, Jackson D,Li Z, Law C, Hawkins PA, Chochua S, Sheth M, Rayamajhi N, Bentley SD,Kim L, Whitney CG, McGee L, Beall B, Active Bacterial Core SurveillanceTeam. 2016. Strain features and distributions in pneumococci fromchildren with invasive disease before and after 13-valent conjugatevaccine implementation in the USA. Clin Microbiol Infect 22:60.e9 – 60.e29. https://doi.org/10.1016/j.cmi.2015.08.027.

27. Olarte L, Kaplan SL, Barson WJ, Romero JR, Lin PL, Tan TQ, Hoffman JA,Bradley JS, Givner LB, Mason EO, Hultén KG. 9 November 2016. Emer-gence of multidrug-resistant pneumococcal serotype 35B among U.S.children. J Clin Microbiol https://doi.org/10.1128/JCM.01778-16.

28. Clinical and Laboratory Standards Institute. 2008. Performance standardsfor antimicrobial susceptibility testing,16th informational supplementM100-S16. Clinical and Laboratory Standards Institute, Wayne, PA.

29. Weinstein MP, Klugman KP, Jones RN. 2009. Rationale for revised peni-cillin susceptibility breakpoints versus Streptococcus pneumoniae: cop-ing with antimicrobial susceptibility in an era of resistance. Clin Infect Dis48:1596 –1600. https://doi.org/10.1086/598975.

30. Venkateswaran PS, Stanton N, Austrian R. 1983. Type variation of strainsof Streptococcus pneumoniae in capsular serogroup 15. J Infect Dis147:1041–1054. https://doi.org/10.1093/infdis/147.6.1041.

31. van Selm S, van Cann LM, Kolkman MAB, van der Zeijst BAM, van Putten

Andam et al. Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1114

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from

Page 12: Genomic Epidemiology of Penicillin-Nonsusceptible Pneumococci … · 2020. 4. 29. · Hanage,whanage@hsph.harvard.edu. EPIDEMIOLOGY crossm April 2017 Volume 55 Issue 4 Journal of

JPM. 2003. Genetic basis for the structural difference between Streptococ-cus pneumoniae serotype 15B and 15C capsular polysaccharides. InfectImmun 71:6192–6198. https://doi.org/10.1128/IAI.71.11.6192-6198.2003.

32. Inouye M, Dashnow H, Raven L-A, Schultz MB, Pope BJ, Tomita T, ZobelJ, Holt KE. 2014. SRST2: rapid genomic surveillance for public health andhospital microbiology labs. Genome Med 6:90. https://doi.org/10.1186/s13073-014-0090-6.

33. Enright MC, Spratt BG. 1998. A multilocus sequence typing scheme forStreptococcus pneumoniae: identification of clones associated with se-rious invasive disease. Microbiology 144(Part 11):3049 –3060. https://doi.org/10.1099/00221287-144-11-3049.

34. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS,Lesin VM, Nikolenko SI, Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV,Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012. SPAdes: a newgenome assembly algorithm and its applications to single-cell sequenc-ing. J Comput Biol 19:455– 477. https://doi.org/10.1089/cmb.2012.0021.

35. Seemann T. 2014. Prokka: rapid prokaryotic genome annotation. Bioin-formatics 30:2068 –2069. https://doi.org/10.1093/bioinformatics/btu153.

36. Contreras-Moreira B, Vinuesa P. 2013. GET_HOMOLOGUES, a versatilesoftware package for scalable and robust microbial pangenome analysis.Appl Environ Microbiol 79:7696 –7701. https://doi.org/10.1128/AEM.02411-13.

37. Page AJ, Cummins CA, Hunt M, Wong VK, Reuter S, Holden MT, FookesM, Keane JA, Parkhill J. 2015. Roary: rapid large-scale prokaryote pangenome analysis. Bioinformatics 31:3691–3693. https://doi.org/10.1093/bioinformatics/btv421.

38. Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method forrapid multiple sequence alignment based on fast Fourier transform.Nucleic Acids Res 30:3059 –3066. https://doi.org/10.1093/nar/gkf436.

39. Stamatakis A. 2014. RAxML version 8: a tool for phylogenetic analysisand post-analysis of large phylogenies. Bioinformatics 30:1312–1313.https://doi.org/10.1093/bioinformatics/btu033.

40. Tavaré S. 1986. Some probabilistic and statistical problems in the analysis ofDNA sequences, p 57–86. In Miura R (ed), Lectures on mathematics in thelife sciences. American Mathematical Society, Providence, RI.

41. Corander J, Marttinen P, Sirén J, Tang J. 2008. Enhanced Bayesianmodelling in BAPS software for learning genetic structures of popula-tions. BMC Bioinformatics 9:539. https://doi.org/10.1186/1471-2105-9-539.

42. Castillo-Ramírez S, Corander J, Marttinen P, Aldeljawi M, Hanage WP,Westh H, Boye K, Gulay Z, Bentley SD, Parkhill J, Holden MT, Feil EJ. 2012.Phylogeographic variation in recombination rates within a global cloneof methicillin-resistant Staphylococcus aureus. Genome Biol 13:R126.https://doi.org/10.1186/gb-2012-13-12-r126.

43. Willems RJL, Top J, van Schaik W, Leavis H, Bonten M, Sirén J, HanageWP, Corander J. 2012. Restricted gene flow among hospital subpopula-

tions of Enterococcus faecium. mBio 3:e00151-12. https://doi.org/10.1128/mBio.00151-12.

44. McNally A, Cheng L, Harris SR, Corander J. 2013. The evolutionary pathto extraintestinal pathogenic, drug-resistant Escherichia coli is markedby drastic reduction in detectable recombination within the core ge-nome. Genome Biol Evol 5:699 –710. https://doi.org/10.1093/gbe/evt038.

45. Francisco AP, Vaz C, Monteiro PT, Melo-Cristino J, Ramirez M, Carriço JA.2012. PHYLOViZ: phylogenetic inference and data visualization for se-quence based typing methods. BMC Bioinformatics 13:87. https://doi.org/10.1186/1471-2105-13-87.

46. Simpson EH. 1949. Measurement of diversity. Nature 163:688. https://doi.org/10.1038/163688a0.

47. Grundmann H, Hori S, Tanner G. 2001. Determining confidence intervalswhen measuring genetic diversity and the discriminatory abilities oftyping methods for microorganisms. J Clin Microbiol 39:4190 – 4192.https://doi.org/10.1128/JCM.39.11.4190-4192.2001.

48. Jolley KA, Wilson DJ, Kriz P, McVean G, Maiden MCJ. 2005. The influenceof mutation, recombination, population history, and selection on pat-terns of genetic diversity in Neisseria meningitidis. Mol Biol Evol 22:562–569. https://doi.org/10.1093/molbev/msi041.

49. Croucher NJ, Page AJ, Connor TR, Delaney AJ, Keane JA, Bentley SD,Parkhill J, Harris SR. 2014. Rapid phylogenetic analysis of large samplesof recombinant bacterial whole-genome sequences using Gubbins. Nu-cleic Acids Res 43:e15. https://doi.org/10.1093/nar/gku1196.

50. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G,Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup.2009. The Sequence Alignment/Map format and SAMtools. Bioinformat-ics 25:2078 –2079. https://doi.org/10.1093/bioinformatics/btp352.

51. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Hand-saker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R, 1000 GenomesProject Analysis Group. 2011. The variant call format and VCFtools. Bioin-formatics 27:2156–2158. https://doi.org/10.1093/bioinformatics/btr330.

52. Gupta SK, Padmanabhan BR, Diene SM, Lopez-Rojas R, Kempf M, Lan-draud L, Rolain J-M. 2014. ARG-ANNOT, a new bioinformatic tool todiscover antibiotic resistance genes in bacterial genomes. AntimicrobAgents Chemother 58:212–220. https://doi.org/10.1128/AAC.01310-13.

53. R Core Team. 2013. R: a language and environment for statistical com-puting. R Foundation for Statistical Computing, Vienna, Austria.

54. Drummond AJ, Rambaut A. 2007. BEAST: Bayesian evolutionary analysisby sampling trees. BMC Evol Biol 7:214. https://doi.org/10.1186/1471-2148-7-214.

55. Drummond AJ, Suchard MA, Xie D, Rambaut A. 2012. Bayesian phylo-genetics with BEAUti and the BEAST 1.7. Mol Biol Evol 29:1969 –1973.https://doi.org/10.1093/molbev/mss075.

56. Croucher NJ, Klugman KP. 2014. The emergence of bacterial “hopefulmonsters”. mBio 5:e01550-14. https://doi.org/10.1128/mBio.01550-14.

Genomics of Nonvaccine-Type Pneumococcus Journal of Clinical Microbiology

April 2017 Volume 55 Issue 4 jcm.asm.org 1115

on March 17, 2021 by guest

http://jcm.asm

.org/D

ownloaded from