PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf ·...

15
PATHOGENOMICS: PUBLIC HEALTH APPLICATIONS By: Andrea Cardenas

Transcript of PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf ·...

Page 1: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

PATHOGENOMICS: PUBLIC HEALTH APPLICATIONS

By: Andrea Cardenas

Page 2: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

OUTLINE

➤ Introduction

➤ History

➤ Sequencing

➤ Public health applications

➤ Case studies

➤ Challenges

Page 3: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

WHAT IS PATHOGENOMICS?

➤ The application of genome sequencing technologies to the characterization and analysis of pathogenic organisms

➤ Aims to understand:

➤ Microbe diversity

➤ Microbe interactions

➤ Host-microbe interactions and their involvement in disease states

Page 4: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

HISTORY OF PATHOGENOMICS

➤ 1995: Haemophilus influenzae is the first free-living orgnanism to have its genome sequenced by The Institute for Genomic Research (TIGR)

➤ One of the first examples proving random shotgun sequencing could be used for whole genomes

➤ 1.8 Mb genome

➤ Found to contain 1740 protein-coding genes, 2 transfer RNA genes, and 18 other RNA genes

Inu::cs':SASwl3) The two X libraries constructed from

H. influenzae genomic DNA were probedwith oligonucleotides designed from theends of contig groups (27). The positiveplaques were then used to prepare tem-plates, and the sequence was determinedfrom each end of the X clone insert. Thesesequence fragments were searched withGRASTA against a database of all contigs.Two contigs that matched the sequencefrom the opposite ends of the same X clonewere ordered. The X clone then providedthe template for closure of the sequence gapbetween the adjacent contigs.

4) To confirm the order of contigs foundby the other approaches and establish theorder of the remaining contigs, we per-formed amplifications by polymerase chainreaction (PCR), both standard and longrange (XL) (28). Although a PCR reactionwas done for essentially every combinationof physical gap ends, techniques such asDNA fingerprinting, database matching,and the probing of large insert clones wereparticularly valuable in ordering contigs ad-jacent to each other and reducing the num-ber of combinatorial PCRs necessary toachieve complete gap closure. Use of thesestrategies to an even greater extent in futuregenome projects will increase the overallefficiency of complete genome closure. Inthe program ASM_ALIGN Southern anal-ysis data, identification of peptide links,forward and reverse sequence data from Xclones, and PCR data are used to establishthe relative order of the contigs separatedby physical gaps. The number of physicalgaps ordered and closed by each of thesetechniques is summarized in Table 2.

Lambda clones were a central feature forcompletion of the genome sequence andassembly. It was probable that some frag-ments of the H. influenzae genome would benonclonable in a high copy plasmid becausethey would produce deleterious proteins inthe E. coli host cells. Lytic X clones wouldprovide DNA for these segments becausesuch genes would not inhibit plaque pro-duction. Furthermore, sequence informa-tion from the ends of 15- to 20-kb clones isparticularly suitable for gap closure and pro-viding general confirmation of genome as-sembly. Because of their size, they would belikely to span any physical gap. Approxi-mately 100 random plaques were pickedfrom the amplified X library, templates wereprepared, and sequence information was ob-tained from each end. These sequenceswere searched (GRASTA) against the con-tigs and linked in the database to theirappropriate contig, thus providing a scaf-folding of X clones that contributed addi-tional support to the accuracy of the ge-nome assembly (Fig. 1). In addition to con-firmation of the contig structure, the Xclones provided closure for 23 physical gaps.

Approximately 78 percent of the genomewas covered by X clones.

The X clones were particularly useful forsolving repeat structures. All repeat struc-tures identified in the genome were smallenough to be spanned by a single clonefrom the random insert library, except forthe six ribosomal RNA (rRNA) operonsand one repeat (two copies) that was 5340bp in length. The ability to distinguish andassemble the six rRNA operons of H. influ-enzae (each containing in order 16S, 23S,and 5S subunit genes) was a test of ouroverall strategy to sequence and assemble acomplex genome that might contain a sig-nificant number of repeat regions. The highdegree of sequence similarity and the lengthof the six operons caused the assembly pro-cess to cluster all the underlying sequencesinto a few indistinguishable contigs. To de-

termine the correct placement of the oper-ons in the sequence, unique sequences wereidentified at the 5S ends. Oligonucleotideprimers were designed from these six flank-ing regions and used to probe the two Xlibraries. For five of the six rRNA operonsat least one positive plaque was identifiedthat completely spanned the rRNA operonand contained uniquely identifying flankingsequence at the 16S and 5S ends. Theseplaques provided the templates for obtain-ing the sequence for these rRNA operons.For rrnA a plaque was identified that con-tained the particular 5S end and terminatedin the 16S end. The 16S end of rrnA wasobtained by PCR from H. influenzae Rdgenomic DNA.An additional confirmation of the global

structure of the assembled circular genomewas obtained by comparing a computer-

Smal1

Rsr II

1500000 ,

1400000-Sma -

1300000

Rsr II I900000

Fig. 1. A circular representation of the H. influenzae Rd chromosome illustrating the location of eachpredicted coding region containing a database match as well as selected global features of the genome.Outer perimeter: The location of the unique Not restriction site (designated as nucleotide 1), the Rsr IIsites, and the Sma sites. Outer concentric circle: Coding regions for which a gene identification wasmade. Each coding region location is classified as to role according to the color code in Fig. 2. Secondconcentric circle: Regions of high G+C content (>42 percent, red; >40 percent, blue) and high A+Tcontent (>66 percent, black; >64 percent, green). Third concentric circle: Coverage by X clones (blue).More than 300 X clones were sequenced from each end to confirm the overall structure of the genomeand identify the six ribosomal operons. Fourth concentric circle: The locations of the six ribosomaloperons (green), the tRNAs (black) and the cryptic mu-like prophage (blue). Fifth concentric circle: Simpletandem repeats. The locations of the following repeats are shown: CTGGCT, GTCT, ATT, AATGGC,TTGA, TTGG, TTTA, TTATC ,TGAC, TCGTC, AACC, TTGC, CAAT, CCAA. The putative origin ofreplication is illustrated by the outward pointing arrows (green) originating near base 603,000. Twopotential termination sequences are shown near the opposite midpoint of the circle (red).

400000

500000

SCIENCE * VOL. 269 * 28 JULY 1995 507

on

June

6, 2

012

ww

w.s

cien

cem

ag.o

rgD

ownl

oade

d fro

m

Fleischmann R. Science 269, no. 5223 (1995): 496.

Page 5: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

HOW ARE PATHOGEN GENOMES SEQUENCED?

revealed evidence for parallel adaptive evolution [35]. Seventeengenes accrued three or more mutations across the 14 patients, ofwhich a significant excess altered the encoded protein. Some ofthese mutations affected important phenotypes, including oxygen-dependent gene regulation—which may be pertinent to lunginfection—antibiotic resistance, and outer membrane synthesis.Mutations not previously implicated in pathogenesis present noveltherapeutic targets.

In hosts colonized multiple times by distinct genotypes, wholegenome sequencing affords an opportunity to investigate recom-bination in vivo. Horizontal gene transfer, also known asrecombination, is a fundamental process that generates diversityand facilitates the spread of advantageous genes [36,37]. Alongitudinal study of the highly promiscuous gut pathogenHelicobacter pylori identified recombination events from the cluster-ing of SNP differences introduced by the import of DNA from one

strain to another [28]. Surprisingly, multiple fragments of around400 bases appeared to be simultaneously imported in a span up to20 kilobases long. This pattern of integration was implied by theresults of a similar study in Streptococcus pneumoniae [38], demon-strating the power of whole genome sequencing to illuminatemolecular mechanisms.

Detection of Transmission Events

Whole genome sequencing offers unprecedented resolution todistinguish degrees of relatedness among bacterial isolates, and thisis a powerful tool for microbial forensics [39]. Genome sequencingcomplements existing epidemiological tools by providing a meansto reconstruct recent chains of transmission, identify sequentialacquisition of strains by persistent carriers, and identify crypticoutbreaks that might otherwise go unnoticed.

Figure 1. An example workflow for high-throughput whole genome sequencing in bacteria. Sample collection. A biological sample (e.g.,blood) is collected. Culture. Bacterial colonies are isolated from the sample by culturing on appropriate media. DNA Preparation. DNA is extracted fromthe colonies and a DNA library is prepared ready for sequencing. High-Throughput Sequencing. Millions of short sequence reads are yielded, typicallyseveral hundred nucleotides long or less. To reconstruct the genome, one of two approaches is generally adopted. Mapping to Reference Genome. Inreference-based mapping, the short sequences are mapped (i.e., aligned) to a reference genome using an algorithm (e.g., [73,74]). Preferably thereference genome is high quality, complete, and closely related. The pie chart illustrates that not all reads necessarily map to the reference genome(e.g., because of novel regions not present in the reference). Filtering. Short reads cannot be mapped reliably to repetitive regions of the referencegenome, so these are identified and filtered out. Sites that are problematic for other reasons (e.g., because too few reads have mapped or becausethe consensus nucleotide is ambiguous) are also filtered out. The pie chart illustrates that some portion of the reference genome does not get calleddue to filtering. In the mapped genome, these positions will receive an ambiguity code (i.e., N rather than A, C, G, or T). De novo Assembly of Contigs.An alternative to mapping is de novo assembly, in which no reference genome is used. An algorithm (e.g., [75,76]) is used to assemble short readsinto longer sequences known as contigs. The number and length of contigs will depend on general factors such as the length of sequence reads andthe total amount of DNA sequence produced, as well as local factors such as the presence of repetitive regions. The pie chart shows an example ofthe proportion of all reads that assemble into contigs of a given length. Alignment. For further analysis, it is necessary to align local regions (e.g.,genes) or whole genomes using appropriate algorithms (e.g., [77–79]). There is a trade-off in computational terms between the length of region andthe number of sequences that can be aligned. Sequence Analysis. The two approaches produce sequence alignments that represent pairwisealignments against a reference (mapping) or multiple alignments one to another (de novo assembly). These alignments can be analyzed directly, orprocessed further to detect variants such as single nucleotide polymorphisms, insertions, and deletions. The pie charts are meant to be illustrativeonly, and were produced from data in [27].doi:10.1371/journal.ppat.1002874.g001

PLOS Pathogens | www.plospathogens.org 3 September 2012 | Volume 8 | Issue 9 | e1002874

➤ Illumina- HiSeq and MiSeq

➤ Semi-conductor sequencing

➤ Single molecule sequencing

Wilson, D. J. I. PLoS Pathog. 8, (2012).

Page 6: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

WHAT CAN WE DO WITH THIS INFORMATION???

Page 136 | Pathogen Genomics Into Practice

Patient diagnosis

Local & national surveillance

Tracking pathogen spread & movement

Improved knowledge

Local outbreak detection or exclusion

Vaccine & therapeutic development

pathogen sequence data; for example through improved understanding of pathogens (e.g. their evolution, transmission), or for vaccine and therapeutic development.

Figure 15.1 The multifaceted applications of pathogen genomic data

15.2 Why genomic data sharing and aggregation is essential for developing and delivering maximally effective services to manage infectious disease

The theoretical advantage offered by the use of genomic data for microbiological investigations is that it is capable of providing a higher resolution and more accurate description of the clinically and epidemiologically relevant properties of an individual pathogen, and also an understanding of how these relate to the properties of other relevant pathogen samples; for example within a patient at a single point in time; within a patient over time; in an outbreak; or within a community / population. It is important to appreciate, however, that the extent to which this theoretical benefit is translated into a real world advantage over existing methodologies in terms of the sensitivity and specificity of diagnostic or discriminatory microbiological tests depends on the depth and breadth of genomic data available to support the investigation being undertaken.

DATA

Pathogen Genomics Into Practice PHG Foundation (2015)

Page 7: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

APPLICATION: OUTBREAK INVESTIGATION AND CONTROL

➤ Retrospective analysis of isolates from an MRSA outbreak in an NICU

➤ Problem: Isolates of MRSA of the same lineage were not distinguishable by current genotyping techniques

➤ Aim: to see if high throughput sequencing technology with a clinically relevant turnaround time could be used to define transmission pathways and characterize outbreaks

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 366;24 nejm.org june 14, 2012 2267

original article

Rapid Whole-Genome Sequencing for Investigation of a Neonatal MRSA OutbreakClaudio U. Köser, B.A., Matthew T.G. Holden, Ph.D., Matthew J. Ellington, D.Phil.,

Edward J.P. Cartwright, M.B., B.S., Nicholas M. Brown, M.D., Amanda L. Ogilvy-Stuart, F.R.C.P., Li Yang Hsu, M.R.C.P.,

Claire Chewapreecha, B.A., Nicholas J. Croucher, M.A., Simon R. Harris, Ph.D., Mandy Sanders, B.Sc., Mark C. Enright, Ph.D.,

Gordon Dougan, Ph.D., Stephen D. Bentley, Ph.D., Julian Parkhill, Ph.D., Louise J. Fraser, Ph.D., Jason R. Betley, Ph.D., Ole B. Schulz-Trieglaff, Ph.D.,

Geoffrey P. Smith, Ph.D., and Sharon J. Peacock, Ph.D., F.R.C.P.

From the University of Cambridge, Cam-bridge (C.U.K., E.J.P.C., S.J.P.), Health Protection Agency, Cambridge (C.U.K., M.J.E., E.J.P.C., N.M.B., S.J.P.), Wellcome Trust Sanger Institute, Hinxton (M.T.G.H., C.C., N.J.C., S.R.H., M.S., G.D., S.D.B., J.P., S.J.P.), Cambridge University Hospi-tals National Health Service Foundation Trust, Cambridge (N.M.B., A.L.O.-S., S.J.P.), Illumina (Cambridge), Little Ches-terford (L.J.F., J.R.B., O.B.S.-T., G.P.S.), and AmpliPhi Biosciences, Sharnbrook (M.C.E.) — all in the United Kingdom; and the National University Health Sys-tem, Singapore (L.Y.H.). Address reprint requests to Dr. Peacock at the Depart-ment of Medicine, University of Cam-bridge, Level 5, Box 157, Addenbrooke’s Hospital, Cambridge CB2 0QQ, United Kingdom, or at [email protected].

Mr. Köser and Drs. Holden, Ellington, and Cartwright contributed equally to this article.

N Engl J Med 2012;366:2267-75.Copyright © 2012 Massachusetts Medical Society.

A bs tr ac t

BackgroundIsolates of methicillin-resistant Staphylococcus aureus (MRSA) belonging to a single lineage are often indistinguishable by means of current typing techniques. Whole-genome sequencing may provide improved resolution to define transmission path-ways and characterize outbreaks.

MethodsWe investigated a putative MRSA outbreak in a neonatal intensive care unit. By using rapid high-throughput sequencing technology with a clinically relevant turnaround time, we retrospectively sequenced the DNA from seven isolates associated with the outbreak and another seven MRSA isolates associated with carriage of MRSA or bacteremia in the same hospital.

ResultsWe constructed a phylogenetic tree by comparing single-nucleotide polymorphisms (SNPs) in the core genome to a reference genome (an epidemic MRSA clone, EMRSA-15 [sequence type 22]). This revealed a distinct cluster of outbreak isolates and clear separation between these and the nonoutbreak isolates. A previously missed trans-mission event was detected between two patients with bacteremia who were not part of the outbreak. We created an artificial “resistome” of antibiotic-resistance genes and demonstrated concordance between it and the results of phenotypic sus-ceptibility testing; we also created a “toxome” consisting of toxin genes. One outbreak isolate had a hypermutator phenotype with a higher number of SNPs than the other outbreak isolates, highlighting the difficulty of imposing a simple threshold for the number of SNPs between isolates to decide whether they are part of a recent transmission chain.

ConclusionsWhole-genome sequencing can provide clinically relevant data within a time frame that can influence patient care. The need for automated data interpretation and the provision of clinically meaningful reports represent hurdles to clinical implementa-tion. (Funded by the U.K. Clinical Research Collaboration Translational Infection Research Initiative and others.)

The New England Journal of Medicine Downloaded from nejm.org at UAB-SERVEI DE BIBLIOTEQUES on December 21, 2016. For personal use only. No other uses without permission.

Copyright © 2012 Massachusetts Medical Society. All rights reserved.

Page 8: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

➤ Sequenced 14 MRSA isolates in hospital using MiSeq

➤ 10/14 were of same sequence type

➤ Separated into 2 groups by 102 snps

➤ 6 outbreak isolates clustered together —> matched profile of first patient

Whole-Genome Sequencing for MRSA Outbreak

n engl j med 366;24 nejm.org june 14, 2012 2273

antimicrobial-susceptibility testing. Antimicrobial phenotypes and genotypes showed concordance, providing proof of principle that whole-genome sequencing could be used (after a more extensive evaluation of larger, more diverse strain collec-tions) to guide therapy and represents a power-ful tool for the discovery of new drug-resistance mechanisms.21 We also created a toxome allowing for immediate identification of the toxin genes present in the study isolates that currently have to be determined with the use of multiple poly-merase-chain-reaction (PCR) reactions at reference laboratories.22 One of the isolates in the study was a small-colony variant, which are known to be associated with persistence and treatment fail-ure.23 The genetic defect for this phenotype was

rapidly identified, indicating that it is possible to construct large genetic data sets to better under-stand and clinically study important microbial variants. Finally, the isolation of a hypermutator strain highlights that it is not possible to use a simple cutoff of SNPs between isolates to decide whether or not they are part of a recent trans-mission chain; this information must be deter-mined from the topologic characteristics of the phylogenetic tree.

Figure 2 (facing page). Phenotypic Antibiograms, Resistome, and Toxome.

Panel A shows the antibiogram, or antimicrobial suscep-tibility pattern, for each of the 14 isolates sequenced, and Panel B shows the resistome and toxome. The sam-ples are identified with the patient number followed by a B for bacteremia or a C for carriage of MRSA, along with the sequence type (ST). For each antimicrobial drug, resistance is indicated with an R; its absence indi-cates susceptibility. The staphylococcal cassette chromo-some (SCC) mec types are also shown. In Panel B, at the top are genes responsible for drug resistance (the resistome) and toxin genes (the toxome). The toxin genes considered were limited to those that are tested for in the United Kingdom (staphylococcal enterotoxins [sea and sej], exfoliative toxins [eta and etd], toxic shock syn-drome toxin [tst], and Panton–Valentine leukocidin [PVL] [lukS-PV and lukF-PV ]). Only those genes present in at least 1 isolate are shown. Underneath these is a “heat map” showing the presence (red) or absence (blue) of the relevant genes for each of the samples (sequences used in the resistome and toxome are listed in Table S2 in the Supplementary Appendix). Antimicrobial resis-tance in MRSA can arise by way of either gene acquisi-tion or chromosomal mutations. Gene acquisition ac-counted for resistance to 10 antimicrobial drugs in the 14 isolates studied. Consistent with a prior study in our hospital of sequence type 1 community-associated MRSA,15 both transmission isolates 16B and 17B were PVL-negative and carried sea and seh. Not shown are the chromosomal mutations detected that accounted for resistance to rifampin (RIF) or ciprofloxacin (CIP): mutations in rpoB (Leu466Ser and His481Asn) account-ed for RIF resistance in isolate 14C16 and two mutations (Ser80Phe mutation in grlA and Ser84Leu in gyrA) were responsible for CIP resistance in all ST 22 and ST 36 isolates,17 and no mutations responsible for linezolid (LIN) resistance were found. CLIN denotes clindamy-cin, CXT cefoxitin, ERY erythromycin, FUS fusidic acid, GEN gentamicin, KAN kanamycin, MUP mupirocin, TET tetracycline, TMP trimethoprim, and TOB tobramycin.

19B

20B

15C

HO

509

6 04

12

12C

8C

6C

1B10C

11C

7C

Outbreak Isolates

25 SNPs

100

686896

9679

76

Figure 3. Results of Phylogenetic Analysis of the 10 MRSA Isolates of Sequence Type 22.

An unrooted maximum likelihood tree of the 10 sequence type 22 MRSA iso-lates identified, together with a reference sequence type 22 isolate (HO 5096 0412), is shown. The isolates are identified with the patient number followed by a B for bacteremia or a C for carriage of MRSA. Bootstrap values are shown in red. Outbreak isolates are circled in blue. Six of the seven outbreak iso-lates clustered closely together. The seventh isolate (6C) had an extended branch length that could be explained by the fact that it had a hypermuta-tor phenotype. The remaining sequence type 22 isolates associated with carriage by an infant in the NICU (15C) or bacteremia in patients on other wards of the same hospital (19B, 20B) were distantly related to the outbreak isolates and to each other. SNP denotes single-nucleotide polymorphism.

The New England Journal of Medicine Downloaded from nejm.org at UAB-SERVEI DE BIBLIOTEQUES on December 21, 2016. For personal use only. No other uses without permission.

Copyright © 2012 Massachusetts Medical Society. All rights reserved.

N Engl J Med 366, 2267-75 (2012)

Page 9: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 366;24 nejm.org june 14, 20122272

have previously been reported to determine resis-tance to these agents.16,17 We found complete concordance between the genotypic evidence for resistance and the phenotypically derived antibi-ogram. We also assembled a toxome, by using all the toxin genes tested for at reference laborato-ries in the United Kingdom to identify those present in at least one isolate (Fig. 2B).

Discussion

The primary objective of this study was to deter-mine whether whole-genome sequencing could

distinguish between MRSA isolates associated and not associated with a putative outbreak, with the use of a sequencing platform with a clinically rel-evant turnaround time. This was achieved by recon-structing a phylogenetic tree that showed a clear distinction between isolates in the outbreak and nonoutbreak groups. This finding alone is valu-able; we also showed that the data enabled addi-tional analyses of immediate clinical value for no additional cost.

We were able to assemble a resistome of genes coding for antibiotic resistance and compare this with the resistance pattern defined by standard

B Resistome and Toxome

A Phenotypic Antibiograms

mecA ermA

ermC

aacA

-aphD

aadD

tetK

dfrG

fusC

ileS-2

sea seb sec seg seh sei tst

0 2500 5000 7500 10,000 12,500 15,000

TMP

KAN & TOB

CXT ERY &

CLIN

GEN

TET FU

S

MUP

Toxins

10C

11C

12C

6C

8C

7C

1B

19B

20B

15C

14C

17B

16B

18B

Sample ST SCCmec CXT ERY CLIN CIP GEN KAN TOB TET TMP RIF FUS MUP LIN1B 22 IVh R R R R R R R R6C 22 IVh R R R R R R R R7C 22 IVh R R R R R R R R8C 22 IVh R R R R R R R R Outbreak10C 22 IVh R R R R R R R R11C 22 IVh R R R R R R R R12C 22 IVh R R R R R R R R14C 5 IVa R R R15C 22 IVh R R R R16B 1 IVa R R R R17B 1 IVa R R R R Nonoutbreak18B 36 II R R R R R R19B 22 IVh R R R R20B 22 IVh R R R R R

Outbreak

Nonoutbreak

SampleNucleotides

The New England Journal of Medicine Downloaded from nejm.org at UAB-SERVEI DE BIBLIOTEQUES on December 21, 2016. For personal use only. No other uses without permission.

Copyright © 2012 Massachusetts Medical Society. All rights reserved.

T h e n e w e ngl a nd j o u r na l o f m e dic i n e

n engl j med 366;24 nejm.org june 14, 20122272

have previously been reported to determine resis-tance to these agents.16,17 We found complete concordance between the genotypic evidence for resistance and the phenotypically derived antibi-ogram. We also assembled a toxome, by using all the toxin genes tested for at reference laborato-ries in the United Kingdom to identify those present in at least one isolate (Fig. 2B).

Discussion

The primary objective of this study was to deter-mine whether whole-genome sequencing could

distinguish between MRSA isolates associated and not associated with a putative outbreak, with the use of a sequencing platform with a clinically rel-evant turnaround time. This was achieved by recon-structing a phylogenetic tree that showed a clear distinction between isolates in the outbreak and nonoutbreak groups. This finding alone is valu-able; we also showed that the data enabled addi-tional analyses of immediate clinical value for no additional cost.

We were able to assemble a resistome of genes coding for antibiotic resistance and compare this with the resistance pattern defined by standard

B Resistome and Toxome

A Phenotypic Antibiograms

mecA ermA

ermC

aacA

-aphD

aadD

tetK

dfrG

fusC

ileS-2

sea seb sec seg seh sei tst

0 2500 5000 7500 10,000 12,500 15,000

TMP

KAN & TOB

CXT ERY &

CLIN

GEN

TET FU

S

MUP

Toxins

10C

11C

12C

6C

8C

7C

1B

19B

20B

15C

14C

17B

16B

18B

Sample ST SCCmec CXT ERY CLIN CIP GEN KAN TOB TET TMP RIF FUS MUP LIN1B 22 IVh R R R R R R R R6C 22 IVh R R R R R R R R7C 22 IVh R R R R R R R R8C 22 IVh R R R R R R R R Outbreak10C 22 IVh R R R R R R R R11C 22 IVh R R R R R R R R12C 22 IVh R R R R R R R R14C 5 IVa R R R15C 22 IVh R R R R16B 1 IVa R R R R17B 1 IVa R R R R Nonoutbreak18B 36 II R R R R R R19B 22 IVh R R R R20B 22 IVh R R R R R

Outbreak

Nonoutbreak

SampleNucleotides

The New England Journal of Medicine Downloaded from nejm.org at UAB-SERVEI DE BIBLIOTEQUES on December 21, 2016. For personal use only. No other uses without permission.

Copyright © 2012 Massachusetts Medical Society. All rights reserved.

N Engl J Med 366, 2267-75 (2012)

- Assembled resistome by looking for snps at sites previously reported to determine resistance

- Toxome also assembled to allow for quick identification of toxins present in strain

Page 10: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

IMPLICATIONS

➤ Can give indications of transmission patterns

➤ Elimination of unrelated isolates can prevent unnecessary ward closures and disruptions

➤ Can be used to guide control measures

➤ Good turnaround time and cost-effective: after extracting DNA from culture, took 1.5 days; $150 per isolate (equivalent to price of 2 PCR tests for MRSA)

Page 11: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

APPLICATION: VACCINE DEVELOPMENT

➤ Via a reverse-vaccinology approach

➤ First successfuly used to identify suitable vaccine candidates for Neisseria meningitidis

➤ Analyzed the whole genome of a bacteria to predict undiscovered vaccine candidates using a specific set of criteria

➤ Resulted in identification of 600 potential candidates—>29 of which could elicit a complement-antibody mediated response

Page 12: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

APPLICATION: VACCINE DEVELOPMENT

Page 13: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

CHALLENGES

➤ Culturing

➤ Cost

➤ Dose sensitivity

➤ Time

➤ Data interpretation

Page 14: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

CONCLUSIONS

➤ Whole genome sequencing can be used to discriminate between pathogens with greater sensitivity and specificity than other methods

➤ Improved management of infectious disease—via improved diagnosis, detection, tracking of antimicrobial resistance and outbreak control

➤ However, several considerations before large scale implementation

Page 15: PATHOGENOMICS: PUBLIC HEALTH APPLICATIONSbioinformatica.uab.cat/.../Pathogenomics2016...6_5.pdf · Public health applications ... HISTORY OF PATHOGENOMICS ... is a powerful tool for

BIBLIOGRAPHY

1. Luheshi, L. et al. Pathogen Genomics Into Practice. PHG Found. (2015).

2. Köser, C. U. et al. Routine Use of Microbial Whole Genome Sequencing in Diagnostic and Public Health Microbiology. PLoS Pathog. 8, (2012).

3. Pallen, M. J. & Wren, B. W. Bacterial pathogenomics. Nature 449, 835–42 (2007).

4. Köser, C. U. et al. Rapid whole-genome sequencing for investigation of a neonatal MRSA outbreak. N.Engl.J.Med. 366, 2267–2275 (2012).

5. Wilson, D. J. Insights from Genomics into Bacterial Pathogen Populations. PLoS Pathog. 8, (2012).

6. Muzzi, A., Masignani, V. & Rappuoli, R. The pan-genome: towards a knowledge-based discovery of novel targets for vaccines and antibacterials. Drug Discov. Today 12, 429–439 (2007).