Klebsiella pneumoniae and Acinetobacter baumannii
Transcript of Klebsiella pneumoniae and Acinetobacter baumannii
Faculty of Medicine and Health Sciences
Genomic insights into the emergence and spread of ‘high-risk’
Klebsiella pneumoniae and Acinetobacter baumannii clones
Thesis submitted for the degree of doctor in Medical Sciences at the
University of Antwerp to be defended by
Mattia PALMIERI
Supervisors:
Prof. Herman Goossens
Prof. Alex van Belkum
Dr. Pieter Moons
Antwerp, 2020
Genomic insights into the emergence and spread of ‘high-risk’
Klebsiella pneumoniae and Acinetobacter baumannii clones
Genomische inzichten in het ontstaan en de verspreiding van
“hoog-risico” Klebsiella pneumoniae en Acinetobacter baumannii
klonen
Thesis submitted for the degree of doctor in Medical Sciences at the
University of Antwerp to be defended by
Mattia PALMIERI
Doctoral committee:
Promotors:
Prof. Herman Goossens
Prof. Alex van Belkum
Dr. Pieter Moons
Counsellor:
Dr. Caroline Mirande
Internal jury, Universiteit Antwerpen:
Dr. Arvid Suls
Prof. Annelies Van Rie
External jury:
Prof. Christian Giske
Prof. Derrick Crook
Prof. Marco Maria D’Andrea
Index of contents
Abstract ................................................................................................................................................... 1
Samenvatting ........................................................................................................................................... 2
List of abbreviations ................................................................................................................................ 4
List of figures ........................................................................................................................................... 6
Preface ..................................................................................................................................................... 7
CHAPTER 1 : General introduction and aims ......................................................................................... 10
1.1 The antimicrobial resistance crisis .............................................................................................. 10
1.2 The ESKAPE pathogens ................................................................................................................ 12
1.3 Whole Genome Sequencing (WGS): a disruptive diagnostic tool ............................................... 21
1.4 Aims ............................................................................................................................................. 26
1.5 References ................................................................................................................................... 27
CHAPTER 2 : Genomic epidemiology of carbapenem- and colistin-resistant Klebsiella pneumoniae
isolates from Serbia: predominance of ST101 strains carrying a novel OXA-48 plasmid ..................... 36
2.1 Abstract ....................................................................................................................................... 37
2.2 Introduction ................................................................................................................................. 37
2.3 Materials and methods ............................................................................................................... 39
2.4 Results ......................................................................................................................................... 41
2.5 Discussion .................................................................................................................................... 47
2.6 References ................................................................................................................................... 48
CHAPTER 3 : Abundance of colistin-resistant, OXA-23- and ArmA-producing Acinetobacter baumannii
belonging to International Clone 2 in Greece ....................................................................................... 55
3.1 Abstract ....................................................................................................................................... 56
3.2 Introduction ................................................................................................................................. 56
3.3 Materials and methods ............................................................................................................... 57
3.4 Results ......................................................................................................................................... 59
3.5 Discussion .................................................................................................................................... 63
3.6 References ................................................................................................................................... 65
CHAPTER 4 : Genomic evolution and local epidemiology of Klebsiella pneumoniae from the Beijing
Hospital 301 over a fifteen-year period: dissemination of known and novel high-risk clones ............. 72
4.1 Introduction ................................................................................................................................. 73
4.2 Materials and methods ............................................................................................................... 74
4.3 Results and discussion ................................................................................................................. 75
4.4 Conclusions .................................................................................................................................. 86
4.5 References ................................................................................................................................... 87
CHAPTER 5 : Interpreting k-mer based signatures for antibiotic resistance prediction ....................... 93
5.1 Abstract ....................................................................................................................................... 94
5.2 Introduction ................................................................................................................................. 94
5.3 Methods ...................................................................................................................................... 96
5.4 Results ....................................................................................................................................... 102
5.5 Discussion .................................................................................................................................. 111
5.6 References ................................................................................................................................. 114
CHAPTER 6 : PFM-like, a novel family of subclass B2 metallo β-lactamase from Pseudomonas
synxantha belonging to the Pseudomonas fluorescens complex ........................................................ 119
6.1 Abstract ..................................................................................................................................... 120
6.2 Main text ................................................................................................................................... 120
6.3 Data availability ......................................................................................................................... 126
6.4 References ................................................................................................................................. 126
CHAPTER 7 : Summary and perspectives ............................................................................................ 130
7.1 Summary.................................................................................................................................... 130
7.2 General discussion and future perspectives ............................................................................. 132
7.3 References ................................................................................................................................. 136
Acknowledgments ............................................................................................................................... 139
1
Abstract
While antibiotics still represent the major antibacterial agents for the treatment of bacterial
infections, an increasing number of bacteria is becoming (multi-drug) resistant (MDR), complicating
the treatment of infections. Carbapenems are highly effective antibiotics commonly used for the
treatment of severe bacterial infections of MDR bacteria, which are resistant to first-line antibiotics.
Of major concern, carbapenem resistance is on the rise, and in some countries it is so high that other
drugs, usually reserved as last options, are widely used. As an example, colistin, an old drug that was
essentially unused due to its toxicity, it’s now commonly adopted in some countries, and resistance
toward this antibiotic is on the rise.
Of the several pathogens associated with MDR, carbapenem-resistant K. pneumoniae and A.
baumannii represent major concerns. Both pathogens frequently cause outbreaks of infections, while
strains which are resistant to all available antibiotics are emerging. Concerning K. pneumoniae, a
novel kind of superbug has been emerging recently. While MDR K. pneumoniae clones causing
hospital outbreaks and hypervirulent, drug susceptible clones causing severe community-acquired
infections were two separate concerns, strains that showed convergence of the two traits are
emerging. Acquisition of hypervirulence and resistance genes have been observed in MDR and
hypervirulent clones, respectively, especially in Asia. Tracking the emergence and evolution of such
novel clones, which cause severe infections with limited treatment options, is fundamental.
The decreasing cost of Whole Genome Sequencing (WGS) is allowing its increase implementation in
bacterial diagnosis. However, there is still a lack of surveillance investigations for last-line resistance
mechanisms and for convergence of resistance and hypervirulence traits. Moreover, while the
phenotype prediction from the genomic data showed encouraging results, the understanding of the
genetic resistance mechanisms of some drugs, such as colistin, is still limited, and novel in silico tools
for the phenotype prediction are needed.
We employed WGS and bioinformatics, together with phenotypic techniques, to address different
problems: i) to decipher the colistin resistance mechanisms and the genomic epidemiology of clinical
isolates of K. pneumonia and A. baumannii from countries where carbapenem resistance is sky-high,
and colistin represent a life-saving agent. ii) to explore the longitudinal population dynamics of K.
pneumonia in a major Chinese hospital, focusing on the simultaneous carriage of resistance and
hypervirulence genes. iii) to predict the phenotype of K. pneumonia strains from their genomes. iv) to
study a novel carbapenemase-encoding gene obtained from environmental bacteria.
2
Samenvatting
Hoewel antibiotica de belangrijkste antibacteriële middelen zijn voor de behandeling van bacteriële
infecties, wordt een toenemend aantal bacteriesoorten (multi-) resistent (MDR), wat de behandeling
van infecties bemoeilijkt. Carbapenems zijn zeer effectieve antibiotica die vaak worden gebruikt voor
behandeling van ernstige MDR bacteriële infecties, die resistent bleken tegen eerstelijns antibiotica.
Zorgwekkend is dat de carbapenem-resistentie toeneemt en in sommige landen zo hoog is dat
andere geneesmiddelen, die meestal alleen als laatste optie worden gebruikt, op grote schaal
worden gebruikt. Colistine, een oud medicijn dat meestal niet werd gebruikt vanwege toxiciteit,
wordt nu in sommige landen algemeen gebruikt en de resistentie tegen dit antibioticum neemt toe.
Van de verschillende MDR pathogenen vormen carbapenem-resistente Klebsiella pneumoniae en
Acinetobacter baumannii klinisch belangrijke voorbeelden. Beide ziekteverwekkers veroorzaken vaak
uitbraken van infecties, terwijl er stammen ontstaan die resistent zijn tegen alle beschikbare
antibiotica. In het geval van K. pneumoniae is onlangs een nieuw soort superbacterie waargenomen.
Terwijl normaalgesproken MDR en hypervirulentie in K. pneumoniae klonen apart werden
waargenomen zij er nu klonen geïdentificeerd die convergentie van deze twee eigenschappen laten
zien. Acquisitie van hypervirulentie- en resistentiegenen is vooral in Azië gezien. Het volgen van de
opkomst en evolutie van dergelijke nieuwe klonen, die ernstige infecties veroorzaken met beperkte
behandelingsmogelijkheden, is van fundamenteel belang.
De dalende kosten van Whole Genome Sequencing (WGS) maakt het mogelijk de implementatie
ervan in de bacteriële routinematige diagnostiek van infectieziekten te versnellen. Er is echter nog
steeds een gebrek aan surveillance van bestaande en nieuwe resistentiemechanismen en naar
convergentie van resistentie- en hypervirulentie-eigenschappen. Bovendien, alhoewel de fenotype-
voorspelling uit de genomische gegevens bemoedigende resultaten liet zien, is het begrip omtrent
resistentiemechanismen rond sommige geneesmiddelen, zoals colistine, nog steeds beperkt, en zijn
nieuw bio-informatische in silico instrumenten voor de fenotype-voorspelling nodig.
In mijn proefschrift gebruikte ik WGS en bio-informatica, samen met fenotypische technieken, om
verscheidene problemen aan te pakken. Ten eerste heb ik onderzoek uitgevoerd naar colistine-
resistentiemechanismen en de genomische
epidemiologie van klinische isolaten van K. pneumoniae en A. baumannii uit landen waar de
carbapenem-resistentie torenhoog is. Ten tweede bestudeerde ik de longitudinale
populatiedynamiek van K. pneumoniae in een groot Chinees ziekenhuis, met nadruk op de analyse
van lokale en internationale verspreiding van resistentie- en hypervirulentiegenen. Ik analyseerde en
3
ontwikkelde methoden om het fenotype van K. pneumoniae stammen uit hun genomen te
voorspellen. Tenslotte bestudeerde ik een nieuw carbapenemase-coderend gen dat was gevonden in
omgevingsbacteriën. Resultaten van deze onderzoekingen zijn samengevat in dit proefschrift.
4
List of abbreviations
Abbreviations Full description
ACL adaptive cluster lasso
AMR antimicrobial resistance
AST antimicrobial susceptibility testing
AUC area under the curve
bACC balanced accuracy
CC clonal complex
cDBG compacted De Bruijn Graph
CG clonal group
cKp classical K. pneumoniae
colR/ColR Colistin resistant
ColS colistin susceptible
cps Capsular polysaccharide
CRAB carbapenem resistant A. baumannii
CRKP/CR-Kp carbapenem-resistant K. pneumoniae
dNTP deoxyribonucleotide triphosphate
ESBL extended spectrum β-lactamase
GI gastro-intestinal
GWAS genome-wide association studies
HAI hospital acquired infection
hvKp hyper-virulent K. pneumoniae
IC international clone
ICU intensive care unit
IS insertion sequence
KPC Klebsiella pneumoniae carbapenemases
L-Ara4N L-aminoarabinose
LD linkage disequilibrium
LPS lipopolysaccharide
MAF minor allele frequency
MALDI-TOF MS matrix-assisted laser desorption/ionization–time of flight mass spectrometry
MBL metallo-β-lactamase
MDR multidrug-resistant
MIC minimum inhibitory concentration
5
ML machine learning
MLST multi-locus sequence typing
NGS Next-Generation Sequencing
NS non-susceptible
OCL outer core locus
ONT Oxford Nanopore Technologies
PBS phosphate-buffered saline
pEtN phosphoethanolamine
PFGE pulsed-field gel electrophoresis
ROC Receiver Operating Characteristic
S susceptible
SMRT single-molecule real-time
SNP single nucleotide polymorphism
UTI urinary tract infection
VNTR variable-number tandem repeat
WGS whole genome sequencing
WHO World Health Organization
ZMW zero-mode waveguide
6
List of figures
Figure 1. Antibiotic resistance strategies in bacteria. From Erik Gullberg, 2014.
Figure 2. Predicted global deaths due to antimicrobial-resistant infections every year, compared to
other major diseases. From O’Neill, 2014.
Figure 3. WHO priority pathogens list for R&D of new antibiotics. *Enterobacteriaceae include: K.
pneumoniae, E. coli, Enterobacter spp., Serratia spp., Proteus spp., Providencia spp. and
Morganella spp. From Tacconelli et al., 2018.
Table 1. β-lactamases types, including some examples of clinically relevant enzymes.
Figure 4. Regulation pathways of LPS modifications in Klebsiella pneumoniae. From Poirel et al.,
2017.
Figure 5. Four well-characterized virulence factors in classical and hypervirulent K. pneumoniae
strains. From Paczosa and Mecsas, 2016.
Figure 6. Schematic representation of A. baumannii colistin resistance mechanisms. From Trebosc
et al., 2019.
Figure 7. A schematic representation of the hypothetical workflow after adoption of WGS, with low
complexity and an expected turnaround time within one day. Adapted from Didelot et al., 2012.
Figure 8. Overview of the three generations of sequencing technologies, with examples of the
major sequencing platforms. From Loman and Pallen, 2015.
7
Preface
In this preface, an overview of the contents of each chapter in this thesis is provided, the chapters
that are included as publications are listed, and the contribution to the chapters directly from the
author of this thesis are listed.
Chapter 1: General introduction and aims
This is an original overview of the background, key concepts and objectives of this thesis.
Chapter 2: Genomic epidemiology of carbapenem- and colistin-resistant Klebsiella pneumoniae
isolates from Serbia: predominance of ST101 strains carrying a novel OXA-48 plasmid
This chapter is an original work that resulted in a publication in Frontiers in Microbiology (DOI:
10.3389/fmicb.2020.00294). I was first author and the main contributor of the work presented in this
publication.
The nature and extent of the thesis author contributions to this chapter are detailed below:
• I contributed to the design of this published study and interpretation with Prof. Alex van Belkum,
Prof. Marco Maria D’Andrea and Prof. Gian Maria Rossolini.
• I performed all wet lab experiments, including antimicrobial susceptibility testing, MALDI-TOF MS
and DNA extraction.
• I performed library preparations for Nanopore long-read sequencing under supervision by and
assistance from Franck Tarendeau (bioMérieux Grenoble).
• I conducted all epidemiological, phylogenetic, and genomic analysis with Prof. Marco Maria
D’Andrea.
• I was responsible for the planning, drafting, editing, and submission of the manuscript, though all
co-authors also edited the manuscript.
Chapter 3: Abundance of colistin-resistant, OXA-23- and ArmA-producing Acinetobacter baumannii
belonging to International Clone 2 in Greece
This chapter is an original work that resulted in a publication in Frontiers in Microbiology (DOI:
10.3389/fmicb.2020.00668). I was first author and the main contributor of the work presented in this
publication.
The nature and extent of my contributions to this chapter are detailed below:
8
• I contributed to the design of this published study and interpretation with Prof. Alex van Belkum,
Prof. Marco Maria D’Andrea and Prof Gian Maria Rossolini. Dr. Nikos Legakis was responsible for the
collection, initial characterization and shipment of the strains. I verified some of the strain
characteristics for reasons of quality control.
• I performed MALDI-TOF MS under supervision by and assistance from Nadine Perrot.
• I performed all wet lab experiments, including antimicrobial susceptibility testing and DNA
extraction.
• I conducted all epidemiological, phylogenetic, and genomic analysis with input from Prof. Marco
Maria D’Andrea.
• I was responsible for the planning, drafting, editing, and submission of the manuscript, though all
co-authors also edited the manuscript.
Chapter 4: Genomic evolution and local epidemiology of Klebsiella pneumoniae from the Beijing
Hospital 301 over a fifteen-year period: dissemination of known and novel high-risk clones
This chapter is an original work that resulted in an in-progress manuscript, soon to be submitted for
publication. I was first author and the main contributor of the work presented in this manuscript.
The nature and extent of my contributions to this chapter are detailed below:
• I conducted all epidemiological, phylogenetic, and genomic analysis together with Dr. Kelly L. Wyres.
• I wrote the first draft of the manuscript and consolidated the editing suggestions made by the co-
authors.
Chapter 5: Interpreting k-mer based signatures for antibiotic resistance prediction
This chapter is an original work that resulted in a submitted manuscript, under revision at the time of
submission of this thesis. I was second author.
The nature and extent of my contributions to this chapter are details below:
• I contributed to the design of this nearly published study and performed data interpretation with
Dr. Pierre Mahé, Dr. Magali Jaillard and Prof. Alex van Belkum.
• I built the K. pneumoniae database used to test the machine elarning algorithm.
• I contributed to the analysis of the data.
9
• I contributed to the initial writing and editing of the manuscript.
Chapter 6 : PFM-like, a novel family of subclass B2 metallo β-lactamase from Pseudomonas
synxantha belonging to the Pseudomonas fluorescens complex
This chapter is an original work that resulted in a publication in Antimicrobial Agents and
Chemotherapy (DOI: 10.1128/AAC.01700-19). I was second author and the main contributor of the
experimental work presented in this publication.
The nature and extent of my contributions to this chapter are detailed below:
• I performed most of the wet lab experiments, including antimicrobial susceptibility testing, gene
cloning, enzyme purification and kinetic analysis of hydrolysis.
• I conducted all bioinformatics analyses.
• I wrote the first draft of the manuscript.
Chapter 7 : Summary and future perspectives
This is an original summary of the implication and significance of the work presented in this thesis,
together with a brief general discussion and the future perspectives.
10
CHAPTER 1 : General introduction and aims
1.1 The antimicrobial resistance crisis
The discovery of antibiotics in the early phase of the previous century was one of the most important
developments in medicine and a milestone in the history of modern human society. Before the
introduction of antibiotics, infectious diseases were a major cause of mortality due to the systemic
infections, sepsis resulting from wound infections, pneumonia and also common infections
surrounding childbirth. In the absence of antibiotics, routine clinical practices such as organ
transplants, surgery and cancer chemotherapy would be impossible 1.
As soon as antibiotics were introduced in clinical practice, clinically-relevant antibiotic resistant
bacterial strains were described. These strains emerged due to their ability to rapidly evolve via both
vertical and horizontal inheritance 2.
Moreover, antibiotics have been inappropriately used in particular outside healthcare settings and
especially in low-income countries. The misuse and overuse of antibiotics has not only been a
problem observed in human clinical settings, but also a frequent habit in agriculture, aquaculture and
animal farming. Alarmingly, these drugs are largely used as disease prophylaxis and growth factors 3.
This situation has led to selection and propagation of antibiotic resistant strains in many
environments, turning them into reservoirs that contribute to storage, transmission and selection of
new superbugs. Consequently, some infections previously easily manageable are now difficult or
impossible to treat 4. Infections caused by a pathogen resistant to the drug of treatment generally
have a poorer clinical outcome (possibly even death) and are also linked to a greater overall
consumption of healthcare resources, when compared to infections caused by antibiotic-susceptible
organisms 1.
Members of a bacterial species can all be naturally resistant to a specific drug (intrinsic resistance) or
(the) resistance trait(s) can be acquired by susceptible microorganisms (acquired resistance). On a
genetic level, resistance may arise i) endogenously, through random chromosomal point mutations,
often when sub-therapeutic concentrations of antibiotics increase mutability and specifically select
for resistant strains, or ii) exogenously, through horizontal gene transfer, when foreign DNA is
mobilized via conjugative plasmids (transformation), bacteriophages (transduction), transposons,
insertion sequences and naked DNA, eventually leading to the recombination of acquired DNA into
the chromosome 2. Concerning the endogenous mechanisms, the process toward high level
resistance is usually stepwise. The antibiotic selection pressure enriches for bacterial cells with an
initial mutation that allows its enhanced survival, followed by subsequent additional mutations that
11
confer increased resistance levels during further antibiotic therapy. Though mutation frequencies can
be as low as 10-8, this is offset by the huge numbers of cells in bacterial colonies 5. Concerning
exogenous mechanisms, the major genetic elements associated with resistance genes are plasmids.
These are nearly ideal carriers for acquisition and dissemination of resistance genes followed by
transposons, which can move genes between plasmids or chromosomes, and the integrons that can
ease the recruitment and expression of resistance determinants. These elements are widely present
among both Gram-negative and Gram-positive bacterial species and play a crucial role for
dissemination of resistance determinants 6.
From a biochemical point of view, four major mechanisms of resistance can occur in bacteria: i)
decreased antibiotic uptake associated with reduction of membrane permeability (e.g. resistance to
tetracyclines and quinolones); ii) enzymatic inhibition/inactivation of the antibiotic (e.g. resistance to
β-lactams by β-lactamases); iii) rapid efflux of the antibiotic from the cell (e.g. resistance to
tetracyclines and macrolides); iv) target alterations: mutation of the cellular structure (receptor) that
the antibiotics target (e.g. resistance to oxacillin and methicillin by mutating the mecA gene,
mutations in DNA gyrase resulting in resistance to several fluoroquinolones); and v) acquisition of
one or more alternative metabolic pathways to supplement those inhibited by antibiotics (e.g.
resistance to sulfonamides) 7(Figure1). These resistance mechanisms can be present together in
different combinations in a single bacterial cell, potentially allowing high level resistance to multiple
antibiotic compounds simultaneously 8.
Figure 1. Antibiotic resistance strategies in bacteria 9
Ever-growing levels of antimicrobial resistance (AMR) menace the health benefits facilitated by
antibiotics and this phenomenon is recognised as a global crisis 10. With an estimate of 50,000 deaths
across the US and Europe every year attributable to AMR, urgent international actions need to be
taken to preserve the efficacy of modern antibiotic treatments.
12
Without proactive solutions to prevent the continued escalation of antibiotic resistance, it is
estimated that by 2050 approximately 10 million people will die annually of antimicrobial-resistant
infections, which is more than the cumulative number of people dying today from any other type of
disease 1(Figure2).
Figure 2. Predicted global deaths due to antimicrobial-resistant infections every year, compared to other major diseases 1
1.2 The ESKAPE pathogens
The ESKAPE pathogens (Enterococcus faecium, Staphylococcus aureus, Klebsiella pneumoniae,
Acinetobacter baumannii, Pseudomonas aeruginosa and Enterobacter species), although not the only
worrisome pathogens, have been labelled as requiring special attention since they are responsible
for the majority of hospital acquired infections (HAIs), concurrently showing a high prevalence of
AMR 11. The World Health Organization (WHO) has also recently listed twelve bacterial species
against which new antibiotics are urgently needed 12. They describe three categories of pathogens
namely critical, high and medium priority, according to the urgency of need for new antibiotics
(Figure3). Carbapenem-resistant A. baumannii and P. aeruginosa along with extended spectrum β-
lactamase (ESBL) or carbapenem-resistant Enterobacteriaceae (including K. pneumoniae) were listed
in the critical priority list of pathogens.
13
Figure 3. WHO priority pathogens list for R&D of new antibiotics. *Enterobacteriaceae include: K. pneumoniae, E. coli, Enterobacter spp., Serratia spp., Proteus spp., Providencia spp. and Morganella spp.
12
1.2.1 Klebsiella pneumoniae
K. pneumoniae, belonging to the Enterobacteriaceae family, was first isolated in the late 19th century
and was initially known as Friedlaender’s bacterium 13. From a clinical point of view, the species K.
pneumoniae is the most important member of the genus Klebsiella spp., which also includes other
clinically relevant species such as K. oxytoca and, even if to a lesser extent, K. rhinoscleromatis and K.
ozaenae 14. Klebsiella spp. are Gram-negative, encapsulated, non-motile bacteria that are able to
readily colonize human mucosal surfaces, including the gastro-intestinal (GI) tract and oropharynx,
even if this colonization appears benign 15. From these sites, this opportunistic pathogen can gain
entry to other tissues where it can cause severe infections in humans. Major diseases include urinary
tract infections, lower respiratory tract infections, intraabdominal infections and bloodstream
infections. Other diseases, such as meningitis and wound infections, are less common 16.
As the best known genus member, K. pneumoniae is a common opportunistic mostly nosocomial
pathogen, accounting for about one third of all Gram-negative HAIs overall 17. It is also an important
cause of serious community onset infections such as necrotizing pneumonia, pyogenic liver abscesses
and endogenous endophthalmitis 14.
In healthcare settings, K. pneumoniae infections commonly occur among patients who already suffer
from serious underlying clinical conditions, often together with a state of general immunodeficiency.
14
Risk factors for K. pneumoniae infections include extremes of age, presence of malignancy, diabetes,
chronic liver disease, recent solid-organ transplantation, and chronic dialysis 18. Other risk factors for
nosocomial infections by K. pneumoniae are treatment with corticosteroids, chemotherapy, organ
transplantation, or other treatments or conditions resulting in neutropenia 19.
Over the last few decades, there has been a concerning rise in the acquisition of resistance to a wide
range of antibiotic classes by “classical” K. pneumoniae strains 20. Consequently, simple infections
such as UTIs have become hard to treat, while more serious infections such as pneumonia and
bacteremia have become increasingly life-threatening 21.
From the mid-1980s, a novel type of community-acquired invasive K. pneumoniae infection, primarily
in the form of pyogenic liver abscesses, has emerged in mostly Asian countries 22. K. pneumoniae
strains causing these invasive infections are defined as being hyper-virulent and express a distinct
hyper-mucoviscous phenotype when grown on agar plates 23.
Very recently, strains with a hyper-virulent phenotype have been found to carry antimicrobial
resistance genes including carbapenemases 24 but also mechanisms of resistance against last resort
antibiotics such as colistin 25, thus leading to a terrific scenario in lacking of novel approach to treat
this kind of superbugs.
1.2.1.1 Antimicrobial resistance in K. pneumoniae: the β-lactamases
K. pneumoniae can produce various enzymes that hydrolyze the four-membered ring of β-lactams
and inactivate them. These enzymes include ESBLs, oxacillinases, carbapenemases (including metallo-
and serine-β-lactamases), among others (Table 1). Genes encoding such enzymes are generally
present on plasmids which K. pneumoniae seems to readily acquire. Such plasmids often carry other
genes conferring resistance to other antibiotic classes including aminoglycosides, chloramphenicol,
sulfonamides, trimethoprim, and tetracyclines. Thus, bacteria containing these plasmids are often
multidrug-resistant (MDR) 26.
Type Ambler class Features Enzymes
Narrow-spectrum β-lactamases
A Hydrolyze penicillins TEM-1, TEM-2, SHV-1
Extended-spectrum β-lactamases
A Hydrolyze narrow and extended-spectrum β-lactams
SHV-2, CTX-M-15, VEB-1, PER-1
Serine carbapenemases
A Hydrolyze carbapenems KPC-2, KPC-3, IMI-1
Metallo β-lactamases B Hydrolyze carbapenems NDM-1, VIM-1, IMP-1 Cephalosporinases C Hydrolyze cephamycins and
some oxymino β-lactams AmpC, CMY-2, FOX-1
OXA-type enzymes D Hydrolyze carbapenems OXA-48, OXA-232 Table 1. β-lactamases types, including some examples of clinically relevant enzymes.
15
Two major types of antibiotic resistance have been commonly described in K. pneumoniae, both
involving the production of β-lactamases. The first mechanism, initially described in the late 1980’s
concomitantly in Europe 27 and in the US 28, is the production of variants of the SHV-1 or TEM-1 β-
lactamases, in which the substitution of only one or two amino acids led to the appearance of
variants that have been termed ESBLs. ESBLs are chromosomally or plasmid-encoded enzymes that
mediate resistance to penicillins, extended-spectrum (third generation) cephalosporins (e. g.
ceftazidime, cefotaxime, and ceftriaxone) and monobactams (e. g. aztreonam), but do not affect
cephamycins (e. g. cefoxitin and cefotetan) or carbapenems (e. g. meropenem and imipenem) 29. The
early SHV and TEM variants have been largely replaced by the CTX-M family of ESBLs, identified in
the early 1990s in Western Europe and South America and that are currently the most common type
of ESBL in enteric bacteria 30.
The second major mechanism of resistance is the expression of carbapenemases, which renders K.
pneumoniae resistant to all β-lactams, including the carbapenems. Carbapenemases can be classified
on the basis of their aminoacid sequence in different molecular classes: class A (e.g. IMI-, SME-, KPC-
type enzymes), class B (of which the main representatives in clinical isolates are the NDM-, IMP- and
VIM-types) and class D β-lactamases (e.g. OXA-48-types, OXA-232-types) 31.
Klebsiella pneumoniae carbapenemases (KPCs) represent the clinically most relevant mechanism of
acquired antimicrobial resistance observed in K. pneumoniae during recent years. This is due to their
very wide range of activity against several β-lactam families, including penicillins, older and newer
cephalosporins, aztreonam and carbapenems 32.
Several different KPC variants (KPC-2 to KPC-22) have been described, even if KPC-2 and KPC-3 are
the most widely diffused. KPCs are mostly plasmid-encoded enzymes and bacteria carrying these
plasmids are often susceptible to only a few antibiotics such as colistin, aminoglycosides, and
tigecycline.
1.2.1.2 Antimicrobial resistance in K. pneumoniae: colistin resistance
Polymyxins represent the major antimicrobial therapeutic option against carbapenem-resistant K.
pneumoniae infections over the last decades. Indeed, polymyxin E (colistin) is considered as a “last
resort” antimicrobial for the treatment of MDR K. pneumoniae infections, essentially the only drug
that will reach adequate serum levels and that will pass the minimum inhibitory concentration (MIC)
of the infecting strain 33.
16
Consequently, the increasing prevalence of colistin-resistant K. pneumoniae is a major concern,
considering the scarcity of the alternative treatment options and the high mortality rate associated
with carbapenem- and colistin-resistant K. pneumoniae infections 34.
The target of colistin is the outer membrane of Gram-negative bacteria. An electrostatic interaction
occurs between the positively charged colistin molecule on the one side and the phosphate groups of
the negatively charged lipid A on the other side. Divalent cations (Ca2+ and Mg2+) are consequently
displaced from the negatively charged phosphate groups of membrane lipids 35. Then, the
lipopolysaccharide (LPS) is destabilized, the permeability of the bacterial membrane is increased, and
cytoplasmic leakage ultimately causes cell death 36. Even though LPS is the initial target, the exact
colistin mode of action is still uncertain 37.
Similar to what is observed in bacteria that are naturally resistant to colistin, LPS modifications via
addition of cationic groups, i.e. L-aminoarabinose (L-Ara4N) and phosphoethanolamine (pEtN), is
responsible for colistin resistance in K. pneumoniae. A large panel of genes and operons is involved in
qualitative modification of the LPS (Figure4). The pmrCAB operon encodes the pEtN
phosphotransferase PmrC, the response regulator PmrA, and the sensor kinase protein PmrB. The
pEtN phosphotransferase PmrC adds a pEtN group to the LPS. Environmental stimuli such as ferric
(Fe3+) iron, aluminium (Al3+), and low pH (e.g., pH 5.5) activate PmrB through its periplasmic domain.
The tyrosine kinase PmrB in turn activates PmrA by phosphorylation. Finally, PmrA activates the
transcription of the pmrCAB operon itself, and also of the pmrHFIJKLM operon and the pmrE gene
which are also involved in LPS modifications. Specific PmrA/B mutations are responsible for
constitutive activation of the PmrAB two-component system, and have been described as being
responsible for colistin resistance in K. pneumoniae 38.
The pmrHFIJKLM operon encodes for seven proteins, and together with the pmrE gene they are
responsible for the synthesis of the L-Ara4N and its coupling to lipid A. The phoPQ operon encodes
the regulator protein PhoP and the sensor protein kinase PhoQ. In a similar way to PmrB, PhoQ
senses environmental stimuli such as low magnesium (Mg2+) and low pH (e.g., pH 5.5), which mediate
PhoQ activation through its periplasmic domain. PhoQ in turn activates PhoP by phosphorylation.
Finally, PhoP activates the transcription of the pmrHFIJKLM operon, mediating the addition of L-
Ara4N to the LPS. PhoP can also activate the PmrA protein, both directly or indirectly via the PmrD
connector protein, causing the LPS modification via pEtN addition. Several mutations in the phoP/Q
genes are responsible for constitutive activation of the PhoPQ two component system and
consequently colistin resistance in K. pneumoniae 38.
17
MgrB is a small transmembrane protein that acts as a negative regulator of the PhoPQ two-
component system. Inactivation of the mgrB gene leads to overexpression of the phoPQ operon and
consequently colistin resistance. Several missense mutations resulting in amino acid substitutions
and nonsense mutations leading to a truncated MgrB protein have been observed. Insertional
inactivation caused by different insertion sequences (IS), belonging to several families and inserted at
different locations within the mgrB gene, is often responsible for colistin resistance in K. pneumoniae
39,40.
The crrAB operon encodes the regulatory protein CrrA and the sensor protein kinase CrrB, which
regulate the pmrAB expression. Inactivation of the crrB gene leads to overexpression of the pmrAB
operon, finally resulting in colistin resistance 41.
Finally, the plasmid-mediated mcr-1 gene is responsible for horizontal transfer of colistin resistance.
It was initially described in E. coli and K. pneumoniae isolates from Chinese patients between 2011
and 2014 42. The encoded MCR-1 protein is a pEtN transferase, and its acquisition results in the
addition of pEtN to lipid A, similarly to the chromosomal mutations mentioned above. Following mcr-
1, several other variants, up to mcr-9, have been described 43–50.
Figure 4. Regulation pathways of LPS modifications in Klebsiella pneumoniae 37
1.2.1.3 Hyper-virulent K. pneumoniae
Despite rendering bacterial infections more difficult to treat, MDR does not enhance the virulence of
K. pneumoniae strains. However, starting from the 1980s, K. pneumoniae strains with the ability to
cause severe infections in apparently healthy individuals emerged. These strains are defined as
hyper-virulent K. pneumoniae (hvKp) compared to classical K. pneumoniae (cKp) strains as they are
18
able to infect both healthy and immunocompromised individuals, with resulting infections which are
generally invasive.
Infections were first described in Taiwan and are common on the Asian Pacific Rim. However, new
cases have recently been reported on a more global scale. In contrast to the infections caused by cKp,
most hvKp infections originate in the community 51. While pyogenic liver abscesses represents the
major disease, hvKp strains can also cause pneumonia and lung abscesses, among others 52.
Bacteremia is frequent among hvKP-infected patients and is correlated with a significantly poorer
prognosis 53.
Several virulence factors were reported and studied in hvKP strains. Capsule is a polysaccharide
matrix that overlays the cell and it is fundamental for K. pneumoniae virulence. hvKp strains are
characterized by hyper-capsulation which consists of an extensive mucoviscous exopolysaccharide
coating that is thicker and more robust than that of the typical capsule. This hyper-capsule
contributes significantly to the pathogenicity of hvKp 20.
Most hvKp are associated with only two of the 130 reported capsular serotypes, K1 and K2, that were
shown to be particularly anti-phagocytic and serum resistant 20,54. hvKp are also associated with
several other key virulence factors (Figure5); the rmpA and rmpA2 genes that upregulate capsule
expression thereby aiding the formation of a hyper-capsule which is linked to the hyper-mucoviscous
phenotype; the colibactin genotoxin that induces eukaryotic cell death and promotes bacterial
transfer from the intestines into the blood; the yersiniabactin, aerobactin and salmochelin
siderophores that enhance survival in the blood by promoting iron scavenging 20. Yersiniabactin
synthesis is encoded by the ybt locus that is generally mobilized by an integrative, conjugative
element termed ICEKp. Its prevalence is about 40% in K. pneumoniae and it is frequently acquired
and lost from MDR clones 55. Conversely, the salmochelin (iro), aerobactin (iuc) and rmpA/rmpA2 loci
are usually co-harbored by a virulence plasmid 56. The prevalence of that virulence plasmid is less
than 10% in the K. pneumoniae population, and until recently it was rarely reported among cKp
strains 57.
hvKp strains are generally susceptible to most antimicrobials. However, the last few years have seen
an increasing number of reports of ‘convergent’ K. pneumoniae strains that are both hyper-virulent
(carrying the iuc aerobactin locus, which is recognized as the single most important feature of hvKp
strains 58) and ESBL/carbapenemase producers. The majority of these reports represent sporadic
isolations, but in 2017 Gu and colleagues reported a fatal outbreak in a Chinese hospital caused by a
hyper-virulent carbapenemase-producing K. pneumoniae isolate 59.
19
Figure 5. Four well-characterized virulence factors in classical and hypervirulent K. pneumoniae strains 20
1.2.2 Acinetobacter baumannii
Acinetobacter baumannii is a Gram-negative coccobacillus recognized as an important opportunistic
human pathogen causing infections of the urinary tract, skin, bloodstream, and soft tissues 60. The
majority of A. baumannii infections occur among critically ill patients in the intensive care unit (ICU)
setting, accounting for as much as 20% of infections in ICUs worldwide 61. MDR phenotypes due to
the acquisition of antibiotic resistance mechanisms represent a major factor of the success of A.
baumannii in hospital environments. Antibiotic modifying enzymes, decreased permeability to
antibiotic molecules, and active efflux pumps are among the major AMR mechanisms. Apart from its
multidrug resistance, the success of A. baumannii can also be attributed to its ability to survive in the
hospital environment 62. Examples of the challenges that A. baumannii faces as an opportunistic
human pathogen include the survival at low temperatures, the exposure to antiseptics and
desiccating agents and the rapid changes of environmental and nutritional conditions when
transferred into the human body from the hospital environment. Therefore, A. baumannii needs to
sense and adapt to these changes in an efficient and prompt manner. A. baumannii also has also the
ability to colonize the skin of patients or healthy individuals without causing any apparent illness.
However, transmission of such colonizing bacteria to a susceptible patient can result in immediate
infection.
1.2.2.1 Multidrug-Resistant A. baumannii
The major mechanism of β-lactam resistance in A. baumannii is enzymatic degradation by β-
lactamases. A. baumannii strains are characterized by chromosomally encoded AmpC
cephalosporinases, which are also known as Acinetobacter-derived cephalosporinases (ADCs). The
overexpression of such enzymes in A. baumannii is regulated by the presence of an upstream
insertion sequence (IS) element, the major representative being ISAba1. The presence of this
20
element correlates with resistance to extended-spectrum cephalosporins due to the increased ADC
production. Cefepime and carbapenems are not hydrolyzed by these enzymes.
ESBLs of the VEB-, PER-, TEM- and CTX-M-type have also been reported in A. baumannii. However,
the assessment of their prevalence is hindered by difficulties with laboratory detection in the
presence of ADCs 60.
The β-lactamases with carbapenemase activity are of major concern and include the serine
oxacillinases (Ambler class D OXA type) and the metallo-β-lactamases (MBLs) (Ambler class B).
The second intrinsic β-lactamase produced by A. baumannii is an oxacillinase, represented by the
OXA-51/69 variants. The OXA-51-like-encoding genes are chromosomally located in A. baumannii and
the carbapenemase activities of OXA-51/69 enzymes have been studied in detail 63,64. However, the
level of expression of the corresponding genes is quite low in most cases, resulting in a minor impact
on β-lactam susceptibility 65.
Identification of a carbapenem-hydrolyzing oxacillinase-encoding gene was first reported in A.
baumannii in 1995 and named blaOXA-23. This enzyme type now represents the major carbapenem
resistance determinant in A. baumannii on a global scale. Two other acquired OXA-type genes giving
rise to the production of proteins with carbapenemase activity have been reported, the blaOXA-24-like
and the blaOXA-58-like carbapenemase genes 65.
IS elements play an important role in oxacillinases-mediated carbapenem resistance in A. baumannii.
These elements provide two major functions. First, they encode a transposase, allowing the
mobilization of the carbapenemase-encoding gene. Second, they can contain promoter regions that
lead to overexpression of downstream genes. IS elements have been frequently described upstream
of blaOXA-23 and blaOXA-58 genes, but they may also promote carbapenem resistance in association with
intrinsic genes such as blaOXA-51. Some IS elements, in particular ISAba1, are relatively unique to A.
baumannii 60.
Aminoglycoside resistance in A. baumannii is encoded by acetyltransferases, nucleotidyltransferases,
and phosphotransferase-encoding genes. More alarmingly, 16S rRNA methylation is becoming
common in A. baumannii due to the expression of the armA gene. This resistance mechanism
protects the 30S ribosomal subunit from aminoglycoside binding conferring high-level resistance to
all clinically useful aminoglycosides, including gentamicin, tobramycin, and amikacin 66.
The major fluoroquinolone resistance mechanism depends on modifications of DNA gyrase or
topoisomerase IV through mutations in the gyrA and parC genes. Such mutations modify the
fluoroquinolone’s target binding site 60.
21
1.2.2.2 Colistin resistance in A. baumannii
The main mechanism of colistin resistance in A. baumannii corresponds to the addition of cationic
groups to the LPS (Figure6). Colistin resistance may also be the consequence of a complete loss of
LPS production. However, LPS loss is associated to growth defects and decreased virulence, and for
these reasons very few clinical isolates are LPS deficient 67.
Colistin resistance has been linked to mutations in the two-component transcriptional regulator
genes pmrA/B and consequent pmrC overexpression in most instances. The pEtN phosphotransferase
PmrC adds a pEtN group to the lipid A of the lipopolysaccharide, lowering the net negative charge of
the cell membrane, thus impacting the binding of colistin and preventing the cell membrane leakage.
The complete loss of LPS is caused by alterations of the lipid A biosynthesis genes, namely the lpxA,
lpxC, and lpxD genes. Mutations identified in those genes were either substitutions, truncations,
frameshifts , or insertional inactivation by the insertion sequence ISAba11 37.
Colistin resistance may also result from the overexpression of etpA, a pmrC homolog. This is
mediated by insertional inactivation of a gene encoding an H-NS family transcriptional regulator 68 or
by integration of insertion sequence elements upstream of the eptA gene itself 69–71.
Figure 6. Schematic representation of A. baumannii colistin resistance mechanisms 69
.
1.3 Whole Genome Sequencing (WGS): a disruptive diagnostic tool
The current methods of clinical microbiology diagnostics mainly consist on conventional culturing of
clinical samples on different agar plates, followed by antimicrobial susceptibility testing (AST) and
further characterization on a case-by-case basis. The major steps in processing a sample are isolating
a pathogen, determining its species, testing antimicrobial susceptibility and virulence and, in specific
22
settings, intra-species typing for epidemiological purposes. The first three steps are crucial for the
treatment and management of an infected patient, while the last step is valuable for identifying
outbreaks and improve the surveillance. Depending on the pathogen, this practice usually takes one
to two days for culturing, an additional one to two days for species identification and susceptibility
testing, and several days for typing 72. While the species identification and AST can be performed
significantly faster, for example by employing MALDI-TOF MS and rapid disk diffusion after 4-6 hours
of culture 73,74, the overall diagnostic process, including typing, remains complex, time-consuming
and difficult to automate 72.
Several methods for rapid diagnostic testing have been developed and evaluated. Molecular
methods, such as PCR, microarray, and nucleic acid sequencing, have been widely adopted in the
clinical laboratory. These methods are able to identify microorganisms, genes and genetic
polymorphisms with high sensitivity and specificity through detection of specific nucleic acid targets.
Regardless of methodology, molecular diagnostics have the capability to reduce the time to results
and provide more accurate diagnosis. Despite these clear advantages, molecular diagnostic methods
are still expensive, and AST is limited to the detection of few resistance markers 75.
WGS has all the essentials to dramatically revolutionize bacterial diagnosis and surveillance by
replacing current time-consuming and labour-intensive techniques with a single and rapid diagnostic
test (Figure 7). Over the past two decades, huge progress was made in the field of high-throughput
sequencing technologies, and nowadays sequencing the full genome of a bacterial pathogen is
considered neither challenging nor particularly expensive anymore. As a result, WGS is believed as
the obvious and inevitable future diagnostics in multiple reviews and opinion articles 72,75–79.
Figure 7. A schematic representation of the hypothetical workflow after adoption of WGS, with low complexity and an expected turnaround time within one day (Adapted from
72).
23
However, WGS diagnostics is still not widely adopted in clinical microbiology, which may seem in
contrast with the number of applications for which WGS has huge potential, and which are already
widely used in the academic research 80.
Some major applications of WGS in diagnosing infectious diseases include:
i) Strain identification and typing. WGS data can be exploited to obtain information concerning the
bacterial species and subtype. WGS can also allow the phylogenetic placement of a given sequence
relative to an existing set of isolates for which the complete genome sequence is also known. WGS-
based strain identification offers a greater resolution compared to current genetic marker-based
approaches such as multi-locus sequence typing (MLST) pulsed-field gel electrophoresis (PFGE),
variable-number tandem repeat (VNTR) profiling. The greater resolution offered by WGS is also of
major significance for bacteria with large accessory genomes. While the core genome contains the
essential housekeeping genes which are present in all members of a lineage, the accessory genome is
defined as the genome fraction containing nonessential genes. In K. pneumoniae and A. baumannii
most of the relevant genes, like those encoding for resistance or virulence, are located in the
accessory genome.
ii) Phenotype prediction. WGS data provide a rich resource that can be exploited to predict the
pathogen’s phenotype. The major bacterial traits of clinical relevance are AMR and virulence, but
may also include other traits such as the ability to form biofilms or survival in the environment.
Concerning AMR prediction, several databases and bioinformatics tools were developed to detect
known genes and mutations associated with a resistance phenotype 81. More recently, the use of
machine learning (ML) techniques was assessed for the antimicrobial susceptibility prediction
without any previous knowledge of the actual AMR determinants involved 82. In general, ML
algorithms work by finding the relevant features in a complex data set that enable strong and reliable
prediction 83. ML algorithms are used to select the genomic features that are relevant to a given
antibiotic susceptibility profile. These relevant genomic features are then used as a phenotype
“classifier” for unknown genomes and as a source for identifying important genomic regions. From a
practical point of view, the counts of overlapping K-mers (subsequences of length ‘k’ contained
within a biological sequence) are computed and combined with the clinical laboratory generated
phenotypic data for each antibiotic to form one large matrix containing both the k-mers and
antibiotics as features. Different algorithms (boosting algorithms, penalized regression models,
decision trees, random forest, neural networks or set cover machines) are then used to build a
predictive model 82.
24
iii) Tracking outbreaks and identifying sources of recurrent infections. WGS can identify isolates
which are part of an outbreak and, by combining epidemiological data with phylogenetic information,
detect putative transmission events between patients or between patients and the environment.
WGS was successfully employed to reconstruct outbreaks within hospitals and the community
caused by pathogens belonging to several species, including carbapenem-resistant K. pneumoniae 84–
86 and A. baumannii 87. A recent review summarizes the major bioinformatics tool for outbreak
investigations 88.
iv) Improved surveillance. Molecular surveillance and real-time tracking of bacterial disease are
among the major promises of WGS implementation. In order to achieve this, the genomes sequenced
each year together with their metadata (e.g. sampling date, geographic location, isolation host) need
to be shared and methodically archived in an exploitable form. With such data, surveillance
initiatives have the capability to identify the likely geographic origin of emerging bacteria and AMR
genes, to group seemingly unrelated cases into outbreaks, and to clearly identify the emergence of
new clones. In a hospital environment, surveillance can help to detect cross-transmission events
between the hospital and the community and to improve antimicrobial stewardship; on a wider scale,
it can anticipate worldwide emerging trends consequently enabling anticipatory policy decisions.
Despite the WGS potential, there are some major bottlenecks to its implementation as a routine
clinical microbiology diagnostic tool. Major limitations include: the cost of performing WGS, which is
still high but it keeps falling; a lack of clinical microbiologists with bioinformatics skills; a lack of the
necessary computational infrastructure in most medical settings; the incompleteness of reference
microbial genomics databases required for AMR and virulence determinants detection; and the lack
of standardized, effective and easy to use bioinformatics protocols 75,80.
1.3.1 Different WGS platforms
From 2005, novel sequencing technologies emerged under the name of second (or next) generation
sequencing platforms, as opposed to the automated Sanger method, which is a first-generation
technology (Figure 8). Three major technologies, Illumina, SOLiD and 454, were employed to
generate bacterial genomes. From 2011, Illumina displaced the other competitors, and nowadays it
represents the major sequencing platform 89.
Illumina sequencing is based on the sequencing-by-synthesis principle to elucidate the sequence of
DNA. Briefly, DNA polymerases catalyse the binding of fluorescently labelled deoxyribonucleotide
triphosphates (dNTPs) into a DNA template strand during subsequent cycles of DNA synthesis. During
each cycle, at the point of incorporation, the nucleotides are identified by fluorophore excitation.
This process takes place across millions of fragments in a massively parallel fashion. The size of the
25
Illumina reads (the fragments of DNA that are sequenced by the instrument) is up to 300 bases. With
appropriate multiplexing, the ordinary coverage for a bacterial genome sequence project is between
30 and 100 reads per base. Illumina reads accuracy rates are typically around 99.9%, although
systematic biases related to GC-rich regions and some specific DNA motifs exist 90. Illumina has
developed several instruments ranging from low-throughput benchtop machines (MiniSeq, MiSeq) to
ultra-high-throughput instruments (HiSeq, NovaSeq). Illumina sequencing is considered as short-read
sequencing. Such short reads are insufficiently large to cover repeat elements such as transposons
and insertion sequences, which usually mobilize resistance and virulence determinants.
Consequently, short-read genome assemblies are fragmented and can consist of up to hundreds of
DNA fragments, called contigs. Sequencing technologies producing longer reads can cover such
repeats allowing the complete assembly of bacterial genomes.
In 2011, the first single-molecule, third generation long-read sequencing technology was released by
Pacific Biosciences (PacBio), while in 2014 Oxford Nanopore Technologies (ONT) released the MinION
instrument. PacBio’s single-molecule real-time (SMRT) sequencing it’s also based on the sequencing-
by-synthesis principle, as it detects sequence information during the replication process of the target
DNA molecule. The method is based on the optical observation of the polymerase-mediated
synthesis in real time. A zero-mode waveguide (ZMW), a hole less than half the wavelength of light,
limits fluorescent excitation to only a single polymerase together with its template. Consequently,
only fluorescently labelled nucleotides integrated into the growing DNA chain emit signals of
sufficient duration to be read 91.
SMRT sequencers (RSII, Sequel and Sequel II) have fast run times, typically less than three hours, and
the long reads produced can be longer than 80 Kb. The raw base-called error rate is decreasing over
the last years, and is now reduced to < 1% 92. As a major drawback, the high cost per base compared
with Illumina technologies and the massive cost for a PacBio sequencer represent major obstacles for
the implementation of this technology in the clinical microbiology laboratory 93.
ONT sequencing principle is based on the passage of a single stranded DNA in a nanopore over which
a voltage is continuously applied. The current through the nanopore changes depending on which
base is passing through it. Such changes can be processed and translated to obtain the sequence of
the DNA molecule that passes through the pore 94. The MinION is the main ONT device, it’s a small
and portable sequencer that can be used outside of traditional laboratories. Its throughput is up to
30 Gb per run, and it can produce reads longer than 200 Kb. The raw base-called error rate is claimed
to have been reduced to < 5% for nanopore sequences 95. An important feature of the MinION
sequencer is that the output can be analysed during its generation. This allows strain identification
26
within 30 minutes and prediction of the antibiotic resistance profile within 10 hours after the start of
a run 89.
Figure 8. Overview of the three generations of sequencing technologies, with examples of the major sequencing platforms
96.
1.4 Aims
Antimicrobial resistance is a severe threat to public health worldwide, leading to growing costs,
treatment failure, morbidity and mortality. Nowadays, the antibiotic resistance level of bacterial
strains can be assessed by simple, mostly culture-based clinical AST methods. Although the classic
tests are reliable, they require extensive manual laboratory work and results are normally obtained
after several days only. WGS is a high-throughput DNA sequencing strategy that can produce a large
amount of data in a single reaction. WGS could potentially reduce the turnaround time for laboratory
results and allow clinically actionable information to be obtained sooner than traditional laboratory
diagnostic tests. However, translating genomic information to AST results is challenging. Moreover,
WGS allows for high resolution epidemiologic investigations, fundamental to track the spread and
the evolution of novel ‘high-risk’ clones.
This research project focuses on the use of WGS in order to study collections of MDR strains obtained
from countries with high AMR rates. The general aim is to study the AMR mechanisms at the
genomic level, with particular focus on last line drugs, such as colistin, and to perform
27
epidemiological investigations about the nosocomial spread focusing mainly on clinical A. baumannii
and K. pneumoniae strains.
The research was part of an initiative to define new diagnostic routing in infectious disease under the
name of ND4ID (Novel Diagnostics for Infectious Diseases). This project received funding from the
European Union’s Horizon 2020 research and innovation program under the Marie Sklodowska-Curie
grant agreement No 675412.
The specific aims of this thesis are:
1. To investigate the genetic mechanisms of colistin resistance in K. pneumoniae (CHAPTER2)
and A. baumannii (CHAPTER 3) from two countries facing high AMR levels. Resistance
mechanism analysis of other antimicrobials, plasmid analysis and genomic epidemiology
investigations were also performed.
2. To study the population of K. pneumoniae isolates collected over a 15-year period in the
Beijing hospital H301 (CHAPTER 4). WGS was employed to decipher the genomic
epidemiology, the AMR and virulence determinants, as well as the emergence of novel ‘high-
risk’ clones, characterized by hyper-virulence and MDR.
3. To build and evaluate a machine learning algorithm for the prediction of antimicrobial
susceptibilities from genomic data (CHAPTER 5). To test the algorithm performances for the
phenotype prediction of K. pneumoniae genomes.
4. To perform classical molecular and enzymology techniques for the cloning, expression and
enzymatic activity testing of a novel carbapenemase. WGS was employed to detect the
putative determinant of carbapenem resistance and its genetic environment and to perform
phylogenetic analysis (CHAPTER 6).
1.5 References
1. O’Neill J. Review on Antimicrobial Resistance. Antimicrobial Resistance: Tackling a Crisis for the
Health and Wealth of Nations, 2014. 2014; 4.
2. Davies J, Davies D. Origins and evolution of antibiotic resistance. Microbiol Mol Biol rev 2010; 74:
417–33.
3. Aarestrup FM, Wegener HC, Collignon P. Resistance in bacteria of the food chain: Epidemiology
and control strategies. Expert Rev Anti Infect Ther 2008; 6: 733–50.
4. Rice LB. The clinical consequences of antimicrobial resistance. Curr Opin Microbiol 2009; 12: 476–
81.
28
5. Drlica K, Perlin DS. Antibiotic Resistance: Understanding and Responding to an Emerging Crisis.
Emerg Infect Dis 2011; 17: 1984–1984.
6. Partridge SR, Kwong SM, Firth N, Jensen SO. Mobile Genetic Elements Associated with
Antimicrobial Resistance. Clin Microbiol Rev 2018; 31: 1–61.
7. Munita JM, Arias CA. Mechanisms of Antibiotic Resistance. Microbiol Spectr 2016; 4: 464–72.
8. Nikaido H. Multidrug Resistance in Bacteria. Annu Rev Biochem 2009; 78: 119–46.
9. Erik Gullberg. Selection of Resistance at very low Antibiotic Concentrations. PhD thesis Uppsqle
Univ 2014; ISBN 978-9.
10. Ventola CL. The antibiotic resistance crisis: causes and threats. P T J 2015; 40: 277–83.
11. Rice LB. Federal Funding for the Study of Antimicrobial Resistance in Nosocomial Pathogens: No
ESKAPE. J Infect Dis 2008; 197: 1079–81.
12. Tacconelli E, Carrara E, Savoldi A, et al. Discovery, research, and development of new antibiotics:
the WHO priority list of antibiotic-resistant bacteria and tuberculosis. Lancet Infect Dis 2018; 18:
318–27.
13. Friedlaender C. Ueber die Schizomyceten bei der acuten fibrösen Pneumonie. Arch für Pathol
Anat und Physiol und für Klin Med 1882; 87: 319–24.
14. Podschun R, Ullmann U. Klebsiella spp. as nosocomial pathogens: Epidemiology, taxonomy,
typing methods, and pathogenicity factors. Clin Microbiol Rev 1998; 11: 589–603.
15. Bagley ST. Habitat association of Klebsiella species. Infect Control 1985; 6: 52–8.
16. Bengoechea JA SPJ. Klebsiella pneumoniae infection biology: living to counteract host defences.
FEMS Microbiol Rev 2019; 43(2):123-.
17. Navon-Venezia S, Kondratyeva K, Carattoli A. Klebsiella pneumoniae: a major worldwide source
and shuttle for antibiotic resistance. FEMS Microbiol Rev 2017; 013: 252–75.
18. Meatherall BL, Gregson D, Ross T, Pitout JDD, Laupland KB. Incidence, Risk Factors, and Outcomes
of Klebsiella pneumoniae Bacteremia. Am J Med 2009; 122: 866–73.
19. Tsay RW, Siu LK, Fung CP, Chang FY. Characteristics of bacteremia between community-acquired
and nosocomial Klebsiella pneumoniae infection: Risk factor for mortality and the impact of capsular
serotypes as a herald for community-acquired infection. Arch Intern Med 2002; 162: 1021–7.
29
20. Paczosa MK, Mecsas J. Klebsiella pneumoniae: Going on the Offense with a Strong Defense.
Microbiol Mol Biol Rev 2016; 80: 629–61.
21. Boucher HW, Talbot GH, Bradley JS, et al. Bad Bugs, No Drugs: No ESKAPE! An Update from the
Infectious Diseases Society of America. Clin Infect Dis 2009; 48: 1–12.
22. Liu YC, Cheng DL, Lin CL. Klebsiella pneumoniae Liver Abscess Associated With Septic
Endophthalmitis. Arch Intern Med 1986; 146: 1913–6.
23. Shon AS, Bajwa RPS, Russo TA. Hypervirulent (hypermucoviscous) Klebsiella Pneumoniae: A new
and dangerous breed. Virulence 2013; 4: 107–18.
24. Chen L, Kreiswirth BN. Convergence of carbapenem-resistance and hypervirulence in Klebsiella
pneumoniae. Lancet Infect Dis 2018; 18: 2–3.
25. Arena F, Henrici De Angelis L, D’Andrea MM, et al. Infections caused by carbapenem-resistant
Klebsiella pneumoniae with hypermucoviscous phenotype: A case report and literature review.
Virulence 2017; 8: 1900–8.
26. Jacoby GA, Sutton L. Properties of plasmids responsible for production of extended-spectrum
beta-lactamases. Antimicrob Agents Chemother 1991; 35: 164–9.
27. Sirot J, Chanal C, Petit A, Sirot D, Labia R, Gerbaud G. Klebsiella pneumoniae and other
Enterobacteriaceae producing novel plasmid-mediated β-lactamases markedly active against third-
generation cephalosporins: Epidemiologic studies. Clin Infect Dis 1988; 10: 850–9.
28. Jacoby GA, Medeiros AA, O’brien TF, Pinto ME, Jiang H. Broad-Spectrum, Transmissible β-
Lactamases. N Engl J Med 1988; 319: 723–4.
29. Paterson DL, Bonomo RA. Extended-Spectrum β-Lactamases: a Clinical Update. Clin Microbiol Rev
2005; 18: 657–86.
30. Cantón R, González-Alba JM, Galán JC. CTX-M enzymes: Origin and diffusion. Front Microbiol 2012;
3.
31. Queenan AM, Bush K. Carbapenemases: the versatile beta-lactamases. Clin Microbiol Rev 2007;
20: 440–58.
32. Tzouvelekis LS, Markogiannakis A, Psichogiou M, Tassios PT, Daikos GL. Carbapenemases in
Klebsiella pneumoniae and other Enterobacteriaceae: An evolving crisis of global dimensions. Clin
Microbiol Rev 2012; 25: 682–707.
30
33. Arnold RS, Thom KA, Sharma S, Phillips M, Kristie Johnson J, Morgan DJ. Emergence of Klebsiella
pneumoniae carbapenemase-producing bacteria. South Med J 2011; 104: 40–5.
34. Capone A, Giannella M, Fortini D, et al. High rate of colistin resistance among patients with
carbapenem-resistant Klebsiella pneumoniae infection accounts for an excess of mortality. Clin
Microbiol Infect 2013; 19.
35. Dixon RA, Chopra I. Leakage of periplasmic proteins from Escherichia coli mediated by polymyxin
B nonapeptide. Antimicrob Agents Chemother 1986; 29: 781–8.
36. Li J, Nation RL, Turnidge JD, et al. Colistin: the re-emerging antibiotic for multidrug-resistant
Gram-negative bacterial infections. Lancet Infect Dis 2006; 6: 589–601.
37. Poirel L, Jayol A, Nordmann P. Polymyxins: Antibacterial Activity, Susceptibility Testing, and
Resistance Mechanisms Encoded by Plasmids or Chromosomes. Clin Microbiol Rev 2017; 30: 557–96.
38. Cheng H-Y, Chen Y-F, Peng H-L. Molecular characterization of the PhoPQ-PmrD-PmrAB mediated
pathway regulating polymyxin B resistance in Klebsiella pneumoniae CG43. J Biomed Sci 2010; 17: 60.
39. Cannatelli A, D’Andrea MM, Giani T, et al. In vivo emergence of colistin resistance in Klebsiella
pneumoniae producing KPC-type carbapenemases mediated by insertional inactivation of the
PhoQ/PhoP mgrB regulator. Antimicrob Agents Chemother 2013; 57: 5521–6.
40. Cannatelli A, Giani T, D’Andrea MM, et al. MgrB inactivation is a common mechanism of colistin
resistance in KPC-producing klebsiella pneumoniae of clinical origin. Antimicrob Agents Chemother
2014; 58: 5696–703.
41. Wright MS, Suzuki Y, Jones MB, et al. Genomic and transcriptomic analyses of colistin-resistant
clinical isolates of Klebsiella pneumoniae reveal multiple pathways of resistance. Antimicrob Agents
Chemother 2015; 59: 536–43.
42. Liu YY, Wang Y, Walsh TR, et al. Emergence of plasmid-mediated colistin resistance mechanism
MCR-1 in animals and human beings in China: A microbiological and molecular biological study.
Lancet Infect Dis 2016; 16: 161–8.
43. Xavier BB, Lammens C, Ruhal R, et al. Identification of a novel plasmid-mediated colistin-
resistance gene, mcr-2, in Escherichia coli, Belgium, June 2016. Euro Surveill 2016; 21: 30280.
44. Wenjuan Yin A, Hui Li, a Yingbo Shen, a Zhihai Liu A, Shaolin Wang A, et al. Novel Plasmid-
Mediated Colistin Resistance Gene mcr-3 in Escherichia coli. MBio 2017.
31
45. Carattoli A, Villa L, Feudi C, et al. Novel plasmid-mediated colistin resistance mcr-4 gene in
Salmonella and Escherichia coli , Italy 2013, Spain and Belgium, 2015 to 2016. Eurosurveillance 2017;
22: 30589.
46. Borowiak M, Fischer J, Hammerl JA, Hendriksen RS, Szabo I, Malorny B. Identification of a novel
transposon-associated phosphoethanolamine transferase gene, mcr-5, conferring colistin resistance
in d-tartrate fermenting Salmonella enterica subsp. enterica serovar Paratyphi B. J Antimicrob
Chemother 2017: 3317–24.
47. Yang Y, Li Y, Lei C, Zhang A, Wang H. Novel plasmid-mediated colistin resistance gene mcr-7.1 in
Klebsiella pneumoniae. 2018: 5–9.
48. Wang X, Wang Y, Zhou Y, et al. Emergence of a novel mobile colistin resistance gene , mcr-8 , in
NDM-producing Klebsiella pneumoniae. Emerg Microbes Infect 2018: 1–9.
49. Lima WG, Alves MC, Cruz WS, Paiva MC. Chromosomally encoded and plasmid-mediated
polymyxins resistance in Acinetobacter baumannii: a huge public health threat. Eur J Clin Microbiol
Infect Dis 2018; 37: 1009–19.
50. Carroll LM, Gaballa A, Guldimann C, Sullivan G, Henderson LO, Wiedmann M. Identification of
novel mobilized colistin resistance gene mcr-9 in a multidrug-resistant, colistin-susceptible
Salmonella enterica serotype typhimurium isolate. MBio 2019; 10.
51. Russo TA, Marr CM. Hypervirulent Klebsiella pneumoniae. Clin Microbiol Rev 2019; 32: 1–42.
52. Ko WC, Paterson DL, Sagnimeni AJ, et al. Community-acquired Klebsiella pneumoniae bacteremia:
Global differences in clinical patterns. Emerg Infect Dis 2002; 8: 160–6.
53. Wang J, Chen K, Fang C, Hsueh P, Yang P, Chang S. Changing Bacteriology of Adult Community‐
Acquired Lung Abscess in Taiwan: Klebsiella pneumoniae versus Anaerobes . Clin Infect Dis 2005; 40:
915–22.
54. Kabha K, Nissimov L, Athamna A, et al. Relationships among capsular structure, phagocytosis, and
mouse virulence in Klebsiella pneumoniae. Infect Immun 1995; 63: 847–52.
55. Lam MMC, Wick RR, Wyres KL, et al. Genetic diversity, mobilisation and spread of the
yersiniabactin-encoding mobile element ICEKp in Klebsiella pneumoniae populations. Microb
Genomics 2018; 4.
56. Lam MMC, Wyres KL, Judd LM, et al. Tracking key virulence loci encoding aerobactin and
32
salmochelin siderophore synthesis in Klebsiella pneumoniae. Genome Med 2018; 10: 77.
57. Holt KE, Wertheim H, Zadoks RN, et al. Genomic analysis of diversity, population structure,
virulence, and antimicrobial resistance in Klebsiella pneumoniae, an urgent threat to public health.
Proc Natl Acad Sci 2015; 112: E3574–81.
58. Russo TA, Olson R, Fang CT, et al. Identification of biomarkers for differentiation of hypervirulent
Klebsiella pneumoniae from classical K. pneumoniae. J Clin Microbiol 2018; 56.
59. Gu D, Dong N, Zheng Z, et al. A fatal outbreak of ST11 carbapenem-resistant hypervirulent
Klebsiella pneumoniae in a Chinese hospital: A molecular epidemiological study. Lancet Infect Dis
2017; 18: 37–46.
60. Peleg AY, Seifert H, Paterson DL. Acinetobacter baumannii: Emergence of a Successful Pathogen.
Clin Microbiol Rev 2008; 21: 538–82.
61. Vincent JL, Rello J, Marshall J, et al. International study of the prevalence and outcomes of
infection in intensive care units. JAMA - J Am Med Assoc 2009; 302: 2323–9.
62. Jawad A, Seifert H, Snelling AM, Heritage J, Hawkey PM. Survival of Acinetobacter baumannii on
dry surfaces: Comparison of outbreak and sporadic isolates. J Clin Microbiol 1998; 36: 1938–41.
63. Héritier C, Poirel L, Fournier PE, Claverie JM, Raoult D, Nordmann P. Characterization of the
naturally occurring oxacillinase of Acinetobacter baumannii. Antimicrob Agents Chemother 2005; 49:
4174–9.
64. Brown S, Young HK, Amyes SGB. Characterisation of OXA-51, a novel class D carbapenemase
found in genetically unrelated clinical strains of Acinetobacter baumannii from Argentina. Clin
Microbiol Infect 2005; 11: 15–23.
65. Poirel L, Nordmann P. Carbapenem resistance in Acinetobacter baumannii: mechanisms and
epidemiology. Clin Microbiol Infect 2006; 12: 826–36.
66. Doi Y, Wachino J ichi, Arakawa Y. Aminoglycoside Resistance: The Emergence of Acquired 16S
Ribosomal RNA Methyltransferases. Infect Dis Clin North Am 2016; 30: 523–37.
67. Carretero-Ledesma M, García-Quintanilla M, Martín-Peña R, Pulido MR, Pachón J, McConnell MJ.
Phenotypic changes associated with colistin resistance due to lipopolysaccharide loss in
Acinetobacter baumannii. Virulence 2018; 9: 930–42.
68. Lucas DD, Crane B, Wright A, et al. Emergence of high-level colistin resistance in an Acinetobacter
33
baumannii clinical isolate mediated by inactivation of the global regulator H-NS. AAC 2018; 30: 1–17.
69. Trebosc V, Gartenmann S, Tötzl M, et al. Dissecting Colistin Resistance Mechanisms in Extensively
Drug-Resistant Acinetobacter baumannii Clinical Isolates. MBio 2019; 10.
70. Gerson S, Betts JW, Lucaßen K, et al. Investigation of Novel pmrB and eptA Mutations in Isogenic
Acinetobacter baumannii Isolates Associated with Colistin Resistance and Increased Virulence in vivo .
Antimicrob Agents Chemother 2019; 63: 1–15.
71. Potron A, Vuillemenot J-B, Puja H, et al. ISAba1-dependent overexpression of eptA in clinical
strains of Acinetobacter baumannii resistant to colistin. J Antimicrob Chemother 2019; 74: 2544–50.
72. Didelot X, Bowden R, Wilson DJ, Peto TEA, Crook DW. Transforming clinical microbiology with
bacterial genome sequencing. Nat Rev Genet 2012; 13: 601–12.
73. Fröding I, Vondracek M, Giske CG. Rapid EUCAST disc diffusion testing of MDR Escherichia coli
and Klebsiella pneumoniae: Inhibition zones for extended-spectrum cephalosporins can be reliably
read after 6 h of incubation. J Antimicrob Chemother 2017; 72: 1094–102.
74. Jonasson E, Matuschek E, Kahlmeter G. The EUCAST rapid disc diffusion method for antimicrobial
susceptibility testing directly from positive blood culture bottles. J Antimicrob Chemother 2020; 75:
968–78.
75. van Belkum A, Burnham C-AD, Rossen JWA, Mallard F, Rochas O, Dunne WM. Innovative and
rapid antimicrobial susceptibility testing systems. Nat Rev Microbiol 2020.
76. Pallen MJ, Loman NJ, Penn CW. High-throughput sequencing and clinical microbiology: Progress,
opportunities and challenges. Curr Opin Microbiol 2010; 13: 625–31.
77. Köser CU, Ellington MJ, Cartwright EJP, et al. Routine Use of Microbial Whole Genome
Sequencing in Diagnostic and Public Health Microbiology. PLoS Pathog 2012; 8.
78. Fricke WF, Rasko D a. Bacterial genome sequencing in the clinic: bioinformatic challenges and
solutions. Nat Rev Genet 2014; 15: 49–55.
79. Dunne Jr WM, Jaillard M, Rochas O, Van Belkum A. Microbial genomics and antimicrobial
susceptibility testing. Expert Rev Mol Diagn 2017; 17: 257–69.
80. Balloux F, Brynildsrud OB, Van Dorp L, et al. From Theory to Practice: Translating Whole-Genome
Sequencing (WGS) into the Clinic. Trends Microbiol 2018; xx: 1–14.
34
81. McArthur AG, Tsang KK. Antimicrobial resistance surveillance in the genomic age. Ann N Y Acad
Sci 2017; 1388: 78–91.
82. Su M, Satola SW, Read TD. Genome-based prediction of bacterial antibiotic resistance. J Clin
Microbiol 2019; 57: 1–15.
83. Libbrecht MW, Noble WS. Machine learning applications in genetics and genomics. Nat Rev Genet
2015; 16: 321–32.
84. Jiang Y, Wei Z, Wang Y, Hua X, Feng Y, Yu Y. Tracking a hospital outbreak of KPC-producing ST11
Klebsiella pneumoniae with whole genome sequencing. Clin Microbiol Infect 2015; 21: 1001–7.
85. Sheppard AE, Stoesser N, Wilson DJ, et al. Nested Russian doll-like genetic mobility drives rapid
dissemination of the carbapenem resistance gene blaKPC. Antimicrob Agents Chemother 2016; 60:
3767–78.
86. Yang S, Hemarajata P, Hindler J, et al. Evolution and Transmission of Carbapenem-Resistant
Klebsiella pneumoniae Expressing the blaOXA-232 Gene During an Institutional Outbreak Associated
With Endoscopic Retrograde Cholangiopancreatography. Clin Infect Dis 2017; 64: 894–901.
87. Fitzpatrick MA, Ozer EA, Hauser AR. Utility of Whole-Genome Sequencing in Characterizing
Acinetobacter Epidemiology and Analyzing Hospital Outbreaks. J Clin Microbiol 2016; 54: 593–612.
88. Quainoo S, Coolen JPM, van Hijum SAFT, et al. Whole-genome sequencing of bacterial pathogens:
The future of nosocomial outbreak analysis. Clin Microbiol Rev 2017; 30: 1015–63.
89. Schürch AC, van Schaik W. Challenges and opportunities for whole-genome sequencing–based
surveillance of antibiotic resistance. Ann N Y Acad Sci 2017; 1388: 108–20.
90. Schirmer M, Ijaz UZ, D’Amore R, Hall N, Sloan WT, Quince C. Insight into biases and sequencing
errors for amplicon sequencing with the Illumina MiSeq platform. Nucleic Acids Res 2015; 43: e37.
91. Eid J, Fehr A, Gray J, et al. Real-time DNA sequencing from single polymerase molecules. Science
(80- ) 2009; 323: 133–8.
92. Wenger AM, Peluso P, Rowell WJ, et al. Accurate circular consensus long-read sequencing
improves variant detection and assembly of a human genome. Nat Biotechnol 2019; 37: 1155–62.
93. Amarasinghe SL, Su S, Dong X, Zappia L, Ritchie ME, Gouil Q. Opportunities and challenges in
long-read sequencing data analysis. Genome Biol 2020; 21: 30.
35
94. Deamer D, Akeson M, Branton D. Three decades of nanopore sequencing. Nat Biotechnol 2016;
34: 518–24.
95. Jain M, Koren S, Miga KH, et al. Nanopore sequencing and assembly of a human genome with
ultra-long reads. Nat Biotechnol 2018; 36: 338–45.
96. Loman NJ, Pallen MJ. Twenty years of bacterial genome sequencing. Nat Rev Microbiol 2015; 13:
787–94.
36
CHAPTER 2 : Genomic epidemiology of carbapenem- and colistin-
resistant Klebsiella pneumoniae isolates from Serbia: predominance of
ST101 strains carrying a novel OXA-48 plasmid
Mattia Palmieri1, Marco Maria D’Andrea2,3, Andreu Coello Pelegrin1, Caroline Mirande4, Snezana
Brkic5, Ivana Cirkovic6, Herman Goossens7, Gian Maria Rossolini8,9, Alex van Belkum1
1bioMérieux, Data Analytics Unit, La Balme Les Grottes, France.
2Department of Biology, University of “Tor Vergata”, Rome, Italy.
3Department of Medical Biotechnologies, University of Siena, Siena, Italy.
4bioMérieux, R&D Microbiology, La Balme Les Grottes, France.
5Institute for Laboratory Diagnostics Konzilijum, Belgrade, Serbia.
6Institute of Microbiology and Immunology, Faculty of Medicine, University of Belgrade, Serbia.
7Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp,
Belgium.
8Microbiology and Virology Unit, Florence Careggi University Hospital, Florence, Italy.
9Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
Published in Frontiers in Microbiology, 21 February 2020, doi: 10.3389/fmicb.2020.00294
37
2.1 Abstract
Klebsiella pneumoniae is a major cause of severe healthcare-associated infections and often shows
MDR phenotypes. Carbapenem resistance is frequent, and colistin represents a key molecule to treat
infections caused by such isolates. Here we evaluated the antimicrobial resistance mechanisms and
the genomic epidemiology of clinical K. pneumoniae isolates from Serbia. Consecutive non-replicate
K. pneumoniae clinical isolates (n=2,298) were collected from seven hospitals located in five Serbian
cities and tested for carbapenem resistance by disk diffusion. Isolates resistant to at least one
carbapenem (n=426) were further tested for colistin resistance with Etest or Vitek2. Broth
microdilution (BMD) was performed to confirm the colistin resistance phenotype, and colistin-
resistant isolates (N=45, 10.6%) were characterized by Vitek2 and whole genome sequencing. Three
different clonal groups (CGs) were observed: CG101 (ST101, N=38), CG258 (ST437, N=4; ST340, N=1;
ST258, N=1) and CG17 (ST336, N=1). mcr genes, encoding for acquired colistin resistance, were not
observed, while all the genomes presented mutations previously associated with colistin resistance.
In particular, all strains had a mutated MgrB, with MgrBC28S being the prevalent mutation and
associated with ST101. Isolates belonging to ST101 harbored the carbapenemase OXA-48, which is
generally encoded by an IncL/M plasmid that was no detected in our isolates. MinION sequencing
was performed on a representative ST101 strain, and the obtained long reads were assembled
together with the Illumina high quality reads to decipher the blaOXA-48 genetic background. The blaOXA-
48 gene was located in a novel IncFIA-IncR hybrid plasmid, also containing the extended spectrum β-
lactamase-encoding gene blaCTX-M-15 and several other antimicrobial resistance genes. Non-ST101
isolates presented different MgrB alterations (C28S, C28Y, K2*, K3*, Q30*, adenine deletion leading
to frameshift and premature termination, IS5-mediated inactivation) and expressed different
carbapenemases: OXA-48 (ST437 and ST336), NDM-1 (ST437 and ST340) and KPC-2 (ST258). Our
study reports the clonal expansion of the newly emerging ST101 clone in Serbia. This high-risk clone
appears adept at acquiring resistance, and efforts should be made to contain the spread of such
clone.
2.2 Introduction
Klebsiella pneumoniae has emerged as one of the most challenging antibiotic-resistant pathogens,
since it can cause a variety of infections, including pneumonia and bloodstream infections, and
exhibits a remarkable propensity to acquire antimicrobial resistance (AMR) traits. In particular,
carbapenem-resistant K. pneumoniae (CRKP) are challenging pathogens due to the limited treatment
options, high mortality rates, and potential for rapid dissemination in health care settings (Paczosa
and Mecsas, 2016).
38
Treatment options for CRKP infections are usually limited to aminoglycosides, tigecycline, fosfomycin
and colistin. Novel β-lactam-β-lactamase inhibitors combinations, such as ceftazidime-avibactam and
meropenem-vaborbactam, have represented a major breakthrough for treatment of some CRKP (e. g.
those producing KPC-type and OXA-48-like enzymes), but unfortunately they do not cover strains
producing metallo-carbapenemases (Bassetti et al., 2018). Colistin, despite its nephrotoxicity and
neurotoxicity, remains a key component of some anti-CRKP regimens (Karaiskos et al., 2017).
Colistin resistance (colR) is mainly mediated by modifications of the lipid A moiety of the bacterial
lipopolysaccharide (LPS) by addition of positively charged 4-amino-4-deoxy-L-arabinose (LAra4N)
and/or phosphoethanolamine (pEtN) residues. A large panel of genes and operons is involved in
modifications of the LPS, and mutations conferring colistin resistance have mainly been observed in
mgrB, phoP/phoQ, pmrA/pmrB, and crrB genes (Cheng et al., 2010; Cannatelli et al., 2013, 2014a;
Wright et al., 2015). Recently, several plasmid-mediated colistin resistance genes, named mcr,
encoding pEtN transferases, have also been reported in E. coli and other members of
Enterobacterales, including K. pneumoniae (Sun et al., 2018).
Global dissemination of CRKP is mainly caused by the spread of a few successful clones. Major
representatives of these high-risk clonal lineages include the Clonal Group (CG) 11, CG15, CG307,
CG17, CG37, CG101 and CG147 strains. CG258 strains, and in particular those of ST258, are major
players in the worldwide spread of KPC-type carbapenemases, and are responsible for 68% of the
CRKP outbreaks (Navon-Venezia et al., 2017). CG101 strains harbor different clinically-relevant
resistance determinants, such as carbapenemases of the KPC, OXA-48, VIM and NDM types, and
virulence genes, such as an integrative conjugative element carrying the yersiniabactin siderophore
(ICEKp3), the fimbriae cluster (mrkABCDFHIJ), the ferric uptake system (kfuABC), a capsular K type
K17, and an O antigen type of O1 (Roe et al., 2019). These features, together with the ability to
produce biofilm, are likely major factors in the ecological success of CG101 strains. Indeed, spreading
of this clone is on the rise (Navon-Venezia et al., 2017).
Multidrug resistance (MDR) prevalence in clinical isolates of K. pneumoniae, including resistance to
third-generation cephalosporins, fluoroquinolones and aminoglycosides, may be as high as 50% in
Southern Europe, and even higher proportions have been observed in Eastern Europe. In Serbia, in
2016, MDR K. pneumoniae accounted for 63% of all K. pneumoniae infections in humans, of which
35% were also carbapenem resistant (WHO Regional Office for Europe, 2017). Previous studies
reported that NDM-1 was the main K. pneumoniae-associated carbapenemase observed in Serbia in
the period 2013-2014 followed by OXA-48, while KPC was only sporadically reported (Grundmann et
al., 2017; Trudic et al., 2017). Novović et al. performed a molecular epidemiology study of
39
carbapenem- and colistin-resistant strains from Serbia, showing prevalence of CG258 and CG101
strains, producing NDM-1 and OXA-48 carbapenemases, respectively. However, the proportion of
colistin resistance among those isolates was not reported, and the mechanisms of colistin resistance
of those isolates were not elucidated (Novović et al., 2017).
In this study, we used whole genome sequencing (WGS) to study the genomic epidemiology and
antimicrobial resistance mechanisms of colR K. pneumoniae isolates from Serbia, including some
representative of the previously mentioned collection as reference to study the dynamic changes of
population structure (Novović et al., 2017).
2.3 Materials and methods
Bacterial isolates and susceptibility testing. In the period between November 2013 and May 2017, K.
pneumoniae isolates were obtained from routine microbiological cultures of clinical samples (e.g.
urine, blood, skin, bronchial aspirate) from seven Serbian medical centers distributed in five Serbian
cities (Niš, Novi Sad, Belgrade, Kraljevo and Subotica). Bacteria were not isolated by the authors but
provided by the respective medical centers. Therefore, an ethics approval was not required as per
institutional and national guidelines and regulations. Information about patients antimicrobial
treatment were not available. Identification at the species level was performed by MALDI-TOF MS
(Vitek MS, bioMérieux, Marcy l’Etoile, France), and carbapenem susceptibility was determined by
disk diffusion and interpreted according to the EUCAST breakpoints (EUCAST, 2019). Isolates non-
susceptible to at least one carbapenem (ertapenem, meropenem and imipenem) were tested for
colistin resistance by Vitek2 or Etest (bioMérieux, Marcy l’Etoile, France) according to manufacturer’s
instructions (note that the warning by EUCAST about colistin susceptibility testing was only issued in
July 2016, and for this reason the above methods were used for colistin susceptibility testing of the
isolates collected in this study). Antimicrobial susceptibility testing of the colR isolates was
performed using the Vitek2 automated system, and results were interpreted according to EUCAST
breakpoints (EUCAST, 2019). Colistin minimum inhibitory concentrations (MICs) were confirmed
using the broth microdilution method performed according to the CLSI guidelines (CLSI, 2019) and
interpreted by using the EUCAST breakpoints (EUCAST, 2019). For carbapenems (ertapenem,
imipenem and meropenem), MICs were obtained by using Etests (bioMérieux, Marcy l’Etoile, France).
To note, 25 colR isolates were from the previously described collection by Novović et al., and were
included in this study for comparative purposes.
Mass spectrometry analysis of lipid A. Preparations of lipid A were obtained as previously described
(Kocsis et al., 2017). An aliquot of 0.7 µL of each preparation was spotted on a matrix-assisted laser
desorption/ionization–time of flight mass spectrometry (MALDI-TOF MS) sample plate, mixed with an
40
isovolume of norharmane matrix (Sigma-Aldrich, St Louis, Missouri) and then air-dried. Samples were
analyzed with a Vitek MS instrument (bioMérieux, Marcy l’Étoile, France) in the negative-ion mode.
DNA extraction and Whole Genome Sequencing. Genomic DNA was extracted with the DNeasy
UltraClean kit (Qiagen, Hilden, Germany), quantified by using the Qubit fluorometer (Thermo Fisher
Scientific, USA) and quality checked by using the 260/280 ratio absorbance parameter as determined
by the DS-11 FX + instrument (DeNovix, Wilmington, USA). Sequencing was performed using a
NextSeq platform (Illumina, Inc., San Diego, USA) and a 2x150 bp paired-end approach. Raw data
from paired-end sequencing were quality checked with the FastQC tool (v.0.11.6) and assembled
with SPAdes (v.3.10.1)(Bankevich et al., 2012). One representative strain (KB-2017-139) was also
sequenced with the MinION sequencer (ONT, Oxford, UK) using an R9.5.1 flow cell and the protocol
1D Genomic DNA by Ligation (SQK-LSK109). Illumina and Nanopore raw data from KB-2017-139 were
assembled with a hybrid approach using Unicycler (Wick et al., 2017). Whole genome sequencing
data of the 45 clinical isolates have been deposited under BioProject PRJNA449293
(www.ncbi.nlm.nih.gov/bioproject/PRJNA449293). The complete sequence of the plasmid
pSRB_OXA-48 obtained by Illumina and Nanopore sequencing was deposited on GenBank under
accession number MN218814.
Bioinformatics analysis. MLST was performed in silico by using the tool mlst
(https://github.com/tseemann/mlst) and the Pasteur database (https://bigsdb.pasteur.fr/). BLAST+
(2.7.1) was used to detect mutations in genes potentially involved in colistin resistance (mgrB,
pmrA/B, phoP/Q, crrA/B), and only mutations leading to amino acid variations were considered. For
the characterization of colistin resistance mechanisms, strains of CG258, ST101 and ST336 were
compared to colistin susceptible reference strains of the same CG, i. e. NJST258_2 (accession no.
NZ_CP006918.1), BA33875 (NEWA00000000) and MGH-78578 (NC_009648.1), respectively.
Phylogenetic relatedness was investigated with the parsnp tool (v1.2) (Treangen et al., 2014) by using
default parameters and the strain NTUH-K2044 (accession no. NC_012731.1) as reference. The
phylogenetic tree obtained was visualized with the online tool iTol (Letunic and Bork, 2016). The
ABRicate tool (https://github.com/tseemann/abricate) was used to detect acquired antimicrobial
resistance genes using the ResFinder database (Zankari et al., 2012), while plasmid replicons were
predicted by PlasmidFinder (Carattoli et al., 2014). Kaptive was used for the capsular type detection
(Wyres et al., 2016). Comparative analysis of plasmids was performed with BLAST Ring Image
Generator (Alikhan et al., 2011) and Easyfig (Sullivan et al., 2011).
For the comparative genomic analysis of ST101 isolates, on 31 October 2018 all the K. pneumoniae
genomes available on NCBI (N=5,820) were downloaded with the ncbi-genome-download tool
41
(https://github.com/kblin/ncbi-genome-download). MLST was performed and all ST101 (N=195)
(Table S2) together with ST101 strains from this study were used for phylogenetic investigations by
using parsnp and the closed ST101 chromosome from Kp_Goe_121641 (accession no.
NZ_CP018735.1) as reference.
2.4 Results
K. pneumoniae isolates and antimicrobial susceptibilities. In the period between November 2013
and May 2017, a total of 2,298 clinical isolates of K. pneumoniae were isolated from patients
admitted to seven medical settings located in five Serbian cities. Among those, 426 isolates (18.5%)
were non-susceptible to at least one carbapenem by disk diffusion, and were tested for colistin
resistance. A total of 45 strains (10.6%) out of this subset showed a colistin resistant phenotype. At
the time of the collection, colistin susceptibility testing was routinely performed with the Vitek2
instrument or Etest, although these methods had several limitations (Tan and Ng, 2007). Thus, the
number of colR isolates may be underestimated.
All the strains were confirmed as colistin resistant by the broth microdilution method (considering
the EUCAST susceptibility breakpoint of 2 mg/L) with MICs that ranged between 8 and 32 mg/L
(Table S1). Etest results for carbapenemes showed that all the strains were resistant to ertapenem,
while meropenem and imipenem had susceptibility rates of 93.3% and 91.1%, respectively. Vitek2
results showed that none of the fluoroquinolones, penicillins combined with β-lactamase inhibitors
and cephalosporins (including cefoxitin and the 4th generation cephalosporin cefepime) were
effective against the 45 colR isolates. Conversely, amikacin (86% susceptibility) and
trimethoprim/sulfamethoxazole (78% susceptibility) were the most active agents together with
imipenem and meropenem (Table S1).
Genomic epidemiology. Genome sequence data were used to investigate the population structure of
the colR K. pneumoniae strains circulating in Serbia. Five different STs were detected among the
investigated collection (ST101, ST437, ST258, ST336 and ST340), with the majority of strains
belonging to ST101 (N=38) or CG258 (ST258, N=1; ST340, N=1 and ST437, N=4) (Figure 1). The
remaining strain belonged to CG17 and was typed as ST336. Isolates of ST101 were closely related to
each other (single nucleotide polymorphism (SNP) variation: 5–893, mean 107, median 61), with only
two of them (i. e. KV-2017-142 and KV-2017-143) having more than 200 SNPs when compared to
other ST101 isolates and to each other. The ST101 isolates were detected in all the cities involved in
this study, except Niš, thus demonstrating the endemicity at the national level of this clone.
Moreover, there was not a clear clustering of isolates obtained from different hospitals, suggesting
inter-hospital cross infections.
42
Figure 1. Phylogenetic tree of the colR K. pneumoniae isolates from Serbia. For each isolate, the medical setting (CN, Clinical center of Niš, Niš; CV, Clinical center of Vojvodina, Novi Sad; KB, Konzilijum, Belgrade; DM, University hospital center “Dr Dragiša Mišovic-Dedinje”, Belgrade; KV, The General hospital “Studenica”, Kraljevo; GZ, The Institute of Public health of Belgrade, Belgrade; SU, General Hospital Subotica, Subotica), the year of isolation and the sample number are reported. Colored nodes indicate MLST, while the presence/absence of ESBLs, carbapenemases, resistance genes (black) and plasmid replicons is indicated by filled boxes.
The genomes of the ST101 Serbian isolates were compared with 195 ST101 genomes available in the
NCBI databases, and their phylogenetic relation is showed in Figure 2. Strains from our study (red
lines) cluster together in the tree in a well-defined branch containing other strains from Serbia,
Slovenia, Turkey and Greece. Overall, the number of SNPs among all analyzed ST101 isolates ranged
between 1 and 1,547 (mean 195, median 135), and two major lineages within this group can be
observed. The majority of SNPs separating these two lineages fell in the cps gene cluster, and this
was consistent with the previous observations that strains of ST101 are characterized by two
different K-loci , KL17 and KL106, associated with wzi alleles 137 and 29, respectively (Roe et al.,
2019). While KL17 is prevalent among ST101 strains, KL106 is less frequent but, interestingly, it is the
second most abundant capsular variant of CG258 (Wyres et al., 2015), reinforcing the hypothesis that
capsular exchange in K. pneumoniae is a common event (Chen et al., 2014; Bowers et al., 2015).
43
All non-ST101 isolates (excluding KB-2015-119) were part of a single monophyletic subclade within
the CG258 (Bowers et al., 2015) and produced different carbapenemases or were carbapenemase
negative (Figure 1), while the remaining isolate of ST336 was a OXA-48-producer and harbored the
KL25 capsular type.
Figure 2. Phylogenetic tree of the ST101 K. pneumoniae isolates from this study (red lines) in comparison to ST101 isolates retrieved from NCBI (black lines). The two types of capsular polysaccharides (KL17 and KL106) are indicated by colored ranges. Two datasets are also present, indicating the type of carbapenemase (inner circle) and the country of origin (outer circle).
Colistin resistance mechanisms. No mcr genes were observed in the genomes of the colR isolates.
Conversely, all of them showed alterations in the PhoP/PhoQ regulator mgrB gene. These alterations
were mainly SNPs, with the majority of ST101 isolates from this study characterized by the mutation
MgrBC28S (N=37; 97.4%). Although different substitutions of the cysteine amino acid at position 28
have already been described (e. g. MgrBC28F and MgrBC28Y), and their role in colistin resistance has
been experimentally demonstrated (Cannatelli et al., 2014b; Olaitan et al., 2014; Cheng et al., 2015;
Wright et al., 2015), the MgrBC28S is first described here. This cysteine residue has been previously
shown to be involved in a key disulfide bond relevant to MgrB function (Lippa and Goulian, 2012),
44
thus its substitution by Serine or by any other amino acid is expected to interfere with the ability to
repress PhoQ, leading to the overexpression of the pmrHFIJKLM operon and to a colistin resistance
phenotype. The isolate CN-2013-099, belonging to ST340, displayed the previously studied MgrBC28Y
substitution (Cheng et al., 2015). Different mutations leading to premature stop codons were MgrBK2*
in the ST101 isolate KV-2017-143, firstly described here, MgrBK3* in the ST437 isolate GZ-2017-145
(Nordmann et al., 2016) and MgrBQ30* in the ST336 strain KB-2015-119 (Nordmann et al., 2016). The
ST258 isolate was characterized by an insertion sequence of the family IS5 which interrupted the
mgrB gene at nucleotide 75. Disruption of the mgrB gene by insertion sequences has been shown as
a common mechanism of colistin resistance in KPC harboring strains (Cannatelli et al., 2014b). Three
ST437 strains were characterized by an adenine deletion within the polyadenine region present from
nucleotide 4 to 9 in mgrB, resulting in a frameshift mutation. Collectively, the results of these
analyses demonstrated that all colistin resistant strains investigated in this study were characterized
by genetic alterations in the mgrB gene.
Other genetic alterations potentially involved in colistin resistance were: PmrAE57G (KB-2015-119,
ST336), PmrBT157P (CCV-2015-105, ST101) and PhoQV446G (CCDM-2017-135, ST258). Among these, only
PmrBT157P was previously reported, and its role in reducing colistin susceptibility was demonstrated
(Jayol et al., 2014). Accordingly, the ST101 isolate CV-2015-105 having PmrBT157P together with
MgrBC28S, showed a colistin MIC 1- to 2-fold higher than isogenic strains carrying only MgrBC28S.
Mass spectrometry of lipid A was performed on a subset of isolates representative of the different
alterations potentially involved in colistin resistance. Compared to the colistin susceptible reference
ATCC11296 strain, colR isolates showed an additional peak at 1,971 m/z resulting from the addition
of a 4-amino-4-deoxy-L-arabinose moiety (131 m/z) to lipid A (peak at 1,840 m/z), as previously
reported (Leung et al., 2017) (results not shown). This supports the role of the observed mutations in
the overexpression of the pmrHFIJKLM operon and consequent lipid A modification, leading to
reduced colistin interactions. Moreover, no addition of pEtN moieties to lipid A were observed,
consistently with the absence of mcr-like genes (Liu et al., 2017).
To note, our findings concerning MgrB alterations differ from those previously reported by Novovic
et al., as they did not detect significant MgrB alterations for most of the isolates. This underline the
importance of using well characterized colistin susceptible reference isolates, as the one used in the
mentioned study was not characterized with reference methods for colistin susceptibility testing
(Mirovic et al., 2012).
Other antibiotic resistance mechanisms. All strains were positive for an ESBL-encoding gene, with
blaCTX-M-15 harbored by all strains except the only ST258, which carried a blaSHV-12 gene. Analysis of the
45
ompK35 gene, encoding a major outer membrane protein, showed that all non-ST258 strains had
deletions leading to frameshift and premature stop codons, while the ompK36 gene was intact in all
the genomes. Outer membrane impermeability most likely explains resistance to cefoxitin (a
cephamycin) and to ertapenem for those isolates negative for a carbapenemase encoding gene
(Ardanuy et al., 1998). Two ST437 and the ST336 isolate harbored the 16S rRNA methylase gene
armA, which confers high level resistance to aminoglycosides. Several other antimicrobial resistance
genes were observed for the following antimicrobial classes: aminoglycosides (presence of aac- ,
aad- , aph- and ant-type modifying enzymes), fluoroquinolones (oqxAB, qnrB1, aac(6’)-Ib-cr, parCS80I,
gyrAS83Y-S83I-D87G-D87N), phenicol (floR, catA1 and catB4 genes), sulfonamide (sul1 and sul2 genes),
tetracycline (tetA and tetD genes) and trimethoprim (dfrA).
Novel IncR/IncFIA OXA-48 plasmid within ST101 isolates. The production of OXA-48 was at the basis
of carbapenem resistance in the K. pneumoniae of ST101 analyzed in this study. For this reason, we
deeply investigated the genetic context of this gene. Spreading of the blaOXA-48-encoding gene among
Enterobacterales is mainly related to the dissemination of a single ~62-kb IncL/M-like conjugative
plasmid (Poirel et al., 2012). However, PlasmidFinder analysis did not detect any IncL/M replicon
among ST101 isolates from Serbia. Therefore, MinION sequencing was performed on one
representative strain (KB-2017-139) with the aim to fully characterize the genomic background of the
blaOXA-48 gene.
The blaOXA-48 gene was located on a plasmid of 83,654 bp, named pSRB_OXA-48, carrying both the
IncR and the IncFIA type replicons, the blaCTX-M-15 and several other antimicrobial resistance genes
(tet(D), aac(6')-Ib-cr, blaOXA-1, catB3-like, aac(3’)-IIa and dfrA14). A BLAST analysis showed that
pSRB_OXA-48 is a hybrid plasmid composed by i) a fragment having 99.7 % identity with the IncFIA-
IncR pKp_Goe_641-1 plasmid (CP018737.1) and carrying the blaCTX-M-15 gene and several other
antimicrobial resistance genes (aac(3)-IIa, catB3, blaOXA-1, aac(6')-Ib-cr, aac(6')-Ib, ant(3'')-Ia, blaOXA-9,
blaTEM-1A, dfrA14), and ii) a fragment identical to the IncL/M plasmid pKp_Goe_641-2 (CP018736.1)
carrying the blaOXA-48 gene (Figure 3). Both these plasmids have been described in K. pneumoniae
strain Kp_Goe_121641 (accession no. NZ_CP018735.1), isolated from a refugee from North Africa
hospitalized in Germany, in 2013. The latter strain belongs to ST101 and has a median of 142 SNPs
(min 134, max 601) compared to the Serbian ST101 isolates from this study. Collectively these results
suggest that pSRB_OXA-48 likely originated by recombination events between two plasmids within
an ST101 strain related to Kp_Goe_121641. In order to elucidate the recombination mechanisms at
the origin of pSRB_OXA-48, we compared this plasmid to pKp_Goe_641-1 and to pRA35
(LN864821.1), an IncL/M plasmid similar to pKp_Goe_641-2 but with an intact structure of the
transposon Tn6237 carrying blaOXA-48 (Beyrouthy et al., 2014) (Figure 3). A detailed analysis showed
46
that pSRB_OXA-48 contained a copy of Tn6237 which was disrupted by a IS26 composite transposon
of 73.7 Kbp sharing similarity with pKp_Goe_641-1. This hypothesis was corroborated by the
presence of 8-bp target site duplication sequences (5’-GCGAATAA-3’) flanking the composite
transposons regions (Figure 4). The results of reads-mapping performed against pSRB_OXA-48 using
Illumina short-reads from the other ST101/OXA-48 strains were consistent with the presence of a
pSRB_OXA-48-related plasmid in all the ST101/OXA-48 isolates. Non-ST101 OXA-48 strains (ST336
KB-2015-119 and ST437 GZ-2017-145) had the IncL/M replicon, while lacking the IncFIA and IncR
replicons, suggesting that the blaOXA-48 gene was located in a classic IncL/M plasmid and not in a
pSRB_OXA-48-like plasmid (Figure 1).
Figure 3. BLAST ring image generator output of the OXA-48 plasmid pSRB_OXA-48 from the ST101 isolate KB-2017-139 (violet) against the two major plasmids from the ST101 isolate Kp_Goe_1216141 (pKp_Goe_641-1, in red and pKp_Goe_641-2 in green). Only identities >95% are indicated. Antimicrobial resistance genes are indicated in red, plasmid replicons in blue and all other genes in black.
47
Figure 4. Comparison of plasmids pSRB_OXA-48, pKpGoe_641-1 and pRA35. Antimicrobial resistance genes, plasmid replicons and mobile elements are also indicated. TSD: target site duplication.
2.5 Discussion
This study exploited WGS to characterize a collection of colR CRKP isolates obtained from seven
medical settings and five Serbian cities over a nearly four-year period. Results showed that all the
isolates presented alterations in the PhoP/PhoQ regulator MgrB, confirming its major role in colR in K.
pneumoniae. Lipid A alterations associated with colR were also studied with MALDI-TOF MS. The
analysis revealed the addition of a 4-amino-4-deoxy-L-arabinose moiety to lipid A, but no addition of
pEtN moieties, for all isolates tested. These results support the role of the MgrB mutations in colistin
resistance, and also confirm the absence of mcr-like genes.
The predominant ST observed was ST101, an emerging high-risk clone detected worldwide and
associated with different carbapenemases and high mortality (Navon-Venezia et al., 2017; Can et al.,
2018). In a recent European survey of CRKP isolates, including 244 hospitals in 32 countries, four
major clonal lineages accounted for roughly 70% of the carbapenemase-producing isolates, including
ST 11, 15, 101, 258/512 and their derivatives (David et al., 2019). The first ST101 strain from Serbia
was isolated in 2013, and coproduced the OXA-48 and the NDM-1 carbapenemases (Seiffert et al.,
2014). Most of the colR ST101 from this study were carbapenemase-producers, and OXA-48 was the
only carbapenemase expressed. ST101/OXA-48 has been frequently reported, and in an 11-year
epidemiology study of OXA-48 producers among European and north- African countries, a quarter of
the OXA-48 K. pneumoniae isolates belonged to ST101 (Potron et al., 2013). Outbreaks of
ST101/OXA-48 were also described, with reports from Spain (Pitart et al., 2011; Cubero et al., 2015),
Algeria (Loucif et al., 2016), Czech Republic (Skálová et al., 2016) and Greece (Avgoulea et al., 2018).
The challenging phenotypic detection of OXA-48 carbapenemases and the rapid horizontal transfer
of OXA-48-encoding plasmids favor hospital outbreaks linked to patient transfer (Skálová et al., 2016)
and draw attention to the need for continuous and meticulous surveillance, as well as timely
investigation.
48
The blaOXA-48 gene spread is mainly related to the dissemination of a single ~62-kb IncL/M-like
conjugative plasmid that does not carry additional resistance determinants (Poirel et al., 2012).
Conversely, ST101/OXA-48 isolates from this study carried a novel hybrid plasmid (pSRB_OXA-48)
with replicons IncR and IncFIA and encoding OXA-48, the CTX-M-15 ESBL and several other
antimicrobial resistance genes. Such plasmids confer an MDR phenotype which limits the use of most
β-lactams, including carbapenems. In fact, even if most isolates (91%) were susceptible to imipenem,
carbapenems have been proven to be not effective in an in vivo murine model (Wiskirchen et al.,
2014). Moreover, there have been a number of case reports and series describing treatment failures
with carbapenem-containing regimens in the treatment of OXA-48-producing bacterial infections
(Stewart et al., 2018). Ceftazidime-avibactam may represent an effective alternative against such
isolates, as previously reported (Kazmierczak et al., 2018).
Similarities among the Serbian ST101 strains, supported by the limited number of SNPs observed and
the presence of the same alteration in the mgrB gene, suggest a clonal expansion of this clone among
Serbian medical settings. This observation underscores the need to strengthen contact precautions
for patients diagnosed with or suspected of having CRKP infections to limit the diffusion of colR CRKP
of ST101.
Of note, colR ST101 strains have recently been associated with high mortality rates. Indeed, a
prospective cohort study showed that among colR isolates, ST101 was found to be a significant
independent predictor of patient mortality, with a 30-day patient mortality of 72% (Can et al., 2018).
In conclusion, this work corresponds to the first genomic investigation of colistin resistance in K.
pneumoniae isolates from Serbia. The major role of MgrB mutations in colistin resistance in K.
pneumoniae, observed in strains of CG258, is here confirmed for those of ST101. We also report the
full sequence of a novel plasmid, pSRB_OXA-48, conferring MDR phenotype and encoding for the
ESBL CTX-M-15 and the carbapenemase OXA-48.
2.6 References
Alikhan, N.-F., Petty, N. K., Ben Zakour, N. L., and Beatson, S. A. (2011). BLAST Ring Image Generator
(BRIG): simple prokaryote genome comparisons. BMC Genomics 12, 402. doi:10.1186/1471-
2164-12-402.
Ardanuy, C., Liñares, J., Domínguez, M. A., Hernández-Allés, S., Benedí, V. J., and Martínez-Martínez,
L. (1998). Outer membrane profiles of clonally related Klebsiella pneumoniae isolates from
clinical samples and activities of cephalosporins and carbapenems. Antimicrob. Agents
Chemother. 42, 1636–40. Available at: http://www.ncbi.nlm.nih.gov/pubmed/9660996.
49
Avgoulea, K., Di Pilato, V., Zarkotou, O., Sennati, S., Politi, L., Cannatelli, A., et al. (2018).
Characterization of extensively- or pandrug-resistant ST147 and ST101 OXA-48-producing
Klebsiella pneumoniae isolates causing bloodstream infections in ICU patients. Antimicrob.
Agents Chemother., AAC.02457-17. doi:10.1128/AAC.02457-17.
Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., et al. (2012). SPAdes:
A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J. Comput.
Biol. 19, 455–477. doi:10.1089/cmb.2012.0021.
Bassetti, M., Righi, E., Carnelutti, A., Graziano, E., and Russo, A. (2018). Multidrug-resistant Klebsiella
pneumoniae : challenges for treatment, prevention and infection control. Expert Rev. Anti.
Infect. Ther. 00, 14787210.2018.1522249. doi:10.1080/14787210.2018.1522249.
Beyrouthy, R., Robin, F., Delmas, J., Gibold, L., Dalmasso, G., Dabboussi, F., et al. (2014). IS1R-
Mediated Plasticity of IncL/M Plasmids Leads to the Insertion of blaOXA-48 into the Escherichia
coli Chromosome. Antimicrob. Agents Chemother. 58, 3785. doi:10.1128/AAC.02669-14.
Bowers, J. R., Kitchel, B., Driebe, E. M., MacCannell, D. R., Roe, C., Lemmer, D., et al. (2015). Genomic
Analysis of the Emergence and Rapid Global Dissemination of the Clonal Group 258 Klebsiella
pneumoniae Pandemic. PLoS One 10, e0133727. doi:10.1371/journal.pone.0133727.
Can, F., Menekse, S., Ispir, P., Atac, N., Albayrak, O., Demir, T., et al. (2018). Impact of the ST101
clone on fatality among patients with colistin-resistant Klebsiella pneumoniae infection. J.
Antimicrob. Chemother., 1–7. doi:10.1093/jac/dkx532.
Cannatelli, A., D’Andrea, M. M., Giani, T., Di Pilato, V., Arena, F., Ambretti, S., et al. (2013). In vivo
emergence of colistin resistance in Klebsiella pneumoniae producing KPC-type carbapenemases
mediated by insertional inactivation of the PhoQ/PhoP mgrB regulator. Antimicrob. Agents
Chemother. 57, 5521–5526. doi:10.1128/AAC.01480-13.
Cannatelli, A., Di Pilato, V., Giani, T., Arena, F., Ambretti, S., Gaibani, P., et al. (2014a). In vivo
evolution to Colistin resistance by PmrB sensor kinase mutation in KPC-producing Klebsiella
pneumoniae is associated with low-dosage colistin treatment. Antimicrob. Agents Chemother.
58, 4399–4403. doi:10.1128/AAC.02555-14.
Cannatelli, A., Giani, T., D’Andrea, M. M., Pilato, V. Di, Arena, F., Conte, V., et al. (2014b). MgrB
inactivation is a common mechanism of colistin resistance in KPC-producing klebsiella
pneumoniae of clinical origin. Antimicrob. Agents Chemother. 58, 5696–5703.
doi:10.1128/AAC03110-14.
50
Carattoli, A., Zankari, E., Garciá-Fernández, A., Larsen, M. V., Lund, O., Villa, L., et al. (2014). In Silico
detection and typing of plasmids using plasmidfinder and plasmid multilocus sequence typing.
Antimicrob. Agents Chemother. 58, 3895–3903. doi:10.1128/AAC.02412-14.
Chen, L., Mathema, B., Pitout, J. D. D., DeLeo, F. R., and Kreiswirth, B. N. (2014). Epidemic Klebsiella
pneumoniae ST258 Is a Hybrid Strain. mBio 5, e01355-14. doi:10.1128/mBio.01355-14.
Cheng, H.-Y., Chen, Y.-F., and Peng, H.-L. (2010). Molecular characterization of the PhoPQ-PmrD-
PmrAB mediated pathway regulating polymyxin B resistance in Klebsiella pneumoniae CG43. J.
Biomed. Sci. 17, 60. doi:10.1186/1423-0127-17-60.
Cheng, Y. H., Lin, T. L., Pan, Y. J., Wang, Y. P., Lin, Y. T., and Wang, J. T. (2015). Colistin resistance
mechanisms in Klebsiella pneumoniae strains from Taiwan. Antimicrob. Agents Chemother. 59,
2909–2913. doi:10.1128/AAC.04763-14.
CLSI (2019). CLSI. Performance Standards for Antimicrobial Susceptibility Testing. 29th ed. CLSI
supplement M100. Wayne, PA: Clinical and Laboratory Standars Institute; 2019.
Cubero, M., Cuervo, G., Dominguez, M. Á., Tubau, F., Martí, S., Sevillano, E., et al. (2015).
Carbapenem-resistant and carbapenem-susceptible isogenic isolates of Klebsiella pneumoniae
ST101 causing infection in a tertiary hospital. BMC Microbiol. 15, 177. doi:10.1186/s12866-015-
0510-9.
David, S., Reuter, S., Harris, S. R., Glasner, C., Feltwell, T., Argimon, S., et al. (2019). Epidemic of
carbapenem-resistant Klebsiella pneumoniae in Europe is driven by nosocomial spread. Nat.
Microbiol. doi:10.1038/s41564-019-0492-8.
EUCAST (2019). The European Committee on Antimicrobial Susceptibility Testing. Breakpoint tables
for interpretation of MICs and zone diameters. Version 9.0, 2019. http://www.eucast.org.
Grundmann, H., Glasner, C., Albiger, B., Aanensen, D. M., Tomlinson, C. T., Andrasević, A. T., et al.
(2017). Occurrence of carbapenemase-producing Klebsiella pneumoniae and Escherichia coli in
the European survey of carbapenemase-producing Enterobacteriaceae (EuSCAPE): a prospective,
multinational study. Lancet Infect. Dis. 17, 153–163. doi:10.1016/S1473-3099(16)30257-2.
Jayol, A., Poirel, L., Brink, A., Villegas, M. V., Yilmaz, M., and Nordmann, P. (2014). Resistance to
colistin associated with a single amino acid change in protein PmrB among Klebsiella
pneumoniae isolates of worldwide origin. Antimicrob. Agents Chemother. 58, 4762–4766.
doi:10.1128/AAC.00084-14.
51
Karaiskos, I., Souli, M., Galani, I., and Giamarellou, H. (2017). Colistin: still a lifesaver for the 21st
century? Expert Opin. Drug Metab. Toxicol. 13, 59–71. doi:10.1080/17425255.2017.1230200.
Kazmierczak, K. M., Bradford, P. A., Stone, G. G., de Jonge, B. L. M., and Sahm, D. F. (2018). In Vitro
Activity of Ceftazidime-Avibactam and Aztreonam-Avibactam against OXA-48-Carrying
Enterobacteriaceae Isolated as Part of the International Network for Optimal Resistance
Monitoring (INFORM) Global Surveillance Program from 2012 to 2015. Antimicrob. Agents
Chemother. 62. doi:10.1128/AAC.00592-18.
Kocsis, B., Kilár, A., Péter, S., Dörnyei, Á., Sándor, V., and Kilár, F. (2017). “Mass Spectrometry for
Profiling LOS and Lipid A Structures from Whole-Cell Lysates: Directly from a Few Bacterial
Colonies or from Liquid Broth Cultures,” in Methods in molecular biology (Clifton, N.J.), 187–198.
doi:10.1007/978-1-4939-6958-6_17.
Letunic, I., and Bork, P. (2016). Interactive tree of life (iTOL) v3: an online tool for the display and
annotation of phylogenetic and other trees. Nucleic Acids Res. 44, W242-5.
doi:10.1093/nar/gkw290.
Leung, L. M., Fondrie, W. E., Doi, Y., Johnson, J. K., Strickland, D. K., Ernst, R. K., et al. (2017).
Identification of the ESKAPE pathogens by mass spectrometric analysis of microbial membrane
glycolipids. Sci. Rep. 7, 6403. doi:10.1038/s41598-017-04793-4.Lippa, A. M., and Goulian, M.
(2012). Perturbation of the Oxidizing Environment of the Periplasm Stimulates the PhoQ/PhoP
System in Escherichia coli. J. Bacteriol. 194, 1457–1463. doi:10.1128/JB.06055-11.
Liu, Y.-Y., Chandler, C. E., Leung, L. M., McElheny, C. L., Mettus, R. T., Shanks, R. M. Q., et al. (2017).
Structural Modification of Lipopolysaccharide Conferred by mcr-1 in Gram-Negative ESKAPE
Pathogens. Antimicrob. Agents Chemother. 61, e00580-17. doi:10.1128/AAC.00580-17.
Loucif, L., Kassah Laouar, A., Saidi, M., Messala, A., Chelaghma, W., and Rolain, J.-M. (2016).
Outbreak of OXA-48-producing Klebsiella pneumoniae involving an ST 101 clone in Batna
University Hospital, Algeria. Antimicrob. Agents Chemother. 60, AAC.00525-16.
doi:10.1128/AAC.00525-16.
Mirovic, V., Tomanovic, B., Lepsanovic, Z., Jovcic, B., and Kojic, M. (2012). Isolation of Klebsiella
pneumoniae Producing NDM-1 Metallo-β-Lactamase from the Urine of an Outpatient Baby Boy
Receiving Antibiotic Prophylaxis. Antimicrob. Agents Chemother. 56, 6062–6063.
doi:10.1128/AAC.00838-12.
Navon-Venezia, S., Kondratyeva, K., and Carattoli, A. (2017). Klebsiella pneumoniae: a major
52
worldwide source and shuttle for antibiotic resistance. FEMS Microbiol. Rev. 013, 252–275.
doi:10.1093/femsre/fux013.
Nordmann, P., Jayol, A., and Poirel, L. (2016). Rapid Detection of Polymyxin Resistance in
Enterobacteriaceae. Emerg. Infect. Dis. 22, 1038–1043. doi:10.3201/eid2206.151840.
Novović, K., Trudić, A., Brkić, S., Vasiljević, Z., Kojić, M., Medić, D., et al. (2017). Molecular
Epidemiology of Colistin-Resistant, Carbapenemase-Producing Klebsiella pneumoniae in Serbia
from 2013 to 2016. Antimicrob. Agents Chemother. 61, e02550-16. doi:10.1128/AAC.02550-16.
Olaitan, A. O., Diene, S. M., Kempf, M., Berrazeg, M., Bakour, S., Gupta, S. K., et al. (2014). Worldwide
emergence of colistin resistance in Klebsiella pneumoniae from healthy humans and patients in
Lao PDR, Thailand, Israel, Nigeria and France owing to inactivation of the PhoP/PhoQ regulator
mgrB: an epidemiological and molecular study. Int. J. Antimicrob. Agents 44, 500–507.
doi:10.1016/j.ijantimicag.2014.07.020.
Paczosa, M. K., and Mecsas, J. (2016). Klebsiella pneumoniae: Going on the Offense with a Strong
Defense. Microbiol. Mol. Biol. Rev. 80, 629–61. doi:10.1128/MMBR.00078-15.
Pitart, C., Solé, M., Roca, I., Fàbrega, A., Vila, J., and Marco, F. (2011). First Outbreak of a Plasmid-
Mediated Carbapenem-Hydrolyzing OXA-48 β-Lactamase in Klebsiella pneumoniae in Spain.
Antimicrob. Agents Chemother. 55, 4398–4401. doi:10.1128/AAC.00329-11.
Poirel, L., Bonnin, R. A., and Nordmann, P. (2012). Genetic features of the widespread plasmid coding
for the carbapenemase OXA-48. Antimicrob. Agents Chemother. 56, 559–62.
doi:10.1128/AAC.05289-11.
Potron, A., Poirel, L., Rondinaud, E., and Nordmann, P. (2013). Intercontinental spread of OXA-48
beta-lactamase-producing Enterobacteriaceae over a 11-year period, 2001 to 2011. Euro
Surveill. 18. doi:10.2807/1560-7917.es2013.18.31.20549.
Roe, C. C., Vazquez, A. J., Esposito, E. P., Zarrilli, R., and Sahl, J. W. (2019). Diversity, Virulence, and
Antimicrobial Resistance in Isolates From the Newly Emerging Klebsiella pneumoniae ST101
Lineage. Front. Microbiol. 10, 1–13. doi:10.3389/fmicb.2019.00542.
Seiffert, S. N., Marschall, J., Perreten, V., Carattoli, A., Furrer, H., and Endimiani, A. (2014). Emergence
of Klebsiella pneumoniae co-producing NDM-1, OXA-48, CTX-M-15, CMY-16, QnrA and ArmA in
Switzerland. Int. J. Antimicrob. Agents 44, 260–262. doi:10.1016/j.ijantimicag.2014.05.008.
Skálová, A., Chudějová, K., Rotová, V., Medvecky, M., Študentová, V., Chudáčková, E., et al. (2016).
53
Molecular characterization of OXA-48-like-producing Enterobacteriaceae in the Czech Republic:
evidence for horizontal transfer of pOXA-48-like plasmids. Antimicrob. Agents Chemother. 61,
AAC.01889-16. doi:10.1128/AAC.01889-16.
Stewart, A., Harris, P., Henderson, A., and Paterson, D. (2018). Treatment of Infections by OXA-48-
Producing Enterobacteriaceae. Antimicrob. Agents Chemother. 62. doi:10.1128/AAC.01195-18.
Sullivan, M. J., Petty, N. K., and Beatson, S. A. (2011). Easyfig: a genome comparison visualizer.
Bioinformatics 27, 1009–1010. doi:10.1093/bioinformatics/btr039.
Sun, J., Zhang, H., Liu, Y.-H., and Feng, Y. (2018). Towards Understanding MCR-like Colistin Resistance.
Trends Microbiol. 26, 794–808. doi:10.1016/j.tim.2018.02.006.
Tan, T. Y., and Ng, S. Y. (2007). Comparison of Etest, Vitek and agar dilution for susceptibility testing
of colistin. Clin. Microbiol. Infect. 13, 541–544. doi:10.1111/j.1469-0691.2007.01708.x.
Treangen, T. J., Ondov, B. D., Koren, S., and Phillippy, A. M. (2014). The Harvest suite for rapid core-
genome alignment and visualization of thousands of intraspecific microbial genomes. Genome
Biol. 15, 524. doi:10.1186/s13059-014-0524-x.
Trudic, A., Jelesic, Z., Mihajlovic-Ukropina, M., Medic, D., Zivlak, B., Gusman, V., et al. (2017).
Carbapenemase production in hospital isolates of multidrug-resistant Klebsiella pneumoniae
and Escherichia coli in Serbia. Vojnosanit. Pregl. 74, 715–721. doi:10.2298/VSP150917260T.
WHO Regional Office for Europe (2017). Central Asian and Eastern European Surveillance of
Antimicrobial Resistance. doi:10.2307/3395557.
Wick, R. R., Judd, L. M., Gorrie, C. L., and Holt, K. E. (2017). Unicycler: Resolving bacterial genome
assemblies from short and long sequencing reads. PLoS Comput. Biol. 13, e1005595.
doi:10.1371/journal.pcbi.1005595.
Wiskirchen, D. E., Nordmann, P., Crandon, J. L., and Nicolau, D. P. (2014). Efficacy of Humanized
Carbapenem and Ceftazidime Regimens against Enterobacteriaceae Producing OXA-48
Carbapenemase in a Murine Infection Model. Antimicrob. Agents Chemother. 58, 1678–1683.
doi:10.1128/AAC.01947-13.
Wright, M. S., Suzuki, Y., Jones, M. B., Marshall, S. H., Rudin, S. D., van Duin, D., et al. (2015). Genomic
and transcriptomic analyses of colistin-resistant clinical isolates of Klebsiella pneumoniae reveal
multiple pathways of resistance. Antimicrob. Agents Chemother. 59, 536–43.
doi:10.1128/AAC.04037-14.
54
Wyres, K. L., Gorrie, C., Edwards, D. J., Wertheim, H. F. L., Hsu, L. Y., Van Kinh, N., et al. (2015).
Extensive capsule locus variation and large-scale genomic recombination within the Klebsiella
pneumoniae clonal group 258. Genome Biol. Evol. 7, 1267–1279. doi:10.1093/gbe/evv062.
Wyres, K. L., Wick, R. R., Gorrie, C., Jenney, A., Follador, R., Thomson, N. R., et al. (2016).
Identification of Klebsiella capsule synthesis loci from whole genome data. Microb. Genomics 2,
e000102. doi:10.1099/mgen.0.000102.
Zankari, E., Hasman, H., Cosentino, S., Vestergaard, M., Rasmussen, S., Lund, O., et al. (2012).
Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67, 2640–
2644. doi:10.1093/jac/dks261.
55
CHAPTER 3 : Abundance of colistin-resistant, OXA-23- and ArmA-
producing Acinetobacter baumannii belonging to International Clone 2
in Greece
Mattia Palmieri1, Marco Maria D’Andrea2,3, Andreu Coello Pelegrin1, Nadine Perrot4, Caroline
Mirande4, Bernadette Blanc4, Nicholas Legakis5*, Herman Goossens6, Gian Maria Rossolini7,8 and
Alex van Belkum1
1bioMérieux, Data Analytics Unit, La Balme Les Grottes, France
2Department of Medical Biotechnologies, University of Siena, Siena, Italy.
3Department of Biology, University of Rome “Tor Vergata”, Rome, Italy.
4bioMérieux, R&D Microbiology, La Balme Les Grottes, France
5Central Laboratories, IASO Group Hospitals, Athens, Greece
6Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
7Department of Experimental and Clinical Medicine, University of Florence, Florence, Italy
8Clinical Microbiology and Virology Unit, Florence Careggi University Hospital, Florence, Italy
Published in Frontiers in Microbiology, 15 April 2020, doi: 10.3389/fmicb.2020.00668
56
3.1 Abstract
Carbapenem resistant Acinetobacter baumannii (CRAB) represents one of the most challenging
pathogens in clinical settings. Colistin is routinely used for treatment of infections by this pathogen,
but increasing colistin resistance has been reported. We obtained 122 CRAB isolates from nine Greek
hospitals between 2015 and 2017, and those colistin resistant (ColR) (N=40, 32.8%) were whole
genome sequenced, also by including two colistin susceptible (ColS) isolates for comparison. All ColR
isolates were characterized by a previously described mutation, PmrBA226V, which was associated with
low-level colistin resistance. Some isolates were characterized by additional mutations in PmrB
(E140V or L178F) or PmrA (K172I or D10N), first described here, and higher colistin MICs, up to 64
mg/L. Mass spectrometry analysis of lipid A showed the presence of a phosphoethanolamine (pEtN)
moiety on lipid A, likely resulting from the PmrA/B-induced pmrC overexpression. Interestingly, also
the two ColS isolates had the same lipid A modification, suggesting that not all lipid A modifications
lead to colistin resistance or that other factors could contribute to the resistance phenotype. Most of
the isolates (N=37, 92.5%) belonged to the globally distributed international clone (IC) 2 and
comprised four different sequence types (STs) as defined by using the Oxford scheme (ST 425, 208,
451 and 436). Three isolates belonged to IC1 and ST1567. All the genomes harbored an intrinsic
blaOXA-51 group carbapenemase gene, where blaOXA-66 and blaOXA-69 were associated with IC2 and IC1,
respectively. Carbapenem resistance was due to the most commonly reported acquired
carbapenemase gene blaOXA-23, with ISAba1 located upstream of the gene and likely increasing its
expression. The armA gene, associated with high-level resistance to aminoglycosides, was detected
in 87.5 % of isolates. Collectively, these results revealed a convergent evolution of different clonal
lineages towards the same colistin resistance mechanism, thus limiting the effective therapeutic
options for the treatment of CRAB infections.
3.2 Introduction
Acinetobacter baumannii is now recognized as a major hospital pathogen by its ability to resist major
antimicrobials and to survive in the healthcare environment (Peleg et al., 2008). Currently,
carbapenem resistant A. baumannii (CRAB) is widespread, with rates reaching or exceeding 90% in
some clinical settings in Southern and Eastern European countries (ECDC, 2018) and elsewhere
(https://resistancemap.cddep.org), and mortality rates for the most common CRAB infections such as
bloodstream infections and hospital acquired pneumoniae approaching 60% (Wong et al., 2017).
OXA-type carbapenemases constitute the most prevalent mechanism of carbapenem resistance in
this species, with OXA-23, OXA-24 and OXA-58 being the most prevalent enzymes (Poirel and
Nordmann, 2006). Molecular epidemiological studies usually revealed an oligoclonal distribution of
CRAB, with outbreak strains mostly belonging to international clones (IC) 1 and 2 (Zarrilli et al., 2013).
57
In Greece, since their first emergence in 2000, CRAB have become endemic, and the percentage of
carbapenem resistance reached 94% in 2017 (Tsakris et al., 2003; ECDC, 2018). Regarding the CRAB
clonal nature and carbapenemase gene content, a study conducted from 2000 to 2009 in Greece
showed that CRAB were harboring only the OXA-58 carbapenemase gene; while IC1 was prevalent
until 2004, IC2 became dominant during 2005–2009 (Gogou et al., 2011). Between 2009 and 2011,
OXA-23 producers emerged and replaced the previously predominant OXA-58 producing A.
baumannii strains (Liakopoulos et al., 2012). Recently, a molecular epidemiological study on
contemporary CRAB clinical isolates derived from hospitals throughout Greece demonstrated the
predominance of OXA-23 producers belonging to IC2 (Pournaras et al., 2017).
Colistin-based treatment often represents the only therapeutic option for CRAB infections (Viehman
et al., 2014). However, CRAB isolates that are also ColR are being reported more frequently. Data
from the EARS-Net study in 2016 collected from 30 European countries showed that 4.0 % of the
tested isolates were resistant to colistin, with the vast majority (70.7 %) of the resistant isolates
reported from Greece and Italy (ECDC, 2017). A study from Greece reported an increase in colistin
resistance from 1% in 2012 to 21.1% in 2014 (Oikonomou et al., 2015), while Pournaras et al.
reported a resistance rate of 27.3% in 2015 (Pournaras et al., 2017). More alarmingly, the colistin
resistance rate was 56.8% in isolates collected from patients with ventilator-associated pneumonia in
Greece during 2015 (Nowak et al., 2017). Colistin resistance has been linked to mutations in the two-
component transcriptional regulator genes pmrA/B and consequent pmrC overexpression in most
instances. The phosphoethanolamine phosphotransferase PmrC adds a pEtN group to the lipid A of
the lipopolysaccharide, lowering the net negative charge of the cell membrane, thus impacting the
binding of colistin and preventing the cell membrane leakage (Poirel et al., 2017). Colistin resistance
may also result from the overexpression of etpA, a pmrC homolog. This is mediated by insertional
inactivation of a gene encoding an H-NS family transcriptional regulator (Lucas et al., 2018) or by
integration of insertion sequence elements upstream of the eptA gene itself (Gerson et al., 2019;
Potron et al., 2019; Trebosc et al., 2019).
In this study, 40 ColR and two ColS CRAB isolates collected from nine Greek hospitals between 2015
and 2017 were studied. Whole genome sequencing was performed to investigate the mechanisms of
antibiotic resistance as well as the genomic relatedness between the strains.
3.3 Materials and methods
Bacterial strains and antimicrobial susceptibility testing. In the period 2015-2017, a total of 122
consecutive non-duplicate clinical CRAB isolates were obtained from routine microbiological cultures
of clinical samples (e.g. urine, blood, skin, bronchial aspirate) from different patients admitted to
58
nine Greek hospitals involved in this study (Figure 1). Bacteria were not isolated by the authors but
provided by the respective medical centers. Therefore, an ethics approval was not required as per
institutional and national guidelines and regulations. Antimicrobial susceptibility testing was
performed using the Vitek2 instrument (bioMérieux, Marcy l’Étoile, France) and the results were
interpreted following the EUCAST breakpoints (EUCAST, 2019). Since EUCAST doesn’t provide
breakpoints for cephalosporins and Acinetobacter spp., CLSI breakpoints were used for those
antibiotics (CLSI, 2019). Colistin minimum inhibitory concentrations (MICs) were obtained by broth
microdilution following the CLSI guidelines (CLSI, 2019), and the results were interpreted following
the EUCAST susceptibility breakpoint of 2 mg/L (EUCAST, 2019). Only the ColR CRAB isolates plus two
randomly selected ColS CRAB isolates were retained for further experiments.
Genome sequencing and assembly. Whole DNA of the selected CRAB isolates was extracted using
the QIAGEN UltraClean Microbial kit and sequenced with a NovaSeq sequencer (Illumina, USA),
generating paired end reads of 100 bp. Raw reads were assembled using SPAdes v.3.11.1 (Bankevich
et al., 2012) and annotated with Prokka (Seemann, 2014). Whole genome sequencing data have been
deposited under BioProject PRJNA578598.
Bioinformatics analysis. Sequence types (STs) were assigned by the mlst tool
(https://github.com/tseemann/mlst) by using the Oxford (gltA, gyrB, gdhB, recA, cpn60, gpi and rpoD
genes) and the Pasteur (cpn60, fusA, gltA, pyrG, recA, rplB and rpoB genes) schemes available on
pubMLST.org. The ABRicate tool (https://github.com/tseemann/abricate) was used for the detection
of antimicrobial resistance genes, by using the ResFinder (Zankari et al., 2012), CARD (Jia et al., 2017),
BLDB (Naas et al., 2017) and ARG-ANNOT (Gupta et al., 2014) databases. The minimum percentage of
coverage and identity used were 60 % and 90 %, respectively. The Kaptive tool was used to detect
the KL and OC locus (Wyres et al., 2019). BLAST+ (2.7.1) was used to detect mutations in genes
previously demonstrated to be potentially involved in colistin resistance (i.e. pmrCAB, eptA), and only
those leading to amino acid variations were considered. The pmrA/B/C and eptA genes were
compared to the reference genome ACICU (accession no. CP031380.1). The presence of insertion
sequence elements in the 500 bp region upstream of the blaADC, blaOXA-23, armA, eptA and pmrC genes
was determined using the ISfinder tool (Siguier et al., 2006). Core genes were defined by Roary
(v3.12.0) (Page et al., 2015) by using the annotated genomes, and genomes belonging to different
international clones (ICs) were treated separately. The alignment of these genes was screened for
further recombination using Gubbins (v2.3.4) (Croucher et al., 2015), while an ML phylogeny was
obtained by using RAxML (v8.2.12) (Stamatakis, 2014) with the GTRGAMMA model and 100
bootstrap replicates. The phylogenetic tree was visualized together with associated metadata using
Microreact (v7.0.0)(Argimón et al., 2016). Single nucleotide polymorphisms (SNPs) were obtained
59
with the snp-dists tool (https://github.com/tseemann/snp-dists) by using the Roary core genes
alignment as input.
Analysis of Lipid A. Lipid A was extracted using an acetic acid-based procedure as previously
described (Kocsis et al., 2017). Once extracted, 0.7 µL of the concentrate was spotted on a matrix-
assisted laser desorption/ ionization–time of flight mass spectrometry (MALDI-TOF MS) plate
followed by 0.7 µL of norharmane matrix (Sigma-Aldrich, St Louis, Missouri) and then air-dried. The
samples were analyzed on a Vitek MS instrument (bioMérieux, Marcy l’Étoile, France) in the
negative-ion mode. The resulting spectra were compared to that obtained for the ColS reference
strain A. baumannii ATCC 19606.
3.4 Results
Bacterial strains and antimicrobial susceptibilities. Of the 122 CRAB isolates, 40 (32.8%) were also
ColR, with colistin MICs ranging from 4 to 64 mg/L. All following data concern only the ColR isolates.
Antimicrobial susceptibility testing revealed that all isolates were resistant to cephalosporins
(ceftazidime and cefepime), carbapenems (imipenem and meropenem) fluoroquinolones
(ciprofloxacin and levofloxacin) and tobramycin. Resistance rates for gentamicin and
trimethoprim/sulfamethoxazole were 87.5% (N=35) and 92.5% (N=37), respectively. The two ColS
CRAB isolates, included in this study for comparative purposes, had a colistin MIC of 0.5 mg/L (Table
S1).
Genomic epidemiology. The majority of the ColR CRAB isolates (N=37, 92.5 %) were sequence type
(ST) 2, belonging to the previously described IC2 as defined by the Pasteur MLST scheme (Diancourt
et al., 2010) (Figure 1). The Oxford MLST scheme allowed to further differentiate the IC2 isolates in 4
different STs, all belonging to the clonal complex (CC) 208: the majority of isolates (N=29, 78.4%)
belonged to ST425, while ST208, ST451 and ST436 represented the 8.1% (N=3), 8.1% (N=3) and 5.4%
(N=2), respectively. These 4 STs shared 6 out of 7 alleles and differed only by the gpi gene. The gpi
gene is one of the capsular polysaccharide synthesis genes; therefore, the Oxford MLST scheme
suffers from limitations, as the gpi gene is prone to homologous recombination (Gaiarsa et al., 2019).
From a total of 4,612 different genes detected in all the isolates, 3,031 (65.7%) were core genes. Core
gene SNPs among IC2 genomes varied between 2 and 1,652 (mean: 569, median:531). The
phylogenetic analysis of IC2 isolates shows two major clusters of ST425, well differentiated in the
tree. These two clusters were characterized by two different capsular polysaccharides, KL4 and KL40.
Different capsular polysaccharides were observed in the other IC2 isolates (Figure1), while all the IC2
isolates were characterized by the lipooligosaccharide outer core (OC) locus 1 (OCL1). Isolates of
ST425:KL4 were only observed in Athens, within two hospitals in 2015 (Aglaia Kyriakoy and Agia Olga)
60
and one isolate in the Thriassio General hospital in 2017. The thirteen isolates obtained from the
Aglaia Kyriakoy hospital had an average of 12 core SNPs, suggesting cross-transmission of isolates
between different patients. Isolates of ST425:KL40 were retrieved between 2015-2016 from four
hospitals in Athens and one isolate from the University Hospital in Patras (200 km west of Athens).
This underscores the endemicity at the local level of this clone, moreover, suggesting inter-hospital
cross infections, given the absence of a clear clustering in the tree of isolates obtained from different
hospitals.
The remaining three isolates belonged to ST1 (IC1) and ST1567 according to the Pasteur and Oxford
MLST schemes, respectively, and harbored a capsule and lipooligosaccharide of type KL40 and OCL2.
Core gene SNPs varied between 29 and 305.
The two ColS CRAB isolates belonged to IC2, or ST208 (isolate PU_2016_41) and ST195 (GE_2017_62)
by using the Oxford MLST scheme, and had a median of 706 (min: 8, max:1568) and 1175 SNPs
(min:725, max:1652) compared to the ColR isolates, respectively.
Figure 1. Phylogenetic tree of the A. baumannii clinical strains belonging to IC2. cps: capsular polysaccharides.
Colistin resistance mechanisms. Several chromosomal mutations in genes potentially involved in
colistin resistance were detected, in comparison with the ACICU ColS reference genome. The
mutation PmrBA138T was detected in all ColR and ColS isolates, indicating that it may not contribute
61
significantly to the resistance phenotype, as previously reported (Oikonomou et al., 2015).
Conversely, the mutation A226V in the histidine kinase A (phosphoacceptor) domain of PmrB was
observed in all ColR isolates, and not in the ColS ones (Figure 1). This mutation has been described in
several prior studies, associated with ColR strains (Arroyo et al., 2011; Mavroidi et al., 2015, 2017;
Dortet et al., 2018; Trebosc et al., 2019).
Isolates with PmrBA226V without other alterations had colistin MICs ranging from 4 to 8 mg/L. When
an additional mutation in PmrB occurred (PmrBE140F in AK_2015_33 and PmrBL178F in SI_2017_69),
strains showed a colistin MIC of 16 mg/L. Two strains belonging to IC1 (FK_2016_46 and FK_2016_47)
had an additional K172I mutation in the transcriptional regulatory protein C-terminal domain of
PmrA, and showed colistin MICs of 32 mg/L. Finally, the strain AO_2015_54 had an additional D10N
mutation in the CheY-homologous receiver PmrA domain and was associated with colistin MIC of 64
mg/L. All these additional mutations are, to the best of our knowledge, first described here.
The susceptible strain GE_2017_62 had no additional mutations in pmrA/B genes. However, it had an
ISAba1 positioned 110 bp upstream of the pmrC gene, in reverse orientation. This is, to the best of
our knowledge, the first report of an insertion sequence transposition upstream of the pmrC gene.
However, this transposition event doesn’t seem to alter the colistin susceptibility in this isolate. The
second susceptible strain PU_2016_41 had PmrAM12V and PmrBR181H+Y388N. These mutations are firstly
described here, and in this strain they don’t seem to impact the colistin susceptibility.
The pmrC homolog eptA was detected in all the isolates of the IC2 except the susceptible isolate
GE_2017_62, while it was absent in the IC1 isolates. The obtained eptA gene sequences were
identical to that of the susceptible reference ACICU, and did not present insertion elements in the
upstream region.
The mcr genes, encoding for acquired colistin resistance, have not been described in A. baumannii
yet, and were not detected in our strain collection.
Lipid A modifications. An increased expression of pmrC or eptA results to the addition of pEtN to
lipid A. The lipid A of the ColR and ColS CRAB isolates was extracted and analyzed by MALDI-TOF MS,
and the resulting spectra were compared to that of the ColS reference strain A. baumannii ATCC
19606. Several lipid A species were detected in the reference strain ATCC 19606 and in all clinical
isolates: hepta-acylated lipid A (m/z 1,910), hexa-acylated lipid A (m/z 1,728) and tetra-acylated lipid
A (m/z 1,404). The addition of pEtN (m/z 124) to lipid A was shown by the mass at m/z 2,034, and
unexpectedly it was observed in all the clinical strains, including the ColS ones (Figure 2). Isolates
with colistin MICs higher than 8 mg/L also showed the peak at m/z 1954, representing the pEtN-
62
modified hepta-acylated lipid A (m/z 2034) minus one phosphate group (m/z 80), as previously
reported (Kim et al., 2014). The addition of galactosamine to lipid A, which is indicated by a mass at
m/z 2,071 (Pelletier et al., 2013), was not observed in any isolate.
Figure 2. Mass spectrometry analysis of Lipid A. From the bottom, isolate (colistin MIC, resistant/susceptible): ATCC-19606 (0.5, S), TR_2016_35 (4, R), SI_2017_69 (16, R), AO_2015_54 (64, R) and GE_2017_62 (0.5, S).
Antimicrobial resistance mechanisms and phenotype correlation. All CRAB genomes harbored a
chromosomal blaADC cephalosporinase, an intrinsic blaOXA-51 group carbapenemase and an acquired
blaOXA-23. IC2 genomes contained the blaADC-73 (accession no. KP881233), a variant of blaADC with a
sequence identity of 1,151/1,152 nucleotides compared to that of blaADC-30, and previously observed
in IC2 isolates (Karah et al., 2016). An ISAba1 element was present 9 bp upstream of the blaADC-73
gene in reverse orientation in all IC2 genomes, and it is responsible to increase the cephalosporinase
gene expression (Héritier et al., 2006). Conversely, IC1 genomes contained blaADC-175 (MH594297)
with an ISAba125 element positioned 66 bp upstream the gene in reverse orientation, as also
previously reported (Lopes and Amyes, 2012). ISAba125 was shown to increase the cephalosporinase
expression 6 times more than ISAba1 (Lopes and Amyes, 2012). The allelic variants of the intrinsic
blaOXA-51-like carbapenemase genes were blaOXA-66 and blaOXA-69, associated with IC2 and IC1,
respectively, as previously observed (Zander et al., 2012). All the blaOXA-23 genes were characterized
by the presence of an ISAba1 located upstream of the gene, which has been previously
demonstrated to increase its expression (Turton et al., 2006). In particular, the blaOXA-23 gene was part
of a Tn2006 transposon in the IC2 genomes. Conversely, a Tn2008 embedded within a TnaphA6 was
found in the three IC1 genomes, matching 100% with the sequence of plasmid pABKp1 (KP074966.1)
63
obtained from A. baumannii isolates from Romania (Gheorghe et al., 2014). Consistently with the
mentioned genes and their genetic environments, all isolates were resistant to cephalosporins,
including ceftazidime (3rd generation) and cefepime (4th generation), and carbapenems (imipenem
and meropenem).
Several aminoglycoside resistance genes were observed among the isolates, namely aac(3)-I, aac(3)-
Ia, ant(3”)-1a, aph(3’)-Ia, aph(3’)-Via, aph(6)-Id, armA and strA (Table S1). ArmA is a 16S ribosomal
RNA methyltransferase, which protects the 30S ribosomal subunit from aminoglycoside binding and
conferring high aminoglycosides MICs. Consistently, all the strains carrying armA (35/40, 87.5 %)
were resistant to gentamicin and tobramycin. The armA gene was located in the chromosome
aboard on the widely disseminated Tn1548, and it was found downstream of a cluster of genes
encoding proteins annotated as paraquat-inducible protein A and protein B, as previously described
for ST195 strain AC29 (Lean et al., 2016).
All strains contained substitutions within the QRDR, namely GyrAS83L and ParCS80L, previously
associated to quinolone resistance (Vila et al., 1995, 1997). As expected, all strains were non-
susceptible to ciprofloxacin and levofloxacin.
3.5 Discussion
Carbapenems represent first-line agents for the treatment of A. baumannii infections, consequently
the rise of infections due to carbapenem-resistant strains is of major concern. The carbapenem
resistance in the isolates described here was associated to the ISAba1-mediated overexpression of
blaOXA-23 located either in Tn2006 (IC2 isolates) or Tn2008 (IC1) transposons. Previous studies
reported that OXA-23 producers emerged and replaced the previously predominant OXA-58 A.
baumannii isolates (Liakopoulos et al., 2012), and this phenomenon could be linked to the stronger
hydrolytic activity of OXA-23 compared to OXA-58 (Peleg et al., 2008). Most CRAB isolates are
susceptible to only 1 or 2 agents, making them extensively drug-resistant (XDR) pathogens (Viehman
et al., 2014).
Because of the increasing use of colistin, resistance to this antibiotic has rapidly increased, especially
in CRAB isolates (Giamarellou, 2016; Jeannot et al., 2017), and now reached critical levels in some
countries (Nowak et al., 2017).From the nine hospitals involved in this study, the 32.8% of the CRAB
isolates were also ColR. These results indicate that colistin resistance rates among CRAB isolates from
Greece is on the rise, as a previous study reported a resistance rate of 27.3% in 2015 (Pournaras et al.,
2017). While the mcr genes, encoding for acquired colistin resistance, were absent among our
isolates, we found several mutations in the pmrCAB operon associated with the colistin resistance
64
phenotype. Interestingly, the previously described PmrBA226V mutation, previously associated to low-
level colistin resistance, was detected in all the ColR isolates but no in the ColS ones. In a recent
study, Trebosc et al. investigated the colistin resistance mechanisms of 12 clinical A. baumannii
strains. The authors concluded that colistin resistance was conferred, in most cases, by mutations in
the PmrB sensor kinase that led to PmrC overexpression. Two of those strains were isolated in
Greece in 2012, belonged to either IC1 or IC2 and had the mutation PmrBA226V. Such findings support
the important role of the mentioned PmrB mutation in the colistin resistance phenotype. Moreover,
a similar substitution of the alanine in position 226 of PmrB was reported to confer stable colistin
resistance in clinical A. baumannii isolates (Charretier et al., 2018). Some of our isolates had
additional mutations in either PmrB or PmrA, and were associated with higher colistin MICs, up to 64
mg/L. Multiple mutations may result in an increased expression of pmrC, as recently shown by RNA-
Seq experiments (Wright et al., 2017) and by qRT-PCR (Gerson et al., 2020). However, the same
studies reported clinical isolates characterized by pmrC overexpression due to PmrA/B mutations,
but with an unexpected colistin susceptible phenotype. Similarly, the two ColS isolates from our
study had pmrCAB alterations and a modified lipid A, as observed with mass spectrometry. All these
observations support the hypothesis that additional and still unknown factors are involved in colistin
resistance of clinical A. baumannii isolates (Jeannot et al., 2017; Gerson et al., 2019, 2020).
Determination of the cell-envelope charge could be useful in the elucidation process of the complex
mechanism of colistin resistance in A.baumannii (Cafiso et al., 2019).
In this study, we observed a clear predominance of IC2, which is globally distributed (Higgins et al.,
2010) and which is gradually replacing IC1 (Gogou et al., 2011; Villalon et al., 2011). The major
sequence type within IC2 was ST425, as defined by the Oxford MLST scheme. To the best of our
knowledge, only one study reported such ST, with one clinical isolate collected in 2002 in Sydney,
Australia (Nigro and Hall, 2016). However, WGS data were not provided. Both capsular
polysaccharides reported within our ST425 isolates, KL4 and KL40, were rarely observed (0.2%) or
completely absent, respectively, within IC2 genomes in a recent study where 3,416 publicly available
A. baumannii genomes were analyzed (Wyres et al., 2019). Conversely, KL4 and KL40 represented the
second (20.1%) and third (11.9%) most common capsular types observed within IC1 genomes. It is
conceivable that ST425 resulted from homologous recombination between a CC208 and an IC1
genomes, where the IC1 capsular polysaccharides genes were acquired by the CC208 strain, as this
region was previously shown to be a frequent subject of homologous recombination (Adams et al.,
2008; Snitkin et al., 2011; Kenyon and Hall, 2013).
In conclusion, genomic analysis of ColR CRAB isolates from different Greek hospitals revealed a
convergent evolution of different clonal lineages towards the same colistin resistance mechanism,
65
characterized by the mutation PmrBA226V. The prevalence of ColR CRAB isolates belonging to IC2 and
expressing OXA-23 and ArmA is increasing, and it represents a huge threat within clinical settings,
given the very limited effective agents for the treatment of infections caused by such isolates.
3.6 References
Adams, M. D., Goglin, K., Molyneaux, N., Hujer, K. M., Lavender, H., Jamison, J. J., et al. (2008).
Comparative genome sequence analysis of multidrug-resistant Acinetobacter baumannii. J.
Bacteriol. 190, 8053–8064. doi:10.1128/JB.00834-08.
Argimón, S., Abudahab, K., Goater, R. J. E., Fedosejev, A., Bhai, J., Glasner, C., et al. (2016). Microreact:
visualizing and sharing data for genomic epidemiology and phylogeography. Microb. genomics 2,
e000093. doi:10.1099/mgen.0.000093.
Arroyo, L. A., Herrera, C. M., Fernandez, L., Hankins, J. V., Trent, M. S., and Hancock, R. E. W. (2011).
The pmrCAB Operon Mediates Polymyxin Resistance in Acinetobacter baumannii ATCC 17978
and Clinical Isolates through Phosphoethanolamine Modification of Lipid A. Antimicrob. Agents
Chemother. 55, 3743–3751. doi:10.1128/AAC.00256-11.
Bankevich, A., Nurk, S., Antipov, D., Gurevich, A. A., Dvorkin, M., Kulikov, A. S., et al. (2012). SPAdes:
A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. J. Comput.
Biol. 19, 455–477. doi:10.1089/cmb.2012.0021.
Cafiso, V., Stracquadanio, S., Lo Verde, F., Gabriele, G., Mezzatesta, M. L., Caio, C., et al. (2019).
Colistin Resistant A. baumannii: Genomic and Transcriptomic Traits Acquired Under Colistin
Therapy. Front. Microbiol. 9, 3195. doi:10.3389/fmicb.2018.03195.
Charretier, Y., Diene, S. M., Baud, D., Chatellier, S., Santiago-Allexant, E., Van Belkum, A., et al. (2018).
Colistin heteroresistance and involvement of the PmrAB regulatory system in Acinetobacter
baumannii. Antimicrob. Agents Chemother. 62. doi:10.1128/AAC.00788-18.
CLSI (2019). CLSI. Performance Standards for Antimicrobial Susceptibility Testing. 29th ed. CLSI
supplement M100. Wayne, PA: Clinical and Laboratory Standars Institute; 2019.
Croucher, N. J., Page, A. J., Connor, T. R., Delaney, A. J., Keane, J. A., Bentley, S. D., et al. (2015). Rapid
phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using
Gubbins. Nucleic Acids Res. 43, e15. doi:10.1093/nar/gku1196.
Diancourt, L., Passet, V., Nemec, A., Dijkshoorn, L., and Brisse, S. (2010). The Population Structure of
Acinetobacter baumannii: Expanding Multiresistant Clones from an Ancestral Susceptible
66
Genetic Pool. PLoS One 5, e10034. doi:10.1371/journal.pone.0010034.
Dortet, L., Potron, A., Bonnin, R. A., Plesiat, P., Naas, T., Filloux, A., et al. (2018). Rapid detection of
colistin resistance in Acinetobacter baumannii using MALDI-TOF-based lipidomics on intact
bacteria. Sci. Rep. 8, 16910. doi:10.1038/s41598-018-35041-y.
ECDC (2017). European Centre for Disease Prevention and Control. Surveillance of antimicrobial
resistance in Europe 2016. Annual Report of the European Antimicrobial Resistance Surveillance
Network (EARS-Net). Stockholm: ECDC; 2017.
ECDC (2018). European Centre for Disease Prevention and Control. Surveillance of antimicrobial
resistance in Europe – Annual report of the European Antimicrobial Resistance Surveillance
Network (EARS-Net) 2017. Stockholm: ECDC; 2018.
EUCAST (2019). The European Committee on Antimicrobial Susceptibility Testing. Breakpoint tables
for interpretation of MICs and zone diameters. Version 9.0, 2019. http://www.eucast.org.
Gaiarsa, S., Batisti Biffignandi, G., Esposito, E. P., Castelli, M., Jolley, K. A., Brisse, S., et al. (2019).
Comparative analysis of the two Acinetobacter baumannii multilocus sequence typing (MLST)
schemes. Front. Microbiol. 10. doi:10.3389/fmicb.2019.00930.
Gerson, S., Betts, J. W., Lucaßen, K., Nodari, C. S., Wille, J., Josten, M., et al. (2019). Investigation of
Novel pmrB and eptA Mutations in Isogenic Acinetobacter baumannii Isolates Associated with
Colistin Resistance and Increased Virulence In Vivo . Antimicrob. Agents Chemother. 63, 1–15.
doi:10.1128/aac.01586-18.
Gerson, S., Lucaßen, K., Wille, J., Nodari, C. S., Stefanik, D., Nowak, J., et al. (2020). Diversity of amino
acid substitutions in PmrCAB associated with colistin resistance in clinical isolates of
Acinetobacter baumannii. Int. J. Antimicrob. Agents. doi:10.1016/j.ijantimicag.2019.105862.
Gheorghe, I., Novais, Â., Grosso, F., Rodrigues, C., Chifiriuc, M. C., Lazar, V., et al. (2014). Snapshot on
carbapenemase-producing Pseudomonas aeruginosa and Acinetobacter baumannii in bucharest
hospitals reveals unusual clones and novel genetic surroundings for blaOXA-23. J. Antimicrob.
Chemother. 70, 1016–1020. doi:10.1093/jac/dku527.
Giamarellou, H. (2016). Epidemiology of infections caused by polymyxin-resistant pathogens. Int. J.
Antimicrob. Agents 48, 614–621. doi:10.1016/j.ijantimicag.2016.09.025.
Gogou, V., Pournaras, S., Giannouli, M., Voulgari, E., Piperaki, E.-T., Zarrilli, R., et al. (2011). Evolution
of multidrug-resistant Acinetobacter baumannii clonal lineages: a 10 year study in Greece
67
(2000-09). J. Antimicrob. Chemother. 66, 2767–2772. doi:10.1093/jac/dkr390.
Gupta, S. K., Padmanabhan, B. R., Diene, S. M., Lopez-Rojas, R., Kempf, M., Landraud, L., et al. (2014).
ARG-annot, a new bioinformatic tool to discover antibiotic resistance genes in bacterial
genomes. Antimicrob. Agents Chemother. 58, 212–220. doi:10.1128/AAC.01310-13.
Héritier, C., Poirel, L., and Nordmann, P. (2006). Cephalosporinase over-expression resulting from
insertion of ISAba1 in Acinetobacter baumannii. Clin. Microbiol. Infect. 12, 123–130.
doi:10.1111/j.1469-0691.2005.01320.x.
Higgins, P. G., Dammhayn, C., Hackel, M., and Seifert, H. (2010). Global spread of carbapenem-
resistant Acinetobacter baumannii. J. Antimicrob. Chemother. 65, 233–238.
doi:10.1093/jac/dkp428.
Jeannot, K., Bolard, A., and Plésiat, P. (2017). Resistance to polymyxins in Gram-negative organisms.
Int. J. Antimicrob. Agents 49, 526–535. doi:10.1016/j.ijantimicag.2016.11.029.
Jia, B., Raphenya, A. R., Alcock, B., Waglechner, N., Guo, P., Tsang, K. K., et al. (2017). CARD 2017 :
expansion and model-centric curation of the comprehensive antibiotic resistance database. 45,
566–573. doi:10.1093/nar/gkw1004.
Karah, N., Dwibedi, C. K., Sjöström, K., Edquist, P., Johansson, A., Wai, S. N., et al. (2016). Novel
Aminoglycoside Resistance Transposons and Transposon-Derived Circular Forms Detected in
Carbapenem-Resistant Acinetobacter baumannii Clinical Isolates. Antimicrob. Agents
Chemother. 60, 1801–1818. doi:10.1128/AAC.02143-15.
Kenyon, J. J., and Hall, R. M. (2013). Variation in the Complex Carbohydrate Biosynthesis Loci of
Acinetobacter baumannii Genomes. PLoS One 8, e62160. doi:10.1371/journal.pone.0062160.
Kim, Y., Bae, I. K., Lee, H., Jeong, S. H., Yong, D., and Lee, K. (2014). In vivo emergence of colistin
resistance in Acinetobacter baumannii clinical isolates of sequence type 357 during colistin
treatment. Diagn. Microbiol. Infect. Dis. 79, 362–366. doi:10.1016/j.diagmicrobio.2014.03.027.
Kocsis, B., Kilár, A., Péter, S., Dörnyei, Á., Sándor, V., and Kilár, F. (2017). “Mass Spectrometry for
Profiling LOS and Lipid A Structures from Whole-Cell Lysates: Directly from a Few Bacterial
Colonies or from Liquid Broth Cultures,” in Methods in molecular biology (Clifton, N.J.), 187–198.
doi:10.1007/978-1-4939-6958-6_17.
Lean, S. S., Yeo, C. C., Suhaili, Z., and Thong, K. L. (2016). Comparative genomics of two ST 195
carbapenem-resistant Acinetobacter baumannii with different susceptibility to polymyxin
68
revealed underlying resistance mechanism. Front. Microbiol. 6, 1–17.
doi:10.3389/fmicb.2015.01445.
Liakopoulos, A., Miriagou, V., Katsifas, E. A., Karagouni, A. D., Daikos, G. L., Tzouvelekis, L. S., et al.
(2012). Identification of OXA-23-producing Acinetobacter baumannii in Greece, 2010 to 2011.
Euro Surveill. 17. Available at: http://www.ncbi.nlm.nih.gov/pubmed/22449866 [Accessed
February 7, 2019].
Lopes, B. S., and Amyes, S. G. B. (2012). Role of ISAba1 and ISAba125 in governing the expression of
blaADC in clinically relevant Acinetobacter baumannii strains resistant to cephalosporins. J. Med.
Microbiol. 61, 1103–1108. doi:10.1099/jmm.0.044156-0.
Lucas, D. D., Crane, B., Wright, A., Han, M.-L., Moffatt, J., Bulach, D., et al. (2018). Emergence of high-
level colistin resistance in an Acinetobacter baumannii clinical isolate mediated by inactivation
of the global regulator H-NS. Antimicrob. Agents Chemother 30, 1–17. doi:10.1128/AAC.02442-
17.
Mavroidi, A., Katsiari, M., Palla, E., Likousi, S., Roussou, Z., Nikolaou, C., et al. (2017). Investigation of
Extensively Drug-Resistant blaOXA-23-Producing Acinetobacter baumannii Spread in a Greek
Hospital. Microb. Drug Resist. 23, 488–493. doi:10.1089/mdr.2016.0101.
Mavroidi, A., Likousi, S., Palla, E., Katsiari, M., Roussou, Z., Maguina, A., et al. (2015). Molecular
identification of tigecycline- and colistin-resistant carbapenemase-producing Acinetobacter
baumannii from a Greek hospital from 2011 to 2013. J. Med. Microbiol. 64, 993–997.
doi:10.1099/jmm.0.000127.
Naas, T., Oueslati, S., Bonnin, R. A., Dabos, M. L., Zavala, A., Dortet, L., et al. (2017). Beta-lactamase
database (BLDB) – structure and function. J. Enzyme Inhib. Med. Chem. 32, 917–919.
doi:10.1080/14756366.2017.1344235.
Nigro, S. J., and Hall, R. M. (2016). Loss and gain of aminoglycoside resistance in global clone 2
Acinetobacter baumannii in Australia via modification of genomic resistance islands and
acquisition of plasmids. J. Antimicrob. Chemother. 71, 2432–40. doi:10.1093/jac/dkw176.
Nowak, J., Zander, E., Stefanik, D., Higgins, P. G., Roca, I., Vila, J., et al. (2017). High incidence of
pandrug-resistant Acinetobacter baumannii isolates collected from patients with ventilator-
associated pneumonia in Greece, Italy and Spain as part of the MagicBullet clinical trial. J.
Antimicrob. Chemother. 72, 3277–3282. doi:10.1093/jac/dkx322.
69
Oikonomou, O., Sarrou, S., Papagiannitsis, C. C., Georgiadou, S., Mantzarlis, K., Zakynthinos, E., et al.
(2015). Rapid dissemination of colistin and carbapenem resistant Acinetobacter baumannii in
Central Greece: Mechanisms of resistance, molecular identification and epidemiological data.
BMC Infect. Dis. 15, 13–18. doi:10.1186/s12879-015-1297-x.
Page, A. J., Cummins, C. A., Hunt, M., Wong, V. K., Reuter, S., Holden, M. T. G., et al. (2015). Roary:
rapid large-scale prokaryote pan genome analysis. Bioinformatics 31, 3691–3693.
doi:10.1093/bioinformatics/btv421.
Peleg, A. Y., Seifert, H., and Paterson, D. L. (2008). Acinetobacter baumannii: Emergence of a
Successful Pathogen. Clin. Microbiol. Rev. 21, 538–582. doi:10.1128/CMR.00058-07.
Pelletier, M. R., Casella, L. G., Jones, J. W., Adams, M. D., Zurawski, D. V., Hazlett, K. R. O., et al.
(2013). Unique Structural Modifications Are Present in the Lipopolysaccharide from Colistin-
Resistant Strains of Acinetobacter baumannii. Antimicrob. Agents Chemother. 57, 4831–4840.
doi:10.1128/AAC.00865-13.
Poirel, L., Jayol, A., and Nordmann, P. (2017). Polymyxins: Antibacterial Activity, Susceptibility Testing,
and Resistance Mechanisms Encoded by Plasmids or Chromosomes. Clin. Microbiol. Rev. 30,
557–596. doi:10.1128/CMR.00064-16.
Poirel, L., and Nordmann, P. (2006). Carbapenem resistance in Acinetobacter baumannii: mechanisms
and epidemiology. Clin. Microbiol. Infect. 12, 826–836. doi:10.1111/j.1469-0691.2006.01456.x.
Potron, A., Vuillemenot, J.-B., Puja, H., Triponney, P., Bour, M., Valot, B., et al. (2019). ISAba1-
dependent overexpression of eptA in clinical strains of Acinetobacter baumannii resistant to
colistin. J. Antimicrob. Chemother. 74, 2544–2550. doi:10.1093/jac/dkz241.
Pournaras, S., Dafopoulou, K., Del Franco, M., Zarkotou, O., Dimitroulia, E., Protonotariou, E., et al.
(2017). Predominance of international clone 2 OXA-23-producing- Acinetobacter baumannii
clinical isolates in Greece, 2015: results of a nationwide study. Int. J. Antimicrob. Agents 49,
749–753. doi:10.1016/j.ijantimicag.2017.01.028.
Seemann, T. (2014). Prokka: rapid prokaryotic genome annotation. Bioinformatics 30, 2068–2069.
doi:10.1093/bioinformatics/btu153.
Siguier, P., Perochon, J., Lestrade, L., Mahillon, J., and Chandler, M. (2006). ISfinder: the reference
centre for bacterial insertion sequences. Nucleic Acids Res. 34, D32-6. doi:10.1093/nar/gkj014.
Snitkin, E. S., Zelazny, A. M., Montero, C. I., Stock, F., Mijares, L., Mullikin, J., et al. (2011). Genome-
70
wide recombination drives diversification of epidemic strains of Acinetobacter baumannii. Proc.
Natl. Acad. Sci. U. S. A. 108, 13758–13763. doi:10.1073/pnas.1104404108.
Stamatakis, A. (2014). RAxML version 8: A tool for phylogenetic analysis and post-analysis of large
phylogenies. Bioinformatics 30, 1312–1313. doi:10.1093/bioinformatics/btu033.
Trebosc, V., Gartenmann, S., Tötzl, M., Lucchini, V., Schellhorn, B., Pieren, M., et al. (2019). Dissecting
Colistin Resistance Mechanisms in Extensively Drug-Resistant Acinetobacter baumannii Clinical
Isolates. MBio 10. doi:10.1128/mBio.01083-19.
Tsakris, A., Tsioni, C., Pournaras, S., Polyzos, S., Maniatis, A. N., and Sofianou, D. (2003). Spread of
low-level carbapenem-resistant Acinetobacter baumannii clones in a tertiary care Greek
hospital. J. Antimicrob. Chemother. 52, 1046–1047. doi:10.1093/jac/dkg470.
Turton, J. F., Ward, M. E., Woodford, N., Kaufmann, M. E., Pike, R., Livermore, D. M., et al. (2006).
The role of ISAba1 in expression of OXA carbapenemase genes in Acinetobacter baumannii.
FEMS Microbiol. Lett. 258, 72–77. doi:10.1111/j.1574-6968.2006.00195.x.
Viehman, J. A., Nguyen, M. H., and Doi, Y. (2014). Treatment Options for Carbapenem-Resistant and
Extensively Drug-Resistant Acinetobacter baumannii Infections. Drugs 74, 1315–1333.
doi:10.1007/s40265-014-0267-8.
Vila, J., Ruiz, J., Goñi, P., and Jimenez de Anta, T. (1997). Quinolone-resistance mutations in the
topoisomerase IV parC gene of Acinetobacter baumannii. J. Antimicrob. Chemother. 39, 757–62.
Available at: http://www.ncbi.nlm.nih.gov/pubmed/9222045 [Accessed October 17, 2018].
Vila, J., Ruiz, J., Goñi, P., Marcos, A., and Jimenez de Anta, T. (1995). Mutation in the gyrA gene of
quinolone-resistant clinical isolates of Acinetobacter baumannii. Antimicrob. Agents Chemother.
39, 1201–3. Available at: http://www.ncbi.nlm.nih.gov/pubmed/7625818 [Accessed October 17,
2018].
Villalon, P., Valdezate, S., Medina-Pascual, M. J., Rubio, V., Vindel, A., and Saez-Nieto, J. A. (2011).
Clonal Diversity of Nosocomial Epidemic Acinetobacter baumannii Strains Isolated in Spain. J.
Clin. Microbiol. 49, 875–882. doi:10.1128/JCM.01026-10.
Wong, D., Nielsen, T. B., Bonomo, R. A., Pantapalangkoor, P., Luna, B., and Spellberg, B. (2017).
Clinical and Pathophysiological Overview of Acinetobacter Infections: a Century of Challenges.
Clin. Microbiol. Rev. 30, 409–447. doi:10.1128/CMR.00058-16.
Wright, M. S., Jacobs, M. R., Bonomo, R. A., and Adams, M. D. (2017). Transcriptome Remodeling of
71
Acinetobacter baumannii during Infection and Treatment. MBio 8, e02193-16.
doi:10.1128/mBio.02193-16.
Wyres, K. L., Cahill, S. M., Holt, K. E., Hall, R. M., and Kenyon, J. J. (2019). Identification of
Acinetobacter baumannii loci for capsular polysaccharide (KL) and lipooligosaccharide outer
core (OCL) synthesis in genome assemblies using curated reference databases compatible with
Kaptive. bioRxiv 1, 869370. doi:10.1101/869370.
Zander, E., Nemec, A., Seifert, H., and Higgins, P. G. (2012). Association between β-lactamase-
encoding blaOXA-51 variants and DiversiLab rep-PCR-based typing of Acinetobacter baumannii
isolates. J. Clin. Microbiol. 50, 1900–4. doi:10.1128/JCM.06462-11.
Zankari, E., Hasman, H., Cosentino, S., Vestergaard, M., Rasmussen, S., Lund, O., et al. (2012).
Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67, 2640–
2644. doi:10.1093/jac/dks261.
Zarrilli, R., Pournaras, S., Giannouli, M., and Tsakris, A. (2013). Global evolution of multidrug-resistant
Acinetobacter baumannii clonal lineages. Int. J. Antimicrob. Agents 41, 11–19.
doi:10.1016/j.ijantimicag.2012.09.008.
72
CHAPTER 4 : Genomic evolution and local epidemiology of Klebsiella
pneumoniae from the Beijing Hospital 301 over a fifteen-year period:
dissemination of known and novel high-risk clones
Mattia Palmieri1, Kelly L. Wyres2, Andreu Coello Pelegrin1, Caroline Mirande3, Zhao Qiang4, Ye
Liyan4, Chen Gang4, Herman Goossens5, Kathryn E. Holt2, Alex van Belkum1, Luo Yan Ping4.
1bioMérieux, Data Analytics Unit, La Balme Les Grottes, France.
2Department of Infectious Diseases, Monash University, Melbourne, Victoria, Australia.
3bioMérieux, R&D Microbiology, La Balme Les Grottes, France.
4Chinese PLA General Hospital 301, BJ 301 clinical hospital laboratory, Beijing, China.
5Laboratory of Medical Microbiology, Vaccine and Infectious Disease Institute, University of Antwerp, Belgium.
Manuscript in preparation
73
4.1 Introduction
Klebsiella pneumoniae is one of the greatest infectious threats amongst Gram-negative pathogens.
Multidrug-resistant (MDR) strains causing hospital outbreaks and hypervirulent strains causing
severe community-acquired infections are of major concern (Paczosa & Mecsas 2016). In China,
hypervirulent K. pneumoniae (hvKp), primarily of clonal group (CG) 23, and carbapenem-resistant K.
pneumoniae (CR-Kp), mostly belonging to CG258, represent the two major clinically significant
lineages of K. pneumoniae (Struve et al. 2015; Zhang et al. 2017).
HvKp infections are characterized by high morbidity and mortality and they are mainly associated
with severe life-threatening liver abscesses, pneumonia, meningitis, and endophthalmitis in young
and healthy individuals (Shon et al. 2013). Several virulence factors have been reported in hvKp
strains. The capsular polysaccharide (cps) is a major virulence factor, and hvKp strains are usually
associated with K1 or K2 capsular serotypes, that were shown to be particularly anti-phagocytic and
provide serum resistance (Kabha et al. 1995; Paczosa & Mecsas 2016). hvKp also harbor other
virulence genes: i) the rmpA and rmpA2 genes that upregulate capsule expression, ii) the colibactin
(clb) genotoxin that induces eukaryotic cell death and promotes bacterial transition from the blood
from the gut; the yersiniabactin (ybt), aerobactin (iuc) and salmochelin (iro) siderophores that
enhance survival in the blood by promoting iron scavenging (Paczosa & Mecsas 2016). While the ybt
locus is generally mobilized by an integrative, conjugative element termed ICEKp (Lam, Wick, et al.
2018), the iro, iuc and rmpA/rmpA2 loci are usually co-located on a virulence plasmid (Lam, Wyres, et
al. 2018). CG23 strains are usually susceptible to most antibiotics (Siu et al. 2012), however the last
few years have seen the emergence of MDR strains, including those resistant to carbapenems,
namely CR-hvKp (Bialek-Davenet et al. 2014; Liu et al. 2017; Shen et al. 2019; Dong, Lin, et al. 2018;
Chen et al. 2020).
Carbapenem resistance is rapidly increasing in China, and the CHINET surveillance network showed
that the resistance rate of K. pneumoniae to imipenem and meropenem increased from 3.0% and
2.9% in 2005 to 25% and 26.3% in 2018, respectively, resulting in a more than 8-fold increase (Hu et
al. 2016, 2019). KPC-2 is the most prevalent enzyme among CR-Kp in China, with 77% KPC-2 positive
strains among the carbapenemase producers reported in a recent study (Zhang et al. 2018). CG258 is
recognized worldwide as the most common clinical carbapenem-resistant clone and the major vector
of KPC-2, with ST258 being most prevalent in Europe and the U.S.A (Chen et al. 2014) and ST11
accounting for 75% of CR-Kp in China (Zhang et al. 2018). Genomic studies revealed that most of the
ST11 CR-Kp strains in China harbour a capsule of type KL47 or the recently emerging KL64 (Dong,
Zhang, et al. 2018; Zhou et al. 2020). Recently, CR-Kp ST11 strains with a hyper-virulent phenotype,
74
as defined by the carriage of the iuc aerobactin locus, have emerged (Gu et al. 2017; Yao et al. 2018;
Wong et al. 2018; Dong, Zhang, et al. 2018; Xu et al. 2019; Zhang et al. 2019; Yang et al. 2020; Zhou
et al. 2020). While the majority of these reports represent sporadic isolations, in 2017 a fatal
outbreak was caused by a CR-Kp ST11-KL47 strain harbouring a virulence plasmid containing the iuc
and the rmpA2 genes (Gu et al. 2017). Further retrospective investigations revealed that similar
strains were already circulating within China before the initial report (Gu et al. 2017; Yao et al. 2018).
Numerous studies have investigated the genetic epidemiology of CR-Kp in China (Van Dorp et al.
2019; Yang et al. 2020; Zhou et al. 2020). We here study a large collection of serially selected K.
pneumoniae strains obtained from patients in the H301 Beijing hospital during the period 2002-2016.
Phenotypic antimicrobial susceptibility testing and WGS were employed in order to obtain a global
picture of the strains circulating within the hospital during the study period. Focusing on the broad
population, instead of CR-Kp solely, allows the understanding of the evolution towards MDR,
including ESBL production, and hypervirulence, as well as the convergence of the two traits.
4.2 Materials and methods
Bacterial isolates and antimicrobial susceptibility. Bacterial isolates were obtained from the 4,000-
bed Hospital 301 in Beijing, China. A total of 300 K. pneumoniae isolates were collected from routine
microbiological cultures of clinical samples (urine, blood, sputum, tissues etc) within the period 2002-
2016. Of those, 200 were randomly selected from different patients over the study period, they
represented 3% of the K. pneumoniae isolates collected during the study period and were used for
genomic epidemiology investigations. The additional 100 isolates were selected based on different
criteria (e.g. carbapenem-resistance, isolates from outbreak) and were included to enrich the analysis
of the major clones. Antimicrobial susceptibility testing was performed for all isolates with the Vitek2
automated system (bioMérieux, Marcy L’Ètoile, France), and results were interpreted according to
the EUCAST breakpoints (EUCAST 2019). Antimicrobials tested were: amikacin, aztreonam, cefepime,
ceftazidime, ciprofloxacin, ertapenem, gentamicin, imipenem, levofloxacin, piperacillin/tazobactam,
tobramycin and trimethoprim/sulfamethoxazole. We defined MDR when non-susceptibility to three
or more classes of antimicrobials was observed, as described in reference (Magiorakos et al. 2012).
Data were analysed with python (v3.7.4) and statistical analysis were conducted by using a linear
regression method from the Scikit-learn package.
Whole genome sequencing and assembly. Genomic DNA was extracted with the DNeasy UltraClean
kit (Qiagen, Hilden, Germany), quantified by using the Qubit fluorometer (Thermo Fisher Scientific,
USA) and quality checked by using the 260/280 ratio absorbance parameter as determined by the
DS-11 FX + instrument (DeNovix, Wilmington, USA). Sequencing was performed using a HiSeq
75
platform (Illumina, Inc., San Diego, USA) and a 2x150 bp paired-end approach. Raw data from paired-
end sequencing were quality checked with the FastQC tool (v.0.11.6) and assembled with SPAdes
(v.3.11.1)(Bankevich et al., 2012). Assemblies were inspected with Bandage (v0.8.1) (Wick et al.
2015).
Bioinformatics analysis. Sequence types (STs) were assigned by the mlst tool
(github.com/tseemann/mlst) by using the Pasteur database (bigsdb.pasteur.fr/). The ABRicate tool
(github.com/tseemann/abricate) was used to detect acquired antimicrobial resistance genes using
the ResFinder database (Zankari et al. 2012), while plasmid replicons were predicted by
PlasmidFinder (Carattoli et al. 2014). Kaptive was used for the capsular type detection (Wyres et al.
2016a). Kleborate (github.com/katholt/Kleborate) was used for the species identification, detection
of ICEKp associated virulence loci (yersiniabactin (ybt), colibactin (clb)), virulence plasmid associated
loci (salmochelin (iro), aerobactin (iuc), hypermucoidy (rmpA, rmpA2)) and for checking the
ompK35/36 gene integrity. Phylogenetic analysis of CG258, CG23 and ST383 genomes were
performed by reads mapping of the respective reads by using the reference genomes GD4 (accession
no. CP025951), SGH10 (CP025080) and KpvST383_NDM_OXA-48 (CP034200), respectively. Snippy
was used for the reads mapping (github.com/tseemann/snippy). The whole genome alignments
obtained were screened for recombination using Gubbins (v2.3.4) (Croucher et al. 2015), while a
maximum likelihood phylogeny was obtained by using RAxML (v8.2.12)(Stamatakis 2014) with the
GTRGAMMA model and 100 bootstrap replicates. Core genome Single nucleotide polymorphisms
(SNPs) were obtained with the snp-dists tool (github.com/tseemann/snp-dists) by using the Gubbins
output. The phylogenetic tree were visualized together with associated metadata using Microreact
(v7.0.0)(Argimón et al. 2016) or Phandango (Hadfield et al. 2018). The Harvest suite was used to align
and visualize genomes of CG23 and ST35 strains in order to decipher the recombination events
within ST1265 (Treangen et al. 2014).
4.3 Results and discussion
A total of 299 K. pneumoniae strains were successfully sequenced. One isolate was further identified
as K. michiganensis and was excluded, leaving 299 isolates. Of those, 200 were randomly selected
over the 15-year period (2002-2016) and will be considered for longitudinal and epidemiological
investigations. In silico species identification reported the presence of the four major K. pneumoniae
species (Figure 7), with a prevalence of K. pneumoniae sensu stricto (N=177, 88.5%) followed by K.
quasipneumoniae subsp. similipneumoniae (N=11, 5.5%), K. quasipneumoniae subsp.
quasipneumoniae (N=8, 4%) and K. variicola (N=4, 2%). No particular trends in terms of species
abundance were observed.
76
Figure 7. Phylogenetic analysis of the whole K. pneumoniae collection, showing the different K. pneumoniae species.
4.3.1 Antimicrobial susceptibility.
Phenotypic results highlighted imipenem as the most effective drug, with 94.5% susceptibility,
followed by amikacin and ertapenem (both at 87.5% susceptibility) (Table 2). By clustering the strains
in 5-year groups (2002-2006, 2007-2011 and 2012-2016), we observed a decrease in susceptibility
rates for most of the drugs. The observed trends resulted to be statistically significant for imipenem
(p value=0.024), ertapenem (0.048) and ceftazidime (0.045). Data from the China Antimicrobial
Resistance Surveillance System (CARSS) revealed that the resistance rates of K. pneumoniae were on
a rising trend and reached 34.5 and 8.7% in 2016 to third generations cephalosporins and
carbapenems, respectively (CARSS). In line with such results, K. pneumoniae resistance rates reached
51.0 and 4.1% in 2016 for ceftazidime and imipenem, respectively, within H301.
Overall, the majority of the strains were classified as MDR (N=118, 59%), with an increase from 44.8%
through 51.2% to 64.8% over the three 5-year periods.
AK ATM FEP CAZ CIP ETP GEN IPM LEV TZP TOB SXT
2002-2006 86.2 55.2 79.3 72.4 39.3 96.6 73.1 100 63.0 89.3 65.5 75.0
2007-2011 90.2 62.8 79.1 69.8 37.2 90.7 58.1 97.7 65.1 74.4 51.2 65.1
2012-2016 87.5 46.1 72.7 50.8 35.2 84.4 54.7 92.2 62.5 79.5 51.2 45.3
total 87.5 51.0 75.0 58.0 36.1 87.5 57.8 94.5 63.1 79.8 53.3 53.8
Table 2. Percentages of susceptibility towards the following drugs: tzp: piperacillin/tazobactam, caz: ceftazidime, fep: cefepime, atm: aztreonam, gn: gentamicin, etp: ertapenem, ipm: imipenem, ak: amikacin, tob: tobramycin, gen: gentamycin, cip: ciprofloxacin, lev: levofloxacin, sxt: trimethoprim/sulfamethoxazole
77
4.3.2 Genomic epidemiology
Considering the random collection of 200 strains, 98 different STs were observed, including 27 novel
STs. The majority of STs (72.4%) were represented by only a single strain, highlighting the diversity
within the K. pneumoniae population. Eight clonal groups (CGs) were represented by at least five
strains, including the most frequent CG258 (N=28), CG23 (N=14), CG37 (N=13), CG14 (N=10), CG65
(N=9), CG15 (N=8), CG147 (N=8) and CG307 (N=7), and Figure 8 and Table 3 summarize their major
features. Strains belonging to CG258 represented the 14% of the population overall, and 60% of all
carbapenemase producers.
A total of 73 different K loci were detected, with 60 of them represented by a maximum of three
strains. The major K loci were KL2 (N=21, including ST14, ST65, ST380, ST375, ST86 and ST25), KL1
(N=17, including ST23, ST367 and two novel STs) and KL107 (N=10, including ST15 and 5 other less
represented STs). CG258 strains had the highest number of K loci, with 12 different ones detected, of
which 11 were detected in ST11 strains. CG37 was the second clonal group by K locus diversity, with
eight different ones detected. Conversely, CG23 and CG65, the hypervirulent clones, had K locus type
KL1 and KL2 only, respectively (Table 3).
Table 3. Features of the major CGs observed. The brackets enclose the percentages.
CG count mlst K_loci MDR ESBL CARBA ybt iuc clb rmpA rmpA2
CG258 28 ST11, ST11-
1LV, ST1264,
ST340, ST437
KL105, KL110, KL111,
KL14, KL141, KL142,
KL15, KL22, KL25, KL36,
KL39, KL47, KL64
25 (89.3) 15 (53.6) 9 (32.1) 16 (57.1) 2 (7.1) 0 0 2 (7.1)
CG23 14 ST23 KL1 2 (14.3) 1 (7.1) 1 (7.1) 14 (100.0) 14 (100.0) 14 (100.0) 13 (92.9) 14 (100.0)
CG37 13 ST309, ST37,
ST726, ST727
KL118, KL12, KL122,
KL128, KL15, KL21, KL23,
KL42
6 (46.2) 8 (61.5) 1 (7.7) 1 (7.7) 0 0 0 0
CG14 10 ST14 KL16, KL2 5 (50.0) 2 (20.0) 0 0 0 0 0 0
CG65 9 ST375, ST65 KL2 0 2 (22.2) 0 5 (55.6) 8 (88.9) 5 (55.6) 8 (88.9) 6 (66.7)
CG147 8 ST147, ST273 KL14, KL64, KL74, KL81 7 (87.5) 4 (50.0) 2 (25.0) 1 (12.5) 1 (12.5) 0 1 (12.5) 1 (12.5)
CG15 8 ST15 KL107, KL19, KL24, KL48 8 (100.0) 6 (75.0) 0 0 0 0 0 0
CG307 7 ST307 KL102 6 (85.7) 7 (100.0) 0 1 (14.3) 0 0 0 0
78
Figure 8. Features of the major CGs observed among the 200 randomly collected strains. The prevalence of MDR vs MDS A) and the types of ESBLs (B), carbapenemases (C) and capsular types (D) observed within the major CGs.
4.3.3 Antimicrobial resistance determinants.
More than half of the strains (N=110, 55%) harboured an ESBL-encoding gene, with 13 strains
harbouring more than one gene with up to four genes per strain. The most common ESBLs observed
were of the CTX-M type, with CTX-M-14 (N=35), CTX-M-3 (N=26) and CTX-M-15 (N=22) being the
most prevalent. CG307 strains had the highest prevalence of ESBLs, with all strains encoding for
either CTX-M-15 or CTX-M-14.
Four different carbapenemase-encoding genes were observed, blaKPC-2 (N=10), blaIMP-4 (N=2), blaOXA-48
(N=2) and blaIMP-30 (N=1). Strains belonging to ST11 carried most of the blaKPC-2 genes (90%), while the
remaining gene was found in an ST37 strain. The blaIMP-4 genes were observed in an hypervirulent
ST23 strain and in an ST337 strain. Two ST147 strains had either blaOXA-48 or blaIMP-30, and an ST383
strain had blaOXA-48.
Mutations in ompK genes were observed in 42 strains (21%) and consisted in insertion and deletions
leading to premature termination of OmpK35, which in few cases (N=9) were combined with
simultaneous ompK36 alterations. Such porin deficiencies were mainly observed within CG258, with
22 mutated strains out of 28 (78.6%). No porins alterations were observed for hypervirulent CG23
and CG65 strains.
79
Genes encoding 16S rRNA methyltransferase, associated with high-level aminoglycoside resistance,
were observed, with 13 strains harbouring armA, 11 harbouring rmtB genes and 2 strains harbouring
both armA and rmtB. Such genes were mainly observed in strains belonging to ST11 (N=9) and ST15
(N=4).
Several chromosomal mutations associated with known fluoroquinolone resistance were observed,
the most common being ParC80I (N=57), GyrA83I (N=47) and GyrA83F (N=11). Overall, 65 strains (32.5%)
had at least one ParC or GyrA mutations, the most common combination being GyrA83I-ParC80I (N=37),
and all 65 strains had high ciprofloxacin MIC (≥4 mg/L). Concerning the acquired fluoroquinolone
resistance mechanisms, QnrS1 (N=65), Aac(6')-Ib-cr (N=61) and QnrB4 (N=33) were the most
prevalent. Overall, 150 strains had at least one mechanism of fluoroquinolone resistance.
Genes encoding resistance to trimethoprim (dfrA) and sulfonamides (sul) were observed in 138
strains, with 100 carrying both genes and showing trimethoprim/sulfamethoxazole resistance.
Acquired mechanisms of colistin resistance were also observed. The mcr-1.1 gene was observed in
the K. pneumoniae ST231 strain K089 isolated in 2015. The gene was carried by a plasmid with
replicon IncX4 and identical to plasmid pMCR_WCHEC1618 (accession no. KY463454.1) obtained
from an E. coli strain from China in 2015 (Zhao et al. 2017). Strain K089 also encoded the ESBL CTX-
M-27, as well as fluoroquinolone, trimethoprim and sulfonamide resistance mechanisms. Two mcr-
9.1 genes were detected in K. quasipneumoniae subsp. quasipneumoniae K7029 and K7030 strains
belonging both to ST1681 and collected in 2005. Unfortunately, only relying on the Illumina short-
reads we were not able to determine the genetic background of the mcr-9.1 genes.
4.3.4 Hypervirulent K loci and acquired virulence genes.
K. pneumoniae capsule is a major virulence factor, and the capsule synthesis locus has considerable
genetic diversity between clonal groups (DeLeo et al. 2014; Wyres et al. 2015; Holt et al. 2015; Wyres
et al. 2016b).The hypervirulence-associated KL1 and KL2 represented the two most common capsular
polysaccharides within our collection. KL2 was associated with CG14 and CG65 strains (N=9 each),
and three more strains belonging to ST380, ST86 and ST25. KL1 was strictly linked to ST23 in K.
pneumoniae sensu stricto (N=14). KL1 was also observed in an ST367 K. quasipneumoniae subsp.
similipneumoniae, in a novel ST two locus variant of ST367 belonging to K. quasipneumoniae subsp.
similipneumoniae, and in a novel ST (single locus variant of ST527) belonging to Klebsiella variicola.
Siderophore gene acquisition was recently recognised as an important contributor to severe K.
pneumoniae invasive disease (Holt et al. 2015; Lam, Wick, et al. 2018). Lam et al. reported that the
ybt locus was present in 40.0% of the CG258, 87.8% of the hyper-virulent CG23, and was identified in
80
32.2% of the wider population. In our collection, yersiniabactin-encoding genes were observed in 61
strains (30.5%), and were located in eight different ICEKp chromosomally integrated mobile elements
and one plasmid. The major mobile elements were ICEKp10 (N=22) and ICEKp3 (N=17). While
ICEKp10 was linked to hypervirulent clones (CG23, N=14; CG65, N=5), ICEKp3 was mostly associated
with CG258 (N=9) and other non-hypervirulent clones. We observed ybt genes in 57.1% and 100% of
CG258 and CG23 strains, respectively, which is higher than previously reported (Lam, Wick, et al.
2018).
Plasmid-related iuc, iro, clb, rmpA and rmpA2 genes were also observed (iuc, 17%; iro, 16.5%; rmpA,
16%; rmpA2, 15%; clb, 11%), mostly associated with CG23 and CG65 (Figure 9). Because of its crucial
role in hypervirulence, aerobactin (iuc) positivity was considered a defining genetic trait for hvKP
(Russo et al. 2014). iuc1 was the most prevalent iuc lineage (N=32), and was linked to CG23 (N=14),
CG65 (N=8) and six other less represented CGs, including ‘classic’ clones and including a K.
quasipneumoniae subsp. similipneumoniae strain. iuc1 is usually located within the KpVP-1 virulence
plasmid (Lam, Wyres, et al. 2018) together with the previously mentioned virulence genes. We found
iuc1 together with iro1 (N=28), clb2 (N=14), clb3 (N=4), rmpA (N=28) and rmpA2 (N=29). Other iuc
lineages observed were iuc2, which is associated to KpVP-2 (Lam, Wyres, et al. 2018) and observed in
an ST380 strain, and iuc5, observed in an ST107 strain.
Figure 9. Percentages of virulence genes within the major CGs.
81
4.3.5 Comparative genomics of CG258 strains: cps diversity and hypervirulence
Figure 10. Phylogenetic analysis of CG258 strains, including 48 strains from this study and 18 strains from previous studies (Gu et al. 2017; Dong, Zhang, et al. 2018; Zhou et al. 2020). The fatal outbreak clone reported in China in 2017 (Gu et al. 2017) is highlighted on the tree. Aerobactin and salmochelin are not showed in the legend as they were of the type iuc1 and iro1 only, respectively. Chromosomal regions characterized by high SNPs density are reported on the right and their locations are shown compared to the reference GD4 genome (CP025951). Red blocks indicate predicted recombinations occurring on an internal branch, which are therefore shared by multiple isolates through common descent. Blue blocks represent recombinations that occur on terminal branches, which are unique to individual isolates.
Considering all 299 genomes, we ended with 48 non-duplicated CG258 genomes (ST11, N=40; ST11-
1LV, N=3; ST395, ST437, ST1264, ST340, ST1326, N=1 each). The rapid evolution within CG258 was
emphasized by the number of different capsular polysaccharides detected (N=17), of which 11
detected in ST11 only, and by the high evolutionary rate (~15 SNPs/genome/year) detected in
previous studies (Wyres et al. 2015; Zhou et al. 2020).
Figure 10 shows the phylogenetic relations of the 48 strains together with other ST11 strains
sequenced in previous studies. Two major clades were formed, with clade 1 consisting of ST11-KL47
and ST11-KL64 only, and clade 2 consisting of six different STs and 15 different cps types. Average
core SNP difference between clade 1 strains was 23, ranging from 0 to 60. Consistent with previous
studies, the major CG258 clone was ST11-KL47-KPC-2, which was similar to strains recently described
in China and causing outbreaks, including the fatal one that caused 5 deaths in 2017 (Gu et al. 2017;
Dong, Zhang, et al. 2018; Zhou et al. 2020). All strains from this clade harboured blaKPC-2 and carried
the ybt9 locus on an ICEKp3 element. Two of our ST11-KL47 strains were CR-hvKp and carried blaKPC-2
plus a pLVPK-like plasmid containing iuc1 and a truncated rmpA2. Retrospective studies have shown
that ST11-KL47 CR-hvKP emerged before 2015 and has since become detectable in different Asian
countries, including China, Hong Kong and India, suggesting that CR-hvKP may undergo worldwide
dissemination in the near future (Shankar et al. 2016; Wong et al. 2017; Du et al. 2018).
82
Clade 2 strains had 47 core SNPs on average, ranging from 0 to 123 (median 45). Recent studies
revealed the emergence and predominance of a novel ST11 clone, harbouring KL64, KPC-2 and the
hypervirulence plasmid in some instances (Zhou et al. 2020; Yang et al. 2020). Genomic analysis
revealed that this clade originated from ST11-KL47 after recombination of the cps genes around 2011
(Zhou et al. 2020). Of note, ST11-KL64 strains from this study did not cluster in clade 1 together with
previously reported ST11-KL64 strains, but they were located within clade 2. Analysis of
recombination sites revealed that such strains had two major regions of recombination, the cps
genes and the ICEKpnHS11286-1 region. Conversely, ST11-KL64 strains described by Zhou et al. only
showed recombination within the cps biosynthesis genes. Such findings suggest a different
evolutionary origin of ST11-KL64 strains from this study compared to the emerging clone described
by Zhou et al. The three ST11-KL64 strains in our collection were isolated in 2006 and 2007, they
lacked the blaKPC-2 gene and the ybt locus which is normally present in the ICEKpnHS11286-1
recombinant region. Strain ST11-KL64 K7069, isolated in 2007, carried a pLVPK-like plasmid
containing iuc1 and a truncated rmpA2 and also co-harboured blaCTX-M-3, armA and several other AMR
genes (Table S1). Only three strains out of the 28 composing the lower clade harboured blaKPC-2. Also,
the prevalence of yersiniabactin-encoding genes was lower compared to that of clade 1, with twelve
strains carrying either ybt9, ybt10, ybt13 or ybt14.
4.3.6 Phylogenetic analysis of the hypervirulent CG23
Figure 11. Comparative genomics of CG23 strains from the present study. STs are indicated by coloured tips, with yellow and green indicating ST23 and ST1265, respectively. All strains also contained the cps KL1, ybt1, clb2 and a truncated rmpA2. *replicons IncFIB(K) and IncHI1B of the pLVPK-like plasmid were observed in all strains.
A total of 19 non-duplicate CG23 strains were sequenced over the study period (Figure 11). All
belonged to ST23, except strain K7159 which belonged to ST1265. Average core SNPs observed were
186, ranging from 49 to 288 (median 188). All genomes contained the KL1 capsular locus, the
chromosomally encoded ybt1 embedded in ICEkp10 and the colibactin locus clb2. The hypervirulent
83
plasmid with IncFIB(K) and IncHI1B replicons was observed in all strains, containing iuc1, iro1, rmpA
and rmpA2 in most instances (Figure 11).
Strain K7159 (ST1265) shared 6 MLST genes with ST23, differing only for allele phoE, which is of type
9 and 10 in ST23 and ST1265, respectively. ST1265 was first described in Beijing in 2010, associated
with KL1 cps type, rmpA and a negative string test (Liu et al. 2014). Recombination analysis revealed
that strain K7159 had a ~750 Kbp recombinant region which also contained the phoE gene. Genomic
comparison revealed that this region likely originated from an ST35 genome (Figure 12).
Figure 12. Whole genome alignment of ST1265 in comparison to ST23 and ST35 genomes. The SGH10 chromosome was used as reference for the alignment. Pink lines indicate SNPs identified with the Harvest suite. The MLST gene phoE position is indicated, as well as the ~750 Kb region of divergence of ST1265 strains originating from ST35 genomes.
Strain K7159 was nearly identical to strain 11420 (GCA_009497755.1) isolated in Beijing in 2014 (Li et
al. 2020). Strain 11420 consists of a chromosome of length 5’438’591 bp, a pLVPK-like plasmid of size
229’796 bp and a KPC-2 plasmid of size 81’180 bp, containing the replicon IncN without additional
AMR genes. Reads mapping analysis showed that our ST1265 genome also contained two plasmids
with identical organization and 99.9% nucleotide identity compared to plasmids from strain 11420.
Three additional cases of genomic convergence of MDR and hypervirulence were observed. Strains
K931 and K862 both carried a ~50 Kbp IncN plasmid similar to pIMP-HZ1 (KU886034.1) described in
IMP-4-producing Enterobacteriaceae from China (Wang et al. 2017). While K862 carried a plasmid
identical to pIMP-HZ1, the IncN plasmid from strain K931 had blaCTX-M-3 and blaTEM-1 replacing the
blaIMP-4 gene. Strain K7046 had a plasmid identical to pCTX-M-3 (AF550415) described in C. freundii in
Poland (Gołȩbiewski et al. 2007). It’s a ~90 Kbp, IncL/M plasmid carrying blaCTX-M-3, armA, blaTEM-1,
aac(3)-IId, mph(E), msr(E), sul1, aadA2 and dfrA12 genes.
84
4.3.7 Global comparison of ST383: an emerging high-risk clone
Figure 13. Phylogenetic tree of ST383 genomes from this study in comparison with publicly available ST383 genomes. Coloured leaves indicate different capsular polysaccharides, where yellow is for KL30 and green for KL15.
We deeply investigated the strains belonging to ST383 as we found several of them to be CR-hvKp.
ST383 is an emerging clone that was first observed in Greek hospitals during 2009-2010 and strains
belonging to this clone were co-harbouring blaVIM-4, blaKPC-2 and blaCMY-4 β-lactamases (Papagiannitsis
et al. 2010). Figure 13 shows the phylogenetic relatedness of our ST383 together with publicly
available ST383 genomes. Only ten genomes were available, with most of them originating from
Greece. Strain KpvST383_NDM_OXA-48 from the UK had a complete genome and it was used as
reference for the phylogeny (Turton et al. 2019). Genomic relatedness showed strains from Europe
clustering together, the strain from the UK positioned apart from the rest of the tree, and the
Chinese strains from this study clustering together. Overall, an average of 158 core SNPs was
observed (min: 4, max: 627, median: 157), which decreases to 53 (min: 4, max: 182, median: 40) if we
only consider the strains from China. Two different K loci were observed, with the strain from
Belgium carrying KL15 and all other strains carrying KL30. Gubbins analysis revealed that the capsular
polysaccharide genes represented the major recombinant region. A second recombination concerned
a ~12 Kbp region consisting of mercury resistance genes and several transposases. No other major
recombination events were observed. Several carbapenemase-encoding genes were observed,
comprising the major clinically relevant KPC, OXA-48, NDM and VIM types, with two strains co-
harbouring two different carbapenemase genes. All strains from China carried the blaOXA-48 gene and
had an IncL/M plasmid replicon. ESBL-encoding genes were blaCTX-M-14, observed in all strains from
China, and strain K57 additionally had blaCTX-M-55.
Concerning virulence factors, yersiniabactin-encoding genes were not observed. Conversely, the
hypervirulent pLVPK-like plasmid was observed in some strains from China and in the strain from the
85
UK. Although it was not possible to fully reconstruct the hv plasmid sequences from our short-reads
sequence data, we detected iuc1 on a contig that matches a 45kb region of pLVPK and also carries
rmpA and rmpA2.
Strains belonging to ST383 and carrying OXA-48 plasmids were previously reported, with reports
from the UK (Dimou et al. 2012) and from China (Guo et al. 2016). In the latter study, Guo et al.
reported an outbreak caused by ST383 strains carrying a 70 Kb IncL/M OXA-48 plasmid. ST383 strains
carrying hypervirulence genes were also reported from UK, carrying the iuc and rmpA/A2 genes
together with carbapenemase-encoding genes of type blaOXA-48, sometimes in combination with
blaNDM (Turton et al. 2017, 2019)
4.3.8 Simultaneous carriage of acquired AMR and hypervirulence genes.
We detected eleven examples of genomic convergence of hypervirulence, indicated by the presence
of the aerobactin locus (iuc), and MDR, indicated by the presence of either an ESBL- or a
carbapenemase-encoding gene, in our 200 randomly selected strains (5.5%), spanning eight different
STs. Similarly, in a recent study from South and Southeast Asia aiming at studying the population
structure of bloodstream infection isolates, the prevalence of convergent strains was 7.3%, with
seven different STs observed (Wyres et al. 2020). By considering our complete collection of genomes
after exclusion of duplicates, we ended with 25 cases of genomic MDR-hv convergence (Table S2).
The occurrence of such convergent strains is on the rise, with 80% of them being detected in the
period 2012-2016. Among the convergent strains, the major ST reported was ST383, with 6 cases,
followed by ST11 and ST23 (3 cases each), ST29 (two cases) and eleven other STs with only one case.
Most cases of convergence (N=21) were characterized by the presence of a pLVPK-like plasmid. Such
a plasmid is common within hypervirulent clones such as CG23 and CG65, and we observed more
than 80% of its sequence within our CG23 and CG65 convergent strains. Conversely, variable portions
of the virulent plasmid were observed in normally non-hypervirulent clones (Table S2).
Aerobactin loci detected were of type 1 (N=21), 3 (N=3) and 5 (N=1). Most of the iuc1 convergent
strains belonged to ST383, ST23 and ST11 and were previously described. In some cases we were
able to detect the genetic background of the hv and MDR genes. The K. quasipneumoniae subsp.
similipneumoniae strain K898 belonged to ST367 and had an hypervirulent capsule of the KL1 type. It
carried a blaCTX-M-15 gene in an IncFII plasmid together with blaTEM-1. Such IncFII plasmid is ~95 Kbp and
is identical to pL22-5 (CP031262.1) obtained from an ST367 from Beijing. The pLVPK-like plasmid was
characterized by the presence of the replicon IncFIB(K) and by the virulence genes iuc1, iro1, rmpA
and a truncated rmpA2. Strain K7058 belonged to ST65 and carried a pLVPK-like plasmid plus an ~70
Kb IncFII plasmid harbouring blaCTX-M-15 and no other AMR genes.
86
Three strains carried iuc3 which was associated with IncFIBK and IncFII plasmids similar to NCTC11676
(NZ_UGMR01000002.1). Two of those strains also carried iro3 and an ICEKp1 element containing
ybt2 and rmpA. All three strains carried multiple ESBL-encoding genes, and strain K7156 additionally
harboured a blaIMP-4 carbapenemase-encoding gene.
The strain K7146 belonged to ST107 and carried iuc5 together with iro5, which have been previously
detected in E. coli plasmids such as p3PCN033 (CP006635.1). Reads mapping revealed that our ST107
strain contained a plasmid with 90% coverage and 99.5% identity compared to p3PCN033, including
the plasmid replicons IncFIB, IncFIC and IncQ1 and several AMR genes (aph(3')-Ia, aph(6)-Id, aph(3'')-
Ib, sul2, oqxA/B, dfrA17, blaTEM-1B, tet(B)). K7146 also carried the ESBL-encoding gene blaCTX-M-3 on a
plasmid with replicons IncN and IncU, also containing additional AMR genes (aac(6')-Ib-cr, ARR-3,
qnrS1, catA1, mph(A), dfrA14).
4.4 Conclusions
This study aimed to investigate the longitudinal population of K. pneumoniae clinical isolates from
the Hospital 301 (People's Liberation Army General Hospital) in Beijing, China. The major focus was
directed towards the investigation of ‘high-risk’ clones, those characterized by the simultaneous
carriage of AMR and hypervirulence genes and potentially able to cause serious infections with
limited treatment options. A major limitation was that the sample size was small, especially if we
consider that it was spread over a long time frame. While some sporadic clones may have been
missed from our collection, the major K. pneumoniae clones, as described in previous reports from
China (Zhang et al. 2016; Van Dorp et al. 2019; Yang et al. 2020; Zhou et al. 2020), were observed.
While we did not get a complete picture of the complex K. pneumoniae population, we were able to
detect the major AMR and virulence determinants and, eventually, their genetic environment. We
detected three major high-risk clones, characterized by ESBL and/or carbapenemase production or
hypervirulence, with also strains expressing both features simultaneously. Strains belonging to
CG258, the globally dominant clinical K. pneumoniae clone, were the most represented and showed
high diversity. However, one clone, ST11-KL47, represented the majority of strains, and was highly
associated with KPC-2 and several virulence factors. CG23 still remains the dominant hvKp clone.
While it is usually susceptible to multiple antibiotics, we found some strains harbouring MDR
plasmids encoding for ESBLs and carbapenemases. Moreover, we found a strain belonging to the
recently described ST1265 and we showed that it’s an hybrid strain originating from an ST23 and an
ST35. The simultaneous carriage of the cps KL1, the hypervirulence plasmid and a KPC-2 plasmid
underscore the importance of tracking the spread of such novel clone. We also reported the
emergence of a recently described high-risk clone, ST383. Conversely to strains belonging to CG258,
87
which are usually associated to KPC-2, ST383 strains seems to readily acquire carbapenemases of the
different types, sometimes harbouring two different types. Moreover, we found several ST383
strains carrying the hypervirulent plasmid. The combination of carbapenem resistance and
hypervirulence significantly reduces the antimicrobial options for treating the life-threatening
infections caused by such strains and therefore represents a major urgent challenge for clinical
treatment, infection control and public health (Chen & Kreiswirth 2017).
4.5 References
Argimón S et al. 2016. Microreact: visualizing and sharing data for genomic epidemiology and
phylogeography. Microb. genomics. 2:e000093. doi: 10.1099/mgen.0.000093.
Bialek-Davenet S et al. 2014. Genomic definition of hypervirulent and multidrug-resistant Klebsiella
pneumoniae clonal groups. Emerg. Infect. Dis. 20:1812–20. doi: 10.3201/eid2011.140206.
Carattoli A et al. 2014. In Silico detection and typing of plasmids using plasmidfinder and plasmid
multilocus sequence typing. Antimicrob. Agents Chemother. 58:3895–3903. doi: 10.1128/AAC.02412-
14.
CARSS. National Health and Family Planning Commission of the People’s Republic of China (2017).
Report on Current Status of Antimicrobial Agent Management and Antimicrobial Resistance in China.
Beijing: Beijing Union Medical University Press.
Chen L et al. 2014. Carbapenemase-producing Klebsiella pneumoniae: molecular and genetic
decoding. Trends Microbiol. 22:686–696. doi: 10.1016/j.tim.2014.09.003.
Chen L, Kreiswirth BN. 2017. Convergence of carbapenem-resistance and hypervirulence in Klebsiella
pneumoniae. Lancet Infect. Dis. 3099:9–10. doi: 10.1016/S1473-3099(17)30517-0.
Chen Y et al. 2020. Acquisition of Plasmid with Carbapenem-Resistance Gene blaKPC2 in Hypervirulent
Klebsiella pneumoniae , Singapore . Emerg. Infect. Dis. 26:549–559. doi: 10.3201/eid2603.191230.
Croucher NJ et al. 2015. Rapid phylogenetic analysis of large samples of recombinant bacterial whole
genome sequences using Gubbins. Nucleic Acids Res. 43:e15. doi: 10.1093/nar/gku1196.
DeLeo FR et al. 2014. Molecular dissection of the evolution of carbapenem-resistant multilocus
sequence type 258 Klebsiella pneumoniae. Proc. Natl. Acad. Sci. 111:4988–4993. doi:
10.1073/pnas.1321364111.
Dimou V, Dhanji H, Pike R, Livermore DM, Woodford N. 2012. Characterization of Enterobacteriaceae
88
producing OXA-48-like carbapenemases in the UK. J. Antimicrob. Chemother. 67:1660–1665. doi:
10.1093/jac/dks124.
Dong N, Zhang R, et al. 2018. Genome analysis of clinical multilocus sequence Type 11 Klebsiella
pneumoniae from China. Microb. Genomics. doi: 10.1099/mgen.0.000149.
Dong N, Lin D, Zhang R, Chan EW-C, Chen S. 2018. Carriage of blaKPC-2 by a virulence plasmid in
hypervirulent Klebsiella pneumoniae. J. Antimicrob. Chemother. 73:3317–3321. doi:
10.1093/jac/dky358.
Van Dorp L et al. 2019. Rapid phenotypic evolution in multidrug-resistant Klebsiella pneumoniae
hospital outbreak strains. Microb. Genomics. 5:1–11. doi: 10.1099/mgen.0.000263.
Du P, Zhang Y, Chen C. 2018. Emergence of carbapenem-resistant hypervirulent Klebsiella
pneumoniae. Lancet Infect. Dis. 18:23–24. doi: 10.1016/S1473-3099(17)30625-4.
EUCAST. 2019. The European Committee on Antimicrobial Susceptibility Testing. Breakpoint tables
for interpretation of MICs and zone diameters. Version 9.0, 2019. http://www.eucast.org.
Gołȩbiewski M et al. 2007. Complete nucleotide sequence of the pCTX-M3 plasmid and its
involvement in spread of the extended-spectrum β-lactamase gene blaCTX-M-3. Antimicrob. Agents
Chemother. 51:3789–3795. doi: 10.1128/AAC.00457-07.
Gu D et al. 2017. A fatal outbreak of ST11 carbapenem-resistant hypervirulent Klebsiella pneumoniae
in a Chinese hospital: A molecular epidemiological study. Lancet Infect. Dis. 18:37–46. doi:
10.1016/S1473-3099(17)30489-9.
Guo L et al. 2016. Nosocomial Outbreak of OXA-48-Producing Klebsiella pneumoniae in a Chinese
Hospital: Clonal Transmission of ST147 and ST383 Forestier, C, editor. PLoS One. 11:e0160754. doi:
10.1371/journal.pone.0160754.
Hadfield J et al. 2018. Phandango: an interactive viewer for bacterial population genomics.
Bioinformatics. Jan 15;34(2):292-293. doi: 10.1093/bioinformatics/btx610.
Holt KE et al. 2015. Genomic analysis of diversity, population structure, virulence, and antimicrobial
resistance in Klebsiella pneumoniae , an urgent threat to public health. Proc. Natl. Acad. Sci.
112:E3574–E3581. doi: 10.1073/pnas.1501049112.
Hu F et al. 2019. Resistance reported from China antimicrobial surveillance network (CHINET) in 2018.
Eur. J. Clin. Microbiol. Infect. Dis. 38:2275–2281. doi: 10.1007/s10096-019-03673-1.
89
Hu FP et al. 2016. Resistance trends among clinical isolates in China reported from CHINET
surveillance of bacterial resistance, 2005-2014. Clin. Microbiol. Infect. 22:S9–S14. doi:
10.1016/j.cmi.2016.01.001.
Kabha K et al. 1995. Relationships among capsular structure, phagocytosis, and mouse virulence in
Klebsiella pneumoniae. Infect. Immun. 63:847–52. http://www.ncbi.nlm.nih.gov/pubmed/7868255.
Lam MMC, Wick RR, et al. 2018. Genetic diversity, mobilisation and spread of the yersiniabactin-
encoding mobile element ICEKp in klebsiella pneumoniae populations. Microb. Genomics. 4. doi:
10.1099/mgen.0.000196.
Lam MMC, Wyres KL, et al. 2018. Tracking key virulence loci encoding aerobactin and salmochelin
siderophore synthesis in Klebsiella pneumoniae. Genome Med. 10:77. doi: 10.1186/s13073-018-
0587-5.
Li C et al. 2020. A rare carbapenem-resistant hypervirulent K1/ST1265 Klebsiella pneumoniae with an
untypeable blaKPC-harbored conjugative plasmid. J. Glob. Antimicrob. Resist. doi:
10.1016/j.jgar.2020.04.009.
Liu Y et al. 2017. Capsular Polysaccharide Types and Virulence-Related Traits of Epidemic KPC-
Producing Klebsiella pneumoniae Isolates in a Chinese University Hospital. Microb. Drug Resist.
23:901–907. doi: 10.1089/mdr.2016.0222.
Liu YM et al. 2014. Clinical and molecular characteristics of emerging hypervirulent Klebsiella
pneumoniae bloodstream infections in mainland China. Antimicrob. Agents Chemother. 58:5379–85.
doi: 10.1128/AAC.02523-14.
Magiorakos AP et al. 2012. Multidrug-resistant, extensively drug-resistant and pandrug-resistant
bacteria: An international expert proposal for interim standard definitions for acquired resistance.
Clin. Microbiol. Infect. 18:268–281. doi: 10.1111/j.1469-0691.2011.03570.x.
Paczosa MK, Mecsas J. 2016. Klebsiella pneumoniae: Going on the Offense with a Strong Defense.
Microbiol. Mol. Biol. Rev. 80:629–61. doi: 10.1128/MMBR.00078-15.
Papagiannitsis CC et al. 2010. Emergence of Klebsiella pneumoniae of a novel sequence type (ST383)
producing VIM-4, KPC-2 and CMY-4 β-lactamases. Int. J. Antimicrob. Agents. 36:573–574. doi:
10.1016/j.ijantimicag.2010.07.018.
Russo TA et al. 2014. Aerobactin mediates virulence and accounts for increased siderophore
production under iron-limiting conditions by hypervirulent (hypermucoviscous) Klebsiella
90
pneumoniae. Infect. Immun. 82:2356–2367. doi: 10.1128/IAI.01667-13.
Shankar C et al. 2016. Draft Genome Sequences of Three Hypervirulent Carbapenem-Resistant
Klebsiella pneumoniae Isolates from Bacteremia. Genome Announc. 4. doi: 10.1128/genomeA.01081-
16.
Shen D et al. 2019. Emergence of a multidrug-resistant hypervirulent klebsiella pneumoniae
sequence type 23 strain with a rare blaCTX-M-24-harboring virulence plasmid. Antimicrob. Agents
Chemother. 63. doi: 10.1128/AAC.02273-18.
Shon AS, Bajwa RPS, Russo TA. 2013. Hypervirulent (hypermucoviscous) Klebsiella pneumoniae.
Virulence. 4:107–118. doi: 10.4161/viru.22718.
Siu LK, Yeh KM, Lin JC, Fung CP, Chang FY. 2012. Klebsiella pneumoniae liver abscess: A new invasive
syndrome. Lancet Infect. Dis. 12:881–887. doi: 10.1016/S1473-3099(12)70205-0.
Stamatakis A. 2014. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large
phylogenies. Bioinformatics. 30:1312–1313. doi: 10.1093/bioinformatics/btu033.
Struve C et al. 2015. Mapping the Evolution of Hypervirulent Klebsiella pneumoniae. MBio. 6:e00630.
doi: 10.1128/mBio.00630-15.
Treangen TJ, Ondov BD, Koren S, Phillippy AM. 2014. The Harvest suite for rapid core-genome
alignment and visualization of thousands of intraspecific microbial genomes. Genome Biol. 15:524.
doi: 10.1186/s13059-014-0524-x.
Turton Jane et al. 2019. Hybrid resistance and virulence plasmids in “high-risk” clones of klebsiella
pneumoniae, including those carrying blaNDM-5. Microorganisms. 7. doi:
10.3390/microorganisms7090326.
Turton JF et al. 2017. Virulence genes in isolates of Klebsiella pneumoniae from the UK during 2016,
including among carbapenemase gene-positive hypervirulent K1-ST23 and ‘non-hypervirulent’ types
ST147, ST15 and ST383. J. Med. Microbiol. doi: 10.1099/jmm.0.000653.
Wang Y et al. 2017. IncN ST7 epidemic plasmid carrying blaIMP-4 in Enterobacteriaceae isolates with
epidemiological links to multiple geographical areas in China. J. Antimicrob. Chemother. 72:99–103.
doi: 10.1093/jac/dkw353.
Wick RR, Schultz MB, Zobel J, Holt KE. 2015. Bandage: interactive visualization of de novo genome
assemblies. Bioinformatics. 31:3350–2. doi: 10.1093/bioinformatics/btv383.
91
Wong MHY et al. 2018. Emergence of carbapenem-resistant hypervirulent Klebsiella pneumoniae.
Lancet Infect. Dis. 18:24. doi: 10.1016/S1473-3099(17)30629-1.
Wong MHY et al. 2017. Emergence of carbapenem-resistant hypervirulent Klebsiella pneumoniae.
Lancet Infect. Dis. 3099:5–6. doi: 10.1016/S1473-3099(17)30629-1.
Wyres KL et al. 2015. Extensive capsule locus variation and large-scale genomic recombination within
the Klebsiella pneumoniae clonal group 258. Genome Biol. Evol. 7:1267–1279. doi:
10.1093/gbe/evv062.
Wyres KL et al. 2020. Genomic surveillance for hypervirulence and multi-drug resistance in invasive
Klebsiella pneumoniae from South and Southeast Asia. Genome Med. 12:11. doi: 10.1186/s13073-
019-0706-y.
Wyres KL et al. 2016. Identification of Klebsiella capsule synthesis loci from whole genome data.
Microb. Genomics. 2:e000102. doi: 10.1099/mgen.0.000102.
Xu M et al. 2019. High prevalence of KPC-2-producing hypervirulent Klebsiella pneumoniae causing
meningitis in Eastern China. Infect. Drug Resist. 12:641–653. doi: 10.2147/IDR.S191892.
Yang Q et al. 2020. Emergence of ST11-K47 and ST11-K64 hypervirulent carbapenem-resistant
Klebsiella pneumoniae in bacterial liver abscesses from China: a molecular, biological, and
epidemiological study. Emerg. Microbes Infect. 9:320–331. doi: 10.1080/22221751.2020.1721334.
Yao H, Qin S, Chen S, Shen J, Du XD. 2018. Emergence of carbapenem-resistant hypervirulent
Klebsiella pneumoniae. Lancet Infect. Dis. 18:25. doi: 10.1016/S1473-3099(17)30628-X.
Zankari E et al. 2012. Identification of acquired antimicrobial resistance genes. J. Antimicrob.
Chemother. 67:2640–2644. doi: 10.1093/jac/dks261.
Zhang R et al. 2017. Nationwide Surveillance of Clinical Carbapenem-resistant Enterobacteriaceae
(CRE) Strains in China. EBioMedicine. 19:98–106. doi: 10.1016/j.ebiom.2017.04.032.
Zhang Y et al. 2016. High Prevalence of Hypervirulent Klebsiella pneumoniae Infection in China:
Geographic Distribution, Clinical Characteristics, and Antimicrobial Resistance. Antimicrob Agents
Chemother. 60:6115–6120. doi:10.1128/AAC.01127-16.
Zhang Y et al. 2018. Epidemiology of carbapenem-resistant Enterobacteriaceae infections: Report
from the China CRE Network. Antimicrob. Agents Chemother. 62. doi: 10.1128/AAC.01882-17.
Zhang Y et al. 2019. Evolution of hypervirulence in carbapenem-resistant Klebsiella pneumoniae in
92
China: a multicentre, molecular epidemiological analysis. J. Antimicrob. Chemother. doi:
10.1093/jac/dkz446.
Zhao F, Feng Y, Lü X, McNally A, Zong Z. 2017. Remarkable diversity of Escherichia coli carrying mcr-1
from hospital sewage with the identification of two new mcr-1 variants. Front. Microbiol. 8:2094. doi:
10.3389/fmicb.2017.02094.
Zhou K et al. 2020. Novel subclone of carbapenem-resistant klebsiella pneumoniae sequence type 11
with enhanced virulence and transmissibility, China. Emerg. Infect. Dis. 26:289–297. doi:
10.3201/eid2602.190594.
93
CHAPTER 5 : Interpreting k-mer based signatures for antibiotic
resistance prediction
Magali Jaillard1, Mattia Palmieri1, Alex van Belkum1 and Pierre Mahé1
1bioMérieux, Marcy l’Etoile, France
Submitted to GigaScience
94
5.1 Abstract
Background. Recent years witnessed the development of several k-mer-based approaches aiming to
predict phenotypic traits of bacteria based on their whole-genome sequences. While often
convincing in terms of predictive performance, the underlying models are in general not
straightforward to interpret, the interplay between the actual genetic determinant and its translation
as k-mers being generally hard to decipher.
Results. We propose a simple and computationally efficient strategy allowing one to cope with the
high correlation inherent to k-mer-based representations in supervised machine learning models,
leading to concise and easily interpretable signatures. We demonstrate the benefit of this approach
on the task of predicting the antibiotic resistance profile of a Klebsiella pneumoniae strain from its
genome, where our method leads to signatures defined as weighted linear combinations of genetic
elements that can easily be identified as genuine antibiotic resistance determinants, with state of the
art predictive performance.
Conclusions. By enhancing the interpretability of genomic k-mer-based antibiotic resistance
prediction models, our approach improves their clinical utility, hence will facilitate their adoption in
routine diagnostics by clinicians and microbiologists. While antibiotic resistance was the motivating
application, the method is generic and can be transposed to any other bacterial trait.
5.2 Introduction
Antimicrobial resistance (AMR) is a global healthcare problem and rapid diagnostics are needed to
select the right treatment, to follow the route to cure and to monitor and prevent community- and
hospital-acquired outbreaks of infections. Next-Generation Sequencing (NGS) is a disruptive
technology which is, potentially, able to supplant or even replace the current plethora of diagnostic
tests with a single, most probably well-affordable and faster solution. Inferring the antibiotic
resistance profile from a bacterial genome is challenging. However, good results have been obtained
for several species [1-7], including Klebsiella pneumoniae [8]. Su et al. [9] discussed the challenges of
NGS-based antibiotic susceptibility testing (AST) and provided a comprehensive review of the current
state of the art in this field.
Early approaches relied on the detection of known resistance markers to claim resistance, a strategy
sometimes referred to as direct association analysis [10]. While effective when the genetic bases of
antibiotic resistance are well known, which is the case for instance for most antibiotic resistance
mechanisms in the highly clonal species M. tuberculosis [11, 12] and Salmonella typhi [13], this
approach suffers from several limitations. First and foremost, it intrinsically relies on prior knowledge
95
of the precise nature of the resistance determinants, which may not be available for all species and
drugs. Secondly, it is not able to account for the fact that these markers can have different levels of
predictive power [14, 15], that they can act in a multi-factorial fashion through epistasis [16, 17], or
that resistance can result from the accumulation of several different mutations [18, 19]. Last but not
least, it is hazardous to predict susceptibility when no marker is detected, since the resistance marker
may be novel and databases incomplete. This issue is more and more addressed from the supervised
machine learning (ML) standpoint: given a set of genomes with associated reference phenotypes
(provided by phenotypic AST methods [20]), one seeks a prediction rule allowing to infer the
resistance or susceptibility of a novel strain from genomic features. Even for M. tuberculosis, where
the antibiotic resistance knowledge is probably among the most thorough and complete, recent
studies showed that performance of direct association strategies can still be significantly improved
by ML models [10, 17].
A great variety of ML strategies have been explored, taking into account several parameters. First,
regarding the nature of the genomic features considered: supervised ML models can indeed operate
from known markers like the ones involved in direct association strategies, offering the possibility to
discover more complex and multivariate marker combinations better predicting resistance
phenotypes [3, 10, 17], or directly using the raw sequences represented as k-mers [4, 8, 21-23]. The
latter approach offers several advantages: it does not require prior knowledge about the underlying
resistance mechanisms, allows to capture various types of genomic determinants (including the
acquisition of genes or point mutations), and does not require to align the genomes to a common
reference which may be hard to define for some species, especially the less clonal ones [24, 25].
Second, regarding the type of ML algorithms. Boosting algorithms [4, 8, 21], penalized regression
models [10, 17, 23], decision trees [26], random forest [10, 27], neural networks [17] or set cover
machines [22, 26] have already been successfully deployed in this context. While each algorithm has
its own merits and shortcomings, several studies reported comparable global performance for
various algorithms, with specific variations by drug and microbial species [10, 17, 28]. Finally,
different kinds of antibiotic susceptibility information can be considered: either discrete when the
objective is to distinguish susceptible from resistant (or non-susceptible) ones [10, 17, 21, 22], or
continuous, where one seeks to predict the minimum inhibitory concentration (MIC) of the
antimicrobial agent itself [3, 4, 8].
A critical challenge for the adoption of such predictive ML models by clinicians and microbiologists
resides in their level of interpretability and, ultimately, clinical action-driving ability. While the notion
of interpretability is somehow ill-defined, a natural requirement for the end-user would be to
achieve the prediction from a limited number of genomic features, that can be easily and
96
unambiguously interpreted as actual genetic determinants [25, 26]. This challenge is particularly
important using k-mer-based representations, for several reasons. Firstly, k-mers covering conserved
genomic regions are redundant and can be easily detected and filtered [29], but they define groups
of equivalent k-mers which are not always straightforward to interpret as genomic determinants [21-
23, 26]. Secondly, k-mers may not be specific of a given genomic region, hence may be hard to
annotate. This is especially the case for short k-mers, e.g., when k = 8 or k = 10 [4, 8]. Last but not
least, the k-mer-based representation of genomes intrinsically leads to very high-dimensional feature
spaces, with strongly correlated variables. Using k = 31 for instance, and depending on the bacterial
species considered, it is common to end up working with 105 - 106 (non-redundant) k-mers, many of
which are observed in almost the same sets of genomes, hence bringing almost the same
information regarding the studied phenotype.
We propose to rely on the adaptive cluster lasso (ACL) [30], an extension of Bühlmann et al. [31]
tailored to the high-dimension setting by means of a prior screening of variables. We implemented in
a R package a simple and efficient ACL-inspired strategy able to cope with the very high-dimension
and strong correlations of k-mer-based representation, leading to sparse and interpretable genomic
signatures. This approach compared favorably to the standard lasso on a systematic validation study
focusing on K. pneumoniae. It provided a comparable level of performance while offering better
interpretability of the genomic determinants involved in the models. We could identify known and
potentially novel resistance determinants from the corresponding k-mer signatures, which allowed to
extract meaningful scientific insights.
5.3 Methods
5.3.1 Datasets
Training dataset. We gathered the assembled genomes, provided as contigs, of 1665 strains to
develop MIC prediction models for K. pneumoniae [8]. This set of genomes defines our training
dataset. We focused on the 10 clinically most relevant antibiotics listed in Table 1 which belong to
seven different antibiotic classes. The reference MICs were cast into resistant, susceptible and
intermediate according to the Clinical and Laboratory Standards Institute (CLSI) breakpoints. The
intermediate and resistant strains were finally merged into a common category, to define a binary
classification problem aiming to distinguish susceptible (S) from non-susceptible (NS) strains. Table 1
provides the number of S/NS phenotypes available for each selected drug.
97
Table 1. Dataset constitution. This table provides the number of susceptible (S) and non-susceptible (NS) strains available in the training and test dataset for the various antibiotics considered. piper.tazo stands for piperacillin/tazobactam. Note that a limited number of susceptible strains is available in the test dataset for aztreonam, and to a lesser extent cefepime and meropenem.
k-merization of the training dataset. The k-merization was computed from the contigs of all training
genomes, using the DBGWAS software [25], with a k-mer size of 31 and filtering patterns with a
minor allele frequency (MAF) below 1%. DBGWAS allows for the deduplication of the strictly
equivalent k-mers by compacting overlapping non-branching paths of kmers into unitigs, thanks to
the use of a compacted De Bruijn Graph (cDBG) (Figure 1 A). DBGWAS stores the profiles of
presence/absence of each unitig in the training genomes in a matrix V such as Vi,j = 1 if the j-th unitig
is present in the i-th input genome and Vi,j = 0 otherwise (Figure 1, B1). Each vector Vi,j is then
transformed according to its allele frequency: if its allele frequency exceeds 0.5, meaning that it is
observed in more than 50% of the panel genomes, it is inverted as Vi,j = |1–Vi,j| so that its MAF
corresponds to its average value. This transformation renders identical two originally complementary
vectors. Keeping only the unique patterns then leads to an optimal reduction of the number of
features, without modifying the intrinsic statistical signal (Figure 1 B2). These unique, MAF-filtered,
patterns define the final variant matrix X, where Xi,j = 1 if the j-th pattern is found in the i-th genome,
and 0 otherwise. This process is described in details in Jaillard et al. [25]. The DBGWAS files
describing the cDBG are kept for the further interpretation of the genomic signatures, allowing to
visualize the unitigs of the selected patterns within their genomic environment.
In practice we carry out this k-merization process for each antibiotic separately, processing solely the
strains that have been phenotypically tested. The output of this k-merization step is a sparse variant
matrix X with, for instance in the case of the cefoxitin antibiotic, N = 1643 rows for the N cefoxitin-
phenotyped strains of the training panel and p = 1,234,397 columns representing the p distinct
patterns of presence/ absence retained by DBGWAS. The matrix X is binary as DBGWAS only encodes
the presence or absence in the genomes. It is sparse as only around 13% of the values are not null.
98
Figure 1. K-merization of the training genomes. Illustration of the DBGWAS process of k-merization and variant matrix construction. Refer to Jaillard et al. [25] for further details.
Test dataset. To validate the predictive performance of the models, we built an independent test
dataset involving 634 strains, including 114 strains from our bioMérieux collection (NCBI Bioproject
PRJNA449293 and PRJNA597427) and 520 strains from the PATRIC database (https://www.patricbrc.
org/). Such strains were mostly from the USA, the UK, Serbia, Greece and other European countries
and the MICs were obtained with either agar dilution, broth microdilution or Vitek 2 (bioMérieux,
Marcy l’Étoile, France) (see Supplementary Section S1). Table 1 provides the number of S/NS
phenotypes available in the test dataset.
5.3.2 Coping with highly correlated genomic features.
Logistic regression is a widely used generalized linear model addressing binary classification problems.
In our case, it consists of building a linear function defined for a strain represented by a vector x 𝜖 {0,
1}p as:
99
where p corresponds to the number of distinct patterns identified by DBGWAS, and x encodes their
presence/absence in the strain genome. To estimate the model coefficients and simultaneously
select a limited number of patterns from a training panel of n strains, one can rely on the L1 or lasso
penalty and consider the following optimization problem:
where yi = 0 if the ith strain, stored in the ith row of the training matrix X, is susceptible and 1
otherwise. The function L is the logistic loss function, which quantifies the discrepancy between the
true phenotypes yi of the strains and the predictions f(Xi,.) obtained by the model. The λ parameter
achieves a trade-off between this empirical error and the lasso regularization term, and is usually
optimized by cross-validation.
The feature selection ability of the lasso penalty is notoriously unstable in the presence of strong
correlation between features. This is particularly the case using k-mer based representations, making
it difficult to derive meaningful interpretations from the features selected by the model, and their
associated coefficients. We propose a simple and efficient three-step strategy to identify sparse and
interpretable genomic signatures.
Screening step. In this step, we screen features. For this purpose, we first fit a standard lasso-
penalized regression model on the original feature matrix X for several values of the regularization
parameter λ, and extract the set of features that are selected at some point on this regularization
path. Formally, letting (λ1, ..., λm) be the m values of the considered grid of λ, and B the p x m matrix
containing the model coefficients obtained by Equation 1. We define a set a of active features as:
and let pa = |a| be their number. Since the lasso cannot select more features than observations, we
typically end up with pa in the order of N (i.e., 103 in our case). We then extract the features which
are strongly correlated to the active ones from the entire feature matrix. For this purpose, we
compute a pa x p matrix G containing the pairwise correlations between the pa active features
identified beforehand and the p original ones. Formally, Gi,j = cor(X.,ai , X.,j), where cor is the standard
Pearson correlation between vectors of MAF patterns across the genomes, and is a classical criterion
to quantify linkage disequilibrium (LD) between genomic features [32]. Since we rely on binary
variables encoding the presence/absence of features in the genomes, Gi,j quantifies the extent to
which features i and j co-occur in the genomes. As pa is typically much smaller than p (in the orders
of 103 versus 106 in our case), computing this matrix is much easier than computing the entire p x p
100
correlation matrix. Finally, we extract the set e of features that are strongly correlated to at least one
active feature as:
where the hyperparameter s1 controls the minimum level of correlation required, and is referred to
as the screening threshold. This operation defines a set of pe = |e| features, called the set of
extended features. Obviously, we have pa ≤ pe ≤ p. In our context, we typically end up with a few
thousand extended features, hence pa < pe << p.
Clustering step. While the screening step identifies a limited number of features deemed sufficiently
correlated to the features identified by a standard lasso, the second step aims to explicitly define
groups, or clusters, of strongly correlated variables. We rely for this purpose on a bottom-up
agglomerative clustering procedure, as suggested by Bühlmann et al. [31]. More precisely, we first
define a pe x pe distance matrix D between extended features, defined as Di,j = |1 – cor(X.,ei , X.,ej )|.
This matrix is then used to carry out a hierarchical clustering, implemented in R by the hclust function,
using a minimum linkage criterion. The resulting dendrogram is finally cut at a height of 1–s2, the
second hyperparameter s2, called the clustering threshold, controlling the level of within-cluster
correlation.
Learning step. Finally, we summarize each identified cluster as a new composite variable, defined as
the average of the original variables defining the cluster, and carry out a standard lasso at the cluster
level. Since in our case the original variables encode the presence/absence of a given DBGWAS
pattern in the genomes, these composite variables correspond to the proportion of patterns involved
in a cluster that are present/absent in the genomes. Figure 2 summarizes this three-step method.
101
Figure 2. Three-step process. Illustration of the proposed three-step procedure.
5.3.3 Model selection
Our approach involves three hyperparameters that must be optimized for each antibiotic: the
screening and clustering thresholds s1 and s2 used to build the clusters of correlated variables, and
the regularization parameter λ involved in the final cluster-level lasso model. We relied on the
glmnet software [33] to fit the lasso models involved in both the screening and learning steps. We
used the default heuristic proposed by the software to define the grids of candidate values for the
regularization parameters. The screening and clustering thresholds were both systematically set to
0.95 based on preliminary experiments (see Supplementary Section S2), and we relied on a 10-fold
cross-validation procedure to optimize the regularization parameter involved in the final cluster-level
lasso model, as we now describe.
We first split the training dataset into ten folds, stratified by sequence type and phenotype. For each
of the ten folds, 9 tenth of the dataset were used to screen variables and identify clusters. The final
cluster-level lasso model was then fit and applied to the held-out strains, for each candidate value of
the regularization parameter. Our model selection strategy aimed to simultaneously maximize its
sensitivity and specificity, respectively defined as the fractions of correctly classified non-susceptible
and susceptible strains. For this purpose, a Receiver Operating Characteristic (ROC) curve was built
for each candidate regularization parameter after completion of the cross-validation procedure, and
the point closest to the optimal one (defined by a true positive rate of 1 and a false positive rate of 0)
was used to define the optimal sensitivity/specificity trade-off. Following Hicks et al. [28], we refer to
the average of the (optimal sensitivity and specificity as balanced accuracy (bACC). Finally, we
selected the sparsest model that allowed to maximize the balanced accuracy up to one point, in
order to reduce the risk of overfitting. In practice, this cross-validation procedure was repeated three
times and the selection was based on average balanced accuracy values obtained across the three
repetitions. Supplementary Figure S4 illustrates this model selection strategy.
5.3.4 Interpretation of the predictive signature
We use the DBGWAS software to interpret the genomic signatures, based on the cDBG built during
the k-merization step. The unitigs defining the patterns involved in the final model are visualized
within their neighborhood in the cDBG, which represents their genomic environment hence provides
insight on the type of variant involved, typically a plasmid-based acquired gene versus a local
mutation (single nucleotide polymorphism (SNP) or indel) in a chromosomal region.
102
5.3.5 Evaluation of the computational requirements
We evaluate the computational requirements of the standard lasso and cluster-lasso procedures by
measuring the time and memory required to compute a regularization path involving 100 values of
the regularization parameter. For the standard lasso, this simply amounts to calling the glmnet
function of the glmnet R package, using the variant matrix provided by DBGWAS. For the cluster-
lasso procedure, this amounts to:
i. making the same call to glmnet to identify the set of active variables,
ii. computing the pa x p correlation matrix G in order to identify the set of extended features,
iii. building the clusters of correlated variables
iv. making a second call to glmnet, using the variant matrix defined at the cluster-level.
This procedure is repeated five times for each drug, using a single Xeon E5-2690-V3 CPU.
5.4 Results
5.4.1 Cross-validation results
Table 2 provides the results obtained in terms of cross-validation performance and support size of
the models. The predictive performance is measured by the area under the ROC curve (AUC) and
balanced accuracy. Additional performance indicators are provided in Supplementary Table S1. The
support size of a model is defined as the number of features it involves, which respectively
corresponds to individual or clusters of DBGWAS patterns, for the lasso and cluster-lasso strategies.
We also report the overall number of unitigs involved, which is only slightly higher than the number
of features for the lasso and corresponds to unitigs in total LD. In contrast, this overall number is
markedly higher for the cluster-lasso strategy, because of the pattern clustering.
Table 2. Cross-validation results. This table summarizes the cross-validation results obtained by the lasso and cluster-lasso strategies for the 10 antibiotics, in terms of balanced accuracy (bACC), AUC, support size, overall number of unitigs involved and maximal number of unitigs associated to a single pattern or cluster (between brackets).
103
Both strategies show similar performance in terms of both balanced accuracy and AUC, confirming
that taking into account, or not, the correlation between features has a limited impact in terms of
predictive performance. We also note that the model support is often slightly smaller with cluster-
lasso (for 8 drugs out of 10), suggesting that several features selected separately with the lasso
ended up merged in a single cluster by the cluster-lasso. As expected, the overall number of unitigs
involved in a cluster-lasso model is significantly larger. Interestingly, it is not evenly distributed across
its features. In the meropenem model, for instance, 159 out of the 164 unitigs defining the model
features are associated to a single feature, suggesting that it corresponds to the presence of a gene,
as confirmed in the interpretation analysis depicted in the next section.
Finally, Figure 3 provides a graphical representation of the lasso and cluster-lasso signatures obtained
for ceftazidime, which are of moderate complexity. The heatmap shows the correlation between the
patterns involved in one signature and/or the other, and highlights the 8 major clusters identified by
the cluster-lasso strategy (clusters including more than 10 patterns). While all the patterns defining a
cluster have by construction a similar level of predictive power, the lasso model usually selected a
single one of them. There is an exception for the 3rd cluster, shown in green in the zoomed area of
Figure 3, where two patterns were selected as distinct features of the lasso model.
By explicitly reconstructing and providing these clusters of correlated features to the learning
algorithm, the cluster-lasso strategy leads to a more meaningful characterization of the genetic
determinants involved, as we describe below.
104
Figure 3. Correlation within features selected in the signatures. This heatmap shows the correlation matrix built from the features selected by the lasso and the cluster-lasso (identified by the orange and blue bars shown above the heatmap, respectively), for ceftazidime. The corresponding values of model coefficients are represented by green bars. The 8 major clusters (involving more than 10 patterns) of the cluster-lasso signatures are identified by a dedicated color ranging from red to grey. A zoom of the top left side of the figure allows a better reading of the colored bars for the major clusters 1, 3, 7 and 8.
5.4.2 Model interpretation
We focus on two drugs to illustrate the improved interpretability offered by cluster-lasso signatures:
meropenem, where the interpretation is straightforward, and cefoxitin, which is among the
signatures of highest support. Additional results obtained for the remaining drugs are deferred to
Supplementary Materials, Section S5.
As shown in Table 2, the lasso and cluster-lasso meropenem models involve 8 and 3 features,
respectively. As shown in Figure 4(B), each lasso feature corresponds to a single unitig, while the
cluster-lasso signature involves a large cluster of unitigs (159 out of the 164 involved). Figure 4(A)
shows the magnitude of the model coefficients. It reveals that the cluster-lasso signature is
essentially driven by a single prominent feature, while 4 to 5 features of the lasso signature have a
non-negligible weight. The major feature of the cluster-lasso signature corresponds to the large
cluster of correlated patterns, and the DBGWAS visualization (Figure 4(C)) shows that the
corresponding unitigs are organized as a long linear path in the cDBG. This suggests that this cluster
105
corresponds to an entire gene. The annotation provided by DBGWAS shows the gene to be the Class
A beta-lactamase blaKPC. The DBGWAS visualization obtained for the lasso signature indicates that 3
of the 8 features – features 1, 2 and 4 – are also co-located in a region of the cDBG annotated as
blaKPC. The fact that the lasso selected these specific unitigs within the blaKPC gene suggests that the
resistance determinants involved are SNPs or indels. While the gene-level annotation is the same as
that obtained with the cluster-lasso, the interpretation of the signature in terms of genetic variants is
therefore radically different. A closer look at the lasso signature reveals that the 3 blaKPC features are
actually strongly correlated: they are often observed together. Unsurprisingly, they belong to the
largest cluster involved in the cluster-lasso signature, and interestingly, their cumulative weight is
approximately equal to that of the cluster-lasso feature (3.4 instead of 3.3). By explicitly detecting
that these features are correlated, and merging them into a single feature, together with additional
correlated features not even involved in the lasso signature, the cluster-lasso leads to a more
meaningful interpretation of the underlying prediction model, in two aspects. Firstly, it captures the
true nature of the genomic determinant involved: the presence of the blaKPC gene, as opposed to
mutations within the gene. Secondly, it assesses the overall contribution of the gene presence in the
decision rule, while, in the lasso signature, this contribution is shared by several distinct yet
correlated features.
106
Figure 4. Interpretation of the meropenem signatures. This figure provides a detailed comparison of the lasso (left) and cluster-lasso (right) signatures. A) Absolute value of the coefficients of the models. B) Number of unitigs involved in the features of the models. C) Visualization of the first subgraph obtained by DBGWAS for each signature. Nodes of the graphs correspond to unitigs of the cDBG built by DBGWAS from the training panel of genomes, as illustrated in Figure 1 and detailed in [25]. Colors allow to identify which unitigs of the graphs in panel C are related to which features of the models in panels A and B.
Likewise, Figure 5 presents the DBGWAS analysis of the lasso and cluster-lasso signatures obtained
for cefoxitin. We focused on the two first subgraphs provided by the software, which represent the
two genomic neighbourhoods of the most important patterns, or clusters of patterns, involved in the
models. The subgraphs are indeed ordered according to the maximal absolute value of model
coefficients among all patterns or clusters involved in the subgraph. While DBGWAS identifies the
same resistance genes in both methods (the efflux pump ompK36 and blaKPC), the nature of the
underlying resistance determinants cannot be deduced from the lasso signature. The ompK36-
annotated subgraph obtained for the cluster-lasso signature (top-right panel of Figure 5) involves 2
clusters gathering 9 unitigs (clusters 1 and 3), and presents a topology attributable to a local
polymorphism: a complex bubble, with a fork separating susceptible (blue) and resistant (red) strains,
as described in [25]. The corresponding lasso subgraph, shown on the top-left panel, includes 4
patterns (patterns 1, 2, 32 and 56) each having its proper value of model coefficient, represented by
4 shades of colors ranging from blue to red. These distinct model coefficient values can lead to wrong
conclusions regarding the individual importance of the corresponding unitig sequences. Indeed,
aligning these unitigs with annotated ompK36 sequences reveals that features 2 and 56 both
represent the wild type, while features 1 and 32 align to the insertion of two amino acids in the L3
loop, as described in Novais et al. [34] (Supplementary Figure S6). The second lasso subgraph
(bottom-left panel of Figure 5) includes a single feature of the signature (shown in purple),
surrounded by seven nodes (shown in grey), among which two are annotated as blaKPC. The node of
the signature is however not annotated itself, hence the subgraph could be interpreted as a local
polymorphism in the promoter region of the blaKPC gene. The cluster-lasso subgraph shown on the
bottom-right panel reveals however that this unitig was selected by the lasso among hundreds of
highly correlated unitigs. They all belong to cluster 2, which includes the complete blaKPC gene (shown
between brackets) and plasmid sequences in strong LD.
107
Figure 5. DBGWAS visualizations for the interpretation of the cefoxitin signatures. This figure presents the two first subgraphs obtained by DBGWAS for the lasso and cluster-lasso signatures. The DBGWAS subgraphs are ordered by decreasing maximal absolute value of model coefficient among all patterns/ clusters involved in the subgraph. Likewise, pattern and cluster identifiers are ordered by decreasing absolute value of model coefficient, meaning for instance that pattern/cluster #1 has a greater weight in the model that pattern/cluster #2. The nodes (unitigs) belonging to patterns/clusters of the signatures are colored by the value of their model coefficients (from blue to red, indicating negative and positive values, respectively). The grey nodes/unitigs, not involved in the models, represent their genomic neighbourhood. The nodes for which an annotation related to antibiotic resistance was found are surrounded by a black circle. Bold brackets are used on the bottom right subgraph to highlight these black-circled nodes. This particular subgraph gathers 7 clusters, whose identifiers are reported on the picture. Cluster 2 is the largest one, and includes the blaKPC-annotated nodes. The dashed arrow shows which node of the cluster-lasso blaKPC subgraph corresponds to the one selected by the lasso.
Performance on the test set
Table 3 shows the predictive performance obtained on the test set by the lasso and cluster-lasso
signatures, as well as the models defined by Nguyen et al. [8], in terms of sensitivity, specificity and
balanced accuracy.
108
Table 3. Test set results. This table summarizes the results obtained on the test dataset by the lasso, cluster-lasso and Nguyen et al. [8] models for the 10 antibiotics, in terms of sensitivity, specificity and balanced accuracy (bACC). The MIC predicted by the Nguyen et al. [8] models were converted into S/NS categorical phenotypes according to the CLSI breakpoints.
We first noted that the lasso and cluster-lasso strategies reached a similar level of balanced accuracy
for most drugs, although they did not always achieve the same trade-off in terms of sensitivity and
specificity. We noted however that the confidence intervals of the corresponding sensitivities and
specificities largely overlapped for all drugs but ceftazidime (Figure 6 and Supplementary Figure S8),
indicating that they were not significantly different between lasso and cluster-lasso, except for one
drug.
Figure 6. Test set results. This figure represents the ROC curves obtained for cefepime, cefoxitin, ceftazidime and meropenem by the lasso (red) and cluster-lasso (blue) signatures, as well as their associated sensitivities / specificities and that of the Nguyen et al. [8] models, with their 95% confidence intervals.
109
We also noted that the models proposed by Nguyen et al. [8] usually achieved a lesser level of
balanced accuracy. This was the case for all drugs but cefepime, imipenem and meropenem, where
the performance remained comparable. Apart from these three drugs, the loss ranged from 6.6
points for piperacillin-tazobactam to 23.6 points for aztreonam. Strikingly, these models usually
achieved a much lower level of specificity than the lasso and cluster-lasso ones. This was especially
the case for ceftazidime, piperacillin-tazobactam, tetracycline and aztreonam, where the specificity
fell below 50%. In the latter case, every single strain was actually classified as resistant, hence the
specificity was null. As can be seen from Figure 6 and Supplementary Figure S8, however, the
confidence intervals of their sensitivities and specificities often overlapped with the ROC curves of
the lasso and cluster-lasso models. That these models were however trained to predict MICs, which
we subsequently cast into S/NS categories according to the CLSI breakpoints. While this strategy may
not be optimal to evaluate the ability of these models to accurately predict MICs, we noted that the
agreement between reference and predicted MICs was much smaller on this dataset than reported
in the original publication (see Supplementary Table S3).
We often observed a serious drop between the predictive performance estimated by cross-validation
and that observed for the test set: more than 5 points of balanced accuracy for 6 drugs out of 10, and
up to 10 points or more for amikacin, cefoxitin, imipenem and meropenem (13.4, 10.2, 10.9 and 9.9
points, respectively). This suggested that the training dataset taken from Nguyen et al. [8] could not
account for the entire diversity displayed by K. pneumoniae. A simple resistome-based analysis done
using the kleborate software revealed indeed that the prevalence of well-known resistance genes
was sometimes very different in the two panels. This is illustrated in Figure 7 for amikacin and
imipenem, which suffered from the highest performance drop. Redesigning the training and test
datasets by shuffling the original ones in order to obtain a homogeneous split fixed this
generalization issue (Supplementary Section S9). This illustrates that while machine learning models
can indeed succeed in learning accurate prediction rules, they fail to generalize when the dataset
they are trained on does not account for the overall diversity of the bacterial species.
110
Figure 7. Resistome analysis. This figure compares the training and test panels of genomes in terms of predictive performance and resistome constitution for the drugs amikacin (top) and imipenem (bottom). Left: predictive performance in terms of sensibility, specificity, bACC and AUC estimated by cross-validation on the training set and measured on the test set, using the lasso signatures. Right: comparison of the resistome constitutions. Each kleborate resistance marker is represented by its prevalence in the resistant strains of the training (x-axis) and test (y-axis) panels.
Finally, Table 3 and Supplementary Figure S9 shows an uneven level of prediction performance
among the ten antibiotics considered. The best performances were obtained for ciprofloxacin and
ceftazidime, with an AUC around 95% using either the original or the redesigned datasets
(Supplementary Figure S9). The poorest performances were obtained for two beta-lactams: cefepime,
a 4th-generation cephalosporin, and the monobactam aztreonam. This may be due to a reduced
penetrance of their genetic determinants, as described in human genetics [35], because more
complex resistance mechanisms are involved, including efflux pumps, gene regulation, or plasmid
copy number [36-38].
5.4.3 Computational requirements
Figure 8 indicates that while the duration of the cluster-lasso was in average about three times
longer than the lasso (571 vs 180 seconds), it took only about 10 minutes to obtain an entire
regularization path defined at the cluster-level. Optimizing the regularization parameter using our
cross-validation process therefore took approximately 5 hours on a single CPU. We noted that while
111
the time required by the lasso was relatively homogeneous across drugs, it was more variable for the
cluster-lasso. This variability was due to the fact that the lasso used in the first step identified a
variable number of active features, which directly impacted the time required to screen the
remaining ones. This is illustrated in Supplementary Figure S7.
Figure 8. Time and memory requirements. The boxplots represent the variability of the time (panel A) and maximum memory (panel B) required to generate a lasso or cluster-lasso regularization path for the ten antibiotics.
In terms of memory, we noted that the cluster-lasso procedure led to an overhead of about 2 GB
with respect to the lasso, which was related to the computation of the correlation matrix G . In
practice, we limited this overhead by computing this matrix by slices, considering subsets of p’ =
10,000 features and computing pa x p’ matrices instead of the entire pa x p matrix at once. Altogether,
this led to a computationally efficient procedure, allowing to identify cluster-level signatures in a few
hours, for a limited memory footprint. We note that it could be straightforwardly parallelized, using
several CPUs to compute the various slices of the correlation matrix G.
5.5 Discussion
Representing bacterial genomes using k-mers leads to very high-dimensional representations with
strong correlation structures. This may hinder a meaningful interpretation of predictive models built
by sparse ML strategies like lasso-penalized regressions [39] or decision trees-based algorithms [40],
which are known to be unstable in this case: when some features are strongly correlated, they tend
to pick one, or few ones, out of them arbitrarily [41]. This instability may not be an issue in terms of
predictive performance: as long as one feature among a group of correlated ones appears in the
model, the prediction may be unchanged. It may however have a severe impact in terms of
interpretability, as the features selected by the model may provide an incomplete or erroneous
characterization of the causal resistance determinant.
112
We propose a simple and computationally efficient strategy to cope with the strong correlation
structures inherent to k-mer-based representations, and build sparse and meaningful genomic
signatures. While performing a systematic study on thousands of strains of K. pneumoniae, our
approach compared favorably to the state of the art, providing indeed a comparable level of
performance, while offering a greater interpretability of the genomic features involved in the models.
On this challenging genetically flexible bacterial species with significant accessory genome
components, this new approach allowed to extract meaningful scientific insights from the identified
signatures, as further detailed in Section S5 of the Supplementary Materials.
Central to our approach is a three-step strategy, where a sparse ML algorithm is first used to screen
features in a generic manner, which are then extended to clusters of strongly correlated features,
ultimately considered as candidate features to be included in the final antibiotic resistance prediction
model. While we here relied on lasso-penalized logistic regression for both the screening and final
learning stages, this principle is generic and could readily be transposed to other sparse ML
algorithms, like xgboost [4, 8] or set cover machines [26]. Likewise, it could straightforwardly be
extended to handle MICs or other phenotypic traits, as well as other types of genomic features (e.g.,
relying on SNPs instead of k-mers).
Several alternative strategies could be considered to handle correlations between k-mers. Most
related to our approach are the elastic-net and the group-lasso strategies, which also rely on logistic
regression – and more generally on generalized linear models – but with alternative regularization
penalties. The elastic-net penalty combines the lasso and the ridge penalties, which leads to sparse
models with a grouping mechanism: correlated features tend to be selected together [42]. This
approach was recently shown to be efficient in the context of bacterial genome-wide association
studies (GWAS), providing increased statistical power for the identification of genotype-phenotype
associations and accurate prediction rules [43]. As we demonstrate in Supplementary Section S10,
however, it remains limited in its ability to provide interpretable predictive signatures, for several
reasons. First, while it has the effect of stabilizing the lasso solution and of simultaneously activating
groups of correlated features, these groups are not defined explicitly, which intrinsically makes the
interpretation of the model difficult. Moreover, while the parameter controlling the trade-off
between the lasso and ridge penalties had a direct impact on the number of selected features, it had
little impact on the predictive performance of the model, thereby making it difficult to optimize
objectively. Finally, we empirically observed that it led to a partial and heterogeneous reconstruction
of the genomic features obtained by the cluster-lasso: a significant fraction of the cluster members
were not selected by the elastic-net, and the individual weights associated to the selected ones
greatly varied, although their level of predictive power was comparable.
113
The group-lasso penalty leverages predefined groups of features, ensuring that all features of a given
group are either active or inactive simultaneously [44]. This strategy was for instance considered in
human GWAS, using groups of SNPs defined spatially to account for their LD [45]. Transposing this
idea to bacterial genomes is challenging since no such prior information is available to define groups,
as LD can be genome-wide [29]. A solution could be to identify clusters of correlated k-mers using
agglomerating strategies [31], but is hard to carry out in practice from the high-dimensional datasets
involving 105 - 106 features encountered in our application. Our approach can therefore be seen as a
simple and efficient strategy to approximate such a group-lasso process in very high-dimensional
settings. Instead of collapsing groups of correlated features into composite variables, a natural
extension of our method would however be to rely on a group-lasso penalized regression defined at
the cluster level. Each feature would then be granted its own weight, which could allow to better
reflect their individual predictive power. We empirically observed that the weights variability within a
cluster was very small, as shown in Supplementary Figure S13, which therefore indicated that
keeping the features separated or averaging them is essentially equivalent. In practice, we find it
easier to explicitly collapse each cluster to a single composite variable to interpret the model
parameters.
On the practical side, our method involves two hyper-parameters, besides the regularization
parameter, to identify active variables and to build the final model. Although these so-called
screening and clustering thresholds did not have a strong influence in this study (Supplementary
Section S2), they may be cumbersome to optimize in practice for other applications. A natural
extension to our method would be to consider re-sampling strategies in the clustering step, in order
to identify stable clusters, whose constitution would be robust to the precise definition of the
clustering threshold [46]. Alternatively, one could rely on tree-guided lasso penalization to leverage
the entire dendrogram during the final learning step, which would then simultaneously identify
clusters and learn the prediction model [47].
Regarding AMR prediction, our study led on K. pneumoniae confirms several observations made
recently, namely that kmer- based approaches can learn sparse prediction rules without any prior
information, that predictions are more accurate with R/S models than MICs and that the level of
predictive performance can vary by antibiotic [26, 28]. Importantly, our study involved a novel panel
of 634 K. pneumoniae strains for the validation of the prediction models and suggested that the
problem is more challenging than reported in Nguyen et al. [8]. The figures reported in this study
were indeed probably optimistic because the genomes panel considered did not account for the
overall genomic diversity of the K. pneumoniae species (Supplementary Section S1). The 634
114
additional strains with genomes and phenotypes considered in this study will help learning more
accurate and generalizable predictions models.
Finally, the ML methods developed in this study are available in a generic R package that can be
easily transposed to other applications, not necessarily involving k-mers nor AMR phenotypes. On
the challenging dataset considering in this study, involving more than a thousand strains for more
than a million genomic features, the computational requirements remained limited and the
signatures could be identified in a few hours on a standard workstation. Coupled with the enriched
level of interpretability they offer, we believe our approach will help defining prediction models
amenable to routine diagnostics.
5.6 References
1. Gordon NC, Price JR, Cole K, Everitt R, Morgan M, Finney F, et al. Prediction of Staphylococcus
aureus Antimicrobial Resistance by Whole-Genome Sequencing. Journal of Clinical Microbiology
2014;52(4):1182–1191.
2. Walker TM, Kohl TA, Omar SV, Hedge J, Elias CDO, Bradley P, et al. Whole-genome sequencing for
prediction of Mycobacterium tuberculosis drug susceptibility and resistance: a retrospective cohort
study. The Lancet Infections Diseases 2015;15:1193–1202.
3. Eyre DW, De Silva D, Cole K, Peters J, Cole MJ, Grad YH, et al. WGS to predict antibiotic MICs for
Neisseria gonorrhoeae. The Journal of Antimicrobial Chemotherapy 2017;72(7):1937–1947.
4. Nguyen M, Long SW, McDermott PF, Olsen RJ, Olson R, Stevens RL, et al. Using Machine Learning
To Predict Antimicrobial MICs and Associated Genomic Features for Nontyphoidal Salmonella.
Journal of Clinical Microbiology 2019;57(2).
5. Tyson GH, McDermott PF, Li C, Chen Y, Tadesse DA, Mukherjee S, et al. WGS accurately predicts
antimicrobial resistance in Escherichia coli. Journal of Antimicrobial Chemotherapy 2015;70(10).
6. Moradigaravand D, Palm M, Farewell A, Mustonen V, Warringer J, Parts L. Prediction of antibiotic
resistance in Escherichia coli from large-scale pan-genome data. PLOS Computational Biology
2018;14(12):1–17.
7. Deng X, Memari N, Teatero S, Athey T, Isabel M, Mazzulli T, et al. Whole-genome Sequencing for
Surveillance of Invasive Pneumococcal Diseases in Ontario, Canada: Rapid Prediction of Genotype,
Antibiotic Resistance and Characterization of Emerging Serotype 22F. Frontiers in Microbiology
2016;7:2099.
115
8. Nguyen M, Brettin T, Long SW, Musser JM, Olsen RJ, Olson R, et al. Developing an in silico
minimum inhibitory concentration panel test for Klebsiella pneumoniae. Scientific reports
2018;8(1):421.
9. Su M, Satola SW, Read TD. Genome-Based Prediction of Bacterial Antibiotic Resistance. Journal of
Clinical Microbiology 2019;57(3).
10. Yang Y, Niehaus KE, Walker TM, Iqbal Z, Walker AS, Wilson DJ, et al. Machine Learning for
Classifying Tuberculosis Drug-Resistance from DNA Sequencing Data. Bioinformatics 2017;p. btx801.
11. Coll F, McNerney R, Preston MD, Guerra-Assunção JA, Warry A, Hill-Cawthorne G, et al. Rapid
determination of anti-tuberculosis drug resistance from whole-genome sequences. Genome
Medicine 2015;7(1):51.
12. Bradley P, Gordon NC, Walker TM, Dunn L, Heys S, Huang B, et al. Rapid antibiotic-resistance
predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis.
Nature Communications 2015;6:10063.
13. Tanmoy AM, Westeel E, De Bruyne K, Goris J, Rajoharison A, Sajib MS, et al. Salmonella enterica
Serovar Typhi in Bangladesh: exploration of genomic diversity and antimicrobial resistance. mBio
2018;9(6):e02112–18.
14. Miotto P, Tessema B, Tagliani E, Chindelevitch L, Starks AM, Emerson C, et al. A standardised
method for interpreting the association between mutations and phenotypic drug resistance in
Mycobacterium tuberculosis. European Respiratory Journal 2017;50(6).
15. Mahé P, El Azami M, Barlas P, Tournoud M. A large scale evaluation of TBProfiler and Mykrobe for
antibiotic resistance prediction in Mycobacterium tuberculosis. PeerJ 2019 May;7:e6857.
16. Gygli SM, Borrell S, Trauner A, Gagneux S. Antimicrobial resistance in Mycobacterium tuberculosis:
mechanistic and evolutionary perspectives. FEMS Microbiology Reviews 2017 03;41(3):354–373.
17. Chen ML, Doddi A, Royer J, Freschi L, Schito M, Ezewudo M, et al. Beyond multidrug resistance:
Leveraging rare variants with machine and statistical learning models in Mycobacterium tuberculosis
resistance prediction. EBioMedicine 2019.
18. Palomino JC, Martin A. Drug resistance mechanisms in Mycobacterium tuberculosis. Antibiotics
2014;3:317–340.
116
19. Palmer AC, Kishony R. Understanding, predicting and manipulating the genotypic evolution of
antibiotic resistance. Nature Review Genetics 2013;14:243–248.
20. van Belkum A, Burnham CAD, Rossen JWA, Mallard F, Rochas O, Dunne Jr WM. Innovative and
rapid antimicrobial susceptibility testing systems. Nature Reviews Microbiology 2020.
21. Davis JJ, Boisvert S, Brettin T, Kenyon RW, Mao C, Olson R, et al. Antimicrobial Resistance
Prediction in PATRIC and RAST. Scientific Reports 2016;6:27930.
22. Drouin A, Giguère S, Déraspe M, Marchand M, Tyers M, Loo VG, et al. Predictive computational
phenotyping and biomarker discovery using reference-free genome comparisons. BMC Genomics
2016;17(1):754.
23. Mahé P, Tournoud M. Predicting bacterial resistance from whole-genome sequences using k-
mers and stability selection. BMC Bioinformatics 2018 Oct;19(1):383.
24. Lees JA, Vehkala M, Välimäki N, Harris SR, Chewapreecha C, Croucher NJ, et al. Sequence element
enrichment analysis to determine the genetic basis of bacterial phenotypes. Nature Communications
2016;7(12797).
25. Jaillard M, Lima L, Tournoud M, Mahé P, van Belkum A, Lacroix V, et al. A fast and agnostic
method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic
events. PLOS Genetics 2018 11;14(11):1–28.
26. Drouin A, Letarte G, Raymond F, Marchand M, Corbeil J, Laviolette F. Interpretable genotype-to-
phenotype classifiers with performance guarantees. Scientific Reports 2019 dec;9(1).
27. Farhat MR, Sultana R, Iartchouk O, Bozeman S, Galagan J, Sisk P, et al. Genetic determinants of
drug resistance in Mycobacterium tuberculosis and their diagnostic value. Am J Respir Crit Care Med
2016 2016 Sep 1;194(5):621–30.
28. Hicks AL, Wheeler N, Sanchez-Buso L, Rakeman JL, Harris SR, Grad YH. Evaluation of parameters
affecting performance and reliability of machine learning-based antibiotic susceptibility testing from
whole genome sequencing data. PLOS Computational Biology 2019;15(9):e1007349.
29. Earle SG, Wu CH, Charlesworth J, Stoesser N, Gordon NC, Walker TM, et al. Identifying lineage
effects when controlling for population structure improves power in bacterial association studies.
Nature Microbiology 2016;1(16041).
117
30. Gauraha N, Parui SK. Efficient clustering of correlated variables and variable selection in high-
dimensional linear models. arXiv preprint arXiv:160303724 2016;.
31. Bühlmann P, Rütimann P, van de Geer S, Zhang CH. Correlated variables in regression: Clustering
and sparse estimation. Journal of Statistical Planning and Inference 2013;143:1835–1858.
32. Slatkin M. Linkage disequilibrium: understanding the evolutionary past and mapping the medical
future. Nature reviews genetics 2008;9(6):477.
33. Friedman J, Hastie T, Tibshirani R. Regularization Paths for Generalized Linear Models via
Coordinate Descent. Journal of Statistical Software 2010;33(1):1–22.
34. Novais A, Rodrigues C, Branquinho R, Antunes P, Grosso F, Boaventura L, et al. Spread of an
OmpK36-modified ST15 Klebsiella pneumoniae variant during an outbreak involving multiple
carbapenem-resistant Enterobacteriaceae species and clones. European journal of clinical
microbiology & infectious diseases 2012;31(11):3057–3063.
35. Cooper DN, Krawczak M, Polychronakos C, Tyler-Smith C, Kehrer-Sawatzki H. Where genotype is
not predictive of phenotype: towards an understanding of the molecular basis of reduced penetrance
in human inherited disease. Human genetics 2013;132(10):1077–1130.
36. Hocquet D, Nordmann P, El Garch F, Cabanne L, Plésiat P. Involvement of the MexXY-OprM efflux
system in emergence of cefepime resistance in clinical strains of Pseudomonas aeruginosa.
Antimicrobial agents and chemotherapy 2006;50(4):1347–1351.
37. Pages JM, Lavigne JP, Leflon-Guibout V, Marcon E, Bert F, Noussair L, et al. Efflux pump, the
masked side of ß-lactam resistance in Klebsiella pneumoniae clinical isolates. PLoS One
2009;4(3):e4817.
38. Kitchel B, Rasheed JK, Endimiani A, Hujer AM, Anderson KF, Bonomo RA, et al. Genetic factors
associated with elevated carbapenem resistance in KPC-producing Klebsiella pneumoniae.
Antimicrobial agents and chemotherapy 2010;54(10):4201–4207.
39. Tibshirani R. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical
Society: Series B (Methodological) 1996;58(1):267–288.
40. Chen T, Guestrin C. Xgboost: A scalable tree boosting system. In: Proceedings of the 22nd acm
sigkdd international conference on knowledge discovery and data mining ACM; 2016. p. 785–794.
118
41. Hastie T, Tibshirani R, Wainwright M. Statistical Learning with Sparsity: The Lasso and
Generalizations. Chapman & Hall/CRC; 2015.
42. Zou H, Hastie T. Regularization and variable selection via the elastic net. Journal of the royal
statistical society: series B (statistical methodology) 2005;67(2):301–320.
43. Lees JA, Tien Mai T, Galardini M, Wheeler NE, Corander J. Improved inference and prediction of
bacterial genotype-phenotype associations using pangenome-spanning regressions. bioRxiv
2019;https://www.biorxiv.org/content/ early/2019/11/23/852426.
44. Yuan M, Lin Y. Model selection and estimation in regression with grouped variables. Journal of
the Royal Statistical Society: Series B (Statistical Methodology) 2006;68(1):49– 67.
45. Dehman A, Ambroise C, Neuvial P. Performance of a blockwise approach in variable selection
using linkage disequilibrium information. BMC bioinformatics 2015;16(1):148.
46. Kimes PK, Liu Y, Hayes DN, Marron JS. Statistical significance for hierarchical clustering.
Biometrics 2014;73(3):811–821.
47. Kim S, Xing EP. Tree-guided group lasso for multi-task regression with structured sparsity. In:
International Conference on Machine Learning; 2010. p. 543–550.
119
CHAPTER 6 : PFM-like, a novel family of subclass B2 metallo β-
lactamase from Pseudomonas synxantha belonging to the
Pseudomonas fluorescens complex
Laurent Poirel1,2,3, Mattia Palmieri1,4, Michael Brilhante5, Amandine Masseron1, Vincent Perreten5,
Patrice Nordmann1,2,3,6
1Microbiology Unit, Department of Medicine, Faculty of Science, University of Fribourg, Fribourg,
Switzerland.
2INSERM European Unit (IAME, France), University of Fribourg, Fribourg, Switzerland.
3Swiss National Reference Center for Emerging Antibiotic Resistance (NARA), University of Fribourg,
Fribourg, Switzerland.
4bioMérieux, Data Analytics Unit, La Balme Les Grottes, France.
5Institute of Veterinary Bacteriology, Vetsuisse Faculty, University of Bern, Bern, Switzerland.
6Institute for Microbiology, University of Lausanne and University Hospital Centre, Lausanne,
Switzerland
Published in Antimicrobial Agents and Chemotherapy, 27 January 2020, doi: 10.1128/AAC.01700-
19
120
6.1 Abstract
A carbapenem-resistant Pseudomonas synxantha isolate recovered from chicken meat produced the
novel carbapenemase PFM-1. That subclass B2 metallo-β-lactamase shared 71% amino acid identity
with β-lactamase Sfh-1 from Serratia fonticola. The blaPFM-1 gene was chromosomally located and
likely acquired. Variants of PFM-1 sharing 90% to 92% amino acid identity were identified in bacterial
species belonging to the Pseudomonas fluorescens complex, including Pseudomonas libanensis (PFM-
2) and Pseudomonas fluorescens (PFM-3), highlighting that these species constitute reservoirs of
PFM-like encoding genes.
6.2 Main text
Metallo-β-lactamases (MBLs) are zinc-dependent enzymes that can catalyze the hydrolysis of
virtually all β-lactam antibiotics (including carbapenems) except for monobactams and that are
resistant to the β-lactamase inhibitors clavulanate, tazobactam, and avibactam (1). They constitute a
highly diverse family of enzymes and can be categorized into three subclasses, namely, B1, B2, and
B3 (2). The subclass B1 enzymes are the most clinically important since they comprise MBLs such as
IMP-1, NDM-1, SPM-1, KHM-1, VIM-1, and VIM-2 (3), widely identified in Enterobacteriaceae,
Acinetobacter spp., and Pseudomonas spp. Subclass B2 includes CphA (4, 5), ImiS (6, 7), and AsbM1
(8), which are intrinsic enzymes in Aeromonas spp., and Sfh-I (9) from the occasionally pathogenic
species Serratia fonticola. These carbapenemases are monozinc enzymes that usually shown much
higher hydrolysis rates against carbapenem substrates than the other β-lactams (9).
Production of MBLs in the Pseudomonas genus is frequently observed, with acquired MBL-encoding
genes (blaIMP, blaVIM, blaSPM) being reported worldwide mainly in Pseudomonas aeruginosa and, to a
lesser extent, in Pseudomonas fluorescens (10, 11). In addition, intrinsic MBL genes encoding subclass
B3 POM-1-like and PAM-1-like enzymes have been identified in Pseudomonas otitidis and
Pseudomonas alcaligenes, respectively (12–14).
P. fluorescens and related species belonging to a same complex are rarely associated with infections
in human medicine (15). Nevertheless, P. fluorescens can cause bloodstream infections in humans,
and most reported cases have been iatrogenic (16). Few studies have focused on the β-lactamase
gene content of the P. fluorescens complex. While P. fluorescens possesses a chromosomally located
and inducible Ambler class C β-lactamase gene (17), the acquired but chromosomally located blaBIC-1
gene encoding an Ambler class A carbapenemase was previously identified as a source of
carbapenem resistance in P. fluorescens isolates recovered from the Seine River, Paris (18).
121
Here, we analyzed a carbapenem-resistant Pseudomonas sp. isolate that had been recovered during
a survey aimed to study the spread of multidrug-resistant Gram- negative organisms among food
varieties and food-producing animals in Switzerland in 2018. Isolate MCP-106 was isolated from
chicken meat after an 18-h preenrichment in LB broth and subsequent selection on ChromID
CarbaSmart (bioMérieux, La Balme-les- Grottes, France). Carbapenemase production was tested
using the Rapid Carba NP test (19). Matrix-assisted laser desorption ionization–time of flight (MALDI-
TOF) analysis assigned the strain to the Pseudomonas synxantha species, and that assignment was
further confirmed by analysis of the rpoB and rpoD gene sequences (Fig. 1). P. synxantha, which
belongs to the P. fluorescens complex (20), is an environmental species that reduces and
accumulates the heavy metal chromium (21, 22) that is pathogenic to nematode eggs and may
therefore be used as a nematicidal agent (23). Susceptibility testing performed for β-lactams by disk
diffusion showed that P. synxantha strain MCP-106 was resistant to amino- and carboxypenicillins,
broad- spectrum cephalosporins, aztreonam, and carbapenems. Whole-genome sequencing was
performed using an Illumina MiSeq platform (2 × 150-bp paired ends) to assess the genetic
determinants of carbapenem resistance. The obtained reads were trimmed using trimmomatic 0.36,
assembled with SPAdes version 3.11.1 (24), and annotated with PROKKA version 1.12. TBLASTN
analysis of the DNA contigs using VIM as a reference revealed a chromosomally located MBL protein
that was named PFM-1 (Pseudomonas fluorescens metallo-β-lactamase). PFM-1 (encoded by the
blaPFM-1 gene) consisted of a β-lactamase with 253 amino acids and a relative molecular mass of 28.5
kDa.
FIG 1. Dendrogram performed by using the seven genes from the multilocus sequence typing (MLST) analysis in comparison with representative genes from other Pseudomonas species, in particular, the most closely related ones, which are Pseudomonas fluorescens and Pseudomonas synxantha. The alignment used for the tree calculation was performed with the Clustal Omega program.
122
A BLASTN analysis against the NCBI database revealed the presence of a blaPFM-like gene (named
blaPFM-2, with PFM-2 sharing 92% amino acid identity with PFM-1) in Pseudomonas libanensis strain
CIP105460 (GenBank accession no. GCA_001439685.1) (25) which actually belongs to the
Pseudomonas fluorescens sp. complex. In addition, genes encoding PFM-like products were also
identified in the genomes of a single P. fluorescens strain (WP_050516231.1) and two Pseudomonas
brenneri strains, sharing 90% amino acid identity with PFM-1 (WP_128593843.1 and OAE14554.1).
Furthermore, a gene encoding a more distantly related enzyme (75% amino acid identity) was found
in the genome of a Pseudomonas chlororaphis strain (WP_038635452.1). However, no other blaPFM-
like gene was identified in any other P. fluorescens genomes (or in any genomes of species belonging
to the same complex), despite numerous genomes of strains belonging to the P. fluorescens complex
(n = 145) having been fully sequenced. We then screened 10 P. fluorescens strains from our
laboratory collection, all of which had been recovered from human, animal, or environmental
samples. A PCR-based approach using primer pair PFM-1-Fw (5’-GTTACGCCTGATGGACTTTG-3’) and
PFM-1-Rv (5’-CTTAGAAGCATGTCAGTGCG-3’) for blaPFM-1 and primer pair PFM-2-Fw (5’-
CTGATCAGAAAATGTGGGGC-3’) and PFM-2-Rw (5’-GACACGCCGTGTTTCTATATC-3’) for blaPFM-2 was
employed. A single strain gave a positive result, and Sanger sequencing identified a blaPFM-like gene
(blaPFM-3) encoding a protein sharing 91% amino acid identity with PFM-1. The blaPFM-3 gene was
identified from P. fluorescens PF1, an isolate recovered from a water sample from the Seine River in
Paris, France, and also producing the Ambler class A carbapenemase BIC-1 (18). PFM-2 and PFM-3
differed by five amino acids.
Pairwise alignment of the sequences of the PFM-like amino acid sequences with those of other MBLs
revealed that these newly identified enzymes were most closely related to the subclass B2 MBL
enzymes. PFM-1 shares 71% amino acid identity with Sfh-1, originally identified in Serratia fonticola
strain UTAD54 (9), and 53% identity with CphA-1 from Aeromonas hydrophila (26). It shared very low
identity with subclass B1 MBLs such as NDM-1 (17%) and VIM-1 and IMP-1 (22%) (Fig. 2). Protein
alignments of the β-lactamase PFM-1 with representative subclass B2 MBLs revealed the presence of
conserved amino acid residues known to be involved in binding to zinc of class B β-lactamase (BBL)
(27) (Fig. 3). The motif Asn-Tyr-His-Thr-Asp (positions 116 to 120 [BBL nomenclature]), being a
distinctive feature of subclass B2 MBLs and presumably involved in the coordination of the two zinc
ions found in the active site of these enzymes, was identified in PFM-like enzymes. Amino acids
Asp120, Cys-221, and His-263, presumably involved in the binding of the second zinc ion in subclass
B2 MBLs, were also conserved in the PFM-like proteins.
123
FIG 2. Dendrogram of PFM-1, PFM-2, and PFM-3 in comparison with representative class B β-lactamases subjected to neighbor-joining analysis. The alignment used for the tree calculation was performed with the Clustal Omega program. Numbers in parentheses indicate percentages of amino acid identity with PFM-1. The β-lactamases used for the comparisons (GenBank accession numbers) were Sfh-1 (NZ_AUZV01000091.1), CphA-1 (X57102), ImiS (Y10415), ImiH (AJ548797), VIM-1 (AJ278514), IMP-1 (EF027105), NDM-1 (KJ018857), POM-1 (EU315252), and PAM-1 (AB858498). Percentages of amino acid identities compared to PFM-1 are indicated.
FIG 3. Alignment of the amino acid sequences of subclass B2 MBLs. Residues conserved in the enzymes are indicated by asterisks; colons indicate conservation between groups with strongly similar properties; dots indicate conservation between groups with weakly similar properties. The BBL numbering scheme (in bold) is used for residues conserved in MBLs.
In order to gain insight into the β-lactam resistance phenotype conferred by the corresponding
proteins, the blaPFM-1, blaPFM-2, and blaPFM-3 genes of P. synxantha strain MCP-106, P. libanensis strain
CIP105460, and P. fluorescens PF1 were cloned into plasmid pTOPO (Invitrogen, Illkirch, France) and
expressed in Escherichia coli. Cloning experiments were performed using the pCR-blunt TOPO cloning
124
kit (Invitrogen, Illkirch, France) after amplification of the genes with primers PFM-1-Fw and PFM-1-Rv
for blaPFM-1 and with primers PFM-2-Fw and PFM-2-Rw for blaPFM-2 and blaPFM-3. The resulting
recombinant plasmids were transformed into chemically competent E. coli TOP10 strains. Once
expressed in E. coli TOP10, similar resistance phenotypes were observed with the different PFM
variants, with reduced susceptibility to carbapenems seen (Table 1) but paradoxically no effect on
the other β-lactams tested such as amoxicillin, ticarcillin, cefoxitin, cefotaxime, and ceftazidime (data
not shown). MICs of carbapenems were determined by Etest and showed values for the PFM-3-
producing recombinant strain that were higher than those obtained with the PFM-1-producing and
PFM-2-producing recombinant strains, particularly for imipenem (Table 1).
Table 1. MICs of carbapenems for E. coli TOP10 recipient strain with and without the blaPFM genes and for Pseudomonas isolates.
aCIP105460 was originally described by Dabboussi et al. (25).
bPF1 was originally described by Girlich et al. (18).
cClavulanic acid was used at a concentration of 2 µg/ml.
dTazobactam was used at a concentration of 4 µg/ml.
Purification of the PFM-1 enzyme was performed using a four-liter LB broth culture of E. coli TOP10
(pTOPO-blaPFM-1) recombinant strain supplemented with kanamycin (50 µg/ml) and inoculated for 24
h at 37°C under shaking conditions. The bacterial culture was centrifuged, and the pellet was
resuspended in Tris-HCl buffer (50 mM Tris-HCl, 100 µM ZnCl2, pH 8.5) and sonicated using a Vibra-
Cell 75186 sonicator (Thermo Fisher Scientific). After filtration using a 0.22-µm pore size
nitrocellulose filter, the crude extract was loaded in a Q-Sepharose column connected to an
ÄKTAprime chromatography system (GE Healthcare, Glattbrugg, Switzerland) and eluted with a linear
NaCl gradient. The presence of the β-lactamase was monitored using the Rapid Carba NP test (19),
and the fractions showing the highest β-lactamase activity were pooled and dialyzed against 100 mM
phosphate buffer (pH 7.0), prior to 10-fold concentration performed with a Vivaspin 20 concentrator
(GE Healthcare). The purified β-lactamase extract was immediately used for enzymatic
determinations.
The protein concentrations were measured using Bradford reagent (Sigma-Aldrich, Buchs,
Switzerland), and the purity of the enzyme was estimated by SDS-PAGE analysis (GenScript, NJ, USA).
The purity of PFM-1 was estimated to be >95%, with a single dominant band visible on the SDS-
125
polyacrylamide gel. Kinetic measurements were performed at room temperature using phosphate-
buffered saline (PBS) buffer (0.1 M, pH 7) supplemented with ZnSO4 (5 µM) using a UV/visible
Ultrospec 2100 Pro spectrophotometer (Amersham Biosciences, Buckinghamshire, United Kingdom).
This kinetic analysis confirmed that PFM-1 hydrolyzed carbapenems; however, the catalytic efficiency
was slightly lower than that seen with the previously described subclass B2 MBLs (Table 2). In
contrast, hydrolysis of other β-lactam substrates such as benzylpenicillin or cefotaxime was not
detected (kcat value < 0.01 s—1). This study therefore characterized a novel family of subclass B2
MBLs with substantial carbapenemase activity. Compared to other subclass B2 MBLs, PFM-1
hydrolysis is limited to carbapenems, and the catalytic efficiency is lower.
Table 2. Kinetic parameters of purified β-lactamase PFM-1 and comparison with other B2 MBLs. Kinetic data are displayed for Sfh-1, CphA, and AsbM1 as reported previously by Fonseca et al. (30), Vanhove et al. (31), and Yang and Bush (8), respectively. ImiS kinetic values are presented for imipenem and meropenem as reported by Sharma et al. (32) and Crawford et al. (6), respectively. NR, not reported.
The levels of G+C content of blaPFM-1 (50%) and blaPFM-2/-3 (52%) differed from the expected range of
the G+C content of Pseudomonas genes (ca. 60%); in addition, the fact that no other blaPFM-like genes
were identified in several fully sequenced genomes of P. fluorescens strains available in the GenBank
databases further suggests a non- Pseudomonas origin. However, no obvious genetic element that
could have been involved in the acquisition of that gene was observed in its nearby genetic environ-
ment. Similarly, no mobile genetic elements were identified in their upstream vicinity by analyzing
the genes showing significant identities with blaPFM-1 in the GenBank database. It may be speculated
that those genes have been acquired by transformation since P. fluorescens strains, as with many
other Gram-negative nonfermenters, are spontaneously transformable at high frequency (28).
However, a discrepancy was always noticed between all of the putative MBL-encoding genes
(including blaPFM-1) and the surrounding chromosomal sequences in term of GC content (ca. 50%
versus ca 60%), suggesting a foreign origin (data not shown).
This work underlines that P. fluorescens-like species may possess class B β-lactamase genes that are,
however, not systematically present in their genomes. Although strains belonging to the P.
fluorescens complex are rarely involved in human infections, they are widely disseminated in the
environment and parts of the human microbiota and can also be found in chicken meat (16). Those
bacterial species may therefore constitute reservoirs of antimicrobial resistance genes (29).
126
6.3 Data availability
The sequences of PFM-1, PFM-2, and PFM-3 have deposited in the NCBI database under GenBank
accession numbers MN065826 (PFM-1), MN080496 (PFM- 2), and MN080497 (PFM-3). The sequence
of the whole genome of P. synxantha strain MCP-106 has been deposited under GenBank accession
number VSRO00000000.1, BioProject accession no. PRJNA561277, and BioSample accession no.
SAMN12612925.
6.4 References
1. Jeon J, Lee JH, Lee JJ, Park KS, Karim AM, Lee CR, Jeong BC, Lee SH. 2015. Structural basis for
carbapenem-hydrolyzing mechanisms of carbapenemases conferring antibiotic resistance. Int J Mol
Sci 16:9654 –9692. https://doi.org/10.3390/ijms16059654.
2. Palzkill T. 2013. Metallo-β-lactamase structure and function. Ann N Y Acad Sci 1277:91–104.
https://doi.org/10.1111/j.1749-6632.2012.06796.x.
3. Cornaglia G, Giamarellou H, Rossolini GM. 2011. Metallo-β-lactamases: a last frontier for β-lactams?
Lancet Infect Dis 11:381–393. https://doi.org/ 10.1016/S1473-3099(11)70056-1.
4. Hernandez Valladares M, Felici A, Weber G, Adolph HW, Zeppezauer M, Rossolini GM, Amicosante
G, Frère JM, Galleni M. 1997. Zn(II) dependence of the Aeromonas hydrophila AE036 metallo-β-
lactamase activity and stability. Biochemistry 36:11534 –11541. https://doi.org/10.1021/ bi971056h.
5. Segatore B, Massidda O, Satta G, Setacci D, Amicosante G. 1993. High specificity of cphA-encoded
metallo-β-lactamase from Aeromonas hydrophila AE036 for carbapenems and its contribution to β-
lactam resistance. Antimicrob Agents Chemother 37:1324 –1328. https://doi.org/10.1128/
aac.37.6.1324.
6. Crawford PA, Sharma N, Chandrasekar S, Sigdel T, Walsh TR, Spencer J, Crowder MW. 2004. Over-
expression, purification, and characterization of metallo-β-lactamase ImiS from Aeromonas veronii bv.
sobria. Protein Expr Purif 36:272–279. https://doi.org/10.1016/j.pep.2004.04.017.
7. Walsh TR, Gamblin S, Emery DC, MacGowan AP, Bennett PM. 1996. Enzyme kinetics and
biochemical analysis of the Imis, the metallo-β-lactamase from Areonomas sobria. J Antimicrob
Chemother 37:423– 441. https://doi.org/10.1093/jac/37.3.423.
8. Yang Y, Bush K. 1996. Biochemical characterization of the carbapenem- hydrolyzing β-lactamase
AsbM1 from Aeromonas sobria AER 14M: a member of a novel subgroup of metallo-β-lactamases.
FEMS Microbiol Lett 137:193–200. https://doi.org/10.1111/j.1574-6968.1996.tb08105.x.
127
9. Saavedra MJ, Peixe L, Sousa JC, Henriques I, Alves A, Correia A. 2003. Sfh-I, a subclass B2 metallo-β-
lactamase from a Serratia fonticola environmental isolate. Antimicrob Agents Chemother 47:2330 –
2333. https:// doi.org/10.1128/aac.47.7.2330-2333.2003.
10. Koh TH, Wang GCY, Sng L-H. 2004. IMP-1 and a novel metallo-β- lactamase, VIM-6, in fluorescent
pseudomonads isolated in Singapore. Antimicrob Agents Chemother 48:2334 –2336.
https://doi.org/10.1128/ AAC.48.6.2334-2336.2004.
11. Pellegrini C, Mercuri PS, Celenza G, Galleni M, Segatore B, Sacchetti E, Volpe R, Amicosante G,
Perilli M. 2009. Identification of blaIMP-22 in Pseudomonas spp. in urban wastewater and nosocomial
environments: biochemical characterization of a new IMP metallo-enzyme variant and its genetic
location. J Antimicrob Chemother 63:901–908. https://doi.org/10.1093/jac/dkp061.
12. Lee K, Kim CK, Yong D, Yum JH, Chung MH, Chong Y, Thaller MC, Rossolini GM. 2012. POM-1
metallo-β-lactamase-producing Pseudomonas otitidis isolate from a patient with chronic otitis media.
Diagn Microbiol Infect Dis 72:295–296. https://doi.org/10.1016/j.diagmicrobio.2011.11.007.
13. Borgianni L, De Luca F, Thaller MC, Chong Y, Rossolini GM, Docquier JD. 2015. Biochemical
characterization of the POM-1 metallo-β-lactamase from Pseudomonas otitidis. Antimicrob Agents
Chemother 59:1755–1758. https://doi.org/10.1128/AAC.03843-14.
14. Suzuki M, Suzuki S, Matsui M, Hiraki Y, Kawano F, Shibayama K. 2014. A subclass B3 metallo-β-
lactamase found in Pseudomonas alcaligenes. J Antimicrob Chemother 69:1430 –1432.
https://doi.org/10.1093/jac/ dkt498.
15. Garrido-Sanz D, Arrebola E, Martínez-Granero F, García-Méndez S, Muriel C, Blanco-Romero E,
Martín M, Rivilla R, Redondo-Nieto M. 2017. Classification of isolates from the Pseudomonas
fluorescens complex into phylogenomic groups based in group-specific markers. Front Microbiol
8:413. https://doi.org/10.3389/fmicb.2017.00413.
16. Scales BS, Dickson RP, LiPuma JJ, Huffnagle GB. 2014. Microbiology, genomics, and clinical
significance of the Pseudomonas fluorescens species complex, an unappreciated colonizer of humans.
Clin Microbiol Rev 27:927–948. https://doi.org/10.1128/CMR.00044-14.
17. Pierrard A, Ledent P, Docquier JD, Feller G, Gerday C, Frère JM. 1998. Inducible class C β-
lactamases produced by psychrophilic bacteria. FEMS Microbiol Lett 161:311–315.
https://doi.org/10.1111/j.1574-6968.1998.tb12962.x.
128
18. Girlich D, Poirel L, Nordmann P. 2010. Novel Ambler class A carbapenem- hydrolyzing β-lactamase
from a Pseudomonas fluorescens isolate from the Seine River, Paris, France. Antimicrob Agents
Chemother 54:328 –332. https://doi.org/10.1128/AAC.00961-09.
19. Nordmann P, Poirel L, Dortet L. 2012. Rapid detection of carbapenemase- producing
Enterobacteriaceae. Emerg Infect Dis 18:1503–1507. https://doi.org/10.3201/eid1809.120355.
20. Anzai Y, Kim H, Park JY, Wakabayashi H, Oyaizu H. 2000. Phylogenetic affiliation of the
pseudomonads based on 16S rRNA sequence. Int J Syst Evol Microbiol 50:1563–1589.
https://doi.org/10.1099/00207713-50-4-1563.
21. McLean JS, Beveridge TJ, Phipps D. 2000. Isolation and characterization of a chromium-reducing
bacterium from a chromated copper arsenate- contaminated site. Environ Microbiol 2:611– 619.
https://doi.org/10.1046/j.1462-2920.2000.00143.x.
22. Badar U, Ahmed N, Beswick AJ, Pattanapipitpaisal P, Macaskie LE. 2000. Reduction of chromate
by microorganisms isolated from metal contaminated sites of Karachi, Pakistan. Biotechnol Lett
22:829 – 836. https://doi.org/10.1023/A:1005649113190.
23. Wechter WP, Begum D, Presting G, Kim JJ, Wing RA, Kluepfel DA. 2002. Physical mapping, BAC-
end sequence analysis, and marker tagging of the soilborne nematicidal bacterium, Pseudomonas
synxantha BG33R. OMICS 6:11–21. https://doi.org/10.1089/15362310252780807.
24. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov AS, Lesin VM, Nikolenko SI,
Pham S, Prjibelski AD, Pyshkin AV, Sirotkin AV, Vyahhi N, Tesler G, Alekseyev MA, Pevzner PA. 2012.
SPAdes: a new genome assembly algorithm and Its applications to single-cell sequencing. J Comput
Biol 19:455– 477. https://doi.org/10.1089/cmb.2012.0021.
25. Dabboussi F, Hamze M, Elomari M, Verhille S, Baida N, Izard D, Leclerc H. 1999. Pseudomonas
libanensis sp. nov., a new specie isolated from Lebanese spring waters. Int J Syst Bacteriol 49:1091–
1101. https://doi.org/10.1099/00207713-49-3-1091.
26. Massidda O, Rossolini GM, Satta G. 1991. The Aeromonas hydrophila cphA gene: molecular
heterogeneity among class B metallo-β-lactamases. J Bacteriol 173:4611– 4617.
https://doi.org/10.1128/jb.173.15.4611-4617.1991.
27. Garau G, García-Sáez I, Bebrone C, Anne C, Mercuri P, Galleni M, Frère JM, Dideberg O. 2004.
Update of the standard numbering scheme for class B β-lactamases. Antimicrob Agents Chemother
48:2347–2349. https:// doi.org/10.1128/AAC.48.7.2347-2349.2004.
129
28. Nielsen KM, Smalla K, van Elsas JD. 2000. Natural transformation of Acinetobacter sp. strain
BD413 with cell lysates of Acinetobacter sp., Pseudomonas fluorescens, and Burkholderia cepacia in
soil microcosms. Appl Environ Microbiol 66:206 –212. https://doi.org/10.1128/aem.66.1.206-
212.2000.
29. D’Costa VM, King CE, Kalan L, Morar M, Sung WWL, Schwarz C, Froese D, Zazula G, Calmels F,
Debruyne R, Golding GB, Poinar HN, Wright GD. 2011. Antibiotic resistance is ancient. Nature
477:457– 461. https://doi.org/10.1038/nature10388.
30. Fonseca F, Sarmento AC, Henriques I, Samyn B, van Beeumen J, Domingues P, Domingues MR,
Saavedra MJ, Correia A. 2011. Biochemical characterization of Sfh-I, a subclass B2 metallo-β-
lactamase from Serratia fonticola UTAD54. Antimicrob Agents Chemother 55:5392–5395.
https://doi.org/10.1128/AAC.00429-11.
31. Vanhove M, Zakhem M, Devreese B, Franceschini N, Anne C, Bebrone C, Amicosante G, Rossolini
GM, Van Beeumen J, Frère JM, Galleni M. 2003. Role of Cys221 and Asn116 in the zinc-binding sites
of the Aeromonas hydrophila metallo-β-lactamase. Cell Mol Life Sci 60:2501–2509. https://
doi.org/10.1007/s00018-003-3092-x.
32. Sharma NP, Hajdin C, Chandrasekar S, Bennett B, Yang KW, Crowder MW. 2006. Mechanistic
studies on the mononuclear ZnII-containing metallo- β-lactamase ImiS from Aeromonas sobria.
Biochemistry 45:10729 –10738. https://doi.org/10.1021/bi060893t.
130
CHAPTER 7 : Summary and perspectives
7.1 Summary
A general description of the main topics in this work is provided in CHAPTER 1: the AMR challenge,
with particular focus on the species K. pneumoniae and A. baumannii, and the potential role of WGS
in improving diagnostics and surveillance.
While antibiotics still represent the major antibacterial agents for the treatment of bacterial
infections, an increasing number of bacteria is becoming resistant to them, complicating the
treatment of infections. Carbapenems are highly effective antibiotics commonly used for the
treatment of severe bacterial infections of MDR bacteria, which are resistant to first-line antibiotics.
Of major concern, carbapenem resistance is on the rise, and strains carrying mobile genetic
determinants of carbapenem resistance are the leading cause of nosocomial outbreaks. In some
countries the carbapenem resistance prevalence is so high that other drugs, usually reserved as last
options, are widely used. As an example, colistin, an old drug that was unused due to its toxicity, it’s
now commonly adopted in some countries, and resistance toward this antibiotic is on the rise.
Of the several pathogens associated with AMR, carbapenem-resistant K. pneumoniae and A.
baumannii represent major concerns. Both pathogens frequently cause outbreaks of infections, while
strains which are resistant to all available antibiotics are emerging. Concerning K. pneumoniae, a
novel kind of superbug is recently emerging. While MDR K. pneumoniae clones causing hospital
outbreaks and hypervirulent, drug susceptible clones causing severe community-acquired infections
were two separate concerns, the convergence of the two traits is emerging. Both acquisition of
hypervirulence and resistance genes have been observed in MDR and hypervirulent clones,
respectively, especially in Asia. Tracking the emergence and evolution of such novel clones, causing
severe infections with limited treatment options, is fundamental.
The decreasing cost of WGS is allowing its increase implementation in bacterial diagnosis.
Surveillance, outbreaks investigation and phenotype prediction, in particular for AMR and virulence
determinants, are some of the major applications of WGS in the clinical microbiology laboratory.
Despite an increasing number of studies, there is still a lack of surveillance investigations for last-line
resistance mechanisms and for convergence of resistance and hypervirulence traits. Moreover, while
the phenotype prediction from the genomic data showed encouraging results, the understanding of
the genetic resistance mechanisms of some drugs, such as colistin, is still limited, and novel in silico
tools for the phenotype prediction are needed.
131
The first aim of this work was to employ WGS to characterize collections of clinical colistin-resistant
isolates from countries where the carbapenem-resistance rate is sky-high and colistin often
represents the last treatment option. In CHAPTER 2 we analysed forty-five colistin-resistant K.
pneumoniae strains from Serbia, collected during 2013-2017 from seven Serbian medical settings
covering the entire country. WGS showed the absence of acquired colistin resistance mechanisms,
while alterations in the mgrB gene, involved in LPS modifications, were observed in all strains. Such
modifications were confirmed by mass spectrometry, which revealed addition of LAra4N to the LPS,
consistent with MgrB inactivation. Genomic epidemiology investigations revealed the abundance of
an emerging ‘high-risk’ clone, ST101, which was observed in most of the cities involved in the study,
demonstrating its high endemicity. Interestingly, ST101 strains carried the carbapenemase-encoding
gene blaOXA-48, however such gene was not embedded in the usual OXA-48 plasmid. In order to
decipher the blaOXA-48 genetic background, we performed long-reads sequencing with the ONT
MinION instrument, and we obtained the full plasmid sequence. Such plasmid was novel and likely
resulting from the recombination of two previously described plasmids. Compared to a classic OXA-
48 plasmid, it had different plasmid replicons and carried several other AMR genes, including the
ESBL-encoding gene blaCTX-M-15.
In CHAPTER 3 we studied a collection of carbapenem- and colistin-resistant A. baumannii clinical
isolates obtained during 2015-2017 from several Greek hospitals. WGS revealed that the strains
belonged to one of the two major international clones (IC1 or IC2), with a clear predominance of IC2
which is replacing IC1 globally. Interestingly, we observed the same colistin resistance-associated
mutation in all the strains from both ICs, represented by an amino acid substitution in the PmrB, a
regulator involved in LPS modifications. Such mutation was associated with low-level colistin
resistance. In some strains, additional mutations in either PmrA or PmrB further decreased the
colistin susceptibility, leading to high-level colistin resistance. Interestingly, mass spectrometry
analysis of LPS detected modifications in both colistin-resistant and susceptible strains. Such finding
indicates that other still unknown factors may be needed for a resistant phenotype. Overall, we
observed a convergent evolution of different clonal lineages towards the same colistin resistance
mechanism, indicating that such mechanism may not have major impact on the strains fitness.
Given the frequent emergence of novel AMR mechanisms and high-risk clones from Asia, in CHAPTER
4 we employed WGS to study the evolution and epidemiology of a large collection (N=300) of K.
pneumoniae isolates from the H301 hospital in Beijing, China. The isolates were of clinical origin and
obtained during the period 2002-2016. Of those, 200 were randomly selected, aiming to study the
population structure within the hospital during the time period. We observed an increase in
carbapenems resistance during the study period, driven by carbapenemases production from strains
132
mostly belonging to the globally dominant CG258 clone. Hypervirulent strains causing severe
infections were also observed, and were mainly represented by the CG23 clone. Interestingly, we
also detected eleven cases, corresponding to 5.5 % overall, of simultaneous carriage of AMR genes
(ESBLs or carbapenemases) and hypervirulent genes. Tracking the emergence and evolution of such
strains, causing severe infections with limited treatment options, is fundamental in order to
understand their origin, possible further evolution and to limit their spread.
In CHAPTER 5 we described the performance and interpretability of a machine learning (ML)
algorithm for the genome-based prediction of antimicrobial susceptibilities. While several algorithms
with gold-standard performances were built and successfully tested on different bacterial species,
such methods are not interpretable, as the predictive genetic features are not revealed. Our
approach was tested on a panel of K. pneumoniae genomes, with state-of-the-art predictive
performances while also revealing the underlying resistance mechanisms. By enhancing the
interpretability of in silico prediction models, such approach improves their clinical utility, hence
facilitating their adoption in routine diagnostics by clinicians and microbiologists.
Finally, in CHAPTER 6 we employed WGS to decipher the carbapenem resistance mechanism of an
environmental P. fluorescens isolate. WGS revealed the presence of a putative carbapenemase-
encoding gene. The gene was cloned in a plasmid vector and the carbapenemase, PFM-1, was
expressed in an E. coli laboratory strain. The carbapenemase hydrolytic activity was also tested and
compared to those obtained from similar carbapenemases. Bioinformatics analysis further revealed
the presence of such carbapenemase-encoding gene in other environmental strains, and allowed to
study its genetic environment. Although the gene was observed in environmental isolates, it could
mobilize to successful mobile genetic elements and spread to clinically relevant pathogens, as
previously reported for the most common clinically relevant AMR mechanisms.
7.2 General discussion and future perspectives
WGS is a powerful tool to practically monitor but also more theoretically study bacterial
epidemiology since it provides a comprehensive picture of bacterial populations in a single uniform
assay that can be used for all microbial species indiscriminately. WGS allows the simultaneous
detection and identification of species, distinction of strains within a species, characterisation of
lineages, indirect assessment of capsular and other pheno-types, the definition of core and accessory
AMR and virulence determinants, and it also allows for the unravelling of high-resolution strain
relatedness with which to assess evidence of microbial transmission.
133
Given the genetic diversity and complexity of K. pneumoniae and A. baumannii as clinical pathogens,
WGS is rapidly becoming a fundamental tool for epidemiology and surveillance of such pathogens.
Genomic surveillance already elucidated patterns of clonal spread at local, regional and global levels
and uncovered insights into the burden and geographic spread of K. pneumoniae and A. baumannii
HAIs 1–6. In particular, detecting novel AMR mechanisms, monitoring the emergence and
dissemination of high-risk clones, tracking the convergence of AMR and hypervirulence traits, and
exploring links between clinical infections and potential ecological reservoirs (the environment and
zoonoses being the two main examples of such reservoirs) are among the major WGS-resolvable
issues for genomic surveillance.
We successfully exploited WGS in order to study the presence and dynamics of AMR and virulence
genes, the population structure and the genomic epidemiology of clinical isolates of the two
aforementioned species from European and Asian countries, especially in regions and institutes
where AMR levels are sky-high. A major limitation of our studies was that in some cases the isolates’
selection criteria were not properly designed and our study design had to be a bit opportunistic. For
instance, studies of colistin resistant isolates should have included also several colistin susceptible
isolates from the same time period and geographic locales for comparative purposes. Concerning the
longitudinal study of K. pneumoniae isolates within the Chinese hospital H301, randomly selecting
isolates during a time frame of 15 years led to an overly diverse population, including several isolates
with little clinical relevance. An alternative selection should have been based on selecting only MDR
pathogens and hypermucoviscous pathogens, which are positive to a simple string test. Such kind of
collection would be focused on the characterisation of hypervirulence, MDR and the convergence of
the two traits, which is an emerging and serious threat, especially in Asia 3.
Another limitation of our studies was that patient information was generally scarce and this was
surely not only due to aspects of privacy of patient data. For instance, data about colistin
administration was not available for patients included in both our Greek and Serbian studies on
colistin resistance (Chapter 2 and 3). Concerning the Chinese H301 study (Chapter 4), the scarcity of
patient data didn’t allow to investigate the outbreaks within the hospital, a task that is proven to be
well suited for WGS 7.
The usefulness of WGS for outbreak investigations transcend its initial purpose. Local hospital
outbreak studies have shown for example that MDR strains are more transmissible than susceptible
ones in hospitals 2,8 and revealed hospital plumbing systems as a source of prolonged outbreaks 9,
providing useful informations for intervention and prevention strategies.
134
WGS investigations also underscored the importance of the dynamics of plasmids and other mobile
genetic elements in hospital outbreaks 10. In particular, K. pneumoniae was shown to play a leading
role as both donor and recipient for mobilisable AMR genes 9–12. The increasing awareness of the
potential for so-called plasmid outbreaks is transforming our vision on how screening and
surveillance of pathogens should be handled, moving the focus from individual pathogens to the
enzymes and plasmids that make the pathogen a threat.
Therefore, plasmid analysis is of major importance, and resolving the dynamics of mobile genetic
elelements represents the next step to further improve genomic surveillance. Obtaining complete
assemblies is a major prerogative in order to study plasmids. The low cost and high accuracy of
Illumina short read sequencing technology makes them well adapted for high-throughput bacterial
genomics. This led Illumina sequencing to become the dominant technology for WGS of bacterial
pathogens 13. However, short read sequencing cannot resolve all genomic repeats, which are
particularly abundant in mobile genetic elements. Therefore, fragmented assemblies of hundreds of
contigs often provide the best possible outcome. Where needed, short-read sequencing can be
combined with long-read sequencing data obtained with technologies such as PacBio SMRT and ONT
nanopore sequencing 14,15, resulting in the definition of more complete genomes. We obtained the
complete genome of a K. pneumoniae strain by combining the Illumina short reads with the ONT
MinION long-reads (Chapter 2). While this approach is known to be quite costly, multiplexing can
result in a final complete assembly for about 150 USD on a per strain basis 15. Therefore, the selection
of a subset of representative strains for long-read sequencing for genomic surveillance studies,
especially involving outbreaks, should be a major priority. This will limit the overall number of contigs
and facilitate the inter-genomic comparisons since alignment between genomes will become more
easy.
Currently, WGS in clinical or public health microbiology laboratories is mainly used as a typing tool to
inform on the genetic relatedness of strains, which can be used for local and global epidemiological
studies. However, WGS holds also considerable promise for AST 16. Sequence data can be queried to
identify the presence of both acquired antibiotic resistance genes and chromosomal mutations that
contribute to antibiotic resistance.
We were able to correlate the AST data with genomic data, also for some drugs where the underlying
genetic AMR mechanisms are still not completely understood, such as in the case of colistin. ML
algorithms for in silico prediction of antimicrobial susceptibilities are also promising, and we showed
that the data interpretability could still be augmented (Chapter 5). Unfortunately, we were only able
to test our ML algorithm on a single species. Extending the test to other bacterial species is an urgent
135
and mandatory next step, and focusing on A. baumannii will be a priority, given the fact that only few
reports on genomic AMR detection exist for this pathogen 17,18.
An overall limitation of our work (but also in general for similar work by third parties) was that the
genome collections were relatively small and focused on limited geographical locations. A further
improvement would be to expand the genome collection. ML algorithms generally require big
databases where genome sequences are combined with their respective AST data 19. While there is a
lot of public data available for several species/drug combinations 20,21, this is not the case for recently
introduced drugs or for drugs requiring a testing method that cannot be automated. Therefore,
testing ML algorithms in the prediction of antimicrobial susceptibilities of such molecules is today not
possible. Efforts should be made to collect existing and prospective data on such drugs in order to
create an exhaustive database suitable for ML testing, also potentially enabling the discovery of
novel AMR mechanisms.
Finally, a next step on our work will be to translate the pipeline from the R language to the Python
language. While R is one of the most popular code languages, it’s rapidly been replaced by Python
among researchers. In general, bioinformatics is a new scientific domain. There are no globally
accepted pipelines for both epidemiological analyses but also the establishment of resistance and
virulence profiles available. Different researchers use different systems and this is not promoting
standardisation and common use of certail tools. The few comparative studies available show that
this is risky and that different bioinformatic tools used for the analysis of the same dataset will lead
to different interpretation 22. This aspect should be taken very seriously and further research in tool
comparison is much needed. Resolving the irreproducibility will in the end allow the development of
diagnostic assays that will withstand scrutiny by regulatory agencies such as the FDA.
WGS is enabling the discovery of novel AMR mechanisms with ease and at an unprecedented speed,
also uncovering novel reservoirs of AMR genes in the environment and the animal domain. We
identified and characterized a novel family of carbapenemase enzymes produced by environmental
isolates (Chapter 6). It is well known that environmental bacterial species constitute an important
reservoir of antimicrobial resistance genes 23. More research is needed to fill knowledge gaps and
assess the potential risk antimicrobials and resistant bacteria in the environment pose to human
health and the broader environmental ecosystem. Though this was not the main purpose of this
thesis, a next steps should be to exploit metagenomics, the study of genetic material recovered
directly from environmental samples or essentially any niche specific sample containing microbes,
which was already previously demonstrated to be a fundamental tool in elucidating the strong
correlation between AMR and the environment 24.
136
Finally, the novel understandings obtained in the present thesis will pile up with the existing ones,
allowing a step forward towards a complete implementation of WGS in the clinical microbiology
laboratories and towards a WGS-based antimicrobial susceptibility test. The major obstacles that still
exist include the current lack of rapidity, the still elevated costs of the sequencing based assays, the
fact that the technologies are still developing rapidly and, consequently, the lack of generally
accepted FDA approved tests. The diagnostic microbiology community is aware of these fundamental
problems and there are many initiatives to solve these problems underway. The coming 5 years will
see solutions appear and there will be rapid, cheap and reproducible tests available for all to use.
Such tests are going to be based upon NGS technology solely and will have a major positive impact
on the quality of the diagnostic laboratory.
7.3 References
1. Long SW, Olsen RJ, Eagar TN, et al. Population genomic analysis of 1,777 extended-spectrum beta-
lactamase-producing Klebsiella pneumoniae isolates, Houston, Texas: Unexpected abundance of
clonal group 307. MBio 2017; 8.
2. David S, Reuter S, Harris SR, et al. Epidemic of carbapenem-resistant Klebsiella pneumoniae in
Europe is driven by nosocomial spread. Nat Microbiol 2019.
3. Wyres KL, Nguyen TNT, Lam MMC, et al. Genomic surveillance for hypervirulence and multi-drug
resistance in invasive Klebsiella pneumoniae from South and Southeast Asia. Genome Med 2020; 12:
11.
4. Heinz E, Brindle R, Morgan-McCalla A, Peters K, Thomson NR. Caribbean multi-centre study of
Klebsiella pneumoniae: whole-genome sequencing, antimicrobial resistance and virulence factors.
Microb Genomics 2019.
5. Ellington MJ, Heinz E, Wailan AM, et al. Contrasting patterns of longitudinal population dynamics
and antimicrobial resistance mechanisms in two priority bacterial pathogens over 7 years in a single
center. Genome Biol 2019; 20: 1–16.
6. Wright MS, Haft DH, Harkins DM, et al. New Insights into Dissemination and Variation of the
Health Care- Associated Pathogen Acinetobacter baumannii from Genomic Analysis. 2014; 5: 1–13.
7. Quainoo S, Coolen JPM, van Hijum SAFT, et al. Whole-genome sequencing of bacterial pathogens:
The future of nosocomial outbreak analysis. Clin Microbiol Rev 2017; 30: 1015–63.
8. Gorrie CL, Mirceta M, Wick RR, et al. Antimicrobial-Resistant Klebsiella Pneumoniae Carriage and
137
Infection in Specialized Geriatric Care Wards Linked to Acquisition in the Referring Hospital. Clin
Infect Dis 2018; Jul 15; 67.
9. Weingarten RA, Johnson RC, Conlan S, et al. Genomic analysis of hospital plumbing reveals diverse
reservoir of bacterial plasmids conferring carbapenem resistance. MBio 2018; 9.
10. Martin J, Phan HTT, Findlay J, et al. Covert dissemination of carbapenemase-producing Klebsiella
pneumoniae (KPC) in a successfully controlled outbreak: Long- and short-read whole-genome
sequencing demonstrate multiple genetic modes of transmission. J Antimicrob Chemother 2017; 72:
3025–34.
11. Conlan S, Park M, Deming C, et al. Plasmid dynamics in KPC-positive Klebsiella pneumoniae during
long-term patient colonization. MBio 2016; 7.
12. Sheppard AE, Stoesser N, Wilson DJ, et al. Nested Russian doll-like genetic mobility drives rapid
dissemination of the carbapenem resistance gene blaKPC. Antimicrob Agents Chemother 2016; 60:
3767–78.
13. Kwong JC, Mccallum N, Sintchenko V, Howden BP. Whole genome sequencing in clinical and
public health microbiology. Pathology 2015; 47: 199–210.
14. Zhang L, Li Y, Shen W, Wang SM, Wang G, Zhou Y. Whole-genome sequence of a carbapenem-
resistant hypermucoviscous Klebsiella pneumoniae isolate SWU01 with capsular serotype K47
belonging to ST11 from a patient in China. J Glob Antimicrob Resist 2017; 11: 87–9.
15. Wick RR, Judd LM, Gorrie CL, Holt KE. Completing bacterial genome assemblies with multiplex
MinION sequencing. Microb Genomics 2017: 0–6.
16. Schürch AC, van Schaik W. Challenges and opportunities for whole-genome sequencing–based
surveillance of antibiotic resistance. Ann N Y Acad Sci 2017; 1388: 108–20.
17. Ellington MJ, Ekelund O, Aarestrup FM, et al. The role of whole genome sequencing in
antimicrobial susceptibility testing of bacteria: report from the EUCAST Subcommittee. Clin Microbiol
Infect 2017; 23: 2–22.
18. Drouin A, Letarte G, Raymond F, Marchand M, Corbeil J, Laviolette F. Interpretable genotype-to-
phenotype classifiers with performance guarantees. Sci Rep 2019; 9.
19. Su M, Satola SW, Read TD. Genome-based prediction of bacterial antibiotic resistance. J Clin
Microbiol 2019; 57: 1–15.
138
20. Walker TM, Kohl TA, Omar S V., et al. Whole-genome sequencing for prediction of
Mycobacterium tuberculosis drug susceptibility and resistance: A retrospective cohort study. Lancet
Infect Dis 2015; 15: 1193–202.
21. Gordon NC, Price JR, Cole K, et al. Prediction of Staphylococcus aureus Antimicrobial Resistance
by Whole-Genome Sequencing. J Clin Microbiol 2014; 52: 1182–91.
22. Anon. Doyle RM, O’Sullivan DM, Aller SD, et al. Discordant bioinformatic predictions of
antimicrobial resistance from whole-genome sequencing data of bacterial isolates: an inter-
laboratory study. Microb Genom. 2020;6(2):e000335.
23. D’Costa VM, King CE, Kalan L, et al. Antibiotic resistance is ancient. Nature 2011; 477: 457–61.
24. Forbes JD, Knox NC, Ronholm J, Pagotto F, Reimer A. Metagenomics: The next culture-
independent game changer. Front Microbiol 2017; 8.
139
Acknowledgments
Firstly, I would like to express my sincere gratitude to my supervisors Prof. Alex van Belkum and Dr.
Caroline Mirande for the continuous support of my PhD study and related research, for their
patience, motivation, and valuable suggestions. Their guidance helped me in all the time of research
and writing of this thesis. I would like to thank also the bioMérieux employees who made my access
simpler to the research facilities and laboratory.
I thank the members of the doctoral committee, Dr. Arvid Suls, Prof. Annelies Van Rie, Prof. Herman
Goossens and Dr. Pieter Moons, for reading my work and for your time. A special thanks to Prof.
Herman Goossens, my promotor, for allowing me to get through my PhD and for his support, and to
Dr. Pieter Moons, for the great ND4ID project organization. I thank also all the people involved in the
ND4ID project, including PIs and PhD students.
I thank the other PhD students in bioMérieux, Andreu, Manisha and Rucha for their support and
chats.
My sincere thanks also goes to Prof. Marco Maria D’Andrea and Prof. Gian Maria Rossolini for the
nice collaborations, for all the suggestions and writing tips.
I thank very much Dr. Kelly Wyres and Prof. Kathryn Holt for the wonderful collaboration, for their
great suggestions and for helping me making sense of the research results.
I am also pleased to say thank you to Dr. Laurent Poirel and Prof. Patrice Nordmann who provided
me an opportunity to join their team for a few months, and who gave access to the laboratory and
research facilities. Thanks also to all the lab members who helped me during that time period.
I thank Dr. Pierre Mahé and Dr. Magali Jaillard for involving me in a wonderful project and teaching
me new things.
I thank Prof. Ivana Cirkovic, Dr. Nicholas Legakis and Prof. Luo Yan Ping for providing me with the
clinical isolates and for their involvement in the writing of the manuscripts.
In the end, I am grateful to my family, friends and acquaintances for supporting me throughout
writing this thesis and my life in general. A special thanks to my girlfriend, Amandine, for always
being there.