Predictions of antibiotic resistance genes in the...
Transcript of Predictions of antibiotic resistance genes in the...
Predictions of antibiotic resistance genes in the human intestinal microbiota
1
Journée des bioinformaticiens de Jouy-en-Josas February 5, 2014
Etienne Ruppé
2
Not considered as ARDs: gene(s), which, when mutated, cause a decrease in the susceptibility to at least one antibiotic (e.g. mutated DNA gyrases cause fluroquinolone resistance)
Adapted from Levy SB, et al. 2004. Nat Med. 10:S122-129.
Antibiotic efflux pump
Chromosome
Antibiotic- altering enzyme
Antibiotic-degrading enzyme
Antibiotic-unsusceptible
metabolic pathway
Antibiotic
Antibiotic
Antibiotic Mobile genetic element
Antibiotic
Antibiotic Target site protection
Considered ARDs: gene(s) which, when expressed, cause a decrease in the susceptibility to at least one antibiotic (e.g. beta-lactamases hydrolyze beta-lactams)
X Mutation X
Porin X Antibiotic
What do we mean by antibiotic resistance determinants (ARDs)?
3
Any ARDs in the intestinal microbiota?
Only a few hundreds of ARDs among millions of genes? > Issue of the ARD database (ARDB not curated) > Issue of the searching method
Forslund, K. et al.(2013) Genome Res; Ghosh, TS. et al. (2013) PloS One; Hu, Y. et al. (2013) Nat Comm
Colonne1 Forslund, K. et al Ghosh, TS. Et al Hu, Y. et al.
Journal Genome Research Plos One Nature Communications
Published in 2013 2013 2013
N individuals 252 257 162
Origin of individuals American, Danish, Spanish American, Danish,
French, Italian, Japanese,
Spanish, Indian, Chinese
Danish, Spanish, Chinese
ARD reference database ARDB (enriched in-house) ARDB ARDB
Search algorithm Blastn Blastx Blastp
N unique ARDs (>95%) 100 157 156
Beta-lactamases TEM, SHV, AmpC_E. coli,
CCRA, CBLA, CFXA, CEPA
TEM, LEN, SHV, OXY,CTX-
M, CFXA, CBLA, CEPA,
AmpC_E. coli, CMY-2
KPC, ROB, TEM, CTX-M, OXY,
PER, SHV, CARB, PSE, LCR, OXA-
1, OXA, SME, L1, IMP
4
1Sommer, M.O., et al. Science 325, 1128-1131 (2009).
Aminoacid identity (NCBI)
METAGENOME
AEROBIC CULTURE
BOTH
A hunch from functional metagenomics… ARDs from the intestinal microbiota might differ from
those in the databases
Non-cultivable under the radar
Cultivable ARD databases
If one-dimension cannot make it, try 3-D
5
Indeed, proteins with low shared identity can have the same structure
PER-1 and CTX-M-15 do not share more than 21% identity in aminoacid (1-dimension)
1 15710 20 30 40 50 60 70 80 90 100 110 120 130 140(1)
MVKKSLRQFTLMATATVTLLLGSVPLYAQTADVQQKLAELERQSGGRLGVALINTADNSQILYRADERFAMCSTSKVMAAAAVLKKSESEPNLLNQRVEIKKSDLVN--YNPIAEKHVN--GTMSLAELSAAALQYSDNVAMNKLIAHVGGPASVTACTX_M_15_JQ686199 (1)
--MNVIIKAVVTASTLLMVSFSSFETSAQSPLLKEQIESIVIGKKATVGVAVWGPDDLEPLLINPFEKFPMQSVFKLHLAMLVLHQVDQGKLDLNQTVIVNRAKVLQNTWAPIMKAYQGDEFSVPVQQLLQYSVSHSDNVACDLLFELVGGPAALHDPER_1_EF5356 (1)
I L AS L L S AQS L I I A LGVAL D IL EKF M S KL A VL D LNQ V I KA LLN W PI H SM L L AL HSDNVA L VGGPAAL Consensus (1)
But they do share a high homology in 3-dimensions!
CTX-M-15 PER-1 Alignement of CTX-M-15 and PER-1 (TMscore > 0.9)
6
Homology modeling
MNIIDIVAIIPYFITLGAT
Template alignment
Backbone generation
Loop modeling
Side chain modeling
Model optimization
7
8
So, why not using 3D instead of 1D to identify ARDs in the interstinal microbiota?
ISSUES
1. Which 3D scores/cut-offs would be used to identify ARDS among candidates?
2. Online tools (Itasser) do not allow high-throughput: one model per 24h, thousands of candidates
3. 3D modeling requires important computational ressources
SOLUTIONS
1. Use a pairwise comparative process
2. Uses of a software that performs fast and accurate modeling
3. Use local/national clusters
9
ARD Reference sequences
Find sequence homologues: Hmmer,
Blastp, SSearch
Protein catalogue
Filtering candidates: size (75%<X<125% of reference mean size)
ARD Reference templates
Negative reference templates
Homology modeling of candidates
Negative reference
sequences
Candidates vs negative templates
Candidates vs ARD
reference templates
Score analysis
Concept of pairwise comparative modeling
PREDICTION
Positive reference
sequences
Functional annotation with eggNOG v3
Substraction of the COG/NOG of the
positive reference sequences
Candidates
Selection of the ‘leading’ proteins of
the remaining groups
Curation
Negative reference sequences
Selection of the negative references 10
Positive and negative
references
Homology modelling with ARD reference
Homology modelling with ref negative
reference
Structural model candidate vs
reference
Structural model candidate vs negative
reference
Model construction with logistic regression and machine learning1
Universal model construction
11000 bootstraps, 10 fold cross-validations
11
Statistical model: high discrimination between ARDs and negative references
AUC = 0.98 662 ARD references 522 negative references
12
Statistical model 13
14
An example of prediction: class A beta-lactamase
TEM (pdb 1TEM)
Candidate from Akkermansia (36% identity with VEB-1) 86% predicted as class A beta-lactamase
Blaa reference templates Negative reference templates
15
Qnr (PDB 2W7Z)
Candidate from Enterococcus (21% identity with QnrA4) 95% predicted as Qnr
An example of prediction: Qnr
Qnr reference templates Negative reference templates
Prediction score and identity with reference ARD
16
17
Validation with an external dataset
1Forsberg, KJ., et al. Nature, 29; 509 (2014), 2Gibson, MK., et al. ISME J. 1;207 (2015), 3The same ARD families as those considered in the PCM and the conventional method were searched. Candidates outside the size ranges were discarded.
12,904 ORFs
Functional metagenomic study from soils: 4654 inserts
containing ARD1
1380 (99.2%) predicted as
ARDs
Flat search: 1674 hits
1391 candidates for PCM
210 from inserts with no identified
ARDs
73 from inserts with
>1 annotated ARDs
1391 candidates for conventional method
(80% identity)
9 (predicted as ARDs
Resfams2
1345 predicted as ARDs3
18
Worth a second round?
Class A beta-lactamases: 204
predictions added to the reference dataset
New PCM round
1334 candidates, 933 after removing 1st
round candidates
70 predicted as class A beta-lactamases
863 not predicted as class A beta-
lactamases
TEM-1 (green) aligned with a blaa obtained during the 2nd round
19
Influence of the size
0
500
1000
1500
2000
2500
3000
3500
Total number of PCM
Candidates
Predicted as ARDs
Summary of the predictions for December 2014
20
~18,000 PCM performed (36,000 structures) ~2 year full time on a 8-CPU cluster Partners involved: -The local MetaGenoPolis cluster -The Jouy-en-Josas INRA cluster (Migale) -The Toulouse INRA cluster (Genotoul) -The Roscoff INRA cluster -The Genouest cluster
21
Do people cluster according to their resistome? Concept of ‘resistotypes’
Ding & Schloss. Nature. 2014 May 15;509(7500)
1 2 3 4 5 6 Resistotype
22
Resistotypes are linked to gene richness, such as enterotypes are
23
Resistotype 4 is related to a low gene richness
Resistotypes are not associated to age, gender, or body mass index
MetaHIT subjects ARDs
24
RLQ ordination method. Dray, S., et al. Ecology. 2014 Jan;95(1):14-21.
aac2 aac6 ant
blaa
aph
blab1
blab3 blac blad qnr
van
Before ATB Right after ATB At distance from ATB
Preliminary results from Utrecht 25
What is the interplay between the resistome and the effect of antibiotics on
the intestinal microbiota?
Resistotype 2 Resistotype 1
Resistotype 3 Resistotype 4 Resistotype 5
Antibiotic exposure
Altered microbiota
26
Resistotype 6
Some commensal bacteria indeed provide a protection against antibiotics
Stiefel, U. et al. 2014 Aug;58(8):4535-42
27
28
Perspectives
What if ARDs from the dominant microbiota were eventually good to us??? If so, what are the genes/species that exert this altruistic effect?
Data from EvoTAR cohorts shall contain all the answers!
Thank you for your attention!
Resistant commensals can protect from colonization by exogenous pathogens/resistant bacteria
They have barely been isolated in pathogens despite close contact and ATB pressure