A systematic evaluation of Mycobacterium tuberculosis ...Dany JV Beste 3,O and Rigoberto Rios-Estepa...
Transcript of A systematic evaluation of Mycobacterium tuberculosis ...Dany JV Beste 3,O and Rigoberto Rios-Estepa...
-
A systematic evaluation of Mycobacterium tuberculosis Genome-Scale 1
Metabolic Networks 2
Víctor A López-Agudelo 1,2, Emma Laing 3, Tom A Mendum 3, Andres Baena 2,4, Luis F 3
Barrera 2,5 Dany JV Beste 3,O and Rigoberto Rios-Estepa 1,* 4
1 Grupo de Bioprocesos, Departamento de Ingeniería Química, Universidad de Antioquia UdeA, 5
Calle 70 No. 52-21, Medellín, Colombia 6
2 Grupo de Inmunología Celular e Inmunogenética (GICIG), Facultad de Medicina, Universidad 7
de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 8
3 Department of Microbial Sciences, Faculty of Health and Medical Sciences, University of 9
Surrey, Guildford GU2 7XH, UK 10
4 Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de 11
Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 12
5 Instituto de Investigaciones Médicas, Facultad de Medicina, Universidad de Antioquia UdeA, 13
Calle 70 No. 52-21, Medellín, Colombia 14
15
o Corresponding Author: [email protected] 16
* Corresponding Author: [email protected] 17
Abstract 18
The metabolism of the causative agent of TB, Mycobacterium tuberculosis (Mtb) has recently re-19
emerged as an attractive drug target. A powerful approach to study Mtb metabolism is to use a 20
systems biology framework, such as a Genome-Scale Metabolic Network (GSMN) that allows the 21
dynamic interactions of the many individual components of metabolism to be interrogated together. 22
Several GSMNs networks have been constructed for Mtb and used to study the complex relationship 23
between Mtb genotype and phenotype. However, their utility is hampered by the existence of 24
multiple models of varying properties and performances. Here we systematically evaluate eight 25
recently published metabolic models of Mtb-H37Rv to facilitate model choice. The best performing 26
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
2 of 39
models, sMtb2018 and iEK1011, were refined and improved for use in future studies by the TB 27
research community. 28
Introduction 29
Mycobacterium tuberculosis (Mtb) is the causative bacterium agent of the global tuberculosis (TB) 30
epidemic, which is now the biggest infectious disease killer worldwide, causing 1.6 million deaths in 31
2017 alone [1]. Mtb is an unusual bacterial pathogen, as it is able to cause both acute life threatening 32
disease and a clinically latent infections that can persist for the lifetime of the human host [2,3]. 33
Metabolic reprogramming in response to the host niche during both the acute and the chronic phase 34
of TB infections is a crucial determinant of virulence [4–7]. With the worldwide spread of multi- and 35
extensively-resistant strains of Mtb thwarting the control of this global emergency, new drugs against 36
Mtb are urgently needed and metabolism presents an attractive target [8,9]. 37
Genome-scale constraint-based modelling has proved to be a powerful method to probe the 38
metabolism of Mtb. The first Genome Scale Metabolic Networks (GSMNs) of Mtb were published 39
in 2007 by Beste (GSMN-TB) [10] and Jamshidi (iNJ661) [11] and have been used as a platform for 40
interrogating high throughput ‘omics’ data, by simulating bacterial growth, generating hypothesis and 41
drug discovery and are also the backbone for subsequent model iterations [12–20]. In 2014, Rienksma 42
and colleagues developed a genome-scale model of Mtb metabolism (sMtb) building upon three 43
previously published models [15]. More recently Rienksma and colleagues utilized RNA sequence 44
data to provide condition-specific biomass reactions and identify metabolic drug responses during 45
Mtb infection of human macrophages [21,22] thus providing a model more suited to exploring the 46
metabolism of Mtb during human infection. In 2018 the first consolidated genome-scale metabolic 47
network, iEK1011 was constructed using standardized nomenclature of metabolites and reactions 48
from the BiGG model database [23,24]. The utility of Mtb GSMNs will continue to increase, as more 49
biological data is published and existing models are updated [25]. 50
With so many well-annotated GSMN’s of Mtb available (Fig 1), a crucial first step in any genome 51
scale exploration of the metabolism of Mtb is the selection of an appropriate model. Here, we 52
systematically evaluate the performance of eight recently published GSMN of Mtb-H37Rv. In 53
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
3 of 39
addition to comparing the metrics of the models descriptively in terms of size, connectivity, number 54
of blocked reactions and gaps in the network, we also identify the thermodynamically infeasible, and 55
energy generating cycles that could significantly impact on the accuracy of flux simulations. Using 56
Flux Balance Analysis (FBA) we performed growth analysis and compared the models’ ability to 57
predict gene essentiality when grown on different carbon and nitrogen sources including growth on 58
cholesterol, which is thought to be the major carbon source for intracellular Mtb within its human 59
host macrophage. 60
This work provides an inventory of the available GSMN-TB and their utility in recapitulating aspects 61
of Mtb metabolism. In addition, we present updated versions of the highest performing models 62
iEK1011 and sMtb2018 (iEK1011_2.0 and sMtb2.0) which provide multi-scale simulation platforms 63
available for the TB research community and will therefore inform future studies on the metabolism 64
of this deadly pathogen. 65
66
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
4 of 39
Fig 1. The evolution of Genome Scale Metabolic models of Mtb. Models highlighted in 67
gray were analysed in this study. Numbers denote genes/intracellular reactions. Black 68
circles are indicative of merged Mtb models. 69
Results and Discussion 70
Descriptive evaluation of the models 71
Each of the GSMNs analysed in this study (Fig 1, S1 Appendix) combine knowledge from genome 72
annotations, literature and measured biochemical compositions of Mtb. The complex linkage between 73
genotype and phenotype is made by gene-protein-reaction (GPR) associations, implemented as 74
Boolean rules in order to connect gene functions to enzyme complexes, isozymes or promiscuous 75
enzymes, and then to biochemical reactions [26]. Using set theory, we computed the intersection 76
between all sets of the model’s genes (Fig 2, andS1 Table). In accordance with expectations, the 77
pairwise matrix (Fig 2) demonstrates that Mtb models constructed from the same ancestor (iNJ661 or 78
GSMN-TB), are more similar (Fig. 1, Fig. 2). By contrast the consolidated models iEK1011 and 79
sMtb2018 share gene similarities (>60%, 60%,
-
5 of 39
biosynthesis. Likewise, the iNJ661v_modified model (Fig 1) has poor annotation of genes involved 92
in lipid-metabolism (e.g. β-oxidation, cholesterol degradation, fatty acid biosynthesis, lipid 93
biosynthesis and mycolic acid biosynthesis). These models therefore have limitations for in silico 94
simulation of Mtb growing on these physiologically relevant lipid sources and for modelling growth 95
in vivo. 96
97
98
Fig 2. Pairwise matrix of shared genes among Mtb models. Values in black (genes in 99
common between the models), blue (genes specific to the model indicated), and red (genes 100
specific to the model indicated) represent the number and percentage of specific model 101
genes specified in the y- and x-axis. 102
Checking mass and charge balances of biochemical reactions 103
Currency metabolites like water, protons, ATP, and cofactors like NADH, NADPH, FADH2, 104
CoA, etc. are ubiquitous and essential for metabolism. The addition of these cofactor metabolites 105
in GSMNs and in the biomass reaction considerably improves phenotype predictions and is a 106
hallmark of good quality reconstructions [19, 30]. In order to check this we converted the 107
GSMNs into substance graphs (using a local script) where metabolites (nodes) are connected by 108
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
6 of 39
edges (undirected and unweighted) if they appear in the same reaction [31] and computed node 109
degrees (number of edges connected to the node) (Table 2, and S4 Table), 110
Table 2. Degree values for currency metabolites of Mtb GSMNs. PI: Phosphate, PPI: 111
diphosphate, ACP: acyl-carrier protein, MK: menaquinone. 112
Currency
Metabolite GSMN-TB1.1 iOSDD890 sMtb iCG760 iSM810 iNJ661v_mod iEK1011 sMtb2018
H 72 701 512 102 127 667 741 511
CO2 150 202 266 146 160 193 204 268
H2O 5 507 624 0 5 472 569 624
ATP 236 353 320 264 242 295 315 320
AMP 101 170 115 103 103 117 132 115
PI 189 293 266 213 196 268 293 267
PPI 175 235 180 177 180 168 196 181
COA 175 163 230 175 190 142 194 230
ACP 98 48 136 94 103 48 57 136
NADH 102 141 262 104 113 111 176 263
NADPH 133 151 192 132 146 147 150 192
FADH2 40 47 55 42 48 21 62 55
MK 37 20 11 37 37 20 18 20
O2 37 76 64 40 41 68 91 65
113
From this analysis the degree values indicated that GSMN-TB 1.1, iCG760 and iSM810 models 114
have the lowest participation of currency metabolites (Table 2). Water and protons were the most 115
underrepresented metabolites (low degree values), indicating that these models may not be 116
correctly balanced. It is important that biochemical reactions are charge and mass balanced. 117
Unbalanced reactions in GSMNs may allow proton or ATP production out of nothing [32]. In 118
order to test whether the Mtb GSMNs are mass and charge balanced, we used the COBRA 119
Toolbox function “checkMassChargeBalance” [33]. Unfortunately, we were not able to perform 120
this analysis for GSMN-TB1.1, iCG760 and iSM810 due to the lack of standard metabolite 121
formulas in these models (Table 3). iEK1011 has the lowest number of imbalanced reactions (4) 122
compared with sMtb2018 (8), sMtb, (12), iNJ661v_modified (13) and iOSDD890 (78). The 123
majority of the imbalanced reactions belonged to cell wall biosynthetic pathways, including 124
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
7 of 39
arabinogalactan, peptidoglycan, and mycolic acid biosynthesis (S5 Table) reflecting the 125
difficulties in rebuilding accurate metabolite formula for complex cell wall components. 126
Another potential error in GSMNs is the molecular weight of biomass, which should be defined as 127
1 g/mmol. Discrepancy in biomass weight can arise as a result of unbalanced reactions which impacts 128
on the reliability of flux predictions using Flux Balance Analysis (FBA). This affect can be amplified 129
when host-pathogen interactions are simulated by integrating host and pathogen metabolic models 130
[34]. Chang and colleagues (2017) developed a systematic algorithm for checking if the biomass 131
molecular weight from GSMNs deviates from 1 g/mmol,. Using this algorithm [34] we found 132
deviations of
-
8 of 39
Of those models that contained a pathway for cholesterol degradation (GSMN-TB1.1, iCG760, 152
iSM810, iEK1011, sMtb, and sMtb2018). The GSMN-TB1.1 cholesterol degrading pathway 153
contained a number of dead end metabolites making this model unsuitable for exploring the 154
metabolism of this important in vivo carbon source. 155
Table 3. Global features of the Mtb metabolic models analyzed in this study. DEM: Dead-End 156
Metabolite, Metabolites wo F: Metabolites without formula, Imbal. Reactions: Imbalanced Reactions, 157
URs: Unbounded Reactions, TICs: Thermodynamically Infeasible Cycles, diss. Flux: dissipation 158
flux, MW: Molecular Weight, UD: Undetermined. 159
GSMN-
TB 1.1 iOSDD890 sMtb iCG760 iSM810
iNJ661v_
mod iEK1011 sMtb2018
Reactions 876 1152 1311 965 938 1054 1228 1321
Intracellular Reactions 876 1055 1192 864 938 956 1118 1200
Metabolites 667 961 1047 754 724 840 998 1049
DEMs 25 162 33 84 53 90 110 34
Blocked Reactions 98 (11%) 290 (25%) 92 (7%) 89 (9%) 117 (12%) 153 (14%) 138 (11%) 92 (7%)
Metabolites wo F 667 0 2 754 724 0 4 2
Imbal. Reactions UD 78 12 UD UD 13 4 8
URs 27 (3%) 52 (5%) 70 (6%) 45 (5%) 33 (3.5%) 67 (7%) 20 (2%) 75 (6%)
TICs 8 17 16 8 8 23 4 17
ATP diss. Flux 0 0 0 0.33 0 0 0 0
GTP diss. Flux 0 0 0 0.33 0 0 0 0
CTP diss. Flux 0 0 0 0.33 0 0 0 0
UTP diss. Flux 0 0 0 0.33 0 0 0 0
Biomass MW UD 1.0070 1.0126 UD UD 1.0094 1.0937 1.0126
160
Thermodynamic and energetic properties 161
Integrating thermodynamics data into GSMNs is extremely useful in order to check the feasibility of 162
reactions and their directionality [43,44]. Although, Mtb GSMNs have been built from 163
thermodynamics information, current Mtb GSMNs have never been checked for infeasible internal 164
flux cycles (also named type III pathways), which are reactions that do not exchange metabolites with 165
the surroundings and therefore violate the second law of thermodynamics [43,45,46]. A tractable way 166
to identify reactions participating in these thermodynamically infeasible cycles (TICs) is by defining 167
the set of reactions required for an unbounded metabolic flux under finite or even zero substrate 168
uptake inputs. Using Flux Variability Analysis (FVA ) the Unbounded Reactions (URs) can be 169
identified as those reactions with fluxes at the upper and/or lower bound constraints. Thus we 170
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
9 of 39
identified the thermodynamically infeasible cycles (TIC) using a local script following methodologies 171
based on FVA and the analysis of the null space of the stoichiometric matrices (S7 Table) [44,47]. 172
Using this approach we show that models descended from GSMN-TB (GSMN-TB1.1, iCG760, and 173
iSM810) have a lower percentage of unbounded reactions as compared with iNJ661 ancestors 174
(iSM810, iNJ661v_modified). Interestedly, the sMtb2018 model has an increased number of 175
unbounded reactions as compared to the original sMtb (Table 3). 176
Fritzemeier and colleagues demonstrated that over 85% of genome-scale models that lack exhaustive 177
manual curation contain Energy Generating Cycles (EGCs) [48,49]. These cyclic net fluxes are 178
entirely independent of nutrient uptakes (exchange fluxes) and therefore have a substantial effect on 179
the predictions of constraint-based analyses, as they basically generate energy out of nothing. Using 180
FBA with zero nutrient uptake [49] but maximizing energy dissipation reactions for ATP, GTP, CTP 181
and UTP we show that iCG760 is (Table 3), the only Mtb genome-scale model that contains EGCs. 182
Gene essentiality metrics 183
An effective and commonly employed predictive matrix for GSMNs is the ability to reproduce high 184
throughput gene essentiality data [50]. Several high throughput transposon mutagenesis screens have 185
been performed for Mtb [51–57] in different in vitro conditions. To compare our models we used a 186
transposon insertion sequence dataset produced by Griffin et al [54] in which genes that were 187
differentially required for growth on cholesterol as compared with glycerol [54] were identified. 188
Cholesterol is an important intracellular source of carbon when Mtb is growing within its host and 189
cholesterol metabolism has been highlighted as a potential drug target [58]. Whilst Griffin and 190
colleagues identified differentially required genes, this data was not used to identify the genes 191
required for growth on cholesterol as an individual condition. We reanalyzed the Griffin’s transposon 192
sequencing data with the statistical Bayesian/Gumbel Method incorporated into the software 193
TRANSIT [59], to identify genes required for growth on glycerol, or growth on cholesterol (S8 and 194
S9 Tables). Only genes categorized as essential (ES) and non-essential (NE) were considered for this 195
analysis. 196
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
10 of 39
We evaluated the overall predictive power of all the Mtb GSMNs versus four high throughput gene 197
essentiality data [54,56,57] by computing the Area Under the Curve (AUC) of the Receiver Operating 198
Characteristic (ROC) (S1 Fig). The predictive power of the six GSMNs which contain the cholesterol 199
degradation pathway (GSMN-TB1.1, iCG760, iSM810,sMtb, sMtb2018, and iEK1011) showed that 200
for both cholesterol and glycerol minimal media the models derived from GSMN-TB [10] as a core 201
metabolic network have better predictive capacities than those using iNJ661 [11] (S10 and S11 202
Tables). However the recently curated model iEK1011 had the highest predictive capability overall. 203
The supremacy of iEK1011 also was confirmed by comparing the predictive power of the models 204
for Mtb grown in standard Middlebrook 7H9 OADC [56] data (S1C Fig and S12 Table). 205
We further used the essentiality dataset generated by Minato and colleagues who identified 206
conditionally essential genes on different in vitro medium including a rich medium called “MtbYM”, 207
which contains several carbon sources, nitrogen sources, amino acids, nucleotide bases, cofactors, 208
and other nutrients [57]. Although this media supported growth similarly to 7H9 medium, the overall 209
predictive power of the Mtb GSMNs in this medium (S1D Fig amd S13 Table) was 10% lower 210
compared to the other media (S1A-S1C Figs), probably because biomass objective functions of the 211
Mtb GSMNs ancestors were built and validated using growth on 7H9, 7H10, Youman’s, CAMR, and 212
Roisin’s minimal medium [10,11]. However these analyses were able to demonstrate the power of 213
ability of these models to accurately predict gene essentiality even under un-anticipated nutritional 214
conditions. 215
We identified the genes that all of the GSMN’s were unable to correctly assign essentiality (S14 216
Table, Fig 3). Using a fixed threshold value of 5% of the maximum wild-type growth rate (WTGR) 217
we identified in silico essential and non-essential genes and using experimental high throughput gene 218
essentiality data identified True Positives (TP), True Negatives (TN), False Positives (FP-a gene 219
which is essential for in silico growth but non-essential by Tn-seq) and False Negatives (FN-in silico 220
the gene is non-essential but the biological data predicts essentiality). FN genes include genes known 221
to have a major role in Mtb central carbon metabolism e.g., icl (Rv0467, isocitrate lyase), gltA 222
(Rv0896, Citrate synthase), glpD2 (Rv3302c, Glycerol-3-phosphate dehydrogenase), pyk (Rv1617, 223
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
11 of 39
Pyruvate Kinase), sucC and sucD (Rv0951, Rv0952, Succinyl-CoA ligase), among others (S14A 224
Table). Some of these genes e.g. icl and gltA are considered conditionally essential genes in the 225
Online GEne Essentiality (OGEE) database [60], because they are classified as NE genes in 7H10 226
medium but ES in minimal medium [53,55]. This may reflect the presence of alternative routes in 227
silico that are not feasible in vivo due to regulatory constraints. However, they may also reflect 228
inaccuracies in the predictive abilities of transposon mutagenesis studies. Some of the FP genes are 229
involved in mycolic acid biosynthesis (S14B Table), e.g., mmaA2 (Rv0644c, Cyclopropane mycolic 230
acid synthase), and mas (Rv2940c, mycocerosic acid synthase). These genes were inaccurately 231
classified as ES in silico but experimentally as NE. This is a direct reflection of our incomplete 232
knowledge of these mycolate producing pathways and the essentiality of the various mycolates 233
produced. 234
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
12 of 39
235
Fig 3. False Negative and False Positive predictions in the evaluated media. (a) Venn 236
diagram for predicted false negative (FN) genes; (b) Venn diagram for predicted false 237
positive (FP) genes. Genes in the gray box represents the intersection of all FN and FP genes 238
in the four media. Genes are classified as True-positives (TP) if model simulation predicts 239
no growth when essential genes are deleted, False-positives (FP) if model simulation 240
predicts no growth when not essential genes are deleted, True negatives (TN) if model 241
simulation predicts growth when not essential genes are deleted and False negatives (FN) if 242
model simulation predicts growth when essential genes are deleted. 243
244
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
13 of 39
Growth metrics 245
Mtb is able to metabolise several carbon and nitrogen sources both in vitro and when growing in the 246
host [61–64], and therefore we evaluated the growth metrics of Mtb GSMNs on 30 sole carbon and 247
17 sole nitrogen sources (Fig 4). The in silico results were compared with available in vitro 248
experimental data from Biolog Phenotype microarray and minimal media data [14,65]. Interestingly 249
the recent consolidated models, iEK1011 and sMtb, had the poorest performance of all the models in 250
predicting growth of Mtb in unique carbon and nitrogen sources (Fig 4A and 4B). A fundamental 251
issue with the Mtb models descended from iNJ661 is that they all require glycerol for growth as this 252
is present in the biomass formulation. Both iEK1011 and sMtb were unable to grow in silico on 253
cholesterol, acetate, oleate, palmitate and propionate on sole carbon sources. We hypothesise that this 254
is as a result of inaccuracies in the reactions associated with redox metabolism and oxidative 255
phosphorylation and specifically including menaquinone-dependent reactions such as fumarate 256
reductase and succinate dehydrogenase would correct this anomalie. To test this hypothesis we added 257
a menaquinone-dependent succinate dehydrogenase reaction into sMtb (Q[c] + SUCC[c] -> QH2[c] 258
+ FUM[c]). In support of our hypothesis this corrected the in silico growth phenotype of Mtb growing 259
on acetate, cholesterol, propionate and fatty acids (Fig 4A and S15 Table). Although iEK1011 also 260
contains a fumarate reductase reaction that is linked to menaquinone/demethylmenaquinone, it does 261
not contain a menoquinone-dependent succinate dehydrogenase (the reverse reaction). As was the 262
case for sMtb, the addition of a new menaquinone-dependent succinate dehydrogenase reaction 263
(mqn8[c] + succ[c] -> fum[c] + mql8[c]) to iEK1011 significantly improves its growth predictions 264
on sole carbon sources (S15I and S15J Tables). These simulations are supported by experimental data 265
demonstrating that fumarate reductase and succinate dehydrogenase are essential for the survival of 266
Mtb in glycolytic and non-glycolytic substrates [66,67]. Succinate dehydrogenase is a bifunctional 267
enzyme that is part of the TCA cycle and complex II of the electron transport chain, coupling the 268
oxidation of succinate to fumarate, with the corresponding reduction of membrane-localized quinone 269
electron carriers [67,68]. Mtb has multiple succinate dehydrogenases and fumarate reductases that are 270
essential for the survival of Mtb survival during hypoxia [66,67,69–71]. Succinate is central to much 271
of Mtb’s lipid metabolism: host derived cholesterol, uneven chain length fatty acids or methyl 272
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
14 of 39
branched amino acids all generate propionyl-CoA that can be channeled into the methylcitrate cycle 273
to produce succinate (S2 Fig), while acetyl-CoA produced by β-oxidation of host derived even-chain 274
fatty acids is metabolized through the glyoxylate shunt to also produce succinate (S2 Fig). Succinate 275
oxidation by succinate dehydrogenases is therefore a critical step, as the enzyme couples the TCA 276
cycle with electron transport chain and oxidative phosphorylation [69]. Having multiple succinate 277
dehydrogenases provides Mtb with the metabolic flexibility to survive within the human host where 278
the REDOX and nutritional environment are liable to vary. 279
Whilst carbon metabolism has been intensively studied in vitro and ex vivo, attention has only recently 280
been directed to nitrogen metabolism [72–75]. Similar to carbon consumption, iEK1011 and sMtb 281
were poor at predicting growth on sole nitrogen sources (Fig 4B, Mathews Correlation Coefficient 282
(MCC) = 0.54 and 0.48, respectively). However, like carbon the addition of the menaquinone linked 283
succinate dehydrogenase reaction into iEK1011 and sMtb significantly improves the predictive power 284
of these models on sole nitrogen sources (S16I and S16J Tables). Specifically, the correct growth 285
prediction were obtained for Mtb growing on branched chain amino acids (isoleucine and valine) and 286
proline (Fig 4B). The possible explaination for these correct prediction is that complete degradation 287
of these amino acids can converge on succinate via methyl citrate cycle (degradation of isoleucine 288
and valine) or the GABA shunt (degradation of proline) which couples TCA cycle with oxidative 289
phosphorylation by succinate dehydrogenase. 290
291
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
15 of 39
292
Figure 4. Predictive capacity of Mtb genome-scale models for the utilization of sole carbon 293
and nitrogen sources; (a) Growth predictions of Mtb genome-scale models using sole 294
carbon sources; (b) Growth predictions of Mtb genome-scale models by using sole nitrogen 295
sources. Model’s performance was evaluated by computation of the Mathews Correlation 296
Coefficient (MCC). Experimental growth data were obtained from [14,65]. The 1 and 0 297
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
16 of 39
values represent growth and not growth in a specific substrate, respectively. Carbon or 298
Nitrogen substrates are classified as TP if model simulation predicts growth while growth 299
is observed experimentally, FP if model simulation predicts growth while no growth is 300
observed experimentally, TN if model simulation predicts no growth while no growth is 301
observed experimentally and FN if model simulation predicts no growth while growth is 302
observed experimentally. 303
Refining Mtb GSMNs 304
Overall iEK1011 and sMtb2018 were the best overall GSMN’s in terms of genetic background, 305
network topology, number of blocked reactions, mass and charge balance reactions and gene 306
essentiality predictions (Fig 2, Table 2, Table 3, and S1 Fig) and therefore we selected these models 307
to refine further. iEK1011 has the advantage of containing standardized BiGG nomenclature of 308
metabolites and therefore can easily be integrated into the human GSMN Recon3D [76] to simulate 309
intracellular growth, while sMtb2018 has the utility that this model supports in silico growth in a 310
wider variety of different nutritional conditions. Our analysis also highlighted some fundemental 311
issues with these models which we addressed in order to improve the performance of these exempler 312
GSMN’s. 313
As already discussed the new sMtb2018 includes an updated respiratory chain including menaquinone 314
and menaquinol as electron carriers in all respiratory chain reactions and selected ubiquinone-315
dependent reactions in sMtb were changed to menaquinone-dependent. Six new menaquinone-316
dependent reactions were included in the sMtb model e.g., succinate dehydrogenase, and cytochrome 317
bc1 menaquinone-dependent, fumarate reductase, and malate dehydrogenase [22]. This improved the 318
predictive growth metric of sMtb and importantly allowed in silico growth on cholesterol (see 319
sMtb2018, Fig 4A). Similarly, we added to iEK1011 a menaquinone-dependent succinate 320
dehydrogenase to improve the performance of this model when growing in fatty acids and cholesterol. 321
Further improvements were also made to cholesterol metabolism by updating both models to include 322
the biochemical degradation of the C and D rings of cholesterol which was not known when these 323
models were reconstructed (S17A and S17B Tables) [77]. 324
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
17 of 39
Although the functionality of the vitamin B12 biosynthetic pathway remains undefined, recent 325
elevated non-synonymous nucleotide substitution rates found on cobalamin-dependent enzymes of 326
3798 Mtb strains suggest positive selection on these genes and a possible functional biosynthetic 327
pathway [41]. Therefore, we facilitated co-factor metabolism in the models by unblocking B12 328
synthesis and including the co-factors biotin and pyridoxal-5-phosphate to enhance the phenotype 329
prediction of sMtb2018 and iEK1011 as recommended by Xavier and colleagues [19]. As iEK1011 330
contains glycerol within its biomass formulation, which prevent in silico growth in glycerol-replete 331
media, we incorporated the iNJ661v_mod biomass objective function into this model (S17B Table). 332
We also added additional genes into sMtb2018 based on iOSDD890 genes that were classified as true 333
negatives and true positives by the gene essentiality analysis (S18 Table). Fifty-one reactions that 334
belong to pathways such as glycolysis, gluconeogenesis, TCA cycle, amino acid metabolism and 335
mycolic acid pathway were improved in terms of GPR annotation. All updates for the new models 336
sMtb2.0 and iEK1011_2.0 are summarized in S17 Table. 337
The sMtb2.0 and iEK1011_2.0 models were curated to remove possible reactions that participate in 338
TICs respectively (methods section). (S19 Table). The stoichiometric matrix associated with these 339
reactions was built and the null space basis vector was computed (S20 Table) [44,47] identifying 340
twelve TICs within sMtb2.0; (Fig 5A , S2 Appendix) and seven TICs in iEK1011_2.0 (Fig 5B, S3 341
Appendix) ; The major TIC of sMtb2.0 were within the reactions required for folate metabolism 342
catalysed by thymidylate synthase (thyA and thyX) enzymes and dihydrofolate reductase (DFRA) 343
enzymes, which is an essential step for de novo glycine and purine biosynthesis, and for the 344
conversion of deoxyuridine monophosphate (dUMP) to deoxytimidine monophosphate (dTMP). 345
(Fig 5A). As recommended by Pereira and colleagues [30], we modified the model to use 346
NADPH/NADP+ by retaining the DFRA2 and DFRA4 reactions and eliminating the NADH/NAD+-347
dependent reactions, DFRA1 and DFRA3. Our thermodynamic calculations suggested that THYA 348
and THYX (∆𝑟𝐺𝑚𝑖𝑛=-123 kJ/mol, ∆𝑟𝐺𝑚𝑎𝑥=- 9.3 kJ/mol and ∆𝑟𝐺𝑚𝑖𝑛=-160 kJ/mol, ∆𝑟𝐺𝑚𝑎𝑥=- 33 349
kJ/mol, respectively) are irreversible in the forward direction (S21 Table). 350
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
18 of 39
Our analysis showed that two-ubiquinone oxidoreductases (QRr, NADH2r) and a transhydrogenase 351
reaction (NADTRHD) were thermodynamically unstable within the iEK1011_2 as both QRr and 352
NADH2r were reversible. The Gibbs free energy computations suggest these reactions should be 353
unidirectional in the direction of ubiquinol production ( ∆𝑟𝐺𝑚𝑖𝑛 = −134.9𝑘𝐽
𝑚𝑜𝑙, ∆𝑟𝐺𝑚𝑎𝑥 =354
−20.7 𝑘𝐽/𝑚𝑜𝑙 and ∆𝑟𝐺𝑚𝑖𝑛 = −131.7𝑘𝐽
𝑚𝑜𝑙, ∆𝑟𝐺𝑚𝑎𝑥 = −20.1 𝑘𝐽/𝑚𝑜𝑙, respectively) and therefore 355
we changed the model accordingly to eliminate this TIC. 356
357
Figure 5. Thermodynamic Infeasible Cycles found in sMtb2.0 and iEK1011_2.0 with 358
proposed modifications; (a) TIC at folate metabolism in sMtb2.0; (b) TIC affecting 359
ubiquinone oxidoreductases in iEK1011_2.0. 360
The sMtb2.0 encompasses 1321 reactions, 1054 metabolites and 989 genes, while iEK1011_2.0 361
comprises 1237 reactions, 977 metabolites and 1013 genes. 362
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
19 of 39
Table 4. Genome-scale model features for sMtb2.0 and iEK1011_2.0. 363
Metrics sMtb2.0 iEK1011_2.0
Genes 989 1013
Reactions 1321 1237
Intracellular Reactions 1198 1119
Exchange Reactions 122 119
Metabolites 1054 977
% of Unbounded Reactions (49) 4% 20 (1.8 %)
MCC Cholesterol minimal medium 0.55 0.63
Evaluated Genes 834 866
True Positive Genes 209 210
True Negative Genes 449 503
False Positive Genes 58 21
False Negative Genes 118 132
MCC Glycerol minimal medium 0.54 0.62
Evaluated Genes 834 866
True Positive Genes 207 206
True Negative Genes 449 503
False Positive Genes 58 21
False Negative Genes 120 136
MCC 7H9 medium 0.56 0.69
Evaluated Genes 984 1006
True Positive Genes 202 216
True Negative Genes 600 667
False Positive Genes 96 45
False Negative Genes 86 78
MCC MtbYM medium 0.43 0.52
Evaluated Genes 827 858
True Positive Genes 153 152
True Negative Genes 458 509
False Positive Genes 51 17
False Negative Genes 165 180
MCC Unique Carbon Source 0.58 0.67
Evaluated Metabolites 30 30
True Positive Metabolites 20 20
True Negative Metabolites 5 6
False Positive Metabolites 4 3
False Negative Metabolites 1 1
MCC Unique Nitrogen Source 0.75 0.75
Evaluated Metabolites 17 17
True Positive Metabolites 11 11
True Negative Metabolites 4 4
False Positive Metabolites 2 2
False Negative Metabolites 0 0
364
Similar to the above sections, the predictive capability of these modified models was evaluated by 365
simulating gene essentiality predictions using experimental data. In this case, we use the 5% of the 366
specific growth rate as essentiality threshold; if a specific growth rate of no more than 5% of the wild-367
type is obtained, the given gene was considered as essential, otherwise was considered non-essential 368
(S22 Table). iEK1011_2.0 has the highest predictive performance of gene essentiality in the four 369
media conditions tested (glycerol and cholesterol minimal medium, Middlebrook 7H9, and YM 370
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
20 of 39
medium) compared with all the Mtb GSMNs evaluated, including sMtb2.0 (8% higher in Cholesterol 371
and Glycerol minimal medium, and 13 % higher in 7H9 medium, Table 4). 372
The ability of these models to predict growth sole carbon and nitrogen sources is also better in the 373
updated GSMN’s as compared with their predecessors. These results show that iEK1011_2.0 and 374
sMtb2.0 are currently the most suitable models for studying host-pathogen interactions, as they are 375
able to correctly predict growth on several important carbon sources that Mtb uses intracellularly [62]. 376
By systematically evaluating eight of the recent Mtb GSMNs, we have identified a number of areas 377
in which model veracity can be improved. Mtb models descended from GSMN-TB (GSMN-TB1.1, 378
iSM810 and iCG760) contain many reactions that are unbalanced by charge and mass, often because 379
protons and water have not been accounted for. We observed that all of the Mtb models have dead-380
end metabolites particularly in cofactor metabolism and related pathways. The best performing 381
models were identified as sMtb2018 and iEK1011 by virtue of higher gene coverage, fewer blocked 382
reactions, better mass and charge balances, and overall best predictive power of gene essentiality data. 383
Whilst our analysis identified that iEK1011 and sMtb2018 were the overall best performing GSMN’s 384
we also highlighted areas for improvements. For example the iEK1011 model had poor predictive 385
power for growth on sole carbon and nitrogen sources and unlike Mtb cannot grow when cholesterol 386
and fatty acids are provided as sole carbon sources. We thus updated these two models by the addition 387
of new reactions, gap filling of cofactor metabolism, and the identification and curation of TICs, to 388
give sMtb2.0 (derived for sMtb2018) and iEK1011_2.0 (derived from iEK1011). 389
The improved GSMN’s, sMtb2.0 and iEK1011_2.0 are now available (S1 File) to simulate and 390
predict the metabolic adaptation of Mtb (by the integration of OMICS data) in a plethora of in vitro 391
and in vivo intracellular conditions. We encourage researchers to continue to curate these models as 392
new data and methods become available. Improved GSMN’s including macrophage-Mtb models 393
provide a platform for more reliable simulations and ultimately a better understanding of the 394
underlying biology of Mtb. 395
Methods 396
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
21 of 39
All simulations were conducted on a laptop running Windows 10 (Microsoft) using MATLAB 2016a 397
(MathWorks Corporation, Natick, Massachusetts, USA), COBRA Toolbox version 3.0 [33] and 398
Gurobi Optimizer version 7.5.2 (Gurobi Optimization, Inc., Houston, Texas, USA). All code written 399
for this study is available in supplementary information (S2-S5 Files). Genome-scale models of 400
Mtb 401
Models were obtained from supplementary information of published papers and modified as 402
follows: 403
GSMN-TB1.1 – from [14] supplementary info. 404
iOSDD890 – from [16] supplementary info. 405
sMtb – from [15] supplementary info. Modification included were the addition of exchange 406
reactions to allow constraints by growth medium components. 407
iCG760 – from [17] supplementary info. 408
iSM810 – from [18] supplementary info. 409
iNJ661v_modified – from [19] supplementary info. 410
sMtb2018 – from [22] supplementary info. 411
iEK1011 – from [24] supplementary info. 412
Network connectivity evaluation 413
GSMNs of Mtb were transformed to substrate networks by local scripts after eliminate biomass 414
reaction. Node-specific topology metrics were carried out using the plugin Network Analyzer [78] in 415
Cytoscape 3.4 [79]. Two main topological parameters were evaluated for each model: 1) the node 416
degree of each metaboliteand 2) the clustering coefficient (S4 Table). 417
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
22 of 39
MC3 Consistency Checker algorithm [38] was used to identify Single Connected and Dead End 418
metabolites, and zero-flux reactions in each metabolic network model of Mtb. This algorithm uses a 419
stoichiometric-based identification of metabolites connected only once in each metabolic network 420
and utilize Flux-Variability-Analysis (FVA) for identifying reactions that cannot carry flux [80]. 421
Biomass Molecular Weight Check 422
Testing the biomass Molecular Weight consistency was done by running the script of Chan and 423
colleagues [34] . A biomass reaction is not standardized when the Molecular Weight of the biomass 424
formula is not equal to 1 g/mmol. However, the accuracy of the results relies on the correct chemical 425
formulae of metabolites in the tested GSMNs. 426
Identification of Unbounded Reactions (URs) 427
A straightforward way to identify all reactions that participate in one or more TICs is by performing 428
flux variability analysis (FVA). All the Infeasible loops are evidenced as a set of reactions able to 429
carry an unbounded metabolic flux under finite or even zero substrate uptake inputs. The URs are 430
those reactions that by applying FVA [80], their fluxes will hit the values defined by the upper and/or 431
lower bounds constraints [47]. Therefore, we performed FVA with the eight Mtb GSMNs with all the 432
uptake media constraints defined by 1.0 mmol/gDW/h. 433
Identification of the core set of TICs 434
Schellenberger and colleagues [81] used a methodology for identifying the core set of TICs, which 435
form the basis of all such possible cycles. This core can be obtained by the computation of the null 436
space basis of the stoichiometric matrix (all possible thermodynamically infeasible cycles form the 437
null space of the stoichiometric matrix). Consequently, the set containing all the reactions that we 438
previously identified participate in TICs was used to build a stoichiometric matrix. Therefore, the null 439
space basis of this set was computed and the different cycles composed by two and more reactions 440
were identified by a local script. 441
Checking the existence of Energy Generating Cycles (EGCs). 442
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
23 of 39
Energy generating cycles (EGCs) exist in metabolic networks and can charge energy metabolites like 443
ATP, GTP, CDP, and UTP without any input of nutrients; therefore, their elimination is essential for 444
correcting energy metabolism [49,82] . Fritzemeier and colleagues developed a methodology for 445
identifying if genome-scale models contain EGCs [49]. We applied it in two steps : 1) Addition of 446
Energy dissipation reactions (EDR) for ATP, GTP, CTP and UTP in the form : H2O[c] + XTP[c] -> 447
H[c] + XDP[c] + Pi[c] and 2) maximization of each EDR flux 𝒗𝒅 while no substrate uptake is 448
allowed into the model as follows: 449
𝐦𝐚𝐱 𝒗𝒅 (𝟏) 450
𝒔. 𝒕 𝑺𝒗 = 𝟎 (𝟐) 451
∀𝒊 ∉ 𝑬: 𝒗𝒊𝑳𝑩 ≤ 𝒗𝒊 ≤ 𝒗𝒊
𝑼𝑩 (𝟑) 452
∀𝒊 ∈ 𝑬: 𝒗𝒊 = 𝟎 (𝟒) 453
Here, 𝑺 is the stoichiometric matrix, 𝒗 the vector of fluxes, d the index of EDRs, 𝒗𝒊𝑳𝑩 and 454
𝒗𝒊𝑼𝑩 the vector of lower and upper bounds, respectively, and 𝑬 is the set of indices of all 455
exchange reactions of the model. If the optimal value of 𝒗𝒅 for this optimization is 𝒗𝒅 > 𝟎, 456
there exist in the genome-scale model at least one EGC that is able to generate energy 457
metabolites like ATP, GTP, CTP, or UTP. 458
Curation of TICs 459
Two types of modifications were performed on the curated Mtb genome-scale metabolic network in 460
order to eliminate TICs [47]. 461
i) TICs formed by linearly dependent reversible reactions: Usually, these arise when there 462
are two reactions (NAD+- and NADP+-dependent) with the same catalytic activity. In this 463
instance, we forced the use of NADPH/NADP+ in anabolic reactions and NADH/NAD+ 464
for catabolic reactions, as recommended by Pereira and colleague [30]. If two irreversible 465
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
24 of 39
reactions that catalyze the forward and backward direction exist, both reactions (and GPR 466
rules) are lumped together in just one reversible reaction. 467
ii) TICs formed by erroneous directionality assignments: we restricted the reaction 468
directionality based on Gibbs free energy change (NExT algorithm) [83,84] as long as 469
gene essentiality predictions are not compromised 470
The later modification was based on the utilization of the NExT (network-embedded thermodynamic 471
analysis) algorithm [83,84]. This algorithm allows identify new irreversible reactions by calculating 472
the thermodynamically feasible range of Gibbs energy of reactions and metabolite concentrations. 473
NExT was implemented for those reactions participating in TICs under Matlab [84] with 474
physiological conditions adapted for Mtb (Table 5). Standard Gibbs energy of formation (∆𝒇𝑮𝒊) (in 475
kJ/mol), number of hydrogen atoms, and charge of all metabolites involved in TICs were obtained 476
from the Biochemical Thermodynamic Calculator, eQuilibrator [85,86]. 477
Table 5. Biophysical properties and concentration ranges for intracellular Mtb. 478
Properties Values Reference
Redox potential, cytosol -275 mV Bhaskar et al., 2014
pH, intracellular 5.7 Zhang et al., 2003
pH, extracellular (activated
macrophage) 4.5
Rohde et al., 2007; Vandal et
al., 2009
Ionic strength 0.15 M Kümmel et al., 2006; Martínez
et al., 2014
Oxygen Concentration (mM) 0.0001 - 0.1 Martínez et al., 2014
[CO2],[Pi] (mM) 1 - 100 Kümmel et al., 2006;
Haraldsdóttir et al., 2012
Other metabolites (mM) 0.0001 – 0.1 Martínez et al., 2014
NADH/NAD 0.0001 – 0.1 Martínez et al., 2014
NADPH/NADP 0.0001 – 0.1 Martínez et al., 2014
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
25 of 39
If a reaction specified to be reversible in the set of TICs and it has its maximum ∆𝒓𝑮 calculated to 479
be negative, the reaction is considered to occur in the forward direction. In contrast, if the minimum 480
∆𝒓𝑮 is positive, the reaction is considered to occur in the reverse direction. No direction can be 481
inferred when the minimum ∆𝒓𝑮 is negative and the maximum is positive. Changes in directionality 482
of reactions were done strictly when gene essentiality predictions in the curated genome-scale model 483
were not compromised. 484
Gene Essentiality Analysis. 485
To identify essential genes of Mtb grown on individual conditions (cholesterol minimal medium and 486
glycerol minimal medium) from Griffin’s paper [54], we use the Bayesian/Gumbel method of 487
TRANSIT, version 2.02 [59]. The Bayesian/Gumbel method determines posterior probability of the 488
essentiality of each gene (zbar). When zbar value is 1, or close to 1, the gene is considered essential 489
(ES), if zbar is 0, or close to 0, the gene is considered non-essential (NE), uncertain (U) genes are 490
those with zbar values between 0 and 1, and for too small (S) genes zbar is -1. After loading the TA 491
count files (replicates for cholesterol and glycerol) and the gene annotation file into TRANSIT, and 492
running the Gumbel method with default parameters, we obtained an output file with essentiality 493
results (Table S9 and S10). Uncertain (U) and too small (S) genes were not taken into account for the 494
in silico essentiality analysis. Minato and colleagues used the same statistical method for classify 495
essential genes [57]. Conversely, DeJesus and colleagues [56] used a Hidden Markov Model based 496
statistical method for classifying genes into four essentiality states: essential (ES), growth defect 497
(GD), nonessential (NE), and growth advantage (GA). In order to evaluate the performance of the 498
Mtb GSMNs to predict gene essentiality data, we use only binary classifiers, therefore we reclassify 499
these genes just in two groups as follows: NE genes included NE and GA genes, and ES genes 500
included GD and ES genes. 501
For the in silico gene essentiality analysis, we set the simulation conditions (asparagine, phosphate, 502
sodium, ammonium, citrate, sulfate, zinc, calcium, chloride, Fe3+, Fe2+, and glycerol or cholesterol) 503
according to Griffin minimal medium [54], 7H9 OADC medium, and “MtbYM” medium and a FBA-504
based gene essentiality analysis was performed in the eight Mtb models using the “single gene 505
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
26 of 39
deletion” function of Cobra Toolbox. Default maximization of biomass objective function was used 506
to predict growth in all models. If a specific growth rate of no more than 5% of the wild-type is 507
obtained, the given gene was considered as essential (in silico), otherwise was considered non-508
essential. 509
Percentage of in silico gene essentiality predictions were categorized as: true-positive, false-positive, 510
true-negative, and false-negative when the in silico data were compared with experimental 511
essentiality data. 512
TP (true-positive): model simulation predicts no growth when essential genes are deleted. 513
FP (false-positive): model simulation predicts no growth when not essential genes are deleted. 514
TN (true-negative): model simulation predicts growth when not essential genes are deleted. 515
FN (false-negative): model simulation predicts growth when essential genes are deleted. 516
For evaluate the performance of the Mtb GSMNs we used sensitivity, specificity, accuracy, and 517
Mathews correlation coefficient (MCC) metrics: 518
𝒔𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚 =𝑻𝑷
𝑻𝑷 + 𝑭𝑵 (𝟓) 519
520
𝒔𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚 =𝑻𝑵
𝑻𝑵 + 𝑭𝑷 (𝟔) 521
522
𝑨𝒄𝒄𝒖𝒓𝒂𝒄𝒚 =(𝑻𝑷 + 𝑻𝑵)
(𝑻𝑷 + 𝑭𝑷 + 𝑻𝑵 + 𝑭𝑵) (𝟕) 523
𝑴𝑪𝑪 =(𝑻𝑷 ∗ 𝑻𝑵) − (𝑭𝑷 ∗ 𝑭𝑵)
√(𝑻𝑷 + 𝑭𝑷)(𝑻𝑷 + 𝑭𝑵)(𝑻𝑵 + 𝑭𝑷)(𝑻𝑵 + 𝑭𝑵) (𝟖) 524
525
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
27 of 39
Utilization of carbon and nitrogen sources 526
The methodology for modeling the effect of different carbon sources and nitrogen sources on Mtb 527
growth was adapted from Lofthouse and colleagues [14]. The biology Phenotype MicroArray 528
experiments classification used were obtained from Lofthouse and colleagues, 2013. They classified 529
growth and no-growth in different carbon and nitrogen sources from the original Biolog data of Khatri 530
and colleagues and the Roisin’s minimal media [14,65]. In addition, we used Roisin’s minimal media 531
data that also were obtained by Lofthouse and colleagues. 532
To model the carbon source experiment, we simulated the media as a modified form of Roisin’s 533
minimal media containing unlimited quantities of ammonia, phosphate, iron, sulfate, carbon dioxide 534
and a Biolog carbon source influx of 1 mmol/gDW/h. Similarly, the nitrogen source experiment was 535
simulated using a modified form of Roisin’s media, where ammonia was replaced with 1 mmol/g 536
DW/h of the Biolog nitrogen source and pyruvate was used as a carbon source (influx at 1 mmol/g 537
DWt/h). 538
For comparing the utilization of carbon and nitrogen sources in all Mtb models with experimental 539
data, we used Mathew’s correlation coefficient metrics (equation 8). 540
In silico growth predictions in carbon and nitrogen sources also were categorized as: true-positive, 541
false-positive, true-negative, and false-negative. 542
TP (true-positive): model simulation predicts growth while growth is observed experimentally (or 543
respiration rate is observed in biolog phenotype microarrays) in presence of a unique carbon or 544
nitrogen source. 545
FP (false-positive): model simulation predicts growth while no growth is observed experimentally in 546
presence of a unique carbon or nitrogen source. 547
TN (true-negative): model simulation predicts no growth while no growth is observed experimentally 548
in presence of a unique carbon or nitrogen source. 549
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
28 of 39
FN (false-negative): model simulation predicts no growth while growth is observed experimentally 550
in presence of a unique carbon or nitrogen source. 551
Supporting Information 552
S1 Appendix. Overview of the eight constraint-based models of Mtb used in this study. A 553
document providing additional detail about the eight Mtb GSMNs used in this study. (DOCX) 554
S2 Appendix. Curation of additional Thermodynamic Infeasible Cycles in sMtb2.0. A document 555
providing detailed description of the curation of TICs in sMtb2.0. (DOCX) 556
S3 Appendix. Curation of additional Thermodynamic Infeasible Cycles in iEK1011_2.0. A 557
document providing detailed description of the curation of TICs in iEK1011_2.0. (DOCX) 558
559
S1 Fig. ROC curves of gene essentiality predictions for Mtb GSMNs. a Receiver operating 560
characteristic curve for the gene essentiality predictions in cholesterol minimal medium, b 561
Receiver operating characteristic curve for the gene essentiality predictions in glycerol minimal 562
medium, c Receiver operating characteristic curve for gene essentiality predictions in 7H9 563
Middlebrook OADC medium, d Receiver operating characteristic curve for gene essentiality 564
predictions in MtbYM medium. (TIF) 565
S2 Fig. central role of succinate dehydrogenase in oxidation of odd-chain substrates. (TIF) 566
S1 Table. Set analysis of genes from all the eight Mtb GSMNs. A table with pairwise comparisons, 567
unions, and intersections of genes annotated in the eight GSMNs of Mtb. (XLSX) 568
S2 Table. Metabolic pathways associated to all the intersected gene sets of Mtb GSMNs. (XLSX) 569
S3 Table. Genes and metabolic pathways shared between GSMNs of Mtb. (XLSX) 570
S4 Table. Network topological properties of Mtb GSMNs. (XLSX) 571
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
29 of 39
S5 Table. List of unbalanced reactions and Metabolites without formulas in the Mtb GSMNs. 572
(XLSX) 573
S6 Table. List of blocked reactions and dead-end metabolites. (XLSX) 574
S7 Table. List of Thermodynamically infeasible cycles identified for the eight Mtb GSMNs. 575
(XLSX). 576
S8 Table. Transposon sequencing analysis of Mtb genes required for growing on minimal 577
medium plus cholesterol. A list of genes of Mtb classified as essential, non-essential, and uncertain 578
for growing in cholesterol minimal medium, the essentiality analysis was obtained by applying the 579
Bayesian/Gumbel Method incorporated into the software TRANSIT. (XLSX) 580
S9 Table. Transposon sequencing analysis of Mtb genes required for growing on minimal 581
medium plus glycerol. A list of genes of Mtb classified as essential, non-essential, and uncertain for 582
growing in glycerol minimal medium, the essentiality analysis was obtained by applying the 583
Bayesian/Gumbel Method incorporated into the software TRANSIT. (XLSX) 584
S10 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 585
cholesterol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 586
5% of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 587
(XLSX) 588
S11 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 589
glycerol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 5% 590
of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 591
(XLSX) 592
S12 Table. Predictive power of Mtb GSMNs for classify essential and non essential genes on 593
7H9 OADC medium. Genes whose in silico knockouts give growth rate values lesser than 5% of the 594
maximum growth rate are classified as essential, otherwise are classified as non-essential. (XLSX) 595
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
30 of 39
S13 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 596
cholesterol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 597
5% of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 598
(XLSX) 599
S14 Table. List of common False Positive and False Negative Genes of all the Mtb GSMNs 600
during gene essentiality predictions. (XLSX) 601
S15 Table. Predictive power of Mtb GSMNs growing on sole carbon sources. (XLSX) 602
S16 Table. Predictive power of Mtb GSMNs growing on sole nitrogen sources. (XLSX) 603
S17 Table. New added reactions into the sMtb and iEK1011 models. (XLSX) 604
S18 Table. List of reactions with modified gene annotation in sMtb2.0. (XLSX) 605
S19 Table. List of reactions that participate in Thermodynamically Infeasible Cycles. (XLSX) 606
S20 Table. Null Space of the stoichiometric matrix formed by unbounded reactions of sMtb2.0 607
and iEK1011_2.0. (XLSX) 608
S21 Table. Gibbs free energy change of Unbounded Reactions of updated Mtb GSMNs. (XLSX) 609
S22 Table. Gene Essentiality Predictions for iEK1011_2.0 and sMtb2.0 on four Mtb media. 610
(XLSX) 611
S23 Table. Growth Phenotypes of iEK1011_2.0 and sMtb2.0 on unique carbon and nitrogen 612
sources. (XLSX) 613
S1 File. Updated Mtb models sMtb2.0 and iEK1011 in .mat and ,xlsx format. (ZIP) 614
S2 File. Matlab script for checking the charge and mass balance of Mtb GSMNs. (ZIP) 615
S3 File. Matlab scripts for identifying unbounded reactions and thermodynamically infeasible 616
cycles in Mtb GSMNs. (ZIP) 617
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
31 of 39
S4 File. Matlab scripts for identifying Mtb GSMNs with Energy Generating Cycles. (ZIP) 618
S5 File. Matlab scripts for gene essentiality analysis of Mtb GSMNs on four media conditions. 619
(ZIP) 620
Author Contributions 621
Conceptualization: Víctor A. López-Agudelo, Dany JV Beste., Rigoberto Rios-Estepa. 622
Data curation: Víctor A. López-Agudelo 623
Formal analysis: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Dany JV. Beste, Rigoberto 624
Rios-Estepa. 625
Funding acquisition: Dany JV. Beste, Rigoberto Rios-Estepa. 626
Investigation: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Andres Baena, Luis F. 627
Barrera, Dany JV. Beste, Rigoberto Rios-Estepa. 628
Methodology: Víctor A. López-Agudelo, Dany JV. Beste, Rigoberto Rios-Estepa. 629
Project administration: Rigoberto Rios-Estepa, Dany JV. Beste. 630
Resources: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Andres Baena, Luis F. Barrera, 631
Dany JV. Beste, Rigoberto Rios-Estepa. 632
Software: Víctor A. López-Agudelo, Tom Mendum, Dany JV. Beste. 633
Supervision: Dany JV. Beste, Rigoberto Rios-Estepa. 634
Validation: Tom Mendum, Dany JV. Beste, Rigoberto Rios-Estepa. 635
Visualization: Víctor A. López-Agudelo 636
Writting – original draft: Víctor A. López-Agudelo, Andres Baena, Luis F. Barrera, Dany JV. Beste, 637
Rigoberto Rios-Estepa 638
Writting – review & editing: Víctor A. López-Agudelo, Tom Mendum, Dany JV. Beste, Rigoberto 639
Rios-Estepa. 640
641
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
32 of 39
642
Funding 643
This work was supported by grants from COLCIENCIAS (1115-5693-3520 and the Medical Research 644
Council (MR/K01224X/1). 645
646
Acknowledgments 647
Víctor A. López-Agudelo acknowledges the economic support provided by the Administrative 648
Department of Science, Technology and Innovation - COLCIENCIAS, Colombia, during his Ph.D. 649
studies (National Ph.D. scholarship Conv. 727-2015). 650
References 651
1. World Health Organisation. Global Health TB Report. 2018. 652
2. Deng M, Lv X-D, Fang Z-X, Xie X-S, Chen W-Y. The blood transcriptional signature for 653
active and latent tuberculosis. Infect Drug Resist. 2019;12: 321. 654
3. MacLean E, Broger T, Yerlikaya S, Fernandez-Carballo BL, Pai M, Denkinger CM. A 655
systematic review of biomarkers to detect active tuberculosis. Nat Microbiol. 2019;4: 748. 656
4. Cumming BM, Addicott KW, Adamson JH, Steyn AJC. Mycobacterium tuberculosis induces 657
decelerated bioenergetic metabolism in human macrophages. Elife. 2018;7: e39169. 658
5. Shi L, Jiang Q, Bushkin Y, Subbian S, Tyagi S. Biphasic Dynamics of Macrophage 659
Immunometabolism during Mycobacterium tuberculosis Infection. MBio. 2019;10: e02550-660
18. 661
6. Kalia NP, Shi Lee B, Ab Rahman NB, Moraski GC, Miller MJ, Pethe K. Carbon metabolism 662
modulates the efficacy of drugs targeting the cytochrome bc1:aa3 in Mycobacterium 663
tuberculosis. Sci Rep. 2019;9: 8608. doi:10.1038/s41598-019-44887-9 664
7. Howard NC, Marin ND, Ahmed M, Rosa BA, Martin J, Bambouskova M, et al. 665
Mycobacterium tuberculosis carrying a rifampicin drug resistance mutation reprograms 666
macrophage metabolism through cell wall lipid changes. Nat Microbiol. 2018;3: 1099. 667
8. Murima P, McKinney JD, Pethe K. Targeting bacterial central metabolism for drug 668
development. Chem Biol. 2014;21: 1423–1432. 669
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
33 of 39
9. Bald D, Villellas C, Lu P, Koul A. Targeting Energy Metabolism in Mycobacterium 670
tuberculosis , a New Paradigm in Antimycobacterial Drug Discovery . MBio. 2017;8: e00272-671
17. doi:10.1128/mbio.00272-17 672
10. Beste DJ V, Hooper T, Stewart G, Bonde B, Avignone-Rossa C, Bushell ME, et al. GSMN-673
TB: a web-based genome-scale network model of Mycobacterium tuberculosis metabolism. 674
Genome Biol. 2007;8: R89. 675
11. Jamshidi N, Palsson BØ. Investigating the metabolic capabilities of Mycobacterium 676
tuberculosis H37Rv using the in silico strain iNJ 661 and proposing alternative drug targets. 677
BMC Syst Biol. 2007;1: 26. 678
12. Chindelevitch L, Stanley S, Hung D, Regev A, Berger B. MetaMerge: scaling up genome-679
scale metabolic reconstructions with application to Mycobacterium tuberculosis. Genome 680
Biol. 2012;13: r6. 681
13. Fang X, Wallqvist A, Reifman J. Modeling phenotypic metabolic adaptations of 682
Mycobacterium tuberculosis H37Rv under hypoxia. PLoS Comput Biol. 2012;8: e1002688. 683
14. Lofthouse EK, Wheeler PR, Beste DJV, Khatri BL, Wu H, Mendum TA, et al. Systems-based 684
approaches to probing metabolic variation within the Mycobacterium tuberculosis complex. 685
PLoS One. 2013;8: e75913. doi:10.1371/journal.pone.0075913 686
15. Rienksma RA, Suarez-Diez M, Spina L, Schaap PJ, dos Santos VAPM. Systems-level 687
modeling of mycobacterial metabolism for the identification of new (multi-) drug targets. 688
Seminars in immunology. Elsevier; 2014. pp. 610–622. 689
16. Brahmachari SK, Bhat AG, Kushwaha S, Bhardwaj A. Systems level mapping of metabolic 690
complexity in Mycobacterium tuberculosis to identify high-value drug targets. J Transl Med. 691
2014;12: 263. doi:10.1186/s12967-014-0263-5 692
17. Garay CD, Dreyfuss JM, Galagan JE. Metabolic modeling predicts metabolite changes in 693
Mycobacterium tuberculosis. BMC Syst Biol. 2015;9: 57. 694
18. Ma S, Minch KJ, Rustad TR, Hobbs S, Zhou S-L, Sherman DR, et al. Integrated modeling of 695
gene regulatory and metabolic networks in Mycobacterium tuberculosis. PLoS Comput Biol. 696
2015;11: e1004543. 697
19. Xavier JC, Patil KR, Rocha I. Integration of biomass formulations of genome-scale metabolic 698
models with experimental data reveals universally essential cofactors in prokaryotes. Metab 699
Eng. 2017;39: 200–208. 700
20. López-Agudelo VA, Baena A, Ramirez-Malule H, Ochoa S, Barrera LF, Ríos-Estepa R. 701
Metabolic adaptation of two in silico mutants of Mycobacterium tuberculosis during infection. 702
BMC Syst Biol. 2017;11. doi:10.1186/s12918-017-0496-z 703
21. Rienksma RA, Schaap PJ, Dos Santos VAPM, Suarez-Diez M. Modeling host-pathogen 704
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
34 of 39
interaction to elucidate the metabolic drug response of intracellular mycobacterium 705
tuberculosis. Front Cell Infect Microbiol. 2019;9. doi:10.3389/fcimb.2019.00144 706
22. Rienksma RA, Schaap PJ, Martins dos Santos VAP, Suarez-Diez M. Modeling the metabolic 707
state of Mycobacterium tuberculosis upon infection. Front Cell Infect Microbiol. 2018;8: 264. 708
23. King ZA, Lu J, Dräger A, Miller P, Federowicz S, Lerman JA, et al. BiGG Models: A platform 709
for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res. 2015;44: 710
D515–D522. 711
24. Kavvas ES, Seif Y, Yurkovich JT, Norsigian C, Poudel S, Greenwald WW, et al. Updated and 712
standardized genome-scale reconstruction of Mycobacterium tuberculosis H37Rv, iEK1011, 713
simulates flux states indicative of physiological conditions. BMC Syst Biol. 2018;12: 25. 714
25. Gu C, Kim GB, Kim WJ, Kim HU, Lee SY. Current status and applications of genome-scale 715
metabolic models. Genome Biol. 2019;20: 121. doi:10.1186/s13059-019-1730-3 716
26. Machado D, Herrgård MJ, Rocha I. Stoichiometric representation of gene–protein–reaction 717
associations leverages constraint-based analysis from reaction to gene-level phenotype 718
prediction. PLoS Comput Biol. 2016;12: e1005140. 719
27. Hädicke O, Klamt S. EColiCore2: a reference network model of the central metabolism of 720
Escherichia coli and relationships to its genome-scale parent model. Sci Rep. 2017;7: 39647. 721
28. Madigan CA, Martinot AJ, Wei J-R, Madduri A, Cheng T-Y, Young DC, et al. Lipidomic 722
analysis links mycobactin synthase K to iron uptake and virulence in M. tuberculosis. PLoS 723
Pathog. 2015;11: e1004792. 724
29. Chao A, Sieminski PJ, Owens CP, Goulding CW. Iron Acquisition in Mycobacterium 725
tuberculosis. Chem Rev. 2018;119: 1193–1220. 726
30. Pereira R, Nielsen J, Rocha I. Improving the flux distributions simulated with genome-scale 727
metabolic models of Saccharomyces cerevisiae. Metab Eng Commun. 2016;3: 153–163. 728
doi:10.1016/j.meteno.2016.05.002 729
31. Huss M, Holme P. Currency and commodity metabolites: Their identification and relation to 730
the modularity of metabolic networks. IET Syst Biol. 2006;1: 280–285. doi:10.1049/iet-731
syb:20060077 732
32. Thiele I, Palsson BØ. A protocol for generating a high-quality genome-scale metabolic 733
reconstruction. Nat Protoc. 2010;5: 93. 734
33. Heirendt L, Arreckx S, Pfau T, Mendoza SN, Richelle A, Heinken A, et al. Creation and 735
analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0. Nat 736
Protoc. 2019;14: 639. 737
34. Chan SHJ, Cai J, Wang L, Simons-Senftle MN, Maranas CD. Standardizing biomass reactions 738
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
35 of 39
and ensuring complete mass balance in genome-scale metabolic models. Bioinformatics. 739
2017;33: 3603–3609. doi:10.1093/bioinformatics/btx453 740
35. Kumar VS, Dasika MS, Maranas CD. Optimization based automated curation of metabolic 741
reconstructions. BMC Bioinformatics. 2007;8: 212. 742
36. Thiele I, Vlassis N, Fleming RMT. fastGapFill: efficient gap filling in metabolic networks. 743
Bioinformatics. 2014;30: 2529–2531. 744
37. Benedict MN, Mundy MB, Henry CS, Chia N, Price ND. Likelihood-based gene annotations 745
for gap filling and quality assessment in genome-scale metabolic models. PLoS Comput Biol. 746
2014;10: e1003882. 747
38. Yousofshahi M, Ullah E, Stern R, Hassoun S. MC 3: a steady-state model and constraint 748
consistency checker for biochemical networks. BMC Syst Biol. 2013;7: 129. 749
39. Gopinath K, Moosa A, Mizrahi V, Warner DF. Vitamin B12 metabolism in Mycobacterium 750
tuberculosis. Future Microbiol. 2013;8: 1405–1418. doi:10.2217/fmb.13.113 751
40. Gopinath K, Venclovas Č, Ioerger TR, Sacchettini JC, McKinney JD, Mizrahi V, et al. A 752
vitamin B12 transporter in Mycobacterium tuberculosis. Open Biol. 2013;3: 120175. 753
doi:10.1098/rsob.120175 754
41. Minias A, Minias P, Czubat B, Dziadek J. Purifying selective pressure suggests the 755
functionality of a vitamin B12 biosynthesis pathway in a global population of mycobacterium 756
tuberculosis. Genome Biol Evol. 2018;10: 2326–2337. doi:10.1093/gbe/evy153 757
42. Young DB, Comas I, de Carvalho LPS. Phylogenetic analysis of vitamin B12-related 758
metabolism in Mycobacterium tuberculosis. Front Mol Biosci. 2015;2: 6. 759
doi:10.3389/fmolb.2015.00006 760
43. Noor E. Removing both Internal and Unrealistic Energy-Generating Cycles in Flux Balance 761
Analysis. arXiv Prepr arXiv180304999. 2018. 762
44. Schellenberger J, Lewis NE, Palsson B. Erratum: Elimination of thermodynamically infeasible 763
loops in steady-state metabolic models (Biophysical Journal (2010) 100 (544-553)). Biophys 764
J. 2011;100: 1381. doi:10.1016/j.bpj.2011.02.005 765
45. Price ND, Famili I, Beard DA, Palsson BØ. Extreme pathways and Kirchhoff’s second law. 766
Biophys J. 2002;83: 2879–2882. 767
46. Beard DA, Liang S, Qian H. Energy balance for analysis of complex metabolic networks. 768
Biophys J. 2002;83: 79–86. 769
47. Maranas CD, Zomorrodi AR. Optimization Methods in Metabolic Networks. Optimization 770
Methods in Metabolic Networks. John Wiley & Sons; 2016. doi:10.1002/9781119188902 771
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
36 of 39
48. Fleming RMT, Thiele I, Provan G, Nasheuer HP. Integrated stoichiometric, thermodynamic 772
and kinetic modelling of steady state metabolism. J Theor Biol. 2010;264: 683–692. 773
49. Fritzemeier CJ, Hartleb D, Szappanos B, Papp B, Lercher MJ. Erroneous energy-generating 774
cycles in published genome scale metabolic networks: Identification and removal. PLoS 775
Comput Biol. 2017;13: e1005494. doi:10.1371/journal.pcbi.1005494 776
50. Basler G. Computational prediction of essential metabolic genes using constraint-based 777
approaches. Methods in Molecular Biology. Springer; 2015. pp. 183–204. doi:10.1007/978-778
1-4939-2398-4_12 779
51. Sassetti CM, Rubin EJ. Genetic requirements for mycobacterial survival during infection. Proc 780
Natl Acad Sci. 2003;100: 12989–12994. doi:10.1073/pnas.2134250100 781
52. Lamichhane G, Zignol M, Blades NJ, Geiman DE, Dougherty A, Grosset J, et al. A 782
postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: 783
Application to Mycobacterium tuberculosis. Proc Natl Acad Sci. 2003;100: 7213–7218. 784
doi:10.1073/pnas.1231432100 785
53. Zhang YJ, Ioerger TR, Huttenhower C, Long JE, Sassetti CM, Sacchettini JC, et al. Global 786
Assessment of Genomic Regions Required for Growth in Mycobacterium tuberculosis. PLoS 787
Pathog. 2012;8: e1002946. doi:10.1371/journal.ppat.1002946 788
54. Griffin JE, Gawronski JD, DeJesus MA, Ioerger TR, Akerley BJ, Sassetti CM. High-789
resolution phenotypic profiling defines genes essential for mycobacterial growth and 790
cholesterol catabolism. PLoS Pathog. 2011;7: e1002251. doi:10.1371/journal.ppat.1002251 791
55. DeJesus MA, Zhang YJ, Sassetti CM, Rubin EJ, Sacchettini JC, Ioerger TR. Bayesian analysis 792
of gene essentiality based on sequencing of transposon insertion libraries. Bioinformatics. 793
2013;29: 695–703. doi:10.1093/bioinformatics/btt043 794
56. DeJesus MA, Gerrick ER, Xu W, Park SW, Long JE, Boutte CC, et al. Comprehensive 795
Essentiality Analysis of the Mycobacterium tuberculosis Genome via Saturating Transposon 796
Mutagenesis . MBio. 2017;8: e02133-16. doi:10.1128/mbio.02133-16 797
57. Minato Y, Gohl DM, Thiede JM, Chacón JM, Harcombe WR, Maruyama F, et al. Genome-798
wide assessment of Mycobacterium tuberculosis conditionally essential metabolic pathways. 799
BioRxiv. 2019;4: 534289. 800
58. VanderVen BC, Fahey RJ, Lee W, Liu Y, Abramovitch RB, Memmott C, et al. Novel 801
Inhibitors of Cholesterol Degradation in Mycobacterium tuberculosis Reveal How the 802
Bacterium’s Metabolism Is Constrained by the Intracellular Environment. PLoS Pathog. 803
2015;11: e1004679. doi:10.1371/journal.ppat.1004679 804
59. DeJesus MA, Ambadipudi C, Baker R, Sassetti C, Ioerger TR. TRANSIT - A Software Tool 805
for Himar1 TnSeq Analysis. PLoS Comput Biol. 2015;11: e1004401. 806
.CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made
The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint
https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/
-
37 of 39
doi:10.1371/journal.pcbi.1004401 807
60. Chen WH, Lu G, Chen X, Zhao XM, Bork P. OGEE v2: An update of the online gene 808
essentiality database with special focus on differentially essential genes in human cancer cell 809
lines. Nucleic Acids Res. 2017;45: D940–D944. doi:10.1093/nar/gkw1013 810
61. de Carvalho LPS, Fischer SM, Marrero J, Nathan C, Ehrt S, Rhee KY. Metabolomics of 811
Mycobacterium tuberculosis reveals compartmentalized co-catabolism of carbon substrates 812
(Maybe Helpful Article about M. tuberculosis Untargeted Metabolomic of Profiling of C13-813
Labelle