A systematic evaluation of Mycobacterium tuberculosis ...Dany JV Beste 3,O and Rigoberto Rios-Estepa...

44
A systematic evaluation of Mycobacterium tuberculosis Genome-Scale 1 Metabolic Networks 2 Víctor A López-Agudelo 1,2 , Emma Laing 3 , Tom A Mendum 3 , Andres Baena 2,4 , Luis F 3 Barrera 2,5 Dany JV Beste 3,O and Rigoberto Rios-Estepa 1, * 4 1 Grupo de Bioprocesos, Departamento de Ingeniería Química, Universidad de Antioquia UdeA, 5 Calle 70 No. 52-21, Medellín, Colombia 6 2 Grupo de Inmunología Celular e Inmunogenética (GICIG), Facultad de Medicina, Universidad 7 de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 8 3 Department of Microbial Sciences, Faculty of Health and Medical Sciences, University of 9 Surrey, Guildford GU2 7XH, UK 10 4 Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de 11 Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 12 5 Instituto de Investigaciones Médicas, Facultad de Medicina, Universidad de Antioquia UdeA, 13 Calle 70 No. 52-21, Medellín, Colombia 14 15 o Corresponding Author: [email protected] 16 * Corresponding Author: [email protected] 17 Abstract 18 The metabolism of the causative agent of TB, Mycobacterium tuberculosis (Mtb) has recently re- 19 emerged as an attractive drug target. A powerful approach to study Mtb metabolism is to use a 20 systems biology framework, such as a Genome-Scale Metabolic Network (GSMN) that allows the 21 dynamic interactions of the many individual components of metabolism to be interrogated together. 22 Several GSMNs networks have been constructed for Mtb and used to study the complex relationship 23 between Mtb genotype and phenotype. However, their utility is hampered by the existence of 24 multiple models of varying properties and performances. Here we systematically evaluate eight 25 recently published metabolic models of Mtb-H37Rv to facilitate model choice. The best performing 26 . CC-BY 4.0 International license available under a was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made The copyright holder for this preprint (which this version posted November 10, 2019. ; https://doi.org/10.1101/837401 doi: bioRxiv preprint

Transcript of A systematic evaluation of Mycobacterium tuberculosis ...Dany JV Beste 3,O and Rigoberto Rios-Estepa...

  • A systematic evaluation of Mycobacterium tuberculosis Genome-Scale 1

    Metabolic Networks 2

    Víctor A López-Agudelo 1,2, Emma Laing 3, Tom A Mendum 3, Andres Baena 2,4, Luis F 3

    Barrera 2,5 Dany JV Beste 3,O and Rigoberto Rios-Estepa 1,* 4

    1 Grupo de Bioprocesos, Departamento de Ingeniería Química, Universidad de Antioquia UdeA, 5

    Calle 70 No. 52-21, Medellín, Colombia 6

    2 Grupo de Inmunología Celular e Inmunogenética (GICIG), Facultad de Medicina, Universidad 7

    de Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 8

    3 Department of Microbial Sciences, Faculty of Health and Medical Sciences, University of 9

    Surrey, Guildford GU2 7XH, UK 10

    4 Departamento de Microbiología y Parasitología, Facultad de Medicina, Universidad de 11

    Antioquia UdeA, Calle 70 No. 52-21, Medellín, Colombia 12

    5 Instituto de Investigaciones Médicas, Facultad de Medicina, Universidad de Antioquia UdeA, 13

    Calle 70 No. 52-21, Medellín, Colombia 14

    15

    o Corresponding Author: [email protected] 16

    * Corresponding Author: [email protected] 17

    Abstract 18

    The metabolism of the causative agent of TB, Mycobacterium tuberculosis (Mtb) has recently re-19

    emerged as an attractive drug target. A powerful approach to study Mtb metabolism is to use a 20

    systems biology framework, such as a Genome-Scale Metabolic Network (GSMN) that allows the 21

    dynamic interactions of the many individual components of metabolism to be interrogated together. 22

    Several GSMNs networks have been constructed for Mtb and used to study the complex relationship 23

    between Mtb genotype and phenotype. However, their utility is hampered by the existence of 24

    multiple models of varying properties and performances. Here we systematically evaluate eight 25

    recently published metabolic models of Mtb-H37Rv to facilitate model choice. The best performing 26

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 2 of 39

    models, sMtb2018 and iEK1011, were refined and improved for use in future studies by the TB 27

    research community. 28

    Introduction 29

    Mycobacterium tuberculosis (Mtb) is the causative bacterium agent of the global tuberculosis (TB) 30

    epidemic, which is now the biggest infectious disease killer worldwide, causing 1.6 million deaths in 31

    2017 alone [1]. Mtb is an unusual bacterial pathogen, as it is able to cause both acute life threatening 32

    disease and a clinically latent infections that can persist for the lifetime of the human host [2,3]. 33

    Metabolic reprogramming in response to the host niche during both the acute and the chronic phase 34

    of TB infections is a crucial determinant of virulence [4–7]. With the worldwide spread of multi- and 35

    extensively-resistant strains of Mtb thwarting the control of this global emergency, new drugs against 36

    Mtb are urgently needed and metabolism presents an attractive target [8,9]. 37

    Genome-scale constraint-based modelling has proved to be a powerful method to probe the 38

    metabolism of Mtb. The first Genome Scale Metabolic Networks (GSMNs) of Mtb were published 39

    in 2007 by Beste (GSMN-TB) [10] and Jamshidi (iNJ661) [11] and have been used as a platform for 40

    interrogating high throughput ‘omics’ data, by simulating bacterial growth, generating hypothesis and 41

    drug discovery and are also the backbone for subsequent model iterations [12–20]. In 2014, Rienksma 42

    and colleagues developed a genome-scale model of Mtb metabolism (sMtb) building upon three 43

    previously published models [15]. More recently Rienksma and colleagues utilized RNA sequence 44

    data to provide condition-specific biomass reactions and identify metabolic drug responses during 45

    Mtb infection of human macrophages [21,22] thus providing a model more suited to exploring the 46

    metabolism of Mtb during human infection. In 2018 the first consolidated genome-scale metabolic 47

    network, iEK1011 was constructed using standardized nomenclature of metabolites and reactions 48

    from the BiGG model database [23,24]. The utility of Mtb GSMNs will continue to increase, as more 49

    biological data is published and existing models are updated [25]. 50

    With so many well-annotated GSMN’s of Mtb available (Fig 1), a crucial first step in any genome 51

    scale exploration of the metabolism of Mtb is the selection of an appropriate model. Here, we 52

    systematically evaluate the performance of eight recently published GSMN of Mtb-H37Rv. In 53

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 3 of 39

    addition to comparing the metrics of the models descriptively in terms of size, connectivity, number 54

    of blocked reactions and gaps in the network, we also identify the thermodynamically infeasible, and 55

    energy generating cycles that could significantly impact on the accuracy of flux simulations. Using 56

    Flux Balance Analysis (FBA) we performed growth analysis and compared the models’ ability to 57

    predict gene essentiality when grown on different carbon and nitrogen sources including growth on 58

    cholesterol, which is thought to be the major carbon source for intracellular Mtb within its human 59

    host macrophage. 60

    This work provides an inventory of the available GSMN-TB and their utility in recapitulating aspects 61

    of Mtb metabolism. In addition, we present updated versions of the highest performing models 62

    iEK1011 and sMtb2018 (iEK1011_2.0 and sMtb2.0) which provide multi-scale simulation platforms 63

    available for the TB research community and will therefore inform future studies on the metabolism 64

    of this deadly pathogen. 65

    66

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 4 of 39

    Fig 1. The evolution of Genome Scale Metabolic models of Mtb. Models highlighted in 67

    gray were analysed in this study. Numbers denote genes/intracellular reactions. Black 68

    circles are indicative of merged Mtb models. 69

    Results and Discussion 70

    Descriptive evaluation of the models 71

    Each of the GSMNs analysed in this study (Fig 1, S1 Appendix) combine knowledge from genome 72

    annotations, literature and measured biochemical compositions of Mtb. The complex linkage between 73

    genotype and phenotype is made by gene-protein-reaction (GPR) associations, implemented as 74

    Boolean rules in order to connect gene functions to enzyme complexes, isozymes or promiscuous 75

    enzymes, and then to biochemical reactions [26]. Using set theory, we computed the intersection 76

    between all sets of the model’s genes (Fig 2, andS1 Table). In accordance with expectations, the 77

    pairwise matrix (Fig 2) demonstrates that Mtb models constructed from the same ancestor (iNJ661 or 78

    GSMN-TB), are more similar (Fig. 1, Fig. 2). By contrast the consolidated models iEK1011 and 79

    sMtb2018 share gene similarities (>60%, 60%,

  • 5 of 39

    biosynthesis. Likewise, the iNJ661v_modified model (Fig 1) has poor annotation of genes involved 92

    in lipid-metabolism (e.g. β-oxidation, cholesterol degradation, fatty acid biosynthesis, lipid 93

    biosynthesis and mycolic acid biosynthesis). These models therefore have limitations for in silico 94

    simulation of Mtb growing on these physiologically relevant lipid sources and for modelling growth 95

    in vivo. 96

    97

    98

    Fig 2. Pairwise matrix of shared genes among Mtb models. Values in black (genes in 99

    common between the models), blue (genes specific to the model indicated), and red (genes 100

    specific to the model indicated) represent the number and percentage of specific model 101

    genes specified in the y- and x-axis. 102

    Checking mass and charge balances of biochemical reactions 103

    Currency metabolites like water, protons, ATP, and cofactors like NADH, NADPH, FADH2, 104

    CoA, etc. are ubiquitous and essential for metabolism. The addition of these cofactor metabolites 105

    in GSMNs and in the biomass reaction considerably improves phenotype predictions and is a 106

    hallmark of good quality reconstructions [19, 30]. In order to check this we converted the 107

    GSMNs into substance graphs (using a local script) where metabolites (nodes) are connected by 108

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 6 of 39

    edges (undirected and unweighted) if they appear in the same reaction [31] and computed node 109

    degrees (number of edges connected to the node) (Table 2, and S4 Table), 110

    Table 2. Degree values for currency metabolites of Mtb GSMNs. PI: Phosphate, PPI: 111

    diphosphate, ACP: acyl-carrier protein, MK: menaquinone. 112

    Currency

    Metabolite GSMN-TB1.1 iOSDD890 sMtb iCG760 iSM810 iNJ661v_mod iEK1011 sMtb2018

    H 72 701 512 102 127 667 741 511

    CO2 150 202 266 146 160 193 204 268

    H2O 5 507 624 0 5 472 569 624

    ATP 236 353 320 264 242 295 315 320

    AMP 101 170 115 103 103 117 132 115

    PI 189 293 266 213 196 268 293 267

    PPI 175 235 180 177 180 168 196 181

    COA 175 163 230 175 190 142 194 230

    ACP 98 48 136 94 103 48 57 136

    NADH 102 141 262 104 113 111 176 263

    NADPH 133 151 192 132 146 147 150 192

    FADH2 40 47 55 42 48 21 62 55

    MK 37 20 11 37 37 20 18 20

    O2 37 76 64 40 41 68 91 65

    113

    From this analysis the degree values indicated that GSMN-TB 1.1, iCG760 and iSM810 models 114

    have the lowest participation of currency metabolites (Table 2). Water and protons were the most 115

    underrepresented metabolites (low degree values), indicating that these models may not be 116

    correctly balanced. It is important that biochemical reactions are charge and mass balanced. 117

    Unbalanced reactions in GSMNs may allow proton or ATP production out of nothing [32]. In 118

    order to test whether the Mtb GSMNs are mass and charge balanced, we used the COBRA 119

    Toolbox function “checkMassChargeBalance” [33]. Unfortunately, we were not able to perform 120

    this analysis for GSMN-TB1.1, iCG760 and iSM810 due to the lack of standard metabolite 121

    formulas in these models (Table 3). iEK1011 has the lowest number of imbalanced reactions (4) 122

    compared with sMtb2018 (8), sMtb, (12), iNJ661v_modified (13) and iOSDD890 (78). The 123

    majority of the imbalanced reactions belonged to cell wall biosynthetic pathways, including 124

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 7 of 39

    arabinogalactan, peptidoglycan, and mycolic acid biosynthesis (S5 Table) reflecting the 125

    difficulties in rebuilding accurate metabolite formula for complex cell wall components. 126

    Another potential error in GSMNs is the molecular weight of biomass, which should be defined as 127

    1 g/mmol. Discrepancy in biomass weight can arise as a result of unbalanced reactions which impacts 128

    on the reliability of flux predictions using Flux Balance Analysis (FBA). This affect can be amplified 129

    when host-pathogen interactions are simulated by integrating host and pathogen metabolic models 130

    [34]. Chang and colleagues (2017) developed a systematic algorithm for checking if the biomass 131

    molecular weight from GSMNs deviates from 1 g/mmol,. Using this algorithm [34] we found 132

    deviations of

  • 8 of 39

    Of those models that contained a pathway for cholesterol degradation (GSMN-TB1.1, iCG760, 152

    iSM810, iEK1011, sMtb, and sMtb2018). The GSMN-TB1.1 cholesterol degrading pathway 153

    contained a number of dead end metabolites making this model unsuitable for exploring the 154

    metabolism of this important in vivo carbon source. 155

    Table 3. Global features of the Mtb metabolic models analyzed in this study. DEM: Dead-End 156

    Metabolite, Metabolites wo F: Metabolites without formula, Imbal. Reactions: Imbalanced Reactions, 157

    URs: Unbounded Reactions, TICs: Thermodynamically Infeasible Cycles, diss. Flux: dissipation 158

    flux, MW: Molecular Weight, UD: Undetermined. 159

    GSMN-

    TB 1.1 iOSDD890 sMtb iCG760 iSM810

    iNJ661v_

    mod iEK1011 sMtb2018

    Reactions 876 1152 1311 965 938 1054 1228 1321

    Intracellular Reactions 876 1055 1192 864 938 956 1118 1200

    Metabolites 667 961 1047 754 724 840 998 1049

    DEMs 25 162 33 84 53 90 110 34

    Blocked Reactions 98 (11%) 290 (25%) 92 (7%) 89 (9%) 117 (12%) 153 (14%) 138 (11%) 92 (7%)

    Metabolites wo F 667 0 2 754 724 0 4 2

    Imbal. Reactions UD 78 12 UD UD 13 4 8

    URs 27 (3%) 52 (5%) 70 (6%) 45 (5%) 33 (3.5%) 67 (7%) 20 (2%) 75 (6%)

    TICs 8 17 16 8 8 23 4 17

    ATP diss. Flux 0 0 0 0.33 0 0 0 0

    GTP diss. Flux 0 0 0 0.33 0 0 0 0

    CTP diss. Flux 0 0 0 0.33 0 0 0 0

    UTP diss. Flux 0 0 0 0.33 0 0 0 0

    Biomass MW UD 1.0070 1.0126 UD UD 1.0094 1.0937 1.0126

    160

    Thermodynamic and energetic properties 161

    Integrating thermodynamics data into GSMNs is extremely useful in order to check the feasibility of 162

    reactions and their directionality [43,44]. Although, Mtb GSMNs have been built from 163

    thermodynamics information, current Mtb GSMNs have never been checked for infeasible internal 164

    flux cycles (also named type III pathways), which are reactions that do not exchange metabolites with 165

    the surroundings and therefore violate the second law of thermodynamics [43,45,46]. A tractable way 166

    to identify reactions participating in these thermodynamically infeasible cycles (TICs) is by defining 167

    the set of reactions required for an unbounded metabolic flux under finite or even zero substrate 168

    uptake inputs. Using Flux Variability Analysis (FVA ) the Unbounded Reactions (URs) can be 169

    identified as those reactions with fluxes at the upper and/or lower bound constraints. Thus we 170

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 9 of 39

    identified the thermodynamically infeasible cycles (TIC) using a local script following methodologies 171

    based on FVA and the analysis of the null space of the stoichiometric matrices (S7 Table) [44,47]. 172

    Using this approach we show that models descended from GSMN-TB (GSMN-TB1.1, iCG760, and 173

    iSM810) have a lower percentage of unbounded reactions as compared with iNJ661 ancestors 174

    (iSM810, iNJ661v_modified). Interestedly, the sMtb2018 model has an increased number of 175

    unbounded reactions as compared to the original sMtb (Table 3). 176

    Fritzemeier and colleagues demonstrated that over 85% of genome-scale models that lack exhaustive 177

    manual curation contain Energy Generating Cycles (EGCs) [48,49]. These cyclic net fluxes are 178

    entirely independent of nutrient uptakes (exchange fluxes) and therefore have a substantial effect on 179

    the predictions of constraint-based analyses, as they basically generate energy out of nothing. Using 180

    FBA with zero nutrient uptake [49] but maximizing energy dissipation reactions for ATP, GTP, CTP 181

    and UTP we show that iCG760 is (Table 3), the only Mtb genome-scale model that contains EGCs. 182

    Gene essentiality metrics 183

    An effective and commonly employed predictive matrix for GSMNs is the ability to reproduce high 184

    throughput gene essentiality data [50]. Several high throughput transposon mutagenesis screens have 185

    been performed for Mtb [51–57] in different in vitro conditions. To compare our models we used a 186

    transposon insertion sequence dataset produced by Griffin et al [54] in which genes that were 187

    differentially required for growth on cholesterol as compared with glycerol [54] were identified. 188

    Cholesterol is an important intracellular source of carbon when Mtb is growing within its host and 189

    cholesterol metabolism has been highlighted as a potential drug target [58]. Whilst Griffin and 190

    colleagues identified differentially required genes, this data was not used to identify the genes 191

    required for growth on cholesterol as an individual condition. We reanalyzed the Griffin’s transposon 192

    sequencing data with the statistical Bayesian/Gumbel Method incorporated into the software 193

    TRANSIT [59], to identify genes required for growth on glycerol, or growth on cholesterol (S8 and 194

    S9 Tables). Only genes categorized as essential (ES) and non-essential (NE) were considered for this 195

    analysis. 196

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 10 of 39

    We evaluated the overall predictive power of all the Mtb GSMNs versus four high throughput gene 197

    essentiality data [54,56,57] by computing the Area Under the Curve (AUC) of the Receiver Operating 198

    Characteristic (ROC) (S1 Fig). The predictive power of the six GSMNs which contain the cholesterol 199

    degradation pathway (GSMN-TB1.1, iCG760, iSM810,sMtb, sMtb2018, and iEK1011) showed that 200

    for both cholesterol and glycerol minimal media the models derived from GSMN-TB [10] as a core 201

    metabolic network have better predictive capacities than those using iNJ661 [11] (S10 and S11 202

    Tables). However the recently curated model iEK1011 had the highest predictive capability overall. 203

    The supremacy of iEK1011 also was confirmed by comparing the predictive power of the models 204

    for Mtb grown in standard Middlebrook 7H9 OADC [56] data (S1C Fig and S12 Table). 205

    We further used the essentiality dataset generated by Minato and colleagues who identified 206

    conditionally essential genes on different in vitro medium including a rich medium called “MtbYM”, 207

    which contains several carbon sources, nitrogen sources, amino acids, nucleotide bases, cofactors, 208

    and other nutrients [57]. Although this media supported growth similarly to 7H9 medium, the overall 209

    predictive power of the Mtb GSMNs in this medium (S1D Fig amd S13 Table) was 10% lower 210

    compared to the other media (S1A-S1C Figs), probably because biomass objective functions of the 211

    Mtb GSMNs ancestors were built and validated using growth on 7H9, 7H10, Youman’s, CAMR, and 212

    Roisin’s minimal medium [10,11]. However these analyses were able to demonstrate the power of 213

    ability of these models to accurately predict gene essentiality even under un-anticipated nutritional 214

    conditions. 215

    We identified the genes that all of the GSMN’s were unable to correctly assign essentiality (S14 216

    Table, Fig 3). Using a fixed threshold value of 5% of the maximum wild-type growth rate (WTGR) 217

    we identified in silico essential and non-essential genes and using experimental high throughput gene 218

    essentiality data identified True Positives (TP), True Negatives (TN), False Positives (FP-a gene 219

    which is essential for in silico growth but non-essential by Tn-seq) and False Negatives (FN-in silico 220

    the gene is non-essential but the biological data predicts essentiality). FN genes include genes known 221

    to have a major role in Mtb central carbon metabolism e.g., icl (Rv0467, isocitrate lyase), gltA 222

    (Rv0896, Citrate synthase), glpD2 (Rv3302c, Glycerol-3-phosphate dehydrogenase), pyk (Rv1617, 223

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 11 of 39

    Pyruvate Kinase), sucC and sucD (Rv0951, Rv0952, Succinyl-CoA ligase), among others (S14A 224

    Table). Some of these genes e.g. icl and gltA are considered conditionally essential genes in the 225

    Online GEne Essentiality (OGEE) database [60], because they are classified as NE genes in 7H10 226

    medium but ES in minimal medium [53,55]. This may reflect the presence of alternative routes in 227

    silico that are not feasible in vivo due to regulatory constraints. However, they may also reflect 228

    inaccuracies in the predictive abilities of transposon mutagenesis studies. Some of the FP genes are 229

    involved in mycolic acid biosynthesis (S14B Table), e.g., mmaA2 (Rv0644c, Cyclopropane mycolic 230

    acid synthase), and mas (Rv2940c, mycocerosic acid synthase). These genes were inaccurately 231

    classified as ES in silico but experimentally as NE. This is a direct reflection of our incomplete 232

    knowledge of these mycolate producing pathways and the essentiality of the various mycolates 233

    produced. 234

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 12 of 39

    235

    Fig 3. False Negative and False Positive predictions in the evaluated media. (a) Venn 236

    diagram for predicted false negative (FN) genes; (b) Venn diagram for predicted false 237

    positive (FP) genes. Genes in the gray box represents the intersection of all FN and FP genes 238

    in the four media. Genes are classified as True-positives (TP) if model simulation predicts 239

    no growth when essential genes are deleted, False-positives (FP) if model simulation 240

    predicts no growth when not essential genes are deleted, True negatives (TN) if model 241

    simulation predicts growth when not essential genes are deleted and False negatives (FN) if 242

    model simulation predicts growth when essential genes are deleted. 243

    244

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 13 of 39

    Growth metrics 245

    Mtb is able to metabolise several carbon and nitrogen sources both in vitro and when growing in the 246

    host [61–64], and therefore we evaluated the growth metrics of Mtb GSMNs on 30 sole carbon and 247

    17 sole nitrogen sources (Fig 4). The in silico results were compared with available in vitro 248

    experimental data from Biolog Phenotype microarray and minimal media data [14,65]. Interestingly 249

    the recent consolidated models, iEK1011 and sMtb, had the poorest performance of all the models in 250

    predicting growth of Mtb in unique carbon and nitrogen sources (Fig 4A and 4B). A fundamental 251

    issue with the Mtb models descended from iNJ661 is that they all require glycerol for growth as this 252

    is present in the biomass formulation. Both iEK1011 and sMtb were unable to grow in silico on 253

    cholesterol, acetate, oleate, palmitate and propionate on sole carbon sources. We hypothesise that this 254

    is as a result of inaccuracies in the reactions associated with redox metabolism and oxidative 255

    phosphorylation and specifically including menaquinone-dependent reactions such as fumarate 256

    reductase and succinate dehydrogenase would correct this anomalie. To test this hypothesis we added 257

    a menaquinone-dependent succinate dehydrogenase reaction into sMtb (Q[c] + SUCC[c] -> QH2[c] 258

    + FUM[c]). In support of our hypothesis this corrected the in silico growth phenotype of Mtb growing 259

    on acetate, cholesterol, propionate and fatty acids (Fig 4A and S15 Table). Although iEK1011 also 260

    contains a fumarate reductase reaction that is linked to menaquinone/demethylmenaquinone, it does 261

    not contain a menoquinone-dependent succinate dehydrogenase (the reverse reaction). As was the 262

    case for sMtb, the addition of a new menaquinone-dependent succinate dehydrogenase reaction 263

    (mqn8[c] + succ[c] -> fum[c] + mql8[c]) to iEK1011 significantly improves its growth predictions 264

    on sole carbon sources (S15I and S15J Tables). These simulations are supported by experimental data 265

    demonstrating that fumarate reductase and succinate dehydrogenase are essential for the survival of 266

    Mtb in glycolytic and non-glycolytic substrates [66,67]. Succinate dehydrogenase is a bifunctional 267

    enzyme that is part of the TCA cycle and complex II of the electron transport chain, coupling the 268

    oxidation of succinate to fumarate, with the corresponding reduction of membrane-localized quinone 269

    electron carriers [67,68]. Mtb has multiple succinate dehydrogenases and fumarate reductases that are 270

    essential for the survival of Mtb survival during hypoxia [66,67,69–71]. Succinate is central to much 271

    of Mtb’s lipid metabolism: host derived cholesterol, uneven chain length fatty acids or methyl 272

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 14 of 39

    branched amino acids all generate propionyl-CoA that can be channeled into the methylcitrate cycle 273

    to produce succinate (S2 Fig), while acetyl-CoA produced by β-oxidation of host derived even-chain 274

    fatty acids is metabolized through the glyoxylate shunt to also produce succinate (S2 Fig). Succinate 275

    oxidation by succinate dehydrogenases is therefore a critical step, as the enzyme couples the TCA 276

    cycle with electron transport chain and oxidative phosphorylation [69]. Having multiple succinate 277

    dehydrogenases provides Mtb with the metabolic flexibility to survive within the human host where 278

    the REDOX and nutritional environment are liable to vary. 279

    Whilst carbon metabolism has been intensively studied in vitro and ex vivo, attention has only recently 280

    been directed to nitrogen metabolism [72–75]. Similar to carbon consumption, iEK1011 and sMtb 281

    were poor at predicting growth on sole nitrogen sources (Fig 4B, Mathews Correlation Coefficient 282

    (MCC) = 0.54 and 0.48, respectively). However, like carbon the addition of the menaquinone linked 283

    succinate dehydrogenase reaction into iEK1011 and sMtb significantly improves the predictive power 284

    of these models on sole nitrogen sources (S16I and S16J Tables). Specifically, the correct growth 285

    prediction were obtained for Mtb growing on branched chain amino acids (isoleucine and valine) and 286

    proline (Fig 4B). The possible explaination for these correct prediction is that complete degradation 287

    of these amino acids can converge on succinate via methyl citrate cycle (degradation of isoleucine 288

    and valine) or the GABA shunt (degradation of proline) which couples TCA cycle with oxidative 289

    phosphorylation by succinate dehydrogenase. 290

    291

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 15 of 39

    292

    Figure 4. Predictive capacity of Mtb genome-scale models for the utilization of sole carbon 293

    and nitrogen sources; (a) Growth predictions of Mtb genome-scale models using sole 294

    carbon sources; (b) Growth predictions of Mtb genome-scale models by using sole nitrogen 295

    sources. Model’s performance was evaluated by computation of the Mathews Correlation 296

    Coefficient (MCC). Experimental growth data were obtained from [14,65]. The 1 and 0 297

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 16 of 39

    values represent growth and not growth in a specific substrate, respectively. Carbon or 298

    Nitrogen substrates are classified as TP if model simulation predicts growth while growth 299

    is observed experimentally, FP if model simulation predicts growth while no growth is 300

    observed experimentally, TN if model simulation predicts no growth while no growth is 301

    observed experimentally and FN if model simulation predicts no growth while growth is 302

    observed experimentally. 303

    Refining Mtb GSMNs 304

    Overall iEK1011 and sMtb2018 were the best overall GSMN’s in terms of genetic background, 305

    network topology, number of blocked reactions, mass and charge balance reactions and gene 306

    essentiality predictions (Fig 2, Table 2, Table 3, and S1 Fig) and therefore we selected these models 307

    to refine further. iEK1011 has the advantage of containing standardized BiGG nomenclature of 308

    metabolites and therefore can easily be integrated into the human GSMN Recon3D [76] to simulate 309

    intracellular growth, while sMtb2018 has the utility that this model supports in silico growth in a 310

    wider variety of different nutritional conditions. Our analysis also highlighted some fundemental 311

    issues with these models which we addressed in order to improve the performance of these exempler 312

    GSMN’s. 313

    As already discussed the new sMtb2018 includes an updated respiratory chain including menaquinone 314

    and menaquinol as electron carriers in all respiratory chain reactions and selected ubiquinone-315

    dependent reactions in sMtb were changed to menaquinone-dependent. Six new menaquinone-316

    dependent reactions were included in the sMtb model e.g., succinate dehydrogenase, and cytochrome 317

    bc1 menaquinone-dependent, fumarate reductase, and malate dehydrogenase [22]. This improved the 318

    predictive growth metric of sMtb and importantly allowed in silico growth on cholesterol (see 319

    sMtb2018, Fig 4A). Similarly, we added to iEK1011 a menaquinone-dependent succinate 320

    dehydrogenase to improve the performance of this model when growing in fatty acids and cholesterol. 321

    Further improvements were also made to cholesterol metabolism by updating both models to include 322

    the biochemical degradation of the C and D rings of cholesterol which was not known when these 323

    models were reconstructed (S17A and S17B Tables) [77]. 324

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 17 of 39

    Although the functionality of the vitamin B12 biosynthetic pathway remains undefined, recent 325

    elevated non-synonymous nucleotide substitution rates found on cobalamin-dependent enzymes of 326

    3798 Mtb strains suggest positive selection on these genes and a possible functional biosynthetic 327

    pathway [41]. Therefore, we facilitated co-factor metabolism in the models by unblocking B12 328

    synthesis and including the co-factors biotin and pyridoxal-5-phosphate to enhance the phenotype 329

    prediction of sMtb2018 and iEK1011 as recommended by Xavier and colleagues [19]. As iEK1011 330

    contains glycerol within its biomass formulation, which prevent in silico growth in glycerol-replete 331

    media, we incorporated the iNJ661v_mod biomass objective function into this model (S17B Table). 332

    We also added additional genes into sMtb2018 based on iOSDD890 genes that were classified as true 333

    negatives and true positives by the gene essentiality analysis (S18 Table). Fifty-one reactions that 334

    belong to pathways such as glycolysis, gluconeogenesis, TCA cycle, amino acid metabolism and 335

    mycolic acid pathway were improved in terms of GPR annotation. All updates for the new models 336

    sMtb2.0 and iEK1011_2.0 are summarized in S17 Table. 337

    The sMtb2.0 and iEK1011_2.0 models were curated to remove possible reactions that participate in 338

    TICs respectively (methods section). (S19 Table). The stoichiometric matrix associated with these 339

    reactions was built and the null space basis vector was computed (S20 Table) [44,47] identifying 340

    twelve TICs within sMtb2.0; (Fig 5A , S2 Appendix) and seven TICs in iEK1011_2.0 (Fig 5B, S3 341

    Appendix) ; The major TIC of sMtb2.0 were within the reactions required for folate metabolism 342

    catalysed by thymidylate synthase (thyA and thyX) enzymes and dihydrofolate reductase (DFRA) 343

    enzymes, which is an essential step for de novo glycine and purine biosynthesis, and for the 344

    conversion of deoxyuridine monophosphate (dUMP) to deoxytimidine monophosphate (dTMP). 345

    (Fig 5A). As recommended by Pereira and colleagues [30], we modified the model to use 346

    NADPH/NADP+ by retaining the DFRA2 and DFRA4 reactions and eliminating the NADH/NAD+-347

    dependent reactions, DFRA1 and DFRA3. Our thermodynamic calculations suggested that THYA 348

    and THYX (∆𝑟𝐺𝑚𝑖𝑛=-123 kJ/mol, ∆𝑟𝐺𝑚𝑎𝑥=- 9.3 kJ/mol and ∆𝑟𝐺𝑚𝑖𝑛=-160 kJ/mol, ∆𝑟𝐺𝑚𝑎𝑥=- 33 349

    kJ/mol, respectively) are irreversible in the forward direction (S21 Table). 350

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 18 of 39

    Our analysis showed that two-ubiquinone oxidoreductases (QRr, NADH2r) and a transhydrogenase 351

    reaction (NADTRHD) were thermodynamically unstable within the iEK1011_2 as both QRr and 352

    NADH2r were reversible. The Gibbs free energy computations suggest these reactions should be 353

    unidirectional in the direction of ubiquinol production ( ∆𝑟𝐺𝑚𝑖𝑛 = −134.9𝑘𝐽

    𝑚𝑜𝑙, ∆𝑟𝐺𝑚𝑎𝑥 =354

    −20.7 𝑘𝐽/𝑚𝑜𝑙 and ∆𝑟𝐺𝑚𝑖𝑛 = −131.7𝑘𝐽

    𝑚𝑜𝑙, ∆𝑟𝐺𝑚𝑎𝑥 = −20.1 𝑘𝐽/𝑚𝑜𝑙, respectively) and therefore 355

    we changed the model accordingly to eliminate this TIC. 356

    357

    Figure 5. Thermodynamic Infeasible Cycles found in sMtb2.0 and iEK1011_2.0 with 358

    proposed modifications; (a) TIC at folate metabolism in sMtb2.0; (b) TIC affecting 359

    ubiquinone oxidoreductases in iEK1011_2.0. 360

    The sMtb2.0 encompasses 1321 reactions, 1054 metabolites and 989 genes, while iEK1011_2.0 361

    comprises 1237 reactions, 977 metabolites and 1013 genes. 362

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 19 of 39

    Table 4. Genome-scale model features for sMtb2.0 and iEK1011_2.0. 363

    Metrics sMtb2.0 iEK1011_2.0

    Genes 989 1013

    Reactions 1321 1237

    Intracellular Reactions 1198 1119

    Exchange Reactions 122 119

    Metabolites 1054 977

    % of Unbounded Reactions (49) 4% 20 (1.8 %)

    MCC Cholesterol minimal medium 0.55 0.63

    Evaluated Genes 834 866

    True Positive Genes 209 210

    True Negative Genes 449 503

    False Positive Genes 58 21

    False Negative Genes 118 132

    MCC Glycerol minimal medium 0.54 0.62

    Evaluated Genes 834 866

    True Positive Genes 207 206

    True Negative Genes 449 503

    False Positive Genes 58 21

    False Negative Genes 120 136

    MCC 7H9 medium 0.56 0.69

    Evaluated Genes 984 1006

    True Positive Genes 202 216

    True Negative Genes 600 667

    False Positive Genes 96 45

    False Negative Genes 86 78

    MCC MtbYM medium 0.43 0.52

    Evaluated Genes 827 858

    True Positive Genes 153 152

    True Negative Genes 458 509

    False Positive Genes 51 17

    False Negative Genes 165 180

    MCC Unique Carbon Source 0.58 0.67

    Evaluated Metabolites 30 30

    True Positive Metabolites 20 20

    True Negative Metabolites 5 6

    False Positive Metabolites 4 3

    False Negative Metabolites 1 1

    MCC Unique Nitrogen Source 0.75 0.75

    Evaluated Metabolites 17 17

    True Positive Metabolites 11 11

    True Negative Metabolites 4 4

    False Positive Metabolites 2 2

    False Negative Metabolites 0 0

    364

    Similar to the above sections, the predictive capability of these modified models was evaluated by 365

    simulating gene essentiality predictions using experimental data. In this case, we use the 5% of the 366

    specific growth rate as essentiality threshold; if a specific growth rate of no more than 5% of the wild-367

    type is obtained, the given gene was considered as essential, otherwise was considered non-essential 368

    (S22 Table). iEK1011_2.0 has the highest predictive performance of gene essentiality in the four 369

    media conditions tested (glycerol and cholesterol minimal medium, Middlebrook 7H9, and YM 370

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 20 of 39

    medium) compared with all the Mtb GSMNs evaluated, including sMtb2.0 (8% higher in Cholesterol 371

    and Glycerol minimal medium, and 13 % higher in 7H9 medium, Table 4). 372

    The ability of these models to predict growth sole carbon and nitrogen sources is also better in the 373

    updated GSMN’s as compared with their predecessors. These results show that iEK1011_2.0 and 374

    sMtb2.0 are currently the most suitable models for studying host-pathogen interactions, as they are 375

    able to correctly predict growth on several important carbon sources that Mtb uses intracellularly [62]. 376

    By systematically evaluating eight of the recent Mtb GSMNs, we have identified a number of areas 377

    in which model veracity can be improved. Mtb models descended from GSMN-TB (GSMN-TB1.1, 378

    iSM810 and iCG760) contain many reactions that are unbalanced by charge and mass, often because 379

    protons and water have not been accounted for. We observed that all of the Mtb models have dead-380

    end metabolites particularly in cofactor metabolism and related pathways. The best performing 381

    models were identified as sMtb2018 and iEK1011 by virtue of higher gene coverage, fewer blocked 382

    reactions, better mass and charge balances, and overall best predictive power of gene essentiality data. 383

    Whilst our analysis identified that iEK1011 and sMtb2018 were the overall best performing GSMN’s 384

    we also highlighted areas for improvements. For example the iEK1011 model had poor predictive 385

    power for growth on sole carbon and nitrogen sources and unlike Mtb cannot grow when cholesterol 386

    and fatty acids are provided as sole carbon sources. We thus updated these two models by the addition 387

    of new reactions, gap filling of cofactor metabolism, and the identification and curation of TICs, to 388

    give sMtb2.0 (derived for sMtb2018) and iEK1011_2.0 (derived from iEK1011). 389

    The improved GSMN’s, sMtb2.0 and iEK1011_2.0 are now available (S1 File) to simulate and 390

    predict the metabolic adaptation of Mtb (by the integration of OMICS data) in a plethora of in vitro 391

    and in vivo intracellular conditions. We encourage researchers to continue to curate these models as 392

    new data and methods become available. Improved GSMN’s including macrophage-Mtb models 393

    provide a platform for more reliable simulations and ultimately a better understanding of the 394

    underlying biology of Mtb. 395

    Methods 396

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 21 of 39

    All simulations were conducted on a laptop running Windows 10 (Microsoft) using MATLAB 2016a 397

    (MathWorks Corporation, Natick, Massachusetts, USA), COBRA Toolbox version 3.0 [33] and 398

    Gurobi Optimizer version 7.5.2 (Gurobi Optimization, Inc., Houston, Texas, USA). All code written 399

    for this study is available in supplementary information (S2-S5 Files). Genome-scale models of 400

    Mtb 401

    Models were obtained from supplementary information of published papers and modified as 402

    follows: 403

    GSMN-TB1.1 – from [14] supplementary info. 404

    iOSDD890 – from [16] supplementary info. 405

    sMtb – from [15] supplementary info. Modification included were the addition of exchange 406

    reactions to allow constraints by growth medium components. 407

    iCG760 – from [17] supplementary info. 408

    iSM810 – from [18] supplementary info. 409

    iNJ661v_modified – from [19] supplementary info. 410

    sMtb2018 – from [22] supplementary info. 411

    iEK1011 – from [24] supplementary info. 412

    Network connectivity evaluation 413

    GSMNs of Mtb were transformed to substrate networks by local scripts after eliminate biomass 414

    reaction. Node-specific topology metrics were carried out using the plugin Network Analyzer [78] in 415

    Cytoscape 3.4 [79]. Two main topological parameters were evaluated for each model: 1) the node 416

    degree of each metaboliteand 2) the clustering coefficient (S4 Table). 417

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 22 of 39

    MC3 Consistency Checker algorithm [38] was used to identify Single Connected and Dead End 418

    metabolites, and zero-flux reactions in each metabolic network model of Mtb. This algorithm uses a 419

    stoichiometric-based identification of metabolites connected only once in each metabolic network 420

    and utilize Flux-Variability-Analysis (FVA) for identifying reactions that cannot carry flux [80]. 421

    Biomass Molecular Weight Check 422

    Testing the biomass Molecular Weight consistency was done by running the script of Chan and 423

    colleagues [34] . A biomass reaction is not standardized when the Molecular Weight of the biomass 424

    formula is not equal to 1 g/mmol. However, the accuracy of the results relies on the correct chemical 425

    formulae of metabolites in the tested GSMNs. 426

    Identification of Unbounded Reactions (URs) 427

    A straightforward way to identify all reactions that participate in one or more TICs is by performing 428

    flux variability analysis (FVA). All the Infeasible loops are evidenced as a set of reactions able to 429

    carry an unbounded metabolic flux under finite or even zero substrate uptake inputs. The URs are 430

    those reactions that by applying FVA [80], their fluxes will hit the values defined by the upper and/or 431

    lower bounds constraints [47]. Therefore, we performed FVA with the eight Mtb GSMNs with all the 432

    uptake media constraints defined by 1.0 mmol/gDW/h. 433

    Identification of the core set of TICs 434

    Schellenberger and colleagues [81] used a methodology for identifying the core set of TICs, which 435

    form the basis of all such possible cycles. This core can be obtained by the computation of the null 436

    space basis of the stoichiometric matrix (all possible thermodynamically infeasible cycles form the 437

    null space of the stoichiometric matrix). Consequently, the set containing all the reactions that we 438

    previously identified participate in TICs was used to build a stoichiometric matrix. Therefore, the null 439

    space basis of this set was computed and the different cycles composed by two and more reactions 440

    were identified by a local script. 441

    Checking the existence of Energy Generating Cycles (EGCs). 442

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 23 of 39

    Energy generating cycles (EGCs) exist in metabolic networks and can charge energy metabolites like 443

    ATP, GTP, CDP, and UTP without any input of nutrients; therefore, their elimination is essential for 444

    correcting energy metabolism [49,82] . Fritzemeier and colleagues developed a methodology for 445

    identifying if genome-scale models contain EGCs [49]. We applied it in two steps : 1) Addition of 446

    Energy dissipation reactions (EDR) for ATP, GTP, CTP and UTP in the form : H2O[c] + XTP[c] -> 447

    H[c] + XDP[c] + Pi[c] and 2) maximization of each EDR flux 𝒗𝒅 while no substrate uptake is 448

    allowed into the model as follows: 449

    𝐦𝐚𝐱 𝒗𝒅 (𝟏) 450

    𝒔. 𝒕 𝑺𝒗 = 𝟎 (𝟐) 451

    ∀𝒊 ∉ 𝑬: 𝒗𝒊𝑳𝑩 ≤ 𝒗𝒊 ≤ 𝒗𝒊

    𝑼𝑩 (𝟑) 452

    ∀𝒊 ∈ 𝑬: 𝒗𝒊 = 𝟎 (𝟒) 453

    Here, 𝑺 is the stoichiometric matrix, 𝒗 the vector of fluxes, d the index of EDRs, 𝒗𝒊𝑳𝑩 and 454

    𝒗𝒊𝑼𝑩 the vector of lower and upper bounds, respectively, and 𝑬 is the set of indices of all 455

    exchange reactions of the model. If the optimal value of 𝒗𝒅 for this optimization is 𝒗𝒅 > 𝟎, 456

    there exist in the genome-scale model at least one EGC that is able to generate energy 457

    metabolites like ATP, GTP, CTP, or UTP. 458

    Curation of TICs 459

    Two types of modifications were performed on the curated Mtb genome-scale metabolic network in 460

    order to eliminate TICs [47]. 461

    i) TICs formed by linearly dependent reversible reactions: Usually, these arise when there 462

    are two reactions (NAD+- and NADP+-dependent) with the same catalytic activity. In this 463

    instance, we forced the use of NADPH/NADP+ in anabolic reactions and NADH/NAD+ 464

    for catabolic reactions, as recommended by Pereira and colleague [30]. If two irreversible 465

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 24 of 39

    reactions that catalyze the forward and backward direction exist, both reactions (and GPR 466

    rules) are lumped together in just one reversible reaction. 467

    ii) TICs formed by erroneous directionality assignments: we restricted the reaction 468

    directionality based on Gibbs free energy change (NExT algorithm) [83,84] as long as 469

    gene essentiality predictions are not compromised 470

    The later modification was based on the utilization of the NExT (network-embedded thermodynamic 471

    analysis) algorithm [83,84]. This algorithm allows identify new irreversible reactions by calculating 472

    the thermodynamically feasible range of Gibbs energy of reactions and metabolite concentrations. 473

    NExT was implemented for those reactions participating in TICs under Matlab [84] with 474

    physiological conditions adapted for Mtb (Table 5). Standard Gibbs energy of formation (∆𝒇𝑮𝒊) (in 475

    kJ/mol), number of hydrogen atoms, and charge of all metabolites involved in TICs were obtained 476

    from the Biochemical Thermodynamic Calculator, eQuilibrator [85,86]. 477

    Table 5. Biophysical properties and concentration ranges for intracellular Mtb. 478

    Properties Values Reference

    Redox potential, cytosol -275 mV Bhaskar et al., 2014

    pH, intracellular 5.7 Zhang et al., 2003

    pH, extracellular (activated

    macrophage) 4.5

    Rohde et al., 2007; Vandal et

    al., 2009

    Ionic strength 0.15 M Kümmel et al., 2006; Martínez

    et al., 2014

    Oxygen Concentration (mM) 0.0001 - 0.1 Martínez et al., 2014

    [CO2],[Pi] (mM) 1 - 100 Kümmel et al., 2006;

    Haraldsdóttir et al., 2012

    Other metabolites (mM) 0.0001 – 0.1 Martínez et al., 2014

    NADH/NAD 0.0001 – 0.1 Martínez et al., 2014

    NADPH/NADP 0.0001 – 0.1 Martínez et al., 2014

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 25 of 39

    If a reaction specified to be reversible in the set of TICs and it has its maximum ∆𝒓𝑮 calculated to 479

    be negative, the reaction is considered to occur in the forward direction. In contrast, if the minimum 480

    ∆𝒓𝑮 is positive, the reaction is considered to occur in the reverse direction. No direction can be 481

    inferred when the minimum ∆𝒓𝑮 is negative and the maximum is positive. Changes in directionality 482

    of reactions were done strictly when gene essentiality predictions in the curated genome-scale model 483

    were not compromised. 484

    Gene Essentiality Analysis. 485

    To identify essential genes of Mtb grown on individual conditions (cholesterol minimal medium and 486

    glycerol minimal medium) from Griffin’s paper [54], we use the Bayesian/Gumbel method of 487

    TRANSIT, version 2.02 [59]. The Bayesian/Gumbel method determines posterior probability of the 488

    essentiality of each gene (zbar). When zbar value is 1, or close to 1, the gene is considered essential 489

    (ES), if zbar is 0, or close to 0, the gene is considered non-essential (NE), uncertain (U) genes are 490

    those with zbar values between 0 and 1, and for too small (S) genes zbar is -1. After loading the TA 491

    count files (replicates for cholesterol and glycerol) and the gene annotation file into TRANSIT, and 492

    running the Gumbel method with default parameters, we obtained an output file with essentiality 493

    results (Table S9 and S10). Uncertain (U) and too small (S) genes were not taken into account for the 494

    in silico essentiality analysis. Minato and colleagues used the same statistical method for classify 495

    essential genes [57]. Conversely, DeJesus and colleagues [56] used a Hidden Markov Model based 496

    statistical method for classifying genes into four essentiality states: essential (ES), growth defect 497

    (GD), nonessential (NE), and growth advantage (GA). In order to evaluate the performance of the 498

    Mtb GSMNs to predict gene essentiality data, we use only binary classifiers, therefore we reclassify 499

    these genes just in two groups as follows: NE genes included NE and GA genes, and ES genes 500

    included GD and ES genes. 501

    For the in silico gene essentiality analysis, we set the simulation conditions (asparagine, phosphate, 502

    sodium, ammonium, citrate, sulfate, zinc, calcium, chloride, Fe3+, Fe2+, and glycerol or cholesterol) 503

    according to Griffin minimal medium [54], 7H9 OADC medium, and “MtbYM” medium and a FBA-504

    based gene essentiality analysis was performed in the eight Mtb models using the “single gene 505

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 26 of 39

    deletion” function of Cobra Toolbox. Default maximization of biomass objective function was used 506

    to predict growth in all models. If a specific growth rate of no more than 5% of the wild-type is 507

    obtained, the given gene was considered as essential (in silico), otherwise was considered non-508

    essential. 509

    Percentage of in silico gene essentiality predictions were categorized as: true-positive, false-positive, 510

    true-negative, and false-negative when the in silico data were compared with experimental 511

    essentiality data. 512

    TP (true-positive): model simulation predicts no growth when essential genes are deleted. 513

    FP (false-positive): model simulation predicts no growth when not essential genes are deleted. 514

    TN (true-negative): model simulation predicts growth when not essential genes are deleted. 515

    FN (false-negative): model simulation predicts growth when essential genes are deleted. 516

    For evaluate the performance of the Mtb GSMNs we used sensitivity, specificity, accuracy, and 517

    Mathews correlation coefficient (MCC) metrics: 518

    𝒔𝒆𝒏𝒔𝒊𝒕𝒊𝒗𝒊𝒕𝒚 =𝑻𝑷

    𝑻𝑷 + 𝑭𝑵 (𝟓) 519

    520

    𝒔𝒑𝒆𝒄𝒊𝒇𝒊𝒄𝒊𝒕𝒚 =𝑻𝑵

    𝑻𝑵 + 𝑭𝑷 (𝟔) 521

    522

    𝑨𝒄𝒄𝒖𝒓𝒂𝒄𝒚 =(𝑻𝑷 + 𝑻𝑵)

    (𝑻𝑷 + 𝑭𝑷 + 𝑻𝑵 + 𝑭𝑵) (𝟕) 523

    𝑴𝑪𝑪 =(𝑻𝑷 ∗ 𝑻𝑵) − (𝑭𝑷 ∗ 𝑭𝑵)

    √(𝑻𝑷 + 𝑭𝑷)(𝑻𝑷 + 𝑭𝑵)(𝑻𝑵 + 𝑭𝑷)(𝑻𝑵 + 𝑭𝑵) (𝟖) 524

    525

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 27 of 39

    Utilization of carbon and nitrogen sources 526

    The methodology for modeling the effect of different carbon sources and nitrogen sources on Mtb 527

    growth was adapted from Lofthouse and colleagues [14]. The biology Phenotype MicroArray 528

    experiments classification used were obtained from Lofthouse and colleagues, 2013. They classified 529

    growth and no-growth in different carbon and nitrogen sources from the original Biolog data of Khatri 530

    and colleagues and the Roisin’s minimal media [14,65]. In addition, we used Roisin’s minimal media 531

    data that also were obtained by Lofthouse and colleagues. 532

    To model the carbon source experiment, we simulated the media as a modified form of Roisin’s 533

    minimal media containing unlimited quantities of ammonia, phosphate, iron, sulfate, carbon dioxide 534

    and a Biolog carbon source influx of 1 mmol/gDW/h. Similarly, the nitrogen source experiment was 535

    simulated using a modified form of Roisin’s media, where ammonia was replaced with 1 mmol/g 536

    DW/h of the Biolog nitrogen source and pyruvate was used as a carbon source (influx at 1 mmol/g 537

    DWt/h). 538

    For comparing the utilization of carbon and nitrogen sources in all Mtb models with experimental 539

    data, we used Mathew’s correlation coefficient metrics (equation 8). 540

    In silico growth predictions in carbon and nitrogen sources also were categorized as: true-positive, 541

    false-positive, true-negative, and false-negative. 542

    TP (true-positive): model simulation predicts growth while growth is observed experimentally (or 543

    respiration rate is observed in biolog phenotype microarrays) in presence of a unique carbon or 544

    nitrogen source. 545

    FP (false-positive): model simulation predicts growth while no growth is observed experimentally in 546

    presence of a unique carbon or nitrogen source. 547

    TN (true-negative): model simulation predicts no growth while no growth is observed experimentally 548

    in presence of a unique carbon or nitrogen source. 549

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 28 of 39

    FN (false-negative): model simulation predicts no growth while growth is observed experimentally 550

    in presence of a unique carbon or nitrogen source. 551

    Supporting Information 552

    S1 Appendix. Overview of the eight constraint-based models of Mtb used in this study. A 553

    document providing additional detail about the eight Mtb GSMNs used in this study. (DOCX) 554

    S2 Appendix. Curation of additional Thermodynamic Infeasible Cycles in sMtb2.0. A document 555

    providing detailed description of the curation of TICs in sMtb2.0. (DOCX) 556

    S3 Appendix. Curation of additional Thermodynamic Infeasible Cycles in iEK1011_2.0. A 557

    document providing detailed description of the curation of TICs in iEK1011_2.0. (DOCX) 558

    559

    S1 Fig. ROC curves of gene essentiality predictions for Mtb GSMNs. a Receiver operating 560

    characteristic curve for the gene essentiality predictions in cholesterol minimal medium, b 561

    Receiver operating characteristic curve for the gene essentiality predictions in glycerol minimal 562

    medium, c Receiver operating characteristic curve for gene essentiality predictions in 7H9 563

    Middlebrook OADC medium, d Receiver operating characteristic curve for gene essentiality 564

    predictions in MtbYM medium. (TIF) 565

    S2 Fig. central role of succinate dehydrogenase in oxidation of odd-chain substrates. (TIF) 566

    S1 Table. Set analysis of genes from all the eight Mtb GSMNs. A table with pairwise comparisons, 567

    unions, and intersections of genes annotated in the eight GSMNs of Mtb. (XLSX) 568

    S2 Table. Metabolic pathways associated to all the intersected gene sets of Mtb GSMNs. (XLSX) 569

    S3 Table. Genes and metabolic pathways shared between GSMNs of Mtb. (XLSX) 570

    S4 Table. Network topological properties of Mtb GSMNs. (XLSX) 571

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 29 of 39

    S5 Table. List of unbalanced reactions and Metabolites without formulas in the Mtb GSMNs. 572

    (XLSX) 573

    S6 Table. List of blocked reactions and dead-end metabolites. (XLSX) 574

    S7 Table. List of Thermodynamically infeasible cycles identified for the eight Mtb GSMNs. 575

    (XLSX). 576

    S8 Table. Transposon sequencing analysis of Mtb genes required for growing on minimal 577

    medium plus cholesterol. A list of genes of Mtb classified as essential, non-essential, and uncertain 578

    for growing in cholesterol minimal medium, the essentiality analysis was obtained by applying the 579

    Bayesian/Gumbel Method incorporated into the software TRANSIT. (XLSX) 580

    S9 Table. Transposon sequencing analysis of Mtb genes required for growing on minimal 581

    medium plus glycerol. A list of genes of Mtb classified as essential, non-essential, and uncertain for 582

    growing in glycerol minimal medium, the essentiality analysis was obtained by applying the 583

    Bayesian/Gumbel Method incorporated into the software TRANSIT. (XLSX) 584

    S10 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 585

    cholesterol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 586

    5% of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 587

    (XLSX) 588

    S11 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 589

    glycerol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 5% 590

    of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 591

    (XLSX) 592

    S12 Table. Predictive power of Mtb GSMNs for classify essential and non essential genes on 593

    7H9 OADC medium. Genes whose in silico knockouts give growth rate values lesser than 5% of the 594

    maximum growth rate are classified as essential, otherwise are classified as non-essential. (XLSX) 595

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 30 of 39

    S13 Table. Predictive power of Mtb GSMNs for classify essential and non-essential genes on 596

    cholesterol minimal medium. Genes whose in silico knockouts give growth rate values lesser than 597

    5% of the maximum growth rate are classified as essential, otherwise are classified as non-essential. 598

    (XLSX) 599

    S14 Table. List of common False Positive and False Negative Genes of all the Mtb GSMNs 600

    during gene essentiality predictions. (XLSX) 601

    S15 Table. Predictive power of Mtb GSMNs growing on sole carbon sources. (XLSX) 602

    S16 Table. Predictive power of Mtb GSMNs growing on sole nitrogen sources. (XLSX) 603

    S17 Table. New added reactions into the sMtb and iEK1011 models. (XLSX) 604

    S18 Table. List of reactions with modified gene annotation in sMtb2.0. (XLSX) 605

    S19 Table. List of reactions that participate in Thermodynamically Infeasible Cycles. (XLSX) 606

    S20 Table. Null Space of the stoichiometric matrix formed by unbounded reactions of sMtb2.0 607

    and iEK1011_2.0. (XLSX) 608

    S21 Table. Gibbs free energy change of Unbounded Reactions of updated Mtb GSMNs. (XLSX) 609

    S22 Table. Gene Essentiality Predictions for iEK1011_2.0 and sMtb2.0 on four Mtb media. 610

    (XLSX) 611

    S23 Table. Growth Phenotypes of iEK1011_2.0 and sMtb2.0 on unique carbon and nitrogen 612

    sources. (XLSX) 613

    S1 File. Updated Mtb models sMtb2.0 and iEK1011 in .mat and ,xlsx format. (ZIP) 614

    S2 File. Matlab script for checking the charge and mass balance of Mtb GSMNs. (ZIP) 615

    S3 File. Matlab scripts for identifying unbounded reactions and thermodynamically infeasible 616

    cycles in Mtb GSMNs. (ZIP) 617

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 31 of 39

    S4 File. Matlab scripts for identifying Mtb GSMNs with Energy Generating Cycles. (ZIP) 618

    S5 File. Matlab scripts for gene essentiality analysis of Mtb GSMNs on four media conditions. 619

    (ZIP) 620

    Author Contributions 621

    Conceptualization: Víctor A. López-Agudelo, Dany JV Beste., Rigoberto Rios-Estepa. 622

    Data curation: Víctor A. López-Agudelo 623

    Formal analysis: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Dany JV. Beste, Rigoberto 624

    Rios-Estepa. 625

    Funding acquisition: Dany JV. Beste, Rigoberto Rios-Estepa. 626

    Investigation: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Andres Baena, Luis F. 627

    Barrera, Dany JV. Beste, Rigoberto Rios-Estepa. 628

    Methodology: Víctor A. López-Agudelo, Dany JV. Beste, Rigoberto Rios-Estepa. 629

    Project administration: Rigoberto Rios-Estepa, Dany JV. Beste. 630

    Resources: Víctor A. López-Agudelo, Emma Laing, Tom Mendum, Andres Baena, Luis F. Barrera, 631

    Dany JV. Beste, Rigoberto Rios-Estepa. 632

    Software: Víctor A. López-Agudelo, Tom Mendum, Dany JV. Beste. 633

    Supervision: Dany JV. Beste, Rigoberto Rios-Estepa. 634

    Validation: Tom Mendum, Dany JV. Beste, Rigoberto Rios-Estepa. 635

    Visualization: Víctor A. López-Agudelo 636

    Writting – original draft: Víctor A. López-Agudelo, Andres Baena, Luis F. Barrera, Dany JV. Beste, 637

    Rigoberto Rios-Estepa 638

    Writting – review & editing: Víctor A. López-Agudelo, Tom Mendum, Dany JV. Beste, Rigoberto 639

    Rios-Estepa. 640

    641

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 32 of 39

    642

    Funding 643

    This work was supported by grants from COLCIENCIAS (1115-5693-3520 and the Medical Research 644

    Council (MR/K01224X/1). 645

    646

    Acknowledgments 647

    Víctor A. López-Agudelo acknowledges the economic support provided by the Administrative 648

    Department of Science, Technology and Innovation - COLCIENCIAS, Colombia, during his Ph.D. 649

    studies (National Ph.D. scholarship Conv. 727-2015). 650

    References 651

    1. World Health Organisation. Global Health TB Report. 2018. 652

    2. Deng M, Lv X-D, Fang Z-X, Xie X-S, Chen W-Y. The blood transcriptional signature for 653

    active and latent tuberculosis. Infect Drug Resist. 2019;12: 321. 654

    3. MacLean E, Broger T, Yerlikaya S, Fernandez-Carballo BL, Pai M, Denkinger CM. A 655

    systematic review of biomarkers to detect active tuberculosis. Nat Microbiol. 2019;4: 748. 656

    4. Cumming BM, Addicott KW, Adamson JH, Steyn AJC. Mycobacterium tuberculosis induces 657

    decelerated bioenergetic metabolism in human macrophages. Elife. 2018;7: e39169. 658

    5. Shi L, Jiang Q, Bushkin Y, Subbian S, Tyagi S. Biphasic Dynamics of Macrophage 659

    Immunometabolism during Mycobacterium tuberculosis Infection. MBio. 2019;10: e02550-660

    18. 661

    6. Kalia NP, Shi Lee B, Ab Rahman NB, Moraski GC, Miller MJ, Pethe K. Carbon metabolism 662

    modulates the efficacy of drugs targeting the cytochrome bc1:aa3 in Mycobacterium 663

    tuberculosis. Sci Rep. 2019;9: 8608. doi:10.1038/s41598-019-44887-9 664

    7. Howard NC, Marin ND, Ahmed M, Rosa BA, Martin J, Bambouskova M, et al. 665

    Mycobacterium tuberculosis carrying a rifampicin drug resistance mutation reprograms 666

    macrophage metabolism through cell wall lipid changes. Nat Microbiol. 2018;3: 1099. 667

    8. Murima P, McKinney JD, Pethe K. Targeting bacterial central metabolism for drug 668

    development. Chem Biol. 2014;21: 1423–1432. 669

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 33 of 39

    9. Bald D, Villellas C, Lu P, Koul A. Targeting Energy Metabolism in Mycobacterium 670

    tuberculosis , a New Paradigm in Antimycobacterial Drug Discovery . MBio. 2017;8: e00272-671

    17. doi:10.1128/mbio.00272-17 672

    10. Beste DJ V, Hooper T, Stewart G, Bonde B, Avignone-Rossa C, Bushell ME, et al. GSMN-673

    TB: a web-based genome-scale network model of Mycobacterium tuberculosis metabolism. 674

    Genome Biol. 2007;8: R89. 675

    11. Jamshidi N, Palsson BØ. Investigating the metabolic capabilities of Mycobacterium 676

    tuberculosis H37Rv using the in silico strain iNJ 661 and proposing alternative drug targets. 677

    BMC Syst Biol. 2007;1: 26. 678

    12. Chindelevitch L, Stanley S, Hung D, Regev A, Berger B. MetaMerge: scaling up genome-679

    scale metabolic reconstructions with application to Mycobacterium tuberculosis. Genome 680

    Biol. 2012;13: r6. 681

    13. Fang X, Wallqvist A, Reifman J. Modeling phenotypic metabolic adaptations of 682

    Mycobacterium tuberculosis H37Rv under hypoxia. PLoS Comput Biol. 2012;8: e1002688. 683

    14. Lofthouse EK, Wheeler PR, Beste DJV, Khatri BL, Wu H, Mendum TA, et al. Systems-based 684

    approaches to probing metabolic variation within the Mycobacterium tuberculosis complex. 685

    PLoS One. 2013;8: e75913. doi:10.1371/journal.pone.0075913 686

    15. Rienksma RA, Suarez-Diez M, Spina L, Schaap PJ, dos Santos VAPM. Systems-level 687

    modeling of mycobacterial metabolism for the identification of new (multi-) drug targets. 688

    Seminars in immunology. Elsevier; 2014. pp. 610–622. 689

    16. Brahmachari SK, Bhat AG, Kushwaha S, Bhardwaj A. Systems level mapping of metabolic 690

    complexity in Mycobacterium tuberculosis to identify high-value drug targets. J Transl Med. 691

    2014;12: 263. doi:10.1186/s12967-014-0263-5 692

    17. Garay CD, Dreyfuss JM, Galagan JE. Metabolic modeling predicts metabolite changes in 693

    Mycobacterium tuberculosis. BMC Syst Biol. 2015;9: 57. 694

    18. Ma S, Minch KJ, Rustad TR, Hobbs S, Zhou S-L, Sherman DR, et al. Integrated modeling of 695

    gene regulatory and metabolic networks in Mycobacterium tuberculosis. PLoS Comput Biol. 696

    2015;11: e1004543. 697

    19. Xavier JC, Patil KR, Rocha I. Integration of biomass formulations of genome-scale metabolic 698

    models with experimental data reveals universally essential cofactors in prokaryotes. Metab 699

    Eng. 2017;39: 200–208. 700

    20. López-Agudelo VA, Baena A, Ramirez-Malule H, Ochoa S, Barrera LF, Ríos-Estepa R. 701

    Metabolic adaptation of two in silico mutants of Mycobacterium tuberculosis during infection. 702

    BMC Syst Biol. 2017;11. doi:10.1186/s12918-017-0496-z 703

    21. Rienksma RA, Schaap PJ, Dos Santos VAPM, Suarez-Diez M. Modeling host-pathogen 704

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 34 of 39

    interaction to elucidate the metabolic drug response of intracellular mycobacterium 705

    tuberculosis. Front Cell Infect Microbiol. 2019;9. doi:10.3389/fcimb.2019.00144 706

    22. Rienksma RA, Schaap PJ, Martins dos Santos VAP, Suarez-Diez M. Modeling the metabolic 707

    state of Mycobacterium tuberculosis upon infection. Front Cell Infect Microbiol. 2018;8: 264. 708

    23. King ZA, Lu J, Dräger A, Miller P, Federowicz S, Lerman JA, et al. BiGG Models: A platform 709

    for integrating, standardizing and sharing genome-scale models. Nucleic Acids Res. 2015;44: 710

    D515–D522. 711

    24. Kavvas ES, Seif Y, Yurkovich JT, Norsigian C, Poudel S, Greenwald WW, et al. Updated and 712

    standardized genome-scale reconstruction of Mycobacterium tuberculosis H37Rv, iEK1011, 713

    simulates flux states indicative of physiological conditions. BMC Syst Biol. 2018;12: 25. 714

    25. Gu C, Kim GB, Kim WJ, Kim HU, Lee SY. Current status and applications of genome-scale 715

    metabolic models. Genome Biol. 2019;20: 121. doi:10.1186/s13059-019-1730-3 716

    26. Machado D, Herrgård MJ, Rocha I. Stoichiometric representation of gene–protein–reaction 717

    associations leverages constraint-based analysis from reaction to gene-level phenotype 718

    prediction. PLoS Comput Biol. 2016;12: e1005140. 719

    27. Hädicke O, Klamt S. EColiCore2: a reference network model of the central metabolism of 720

    Escherichia coli and relationships to its genome-scale parent model. Sci Rep. 2017;7: 39647. 721

    28. Madigan CA, Martinot AJ, Wei J-R, Madduri A, Cheng T-Y, Young DC, et al. Lipidomic 722

    analysis links mycobactin synthase K to iron uptake and virulence in M. tuberculosis. PLoS 723

    Pathog. 2015;11: e1004792. 724

    29. Chao A, Sieminski PJ, Owens CP, Goulding CW. Iron Acquisition in Mycobacterium 725

    tuberculosis. Chem Rev. 2018;119: 1193–1220. 726

    30. Pereira R, Nielsen J, Rocha I. Improving the flux distributions simulated with genome-scale 727

    metabolic models of Saccharomyces cerevisiae. Metab Eng Commun. 2016;3: 153–163. 728

    doi:10.1016/j.meteno.2016.05.002 729

    31. Huss M, Holme P. Currency and commodity metabolites: Their identification and relation to 730

    the modularity of metabolic networks. IET Syst Biol. 2006;1: 280–285. doi:10.1049/iet-731

    syb:20060077 732

    32. Thiele I, Palsson BØ. A protocol for generating a high-quality genome-scale metabolic 733

    reconstruction. Nat Protoc. 2010;5: 93. 734

    33. Heirendt L, Arreckx S, Pfau T, Mendoza SN, Richelle A, Heinken A, et al. Creation and 735

    analysis of biochemical constraint-based models using the COBRA Toolbox v. 3.0. Nat 736

    Protoc. 2019;14: 639. 737

    34. Chan SHJ, Cai J, Wang L, Simons-Senftle MN, Maranas CD. Standardizing biomass reactions 738

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 35 of 39

    and ensuring complete mass balance in genome-scale metabolic models. Bioinformatics. 739

    2017;33: 3603–3609. doi:10.1093/bioinformatics/btx453 740

    35. Kumar VS, Dasika MS, Maranas CD. Optimization based automated curation of metabolic 741

    reconstructions. BMC Bioinformatics. 2007;8: 212. 742

    36. Thiele I, Vlassis N, Fleming RMT. fastGapFill: efficient gap filling in metabolic networks. 743

    Bioinformatics. 2014;30: 2529–2531. 744

    37. Benedict MN, Mundy MB, Henry CS, Chia N, Price ND. Likelihood-based gene annotations 745

    for gap filling and quality assessment in genome-scale metabolic models. PLoS Comput Biol. 746

    2014;10: e1003882. 747

    38. Yousofshahi M, Ullah E, Stern R, Hassoun S. MC 3: a steady-state model and constraint 748

    consistency checker for biochemical networks. BMC Syst Biol. 2013;7: 129. 749

    39. Gopinath K, Moosa A, Mizrahi V, Warner DF. Vitamin B12 metabolism in Mycobacterium 750

    tuberculosis. Future Microbiol. 2013;8: 1405–1418. doi:10.2217/fmb.13.113 751

    40. Gopinath K, Venclovas Č, Ioerger TR, Sacchettini JC, McKinney JD, Mizrahi V, et al. A 752

    vitamin B12 transporter in Mycobacterium tuberculosis. Open Biol. 2013;3: 120175. 753

    doi:10.1098/rsob.120175 754

    41. Minias A, Minias P, Czubat B, Dziadek J. Purifying selective pressure suggests the 755

    functionality of a vitamin B12 biosynthesis pathway in a global population of mycobacterium 756

    tuberculosis. Genome Biol Evol. 2018;10: 2326–2337. doi:10.1093/gbe/evy153 757

    42. Young DB, Comas I, de Carvalho LPS. Phylogenetic analysis of vitamin B12-related 758

    metabolism in Mycobacterium tuberculosis. Front Mol Biosci. 2015;2: 6. 759

    doi:10.3389/fmolb.2015.00006 760

    43. Noor E. Removing both Internal and Unrealistic Energy-Generating Cycles in Flux Balance 761

    Analysis. arXiv Prepr arXiv180304999. 2018. 762

    44. Schellenberger J, Lewis NE, Palsson B. Erratum: Elimination of thermodynamically infeasible 763

    loops in steady-state metabolic models (Biophysical Journal (2010) 100 (544-553)). Biophys 764

    J. 2011;100: 1381. doi:10.1016/j.bpj.2011.02.005 765

    45. Price ND, Famili I, Beard DA, Palsson BØ. Extreme pathways and Kirchhoff’s second law. 766

    Biophys J. 2002;83: 2879–2882. 767

    46. Beard DA, Liang S, Qian H. Energy balance for analysis of complex metabolic networks. 768

    Biophys J. 2002;83: 79–86. 769

    47. Maranas CD, Zomorrodi AR. Optimization Methods in Metabolic Networks. Optimization 770

    Methods in Metabolic Networks. John Wiley & Sons; 2016. doi:10.1002/9781119188902 771

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 36 of 39

    48. Fleming RMT, Thiele I, Provan G, Nasheuer HP. Integrated stoichiometric, thermodynamic 772

    and kinetic modelling of steady state metabolism. J Theor Biol. 2010;264: 683–692. 773

    49. Fritzemeier CJ, Hartleb D, Szappanos B, Papp B, Lercher MJ. Erroneous energy-generating 774

    cycles in published genome scale metabolic networks: Identification and removal. PLoS 775

    Comput Biol. 2017;13: e1005494. doi:10.1371/journal.pcbi.1005494 776

    50. Basler G. Computational prediction of essential metabolic genes using constraint-based 777

    approaches. Methods in Molecular Biology. Springer; 2015. pp. 183–204. doi:10.1007/978-778

    1-4939-2398-4_12 779

    51. Sassetti CM, Rubin EJ. Genetic requirements for mycobacterial survival during infection. Proc 780

    Natl Acad Sci. 2003;100: 12989–12994. doi:10.1073/pnas.2134250100 781

    52. Lamichhane G, Zignol M, Blades NJ, Geiman DE, Dougherty A, Grosset J, et al. A 782

    postgenomic method for predicting essential genes at subsaturation levels of mutagenesis: 783

    Application to Mycobacterium tuberculosis. Proc Natl Acad Sci. 2003;100: 7213–7218. 784

    doi:10.1073/pnas.1231432100 785

    53. Zhang YJ, Ioerger TR, Huttenhower C, Long JE, Sassetti CM, Sacchettini JC, et al. Global 786

    Assessment of Genomic Regions Required for Growth in Mycobacterium tuberculosis. PLoS 787

    Pathog. 2012;8: e1002946. doi:10.1371/journal.ppat.1002946 788

    54. Griffin JE, Gawronski JD, DeJesus MA, Ioerger TR, Akerley BJ, Sassetti CM. High-789

    resolution phenotypic profiling defines genes essential for mycobacterial growth and 790

    cholesterol catabolism. PLoS Pathog. 2011;7: e1002251. doi:10.1371/journal.ppat.1002251 791

    55. DeJesus MA, Zhang YJ, Sassetti CM, Rubin EJ, Sacchettini JC, Ioerger TR. Bayesian analysis 792

    of gene essentiality based on sequencing of transposon insertion libraries. Bioinformatics. 793

    2013;29: 695–703. doi:10.1093/bioinformatics/btt043 794

    56. DeJesus MA, Gerrick ER, Xu W, Park SW, Long JE, Boutte CC, et al. Comprehensive 795

    Essentiality Analysis of the Mycobacterium tuberculosis Genome via Saturating Transposon 796

    Mutagenesis . MBio. 2017;8: e02133-16. doi:10.1128/mbio.02133-16 797

    57. Minato Y, Gohl DM, Thiede JM, Chacón JM, Harcombe WR, Maruyama F, et al. Genome-798

    wide assessment of Mycobacterium tuberculosis conditionally essential metabolic pathways. 799

    BioRxiv. 2019;4: 534289. 800

    58. VanderVen BC, Fahey RJ, Lee W, Liu Y, Abramovitch RB, Memmott C, et al. Novel 801

    Inhibitors of Cholesterol Degradation in Mycobacterium tuberculosis Reveal How the 802

    Bacterium’s Metabolism Is Constrained by the Intracellular Environment. PLoS Pathog. 803

    2015;11: e1004679. doi:10.1371/journal.ppat.1004679 804

    59. DeJesus MA, Ambadipudi C, Baker R, Sassetti C, Ioerger TR. TRANSIT - A Software Tool 805

    for Himar1 TnSeq Analysis. PLoS Comput Biol. 2015;11: e1004401. 806

    .CC-BY 4.0 International licenseavailable under awas not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made

    The copyright holder for this preprint (whichthis version posted November 10, 2019. ; https://doi.org/10.1101/837401doi: bioRxiv preprint

    https://doi.org/10.1101/837401http://creativecommons.org/licenses/by/4.0/

  • 37 of 39

    doi:10.1371/journal.pcbi.1004401 807

    60. Chen WH, Lu G, Chen X, Zhao XM, Bork P. OGEE v2: An update of the online gene 808

    essentiality database with special focus on differentially essential genes in human cancer cell 809

    lines. Nucleic Acids Res. 2017;45: D940–D944. doi:10.1093/nar/gkw1013 810

    61. de Carvalho LPS, Fischer SM, Marrero J, Nathan C, Ehrt S, Rhee KY. Metabolomics of 811

    Mycobacterium tuberculosis reveals compartmentalized co-catabolism of carbon substrates 812

    (Maybe Helpful Article about M. tuberculosis Untargeted Metabolomic of Profiling of C13-813

    Labelle