The Eli and Edythe L. Broad Institute A Collaboration of Massachusetts Institute of Technology,...

download The Eli and Edythe L. Broad Institute A Collaboration of Massachusetts Institute of Technology, Harvard University and affiliated Hospitals, and Whitehead.

If you can't read please download the document

Transcript of The Eli and Edythe L. Broad Institute A Collaboration of Massachusetts Institute of Technology,...

  • Slide 1

The Eli and Edythe L. Broad Institute A Collaboration of Massachusetts Institute of Technology, Harvard University and affiliated Hospitals, and Whitehead Institute for Biomedical Research Lessons learned from the Genome- scale metabolic reconstruction and curation of Neurospora crassa Jeremy Zucker Jonathan Dreyfuss Heather Hood James Galagan Slide 2 Capture Metabolic Knowledge Pathway-tools/BioCyc KEGG Reactions Interactions Literature Slide 3 Visualizing omics Data Provide a visually intuitive, metabolic framework for interpreting large omics datasets Slide 4 in silico Predictions Algorithmically Interpret Expression Data in a Metabolic Context? Slide 5 Example: Plasmodium Validation KO Phenotype Predictions 90% Accuracy External Metabolite Changes 70% Accuracy New Predictions 40 Enzymatic drug targets Experimental validation of novel target Eflux* *Colijn, C., A. Brandes, J. Zucker, et al. (2009). PLoS Comput Biol Slide 6 Modeling in the Neurospora PO1 ClockVisualization and Analysis Profiling RNA-Seq ChIP-Seq Interpretation of Expression Profiling and Regulatory Network Data in a Metabolic Context Inform Experiments Slide 7 BUILDING THE MODEL Slide 8 Manual reconstruction protocol Nature Protocols, Vol. 5, No. 1. (07 January 2010), pp. 93-121. Slide 9 Automated Model SEED reconstruction pipeline Nature biotechnology, Vol. 28, No. 9. (29 September 2010), pp. 977-982 Slide 10 Genome sequence to metabolic model PathwaysLiterature Nutrient media (Vogels) NeurosporaCyc ElementsMetadata Complexes Reactions Transporters Biomass composition Slide 11 EFICAz2 predicts enzymes Decision tree Databases HMMs FDR SVM 9934 protein sequences 1993 enzymes 1770 reactions BMC Bioinformatics 2009, 10:107 Slide 12 Protein Complex editor 182 reactions with isozymes or complexes 31 complexes experimentally validated through literature search 2-oxoisovalerate alpha subunit 2-oxoisovalerate beta subunit fatty acid synthase beta subunit dehydratase fatty acid synthase alpha subunit reductase Identify multiple genes of reaction Allow curator to validate potential complexes 2-oxoisovalerate complex Present all possible combinations of complexes Fatty acid synthase complex Slide 13 Transport inference parser (TIP) 9934 free-text Protein annotations 176 transporters assigned to 97 transport reactions MFS glucose transporter ATP synthase sucrose transporter Filter proteins for transporters Infer multimeric complex Infer substrate Infer energy-coupling mechanism Bioinformatics (2008) 24 (13): i259-i267. Slide 14 Pathologic predicts pathways 1770 enzyme- catalyzed reactions 265 Pathways X = #rxns in metacyc pwy Y = #rxns with enzyme evidence Z = #unique rxns in pwy P(X|Y|Z) = prob of pwy in Neurospora Science 293:2040-4, 2001. Slide 15 Literature curation validates predictions 1212 citations associated with 307 pathways 31 complexes 168 genes Slide 16 Neurospora Cellular overview Slide 17 NEUROSPORACYC Slide 18 New feature on Broad website Slide 19 NeurosporaCyc Cellular overview Slide 20 NeurosporaCyc cellular overview Slide 21 Googlemaps-like zoomable interface Slide 22 Highlight genes on overview Slide 23 Slide 24 Slide 25 NeurosporaCyc Omics Viewer Slide 26 Omics data mapped onto metabolism Slide 27 Slide 28 Slide 29 Omics data mapped onto Genome Slide 30 Slide 31 Slide 32 DEBUGGING THE BUG Slide 33 The problem with EC numbers Reaction classNumber of reactions neurospora (metacyc) Balanced normal reactions993 (4585) Generic reactions198 (688) Protein modification reactions:82 (469) Reactions with instanceless classes:80 (228) Generic redox reactions36 (212) Polymeric reactions24 (91) Polymerization pathway reactions11 (17) Slide 34 Generic Reactions Slide 35 3.6.1.42 instance of 3.6.1.6? Slide 36 Protein Modification reactions Slide 37 Reactions with instanceless classes Slide 38 Solution: Instantiate classes Slide 39 Generic Redox reactions Slide 40 Polymeric reactions Slide 41 Polymerization Pathway reactions Slide 42 Solution: Instantiate polymerization steps POLYMER-INST-Fatty-Acids-C16 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C16 + diphosphate + AMP + H+ POLYMER-INST-Fatty-Acids-C14 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C14 + diphosphate + AMP + H+ POLYMER-INST-Fatty-Acids-C0 + coenzyme A + ATP -> POLYMER-INST-Saturated-Fatty-Acyl-CoA- C0 + diphosphate + AMP + H+ Slide 43 What happens when the metabolic network is infeasible? Add a reaction with the smallest number of reactants and products that results in a feasible model minimize card(r) subject to Sv + r = 0 l v u Slide 44 Fast Automated Reconstruction of Metabolism Input: EFICAz probabilities for each reaction Biomass components Experimental growth / no growth phenotypes in different nutrient conditions Gene essentiality Manual curation of pathways Output: Metabolic network of MetaCyc reactions maximally consistent with input Slide 45 VALIDATING THE MODEL WITH IN SILICO KNOCKOUT PREDICTIONS Slide 46 Neurospora phenotypes for validation Neurospora e-Compendium 29 Mutants essential on minimal media Non-essential on supplemental media PO1 Phenotype Collection 79 non-essential KOs under minimal media Additional phenotypes are observed. Used FBA with Neurospora model to simulate gene knockouts in minimal medium Slide 47 Neurospora phenotype prediction results Predicted EssentialNon-Essential ObservedEssential22 (TN)7 (FP) Non-Essential14 (FN)65 (TP) PrecisionTP/ (TP+FP) 90% RecallTP/ (TP+FN) 82% SpecificityTN/ (TP+FP) 76% Accuracy(TP+TN)/ (TP+TN+FP+FN) 81% Slide 48 Comparison of model organisms under minimal media Yeast (iND750) 1 E.Coli (iAF1260) 2 Neurospora Viable Predicted/ Observed 439/455=96%993/1022=97%65/79=82% Essential Predicted/ Observed 35/109=32%159/238=67%22/29=76% Overall accuracy84%91%81% [1] Genome Res. 2004. 14: 1298-1309 [2] Molecular Systems Biology 2007 3:121 Slide 49 MODELING THE EFFECT OF OXYGEN LIMITATION ON XYLOSE FERMENTATION Slide 50 Biofuels from Neurospora? Growing interest for obtaining biofuels from fungi Neurospora crassa has more cellulytic enzymes than Trichoderma reesei N. crassa can degrade cellulose and hemicellulose to ethanol [Rao83] Simultaneous saccharification and fermentation means that N. crassa is a possible candidate for consolidated bioprocessing Xylose Ethanol Slide 51 Effects of Oxygen limitation on Xylose fermentation in Neurospora crassa Zhang, Z., Qu, Y., Zhang, X., Lin, J., March 2008. Effects of oxygen limitation on xylose fermentation, intracellular metabolites, and key enzymes of Neurospora crassa as3.1602. Applied biochemistry and biotechnology 145 (1-3), 39-51. Xylose Pyruvate TCAEthanol RespirationFermentation Glycolysis Oxygen level (mmol/L*g) Ethanol conversion (%) Low O 2 Intermediate O 2 High O 2 Slide 52 Pentose phosphate Aerobic respiration Fermentation TCA Cycle Model of Xylose Fermentation Xylose Oxygen Ethanol ATP Two paths from xylose to xylitol Slide 53 Pentose phosphate Aerobic respiration Fermentation TCA Cycle Oxygen=5 ATP=16.3 NADPH Regeneration NADPH & NAD + Utilization High Oxygen NAD + Regeneration Slide 54 Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Low Oxygen Oxygen=0 Slide 55 Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Intermediate Oxygen Optimal Ethanol NADPH & NAD Utilization Oxygen=0.5 ATP=2.8 NAD Regeneration NADPH Regeneration All O 2 used to regenerate NAD used in first step Slide 56 Pentose phosphate Aerobic respiration Fermentation TCA Cycle Ethanol Intermediate Oxygen Optimal Ethanol NADPH & NAD Utilization Oxygen=0.5 ATP=2.8 NAD Regeneration NADPH Regeneration All O 2 used to regenerate NAD used in first step Bottleneck Pyruvate decarboxylase Improve NADH enzyme Slide 57 USING E-FLUX TO PREDICT DRUG TARGETS BY INTEGRATING EXPRESSION DATA WITH FBA Slide 58 E-Flux explanation Slide 59 Application of E-flux to TB Slide 60 Slide 61 Slide 62 Next Steps Annotation: use phenotype predictions to improve model NeurosporaCyc: Use E-flux to interpret the effect of clock genetic regulatory program on metabolism. Validation: add additional phenotypes Slide 63 Acknowledgements Neurospora P01 Project Heather Hood Jonathan Dreyfuss James Galagan SRI Peter Karp Mario Latendresse Markus Krumenacker Ingrid Kesseler Tomer Altman Suzanne Paley Ron Caspi Mike Travers Slide 64 Fast Automated Reconstruction of Metabolism (FARM) Gene Calls (Broad) Protein Complex prediction Transport predictor (TIP) Pathway prediction (Pathologic) Enzyme prediction (EFICAz) Literature curation (CAP) Nutrient media (Vogels) NeurosporaCyc Slide 65 C C Fast Automated Reconstruction of Metabolism (FARM) 846 Reactions 640 Metabolites 564 Genes EFICAz predictions Pathway predictions Nutrient conditions Biomass composition Protein complexes Transport