Building blocks for automated elucidation of metabolites...
Transcript of Building blocks for automated elucidation of metabolites...
![Page 1: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/1.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 1 07:05:37
Building blocks for automated elucidation of metabolites:
Machine learning methods for NMR prediction
Stefan Kuhn1, Björn Egert2, Steffen Neumann2, Christoph Steinbeck
1European Bioinformatics Institute (EBI), Chemoinformatics and Metabolism Team, Wellcome Trust Genome Campus, Cambridge, CB10 1SD, United Kingdom
2Research Group for Molecular Informatics, Cologne University Bioinformatics Center (CUBIC), Zuelpicher Str. 47, D50674 Cologne, Germany, [email protected],
![Page 2: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/2.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 2 07:05:37
Metabolomics @ CUBIC
• Experiment:
•Fast quenching of metabolism
•Cell lysis and extraction
•Derivation
•Detection via GC/MS
2 4 6 8 10 120
200000
400000
600000
Trehalose
GlutamatLactatS
igna
linte
nsit
ä t
t [min]
• Ca. 1000 compounds visible in GC
• 400 derivatives can be reproducibly
quantified
• 240 compounds identified
![Page 3: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/3.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 3 07:05:37
156.11
73.07
245.19
347.20
Procedure:
Extraction of bacterial cells with methanol
Derivatisation
Separation of compounds by gas chromatography
Analysis by massspectrometry after electron impact ionization
Gas chromatography (GC)
Massspectrometer
Metabolomics @ CUBIC
Mass spectrometry (MS)
![Page 4: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/4.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 4 07:05:38
Denovo Elucidation of Biomarkers and Metabolites:ComputerAssisted Structure Elucidation (CASE)
![Page 5: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/5.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 5 07:05:38
•Java library for chemoinformatics,
•Open Source, LGPL (permits commercial use)
•>50 developers, core team 1020 people
•>50 academic and industrial projects worldwide
The Chemistry Development Kit (CDK)
![Page 6: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/6.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 6 07:05:38
CDK Functionality
•I/O (CML, MDL Molfile, SDF, PDB) •SMILES •InChI
Input/Output•StructureDiagramLayout (SDG)•2D Rendering•3D Rendering
Visualization
•3D ModelBuilder •AtomTyping•ForceField•Representation of Biomolecular Structures
Modelling
•Isomorphism detection•MaximumCommonSubstructure Searches•SMARTS and Substructure searches•Ring searches•Aromaticity detection
Chemical Graphs
•Deterministic Isomer generator•Stochastic Structure Generators via
Simulated AnnealingGenetic Algorithms
Library Enumeration
•Fingerprinting•> 70 QSARDescriptors•QSAR model building
Properties
![Page 7: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/7.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 7 07:05:38
Characterizing Biomarkers and Metabolites
NMRShiftDB (http://www.nmrshiftdb.org)
[1] Steinbeck, C.; Kuhn, S.; Krause, S., J. Chem. Inf. Comput. Sci. 2003, 43, 1733 1739. [2] Steinbeck, C.; Kuhn, S. Phytochemistry 2004, 65, 27112717.
21500
25000 Open AccessOpen SubmissionOpen Source
![Page 8: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/8.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 8 07:05:39
2D NMR Data for CASE
Steinbeck, C. ComputerAssisted Structure Elucidation. In Handbook on Chemoinformatics.; Gasteiger, J. Ed.; WileyVCH: Weinheim, 2003; Vol. 2; pp. 13781406.
![Page 9: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/9.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 9 07:05:39
H O
O H
Polycarpol (C30H48O2).
CASE with Simulated Annealing
Steinbeck, C.; Journal of Chemical Information & Computer Sciences 2001, 41, 15001507.
Fitness Evaluation (Scoring)
Stotal = SNMRHMBC + SNMRHHCOSY + SNMRShift + SSymmetry + SMassSpec... + SFeatures
![Page 10: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/10.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 10 07:05:39
How far do we get with 1D NMR?
![Page 11: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/11.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 11 07:05:40
Deterministic Structure Generators work ...
... quite nicely for small molecules even with very simple fitness functions
● For around 10 heavy atoms, we've been able to find the correct solutions just based on 13C shift prediction and comparison with measured spectrum.
![Page 12: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/12.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 12 07:05:40
Methods trained based on CDK descriptors (random order)
• J48
• HOSE codes
• Support Vector Machines
• M5'
• PRISM
• naïve Bayes
• Linear Regression
• KMeans Clustering
1D Proton NMR Prediction
![Page 13: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/13.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 13 07:05:40
![Page 14: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/14.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 14 07:05:40
Descriptors(416/100%)
Spatial(105/25,24%)
Physicochemical(242/57,93%)
Exp. Conditions (3/0.72%)
Topological(66/15,86%)
RDF GH,G
D [9]
Van der Waals [11]
Valence Electrons[11]
Electronegativity [9]
Sigma Pi
Period [11]
Hybrization [11]
RDF GS[9]
Distance [11]
Heavy Atom
Hydrogen
Min Avg
RDF GHtopol[9]
Picontact [11]
BondsToAtom [11]
Charge [9]
Sigma Pi
TemperatureFrequency
Solvent
330 descriptors in total
![Page 15: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/15.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 15 07:05:41
![Page 16: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/16.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 16 07:05:41
Random Forest, real vs predicted, 18672 protons
![Page 17: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/17.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 17 07:05:41
Kuhn S., Egert B., Neumann S. and Steinbeck C. (2008) BMC Bioinformatics. 2008 Sep 25;9(1):400.
![Page 18: Building blocks for automated elucidation of metabolites ...acscinf.org/docs/meetings/237nm/presentations/237nm80.pdf · Building blocks for automated elucidation of metabolites:](https://reader034.fdocuments.us/reader034/viewer/2022042311/5ed944bf6714ca7f4769730d/html5/thumbnails/18.jpg)
Christoph Steinbeck European Bioinformatics Institute (EBI) Slide 18 07:05:42
Acknowledgement
Stefan Kuhn
Steffen Neumann
Bjlörn Egert
Egon Willighagen
All Collaborators at
Cologne University Bioinformatics Center (CUBIC),
EBI
and the CDK team
Prof. Peter MurrayRust (Unilever Center for Molecular Informatics, Cambridge, UK)
Dr. William Hull, Dr. Willi von der Lieth
(DKFZ, Heidelberg)
Dr. Kämpchen
(Universität Marburg)
Dr. Heinz Kolshorn
(Universität Mainz)
DFG, BMBF, DAAD
Roche Diagnostics, Penzberg
Orion Pharma, Finnland