Introduction to...
Transcript of Introduction to...
Cheminformatics (also known as chemoinformatics and chemicalinformatics) is the use of computer and informational techniques,applied to a range of problems in the field of chemistry.
These in silico techniques are used in pharmaceutical companiesin the process of drug discovery.
In the U.S., recent NIH emphasis has been placed on developingpublic domain Cheminformatics research by creating sixExploratory Centers for Cheminformatics Research (ECCRs) aspart of the NIH Molecular Libraries Initiative.
Definition (wikipedia)
CHEMOINFORMATICS
Target Protein
Large librariesof molecules
High Throughout Screening
Hit
experimental
computational
Virtual Screening
Filtering, QSAR,Docking
Small Library of selected hits
In silico design
Storage and Search of chemical information
Structure-Property Modeling
Major applications of Chemoinformatics
Chemoinformatics ‐Why?•amount of information
many millions of compounds and reactionsmany millions of publications
Chemical Databases
Storage, organization and search experimental data
Problem: Flood of Information
• > 47 million compounds
• 5-7 million new compounds / year
• 800,000 publications / year0
5 000 000
10 000 000
15 000 000
20 000 000
25 000 000
30 000 000
# of
stru
ctur
es
1965 1970 1975 1980 1985 1990 1995 2000
Year
=> can anyone read 4.000 publications / day ?
Problem: Not Enough Information
•> 47,000,000 chemical compounds
•~ 500,000 3D structures on• Cambridge Crystallographic File
we have 3D structures for 0.1 % of all compounds
Chemoinformatics ‐Why?
• complex relationshipsstructure - biological activitychemical reactivity
In silico design of new compounds
Prediction of physical, chemical and biological properties
The most fundamental and lasting objective of synthesis is not production of new compounds but production of
properties
George S. HammondNorris Award Lecture, 1968
Chemoinformatics ‐ How?
Prediction of physical, chemical and biological properties
Storage, organization and search experimental data
Encoding molecular structures by descriptors
Example 1: Hansch Analysis
• Hansch’s Descriptors canbe broadly classified intothree general types:
• Electronic (σ)• Steric (δEs)• Hydrophobic (logP)
Biological Activity = f (Descriptors) + constant
log1/C = a ( log P )2 + b log P + ρσ + δEs + C
Example 2: Lipinski rule of five
• There are more than 5 H‐bond donors.
• The molecular weight is over 500.
• The LogP is over 5.
• There are more than 10 H‐bond acceptors.
Poor absorption or permeation are more likely when:
Molecule is represented by 4 parameters:- the number of H-bond donor groups;- the number of H-bond acceptor groups;- molecular weight;- logP
Chemoinformatics ‐ definition
Chemoinformatics is a field dealing with molecular objects (graphs, vectors) in multidimentional chemical space
Theoretical chemistry
Quantum Chemistry
Force Field Molecular Modelling
Chemoinformatics
- Molecular model- Basic concepts- Major applications- Learning approaches
Molecular Model
Quantum Chemistry
Force Field Molecular Modelling
Chemoinformaticsobjects in chemical space
(graphs, vectors)
electrons and nuclei
atoms and bonds
Learning approach
Quantum Chemistry
Force Field Molecular Modelling
Chemoinformatics
deductive >> inductive
deductive ≅ inductive
deductive << inductive
Chemoinformatics: From Data to Knowledge
know-ledge
information
data
generalization
context
measurementcalculation
deductivelearning
inductivelearning
They are complementary
Quantum Chemistry
Force Field Modeling
Chemoinformatics
… but Chemoinformatics is the most suitable one for quantitative predictions of properties
Which approach is more useful for a theoretical design of compounds possessing desired properties ?
Chemoinformatics ‐ definition• Chemoinformatics is a generic term that encompasses the design,
creation, organization, management, retrieval, analysis, dissemination,visualization, and use of chemical information
G. Paris, 1998.
• Chemoinformatics is the application of informatics methods to solvechemical problems
J. Gasteiger, 2004
• Chemoinformatics is the mixing of those information resources totransform data into information and information into knowledge for theintended purpose of making better decisions faster in the area of drug leadidentification and optimization”
F.K. Brown, 1998
• Chemoinformatics is a field dealing with molecular objects (graphs, vectors) in multidimentional chemical space
A. Varnek, 2007
Recommended reading
Chemoinformatics - A Textbook, Johann Gasteiger andThomas Engel, Wiley-VCH 2003.
Handbook of Chemoinformatics, Johann Gasteiger,Wiley-VCH 2003.
An Introduction to Chemoinformatics, Andrew R. Leach,Valerie J. Gillet, Springer 2007.
Short courses in chemoinformatics, 1 – 5 June 2009
Computer representation of chemical structures A. VarnekMorning
Day 1
Afternoon Creation and management of chemical databases G. Marcou, A.VarnekTutorials with the ChemAxon software
Molecular Descriptors A. VarnekMorning
Day 2
Afternoon Force Field approach. Conformational sampling D. Horvath, A. VarnekTutorials with MOE, Codessa Pro
Pharmacophores T. Langer, D. HorvathMorning
Day 3
Afternoon Chemical space, similarity/diversity and chemical library design J. BajorathTutorials with MOE