Virtual Libraries and Virtual Screening in Drug Discovery ... · • Virtual libraries and virtual...
Transcript of Virtual Libraries and Virtual Screening in Drug Discovery ... · • Virtual libraries and virtual...
Virtual Libraries and Virtual Screening in
Drug Discovery Processes using KNIME
Iván Solt
Solutions for Cheminformatics
Drug Discovery Strategies for known targets
High-Throughput
Screening (HTS)
Cells or recombinant
protein
Fluorescent or
luminescent readout
Automated,
miniaturized
Thousands of
samples / day
Number of primer
actives: ~1%
Virtual Screening (VS)
Ligand or structure based
Virtual or real libraries
Similarity search, 2D or
3D
Can lead to thousands of
possible actives: further
processing needed
Measurement:
Enrichment ratio, ROC
curves for known actives
Virtual Library Design Workflow
DB
DB
Databases
Reactions
Molecules
Queries
Compound selection
Similarity searches
Substructure searches
Enumeration
Fuse fragments
R-group composition
Reaction enumeration
Library analysis
Clustering
2D similarity screen
3D Shape similarity
screen
Fragmentation
R-group decomposition
Fragmentation
Reagent clipping
Find or Virtually Create Candidates
• Virtual screening of
existing compounds
Pros:
– Fast
– Hits are readily available
for in vitro experiments
Cons:
– Limitation on available
compounds
Pros:
– No limitation on virtual
compound space
– Structural novelty
Cons:
– Are hits synthetically
available?
• De novo design
Find or Virtually Create Candidates
• Virtual screening of
existing compounds
Pros:
– Fast
– Hits are readily available
for in vitro experiments
Cons:
– Limitation on available
compounds
Pros:
– No limitation on virtual
compound space
– Structural novelty
Cons:
– Are hits synthetically
available?
• De novo design
Virtual Screening Workflow
DB
DB
Molecules
in-house or
commercially
available
1. Reactions
virtual synthetic
path
Synthetically
Accessible
Compounds
2. Filtering
3. Similarity
Search
4. 3D alignmentin vivo
experiment? 5. Clustering
Step 1: Reaction Enumeration
• Reaction schema for accessible
syntheses
• Combinatorial or sequential enumeration
• Reaction rules: phrase + apply public
and
in-house chemical knowledge
– Selectivity with tolerance
– Reactivity
– Exclusion rules
EXCLUDE: match(reactant(1), "[Cl,Br,I]C(=[O,S])C=C") or
match(reactant(0), "[H][O,S]C=[O,S]") or
match(reactant(0), "[P][H]") or
(max(pka(reactant(0), filter(reactant(0),
"match('[O,S;H1]')"), "acidic")) > 14.5) or
(max(pka(reactant(0), filter(reactant(0),
"match('[#7:1][H]', 1)"), "basic")) > 0)
Step 1: Reaction Enumeration
Reaction rules ON
• Fewer results than
theoretical
• Unfeasible starting
materials eliminated
• Feasible products only
• Custom rules can be
added to increase
selectivity
Reaction rules OFF
• More results
• Best for debugging
purposes
• Prodcts may be incorrect
due to neglecting
chemical rules
Step 1: Reaction Enumeration
Step 2: Filtering• Lead likeness, drug likeness
– Chemical Terms
• Could it fit to the active centre?
– Basic analysis: size, mass...
• Could it get to the active centre?
ADME properties:
solubility, pKa, polar surface,
partition coefficients...
• Structural filtering
– e.g. reactive groups
• Toxicity, environmental concerns,
etc...
Calculator plugins
Elemental Analysis
Elemental Analysis
IUPAC Name
Structure to Name
Protonation
pKa
Microspecies
Isoelectric Point
Partitioning
logP
logD
Charge
Charge
Polarizability
Orbital
Electronegetivity
Isomers
Tautomerization
Stereoisomer
Conformation
Conformer
Flexible 3D Alignment
Molecular Dynamics
Geometry
Topology Analysis
Geometry
Polar Surface Area (2D)
Molecular Surface Area (3D)
Markush
Markush Enumeration
Other
Hydrogen Bond Donor-
Acceptor
Huckel Analysis
Refractivity
Structural Framework
Resonance
Step 3: Similarity search
Screen 2D +
Descriptor package
Screen against known
bioactives
• Chemical Fingerprints
Topology
• Pharmacophore Fingerprints:
Custom atomic properties + their
topological relationship
• ECFP/FCFP
Similarity searches
• H-bond donors / acceptors
• Cationic / anionic groups
• Hydrophobic groups
• Aromatic groups
• etc.
• Tanimoto, Eucledian, Tversky
metrics
• Metrics optimization
0.47 0.55
0.57
0.28
0.20
0.06
regular Tanimoto
optimized Tanimoto
Step 4: Screen 3D
• Align the candidates to the known active in 3D
• Treat the candidate flexible!
• Consider pharmacophore atom types
(align cationic to cationic, etc.)!
• Problem: complicated conformational space
Step 4: Screen 3D
Simple sampling of the
conformational space:
Minimum and maximum distance
between atom pairs in the full
torsion space
Select atoms
• Colors (e.g. pharmacophore types )
• Topological features
(e.g.:longest chain start/end/center)
• Ring centers (aromatic, aliphatic)
Calculate
• Min/max internal distance ranges
• Distance histograms for selected
atoms
• Only once for each molecule
Step 4: Screen 3D
„Hybrid” alignment:
Separate translation&rotation
from torsions
• Robust and goes fast
• Needs good guess on atom-
atom mapping:
• Same colors
• Distance ranges must be allowed
for all mapped pairs
• Triangle inequality must be
fulfilled for any atom triplet
Screen 3D: Test on DUD
0
5
10
15
20
25
30
% o
f th
e a
cti
ves r
etr
iev
ed
Average of 1% Enrichments
Giganti et al. J. Chem. Inf. Model. 2010, 50, 992
Screen 3D: Test on DUD
0
10
20
30
40
50
60
70
80
90
100
% o
f th
e a
cti
ves r
etr
iev
ed
Average of 10% enrichments
Giganti et al. J. Chem. Inf. Model. 2010, 50, 992
Screen 3D: Test on DUD
Speed
Average time per compound
(without precalculations)
ChemAxon Screen3D 0.07
ROCS 0.5
FRED 1.0
ICMsim 2.4
Surflex-sim 6.7
FlexS 6.9
Surflex-dock 14.6
FLEXX 15.6
ICM 17.7
Intel Xeon 2.4 GHz
Intel Q6600 2.4 GHz
Giganti et al. J. Chem. Inf. Model. 2010, 50, 992
Step 5: Clustering, library analysis
Wide range of methods
• Unsupervised, agglomerative
clustering
• Hierarchical and non-hierarchical
methods
• Similarity based and structure
based techniques
Flexible search options
• Tanimoto and Euclidean metrics,
weighting
• Maximum common substructure
identification
• chemical property matching
including atom type, bond type,
hybridization, charge
JKlustor
JChem Extensions in KNIME
• Worklflow management
in KNIME
• JChem extension nodes
developed by InfoCom,
Japan
• Constantly developing
palette of available
JChem tools
JChem Extensions in KNIME
• IO – molecule and reaction import, export,
drawing
• Visualization
• Manipulators
Calculator plugins
Reactor
Similarity and structure-based search
Fingerprint calculation
Fragmentation
Clustering
R-group composition, decompozition
Standardization
...
• Database management
• Molecular format conversion
• Web search services
Step 1: Reaction Enumeration
Step 2: Filtering
Step 2: Filtering
Step 3: Similarity search
JChem Extensions in KNIME
DB
DB
1. Reactions
virtual synthetic
path
Synthetically
Accessible
Compounds
2. Filtering
3. Similarity
Search
4. 3D alignmentin vivo
experiment?
1. Import reactants
2. Enumerate reaction
• Carry out topology
analysis
3. Calculate properties
• Filter
4. Screen for similarity against
known active
5. Export results
Conclusions
• Virtual libraries and virtual screening are essential
tools in modern Drug Discovery
• No special hardware, short experiment cycles,
variety of approaches
• Database of synthetically accessible compounds
can be designed with reaction libraries and custom
in-house synthetic knowledge
• Powerful 3D alignment techniques allow high-
throughput conformational screening with great
efficiency
• Straightforward integration into KNIME
Contributors
• Tímea Polgár
• Attila Tajti
www.chemaxon.com