Post on 02-Jun-2015
description
1
Assessing the similarity of compound collections using molecular fields: Does it add value?
Tim Cheeseright, Mark Mackey, Rob Scoffin, Martin Slater
2
Conclusions
> It works brilliantly
> All synthetic steps gave yields of 100%
> All enrichments were perfect
> All new molecules were sub nM
> All QSARs were totally predictive, q2 = 1.0
> We expect the call from Sweden any day now
3
Conclusions
> Work in progress
> 3D similarity can add value to compound selection
> Full matrix of similarities possibly unnecessary
> Using probes looks like a possible solution
> Not a panacea
4
Agenda & Background
> Fields & similarity
> Generating screening compounds using Fields
> Selecting a 10K “diverse” library for screening from commercial compounds> Initial thoughts
> Problems
> More Initial thoughts
> A solution but not a complete one
> Conclusions
5
NN
Br
F FF
SH2NO
O
Field Points
Condensed representation of electrostatic, hydrophobic and shape properties (“protein’s view”)
> Molecular Field Extrema (“Field Points”)
3D Molecular Electrostatic
Potential (MEP)
Field Points= Positive = Negative= Shape= Hydrophobic
2D
6
Improved MM Electrostatics
> Field patterns from XED force field reproduce experimental results
Interaction of Acetone and Any-OH from small molecule
crystal structures
Experimental Using XEDs
C O
-0.5
-0.5
-0.5
-0.5
-0.5
-1.75
-1.75
+5
+1
H
-0.5
-0.5
+0.9
+0.1
Not using XEDs
XED adds ‘p-orbitals’ to get better representation of atoms
7
Non-Classical Comparisons
8
Molecular Alignment
0.66 0.98
0.82
Cheeseright et al, J. Chem Inf. Mod., 2006, 665
9
Using Fields
> Bioisosteric groups
> Virtual Screening
> Pharmacophore hypothesis
> Qualitative SAR interpretation
> 3D QSAR
> Library Design
10
Field based library design success
11
Libraries from Fields
> Small, custom synthesised libraries (~100s - 1000s compds)
> Low scaffold diversity
> Highly targeted
> Lots of manual design
12
An Opportunity & a Challenge
> Provide a small diverse screening library 10K for a small biotech company
> Diversity in potential biological targets to be hit
> Minimum redundancy in the set
> Maximum chance of success in finding a lead within available budget and screening resources
13
Initial thoughts
> Customised design not an option - commercial compounds only
> Using Fields to successfully select compounds for screening performed many times> Virtual screening
> Always in a specific biological context
> What about using Fields to choose a ‘diverse’ set
> Possible problem with numbers > 10,000 cmpd library small
> 9,000,000 commercially available molecules v. large for 3D diversity
14
Initial thoughts
> Compare 3D and 2D similarities for compound collections - are we wasting our time?
> Take a small compound collection
> Full NxN calculation
> 3D method = Fields & Shape
> 2D method = atom pairs
> Compare and Contrast
15
Conformations
> 3D Method requires conformations - which one(s) to use?
> What is the similarity of 2 compounds in 3D ?> Context is important!
> Highest across all conformations?
> Average ?
> Lowest ?
> For 3D, similarity calculation is Nconfs x Nconfs
17
Compound Collection
> BIONET 'Rule of Three' ('Ro3') Fragment Library: “7,907 'Ro3'-compliant fragments”
> Conformation hunt on every fragment Maximum of 5 conformations (!)
> Full N x N similarity matrix, 3D & 2D (60 Million data points)
> ~30 compounds failed conformation hunting
18
Problems
> 400Mb of data
> Tedious to use and examine
Pilot study just using the first 500 compounds> Some chemical families in this area
> Still a large dataset to deal with (250,000 data points)
> 2D similarities and fragments> Small changes cause disproportionately high changes
> Atom pairs particularly bad
> Switch to KNIME fingerprints
All 2D values lower than ‘normal’
19
Comparing 2D and 3D metrics
Agreement
22
N NHO
O
Cl
Example - Similar Scores
N NHO
O Cl
Cl
101 104
2D sim = 0.9
3D field sim = 0.87
23
Example - Higher 3D Sim
2D sim = 0.1(other methods=0.3)
3D field sim = 0.82
S
O BrN
HNO
24
O
O
O
O
HN
Example - Higher 3D Sim
141
2D sim = 0.2
3D sim = 0.7
454
25
NCl
O
HN
O
437
2D sim = 0.3
(other methods 0.55)
3D field sim = 0.8
440
NS Cl
O
HN
O
O
Example - Higher 3D Sim
26
So…
> Pilot study suggests some added value
> Full matrix painful even if we could calculate it
> What about a reduced matrix? > Use ‘Probe’ compounds to tease out molecules that are
different in Field space
How many probes?
Across how many molecules
> We were running out of time…
27
Compound selection by Field Diversity
> Proposed workflow for generation of a field diverse library:
9M commercial compounds
Calc. 200 X 2002D similarity
matrix
Pick 20K sub-set
Pick 100 Diverse
Field probes
Calc. Shape Diversity by
PMI
Pick 200 sub-set
Property Filters
Calc. 20K X 100Field similarity
matrix
Pick 12KField
Diverse set
3D PCA on Field matrix
1.2M
30
Field Diverse library: Outcome
12K ‘Field Diverse’ library mapped by 3D PCA on the
100 x 20,000 ‘Field Similarity Fingerprint’
Distinct separation of charged species within
this space
AmmoniumsPiperidines
….so what!!
Benzoic and aliphatic acids
31
Field Diverse library: Outcome
12K ‘Field Diverse’ library mapped by 3D PCA
Distinct separation of by molecules by size within
this space
….so what!!
DecreasingSize
32
Deeper - Moderate ‘Field Similarity’
Alignment to ‘template1’
33
Deeper - Moderate ‘Field Similarity’
Alignment to ‘template1’Random selection of mols
35
Deeper - Moderate ‘Field Similarity’
Alignment to ‘template’
36
Is the chemical space sensible?
Small sulphonamides
Large esters
Two example clusters
37
Conclusions
> Work in progress
> Full similarity matrix shows potential of 3D sim to add value
> Full matrix difficult to handle and possibly unnecessary
> Using probes looks like a possible solution
> Not a panacea - still need to play the numbers game
38
Acknowledgements
> Cresset> Martin Slater
> Rob Scoffin
> Mark Mackey
> James Melville
> Mission Therapeutics> Keith Menear