Characterization of Pharmacophore Multiplet Fingerprints...

Post on 08-Jun-2020

11 views 0 download

Transcript of Characterization of Pharmacophore Multiplet Fingerprints...

Characterization ofPharmacophore Multiplet Fingerprintsas Molecular Descriptors

Robert D. ClarkTripos, Inc.

©2004 Tripos, Inc.bclark@tripos.com

Outline• Background

o historyo mechanics

• Finding appropriate binning rangeso biased conformer generation

• Similarity measureso stochastic similarity

• Hypothesis generationo asymmetric similarity

• Conclusions

History of Pharmacophore MultipletsA.C. Good and I.D. Kuntz;

J. Comput.-Aided Mol. Design 1995, 9, 373-379.X. Chen, A. Rusinko, and S.S. Young;

J. Chem. Inf. Comput. Sci. 1998, 38, 1054-1062.J.S. Mason, I. Morize, P.R. Menard, D.L. Cheney, C. Hulme

& R.F. Labaudiniere; J. Med. Chem. 1999, 42, 3251-3264. M.J. McGregor & S.M. Muskal;

J. Chem. Inf. Comput. Sci. 1999, 39, 569-574. H. Matter and T. Pötter;

J. Chem. Inf. Comput. Sci. 1999, 39, 1211-1225. J.S. Mason and B.R. Beno;

J. Mol. Graphics Mod. 2000, 18, 438-451 E. Abrahamian, P.C. Fox, L. Nærum, I.T. Christensen,

H. Thøgersen & R.D. Clark;J. Chem. Inf. Comput. Sci. 2003, 43, 458-468.

Novo Nordisk / Tripos Tuplets Collaboration• 2 year collaboration to develop and extend existing

SYBYL triplet (PDT) technology

• Incorporate pair, triplet and quartet (‘Tuplet) technology

• Augmented ‘Tuplets and support for privileged substructures

• Conformers generated on-the-fly or retrieved

• Bitmaps created, stored and manipulated in compressed format

o four 1.8 x 109 bit bitmaps stored as ~80kb fileo 0.01-0.5 seconds/molecule

Type III antiarrhythmic: UK 66914

acceptoratoms

donor/acceptor atoms

donor atom

hydrophobiccenter

hydrophobic center

positive nitrogen

Multiplet Fingerprints

… 000010001010000000100100001110100001110000111000000000011001...

Indexing Triplets

D

A H

2 3

5

Bin: 5, 3, 2Triplet: H-A-D

Vertex joining longest

and shortest edges

Indexing TetrahedraProblems:• Need a unique mapping • Must deal with chirality• Literally dozens of possible permutations• Mapping must be based on bins and features

Plane of symmetryimplies no chirality

CA

CD

4

4

3

2

2AC

C

D4

4

32

2

BA

C

D4

4

3

2

2Chiral

tetrahedra

C

D 4

4

32

2

AB

Mapping Quartet Bits

...542333* 666666

...DDDD DDDA DDDH HHHH

Mapping for 7 bins and 3 features (D, A, H)

666665

Bitmap Size = 76 * 34 = 9,529,569 bits

...000001000000

*542333 specifies the + enantiomer;245333 specifies the - enantiomer +-

Dis

trib

utio

n of

Dis

tanc

es

Bet

wee

n Fe

atur

es

050

100150200250300

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

0

50

100

150

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

betablockers

K+ channelopeners

Type I anti-arrythmics

0

50

100

150

200

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

edge length (Å)

freq

uenc

yfr

eque

ncy

freq

uenc

y

Cum

ulat

ive

Dis

trib

utio

nsac

ross

Cla

sses

100 Conformer By Class

0

10000

20000

30000

40000

50000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Estrogen AntagonistsType III AntiarrythmicsBenzamidesPhenothiazinesBeta BlockersType I AntiarrythmicsK Channel Openers

1 Conformer By Class

0200400600800

10001200140016001800

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Estrogen AntagonistsType III AntiarrythmicsBenzamidesPhenothiazinesBeta BlockersType I AntiarrythmicsK Channel Openers

edge length (Å)

freq

uenc

yfr

eque

ncy

Effe

ct o

f Bia

sed

Con

form

er G

ener

atio

n

100 Confort Conformer By Class

0

10000

20000

30000

40000

50000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Estrogen AntagonistsType III AntiarrythmicsBenzamidesPhenothiazinesBeta BlockersType I AntiarrythmicsK Channel Openers

100 Systematic Search Conformers By Class

02000400060008000

10000120001400016000

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Estrogen AntagonistsType III AntiarrythmicsBenzamidesPhenothiazinesBeta BlockersType I AntiarrythmicsK Channel Openers

edge length (Å)

freq

uenc

yfr

eque

ncy

Hypothesis Fingerprint CreationDDD000

DDD001

DDA200

DAA210

DDH210

DAH331

DHH333

HHH433

0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0

Binary CompoundFingerprints

Hypothesis Fingerprint CreationDDD000

DDD001

DDA200

DAA210

DDH210

DAH331

DHH333

HHH433

0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0 1 3 1 3 0 2 4 1

Binary CompoundFingerprints

Vector Sum Fingerprint

Hypothesis Fingerprint CreationDDD111

DDD211

DDA311

DAA321

DDH321

DAH442

DHH444

HHH544

0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0 1 3 1 3 0 2 4 1 3 3 3 3 4 4 5 6 3 4 5 6 6 10 12 13 9 36 15 54 24 80 240 78

Binary CompoundFingerprints

Vector Sum Fingerprint

Feature Weights

Bin WeightsBit Score

Weighting Bitsfor Hypothesis Generation

Sb is the score for the bitfb is the frequency of the bitfwi is the weight of the feature typedwj is the weight of the distance bin

∑∑==

××=nd

jj

nf

iibb dwfwfS

11f1

f3f2

d2

d3

d1

⇒Construct an hypothesis from the highest scoring bits.

Hypothesis Fingerprint CreationDDD111

DDD211

DDA311

DAA321

DDH321

DAH442

DHH444

HHH544

0 1 0 0 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 1 1 0 1 1 1 0 0 1 0 1 3 1 3 0 2 4 1 3 3 3 3 4 4 5 6 3 4 5 6 6 10 12 13 9 36 15 54 24 80 240 78

Binary CompoundFingerprints

Vector Sum Fingerprint

Feature Weights

Bin WeightsBit Score

t

tn

NNS =Sa

nity

Che

cker

Similarity Measures• Tanimoto coefficient

• Cosine coefficient

• Stochastic cosine coefficient

)()()()(

),(BpdtApdtBpdtApdt

BAt∪∩

=

[ ][ ] [ ])()()()(

)()(),(

**

*

BpdtBpdtEApdtApdtE

BpdtApdtEBAs

∩×∩

∩=

)()(

)()(),(

bpdtapdtbpdtapdt

baCc×

∩=

Effect of Conformer Counton Stochastic Cosine Similarity

0

0.1

0.2

0.3

0.4

0.5

0.6

0 100 200 300 400 500 600 700 800 900 1000

Estrogen_Antagonist ClassSimilarity

Estrogen_Antagonist Non-ClassSimilarity

K_openers Class Similarity

K_openers Non-Class Similarity

benzamides Class Similarity

benzamides Non-Class Similarity

conformer count (max)

sim

ilarit

y

Effect of Conformer Counton Stochastic Cosine Discrimination

conformer count (max)

0.0000

2.0000

4.0000

6.00008.0000

10.0000

12.0000

14.0000

1 10 100 1000

I_AntiarrythmicsIII_Antiarrythmics

Phenothiazines

beta BlockerBenzamides

K_openers

Estrogen_Antagonist

disc

rimin

atio

n ra

tio

Dis

crim

inat

ion

and

Sim

ilarit

y M

easu

re

0.0000

2.00004.0000

6.0000

8.0000

10.000012.0000

14.0000

1 10 100 1000

I_Antiarrythmics

III_Antiarrythmics

Phenothiazines

beta Blocker

Benzamides

K_openers

Estrogen_Antagonist

0.0000

5.0000

10.0000

15.0000

20.0000

1 10 100 1000

I_Antiarrythmics

III_Antiarrythmics

Phenothiazines

beta Blocker

Benzamides

K_openers

Estrogen_Antagonist

conformer count (max)

disc

rimin

atio

n ra

tiodi

scrim

inat

ion

ratio simple cosine

Tanimoto

Dis

crim

iant

ion

and

Con

form

er B

ias

0.0000

2.0000

4.0000

6.0000

8.0000

10.0000

12.0000

14.0000

1 10 100 1000

I_Antiarrythmics

III_Antiarrythmics

Phenothiazines

beta Blocker

Benzamides

K_openers

Estrogen_Antagonist

0.0000

2.0000

4.0000

6.0000

8.0000

10.0000

12.0000

14.0000

1 10 100 1000

I_Antiarrythmics

III_Antiarrythmics

Phenothiazines

beta Blocker

Benzamides

K_openers

Estrogen_Antagonist

conformer count (max)

disc

rimin

atio

n ra

tiodi

scrim

inat

ion

ratio CONFORT

systematic search

Symmetric Similarity Measures

• Symmetric stochastic cosine

• Asymmetric stochastic cosine

[ ][ ] [ ])()()()(

)()(),(

**

*

BpdtBpdtEApdtApdtE

BpdtApdtEBAs

∩×∩

∩=

[ ][ ]s h tE pdt h pdt t

E pdt h pdt h*( , )

( ) ( )

( ) *( )=

Effe

ct o

f Hyp

ooth

esis

Siz

e (T

ype

III a

ntia

rrhy

thm

ics)

bits in hypothesis

aver

age

sim

ilarit

yav

erag

e si

mila

rity

CONFORT

systematic search

00.10.20.30.40.50.6

0 200 400 600 800 1000

00.10.20.30.40.50.6

0 200 400 600 800 1000

within class

without class

within class

without class

100 Conformers

1000 Conformers

asymmetric stochastic cosine

symmetric cosine

Conclusions

• Compression is cool• Natural binning does make sense

o 1.75 3 4 5 6 7 8 8.75 9.75 10.75 11.75 13 15 >15Åo at least for triplets

• Systematic bias increases discriminationo rule-based conformational bias can be usefulo caveat: it may limit lead-hopping

• More is not necessarily bettero true in terms of conformation counto true in terms of multiplet hypothesis size

• A little asymmetry can be a good thing• Compression is still cool

www.tripos.com

AcknowledgementsNovo Nordisk A/S (Denmark)

Lars Nærum*

Henning Thøgersen*Tripos, Inc.

Edmond AbrahamianPeter FoxTrevor Heritage

May the multiplets be with you...

What a Protein “Sees”

(electrostatic field at 0.5 Å resolution, 80 and 30% contours)

What the Chemist Sees

NH3C

O

NO O

F

CF3

O

NH

NN

ClS

H3C

H3C O

O

H3C

O

tetrahydrophthalimide(American Cyanamide)

trifluorotoluidide pyrazole ether(Monsanto)

Pharmacophoric Features

NH3C

O

NO O

F

CF3

O

NH

NN

ClS

H3C

H3C O

O

H3C

Ohydrophobiccenters

hydrogen bondacceptors

hydrogen bonddonor

Conformational Sampling*

*diverse conformers obtained using CONFORT

Mapping Multiplets

...532000 666

...DDD DDA DDH HHH

Mapping for 7 bins and 3 features (D, A, H)*

665

Bitmap Size = 73 * 33 = 9261 bits

...001

* Features are handled in the order supplied by the application.

1 bit

Hypothesis GenerationMultiple methods implemented for hypothesis generation

o From a collection of known actives

o From a user defined UNITY® query

o From a single molecule pharmacophore map

a) Single or multiple generated conformers

o From user specified residues in receptor cavity

Privileged Substructures:Augmented Triplets

HY

HY

AA

@_AUGMENTED# name mnemonic xref weight min_dist max_distDONOR_SITE DS AA 3.0 2.5 3.5.=NULL.

DS

Effect of Conformer Counton Cosine Coefficient Similarity

0

0.1

0.2

0.3

0.4

0.5

0.6

0 100 200 300 400 500 600 700 800 900 1000

Estrogen_Antagonist ClassSimilarity

Estrogen_Antagonist Non-ClassSimilarity

K_openers Class Similarity

K_openers Non-Class Similarity

benzamides Class Similarity

benzamides Non-Class Similarity

conformer count (max)

sim

ilarit

y

0.0000

2.00004.0000

6.0000

8.0000

10.000012.0000

14.0000

1 10 100 1000

I_Antiarrythmics

III_Antiarrythmics

Phenothiazines

beta Blocker

Benzamides

K_openers

Estrogen_Antagonist

disc

rimin

atio

n ra

tio