Varun Khanna and Shoba Ranganathan Macquarie University, Sydney, Australia vkhanna@cbms.mq.edu.au...

Post on 17-Dec-2015

216 views 1 download

Tags:

Transcript of Varun Khanna and Shoba Ranganathan Macquarie University, Sydney, Australia vkhanna@cbms.mq.edu.au...

Varun Khanna and Shoba RanganathanMacquarie University, Sydney, Australia

vkhanna@cbms.mq.edu.au

Physiochemical property space distribution among human

metabolites, drugs and toxins

2

Outline of the presentation

Introduction Chemoinformatics and current drug discovery approach

Drug-likeness and related measures Molecular bioactivity space

Results Conclusion

3

Chemoinformatics

Chemistry + Informatics = Chemoinformatics Brown 1998; Willett 2007 Involves many sub-disciplines today, such as:

• Similarity and diversity analysis• CASD-Computer Aided Synthesis Design• CASE-Computer Aided Structure Elucidation• QSAR-Quantitative Structure Activity Relationship

4

Current drug discovery process: 10-15 years & US $1 billion

Disease Target identification

Lead identification

Preclinical testing

Human clinical trials

Approved by regulatory authorities

Market

5

Toxicity: major cause of drug failures

Schuster D, Laggner C, Langer T: Why drugs fail - a study on side effects in new chemical entities. Curr Pharm Des 2005, 11(27):3545-3559.

Gut J, Bagatto D: Theragenomic knowledge management for individualised safety of drugs, chemicals, pollutants and dietary ingredients. Expert Opin Drug Metab Toxicol 2005, 1(3):537-554.

However, there is no comparison of toxins to drugs or any other drug-like set of molecules.

Data resources available: Distributed Structure-Searchable Toxicity (DSSTox)

Carcinogenic Potency Database (potency.berkeley.edu)

6

Outline of the presentation

Introduction Chemoinformatics and current drug discovery approach

Drug-likeness and related measures Molecular bioactivity space

Results Conclusion

7

A very brief history of “drug-likeness”

Lipinski’s Rule of Five (Ro5) dominated drug design and discovery since 1997

A molecule is “non-drug-like” if it has >5 five hydrogen bond donors, >10 hydrogen bond acceptors, molecular mass >500 and lipophilicity (measured as AlogP) >5.

Recently, metabolite-likeness is important for designing targeted drugs, that act on specific metabolic pathways (Dobson et al., 2009)

Data resources available are: Human Metabolite Database (www.hmdb.ca) DrugBank (www.drugbank.ca)

8

Molecular bioactivity space

N P

D T

R O

U X

G I

S N

S

M E T A B O L I T E S

9

186 (4.38 %)

228 (2.91 %)

92 (1.65 %)

(3248)Drugs

(995)

Toxins

Metabolites(4568)

Large scale physiochemical property comparison

In this paper, we present Comprehensive analysis of

Drugs Metabolites Toxins

Comparison of Ro5 1D 3D

Clustered (or representative) vs. unclustered (or raw) datasets (for the first time)

10

Clustered and unclustered (raw) datasets

Dataset Metabolites Drugs Toxins

Unclustered M: 6582 D: 4829 T: 1448

Clustered CM: 4568 CD: 3248 CT: 995

11

Properties of drug-like molecules Lipinski properties (Ro5) 1D properties

Number of atoms Number of nitrogen and oxygen atoms Number of rings Number of rotatable bonds

3D properties Molecular volume Molecular surface area Molecular polar surface area Molecular solvent accessible surface area

Analysis Software SciTegic Pilot (accelrys.com/products/scitegic) Clustering: Using “Cluster Clara” algorithm and employing ECFP_4

fingerprints as molecular descriptors.

12

Outline of the presentation

Introduction Chemoinformatics and current drug discovery approach

Drug-likeness and related measures Molecular bioactivity space

Results Conclusion

13

“Rule of five” analysis

Datasets

Lipinski Properties

Molecular weight <500 Da

H-bond Donor <=5

H-bond Acceptor <=10

Log P <5

HMDB (Metabolites) 34% 84% 84% 35%

DDB (Drugs) 84% 86% 87% 92%

CPDB (Toxic molecules)

94% 98% 97% 92%

14

Lipinski properties

0

10

20

30

40

50

60

70

80

-2

0 –

-1

5

-1

5 –

-1

0

-1

0 –

-5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

0

5

10

15

20

25

30

35

0-1

00

10

0-2

00

20

0-3

00

30

0-4

00

40

0-5

00

50

0-6

00

60

0-7

00

70

0-8

00

80

0-9

00

90

0-1

00

0

10

00

-11

00

11

00

-12

00

a. Molecular weight b. Alog POxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Per

cen

tag

e o

f m

ole

cule

s

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Per

cen

tag

e o

f m

ole

cule

s

MetabolitesDrugsToxins

0

10

20

30

40

0 2 4 6 8 10 12 14 16 18 20

0

10

20

30

40

0 1 2 3 4 5 6 7 8 9 10

c. Lipinski hydrogen bond donor d. Lipinski hydrogen bond acceptor

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Per

cen

tag

e o

f m

ole

cule

s

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Per

cen

tag

e o

f m

ole

cule

s

MetabolitesDrugsToxins

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

15

1D property comparison

0

10

20

30

40

50

0-1

0

10-

20

20-

30

30-

40

40-

50

50-

60

60-

70

70-

80

0

10

20

30

40

0-5

5-1

0

10-1

5

15-2

0

20-

25

25-

30

30-3

5

35-

40

40-

45

45-5

0

50-

55

55-

60

60-

65

a. Number of atoms b. Number of carbon atoms

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

0

10

20

30

40

50

0 1 2 3 4 5 6 7 8 9 10

0

10

20

30

40

0 2 4 6 8 10 12 14 16 18 20

c. Number of nitrogen atoms d. Number of oxygen atomsOxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

16

3D property comparison

0

10

20

30

40

50

0-1

00

100

-200

200

-300

300

-400

400

-500

500

-600

600

-700

700

-800

800

-900

0

10

20

30

40

0-1

00

10

0-2

00

20

0-3

00

30

0-4

00

40

0-5

00

50

0-6

00

60

0-7

00

70

0-8

00

80

0-9

00

90

0-1

000

100

0-1

10

0

a. Molecular surface area b. Molecular volume Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

0

10

20

30

40

0-1

00

10

0-2

00

20

0-3

00

30

0-4

00

40

0-5

00

50

0-6

00

60

0-7

00

70

0-8

00

80

0-9

00

90

0-1

000

100

0-1

10

0

11

00-1

200

12

00-1

300

13

00-1

400

0

10

20

30

40

50

0-50

50-

100

100

-150

150

-200

200

-250

250

-300

300

-350

350

-400

c. Molecular polar surface area d. Molecular solvent accessible volume

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Oxygen atom distribution

0

5

10

15

20

25

30

35

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Number of oxygen atoms

Perc

en

tag

e o

f m

ole

cu

les

MetabolitesDrugsToxins

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Pe

rce

nta

ge

of

mo

lecu

les

Clustered vs. raw datasets - Ia. Number of oxygen atoms

0

5

10

15

20

25

30

35

0 2 4 6 8 10 12 14 16 18 20

0

5

10

15

20

25

30

35

0 2 4 6 8 10 12 14 16 18 20

0

5

10

15

20

25

30

35

0 2 4 6 8 10 12 14 16 18 20

0

10

20

30

40

50

60

70

80

-10

– 5

-5

– 0

0 –

5

5 –

10

10 –

15

15

– 20

20

– 25

25 –

30

M Clustered

M Raw

0

10

20

30

40

50

60

70

80

-15

– -1

0

-10

– 5

-5 – 0

0 – 5

5 – 10

10

– 15

15

– 20

20

– 25

25

– 30

D Clustered

D Raw

0

10

20

30

40

50

60

70

80

-1

5 –

-1

0

-1

0 –

5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

T Clustered

T Raw

~ 10 % drop

b. Number of rings

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6 7 8 9 10

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6 7 8 9 10

0

10

20

30

40

50

60

70

0 1 2 3 4 5 6 7 8 9 10

0

10

20

30

40

50

60

70

80

-1

0 – 5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

M Clustered

M Raw

0

10

20

30

40

50

60

70

80

-15

– -1

0

-10

– 5

-5 – 0

0 – 5

5 – 10

10

– 15

15

– 20

20

– 25

25

– 30

D Clustered

D Raw

0

10

20

30

40

50

60

70

80

-1

5 –

-1

0

-1

0 –

5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

T Clustered

T Raw

~ 9 % drop

~ 9 % drop

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

Clustered vs. raw datasets -II

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

b. Molecular polar surface area

0

10

20

30

40

50

0-5

0

50-1

00

100-1

50

150-2

00

200-2

50

250-3

00

300-3

50

350-4

00

0

10

20

30

40

50

0-5

0

50-1

00

100-1

50

150-2

00

200-2

50

250-3

00

300-3

50

350-4

00

0

10

20

30

40

50

0-5

0

50-1

00

100-1

50

150-2

00

200-2

50

250-3

00

300-3

50

350-4

00

0

10

20

30

40

50

60

70

80

-10 – 5

-5 – 0

0 – 5

5 – 10

10 – 15

15 – 20

20 – 25

25 – 30

M Clustered

M Raw

0

10

20

30

40

50

60

70

80

-15

– -10

-10

– 5

-5 – 0

0

– 5

5 – 10

10

– 15

15

– 20

20

– 25

25

– 30

D Clustered

D Raw

0

10

20

30

40

50

60

70

80

-1

5 –

-1

0

-1

0 –

5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

T Clustered

T Raw

a. Molecular solubility

0

15

30

45

60

75

-30 –

-25

-25 –

-20

-20 –

-15

-15 –

-10

-10 –

-5

-5 –

0

0 –

5

0

15

30

45

60

75

-30 –

-25

-25 –

-20

-20 –

-15

-15 –

-10

-10 –

-5

-5 –

0

0 –

5

0

15

30

45

60

75

-30 –

-25

-25 –

-20

-20 –

-15

-15 –

-10

-10 –

-5

-5 –

0

0 –

5

0

10

20

30

40

50

60

70

80

-10 – 5

-5 – 0

0 – 5

5 – 10

10 – 15

15 – 20

20 – 25

25 – 30

M Clustered

M Raw

0

10

20

30

40

50

60

70

80

-15

– -10

-10

– 5

-5 – 0

0

– 5

5 – 10

10

– 15

15

– 20

20

– 25

25

– 30

D Clustered

D Raw

0

10

20

30

40

50

60

70

80

-1

5 –

-1

0

-1

0 –

5

-5

– 0

0 –

5

5 –

10

10

– 1

5

15

– 2

0

20

– 2

5

25

– 3

0

T Clustered

T Raw

Pe

rce

nta

ge

Pe

rce

nta

ge

Pe

rce

nta

ge

~ 10 % drop

~ 15 % rise

19

Functional group analysis

Functional Group

Metabolitedataset

Drugs dataset

Toxin dataset

Aromatic atom 17.4% 70.6% 62.3%

Benzene 10.3% 56.0% 53.0%

HBA Ester 56.3% 13.8% 15.4%

Primary amine 28.0% 14.4% 12.0%

Secondary amine 11.4% 64.0% 41.2%

Tertiary amine 44.6% 80.0% 60.0%

Quaternary Amine 15.3% 02.1% 00.5%

Primary amide 01.5% 04.5% 03.9%

Secondary amide 11.4% 31.0% 14.5%

Tertiary amide 02.8% 16.8% 09.2%

Alkyl halide ~0.5% ~0.5% 03.2%

Azo 00.0% ~0.5% 03.4%

Nitroso ~0.5% 00.6% 08.4%

20

Outline of the presentation

Introduction Chemoinformatics and current drug discovery approach

Drug-likeness and related measures Molecular bioactivity space

Results Conclusion

21

Conclusions

70% of the metabolites are outside Lipinski universe whereas 90% of the toxins abide by Lipinski’s rule.

Ro5 does not explicitly take toxicity into account and therefore present day drugs are more akin to toxins.

Empirical rules like the “Ro5” can be refined to increase the coverage of drugs or drug-like molecules that are clearly not close to toxic compounds.

Clustered and unclustered datasets are very similar, except in the case of the number of oxygen atoms, the molecular polar surface area and the number of rings.

22

Related work

Customary medicinal plant database Gaikwad J, Khanna V, Vemulpad S, Jamie J, Kohen J,

Ranganathan S: CMKb: a web-based prototype for integrating Australian Aboriginal customary medicinal plant knowledge.

BMC Bioinformatics 2008, 9 Suppl 12:S25.

Invited Chemoinformatics book chapter Khanna V, Ranganathan S: In Silico Methods for the Analysis of

Metabolites and Drug Molecules, in Algorithms in Computational Molecular Biology: Techniques, Approaches and Applications, eds. M. Elloumi and A.Y. Zomaya, Wiley, 2009, accepted.

23

Acknowledgement

VK is grateful to: Macquarie University for the award of a Research

Excellence Scholarship (MQRES) PhD Supervisor and Co-supervisors @ MQ Colleagues and friends @ MQ InCoB2009 Program and Organizing Committee members

24

Thank you and

Questions

25

Supplementary data