Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA)...

37
Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA) Gijs Schaftenaar
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    222
  • download

    3

Transcript of Quantitative Structure-Activity Relationships (QSAR) Comparative Molecular Field Analysis (CoMFA)...

Quantitative Structure-Activity Relationships (QSAR)

Comparative Molecular Field Analysis (CoMFA)

Gijs Schaftenaar

Outline

• Introduction

• Structures and activities

• Analysis techniques: Free-Wilson, Hansch

• Regression techniques: PCA, PLS

• Comparative Molecular Field Analysis

QSAR: The Setting

Quantitative structure-activity relationships

are used

when there is little or no receptor information,

but

there are measured activities of (many)

compounds

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

0

1

2

3

4

5

6

7

8

9

1 3 5 7 9 11 13 15

EC5

0

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

LD50

From Structure to Property

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

O

H

H H

OH

QSAR: Which Relationship?

Quantitative structure-activity

relationships

correlate chemical/biological activities

with structural features or atomic, group

or

molecular properties.

within a range of structurally similar compounds

Free Energy of Binding andEquilibrium Constants

The free energy of binding is related to the reaction constants of ligand-receptor complex formation:

Gbinding = –2.303 RT log K

= –2.303 RT log (kon / koff)

Equilibrium constant K

Rate constants kon (association) and koff (dissociation)

Concentration as Activity Measure

• A critical molar concentration Cthat produces the biological effectis related to the equilibrium constant K

• Usually log (1/C) is used (c.f. pH)

• For meaningful QSARs, activities needto be spread out over at least 3 log units

Free Energy of Binding

Gbinding = G0 + Ghb + Gionic + Glipo + Grot

G0 entropy loss (translat. + rotat.) +5.4

Ghb ideal hydrogen bond –4.7

Gionic ideal ionic interaction –8.3

Glipo lipophilic contact –0.17

Grot entropy loss (rotat. bonds) +1.4

(Energies in kJ/mol per unit feature)

Basic Assumption in QSAR

The structural properties of a compound

contribute

in a linearly additive way to its biological

activity

provided there are no non-linear dependencies of

transport or binding on some properties

An Example: Capsaicin Analogs

X EC50(M) log(1/EC50)

H 11.80 4.93

Cl 1.24 5.91

NO2 4.58 5.34

CN 26.50 4.58

C6H5 0.24 6.62

NMe2 4.39 5.36

I 0.35 6.46

NHCHO ? ?

X

NH

O

OH

MeO

An Example: Capsaicin Analogs

X log(1/EC50) MR Es

H 4.93 1.03 0.00 0.00 0.00

Cl 5.91 6.03 0.71 0.23 -0.97

NO2 5.34 7.36 -0.28 0.78 -2.52

CN 4.58 6.33 -0.57 0.66 -0.51

C6H5 6.62 25.36 1.96 -0.01 -3.82

NMe2 5.36 15.55 0.18 -0.83 -2.90

I 6.46 13.94 1.12 0.18 -1.40

NHCHO ? 10.31 -0.98 0.00 -0.98

MR = molar refractivity (polarizability) parameter; = hydrophobicity parameter;

= electronic sigma constant (para position); Es = Taft size parameter

An Example: Capsaicin Analogs

X

NH

O

OH

MeO

log(1/EC50) = -0.89 + 0.019 *

MR + 0.23 * + -0.31 * +

-0.14 * Es

An Example: Capsaicin Analogs

X EC50(M) log(1/EC50)

H 11.80 4.93

Cl 1.24 5.91

NO2 4.58 5.34

CN 26.50 4.58

C6H5 0.24 6.62

NMe2 4.39 5.36

I 0.35 6.46

NHCHO ? ?

X

NH

O

OH

MeO

First Approaches: The Early Days

• Free- Wilson Analysis

• Hansch Analysis

Free-Wilson Analysis

log (1/C) = aixi + xi: presence of group i (0 or 1)

ai: activity group contribution of group i

: activity value of unsubstituted compound

Free-Wilson Analysis

+ Computationally straightforward

– Predictions only for substituents already included

– Requires large number of compounds

Hansch Analysis

Drug transport and binding affinity

depend nonlinearly on lipophilicity:

log (1/C) = a (log P)2 + b log P + c + k

P: n-octanol/water partition coefficient

: Hammett electronic parameter

a,b,c: regression coefficients

k: constant term

Hansch Analysis

+ Fewer regression coefficients needed for correlation

+ Interpretation in physicochemical terms

+ Predictions for other substituents possible

Molecular Descriptors

• Simple counts of features, e.g. of atoms, rings,H-bond donors, molecular weight

• Physicochemical properties, e.g. polarisability, hydrophobicity (logP), water-solubility

• Group properties, e.g. Hammett and Taft constants, volume

• 2D Fingerprints based on fragments

• 3D Screens based on fragments

2D Fingerprints

Br

NH

O

OH

MeO

C N O P S X F Cl Br I Ph CO NH OH Me Et Py CHO SO C=C CΞC C=N Am Im

1 1 1 0 0 1 0 0 1 0 1 1 1 1 1 0 0 0 0 1 0 0 1 0

Regression Techniques

• Principal Component Analysis (PCA)

• Partial Least Squares (PLS)

Principal Component Analysis (PCA)

• Many (>3) variables to describe objects= high dimensionality of descriptor data

• PCA is used to reduce dimensionality

• PCA extracts the most important factors (principal components or PCs) from the data

• Useful when correlations exist between descriptors

• The result is a new, small set of variables (PCs) which explain most of the data variation

PCA – From 2D to 1D

PCA – From 3D to 3D-

Different Views on PCA

• Statistically, PCA is a multivariate analysis technique closely related to eigenvector analysis

• In matrix terms, PCA is a decomposition of matrix Xinto two smaller matrices plus a set of residuals: X = TPT + R

• Geometrically, PCA is a projection technique in which X is projected onto a subspace of reduced dimensions

Partial Least Squares (PLS)

y1 = a0 + a1x11 + a2x12 + a3x13 + … + e1

y2 = a0 + a1x21 + a2x22 + a3x23 + … + e2

y3 = a0 + a1x31 + a2x32 + a3x33 + … + e3

yn = a0 + a1xn1 + a2xn2 + a3xn3 + … + en

Y = XA + E

(compound 1)

(compound 2)

(compound 3)

(compound n)

X = independent variables

Y = dependent variables

PLS – Cross-validation

• Squared correlation coefficient R2

• Value between 0 and 1 (> 0.9)

• Indicating explanative power of regression equation

• Squared correlation coefficient Q2

• Value between 0 and 1 (> 0.5)

• Indicating predictive power of regression equation

With cross-validation:

PCA vs PLS

• PCA: The Principle Components describe the

variance in the independent variables (descriptors)

• PLS: The Principle Components describe the

variance in both the independent variables (descriptors)

and the dependent variable (activity)

Comparative Molecular Field Analysis (CoMFA)

• Set of chemically related compounds

• Common substructure required

• 3D structures needed (e.g., Corina-

generated)

• Bioactive conformations of the active

compounds are to be aligned

CoMFA Alignment

C7OH

OH

A

D

B

C1

MeO OMe

ClClCl

BA

O

OC7OH

OHOH

A

B

C1

O

NMe2

OH

A B

CL

LL d1

d2d3L

LL

d1

d2

d3

L

LL

d1

d2

d3

L

L

L

d1 d2

d3

L

LL

d1

d2

d3

"Pharmacophore"

CoMFA Grid and Field Probe

(Only one molecule shown for clarity)

Electrostatic Potential Contour Lines

CoMFA Model Derivation

Van der Waals field(probe is neutral carbon)

Evdw = (Airij-12 - Birij

-6)

Electrostatic field(probe is charged atom)

Ec = qiqj / Drij

• Molecules are positioned in a regular grid

according to alignment

• Probes are used to determine the molecular

field:

3D Contour Map for Electronegativity

CoMFA Pros and Cons

+ Suitable to describe receptor-ligand interactions

+ 3D visualization of important features

+ Good correlation within related set

+ Predictive power within scanned space

– Alignment is often difficult

– Training required