CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier...

21
CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post- Virtual Screening at the post- genomic era genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France [email protected]

Transcript of CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier...

Page 1: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Virtual Screening at the post-Virtual Screening at the post-genomic eragenomic era

Dr. Didier ROGNAN

Bioinformatic Group

UMR CNRS 7081

Illkirch, France

[email protected]

Page 2: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Virtual screening: DefinitionVirtual screening: Definition

Searching electronic databases (2D, 3D) for molecules fitting:

a pharmacophore

an active site

Walters et al. Drug Discovery Today 1998, 3, 160-178Schneider et al., Drug Discovery Today 2002, 7, 64-70.

Page 3: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Sci Scientific reasonsntific reasons1. Increasing number of interesting macromolecular targets (500 10,000)2. Increasing number of protein 3-D structures (X-ray, NMR)3. Better knowledge of protein-ligand interactions4. Dévelopement of chem- and bio-informatic methods5. Increasing computing facilities

Economic reasons

1. High cost of high-througput screening (HTS): 0.2 € /molecule

2. Increase the ratio

ions Applications1. Identifying the very first ligands of orphan targets2. Identifying/optimizing new chemical scaffolds

Importance of virtual screeningImportance of virtual screening

# of active molecules (hits)

# of tested molecules

Page 4: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Protein-based virtual screening Protein-based virtual screening

2. Evaluation

« Scoring »

Mol # Gbind

11121 -44.51 222 -42.21 3563 -41.50 6578 -40.31 25639 -40.28. .....100000 22.54

Database (3-D)

1. Orientation « docking »

Target-Ligand Complex

Target(3D !!)

Hit list

Page 5: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Docking Docking

GoalQuickly find (1-2 min./molécule)

the orientation of the ligand in the active site the protein-bound conformation MéthodsOrientationSurface complementarityComplementarity of intermolecular interactions

Conformational freedomIncremental constructionConformational sampling (MC, GA, SA)

Abagyan et al. Curr. Opin. Struct. Biol. 2001, 5, 375-382

Page 6: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Docking :Docking : OrientationOrientation

Surface-based orientation (e.g. DOCK)

2. Molecular surface (active site)

3. Filling the surface by overlapping spheres

4. Matching sphere centerswith atoms

1. 3D structure

Page 7: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

http://cartan.gmd.de/flexx

Docking :Docking : OrientationOrientation

interactions-based orientation (e.g. FlexX)

-Statistical rules for locating ligand atoms

-Overall placement of a base fragment by triangulation

Page 8: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Docking: Ligand flexibilityDocking: Ligand flexibility

- by preselecting several conformers/molecules

- by incremental construction

Termination adding the 2nd adding the 1st peripheral fragment peripheral fragment

Reading preferred torsion valuesSelecting the « best »

Ligand Fragment decomposition base fragment

Page 9: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

- by a genetic algorithm (e.g. Gold)

http://www.ccdc.cam.ac.uk/prods/gold/

Initial population

Selection of parents

Genetic operators

Selection of children

New population

Convergence test

size

Parent ScoreA 2.5B 5.0C 1.5D 1.0

B

A CD

Survival rate

100110010010010011

100110011010011010

100110010

100101010

gene:

x,y,z coords.tors. anglesorientation…

crossing over

mutation

New

genera

tion

crossing over rate

mutation rate

# o

f evolu

tion

s

Chromosome = Ligand (orientation, conformation)

Docking: Ligand flexibilityDocking: Ligand flexibility

Page 10: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Docking AccuracyDocking Accuracy

Analysing 100 high-resolution PDB complexes Paul,N. and Rognan, D. Proteins, in press

0 2 4 6 8 10 12 140

10

20

30

40

50

60

70

80

90

100

Accuracy of the best possible pose (n =30)

% o

f com

plex

es

rmsd, Å

Dock FlexX Gold ConsDock

Finding a reliable pose out of a set of 30-50 solutions is feasible !

Page 11: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Docking AccuracyDocking Accuracy

0 2 4 6 8 10 12 140

10

20

30

40

50

60

70

80

90

100

Accuracy of the top-ranked pose

% o

f com

plex

es

rmsd, Å

Dock FlexX Gold ConsDock

Analysing 100 high-resolution PDB complexes Paul,N. and Rognan, D. Proteins, in press

Ranking the most reliable solution at the top of the list is still an issue !

Page 12: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Source of Docking ErrorsSource of Docking Errors

Nature of the active site (flat vs. cavity)

Missed influence of waterLigand flexibilityInaccuracy of the scoring functionUnusual binding mode/interactions

Inadequate set of protein coordinatesWrong atom typing

Impossible

Difficult

Easy

Page 13: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

ScoringScoring

Thermodynamic Methods: FEP, TI (2)

Force-fields (10-100)

QSAR, 3D-QSAR (100-1,000)

Empirical scoring functions (>100,000)

# of molecules

Err

or,

kJ/

mol

Accu

racy

2 1000 100,000

2

10

Page 14: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

ScoringScoring

First-principle methods:sum of physically meaningfull terms

Regression-based free energy approximations:sum of regression-weighted terms

Potential of mean forcesdistance-dependent atom pair-weighted Helmotz free energies

Gohlke et al. Curr. Opin. Struct. Biol. 2001, 11,231-235

Page 15: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Empirical Scoring functionEmpirical Scoring function

Constant

H-bond term

g1( r) =

0

0.25)/0.4-r(1

1

Å 0.65 r if

Å 0.65 r Å 0.25 if

0.25År if

g2( ) =

0

30)/50-α(1

1

º80 α if

80º α 30º if

º03α if

f(r) =

0

R1)/3.-r(1

1

R2 r if

R2r R1 if

R1r if

lipophilic term

buried-polar repulsive term

rotational term

0

,,,0 )()()()(2)(1 reacdesolvrotrot

LppL

PllPbp

LllLlipo

hbhbbinding GGHGrfrfGrfGgrgGGG

desolvation term

FresnoRognan et al. (1999) J. Med. Chem., 42, 4650-4658.

Hrot = 1 + (1-1/Nrot) r

(Pp(r) + P’p(r))/2

Page 16: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Scoring AccuracyScoring Accuracy

Current accuracy: 5-10 kJ/mol (1-2 pK unit)

Weak point of all docking programs

Entropic contributions are difficult to handle ! !

Way-around: use of consensus scoring functions

Page 17: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

S

S

Br

O

O

NH

H

O

Isis/Base

C[1](=C(C(=CS@1(=O)=O)SC[9]:C:C:C(:C:C:@9)Br)C[16]:C:C:C:C:C@16)N

2-D Fingerprint

Full database

FilteringChemical reactivtypharmacokinéticsDrug-likeness

C[1](=C(C(=CS@1(=O)=O)SC[9]:C:C:C(:C:C:@9)Br)C[16]:C:C:C:C:C@16)N

Filtered database

2D 3D

HydrogensIonisation

3-D Database

Library set-upLibrary set-up

Page 18: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

ApplicationsApplications

High-resolution X-ray structures (enzymes)

Target Ligands Base Hit ReferenceRate

CD4-gp120 inhibitors 150,000 9.7 % Li et al., PNAS (1997)

gp41 inhibitors 20,000 12.5 % Debnath et al., J. Med. Chem. (1999)

FT inhibitors 219,000 19.0 % Perola et al., J. Med. Chem (2000)

kinesin inhibitors 20,000 12.5 % Hopkins et al., Biochemistry (2000)

HIV1 Tar-Tat inhibitors 153,000 25.0 % Filikov et al., JCAMD (2000)

gp41 inhibitors 20,000 12.5 % Debnath et al., J. Med. Chem

Bcl-2 inhibitors 207,000 20.0 % Enyedi et al., J. Med. Chem (2001)

HCA-II inhibitors 90,000 61.0 % Grüneberg et al., Angew. (2001)

RAR agonists 250,000 6.6 % Shapira et al., BMC Struct. Biol. (2001)

TPI inhibiteurs 108,000 20.0 % Joubert et al., Proteins (2001)

ER antagonists 1,500,000 72.0 % Shapira et al. IBM Sys. J. (2001)

FT: farnesyltransférase, HCA: human carbonic anhydrase, RAR: retonic acid receptor, ER:Estrogen receptor, TPI: triosephosphate isomerase, PEP: phosphoenolpyruvate

Page 19: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Conclusions

What is possible ?What is possible ?

Discriminate true hits from random ligands Enriching a reduced library by a factor 20 Retrieving about 50% of all true hits Prioritizing ligands for synthesis and experimental screening Using virtual screening for lead finding

What remains to improve ?What remains to improve ?

Predicting the exact orientation Predicting the absolute binding free energy Discriminating true hits from “similar inactives“ Catching all hits Using virtual screening for lead optimization Throughput (100K mols/day 1M/day ?) Pre and post-processing of vHTS

Page 20: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

Virtual screening at the genomic scaleVirtual screening at the genomic scale

Primary Sequence

3-D Model

virtual Hits

True Hits

SélectivityAffinityADME/Tox

GPCR-Gen

vHTS

Validation

Available analoguesFocussed Libraries

vs. Enzymes (PDB library)vs. RCPGs (RCPG library)

e-Libraries “Bioinfo” (350,000)

“RCPG” ( 30,000)

“Endo” ( 2,000)

Optimisation

RCPGs of the human genome

Page 21: CENTRE NATIONAL DE LA RECHERCHE SCIENTIFIQUE Virtual Screening at the post-genomic era Dr. Didier ROGNAN Bioinformatic Group UMR CNRS 7081 Illkirch, France.

CENTRE NATIONALDE LA RECHERCHESCIENTIFIQUE

1012 molecules virtual Library

109

107

107 (108 conformations)

105 (106 conformations)

104

103

100

ADME/Tox

Similarité 2-D

Conformations 3-D

Similarity 3-D

Docking

Scoring

expt. Validation True hits

Virtual screening: TomorrowVirtual screening: Tomorrow