Ligand Solvation in Molecular Docking

13
RESEARCH ARTICLES L ig a nd So lva t io n in Mo le cu la r Do cking Brian K. Sho iche t, 1 * And re w R. Lea ch, 2 an d Irwin D.K unt z 2 * 1 Departm en t of M ole cular Pharm acology and Biological Chemi stry, North wes tern Un iversity M edical School, Chi cago, Il li nois  2 Departm ent of Pharm aceuti cal Chemi stry, Un iversity of Cali fornia, San Francisco , Californi a  ABSTRACT    So lvat ion play s an impo rtant role in li g and- prote in as so ciation and has a st rong i m- pa ct on co mp arisons of bind ing ene rgie s fo r dis s imi- lar mo lecule s. Whe n dat ab as es of such mo lecule s are scre ene d for complem ent arity to rece ptors of know n structure,as of te n occ urs in s tructure -b as e d inhi bitor di sco very , failur e to cons ider li g and so l- vation often lea ds to puta tive l igand s that are too hig hly cha rg e d or to o larg e. To correc t fo r the diff e rent chargestates and s ize s of the liga nds , we calculate d ele ctros tat ic and non- po lar so lvation free ene rgies for mo lecules in a widely use d mo lecular database, theAv ailab le Chemicals Dir e cto ry (ACD). A mo died Born eq uation trea tment was use d to calcu late the e lec tros ta tic compone nt of liga nd s ol- vation. The non- pola r com pone nt of l igand so lva- tio n wa s ca lcula te d ba s e d on th es urfa ce are a of th e ligand and parameters de rived from the hydration ene rgies of apo lar li g ands . Theseso lvation ene rgies were sub tracte d from the ligand- rece pto r interac- tion energies. W e tested the use fulness of the se correct ions by s cree ning theACD fo r mo lec ule s tha t complement ed three pr ot eins of known structure, us ing a mo le cula r do cking pro g ram. Corre ct ing fo r liga nd s olva tion impro ve d the r anking s of known ligands and discriminated against mo lecule s with inapp ropriate charg e states and s izes . Pr ot eins 1999;34:4–16. 19 99Wiley- Liss, I nc. Key words : so lvation; mo lecular do cki ng ; dat ab as e se arch; st ructur e- based drug design; compute r-aided drug des ign INTRODUCTION Given the st ructure of a biologic al receptor, it should be possible to design or discover molecules tha t w ill bind to it. Using atomic resolution s t r u ct u r es and computational techniques, investigators have attempted to design 1–3 or discover 4–8 novel inhibitors for biological receptors. Such putat ive ligands ha ve been selected for their complementa - rity to the structure of the receptor. When the energy of the solvated state is not considered in these calculations, the ligands that are selected often bear high formal charge or are larger than expected. This is particularly true when com- paring ma ny potential ligands t hat diff er in po larity a nd size. The bindin g affinit y of a liga nd for a receptor depends on the int eraction free energy of the tw o molecules relat ive to their free energies in solution: G bind   G interact   G solv ,L   G solv ,R  (1) where  G interact  is the interaction free energy of the com- plex,  G solv,L  is t he free energy of desolvating the ligand a n d G solv,R  is the free energy of occluding t he receptor site from solvent. Various methods have been proposed to evaluate or to estima te these terms; the problem is difficult because t he energy of each component on the right ha nd side of Equation 1 is large while the difference between them is small. The most accurate way to calculate relative binding energies is with free-energy perturbation techniques. 9–11 These techniques are usually restricted to calculating the differential binding of similar co mpounds, a nd require extensive computa tion. 12–14 In novel inhibitor design and discovery, many different candidate molecules are evalu- ated, making free energy perturbation impractical as an initial screen. Methods that consider many different possible ligand- receptor complexes a re necessarily less accurate tha n perturbation techniques. The free energy of interaction (G interact ) is usually approximated a s an enthalpy an d is calculated w ith a receptor potentia l function, often derived from molecular mechanics. 15–17 Ligan d and receptor so lva- tion contributions (G solv, L  a n d  G solv, R ) are usually calcu- lated as free energi es; e ff orts range from empirical scales 18,19 to parameterization for specic functional groups  20–24 to detailed theoretical treatments. 25–27 Several Gra nt sponsor: P hRMAFoundation; Gran t sponsor: Animal Al terna- tives P rogram of Procter a nd Gamble; Gra nt sponsor: Na tional Institutes of Health; Grant number: GM31497; Grant sponsor: Science an d En gineering Research Council (UK ) under t he NATO postdoctoral fellowship program . Andrew R. Leach’s present address is Department of Chemistry, Glaxo-Wellcome, Southampton, Hampshire SO9 5NH, United King- dom. *Correspo ndence t o: Br ian K. Shoic het, Department of Mole cu l a r Pha rmacol ogy a nd Biologi cal Ch emistry , Northwestern University Medical School, 303 East Chicago Avenue, Chicago, IL 60611–3008. E-mail: [email protected], or to Irwin D. Kuntz, Department of Pharmaceutical Chemistry, University of California, San Francisco, CA94143–0446. E-mail: [email protected]. Received 3 Februa ry 1998; Accepted 27 August 1998 P ROTEI NS: S tructur e, Function, an d G enetics 34:4–16 (1 9 99 ) 1999 WILEY-LISS, INC.

Transcript of Ligand Solvation in Molecular Docking

Page 1: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 1/13

RESEARCH ARTICLES

Ligand Solvation in Molecular Docking

Brian K. Shoichet,1* Andrew R. Leach,2 and Irwin D.K untz2*1Depart m ent of M ol ecul ar P harm acol ogy and B i ol ogi cal Chemi st ry, Nort h west ern Un i versi t y M edi cal S chool , Chi cago, I l l i noi s 2Depart m ent of P harm aceut i cal Chemi st ry, Un i versi t y of Cal i f orni a, S an F ran ci sco, Cal i f orni a 

ABSTRACT       Solvation plays an important rolein ligand-protein association and has a strong im-pact on comparisonsof binding energies for dissimi-lar molecules. When databases of such moleculesare screened for complementarity to receptors of known structure,asoften occurs in structure-based

inhibitor discovery, failure to consider ligand sol-vation often leads to putative ligands that are toohighly charged or too large. To correct for thedifferent charge states and sizes of the ligands, wecalculated electrostatic and non-polar solvation freeenergies for molecules in a widely used moleculardatabase, theAvailable Chemicals Directory (ACD).A modified Born equation treatment was used tocalculate the electrostatic component of ligand sol-vation. The non-polar component of ligand solva-tion was calculated based on the surface area of theligand and parameters derived from the hydrationenergies of apolar ligands. Thesesolvation energieswere subtracted from the ligand-receptor interac-

tion energies. We tested the usefulness of thesecorrectionsby screening theACD for molecules thatcomplemented three proteins of known structure,using a molecular docking program. Correcting forligand solvation improved the rankings of knownligands and discriminated against molecules withinappropriate charge states and sizes. Proteins1999;34:4–16. 1999Wiley-Liss,Inc.

Key words: solvation; molecular docking; databasesearch; structur e-based drug design;computer-aided drug design

INTRODUCTION

Given the st ructure of a biological receptor, it should bepossible to design or discover molecules tha t w ill bind to it.U s in g a t o m ic r es olu t ion s t r u ct u r es a n d com p ut a t io n a ltechniques, invest igators have a t tempted to design 1–3 ordiscover 4–8 novel inhibitors for biological receptors. S uchputat ive ligands ha ve been selected for their complementa -rity to the structure of the receptor. When the energy of thesolvated state is not considered in these calculations, theligands that are selected often bear high formal charge orare larger than expected. This is particularly true when com-paring ma ny potential ligands t hat differ in polarity a nd size.

The bindin g affinit y of a liga nd for a receptor depends onthe int eraction free energy of the tw o molecules relat ive totheir free energies in solution:

G bind   G int eract   G solv,L   G solv,R   (1)

where  G interact   is the interaction free energy of the com-plex,   G solv,L   is t he free energy of desolvat ing the l iganda n d G solv,R  is the free energy of occluding t he receptor sitefrom solvent . Various methods have been proposed t oevalua te or to estima te th ese terms; the problem is difficultbecause t he energy of each component on the r ight ha ndside of Equat ion 1 is large while the dif ference betweenthem is small.

Th e m os t a c cu r a t e w a y t o c a lc ula t e r e la t iv e b in din genergies is with free-energy perturbation techniques. 9–11

These techniques are usually restricted to calculating thedifferent ia l b inding of s imilar compounds, a nd requireextensive computa tion.12–14 In novel inhibitor design anddiscovery, many different candidate molecules are evalu-

ated, making free energy perturbat ion impract ical as aninitial screen.

Methods tha t consider many different possible l igand-receptor complexes a re necessarily less accurat e tha nperturbat ion techniques. The free energy of interact ion(G interact) is usually approximated a s an enthalpy an d iscalculated w ith a receptor potentia l function, often derivedfrom molecular mechanics.15–17 Ligan d and r eceptor solva-tion contributions (G solv, L a nd  G solv, R) ar e usually calcu-l a t e d a s f r ee e n er g ie s; e ff or t s r a n g e f r om e m pi r ica lscales 18,19 t o p a r a m et er iz a t io n f or s p ec ifi c f u n ct io n a lgroups   20–24 to detailed theoretical treatments.25–27 Several

Gra nt sponsor: P hRMAFoundat ion; Gran t sponsor: Animal Alterna-t i ve s P r og r a m o f P r o ct e r a n d G a m b le ; G r a n t s p on s or : N a t i on a lInstitutes of Health; Grant number: GM31497; Grant sponsor: Sciencean d En gineering Research C ouncil (UK ) under t he NATO postdoctoralfellowship program .

Andrew R. Leach’s present address is Department of Chemistry,Glaxo-Wellcome, Southampton, Hampshire SO9 5NH, United King-dom.

*Correspondence t o : Br ian K. Shoichet , Department of MolecularPha rmacology a nd Biological Ch emistry, Northwestern Universi tyMedical School, 303 East Chicago Avenue, Chicago, IL 60611–3008.E-mai l : [email protected], or to Irwin D. Kuntz, Department ofPharmaceutical Chemistry, Universi ty of Cal i fornia, San Francisco,CA94143–0446. E-mail: [email protected].

Received 3 Februa ry 1998; Accepted 27 August 1998

P ROTEI NS: S tructur e, Function, an d G enetics 34:4–16 (1999)

1999 WILEY-LISS, INC.

Page 2: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 2/13

invest igators have implicit ly included solvat ion in theiraffinity calculations by parameterizing the affinity of indi-vidual funct ional groups based on the binding of sets ofknown ligands to different receptors.28–32 This overcomesthe problem of subtra cting th e explicitly ca lculat ed solva-tion and interaction energies, both of which are typically

large. The energies used in the parameterized functionalgroup methods have no necessary relationship to discretea t o m ic t er m s, s u ch a s e lect r o st a t ic a n d v a n der Wa a lsforces. We are only interested in such atomic terms here,and will consider only force-fi eld-based calculat ions ofaffinities.

Several a uthors ha ve described force fields tha t considert h e b ou n d a n d s olva t ed s t a t es . Am in o a c ids h a v e b eenpara meterized using experimenta l va por-part itioning coef-ficients20 for the ECEPP potent ia l funct ion 23 to allow forsolvation correction. Similar corrections have been used inligand discovery meth ods2,33,34 These met hods successfullypredicted new ligands and a lso the s tructures of l igand-receptor complexes33 I t s eem s c lea r t h a t t h e s olv a t ion

correction terms improved the evaluation of the putativecomplexes. On the other hand, the relative magnitudes ofthe solvat ion and interact ion terms in these s tudies re-m a in ed p r ob lem a t ic, a s t h e exp er im en t a l p a r t i t ion in gnumbers are often small compared with the interact ionenergies one might expect w ith a force field such as E CE P Por AMBE R. Also, the experimenta l solvat ion numbers usedby these aut hors are best char acterized, in the context of ama cromolecular potentia l function, for peptides and n ucleicacids , and less w ell chara cterized for other small organicmolecules . Many inhibitor design a lgorithms, includingseveral of our own, have ignored solvation and have usedonly the interact ion energy for evaluat ing complexes,t r ea t in g i t a s a p r o x y f o r t h e f r ee en er gy o f b in din g o f

different puta tive ligands or functional groups1,3,5,6,35–37

Here w e a t tempt to bala nce protein-ligand interact ionenergies with ligand solvation energies in docking studiesof a molecular database. The molecular docking programDOCK wa s used to screen the Availa ble Chemical directo-ries (ACD) database for compounds that complementedthe enzymes thymidylate synthase (TS), the L99Amutantof T4 lysozyme, and dihydrofolate reductase (DHFR). TSbinds pyrimidine nucleot ides bearing one or two formalnegat ive charges; th e ‘‘cavity’ ’muta nt of T4 lysozyme bindsneutral aromatic hydrocarbons; DHFR binds pteridinesbearing either one or no positive formal cha rge. B y choos-ing these three receptors , we hoped to span a range ofligand charge a nd size, testing t he solvation corrections in

very different receptor environments. The ACD includesknown liga nds for ea ch receptor, but most of molecules inthe ACD ar e not thought to bind to these enzymes.

We c or r e ct e d t h e e le ct r o st a t i c i n t e ra c t i on e n er g y(G elec,interact) with a l igand electrosta t ic solvat ion energy(G elec,L,solv) a n d t h e v a n d er Wa a l s com pon en t of t h einteract ion energy (E vdw,interact ) with a non-polar compo-nent of ligand solvation (G np,L,solv). Dividing the solvat ionand interact ion terms into electrosta t ic a nd non-electro-sta t ic components is not r igorously correct ,13 b ut t h eapproximation is often reasonable.24 The binding energy

score equation becomes:

G bind   G elec,interact   E vdw, int eract

G elec,L,solv G np, L , s olv   (2)

Equa t ion 2 lacks many of the terms that are consideredimporta nt for determining binding a ffinities. Since we w illbe concerned only with rela t ive binding af f init ies of theligands, some of these other t erms w ill cancel. We assum ethat : every l igand pays the same configurat ional entropycost for binding; the receptor adopts the same conforma-tion in ea ch complex; every liga nd d esolvat es the receptorequally; the l igand is completely desolvated on binding;ea ch l iga n d h a s on ly on e b ou n d c on f or m a t ion . Th eseassumptions limit the accuracy of our results. The ener-gies returned by the docking calculat ions are prone todeviat ions, sometimes enormous deviat ions, from truebinding energies, and we will refer to energy ‘‘scores ’’ra ther t ha n simply energies in recognition of this. St ill, forinhibitor discovery a nd dat abase screening a pplicat ions,the first task to perform is to screen out unlikely ligands,a n d h igh l igh t l ik ely o n es . E q u a t io n 2 s ign ifi c a n t ly im -proves our a bility to do so, over the case wh ere solvat ion isignored, as we w ill show.

METHODSApproach

We fi rst out line the general procedure for th e D OCKd a t a b a s e s c re en s a n d t h en p rov id e t h e v a l ue s of t h evaria bles tha t were used in the various calculat ions.

The binding site of the protein is defined using spheres 38

that complement the molecular surface 39 of the protein orpoints in t he s ite wh ere l igand a toms a re experimentally

known to bind.7 These spheres and points ca n be t houghtof as pseudo-atom positions onto which DOCK superim-poses atoms of the dat aba se molecules to genera te a liga ndor ien t a t io n in t h e b in ding s i t e. F or a n y g iven l iga n d,mult iple orienta t ions are generated, depending on thecorrespondence of the interna l dista nces of its a toms w ithth ose of the receptor spheres. 35,38,40

Once oriented in the site, the molecule is screened forsteric complementarity. For orientat ions that pass thisscreen, an interaction energy is calculated based on electro-sta t ic and van der Waa ls complementarity to the protein.The electrosta t ic potent ia l of the protein is calculatedusing the fi nite-dif ference Poisson-Boltzma nn methodimplemented in D elPhi.41 The electrostatic component of

t h e in t er a c t ion en er gy com es f r om m u lt ip ly in g l iga n dp a r t ia l a t o m ic c h a r ges b y t h e r ec ep t o r p o t en t ia l a t t h ea t o m p os i t ion s o f a g iven l iga n d or ien t a t ion .36 P a r t i a la t o m ic c h a r ges a r e c a lc u la t ed w i t h t h e G a s t eiger a lgo -rithm 42 implement ed in th e progra m S YBYL (Tripos Asso-ciates, 1991).36 The va n der Waa ls potential of t he proteinsite is calculated with CHEMGRID. 36 The van der Waalscomponent of the in tera ction energy comes from mu ltiply-ing l igand van der Waa ls parameters , ass igned by DOCK,by the r eceptor potentia l at the va rious atom positions of agiven ligand orientat ion.36 For any orientat ion of a mol-

5LIGAND SOLVATION IN MOLECULAR DOCKING

Page 3: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 3/13

ecule in a binding sit e, the intera ction energy is:

G int eract i

q iP i v iP v   (3)

Where q i   i s t h e c h a r ge o f a t o m i o f t h e l iga n d, P i   i s t h e

elect r os t a t ic p ot en t ia l of t h e r ecept or a t t h e p os i t ionoccu p ied b y a t o m i , v i   is t h e v a n d er Wa a l s a t o mi c

parameter of a tom i and P v   is the van d er Waa ls potentia l

at position i.

To calculate a free energy of binding, we subtract a n

electrosta t ic and a non-polar solvat ion energy from the

intera ction energy. The electrostat ic component of liga nd

solvat ion is calculated w ith continuum electrostat ic method

of Ra shin,27,43 implemented in the program H YDRE N. The

non-polar solvat ion energy is a lso calculat ed by HYDR EN

an d is derived from the surface area of the ligand. 25,27 Both

electrostatic and non-polar terms are calculated once and

stored in a look-up table. The energy score calculated by

DOCK becomes:

E bind  i

q iP i v iP v E solv,elec E solv,np   (4)

This energy is used to ra nk the da ta base molecules in their

complexes with the protein.

Ligand Solvation

Continuum electrostat ic methods for calculating molecu-

lar electrosta t ic hydrat ion energies derive from the Born

equat ion:

G solv (q 2/2r )(1/D 0 1/D w ) (5)

Where q is charg e, r is the ra dius of the charged group, D 0

is the dielectric of the phase to which the ligand is being

t r a n s f er r ed a n d D w   is the dielectric of w ater . This equa -

t ion, w ith several correct ion factors , has been used to

accurately calculate the solvation energies of spherically

distributed charges.44 Recent work has extended the ap-

proach to non-spherical charged a nd polar groups.25,27,45,46

The method used here27,43 em ploy s a r ea ct ion fi eld a p -

proach. The solvent polarization charge induced by a point

charge embedded within a sphere of defined dielectric is

calculated. The entha lpy is the intera ction energy betw een

the induced polarization charge and the original charge of

the a tom. The equat ions used have the sam e form a s theBorn equa tion (Eq uat ion 5).43 Aboundary element method

is used to define the border between t he dielectric of t he

atom a nd th at of the solvent . Dif ferent parts of the solute

are represented as cont inuum dielectrics , or as sets of

a toms tha t are characterized by isotropic a tomic polaris-

abili t ies . The dielectric boundary is ident ified with the

molecular surface and is determined using the program

MS .39 This sur face is divided into discrete smooth bound-

a r y e lem en t s . Th e ch a r ge dist r ibu t ion of t h e s olu t e is

r epr esen t ed b y p oin t c ha r ges . P o la r iz a t ion eff ec t s a r e

represented by polarization charge densities at dielectric

boundaries a nd by induced dipoles a t polarisable a toms.

The polariza tion charge density is assumed t o be consta nt

within each boundary element . This enables a system of

linear equat ions to be established tha t a re solved to give

the polar izat ion cha rge densities from wh ich various ther-

modynamic values can be derived. The method can treat

charged or part ia lly charged a toms. The basic approachwas a t t ract ive to us because it is a cont inuum dielectric

method that seemed physically consis tent with DelPhi,

w h ich w e u s e t o ca lc ula t e t h e r ecept o r p ot en t ia l . Th e

p a r t icu la r im plem en t a t ion in H Y D R E N w a s a t t r a c t iv e

because th e code wa s ava ilable to us, the program req uired

few external par am eters, and could be applied to the lar ge

numbers of dispar at e compounds found in molecular d at a-

bases. Several other im plementat ions of continuum solva-

t ion approaches ha ve been published.25,26,45,46 Although

they differ in the treat ment of the solvent boundary a nd in

several correct ion terms, each a ppears to give s imilar

results for s imilar molecules ; presumably any of these

methods could be used to calculate the solvation energiessimilar t o those used here.

The non-polar component is calculated based on t he

s u r fa c e a r ea of t h e l iga n d.43 For non-polar solutes, the

hydrat ion entha lpy m ay be modeled using the s ize of the

cavity created in the solvent to a ccommodat e the l igand,

an d the surfa ce ar ea of the ligand in t he solvent. For most

ligands t he cavit at ion term scales as t he squar e of the size

of the ligand , i.e. , wit h the surfa ce ar ea. 43 Thus the overall

energy of hydra tion of a non-polar liga nd can be related to

t h e s u r fa c e a r e a of t h a t l ig a n d . I f on e c a n s pl it t h e

hydrat ion of a l igand into a polar and a non-polar term,

then for all ligands the non-polar component of hydration

should scale l inearly w ith th e surface area of the l igand.

Rashin and co-workers43 have used this relation to calcu-late non-polar components of solvat ion for l igands of

arbitrar y shape by fi t t ing experimental hydra t ion entha l-

pies for non-polar ligands to their calculat ed surfa ce areas.

The non-polar component of solvation enthalpy is calcu-

lated by HYDREN43 a s:

H np 8638.59 96.518 a r ea a r ea   131.11 Å2

H np 621.48 25.890 a r ea a r ea   131.11 Å2

Where H np  is the non-polar hy dra tion ent ha lpy in cal/mol

of a molecule with the cavity surface area in Å2. H ere too,

similar corrections and met hods for calculating t hem ha ve

been proposed by other authors 25—we presume that these

methods would do as w ell as the one used here.

Parameters Used in All Docking Calculations

All calculations were performed with DOCK3.5, modi-

fied to read-in and correct for the pre-calculated electro-

sta t ic and van der Waa ls components of solvat ion. The

polar an d non-polar close conta ct limits used in the st eric

gr ids w er e 2 . 4 a n d 2 . 8 A.35 Th e A M B E R u n i t ed a t o m

ch a r g e s e t , d is t r ib ut e d w i t h D e lP h i , w a s u se d f or a l l

receptor electrostatic calculations. All heteroatom hydro-

6   B .K. SHOIC HET ET AL.

Page 4: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 4/13

gens used in the DelPh i calculat ions were placed using theE D I T p r og r a m i n A M B E R , u n l es s ot h e r w i se n ot e d .C H E MG R I D w a s u sed t o c a lcu la t e a v a n d er Wa a l spotent ia l for the enzymes using s tandard parameters . 36

Chemical labeling was used 47 in the matching calculation.This involves labeling site positions or a toms by chemicalpropert ies t o speed the docking calculat ion. H ere, fi velabels were employed: positive, negative, hydrogen-bonddonor, hydrogen-bond acceptor, and neutra l. Except in theDHF R calculat ion, where one bound-wa ter w as included,we did not include structural water molecules or counter-ions in either the electrosta t ic or s teric calculat ions ofenzyme potentia l grids.

We used a dielectric of 2 for the protein interior 26 a n d7827 for the water phase in the DelPhi calculat ions. Theinternal a nd external dielectrics in the hydrat ion calcula-tions were a lso set t o 2 an d 78.27 In the DelPhi calculationthe probe s ize wa s set to 1.4 A, in the HYDREN calcula-tions the probe radius was 0.8 A.27 Atomic van der Waa lsr a d ii f or t h e p r ot e in a n d t h e l i g a n d w e re t a k en f r omRashin.27 In the DelPhi calculat ion, the ionic exclusionradius wa s set to 2 A and the ionic molarity was set to 0.1M. The proper values of l igand and protein dielectrics ,

probe, van der Waa ls , and ionic radii a re a ct ive a reas of

research; we ha ve not tr ied to optimize these terms.

In the receptor potential calculation three-step focus-

in g41 wa s used with protein containment i terat ively set to

20 , 6 0, a n d 9 0 p er cen t w i t h in a 6 53 A 3 la t t ic e . I n t h e

HYDREN calculations the maximum number of iterations

wa s 10, convergence w as set to 0.001 with low and high

density surfa ce numbers set to 2 and 10. Density numbers

w er e a u t o m a t ic a l ly s et lo w er f o r la r ge l iga n ds t h a t ex -

ceeded arra y bounds wh en running the programs.

To consider t he possibility t ha t using a higher d ielectricconstant might obviate the need for a formal solvat ion

correction term, we performed a calculation with TS that

u s ed a d ielec t r ic c on s t a n t of 2 0 in st ea d of 2 . I n t h is

calculation, no solvation correction term was applied. We

also conducted calculations on the ligand-bound and un-

bound conformations of TS and DHFR, to investigate the

eff ec t t h a t con f or m a t ion a l ch a n ge w o u ld h a v e o n t h e

ma gnitude of the intera ction energies.

E x cept w h er e n o t ed, a l l da t a b a s e s ea r ch es u s ed t h e

sa me 153,536 compound s ubset of the 1995/2 relea se of the

ACD.48 These molecules were selected based on our ability

Fig. 1. The variation of DOCKrankwithligandchargefor TS. The top ranking 400 molecules outof 153,536 screened by DOCK are shown. The arrows mark the charge state of known nucleotide

ligands of TS.   a.   Solvation   uncorrected   search.   b.   Solvation   corrected   search.   c.   Solvationuncorrected  search using a protein dielectric of 20 instead of 2.

7LIGAND SOLVATION IN MOLECULAR DOCKING

Page 5: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 5/13

Fig. 2. The docked orientation of thymidine monophosphate in themolecular surface of the TS. As seen in crystallographic complexes withnucleotides andTS, Tyr261 makes a hydrogen bond with the O3’hydroxylof the ligand (2.8A), and Arg178’ makes a hydrogen bond with a

phosphate oxygen (2.6A). The conformation of the pyrimidine moietydiffers from crystallographic complexes of nucleotides with TS owing tothe lack of ligand flexibility in the current docking algorithm.

TABLE I. Number ofKnown Ligands in theTop400Ranked Molecules,Out of153,536Molecules Searched,With and Without Solvation

EnzymeKnown ligands in t op 400wit h solvation correction

Known ligands in t op 400without solvation correction

Rank of characteristic ligandswit h solvation correction

Rank of characteristic ligandswit hout solvat ion correction

TS 21 11 7b 285b

144c 575c

L99A 28 (74)a 0 (39)a 29d 175d

141e 459e

D H FR 13 (37) 3 (7)a 2f 48f

172g 2047g

a Number of molecules tha t correspond to known ligands (close an alogs for which binding da ta could not be found).bPyr idoxal phosphate.cdUMP.dIndene.eToluene.f2,4-Diaminopteridine.g2,4-Diamino-6,7-dimethylpteridine.

8   B .K. SHOIC HET ET AL.

Page 6: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 6/13

to calculate pa rt ia l a tomic charges ,36 an d included most ofth e molecules in the ACD -3D.

TS Docking Calculations

We used the structure of TS from   Lactobacillus casei 

determined in the presence of phosphate.49 The phosphate

wa s deleted from the s ite to a llow da taba se molecules tofi t. To allow for the close a pproach of nucleotides to th ecata lyt ic Cys198, the Sg of this residue wa s deleted fromt h e s t r u c t u r e u s ed t o c a lc u la t e p o t en t ia l en er gy m a p s .Docking spheres were genera ted a s described.35 We used anode limit of five and bin sizes and overlaps of 0.2 A for thereceptor, a bin s ize of 0.2 A, an d a n overlap of 0.1 A for theligand. The dis tance tolerance for l igand a t om-receptorsphere matching (dislim) was 1.5 A. Thirty steps of rigidbody minimizat ion were conducted for the docked mol-ecules.50

To compar e the effects of enzym e conforma tiona l chan geon the calculated interaction energies, we also performeddocking calculations on the TS from   Escherichia coli   in its

unbound conformat ion (in th e absence of nucleotide, PD Baccession number 3tms) and in its bound, tern ar y form (inthe presence of dU MP and a fola te inhibitor , PDB acces-sion number 1syn). The coordinates of dUMP from 1synwere used a s ma tching spheres in both searches; for t hecalculation against 3tms this was achieved by RMS-fittingt h e C   coordinat es of 3tms onto the 1syn s tructure. Allother para meters were a s described above.

DHFR Docking Calculations

The structur e of DHF R from  E . coli  bound to methotrex-a t e w a s u s ed (P D B 51 structure 3dfr 52). The m ethotrexat ew a s delet ed f r o m t h e s t r u c t u r e t o a l lo w da t a b a s e m o l-ecules to fi t into the s ite. The locat ions of methotrexate

a t o m s a n d t h o s e o f f o la t e fi t in t o t h e s i t e f r o m a f o la t ecomplex with DHFR were used as proxies for receptorsphere centers.47 The docking calculation used bin sizes of0. 5 a n d 0. 5 A f o r t h e l iga n d a n d t h e r ec ep t o r , w i t h n ooverlaps and a dis tance tolerance (dis lim) of 1.0 A. Weincluded the s tructurally conserved water 253 from the3dfr structure as pa rt of the protein for the D ISTMAP a ndDelPhi calculat ions.47 Also, t h e O   hydrogen dihedralangle of Thr116, which is set to a default value of 180degrees by AMBER, was rotated to 120 degrees, allowingthis residue to bet ter complement pteridine r ing nitro-

gens.47 N o h e a vy a t om s w e r e m ov ed i n m a k in g t h ischange. Five hundred s teps of r igid body minimizat ionwer e conducted for t he docked molecules.50

To compar e the effects of enzym e conforma tiona l chan geon the calculated interaction energies, we also performeddocking calculations on DHFR in its unbound conforma-tion, in the a bsence of methotrexate (PD B accession num-ber 6dfr). This ca lculat ion proceeded a s described a bove.

T4 Lysozyme Docking Calculations

We used the benzene-bound complex structure of L99A(PDB code 181L). The ligand was deleted from the struc-ture to a llow data base molecules to fi t into the s ite. Thea t o m ic p os it ion s of ot h er k n ow n l iga n ds f or t h e L 99 A

binding site, d etermined in complex to lysozyme by X-ra ycrysta llogra phy, were fi t into this str ucture by overlappingcommon receptor a t oms. Ligand at oms that did not ma kesteric contacts with L99A in i ts conformation bound tobenzene were u sed a s proxies for r eceptor sphere centers.Sev er a l s p h er es f r o m t h e SP H G E N p r o gr a m w er e a ls oused; in total, 39 potential atom sites were used for thesecalculations. A dista nce tolerance (dislim) for m at ching of0.75 A wa s used. Bin sizes and overlaps were set at 0.2 forboth ligand an d receptor. One hundred st eps of rigid bodyminimization were conducted for the docked molecules. 50

Fig. 3. The variation of DOCK search rank with ligand charge forDHFR. The top ranking 400 molecules out of 153,536 screened by DOCKare shown. The arrow marks the charge state of known ligands for thissite. a.  Solvation uncorrected  search. b.  Solvation corrected  search.

TABLE II . Energy Scoresof Known Inhibitors fromDocking SearchesAgainsttheBound andFree

Conformationsof TS and DHFR

Liga nd E nzyme

Energy inbound

conformation

of enzyme(kcal/mol)

Energy inunbound

conformation

of enzyme(kca l/mol)

dU MP TS   97   8.4P y r id oxal p hosphat e TS   118   16.82,4-Diaminopteridine DHFR   26   7.66-Met hylpt erin D HF R   18.7   11.9

9LIGAND SOLVATION IN MOLECULAR DOCKING

Page 7: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 7/13

RESULTS

Da ta base searches were run against TS, DH FR, and T4lysozyme with and without l igand solvat ion correct ion.Ea ch data base molecule was fi t by DOCK into the bindingsite defi ned by t he spheres or pseudo-at oms. Orientat ionswere evalua ted for electrosta tic and van d er Waa ls fit .36 I nsolvat ion corrected calculations, the electrostatic a nd non-polar components of solvat ion were subtracted from theelectrosta tic and van der Waa ls intera ction energies to geta binding energy score. These solvat ion energy terms werepre-calculated for each molecule before the docking calcu-lat ion, and w ere stored in a look-up ta ble.

Docking ScreensAgainst TS

The molecules of the ACD w ere screened for complemen -ta rity to th e nucleotide-binding site of TS. This site bindspyrimidine monophosphates and monophosphate estersw i t h K d   values in the 0.1–100 µM range.53 Several othermonophosphates also inhibit the enzyme, including pyri-doxal phosphate54 and phenolphtha lein monophosphate(BKS, unpublished results) . The enzyme does not bindnucleotide di- or triphosphates. When solvation was notconsidered, the compounds w ith t he best intera ction ener-gies had high net negat ive formal charges , ranging from

2 t o 12, with t he majority having a formal charge of 4

(Figure 1a). This calculat ion wa s repeated but t he cost ofdesolvat ing each ligan d w as n ow considered. This resultedin top ranking compounds w ith net forma l charges of2 or1 (Figure 1b). When solvation correction was used in thedock in g ca lcu la t ion , t h e r a n k s of k n ow n l iga n ds w er e

improved compared to when solvation correction was notapplied (Table I , F igure 2). For example, in the solvat ioncorrected search, pyridoxal phosphate ranked 7th a nd

dUMP ranked 144th out of 153,536 molecules searched.When solvat ion wa s not corrected for , t hese moleculesranked 255th and 562nd, respectively. When solvation was

not considered, nucleotide di- and triphosphates scoredw el l a n d r a n k ed h igh ly in t h e doc k c a lcu la t ion . Th isreduced the rela t ive ranks of the nucleot ide monophos-

phat es. For example, thymidin e 5’-triphospha te wa s ran kedsecond in the solvation uncorrected calculation but 82,570thin the solvat ion corrected calculat ion. All of the di- andtriphosphates scored poorly in the solvat ion correctedcalculation, as is appropriate.

To consider t he possibility t ha t using a higher d ielectricconstant might obviate the need for a formal solvat ioncorrection term, we performed a calculation with TS thatused a protein dielectric constant of 20 instead of 2 (Figure

Fig. 4. The docked geometry of 2,4-diaminopteridine (carbons magenta, nitrogen green)overlaid upon thecrystallographicconfigurationof methotrexate(red) in themolecular surface of theDHFR binding site. Thesurface of thesite hasbeen z-clippedto show theligands.Asp26, Leu4, andAla97 are shown making hydrogen bond interactions with 2,4-diaminopterindine.

10   B .K. SHOIC HET ET AL.

Page 8: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 8/13

1c); the w at er dielectric wa s kept at 78. In th is calculation,

no solvat ion correct ion term wa s applied. The higherdielectric constant resulted in high-ranking ligands withreduced energy scores . The rankings of compounds re-

s em b led t h o se f r om t h e c a lc ula t ion p er f or m ed w i t h adielectric of 2 and no solvat ion. Compounds with high

f o r m a l c h a r ges w er e r a n k ed b et t er t h a n k n o w n l iga n dstha t ha d lower formal charges .

To invest iga te th e effect of conforma tiona l cha nge on themagnitude of the interaction energies, we also conducted

ca lcu la t ion s a ga in s t a l iga n d-b ou n d f or m o f TS a n d aligand unbound form of the enzyme. Both calculat ions

were solvation corrected. For easy comparison we chosethe TS from   E. coli  bound to dUMP and a fola te inhibitor(P D B co de 1 s y n ) a n d t h e s a m e en zy m e u n b ou n d t o a n y

ligand except for phosphate ion; t he enzyme has beendet er m in ed t o good r esolu t ion in b ot h f or m s . K n ow nmonophosphate inhibitors r anked well in both searches,

but th eir energies of intera ction w ere much reduced in th escreen aga inst the unbound receptor (Table I I). Fewer

known inhibitors w ere found in the unbound conforma tionof TS t ha n in t he bound conforma tion of TS.

Docking ScreensAgainst DHF R

The molecules of the ACD database were screened for

complementa rity to the folat e-binding site of D HF R. Thissite is known to bind neutral and charged diaminopteri-dines and diaminoquinazolines with disassociat ion con-sta nts th at va ry from 10 mM to less tha n 1 nM.55 When thedocking calculations did not correct for ligand solvation,t h e t o p r a n k in g l iga n ds b or e f or m a l c ha r ges r a n gingbetween   1 t o   4 (F igur e 3a ). Th is ca lcu la t ion w a srepeated but th e electrostatic an d van d er Waa ls solvat ion

terms for each ligand were now considered (Figure 3b).The best ranking compounds in this second search wereeither neutral or bore a charge of  1, consistent with theknown ligand binding data .55 When ligand solvation wasconsidered in t he docking calculat ions, thirteen of the t op400 molecules corresponded to known pteridine ligands(Table I, Fig ure 4). Among these thir teen, both neut ra l an dpositive species were found, as appropriate for this site.When ligand solvation was not corrected for, only 3 knownligands were found a mong the t op 400 molecules, and ea chof these was charged. Analogs of known ligands were alsofound for this s ite. These included molecules such as2,4-diam ino-6-hydr oxymeth ylpteridine, wh ich is a closeanalog of the known ligand 2,4-diamino-6-formylpteridine,

but for which binding data could not be found. More ofthese ‘ ‘ reasonable’ ’ analogs were found for the dockingcalculation that corrected for ligand solvation than for thedocking calculation tha t ignored ligand solvat ion.

To invest iga te th e effect of conforma tiona l cha nge on thema gnitude of the int eraction energies, we conducted calcu-lat ions aga inst a l igand-unbound conformation of DHF R.This calculat ion was solvat ion corrected. As in the TSs ea r c h , k n ow n in h ib it or s s t i ll r a n k ed w ell a ga in s t t h eunbound conforma tion of the receptor, but th eir energies ofintera ction w ere much r educed (Table I I). Few er knowninhibitors were found in th e unbound conforma tion DH FR.

Docking ScreensAgainst L99A

A docking search was conducted for the cavity bindingsite of the L99A mutant of T4 lysozyme. The s ite wascreated by substit uting a leucine in the core of the enzym efor an a lanine.56 This resulted in a cavity in t he enzymet h a t w a s b ur i ed f rom w a t e r. Th e ca v it y b in d s s m a ll ,neutral ligands with affinities that range from 10 to 1,000µM;57 charged molecules, or overly polar molecules, ha venot been observed t o bind to th is site. The molecules of th eACD da ta base were docked into the cavity w ithout correct-ing for solvation. The best ranking compounds bore formalnet charges of   1 (Figure 5a) . This calculat ion was re-peated but l igand solvat ion was now considered (Figure5b). The best ranking compounds in this second searchwere neutral , consis tent with the known ligand bindingd a t a .57 When solvation correction was used, 28 moleculesof the top 400 corresponded to known ligands (Table I ,Figure 6). If we include close ana logs of known liga nds th atha ve not been tested for binding to da te, this num ber risesto 102 molecules in th e top 400. For insta nce, DOCK fi nds

Fig. 5. The variation of DOCK search rank with ligand charge for thecore binding site of the mutant lysozyme L99A. The top ranking 400molecules out of 153,536 screened by DOCK are shown. The arrowmarks the charge state of known ligands for this site.   a.   Solvationuncorrected  search.  b. Solvation corrected  search.

11LIGAND SOLVATION IN MOLECULAR DOCKING

Page 9: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 9/13

tha t chlorobenzene fit s w ell into the s ite. I t is not knownwhether this molecule actually binds to L99A, but bothiodobenzene and fluorobenzene bind, and it seemed reason-able to include chlorobenzene among the 74 molecules thatare listed as close analogs of known ligands. When solva-tion correction is not used, th e number of ligand s in the top4 0 0 dr o p s t o 0 , o r 3 9 i f a n a lo gs o f k n o w n l iga n ds a r eincluded. Without solvation correction, the ranks of theknown lig a nds a re reduced (Ta ble I). For example, indene,wh ich binds with a K d of 190 µM, wa s ra nked of 29th in t hesolvation corrected calculation, but ranked only 1482nd inthe solvation uncorrected calculation. Similarly, toluene,which binds with a K d  of 100 µM, wa s ra nked 141st in thesolvat ion corrected screen but ranked only 3,806 in thescreen where solvation was not considered. When ligandsolvat ion wa s neglected, many of the known ligands w erereplaced in the t op ranking 400 molecules w ith cha rged orpolar isosteres.

To a ddress t he effect of non-polar solvat ion on thedock in g ca lcu la t ion s , t h e m olecu les of t h e AC D w er edocked into DHFR, with and without the non-polar solva-t ion term (Eq. 4). When non-polar solvat ion w as notconsidered (Figure 7a), the 400 most complementary mol-ec u les in t h e da t a b a s e w er e t y p ic a l ly la r ger t h a n w h ennon-polar solvat ion w as considered (Figure 7b). When

non-polar solvat ion w as neglected, t he t op ranking mol-ecules included portions that had few or no contacts withthe enzyme, leaving them exposed to solvent (Figure 8).When non-polar solvation was included, the best rankingmolecules were more completely surrounded by the en-zymes, with few moieties exposed to solvent.

DISCUSSION

Including ligand solvation in molecular docking dra ma ti-cally changes the rela t ive ra nking of compounds in dat a-base screens. Since it is this effect that will most interestthe general r eader, w e w ill consider i t fi rs t . We w ill thendis c u s s t h e a b s o lu t e en er gies t h a t a r e r et u r n ed b y t h edocking calculat ions, and several of the approximationst h a t w e h a v e m a d e i n t h is i m pl em en t a t i on of l ig a n dsolvation.

Li gand Rankings

When ligand solvation is not considered in moleculardocking, there is no penalty for placing a charged ligandat om in a region where th e receptor potential only w eaklycomplements i t . In this s ituat ion, a highly charged mol-ecule will receive a bet ter interact ion energy t han a t rueligand. The true l igand, bearing less formal charge, willhave a less favorable interaction energy with the receptor

Fig. 6. The docked geometry of indole (magenta), from the databasescreen, overlaid upon the crystallographic configuration of indole (red) inthe molecular surface39 of the L99A binding site. In both structures the

nitrogen is colored green. The surface of the site has been z-clipped toshow the ligands. Met102, which makes a hydrogen-bond interaction withthe indole nitrogen, is shown at right.

12   B .K. SHOIC HET ET AL.

Page 10: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 10/13

potent ia l . This result does not depend on the dielectricproperties of the enzyme and binding site (e.g., Figure 1c.);dampening the electrosta t ic interact ion energy does notobviat e the n eed to consider ligan d desolvation. Neglecting

the desolvat ion of highly charged molecules in databasesearches results in reduced relat ive ranks for true ligands.

Considering th e cost for electrostat ic desolvat ion lowersthe scores of highly charged molecules rela t ive to theknown inhibitors. When a charged molecule moves fromwater to a binding site it exchanges a high dielectric for alow dielectric environment. This increases its self-energy.In contr ast to the dat aba se screens tha t did not correct forligand solvation (Figures 1a, 3a, and 5a), most of the topscoring liga nds in t he solvation corrected screens ha ve thecorrect cha rge sta te (Figures 1b, 3b and 5b). B y consider-ing the cost of moving a charged species from a high to alow dielectric environment , the bias toward moleculesbearing high charge is eliminat ed.

When non-polar solvat ion is not considered, the topscoring molecules are ty pically la rger th an they should be.In the docked complexes, these molecules often havefragments that are poorly complemented by the bindingsite (Figu re 8). Here a ga in, no cost is assessed for removingthe compound from the solvent, and so even loose contactsm a k e f or a b et t er in t er a c t ion en er gy. Th is b ia s es t h ecalculat ion towa rd la rger molecules. When non-pola r solva-t ion is considered, molecules tha t make few favorableinteract ions with the enzyme are dis favored rela t ive tomolecules tha t a re well complemented by th e binding site.The non-polar solvation term acts as a balance to the van

der Wa a ls t er m in t h e in t er a c t ion en er gy, lea din g t o

complexes with a higher proportion of interacting surfaces.

The molecules found in such complexes more often corre-

spond to known ligands than when non-polar solvation is

ignored.

In summary, ignoring l igand solvat ion in screens of

diverse molecular databases leads to pathologies in dock-ing ca lculat ions. Neglecting t he electrostatic component of

ligand solvat ion energy results in compounds with high

formal charges that rank higher than the known inhibitors

for these enzymes (Figures 1, 3, and 5) . Neglect ing the

non-polar component of ligand solvat ion biases the results

towards larger compounds t hat , overall , complement the

binding site worse than the known, smaller ligands.

Energies

In t he scoring fun ction used here (Eq . 4), a va n der Waa ls

energy derived from the AMBE R potentia l36 is added to an

electrosta t ic energy calculated using the DelPhi poten-

t ia l .36 Liga nd solvation is subtra cted using a B orn-derivedenergy and a surface area -derived non-polar term. The

AM B E R a n d D elP h i p ot en t ia ls w er e der iv ed in dep en -

dently, as were the HYDREN solvat ion terms (although

the HYDREN a nd DelPhi energies are based on the same

physical considerat ions). Though each m ethod a ttempt s to

model physical intera ctions from fi rst-principles, it is ea sy

t o im a gin e h o w t h e in div idu a l en er gy t er m s m igh t n o t

have the same ma gnitudes. Since we ha ve made no effort

to weight a ny term in Eq . 4, it is a ppropriate t o ask how 

well the terms ba lance in the fin al binding energy, an d how 

well the predicted energies, a s opposed to the rankings,

correspond wit h experiment.

In docking screens against the bound conformation ofTS, t ypical intera ction energies, aft er correcting for solva-

tion, for high-scoring ligands were on the order of  100

kca l/mol (97 kcal/mol for dU MP ). I n docking screens

aga inst th e bound conformat ion of DHF R, typical energies

for high-scoring ligands were on the order of 25 k ca l/mol

(26 kcal/mol for 2,4-dia minopter idine). The m a gnit udes

of these energies a re unreasonably high. P art ly this owes

to inaccuracies in the scoring scheme, to w hich we shall

return, and the inherent problem of subtracting two large

numbers to get a sma ll one. Pa rtly, the high magnit udes of

the interaction energies arises because we are not consid-

ering the cost of desolvat ing t he enzymes prior to l igand

binding. In effect , w e a re docking to na ked binding s ites

tha t ha ve been stripped of their solvating w at ers. Also, wea r e do c k in g t o en z y m es t h a t a r e in t h eir l iga n d- b o u n d

conformat ions, without considering the energy cost of

changing conformations from the unbound to the bound

forms.

B y it self, docking t o the bound, pre-organized conforma-

t ion of the enzymes can have a large effect on calculated

interaction energies. When the docking calculations were

repeated against the unbound conformations of TS and

DH FR, the energy scores of the high-ranking ligan ds were

globally reduced a nd those of known inhibitors moved

much closer to their tr ue binding en ergies (Ta ble II). In the

Fig. 7. The effect of correcting for non-polar solvation in a screen ofthe ACD against DHFR. The distributions of number of atoms within thetop 400 top scoring molecules out of 153,536 screened are shown.   a.Without considering non-polar solvation.  b.  Including non-polar solvation.The number of atoms reported on the x-axis exclude hydrogens.

13LIGAND SOLVATION IN MOLECULAR DOCKING

Page 11: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 11/13

bound conformations of the enzyme active sites, residuesare in ideal posit ions to complement the l igand. In theunbound conformat ions of the enzymes several residuesrelax away from their l igand-bound conformations. Forinsta nce, in TS tw o of the four ar ginines tha t intera ct witht h e n u cl eot i de p hos ph a t e i n t h e t e rn a r y f or m of t h eenzyme have moved out of the phosphate-binding site inthe unbound form. This leads to lower interaction energieswith phosphate groups in the docking calculations. It alsoleads to some known ligands not being ident ified. In thecalculat ion against the unbound conformation of DHFR,the known ligand 2,4-diamino-6,7-dimethylpteridine has a

positive (unfavora ble) binding energy score w hereas in th ebound conformat ion calculat ion this l igand w as highlyranked w ith a favorable energy. We note tha t efforts topara meterize empirical scoring schemes for docking calcu-lations sometimes use experimental structures of ligand-en zy m e c om p lexes, w h ich a r e of cou r s e in t h e b ou n dconformations of the enzymes. This may lead to biases inthe param eter sets .

Compared to the DHFR and TS sites, the L99A bindingsite a llows us to consider the m agnitude of errors of ourenergy terms with less ambiguity. This site is completely

b u ried f r om s olven t , a n d n o o r der ed w a t er s h a v e b eenobserved in this s ite by X-ray crysta llography. There isli t t le conformational ada ptat ion by t he s ite to the bindingof small ligan ds such a s benzene (for larger a lkyl benzenessome accommodation is observed.57 The L99Abinding siteis thus in ma ny w ays a naked binding s ite. The energiesthat we calculate for this s ite are more l ikely to reflectaffinities and not only rankings.

The DOCK energies for known ligands a re four t o fivekcal/mol greater in magnitude tha n those determinedexperimentally (Table I I I). There are several possibleexplanations for this difference. The calculations do not

cor r ect f or los t degr ees of r ot a t io na l a n d t r a n s la t ion a lfreedom on binding, nor do they consider gains in vibra-tional entr opy of the system on ligand binding; the calcula-tions ignore hydrophobicity. We have not investigated how these terms add up. I t is clear tha t w e are not penalizingfor the desolvat ion of hydrogen bonding groups adequ at ely.For instance, the calculated electrosta t ic component ofdesolvating phenol is   0.84 kcal/mol, w hereas tha t fordesolvat ing t he isosteric toluene is 0.13 kca l/mol. Apa rtfrom absolute errors , the differences in these energiesinad equa tely a ccounts for the cost of desolvat ing the polar

Fig. 8. The effect of non-polar li-gand solvation on the size of high-scoring ligands. 4,4’-sulfonyl-bis(N-(5-indanylmethylene)aniline) (dark gray)is shown in bound to the molecularsurface of DHFR. This moleculeranked 239th when non-polar solva-tion wasnot used, and10,910thwhennon-polar solvation was used. All mo-lecular graphics were rendered withneon in MidasPlus;61 all molecularsurfaces were calculated with MS.39

14   B .K. SHOIC HET ET AL.

Page 12: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 12/13

hydroxyl of phenol. This a llows phenol to achieve a D OCKenergy score of 9.4 kcal/mol, when in fa ct this liga nd w a snot observed to bind to the cavity.57

Several algorithm changes might improve performance.Our fa ilure to adequately penalize neutral polarity maystem from the use of an inductive method for calculatingpart ia l a tomic charges .42 U s in g q u a n t u m m ech a n ica l ly -

derived p a r t ia l a t o m ic c ha r ges m a y im pr ov e m a t t er s.27

U n t i l r ecen t ly t h is h a s b een u n fea s ib le f or t h e la r gedat aba se of ligands tha t w e wished to consider, but improve-m e n t s i n h a r d w a r e a n d s o f t w a r e w i l l m a k e t h i s m o r epractical in th e future. Another gross approximation in th emethod is the subt ra ction of the full solvat ion energy of theligand from each orientation’s interaction energy (Eq. 2).This over-penalizes liga nds, since even fully buried polargroups retain some interaction wit h solvent. 26 Calculat ingdesolvat ion pena lties tha t refl ected th e degree of burial foreach orienta tion of each ligand would improve ma tt ers. Ofcourse, this might be computat ionally expensive. Ourreliance on fixed liga nd conforma tions is an other source oferror in this work—result improve when we allow for the

ligand conformat ional fl exibility.58

Even with such changes our energy calculat ions willremain very approximate. Calculating the absolute magni-tude of intera ction energies remains an unsolved problemfor screens against large databases of diverse molecules,such as the ACD, due to the many degrees of geometricaland chemical freedom and to inaccuracies in the force-fields .13 It m ay be interesting to explore more computa tion-a l ly in t en s iv e s c or in g s ch em es f or a s m a l l n u m ber ofdocked complexes. P rogress ha s been reported in calculat-ing the absolute magnitudes of binding af f init ies for l i-gan ds w hose complex to a receptor ha s been determined tohigh resolution.59,60 Although these methods are too compu-tat ionally intensive to apply to the ent ire database, they

may be useful for deta iled calculat ions a gainst a smallerset of puta tive ligand s aft er the initia l screen. We would bepleased to provide su ch pre-screened, h igh-scoring d ockedl iga n ds , in t h eir co mp lexes w i t h v a r iou s en zy m es, t oinvest igators interested in test ing more detailed energyevalua tion schemes aga inst the fairly diverse set of ligandsreturned by structure-based dat aba se screens.

Even in the absence of more intensive, detailed energyevalua tion schemes, it is clear tha t fa irly simple consider-ations can dramatically improve our ability to distinguishlikely from unlikely l igands. One such is balancing the

ca lcu la t ed in t er a ct ion en er gy b et w een a l iga n d a n d ar ecept or w i t h a l iga n d s olva t io n t er m . I f t h is t er m isignored, calculations comparing the binding affinities ofdiss imilar l igands will be biased towards overly chargedan d overly la rge molecules. Correcting for solvat ion helpsus t o recognize inhibitors , familiar and yet to be discov-

ered, in dat aba se screens of receptors of known str ucture.

ACKNOWLEDGMENTS

We tha nk C. C orwin, E. Meng, D. Bodian, a nd R. Lewisfor enlightening conversations, and D. Lorber, B. Beadle,and A. P at era for reading t his ma nuscript . This researchwas part ly supported by the PhRMA Foundat ion and theAnimal Alternat ives Program of Procter and Gamble ( toBK S), NIH gra nt GM31497 (to IDK ) and t he Science andEngineering Research Council (UK) under the NATOpostdoctoral fellowship program (to ARL). We thank MDLInc. (Sa n Lea ndro, CA) for supplying the Availa ble Chemi-cals Database.

REFERENCES1. B oobbyer D N, Goodford P J , McWhinnie P M, Wade RC. New 

hydrogen-bond potent ials for use in determining energet ical lyfavorable binding sites on molecules of known structure. J MedChem 1989;32:1083–1094.

2. Moon J B, Howe WJ . C omputer design of bioactive molecules: Ameth od for receptor-ba sed de novo liga nd design . Pr oteins 1991;11:314–328.

3. Miranker A, Karplus M. Funct ional i ty maps of binding si tes: Amulticopy simulta neous sea rch meth od. P roteins 1991;11:29– 34.

4. B ar tlet t PA, Sh ea G T, Telfer ST, Wa term a n S. CAVEAT: Apr ogra mto fac i li tate the structure-derived design of biological ly ac t ivemolecules. In: Roberts SM, editor. Molecular Recognition. London:Royal Society of Chemistry;1989. p 182–196.

5. DesJ arlais RL, Seibel GL, Kunt z ID et al . St ructure-based designof nonpeptide inhibitors specifi c for th e huma n immun odeficiencyvirus 1 protease. Proc Natl Acad Sci USA1990;87:6644–6648.

6 . L a w r e n ce M C , D a v i s P C . C L I X : A s e a r ch a l g or i t h m f or fi n d i n gn o ve l l i g a n d s c a p a b le o f b i n d in g p r ot e i n s o f k n o w n t h r e e -dimensional structur e. P roteins 1992;12:31–41.

7. Shoichet BK, P erry KM, San t i DV, Stroud RM, Kuntz ID. St ruc-ture-based discovery of inhibitors of thymidylate syntha se. Sci-ence 1993;259:1445–1450.

8. Bodian DL, Yamasaki RB, Busw el l RL, Stearn s J F, White J M,Kunt z ID. I nhibition of the fusion-inducing conformat ional changeof the influenza hemagglut inin by bensoquinones a nd hy droqui-nones. B iochemis tr y 1993;32:2967–2978.

9. Lybran d TP, Ghose I, McCa mmon J A. Hydra tion of chloride andbromide anions - determination of relative free-energy by com-puter -simula tion. J Am Chem S oc 1985;107:7793–7794.

1 0 . B a s h P A , S i n g h U C , L a n g r i d g e R , K o l l m a n P A . F r e e e n e r g ycalculations by computer simulation. Science 1987;236:564–568.

11. Warshel A, Sussma n F, Hw ang J -K. E valuat ion of Ca talyt ic FreeEnergies in Genetically Modified Proteins. J Mol Biol 1988;201:139–159.

12. Str aa tsma TP, McCa mmon J A. Theoretical calculations of relativeaffinities of binding. Methods Enzymol 1991;202:497–511.

13. Shi YY, Mark AE, Wan g CX, Hua ng F, Berendsen HJ , GunsterenWFv. Can the stabi l i ty of protein mutants be predicted by freeenergy calculations? Protein Eng 1993;6:289–295.

1 4 . P e a r l m a n D A C o n n e l l y P R . D e t e r m i n a t i o n o f t h e d i f f e r e n t i a leffects of hydrogen bonding a nd wa ter release on t he binding ofFK506 to na tive a nd Tyr82--P he82 FKB P -12 proteins using fr eeenergy simulations. J Mol Biol 1995;248:696–717.

15. Brooks BR, B ruccoleri RE, Olafson BD, St ates DJ , Swamina tha nS, Karplus M. CHARMM: A program for macromolecular energy,minimization, and dynamics calculations. J Comp Chem 1983;4:187–217.

16. Weiner SJ , Kollman PA, Case DA, Singh U C, G hio C, Alagona G,Profeta S, Weiner P. A new force field for molecular mechanical

TABLE II I. Comparing Experimental and ComputedEnergies for LigandsBinding in theL99A Site

Compound

DOCK energysolvation

uncorrected(kca l/mol)

DOCK energysolvationcorrected(kca l/mol)

Experimentalenergy

(kca l/mol)

Indene   18.0   10.5   5.13o-Xylene   17.2   9.7   4.6Indole   17.1   9.5   4.89Toluene   17.1   9.3   5.52Benzene   15.0   8.7   5.19

15LIGAND SOLVATION IN MOLECULAR DOCKING

Page 13: Ligand Solvation in Molecular Docking

8/9/2019 Ligand Solvation in Molecular Docking

http://slidepdf.com/reader/full/ligand-solvation-in-molecular-docking 13/13

simula tion of nucleic a cids and pr oteins. J Am Ch em Soc 1984;106:765–784.

1 7. G u n s t e r en WF v, B e r e n ds e n H J C . G R O M O S L i br a r y M a n u a l .Gr oningen: B iomos; 1987.

18. Ha nsch C, LeoAJ . Substituent Consta nts for Correlat ion Analysisin Chemistry and Biology. New York: J ohn Wiley & Sons; 1979.339 p.

19. Nozaki Y, Tan ford C. The solubility of amino a cids and tw o glycine

peptides in aq ueous etha nol and dioxane solutions. Esta blishmentof a hy drophobicity sca le. J B iol Chem 1971;246:2211–2217.

20. Wolfenden R. Affinities of amino acid side chain s for solvent w at er.B iochemist ry 1981;20:849–855.

21. Fa uchere J L, Cha rt on M, Kier LB, Verloop A, P liska V. Amino acids i de c ha i n p a r a m e t er s f or c or r e la t i on s t u d ie s i n b iol og y a n dpharmacology. Int J Pept Protein Res 1988;32:269–278.

22. Eisenberg D , McLachlan AD. Solvat ion energy in protein foldingan d binding . Na ture 1986;319:199.

23. Ooi T, Oobata ke M, Nemethy G , Schera ga H. Accessible surfaceareas as a measure of the thermodynamic parameters of hydra-tion of peptides. Proc Natl Acad Sci USA1987;84:3086–3090.

24. Tokarski J S, Hopfing er AJ . P rediction of l igand-receptor bindingthermodynamics by free energy force fi eld (FEFF) 3D-QSARan alysis: a pplication to a set of peptidomimetic renin inhibitors. JChem Inf Comput Sci 1997;37:792–811.

25. Still WC, Tempczyk A, Ha wley RC , Hendr ickson T. ASemia na lyti-cal treat ment of solvation for molecular mecha nics and dyna mics.

J Am Chem Soc 1990;112:6127.26. Gi lson MK, Honig B . The inclusion of electrostat ic hy drat ion

energies in molecular mechanics calculations. J Comput AidedMol Des 1991;5:5–20.

27. Ras hin AA. Hyd ra tion phenomena, classical electrostat ics an d theboundary element method. J P hys Chem 1990;94:1725–1733.

28. Andrews PR, Craik DJ , Ma rt in J L . Funct ional group contribu-tions to d rug-receptor intera ctions. J Med Ch em 1984;27:1648–1657.

29. William s DH, C ox J PL, Doig AJ , Ga rdner M, Gerh ar d U, Ka ye PT,Lal AR, Nicolls IA, Sa lte CJ , Mitchell RC. Towa rds t he semiqua n-t i tat ive est imat ion of binding constan ts. Guides for pept ide-peptide binding in aqueous solution. J Am Chem Soc 1991;113:7020–7030.

30. Head RD , Smyte ML, Oprea TI, Wal ler CL, Green SM, Marsha l lGR. VALIDATE: A new method for the receptor-based predictionof binding affinities of novel l igands. J Am Chem Soc 1996;118:3959–3969.

31. Bohm H-J . The development of a simple empirical scoring func-tion to estimat e the binding consta nt for a protein-ligand complexof known three-dimensional structure. J Comput Aided Mol Des1994;8:243–256.

32. DeWitte RS, Shakhnovich EI. SMoG: de novo design methodb a s e d o n s i m p l e , f a s t , a n d a c c u r a t e f r e e e n e r g y e s t i m a t e s . 1 .Methodology and supporting evidence. J Am Chem Soc 1996;118:11733–11744.

33. Wilson C, Mace J E, Agard DA. A computat ional method fordesigning enzymes wi th al tered substrat e specifici ty. J Mol Biol1991;220:495–506.

34. Wilson C, Gregoret LM, Aga rd DA. Modeling side-cha in conform a-t ion for homologous proteins using an energy-based rotamersea rch. J Mol Biol 1993;229:996–1006.

35. Shoichet B, Bodian DL , Kuntz ID . Molecular docking using spheredescriptors. J Comp Chem 1992;13:380–397.

36. Meng EC, Shoichet B, Kuntz ID. Automated docking wi th grid-based energy eva luat ion. J Comp Chem 1992;13:505–524.

37. Ho CMW, Marsh all GR. FOU NDATION-Aprogram t o retrieve allpossible structures conta ining a user defi ned minimum number ofmatching query elements from 3-dimensional data bases. J Com-put Aided Mol D es 1993;7:3–22.

3 8. K u n t z I D , B l a n e y J M , O a t l e y S J , L a n g r i dg e R , F e r r in TE . Ageometric approach to macromolecule-ligand interactions. J MolB iol 1982;161:269–288.

39. Connolly ML. Solvent-accessible surfa ces of proteins a nd n ucleicacids . Science 1983;221:709–713.

40. Ew ing TJ A, Kuntz ID . Critical evalua tion of search algorithms forautomated molecular docking an d da taba se screening. J CompChem 1997;18:1175–1189.

41. Gilson MK, Honig BH. Ca lculat ion of electrosta tic potentials in anenzyme active site. Nature 1987;330:84–86.

42. Ga steiger J , Mar sili M. Electrosta tic charges. Tetra hedron 1980;

36:3219.43. Ras hin AA, Na mboodiri K. A simple method for t he calculation of

hydrat ion entha lpies of polar molecules wi th a rbi trary shapes. JPhys Chem 1987;91:6003–6012.

4 4. R a s h i n A A, H o n ig B . R e ev a l u a t i on o f t h e B o r n M od e l o f I o nHydr at ion. J P hys Ch em 1985;89:5588.

45. J ean-Cha rles A, Nichols A, Sha rp K et al . E lectrosta tic contr ibu-tions to solvation energies: Comparison of free energy perturba-tion and continuum calculations. J Am Chem Soc 1991;113:1454–1455.

46. Cra mer CJ , Truhla r DG . AM1-SM2 and P M3-SM3 para meterizedSCF solvat ion models for free energies in aqueous solut ion. JComput Aided Mol D es 1992;6:629–666.

47. Shoichet BK, Kuntz ID . Matching chemistry a nd sha pe in molecu-lar docking. Protein Eng 1993;6:723–732.

48. Guner OF, Hughes D W, Dumont LM. An integra ted a pproach t othree dimensional informat ion ma nagement wi th MACCS-3D. J

Chem Inf Comput Sci 1991;31:408–414.4 9. F i n er -M oor e J , F a u m a n E B , F os t e r P G , P e r r y K M , S a n t i D V,

Stroud RM. Refined structures of substrate-bound and phosphate-bound thy midylate syntha se from   Lactobacillus casei.  J Mol Biol1993;232:1101–1116.

5 0. M en g E C , G s ch w e n d D C , B l a n e y J M , K u n t z I D . O r i e n t a t i on a lsampling and rigid-body minimizat ion in molecular docking.P roteins 1993;17:266–278.

51. Bernstein FC, Koetzle TF, Williams GJ B et al . The protein databank: A computer-based archival file for macromolecular struc-tur e. J Mol Biol 1977;112:535–542.

52. Bolin J T, Filman DJ , Mat thew s DA, Ha mlin RC, Kra ut J . Cryst alstructures of E. col i and L. casei DHFR refined to 1.7 angstromresolution. 1. General feat ures an d binding of methotrexat e. J B iolChem 1982;257:13650–62.

53. San t i D V, Da nenberg PV. In : B lakely RL, B enkovic SJ , edi tors.Folates and Pterins. Vol. 1. New York: J ohn Wiley & Sons, Inc.;

1984. p 345–398.54. Sa nti DV, Ouyan g TM, Tan AK, Gregory DH, Sca nlan T, Car rera s

C W. I n t e r a c t i on o f t h y m i d y la t e s y n t h a s e w i t h p y r id o xa l -5 ’-phosphate as studied by UV/visible difference spectroscopy a ndmolecular modeling. B iochemist ry 1993;32:11819–11824.

55. Blaney J M, Han sch C , Si l ipo C, Vi t toria A. Structure-act ivityrelat ionships of dihydrofolate reductase inhibitors. C hem Rev1984;84:333–407.

56. E riksson AE, Ba ase WA, Wosniak J A, Matt hews BW. A cavity -conta ining muta nt of T4 lysozyme is sta bilized by buried benzene.Na tu re 1992;355:371–373.

57. Morton A, Mat thew s BW. Specifi city of ligan d binding in a buriednonpolar cavity of T4 lysozyme: Linkage of dynamics and struc-tural plasticity. Biochemistry 1995;34:8576–8588.

58. Lorber DM, Shoichet BK. F lexible ligan d docking using conforma-tional ensembles. Protein Sci 1998;7:151–158.

5 9. B a r d i J S , L u q u e I , F r e ir e E . S t r u ct u r e -b a s e d t h e r m od y n a m i c

analysis of HIV-1 protease inhibitors. Biochemistry 1997;36:6588–6596.60. Luque I, Gomez J , Semo N, Freire E. Structure-based thermody-

na mic design of peptide ligan ds: Applicat ion to peptide inhibitorsof the a spar tic protease endoth iapepsin. P roteins 1998;30:74–85.

6 1. F e r r in TE , H u a n g C C , J a r v i s L E , L a n g r i dg e R . Th e M I D ASDisplay P rogram. J Mol Gra phics 1988;6:13–27.

16   B .K. SHOIC HET ET AL.