Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics

8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics

1/13

Pharmacology & Therapeutics 84 (1999) 179191

0163-7258/99/$ see front matter 1999 Elsevier Science Inc. All rights reserved.

PII:S0163-7258(99)00031-5

Associate editor: E. Lolis

Computational approaches to structure-based ligand designDiane Joseph-McCarthy*

Wyeth Research, Biological Chemistry Department, 87 CambridgePark Drive, Cambridge, MA 02140, USA

Abstract

The first computational structure-based drug design methods came into existence in the early 1980s and are, to an extent, still in their

infancy. There have been a few successes to date. With dramatic increases in computer speed, improved accuracy in ligand scoring func-

tions, and the advent of combinatorial chemistry, there promises to be many more. In addition, the virtual explosion in the amount of

available sequence and structural information has increased the need to develop these computational techniques to exploit this vast body

of information. In this review, recent advances in computational methods for database searching and docking, de novo drug design, and

estimation of ligand binding affinities are discussed. 1999 Elsevier Science Inc. All rights reserved.

Keywords: Computer-aided drug design; De novo design; Database searching; Docking; Virtual combinatorial library screening; Binding affinity prediction

Abbreviations: FEP, free energy perturbation; HIV, human immunodeficiency virus; MC, Monte Carlo; MD, molecular dynamics; QSAR, quantitative

structure-activity relationships; vdW, van der Waals.

Contents

1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

2. Database searching and docking methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

3. Computational de novo drug design methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

3.1. Fragment positioning methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

3.2. Molecule growth methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

3.3. Fragment methods coupled to database searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1853.4. Virtual library construction and screening. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

4. Ligand-binding scoring functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

5. Summation and future outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

* Tel.: 617-665-8933; fax: 617-665-8993.

E-mail address: [email protected] (D. Joseph-McCarthy)

1. Introduction

Structure-based drug design, or rational drug design, as it

is sometimes called, refers to the intricate process of using

the information contained in the three-dimensional structure

of a macromolecular target and of related ligand-target com-plexes to design novel drugs for important human diseases.

Computational methods are required to extract all of the rel-

evant information from the available structures and to use it

in an efficient and intelligent manner to design improved

ligands for the target. There are approximately 6000 drugs

currently on the market today (Comprehensive Medicinal

Chemistry Database, Release 94.1, available from MDL In-

formation Systems, Inc., San Leandro, CA, USA) (Bemis &

Murcko, 1996) for on the order of 500 disease or molecular

targets (Drews, 1996). Due to genome sequencing projects,

the number of known sequences is increasing at a rapid rate

(Andrade & Sander, 1997). New target identification strate-

gies and associated bio-informatic technologies are being

developed to categorize this vast body of information (Col-

lins et al., 1998; Kingsbury, 1997). In particular, many peo-

ple are working on ways to try to predict the three-dimen-

sional structure of a protein from its one-dimensional amino

acid sequence (Dunbrack et al., 1997; Onuchic et al., 1997;

Westhead & Thornton, 1998). There is also a worldwide

effort in functional genomics to determine as many three-

dimensional structures of proteins as possible or to develop

computational approaches to cluster sequences into families

of related proteins and then select and solve the three-

dimensional structure of a representative sequence from each

family (Rost, 1998). As a result, in 10 years time, there


2/13

180 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191

should be a very large number of good homology models

and known structures for medically relevant targets. The

questions important for drug design then will be: What is

expected to bind to a given structure and how will this inter-

action change the structure? Computational methods are

needed to exploit the structural information to understand

specific molecular recognition events and to elucidate the

function of the target macromolecule (Fig. 1). This informa-

tion should ultimately lead to the design of small molecule

ligands for the target, which will block its normal function

and thereby act as improved drugs.

Most of the drugs currently on the market have been

found through large-scale random screening of compounds

for activity against a target, for which no three-dimensional

structural information was available. That is, thousands of

compounds (all of the compounds a company has in its

deck, for example) are screened for activity. High-through-

put robotic screening methods (Houston & Banks, 1997) ac-

celerate this process. In the end, it is hoped that at least a

small number of compounds will be active against the tar-get. A good lead compound is active at concentrations of 10

M or less (Verlinde & Hol, 1994).

As the first step in structure-based drug design (Fig. 2),

the three-dimensional structure of the target macromolecule

(protein or nucleic acid) is determined by X-ray crystallog-

raphy or NMR. In a few instances, a homology model (Ring

et al., 1993) has been used as the starting point, but, in gen-

eral, the more accurate the structural information, the more

predictive the computational results will be. Once a lead

compound has been found by some means, an iterative pro-

cess begins that involves solving the three-dimensional

structure of the lead compound bound to the target, examin-

ing that structure and characterizing the types of interac-

tions the bound ligand makes, and using computational

methods to design improvements to the compound. This last

stepdesigning improvements to existing lead com-

poundsis the point at which computational methods have

played an important role in the drug discovery process dur-

ing the last 510 years. A small subset of the most promis-

ing proposed compounds are then synthesized and tested.

For those compounds that do have improved activity, the

three-dimensional structure of the improved compoundbound to the target is determined. There are two problems

with using screening to find an initial lead compound fol-

lowed by structure-based optimization of that compound:

(1) if the initial compound does not already exist, it will

never be found; and (2) in this process, a great deal of time

and effort goes into refining a few lead compounds, and

thereby many of the resulting drug candidates for a given

target are chemically similar to one another. More recently,

pharmaceutical companies have used combinatorial chemis-

try, either in house or by contracting out to smaller technol-

ogy companies, to synthesize large numbers of new com-

pounds simultaneously (Borman, 1997; Wilson, 1997). Incombinatorial chemistry, libraries or mixtures of com-

pounds are simultaneously synthesized from all possible

combinations of up to hundreds of molecular fragments.

The newer computational methods are aimed at using the

information contained in the three-dimensional structure of

the unliganded target to design entirely new lead com-

pounds de novo, as well as to construct large virtual combi-

natorial libraries of compounds that then can be screened

computationally before going to the effort and expense of

actually synthesizing and testing them. Even after many cy-

cles of the structure-based design process, when a compound

that binds to the target with a very high level of activity (typ-

ically at nanomolar concentrations) has been developed, it is

still a long way from being a drug on the market. The com-

pound still has to pass through animal and clinical trials,

where factors that have not been considered, such as toxicity,

bioavailability, and resistance, often determine its fate. There

is now a greater emphasis on incorporating some of these

factors in the initial screening and optimization process that

leads to a drug. On average, it can take 15 years and 350500

million dollars for a drug to reach the market (http://www.

lilly.com/company/about/highlights.html) (Petsko, 1996). The

computational methods that will be described in this review

are expected to accelerate and reduce the cost of the drug

Fig. 1. Sequence information can lead to enhanced target selection and

structure prediction. Structural information about a given macromolecular

target leads to a better understanding of its specific function and enables

the design of small molecule ligands that can bind to the target. An -car-

bon trace of the X-ray structure of RNase A with formate bound in the

active site is shown (Fedorov et al., 1996).


3/13

D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 181

discovery process. This approach is now feasible due to dra-

matic increases in computer power (Buzbee, 1993; Couzin,

1998), developments in the computational methodologies,

and improvements in the accuracy of the empirical energy

functions (Cornell et al., 1995; Halgren, 1996; Lii & Al-

linger, 1991; MacKerell et al., 1998; Maxwell et al., 1995)

used to model atomic interactions in large biological systems.Three general areas of computational drug design will be

discussed: database searching and docking methods, de

novo drug design methods, and ligand scoring functions.

This article is not intended to give an exhaustive review of

all available drug design algorithms and related programs,

but rather to illustrate the general concepts and the capabili-

ties of the existing technology. To this end, within each spe-

cific category, one or two methods will be described in

some detail; often these will be the methods with which the

author has the most familiarity.

2. Database searching and docking methods

The ability to rapidly and accurately dock large numbers

of small molecules into the binding site of a target macro-

molecule, such that the compounds are rank-ordered with

respect to their goodness of fit, is a key component of lead

generation in structure-based drug design (Kuntz, 1992).

One of the older and more widely used computational dock-

ing methods is the program DOCK (Kuntz et al., 1982;

Meng et al., 1992; Shoichet et al., 1993), which has been

and continues to be developed by Kuntz and co-workers at

the University of California, San Francisco and elsewhere.

DOCK systematically attempts to fit each compound from a

database into the binding site of the target structure such

that three or more of the atoms in the database molecule

overlap with a set of predefined site points (or a clique) in

the target binding site. The default method for site point

generation involves creating an inverse surface of the bind-

ing site. This is defined by the set of overlapping spheres

that fill the binding site and touch the molecular surface atonly two points. The sphere centers (for all spheres with ra-

dii within a specified range) are used as site points. Crystal-

lographic water molecules or experimental positions of

known ligand atoms are also often taken as site points. A

site point can be assigned a color that specifies the type of

atom that it is allowed to match, and it can be required that

at least one site point from a subset, or a critical cluster, be

matched (see Fig. 3 for an example).

Often, the Available Chemicals Database is screened be-

cause individual compounds within this database are com-

mercially available. The database can be obtained in a for-

mat that is searchable by DOCK. The three-dimensional

structures of compounds in the database (Ricketts et al.,1993) have typically been generated by the program CON-

CORD (Tripos Associates, 1995, St. Louis, MO, USA)

(Pearlman, 1987), which uses a combination of geometry

rules and optimization procedures1 to select the lowest en-

ergy conformer of the molecule for inclusion in the data-

base. Each match or docking of a molecule is scored on a

Fig. 2. Outline of the structure-based drug design process.

1An input SMILES string is used to identify the cyclic and acyclic por-tions of the molecule. These are separately built and then connectedtogether, relieving bad contacts by optimizing torsions.


4/13


Fig. 3. An example of a DOCK search. (a) The set of site points used for a DOCK calculation on the structure of the CalB domain of phospholipase A2 (Xu et

al., 1998; W. Somers, unpublished). An -carbon trace is shown for the protein, with the two bound calcium ions drawn as the purple spheres. (b) The result-

ing 200 best scoring database molecules shown superimposed in green.


5/13


grid throughout the binding site of the macromolecular tar-

get using precalculated values for the protein part of the in-

teraction energy. A number of different energy functions

can be employed: molecular mechanics force fields such as

Amber (Cornell et al., 1995) or CHARMM (MacKerell et

al., 1998; Neria et al., 1996), contact scoring functions, or

Delphi electrostatic potential maps (Gilson et al., 1988;

Nicholls & Honig, 1991; Sharp & Honig, 1990). In custom-

ized versions of DOCK, a solvation correction for the data-

base compound can be added to the score (Shoichet et al.,

1999).

DOCK has been used to generate lead compounds for a

number of important biological targets, including human

immunodeficiency virus (HIV)-1 protease (Friedman et al.,

1998; Rose & Craik, 1994), dihydrofolate reductase (Gsch-

wend et al., 1997), B-form DNA (Grootenhuis et al., 1994),

RNA (Chen et al., 1997), hemagglutinin (Hoffman et al.,

1997), a malaria protease (Li et al., 1996), and thymidylate

synthase (Shoichet et al., 1993). In an attempt to account for

ligand flexibility, DOCK databases recently have been con-structed with multiple conformations for each molecule, or

ensembles of superimposed conformations. In the first case,

each conformation of a molecule is docked separately,

while in the other case, either the largest rigid fragment of a

molecule (Lorber & Shoichet, 1998) or its largest three-

dimensional pharmacophore (Thomas et al., 1999) can be

used to overlay and dock the ensemble of conformations.

The newest version of the program, DOCK 4.0, can exhaus-

tively search all possible matches of each entry in the data-

base and can be run in a flexible ligand mode (Makino &

Kuntz, 1997), although both are computationally intensive.

The success of DOCK 4.0 and the new multi-conformationdatabases have yet to be fully tested.

Other methods for flexible ligand docking include FLO98

(McMartin & Bohacek, 1997), AUTODOCK (Goodsell et

al., 1996; Morris et al., 1996), Hammerhead (Welch et al.,

1996), and FLEXX (Kramer et al., 1997; Rarey et al.,

1996). The FLO98 algorithm involves Monte Carlo (MC)

perturbation (wide-angle torsional Metropolis perturbation,

as well as translation and rotation of ligand atoms) followed

by energy minimization in Cartesian space for flexible

ligand binding to a target structure; therefore, there is full

flexibility for cyclic and acyclic molecules. For the initial

MC docking, the AMBER potential is evaluated on a grid

surrounding the binding site using relatively short, non-

standard, cutoffs for the nonbonded energy terms and with a

smoothly rising potential wall around the target binding site.

If the interaction energy of the ligand with the binding site

drops below a specified cutoff, the ligand position is fully

energy minimized. In general, the FLO98 package is rela-

tively easy for nonexperts to use to rapidly dock a large

number of ligand molecules into a given target structure and

graphically view the results. The method has been shown to

reproduce the X-ray structure of known complexes in most

cases, although a large enough number of docking cycles

must be carried out to ensure sufficient sampling. For cer-

tain ligands with high barriers to interconversion between

stereoisomers, it may be best to dock a few alternate low-

energy conformations of the molecule. AUTODOCK (Good-

sell et al., 1996; Morris et al., 1996) employs simulated an-

nealing in torsion space and, therefore, is best suited for

ligands with only a few rotatable bonds. Hammerhead

(Welch et al., 1996) also searches torsion space, but uses a

genetic algorithm approach. It is very fast and does well at

reproducing X-ray structures of ligand-protein complexes,

but, like AUTODOCK, does not include conformational

searching of cyclic molecules.

FLEXX (Rarey et al., 1996) is more distinct from the

other docking methods in that it first decomposes the ligand

into fragments by breaking all single acyclic, nonterminal

bonds. A hashing pattern recognition technique is then used

to dock a set of base fragments into the binding site. Base

fragments are docked by matching three ligand-interaction

centers to three interaction points on the receptor surface.

The ligand is incrementally built up starting from the posi-

tion of a base fragment. The set of allowed interaction typesor physicochemical properties and the empirical scoring

function are defined as in the program LUDI (see Section 3

for a more detailed description of this method) (Bohm,

1994a), with slight modifications. This model of discrete

conformational flexibility for the ligand, with finite sets of

allowed torsional angles for single acyclic bonds and pre-

computed conformations for ring systems, allows the dock-

ing to be fast. If the ligand-bound conformation of the re-

ceptor and a base fragment that binds with high specificity

are used, the method can reproduce the X-ray structures of

known complexes. In a recent blind test of the method (at

the CASP2 meeting), FLEXX predicted two of seven se-lected ligand complexes correctly, found parts of the solu-

tion for four of them, and failed at one (Kramer et al., 1997).

3. Computational de novo drug design methods

There are three basic classes of computational methods

for the de novo design of structure-based ligands: fragment

positioning methods, molecule growth methods, and frag-

ment methods coupled to database searches. In each cate-

gory, there are a number of software packages, available

commercially or from academic groups (Caflisch & Kar-

plus, 1995). Also, in some cases, pharmaceutical companieshave developed their own in-house software. For each type,

two or three methods are highlighted below and recent ap-

plications are discussed. The advantages and disadvantages

of the three general strategies are assessed.

3.1. Fragment positioning methods

Of the fragment positioning methods, two well-known

programs are GRID (Goodford, 1985) and MCSS (Multiple

Copy Simultaneous Search) (Evensen et al., 1997; Miranker

& Karplus, 1991). These methods determine energetically

favorable binding site positions for various functional group


6/13


types or chemical fragments. The program GRID calculates

protein interaction energies for functional groups repre-

sented as single-sphere probes on a grid surrounding the tar-

get structure. The GRID nonbonded interaction energy in-

cludes an explicit hydrogen bonding term (Boobbyer et al.,

1989; Wade et al., 1993; Wade & Goodford, 1993) in addi-

tion to electrostatic and van der Waals (vdW) terms. The re-

sulting grid contour map for a given probe looks like elec-

tron density into which fragments of that probe type can be

built. Therefore, GRID should be fairly intuitive for a crys-

tallographer to use, and is particularly useful for designing

modifications to existing lead compounds. As an example,

GRID was used to suggest the replacement of a single hy-

droxyl by an amino group in an existing inhibitor of influenza

virus sialidase (2-deoxy-2,3-didehydro-

N

-acetylneuraminimic

acid) that led to an inhibitor (4-amino-Neu5Ac2en) with

dramatically improved binding affinity (two orders of mag-

nitude improvement in Ki) (von Itzstein et al., 1996). In the

newer versions of GRID, the ability to create multi-sphere

probes is available, but at least three atoms in the multi-sphere probe must be capable of making hydrogen bonds

and must not be in a linear arrangement (so a multi-sphere

phenol group, for example, cannot be created). In contrast,

with the MCSS program, the probes are fully flexible and

individual atoms are represented using the CHARMM

(Brooks et al., 1983) potential energy function. In its stan-

dard single atom probe mode, GRID is fast, but gives much

less detailed information than MCSS. A detailed compari-

son of the two methods (R. Putzer, D. Joseph-McCarthy,

J. M. Hogle, & M. Karplus, in preparation) has shown that

the time required for a typical MCSS calculation for meth-

ane, for example, is approximately 2.5 times that required

for the corresponding GRID calculation, although neither

time is prohibitive and the results are similar. For larger

functional groups (such as phenol), the MCSS calculation

takes significantly longer than the corresponding GRID sin-

gle-sphere probe calculation (an aromatic hydroxyl), but the

results are effective at indicating where in the binding site

the group can be accommodated (Fig. 4). The resulting MCSS

maps are more analogous to experimental mapping of a pro-

tein surface by determining its three-dimensional structure

in various organic solvents (Allen et al., 1996; Joseph-

McCarthy et al., 1996; Shuker et al., 1996). MCSS has been

used to suggest improvements to HIV-1 protease inhibitors

(Caflisch et al., 1993) and thrombin inhibitors (Grootenhuis

& Karplus, 1996) and to design novel picornavirus capsid-binding ligands (D. Joseph-McCarthy, unpublished data).

Other related methods include HIPPO (Gillet et al., 1995)

and the fragment positioning mode of LUDI (Bohm, 1992).

Fragment positioning methods can be considered as the

first step in a three-step approach to de novo drug design.

The second step in the process involves clustering and con-

Fig. 4. Comparison of an MCSS functional group map for phenol and a GRID map for the aromatic hydroxyl probe, both calculated for the poliovirus

capsid protein. MCSS phenol minima with E 12 kcal/mol are shown colored by element, and the GRID density contoured at E 4 kcal/mol is in

magenta.


7/13


necting the optimally placed molecular fragments to form

chemically sensible candidate ligands. The third step in-

volves estimating how well the proposed compounds should

bind relative to one another and to existing drugs (see Sec-

tion 4). Several different approaches can be employed for

the second step, and a number of groups have been develop-

ing ways to automate the process. In one application, MC

minimizations were performed using a pseudo-potential en-

ergy function to connectN-methyl acetamide minima in the

binding site to form peptide backbones (Caflisch et al.,

1993). In another application, a link procedure involving the

optimization of linker carbon positions and their connectiv-

ity to selected functional group minima was used to con-

struct nonpeptide small molecules (Joseph-McCarthy et al.,

1997). The newer program, OLIGO (E. Evensen & M. Kar-

plus, unpublished), can similarly construct peptide back-

bones using a simulated annealing MC minimization proce-

dure and a pseudo potential. In this case, however, each MC

move is the substitution of one backbone monomer frag-

ment (an N

-methyl acetamide minimum position) for an-other in the chain. Allowed side chains (in their optimal po-

sitions in the binding site) are then automatically and

exhaustively added to these backbones. Two related dynam-

ical approaches are DLD (Dynamic Ligand Design)

(Miranker & Karplus, 1995) and CONCERTS (Creation Of

Novel Compounds by Evaluation of Residues at Target

Sites) (Pearlman & Murcko, 1996). DLD saturates the tar-

get binding site with sp3 carbons, which can connect to each

other or to functional group minima (as determined by

MCSS or a related method) to form molecules with the cor-

rect stereochemistry using a pseudo-energy function. This

potential function depends on the Cartesian coordinates ofthe atoms, as well as their occupancies and types. In the

present implementation, it is sampled and optimized using

MC simulated annealing. CONCERTS saturates the binding

site with multiple copies of various molecular fragments

and does both the fragment positioning and connection us-

ing molecular dynamics (MD) with the AMBER potential

energy function. The fragments are fully flexible during the

minimization, and only connected fragments interact with

each other. Connections can occur along user-specified

bonds to hydrogen in each fragment; when an inter-frag-

ment bond is formed, two hydrogens (one belonging to each

fragment) are deleted. During the optimization procedure,

bonds can break, as well as form, if the result lowers the

overall energy of the molecule or macro-fragment. With

both DLD and CONCERTS, multiple molecules are simul-

taneously formed and scored.

3.2. Molecule growth methods

In molecule growth methods, a seed atom (or fragment)

is first placed in the binding site of the target structure. A

ligand molecule is successively built by bonding another

atom (or fragment) to it. There are a number of molecule

growth methods available, including SMoG (Small Mole-

cule Growth) (DeWitte et al., 1997; DeWitte & Shakhnov-

ich, 1996), GrowMol (Bohacek & McMartin, 1994, 1995),

GenStar (Rotstein & Murcko, 1993a), GroupBuild (Rotstein

& Murcko, 1993b), and GROW (Moon & Howe, 1991).

GROW starts with a user-selected residue position and con-

structs a peptide by sequentially adding residues. Amino

acid conformations are selected from a large predefined li-

brary. Peptides are scored as they are being constructed, us-

ing a molecular mechanics force field. GroupBuild is simi-

lar in that it uses a predefined library of chemical fragments

and scores candidate fragment positions based on a molecu-

lar mechanics force field to generate candidate small mole-

cule ligands fragment-by-fragment. In contrast, GenStar se-

quentially grows structures composed of only sp3 carbons,

starting from either the position of user-selected seed atoms

in the target structure or a docked ligand core onto which at-

oms are to be built. For each new atom, several hundred

candidate positions are generated based on geometry con-

siderations, and then scored using a simple contact function.

The new atom position is randomly chosen from among thebest-scoring candidate positions. Branching is allowed, and

ring formation is favored to generate structures that fill the

binding site. Similarly, GrowMol sequentially builds up

ligand structures from a library of allowed atom types (in-

cluding oxygen, nitrogen, negatively charged oxygen, and

hydrogen, in addition to sp3 carbon), as well as small func-

tional group types. In this case, each new atom (or func-

tional group) position is scored based on its chemical com-

plementarity with nearby atoms in the binding site of the

target structure. The Metropolis criterion with this comple-

mentarity score taken as the energy is used to accept or re-

ject the candidate atom position. The complementarity scoreis determined by a grid surrounding the target binding site

with grid points designated as binding-site forbidden (too

close to the target structure), hydrogen bond acceptor, hy-

drogen bond donor, or neutral. SMoG uses a coarse-grained

knowledge-based potential that is based on statistical analy-

sis of crystal structures of small molecule-protein complexes

to estimate the binding affinity of molecules as they are

grown. Molecules are built by joining small, rigid fragments

together with standard bond lengths and angles and optimal

torsions. Functional group additions are accepted based on a

metropolis MC method. The disadvantages to this general

approach are that the final results depend a great deal on the

position of the seed atom in the binding site and that many

of the resulting molecules may be too difficult to synthesize.

In future implementations, additional chemistry rules need

to be considered when growing the molecules.

3.3. Fragment methods coupled to database searches

Fragment positioning methods can also be coupled to da-

tabase searching techniques either to extract those existing

molecules from a database that can be docked into the bind-

ing site with the desired fragments in their optimal positions

or for de novo design. The program HOOK (Eisen et al.,


8/13


1994), for example, can be used to do both. In its de novo

design mode, HOOK first creates a database of molecular

skeletons by stripping off all the functional groups on the

database molecules and then searches this database for

those molecular skeletons that can be fit into the target bind-

ing site in such a way that two MCSS functional group min-

ima can be attached or hooked onto them. After this initial

docking by geometrical superposition of two designated

hooks (methyl groups and attached atoms) in the skeletal

molecules and in two functional group minima, the fit of the

skeleton in the binding site is scored using a simplified, in-

verted Lennard-Jones type contact potential. If the fit is ac-

ceptable, secondary searches are carried out to attach addi-

tional MCSS minima to the skeleton, possibly through an

extra carbon. CAVEAT (Lauri & Bartlett, 1994) is similar

in that it searches a database of three-dimensional structures

of small molecules (often cyclic molecules) to use as molec-

ular frameworks to connect fragments already optimally

placed in the binding site. For each molecule in the data-

base, specific bonds are represented as vectors, and the mol-ecule is represented as a set of pairwise combinations of

these bond vectors. CAVEAT matches specified pairs of

bond vectors from the fragments (or the query) and the data-

base molecules to retrieve compounds. It is fast because the

interaction between the skeletal molecule and the binding

site is only considered in a post-processing step. As with

HOOK and CAVEAT, LUDI (Bohm, 1992, 1994b) can be

used either for database searching or de novo design. For de

novo design, LUDI uses either statistical data from small-

molecule crystal structures, geometric rules, or output from

the program GRID to identify interaction sites in the target

binding site. Molecular fragments (taken from a library ofhundreds) are then placed in binding site positions, where

they can connect up to four of these favorable hydrogen-

bonding or hydrophobic interaction sites. Smaller linker

groups such as CH2 are used interactively to connect these

larger, optimally placed fragments into candidate ligands.

LUDIs empirical scoring function takes into account hy-

drogen bonds, ionic interactions, the lipophilic protein-

ligand contact surface, and the number of rotatable bonds in

a ligand. It was calibrated by fitting to experimental binding

affinities for 45 protein-ligand complexes to obtain the indi-

vidual energy contributions for an ideal neutral hydrogen

bond (

4.7 kJ/mol), an ideal ionic hydrogen bond (

8.3 kJ/

mol), a lipophilic contact (

0.17 kJ/mol), and one rotatable

bond in the ligand (

1.4 kJ/mol). Deviations from ideal ge-

ometry reduce these contributions, and the sum of all inter-

actions gives an estimate of the free energy of binding for a

given protein-ligand complex. Since its scoring function is

based solely on geometric considerations, LUDI is very fast

and can be used interactively to predict protein-ligand com-

plex structures, but it may sometimes miss optimal positions

that are due to more delocalized electrostatic and vdW inter-

actions. Instead of docking molecular fragments from a li-

brary, LUDI can similarly be used to dock and score mole-

cules from a large database.

Fragment positioning methods can also be used to deter-

mine or combinatorially generate possible structure-based

pharmacophores. Traditionally, a pharmacophore is the set

of features common to a series of active molecules. A three-

dimensional pharmacophore specifies the spatial relation-

ship between the groups or features, often defining dis-

tances or distance ranges between groups, angles between

groups or planes, and exclusion spheres (see Fig. 5) (Leach,

1996). Structural information about the target can also be

used to help align ligand molecules to obtain better pharma-

cophores. Programs such as Catalyst can be used to generate

a pharmacophore by aligning and overlaying a set of ligand

structures. Catalyst can also use a pharmacophore to search

a database for new molecules that possess that pharmaco-

phore (Sprague, 1995). ISIS (MDL Information Systems

Inc., 1997) and UNITY (Tripos Associates, 1995) are two

other popular programs for searching a database for two- or

three-dimensional pharmacophores.

3.4. Virtual library construction and screening

The de novo design methods described in Sections 3.1

3.3 can be used to suggest individual molecules or to con-

struct large virtual combinatorial libraries of compounds

that can be screened computationally. MCSS functional

group maps, for example, have been used to design large

structure-based libraries for major histocompatibility Class

II molecules (E. Evensen, D. Joseph-McCarthy, G. Weiss,

S. Schreiber, & M. Karplus, in preparation), and small di-

rected libraries of poliovirus capsid-binding ligands (D. Jo-

seph-McCarthy, J. M. Hogle, & M. Karplus, in preparation).

An automated method, CCLD, for generating combinatorial

libraries by iteratively and exhaustively connecting MCSSminima has also been developed (Caflisch, 1996). Starting

with the MCSS minimum with the lowest approximated

binding free energy, small linker units (with from 0 to 3 co-

valent bonds) are used to add additional fragment minima.

The calculation is fast because a list of mutually excluding

(overlapping) fragment pairs and of possible bonding frag-

ments pairs is pre-computed. Also, ligand growth is stopped

if the average value of the binding free energy of its frag-

Fig. 5. An example of a three-dimensional pharmacophore.


9/13


ments exceeds a specified cutoff. In addition, HOOK can be

used with a database of all allowed conformations of a scaf-

fold or a set of scaffolds, with only positions that can be

combinatorialized designated as hooks. FLO98 can also be

used to generate and score combinatorial libraries in an au-

tomated manner.

4. Ligand-binding scoring functions

The success of docking molecules into a target site, de-

signing ligands de novo, or constructing and screening large

virtual combinatorial libraries is ultimately dependent on

the accuracy of the scoring function that ranks the com-

pounds or how well the corresponding relative binding af-

finities can be predicted. Ligand binding is governed by ki-

netic and thermodynamic principles. Factors that contribute

to ligand binding include the hydrophobic effect, vdW and

dispersion interactions, hydrogen bonding, other electro-

static interactions, and solvation effects (Ajay & Murcko,1995). If the change in free energy associated with complex

formation is negative, the association will be favorable.

Once a candidate ligand is constructed, its interaction en-

ergy with the protein is calculated and compared with that

for other proposed compounds and existing ligands.

In order of increasing complexity, the various ap-

proaches for estimating binding affinities include scoring

functions based on the statistical analysis of known struc-

tures of protein-ligand complexes (Koppensteiner & Sippl,

1998), physicochemical properties (Bohm & Klebe, 1996),

molecular mechanics force-field calculations, force-field

calculations with added solvation corrections, and free en-ergy perturbation (FEP) calculations (Gilson et al., 1997a).

The SMoG pseudo-energy function is an example of a scor-

ing function based on the statistical analysis of high resolu-

tion X-ray structures. The simplest physicochemical scoring

functions include those that count the number of receptor

atom contacts within specified distances or that scale these

counts depending on the distance from the ligand, as HOOK

does. More complicated ones include the LUDI energy

function and similar empirical scoring functions (Eldridge

et al., 1997; Jain, 1996). Molecular mechanics force-field

calculations attempt to model explicitly the atomic inter-

actions in the system. The resulting interaction energies rep-

resent the enthalpic contribution to the free energy. The

simplest force-field calculations are performed with the

ligand-target complex in vacuum using truncation schemes

for the nonbonded interactions. The calculated ligand-target

interaction energies include electrostatic and vdW interac-

tions between the ligand and target, and often also include

the internal energy (bond, angle, and torsion terms) of the

ligand or a ligand strain term (the internal energy of the

ligand in its bound conformation minus a reference energy

for the ligand in an unbound conformation). In a number of

cases of sets of related compounds, a reasonable correlation

exists between the vdW interaction energy alone and bind-

ing affinities (Caflisch & Karplus, 1995; Grootenhuis & van

Galen, 1995; Grootenhuis & Van Helden, 1994; Holloway

et al., 1995; Joseph-McCarthy et al., 1997; Kurinov & Har-

rison, 1994).

A mean force-field approximation or continuum repre-

sentation for solvent can be used to calculate an electrostatic

term that is substituted for the molecular mechanics Cou-

lombic term to estimate the electrostatic contribution to the

free energy. This continuum treatment of long-range elec-

trostatic interactions involves first calculating the electro-

static potential for the final state and the individual refer-

ence states, using a finite difference approach to solve the

linearized Poisson-Boltzmann equation, as implemented in

UHBD (Davis et al., 1991; Madura et al., 1995) or Delphi

(Gilson et al., 1988; Nicholls & Honig, 1991). Calculation

of the electrostatic energy from the electrostatic potential is

trivial, and for ligand binding, the difference in the electro-

static energy approximates the difference in the electrostatic

contribution to the free energy (that is, for the binding of

ligand L to protein P,

GelecUUPLUPUL). Toaccount further for solvation, the solvent-accessible surface

area can be calculated for the ligand, the protein, and the

ligand-protein complex. The surface area buried upon com-

plex formation can be related to the free energy of nonpolar

solvation or the hydrophobic effect associated with ligand

binding (Eisenberg & McLachlan, 1986; Ooi et al., 1987).

A number of groups have used a weighted sum of a contin-

uum electrostatic term and a buried surface area term, some-

times with the addition of a ligand internal energy term, to

predict binding affinities with some success (Caflisch,

1996; Froloff et al., 1997; Novotny et al., 1997; Simonson

et al., 1997). Another approach is to incorporate an implicitsolvation term directly into the molecular mechanics force

field. For example, an excluded volume-implicit solvation

model can be used that assumes that the solvation free en-

ergy for each group or residue in the system is equal to the

calculated solvation free energy for that group in a small

model compound less the amount of solvation lost due to

solvent exclusion by the other atoms of the macromolecular

system (Lazaridis & Karplus, 1999).

The only rigorous way to predict relative or absolute

binding free energies is a FEP calculation with explicit sol-

vent. FEP MD is difficult due to problems with sampling

and the accuracy of the empirical potential used, but it does

allow the free energy contributions to be examined on an

atomic level (Beveridge & DiCapua, 1989; Brooks et al.,

1988; Kollman, 1993; Straatsma & McCammon, 1992).

Furthermore, component analysis of the results can aid in

understanding the relative contribution of various parts of

the system (i.e., of the ligand or protein) to the free energy

(Boresch et al., 1994; Boresch & Karplus, 1995; Gao et al.,

1989). There are a number of approaches for calculating rel-

ative free energies with explicit solvent present (Pearlman,

1994), but all are computer intensive, and for ligand bind-

ing, the calculations are limited to very small changes in the

ligand structure. The standard free energy cycle for deter-


10/13


mining the relative binding free energy (Gbind) for ligand

1 (L1) versus ligand 2 (L2) is

where, for example, L1(w) is ligand 1 in water, L1(p) is

ligand 1 bound to the protein, and is the relative

free energy of solvation of L1 vs. L2 in water. This cycle re-

sults in Gbind Gbind(L1) Gbind(L2)

. (If L1 is taken as a nil particle, in principle, this cy-

cle would yield the absolute binding free energy for L2.)

With the FEP method, the free energies associated with thetwo nonphysicalpaths corresponding to mutating L1 into L2(in water and bound to protein, respectively) are calculated.

Often the simulation is carried out by varying a mixing con-

stant

from 0 to 1 in set increments, where the potential en-

ergy for the system is represented as (1

) of the potential

energy for L1 and of that for L2 [i.e., V(rN, ) (1

)

VL1(rN) VL2(r

N)]. Mutating one ligand into another al-

most always involves the creation and annihilation of at-

oms, as well as the redistribution of molecular charges, pro-

cesses that converge very slowly during a simulation, even

on current computers.

In practice, for drug design applications, large sets ofcandidate ligand molecules that differ considerably need to

be compared. In an attempt to circumvent this problem,

Aqvist and co-workers have developed a semi-empirical

method for calculating absolute-binding free energies from

MD simulations of the two physical paths (Aqvist et al.,

1994; Hansson et al., 1998; Marelius et al., 1998). In this

semi-empirical approach, a linear approximation of the po-

lar and nonpolar free energy contributions is estimated from

averages of MD simulations of the ligand in water and of

the ligand-protein complex in water. That is,

Gbind 1/2

, where is, for example,

the solute-solvent electrostatic term,

refers to the differ-

ence between the protein and water environments for the

ligand, and

is a parameter determined by empirical cali-

bration with a series of ligand-protein complexes with

known binding affinities.

New approaches to address some of the problems associ-

ated with sampling in standard FEP calculations are also be-

ing developed (Cieplak & Kollman, 1996; Gerber et al.,

1993; Gilson et al., 1997b; Guo et al., 1998; Liu et al., 1996;

Tidor, 1993). One such approach by Brooks and co-workers

(Guo et al., 1998) involves an extended Hamiltonian

method whereby the mixing parameter

is treated as a dy-

namic variable that is propagated (as if it were a particle)

Gso lw

Gso lp

Gso lw

VL sel

VL svd W

VL sel

along with the atomic coordinates for the system according

to Newtons equations of motion. A series of related ligands

can be simultaneously simulated using a set ofs, with the

variant parts of the ligand interacting with the target struc-

ture, but not with each other, to calculate relative binding

free energies. Since this allows for more efficient sampling,

the calculations are faster and larger differences in ligand

structures can be examined. Future improvements in the

force fields likely will involve the inclusion of polarization

(Liu et al., 1998) and should also lead to more accurate

binding energy calculations.

5. Summation and future outlook

The greatest success of computer-aided structure-based

drug design to date are the HIV-1 protease inhibitors that re-

cently have been approved by the United States Food and

Drug Administration and reached the market (Wlodawer &

Vondrasek, 1998). With the development of new computa-

tional drug design technologies and their use in connectionwith combinatorial chemistry, there promises to be many

more successes. Improved scoring functions, faster comput-

ers, and better database storage methods will aid in the pro-

cess. These improvements will be particularly relevant to the

areas of virtual library screening and quantitative structure-

activity relationships (QSAR). QSAR approaches, which are

used by all pharmaceutical companies, involve the statistical

analysis of a set of properties or descriptors for a series of bi-

ologically active molecules in order to predict the activity of

additional compounds. QSAR methods that also take into ac-

count available structural information on the protein, as well

as the ligands, are now being developed (So & Karplus,1999), and represent a way of systematically taking into ac-

count all information available for a given pharmaceutical

target to predict binding of new compounds. Expert systems

for organic synthesis, such as LHASA (Corey et al., 1992;

Long & Kappos, 1994) or WODCA (Fick et al., 1995), may

be used either to map out potential synthetic routes or possi-

bly to assess the ease or feasibility of synthesis for a set of

compounds. The construction of large virtual libraries based

on available chemistry or a set of existing combinatorial scaf-

folds and the use of the structure of a macromolecular target

to screen computationally will also be a major focus of future

drug discovery efforts.

Acknowledgments

The author thanks Juan C. Alvarez, Bert E. Thomas, and

Paul D. Lyne for helpful discussions and Erik Evensen,

Ryan Putzer, and Martin Karplus for allowing the discus-

sion of results prior to publication.

References

Ajay, M. M. A. (1995). Computational methods to predict binding free en-

ergy in ligand-receptor complexes.J Med Chem 38, 49534967.


11/13


Allen, K. N., Bellamacina, C. R., Ding, X. C., Jeffery, C. J., Mattos, C.,

Petsko, G. A., & Ringe, D. (1996). An experimental approach to map-

ping the binding surfaces of crystalline proteins. J Phys Chem 100,

26052611.

Andrade, M. A., & Sander, C. (1997). Bioinformatics: from genome data to

biological knowledge. Curr Opin Biotechnol 8, 675683.

Aqvist, J., Medina, C., & Samuelsson, J. E. (1994). New method for pre-

dicting binding-affinity in computer-aided drug design. Protein Eng 7,

385391.Bemis, G. W., & Murcko, M. A. (1996). The properties of known drugs. 1.

Molecular frameworks.J Med Chem 39, 28872893.

Beveridge, D. L., & DiCapua, F. M. (1989). Free energy via molecular

simulation: applications to chemical and biomolecular systems. Annu

Rev Biophys Biophys Chem 18, 431492.

Bohacek, R. S., & McMartin, C. (1994). Multiple highly diverse structures

complementary to enzyme binding-sitesresults of extensive applica-

tion of a de-novo design method incorporating combinatorial growth.J

Am Chem Soc 116, 55605571.

Bohacek, R. S., & McMartin, C. (1995). De-novo design of highly diverse

structures complementary to enzyme binding-sitesapplication to

thermolysin. Comput Aided Mol Des 589, 8297.

Bohm, H. J. (1992). Ludirule-based automatic design of new substitu-

ents for enzyme-inhibitor leads.J Comput Aided Mol Des 6, 593606.

Bohm, H. J. (1994a). The development of a simple empirical scoring func-tion to estimate the binding constant for a protein ligand complex of

known 3-dimensional structure.J Comput Aided Mol Des 8, 243256.

Bohm, H. J. (1994b). On the use of Ludi to search the Fine Chemicals Di-

rectory for ligands of proteins of known 3-dimensional structure. J

Comput Aided Mol Des 8, 623632.

Bohm, H. J., & Klebe, G. (1996). What can we learn from molecular recog-

nition in protein-ligand complexes for the design of new drugs.Angew

Chem Int Ed Engl 35, 25882614.

Boobbyer, D., Goodford, P., McWhinnie, P., & Wade, R. (1989). New hy-

drogen-bond potentials for use in determining energetically favorable

binding sites on molecules of known structure. J Med Chem 32, 1083

1094.

Boresch, S., & Karplus, M. (1995). The meaning of component analysis

decomposition of the free-energy in terms of specific interactions. J

Mol Biol 254, 801807.

Boresch, S., Archontis, G., & Karplus, M. (1994). Free-energy simula-

tionsthe meaning of the individual contributions from a component

analysis. Proteins 20, 2533.

Borman, S. (1997). Combinatorial chemistry. Chem Eng News 75, 4353.

Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swami-

nathan, S., & Karplus, M. (1983). CHARMM: a program for macromo-

lecular energy, minimization, and dynamics calculations. J Comput

Chem 4, 187217.

Brooks, C. L., Karplus, M., & Pettitt, B. N. (1988). Proteins: A Theoretical

Perspective of Dynamics, Structure, and Thermodynamics. New York:

John Wiley & Sons.

Buzbee, B. (1993). Workstation clusters rise and shine (Computing in Sci-

ence: Perspective). Science 261, 852853.

Caflisch, A. (1996). Computational combinatorial ligand design: applica-

tion to human alpha-thrombin.J Comput Aided Mol Des 10, 372396.

Caflisch, A., & Karplus, M. (1995). Computational combinatorial chemis-

try for de novo ligand design: review and assessment. Perspect Drug

Discov Des 3, 5184.

Caflisch, A., Miranker, A., & Karplus, M. (1993). Multiple copy simulta-

neous search and construction of ligands in binding sites.J Med Chem

36, 21422167.

Chen, Q., Shafer, R. H., & Kuntz, I. D. (1997). Structure-based discovery

of ligands targeted to the RNA double helix. Biochemistry 36, 11402

11407.

Cieplak, P., & Kollman, P. A. (1996). A technique to study molecular rec-

ognition in drug design: preliminary application of free energy deriva-

tives to inhibition of a malarial cysteine protease. J Mol Recognit 9,

103112.

Collins, F. S., Patrinos, A., Jordan, E., Chakravarti, A., Gesteland, R.,

Walters, L., Fearon, E., Hartwelt, L., Langley, C. H., Mathies, R. A.,

Olson, M., Pawson, A. J., Pollard, T., Williamson, A., Wold, B., Bue-

tow, K., Branscomb, E., Capecchi, M., Church, G., Garner, H., Gibbs,

R. A., Hawkins, T., Hodgson, K., Knotek, M., Meisler, M., Rubin,

G. M., Smith, L. M., Smith, R. F., Westerfield, M., Clayton, E. W.,

Fisher, N. L., Lerman, C. E., McInerney, J. D., Nebo, W., Press, N., &

Valle, D. (1998). New goals for the US Human Genome Project: 1998

2000. Science 282, 682689.Corey, E. J., Long, A. K., Lotto, G. I., & Rubenstein, S. D. (1992). Com-

puter-assisted synthetic analysisquantitative assessment of transform

utilities.Recl Trav Chim Pays-Bas J R Nether Chem Soc 111, 304309.

Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, K. M., Fergu-

son, D. M., Spellmeyer, D. C., Fox, T., Caldwell, J. W., & Kollman,

P. A. (1995). A 2nd generation force-field for the simulation of pro-

teins, nucleic-acids, and organic-molecules. J Am Chem Soc 117,

51795197.

Couzin, J. (1998). Supercomputingcomputer experts urge new federal

initiative. Science 281, 762.

Davis, M. E., Madura, J. D., Luty, B. A., & McCammon, J. A. (1991).

Electrostatics and diffusion of molecules in solutionsimulations with

the University-of-Houston-Brownian Dynamics program. Comput

Phys Commun 62, 187197.

DeWitte, R., & Shakhnovich, E. (1996). SMoG: de novo design methodbased on simple, fast, and accurate free energy estimates. 1. Methodol-

ogy and supporting evidence.J Am Chem Soc 118, 1173311744.

DeWitte, R., Ishchenko, A., & Shakhnovich, E. (1997). SMoG: de novo

design method based on simple, fast, and accurate free energy esti-

mates. 2. Case studies on molecular design.J Am Chem Soc 119 , 4608

4617.

Drews, J. (1996). Genomic sciences and the medicine of tomorrow. Nat

Biotechnol 14 , 15161518.

Dunbrack, R. L., Gerloff, D. L., Bower, M., Chen, X. W., Lichtarge, O., &

Cohen, F. E. (1997). Meeting review: The Second Meeting on the Crit-

ical Assessment of Techniques for Protein Structure Prediction

(CASP2), Asilomar, California, December 1316, 1996. Fold Des 2,

R27-R42.

Eisen, M. B., Wiley, D. C., Karplus, M., & Hubbard, R. E. (1994). HOOK:

a program for finding novel molecular architectures that satisfy the

chemical and steric requirements of a macromolecule binding sites.

Proteins 19, 199221.

Eisenberg, D., & McLachlan, A. D. (1986). Solvation energy in protein

folding and binding.Nature 319, 199203.

Eldridge, M. D., Murray, C. W., Auton, T. R., Paolini, G. V., & Mee, R. P.

(1997). Empirical scoring functions. 1. The development of a fast em-

pirical scoring function to estimate the binding affinity of ligands in re-

ceptor complexes.J Comput Aided Mol Des 11 , 425445.

Evensen, E., Joseph-McCarthy, D., & Karplus, M. (1997).MCSSv2. Cam-

bridge: Harvard University.

Fedorov, A. A., JosephMcCarthy, D., Fedorov, E., Sirakova, D., Graf, I., &

Almo, S. C. (1996). Ionic interactions in crystalline bovine pancreatic

ribonuclease A.Biochemistry 35, 1596215979.

Fick, R., Ihlenfeldt, W. D., & Gasteiger, J. (1995). Computer-assisted de-

sign of syntheses for heterocyclic-compounds. Heterocycles 40, 993

1007.

Friedman, S. H., Ganapathi, P. S., Rubin, Y., & Kenyon, G. L. (1998). Op-

timizing the binding of fullerene inhibitors of the HIV-1 protease

through predicted increases in hydrophobic desolvation. J Med Chem

41, 24242429.

Froloff, N., Windemuth, A., & Honig, B. (1997). On the calculation of

binding free energies using continuum methods: application to MHC

class I protein-peptide interactions. Protein Sci 6, 12931301.

Gao, J., Kuczera, K., Tidor, B., & Karplus, M. (1989). Hidden thermody-

namics of mutant proteinsa molecular-dynamics analysis. Science

244, 10691072.

Gerber, P. R., Mark, A. E., & Vangunsteren, W. F. (1993). An approximate

but efficient method to calculate free-energy trends by computer-simu-


12/13


lationapplication to dihydrofolate-reductase inhibitor complexes. J


Gillet, V. J., Myatt, G., Zsoldos, Z., & Johnson, A. P. (1995). SPROUT,

HIPPO and CAESA: tools for de novo structure generation and estima-

tion of synthetic accessibility. Perspect Drug Discov Des 3, 3450.

Gilson, M. K., Sharp, K. A., & Honig, B. H. (1988). Calculating the elec-

trostatic potential of molecules in solution: method and error assess-

ment.J Comput Chem 9, 327335.

Gilson, M. K., Given, J. A., Bush, B. L., & McCammon, J. A. (1997a). Thestatistical-thermodynamic basis for computation of binding affinities: a

critical review.Biophys J 72, 10471069.

Gilson, M. K., Given, J. A., & Head, M. S. (1997b). A new class of models

for computing receptor-ligand binding affinities. Chem Biol 4, 8792.

Goodford, P. (1985). A computational procedure for determining energeti-

cally favorable binding sites on biologically important macromole-

cules.J Med Chem 28, 849857.

Goodsell, D. S., Morris, G. M., & Olson, A. J. (1996). Automated docking

of flexible ligands: applications of AutoDock.J Mol Recognit9, 15.

Grootenhuis, P. D. J., & Karplus, M. (1996). Functionality map analysis of

the active site cleft of human thrombin. J Comput Aided Mol Des 10,

110.

Grootenhuis, P. D. J., & van Galen, P. J. M. (1995). Correlation of binding

affinities with nonbonded interaction energies of thrombin-inhibitor

complexes.Acta Crystallogr D 51, 560566.Grootenhuis, P. D. J., & Van Helden, S. P. (1994). Rational approaches to-

wards protease inhibition: predicting the binding of thrombin inhibi-

tors. In G. Wipff (Ed.), Computational Approaches in Supramolecular

Chemistry (pp. 137149). Dordrecht: Kluwer Academic Press.

Grootenhuis, P. D. J., Roe, D. C., Kollman, P. A., & Kuntz, I. D. (1994).

Finding potential DNA-binding compounds by using molecular shape.

J Comput Aided Mol Des 8, 731750.

Gschwend, D. A., Sirawaraporn, W., Santi, D. V., & Kuntz, I. D. (1997).

Specificity in structure-based drug design: Identification of a novel, se-

lective inhibitor ofPneumocystis carinii dihydrofolate reductase. Pro-

teins 29, 5967.

Guo, Z., Brooks, C. L., & Kong, X. (1998). Efficient and flexible algorithm

for free energy calculations using the lambda-dynamics approach. J

Phys Chem B 102, 20322036.

Halgren, T. A. (1996). Merck Molecular Force Field. I. Basis, form, scope,

parameterization, and performance of MMFF94. J Comput Chem 17,

490519.

Hansson, T., Marelius, J., & Aqvist, J. (1998). Ligand binding affinity pre-

diction by linear interaction energy methods.J Comput Aided Mol Des

12, 2735.

Hoffman, L. R., Kuntz, I. D., & White, J. M. (1997). Structure-based iden-

tification of an inducer of the low-pH conformational change in the in-

fluenza virus hemagglutinin: irreversible inhibition of infectivity.J Vi-

rol 71, 88088820.

Holloway, M. K., Wai, J. M., Halgren, T. A., Fitzgerald, P. M. D., Vacca,

J. P., Dorsey, B. D., Levin, R. B., Thompson, W. J., Chen, L. J., de-

Solms, S. J., Gaffin, N., Ghosh, A. K., Giuliani, E. A., Graham, S. L.,

Guare, J. P., Hungate, R. W., Lyle, T. A., Sanders, W. M., Tucker,

T. J., Wiggins, M., Wiscount, C. M., Woltersdorf, O. W., Young, S. D.,

Darke, P. L., & Zugay, J. A. (1995).A priori prediction of activity for

HIV-1 protease inhibitors employing energy minimization in the active

site.J Med Chem 38, 305317.

Houston, J. G., & Banks, M. (1997). The chemical-biological interface: de-

velopments in automated and miniaturised screening technology. Curr

Opin Biotechnol 8, 734740.

Jain, A. N. (1996). Scoring noncovalent protein-ligand interactions: a con-

tinuous differentiable function tuned to compute binding affinities. J


Joseph-McCarthy, D., Fedorov, A. A., & Almo, S. C. (1996). Comparison

of experimental and computational functional group mapping of an

RNase A structure: implications for computer-aided drug design. Pro-

tein Eng 9, 773780.

Joseph-McCarthy, D., Hogle, J. M., & Karplus, M. (1997). Use of the mul-

tiple copy simultaneous search (MCSS) method to design a new class

of picornavirus capsid binding drugs. Proteins 29, 3258.

Kingsbury, D. T. (1997). Bioinformatics in drug discovery.Drug Dev Res

41, 120128.

Kollman, P. (1993). Free energy calculations: applications to chemical and

biochemical phenomena. Chem Rev 93, 23952417.

Koppensteiner, W. A., & Sippl, M. J. (1998). Knowledge-based poten-

tialsback to the roots.Biochemistry Moscow 63, 247252.

Kramer, B., Rarey, M., & Lengauer, T. (1997). CASP2 experiences withdocking flexible ligands using FLEXX. Proteins (suppl. 1), 221225.

Kuntz, I. (1992). Structure-based strategies for drug design and discovery.

Science 257, 10781082.

Kuntz, I. D., Blaney, J. M., Oarley, S. J., Langridge, R., & Ferrin, T. E.

(1982). A geometric approach to macromolecule-ligand interactions. J

Mol Biol 161, 269288.

Kurinov, I. V., & Harrison, R. W. (1994). Prediction of new serine protein-

ase inhibitors.Nature Struct Biol 1, 735743.

Lauri, G., & Bartlett, P. A. (1994). Caveata program to facilitate the de-

sign of organic-molecules.J Comput Aided Mol Des 8 , 5166.

Lazaridis, T., & Karplus, M. (1999). Effective energy function for proteins

in solution. Proteins 35, 133152.

Leach, A. R. (1996). Molecular modelling: principles and applications.

Essex: Addison Wesley Longman Ltd.

Li, R. S., Chen, X. W., Gong, B. Q., Selzer, P. M., Li, Z., Davidson, E.,Kurzban, G., Miller, R. E., Nuzum, E. O., McKerrow, J. H., Fletterick,

R. J., Gillmor, S. A., Craik, C. S., Kuntz, I. D., Cohen, F. E., &

Kenyon, G. L. (1996). Structure-based design of parasitic protease in-

hibitors.Bioorg Med Chem 4, 14211427.

Lii, J. H., & Allinger, N. L. (1991). The MM3 force-field for amides,

polypeptides and proteins.J Comput Chem 12, 186199.

Liu, H. Y., Mark, A. E., & vanGunsteren, W. F. (1996). Estimating the rel-

ative free energy of different molecular states with respect to a single

reference state.J Phys Chem 100, 94859494.

Liu, Y. P., Kim, K., Berne, B. J., Friesner, R. A., & Rick, S. W. (1998).

Constructing ab initio force fields for molecular dynamics simulations.

J Chem Phys 108, 47394755.

Long, A. K., & Kappos, J. C. (1994). Computer-assisted synthetic analy-

sisperformance of tactical combinations of transforms. J Chem Inf

Comput Sci 34, 915921.

Lorber, D. M., & Shoichet, B. K. (1998). Flexible ligand docking using

conformational ensembles. Protein Sci 7, 938950.

MacKerell, A. D., Bashford, D., Bellott, M., Dunbrack, R. L., Evanseck,

J. D., Field, M. J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCar-

thy, D., Kuchnir, L., Kuczera, K., Lau, F. T. K., Mattos, C., Michnick,

S., Ngo, T., Nguyen, D. T., Prodhom, B., Reiher, W. E., Roux, B.,

Schlenkrich, M., Smith, J. C., Stote, R., Straub, J., Watanabe, M.,

Wiorkiewicz-Kuczera, J., Yin, D., & Karplus, M. (1998). All-atom em-

pirical potential for molecular modeling and dynamics studies of pro-

teins.J Phys Chem B 102, 35863616.

Madura, J. D., Briggs, J. M., Wade, R. C., Davis, M. E., Luty, B. A., Ilin,

A., Antosiewicz, J., Gilson, M. K., Bagheri, B., Scott, L. R., & Mc-

Cammon, J. A. (1995). Electrostatics and diffusion of molecules in so-

lutionsimulations with the University-of-Houston Brownian Dynam-

ics program. Comput Phys Commun 91, 5795.

Makino, S., & Kuntz, I. D. (1997). Automated flexible ligand docking

method and its application for database search. J Comput Chem 18,

18121825.

Marelius, J., Hansson, T., & Aqvist, J. (1998). Calculation of ligand bind-

ing free energies from molecular dynamics simulations.Int J Quantum

Chem 69, 7788.

Maxwell, D. S., Tiradorives, J., & Jorgensen, W. L. (1995). A comprehen-

sive study of the rotational energy profiles of organic-systems by ab-

initio MO theory, forming a basis for peptide torsional parameters. J

Comput Chem 16, 9841010.

McMartin, C., & Bohacek, R. S. (1997). QXP: Powerful, rapid computer

algorithms for structure-based drug design. J Comput Aided Mol Des

11, 333344.


13/13


Meng, E. C., Shoichet, B. K., & Kuntz, I. D. (1992). Automated docking

with grid-based energy evaluation.J Comput Chem 13, 505524.

Miranker, A., & Karplus, M. (1991). Functionality maps of binding sites: a

multiple copy simultaneous search method. Proteins 11, 2934.

Miranker, A., & Karplus, M. (1995). An automated method for dynamic

ligand design. Proteins 23, 472490.

Moon, J., & Howe, W. (1991). Computer design of bioactive molecules: a

method for receptor-based de novo ligand design. Proteins 11, 314

328.Morris, G. M., Goodsell, D. S., Huey, R., & Olson, A. J. (1996). Distrib-

uted automated docking of flexible ligands to proteins: parallel applica-

tions of AutoDock 2.4.J Comput Aided Mol Des 10 , 293304.

Neria, E., Fischer, S., & Karplus, M. (1996). Simulation of activation ener-

gies in molecular systems.J Chem Phys 105, 19021921.

Nicholls, A., & Honig, B. (1991). A rapid finite-difference algorithm, uti-

lizing successive over-relaxation to solve the Poisson-Boltzmann equa-

tion.J Comput Chem 12, 435445.

Novotny, J., Bruccoleri, R. E., Davis, M., & Sharp, K. A. (1997). Empirical

free energy calculations: a blind test and further improvements to the

method.J Mol Biol 268, 401411.

Onuchic, J. N., LutheySchulten, Z., & Wolynes, P. G. (1997). Theory of

protein folding: the energy landscape perspective. Annu Rev Phys

Chem 48, 545600.

Ooi, W., Oobataki, M., Nemethy, G., & Scheraga, H. A. (1987). Accessiblesurface areas as a measure of the thermodynamic parameters of hydra-

tion of peptides. Proc Natl Acad Sci USA 84, 30863090.

Pearlman, D. A. (1994). A comparison of alternative approaches to free en-

ergy calculations.J Phys Chem 98, 14871493.

Pearlman, D. A., & Murcko, M. (1996). CONCERTS: dynamic connection

of fragments as an approach to de novo ligand design.J Med Chem 39,

16511663.

Pearlman, R. S. (1987). Rapid generation of high quality approximate 3D

molecular structures. Chem Des Aut News 2, 16.

Petsko, G. A. (1996). For medicinal purposes.Nature 384, 79.

Rarey, M., Wefing, S., & Lengauer, T. (1996). Placement of medium-sized

molecular fragments into active sites of proteins.J Comput Aided Mol

Des 10, 4154.

Ricketts, E. M., Bradshaw, J., Hann, M., Hayes, F., Tanna, N., & Ricketts,

D. M. (1993). Comparison of conformations of small-molecule struc-

tures from the protein data-bank with those generated by Concord, Co-

bra, Chemdbs-3d, and Converter and those extracted from the Cam-

bridge Structural Database.J Chem Inf Comput Sci 33, 905925.

Ring, C., Sun, E., McKerrow, J., Lee, G., Rosenthal, P., Kuntz, I., & Co-

hen, F. (1993). Structure-based inhibitor design by using protein mod-

els for the development of antiparasitic agents. Proc Natl Acad Sci USA

90, 35833587.

Rose, J. R., & Craik, C. S. (1994). Structure-assisted design of nonpeptide

human immunodeficiency virus-1 protease inhibitors.Am J Respir Crit

Care Med 150, S176S182.

Rost, B. (1998). Marrying structure and genomics. Structure 6, 259263.

Rotstein, S. H., & Murcko, M. A. (1993a). Genstara method for de novo

drug design.J Comput Aided Mol Des 7, 2343.

Rotstein, S. H., & Murcko, M. A. (1993b). Groupbuilda fragment-based

method for de novo drug design.J Med Chem 36, 17001710.

Sharp, K. A., & Honig, B. (1990). Electrostatic interactions in macromole-

culestheory and applications. Annu Rev Biophys Biophys Chem 19,

301332.

Shoichet, B. K., Stroud, R. M., Santi, D. V., Kuntz, I. D., & Perry, K. M.

(1993). Structure-based discovery of inhibitors of thymidylate syn-

thase. Science 259, 14451450.

Shoichet, B. K., Leach, A. R., & Kuntz, I. D. (1999). Ligand solvation in

molecular docking. Proteins 34, 416.

Shuker, S. B., Hajduk, P. J., Meadows, R. P., & Fesik, S. W. (1996). Dis-covering high-affinity ligands for proteins: SAR by NMR. Science 274,

15311534.

Simonson, T., Archontis, G., & Karplus, M. (1997). Continuum treatment

of long-range interactions in free energy calculations. Application to

protein-ligand binding.J Phys Chem B 101 , 83498362.

So, S. S., & Karplus, M. (1999). A comparative study of ligand-receptor

complex binding affinity prediction methods based on glycogen phos-

phorylase inhibitors.J Comput Aided Mol Des 13 , 243258.

Sprague, P. W. (1995). Automated chemical hypothesis generation and da-

tabase searching with Catalyst. Perspect Drug Discov Des 3, 120.

Straatsma, T. P., & McCammon, J. A. (1992). Computational alchemy.

Annu Rev Phys Chem 43, 407435.

Thomas, B. E., IV, Joseph-McCarthy, D., Alvarez, J. C. (1999). Pharma-

cophore-based molecular docking. In O. F. Guner (Ed.), Pharmaco-

phore perception, development, and use in drug design. La Jolla: Inter-national University Press, in press.

Tidor, B. (1993). Simulated annealing on free-energy surfaces by a com-

bined molecular-dynamics and Monte-Carlo approach. J Phys Chem

97, 10691073.

Verlinde, C. L. M. J., & Hol, W. G. J. (1994). Structure-based drug design:

progress, results and challenges. Structure 2, 577587.

von Itzstein, M., Dyason, J. C., Oliver, S. W., White, H. F., Wu, W. Y.,

Kok, G. B., & Pegg, M. S. (1996). A study of the active site of influ-

enza virus sialidase: an approach to the rational design of novel anti-

influenza drugs.J Med Chem 39, 388391.

Wade, R., & Goodford, P. (1993). Further development of hydrogen bond

functions for use in determining energetically favorable binding sites

on molecules of known structure. 2. Ligand probe groups with the abil-

ity to form more than two hydrogen bonds.J Med Chem 36, 148156.

Wade, R., Clark, K., & Goodford, P. (1993). Further development of hy-

drogen bond functions for use in determining energetically favorable

binding sites on molecules of known structure. 1. Ligand probe groups

with the ability to form two hydrogen bonds.J Med Chem 36, 140147.

Welch, W., Ruppert, J., & Jain, A. N. (1996). Hammerhead: fast, fully au-

tomated docking of flexible ligands to protein binding sites. Chem Biol

3, 449462.

Westhead, D. R., & Thornton, J. M. (1998). Protein structure prediction.

Curr Opin Biotechnol 9, 383389.

Wilson, E. K. (1997). Combinatorial chemistry. Chem Eng News 75, 2425.

Wlodawer, A., & Vondrasek, J. (1998). Inhibitors of HIV-1 protease: a ma-

jor success of structure-assisted drug design.Annu Rev Biophys Biomol

Struct 27, 249284.

Xu, G. Y., McDonagh, T., Yu, H. A., Nalefski, E. A., Clark, J. D., & Cum-

ming, D. A. (1998). Solution structure and membrane interactions of the

C2 domain of cytosolic phospholipase A(2).J Mol Biol 280, 485500.

Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics

Documents

Transcript of Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics