Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
-
Upload
rakesh-chandarana -
Category
Documents
-
view
216 -
download
0
Transcript of Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
1/13
Pharmacology & Therapeutics 84 (1999) 179191
0163-7258/99/$ see front matter 1999 Elsevier Science Inc. All rights reserved.
PII:S0163-7258(99)00031-5
Associate editor: E. Lolis
Computational approaches to structure-based ligand designDiane Joseph-McCarthy*
Wyeth Research, Biological Chemistry Department, 87 CambridgePark Drive, Cambridge, MA 02140, USA
Abstract
The first computational structure-based drug design methods came into existence in the early 1980s and are, to an extent, still in their
infancy. There have been a few successes to date. With dramatic increases in computer speed, improved accuracy in ligand scoring func-
tions, and the advent of combinatorial chemistry, there promises to be many more. In addition, the virtual explosion in the amount of
available sequence and structural information has increased the need to develop these computational techniques to exploit this vast body
of information. In this review, recent advances in computational methods for database searching and docking, de novo drug design, and
estimation of ligand binding affinities are discussed. 1999 Elsevier Science Inc. All rights reserved.
Keywords: Computer-aided drug design; De novo design; Database searching; Docking; Virtual combinatorial library screening; Binding affinity prediction
Abbreviations: FEP, free energy perturbation; HIV, human immunodeficiency virus; MC, Monte Carlo; MD, molecular dynamics; QSAR, quantitative
structure-activity relationships; vdW, van der Waals.
Contents
1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
2. Database searching and docking methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
3. Computational de novo drug design methods. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
3.1. Fragment positioning methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
3.2. Molecule growth methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.3. Fragment methods coupled to database searches. . . . . . . . . . . . . . . . . . . . . . . . . . . . 1853.4. Virtual library construction and screening. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
4. Ligand-binding scoring functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5. Summation and future outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
* Tel.: 617-665-8933; fax: 617-665-8993.
E-mail address: [email protected] (D. Joseph-McCarthy)
1. Introduction
Structure-based drug design, or rational drug design, as it
is sometimes called, refers to the intricate process of using
the information contained in the three-dimensional structure
of a macromolecular target and of related ligand-target com-plexes to design novel drugs for important human diseases.
Computational methods are required to extract all of the rel-
evant information from the available structures and to use it
in an efficient and intelligent manner to design improved
ligands for the target. There are approximately 6000 drugs
currently on the market today (Comprehensive Medicinal
Chemistry Database, Release 94.1, available from MDL In-
formation Systems, Inc., San Leandro, CA, USA) (Bemis &
Murcko, 1996) for on the order of 500 disease or molecular
targets (Drews, 1996). Due to genome sequencing projects,
the number of known sequences is increasing at a rapid rate
(Andrade & Sander, 1997). New target identification strate-
gies and associated bio-informatic technologies are being
developed to categorize this vast body of information (Col-
lins et al., 1998; Kingsbury, 1997). In particular, many peo-
ple are working on ways to try to predict the three-dimen-
sional structure of a protein from its one-dimensional amino
acid sequence (Dunbrack et al., 1997; Onuchic et al., 1997;
Westhead & Thornton, 1998). There is also a worldwide
effort in functional genomics to determine as many three-
dimensional structures of proteins as possible or to develop
computational approaches to cluster sequences into families
of related proteins and then select and solve the three-
dimensional structure of a representative sequence from each
family (Rost, 1998). As a result, in 10 years time, there
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
2/13
180 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
should be a very large number of good homology models
and known structures for medically relevant targets. The
questions important for drug design then will be: What is
expected to bind to a given structure and how will this inter-
action change the structure? Computational methods are
needed to exploit the structural information to understand
specific molecular recognition events and to elucidate the
function of the target macromolecule (Fig. 1). This informa-
tion should ultimately lead to the design of small molecule
ligands for the target, which will block its normal function
and thereby act as improved drugs.
Most of the drugs currently on the market have been
found through large-scale random screening of compounds
for activity against a target, for which no three-dimensional
structural information was available. That is, thousands of
compounds (all of the compounds a company has in its
deck, for example) are screened for activity. High-through-
put robotic screening methods (Houston & Banks, 1997) ac-
celerate this process. In the end, it is hoped that at least a
small number of compounds will be active against the tar-get. A good lead compound is active at concentrations of 10
M or less (Verlinde & Hol, 1994).
As the first step in structure-based drug design (Fig. 2),
the three-dimensional structure of the target macromolecule
(protein or nucleic acid) is determined by X-ray crystallog-
raphy or NMR. In a few instances, a homology model (Ring
et al., 1993) has been used as the starting point, but, in gen-
eral, the more accurate the structural information, the more
predictive the computational results will be. Once a lead
compound has been found by some means, an iterative pro-
cess begins that involves solving the three-dimensional
structure of the lead compound bound to the target, examin-
ing that structure and characterizing the types of interac-
tions the bound ligand makes, and using computational
methods to design improvements to the compound. This last
stepdesigning improvements to existing lead com-
poundsis the point at which computational methods have
played an important role in the drug discovery process dur-
ing the last 510 years. A small subset of the most promis-
ing proposed compounds are then synthesized and tested.
For those compounds that do have improved activity, the
three-dimensional structure of the improved compoundbound to the target is determined. There are two problems
with using screening to find an initial lead compound fol-
lowed by structure-based optimization of that compound:
(1) if the initial compound does not already exist, it will
never be found; and (2) in this process, a great deal of time
and effort goes into refining a few lead compounds, and
thereby many of the resulting drug candidates for a given
target are chemically similar to one another. More recently,
pharmaceutical companies have used combinatorial chemis-
try, either in house or by contracting out to smaller technol-
ogy companies, to synthesize large numbers of new com-
pounds simultaneously (Borman, 1997; Wilson, 1997). Incombinatorial chemistry, libraries or mixtures of com-
pounds are simultaneously synthesized from all possible
combinations of up to hundreds of molecular fragments.
The newer computational methods are aimed at using the
information contained in the three-dimensional structure of
the unliganded target to design entirely new lead com-
pounds de novo, as well as to construct large virtual combi-
natorial libraries of compounds that then can be screened
computationally before going to the effort and expense of
actually synthesizing and testing them. Even after many cy-
cles of the structure-based design process, when a compound
that binds to the target with a very high level of activity (typ-
ically at nanomolar concentrations) has been developed, it is
still a long way from being a drug on the market. The com-
pound still has to pass through animal and clinical trials,
where factors that have not been considered, such as toxicity,
bioavailability, and resistance, often determine its fate. There
is now a greater emphasis on incorporating some of these
factors in the initial screening and optimization process that
leads to a drug. On average, it can take 15 years and 350500
million dollars for a drug to reach the market (http://www.
lilly.com/company/about/highlights.html) (Petsko, 1996). The
computational methods that will be described in this review
are expected to accelerate and reduce the cost of the drug
Fig. 1. Sequence information can lead to enhanced target selection and
structure prediction. Structural information about a given macromolecular
target leads to a better understanding of its specific function and enables
the design of small molecule ligands that can bind to the target. An -car-
bon trace of the X-ray structure of RNase A with formate bound in the
active site is shown (Fedorov et al., 1996).
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
3/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 181
discovery process. This approach is now feasible due to dra-
matic increases in computer power (Buzbee, 1993; Couzin,
1998), developments in the computational methodologies,
and improvements in the accuracy of the empirical energy
functions (Cornell et al., 1995; Halgren, 1996; Lii & Al-
linger, 1991; MacKerell et al., 1998; Maxwell et al., 1995)
used to model atomic interactions in large biological systems.Three general areas of computational drug design will be
discussed: database searching and docking methods, de
novo drug design methods, and ligand scoring functions.
This article is not intended to give an exhaustive review of
all available drug design algorithms and related programs,
but rather to illustrate the general concepts and the capabili-
ties of the existing technology. To this end, within each spe-
cific category, one or two methods will be described in
some detail; often these will be the methods with which the
author has the most familiarity.
2. Database searching and docking methods
The ability to rapidly and accurately dock large numbers
of small molecules into the binding site of a target macro-
molecule, such that the compounds are rank-ordered with
respect to their goodness of fit, is a key component of lead
generation in structure-based drug design (Kuntz, 1992).
One of the older and more widely used computational dock-
ing methods is the program DOCK (Kuntz et al., 1982;
Meng et al., 1992; Shoichet et al., 1993), which has been
and continues to be developed by Kuntz and co-workers at
the University of California, San Francisco and elsewhere.
DOCK systematically attempts to fit each compound from a
database into the binding site of the target structure such
that three or more of the atoms in the database molecule
overlap with a set of predefined site points (or a clique) in
the target binding site. The default method for site point
generation involves creating an inverse surface of the bind-
ing site. This is defined by the set of overlapping spheres
that fill the binding site and touch the molecular surface atonly two points. The sphere centers (for all spheres with ra-
dii within a specified range) are used as site points. Crystal-
lographic water molecules or experimental positions of
known ligand atoms are also often taken as site points. A
site point can be assigned a color that specifies the type of
atom that it is allowed to match, and it can be required that
at least one site point from a subset, or a critical cluster, be
matched (see Fig. 3 for an example).
Often, the Available Chemicals Database is screened be-
cause individual compounds within this database are com-
mercially available. The database can be obtained in a for-
mat that is searchable by DOCK. The three-dimensional
structures of compounds in the database (Ricketts et al.,1993) have typically been generated by the program CON-
CORD (Tripos Associates, 1995, St. Louis, MO, USA)
(Pearlman, 1987), which uses a combination of geometry
rules and optimization procedures1 to select the lowest en-
ergy conformer of the molecule for inclusion in the data-
base. Each match or docking of a molecule is scored on a
Fig. 2. Outline of the structure-based drug design process.
1An input SMILES string is used to identify the cyclic and acyclic por-tions of the molecule. These are separately built and then connectedtogether, relieving bad contacts by optimizing torsions.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
4/13
182 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
Fig. 3. An example of a DOCK search. (a) The set of site points used for a DOCK calculation on the structure of the CalB domain of phospholipase A2 (Xu et
al., 1998; W. Somers, unpublished). An -carbon trace is shown for the protein, with the two bound calcium ions drawn as the purple spheres. (b) The result-
ing 200 best scoring database molecules shown superimposed in green.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
5/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 183
grid throughout the binding site of the macromolecular tar-
get using precalculated values for the protein part of the in-
teraction energy. A number of different energy functions
can be employed: molecular mechanics force fields such as
Amber (Cornell et al., 1995) or CHARMM (MacKerell et
al., 1998; Neria et al., 1996), contact scoring functions, or
Delphi electrostatic potential maps (Gilson et al., 1988;
Nicholls & Honig, 1991; Sharp & Honig, 1990). In custom-
ized versions of DOCK, a solvation correction for the data-
base compound can be added to the score (Shoichet et al.,
1999).
DOCK has been used to generate lead compounds for a
number of important biological targets, including human
immunodeficiency virus (HIV)-1 protease (Friedman et al.,
1998; Rose & Craik, 1994), dihydrofolate reductase (Gsch-
wend et al., 1997), B-form DNA (Grootenhuis et al., 1994),
RNA (Chen et al., 1997), hemagglutinin (Hoffman et al.,
1997), a malaria protease (Li et al., 1996), and thymidylate
synthase (Shoichet et al., 1993). In an attempt to account for
ligand flexibility, DOCK databases recently have been con-structed with multiple conformations for each molecule, or
ensembles of superimposed conformations. In the first case,
each conformation of a molecule is docked separately,
while in the other case, either the largest rigid fragment of a
molecule (Lorber & Shoichet, 1998) or its largest three-
dimensional pharmacophore (Thomas et al., 1999) can be
used to overlay and dock the ensemble of conformations.
The newest version of the program, DOCK 4.0, can exhaus-
tively search all possible matches of each entry in the data-
base and can be run in a flexible ligand mode (Makino &
Kuntz, 1997), although both are computationally intensive.
The success of DOCK 4.0 and the new multi-conformationdatabases have yet to be fully tested.
Other methods for flexible ligand docking include FLO98
(McMartin & Bohacek, 1997), AUTODOCK (Goodsell et
al., 1996; Morris et al., 1996), Hammerhead (Welch et al.,
1996), and FLEXX (Kramer et al., 1997; Rarey et al.,
1996). The FLO98 algorithm involves Monte Carlo (MC)
perturbation (wide-angle torsional Metropolis perturbation,
as well as translation and rotation of ligand atoms) followed
by energy minimization in Cartesian space for flexible
ligand binding to a target structure; therefore, there is full
flexibility for cyclic and acyclic molecules. For the initial
MC docking, the AMBER potential is evaluated on a grid
surrounding the binding site using relatively short, non-
standard, cutoffs for the nonbonded energy terms and with a
smoothly rising potential wall around the target binding site.
If the interaction energy of the ligand with the binding site
drops below a specified cutoff, the ligand position is fully
energy minimized. In general, the FLO98 package is rela-
tively easy for nonexperts to use to rapidly dock a large
number of ligand molecules into a given target structure and
graphically view the results. The method has been shown to
reproduce the X-ray structure of known complexes in most
cases, although a large enough number of docking cycles
must be carried out to ensure sufficient sampling. For cer-
tain ligands with high barriers to interconversion between
stereoisomers, it may be best to dock a few alternate low-
energy conformations of the molecule. AUTODOCK (Good-
sell et al., 1996; Morris et al., 1996) employs simulated an-
nealing in torsion space and, therefore, is best suited for
ligands with only a few rotatable bonds. Hammerhead
(Welch et al., 1996) also searches torsion space, but uses a
genetic algorithm approach. It is very fast and does well at
reproducing X-ray structures of ligand-protein complexes,
but, like AUTODOCK, does not include conformational
searching of cyclic molecules.
FLEXX (Rarey et al., 1996) is more distinct from the
other docking methods in that it first decomposes the ligand
into fragments by breaking all single acyclic, nonterminal
bonds. A hashing pattern recognition technique is then used
to dock a set of base fragments into the binding site. Base
fragments are docked by matching three ligand-interaction
centers to three interaction points on the receptor surface.
The ligand is incrementally built up starting from the posi-
tion of a base fragment. The set of allowed interaction typesor physicochemical properties and the empirical scoring
function are defined as in the program LUDI (see Section 3
for a more detailed description of this method) (Bohm,
1994a), with slight modifications. This model of discrete
conformational flexibility for the ligand, with finite sets of
allowed torsional angles for single acyclic bonds and pre-
computed conformations for ring systems, allows the dock-
ing to be fast. If the ligand-bound conformation of the re-
ceptor and a base fragment that binds with high specificity
are used, the method can reproduce the X-ray structures of
known complexes. In a recent blind test of the method (at
the CASP2 meeting), FLEXX predicted two of seven se-lected ligand complexes correctly, found parts of the solu-
tion for four of them, and failed at one (Kramer et al., 1997).
3. Computational de novo drug design methods
There are three basic classes of computational methods
for the de novo design of structure-based ligands: fragment
positioning methods, molecule growth methods, and frag-
ment methods coupled to database searches. In each cate-
gory, there are a number of software packages, available
commercially or from academic groups (Caflisch & Kar-
plus, 1995). Also, in some cases, pharmaceutical companieshave developed their own in-house software. For each type,
two or three methods are highlighted below and recent ap-
plications are discussed. The advantages and disadvantages
of the three general strategies are assessed.
3.1. Fragment positioning methods
Of the fragment positioning methods, two well-known
programs are GRID (Goodford, 1985) and MCSS (Multiple
Copy Simultaneous Search) (Evensen et al., 1997; Miranker
& Karplus, 1991). These methods determine energetically
favorable binding site positions for various functional group
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
6/13
184 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
types or chemical fragments. The program GRID calculates
protein interaction energies for functional groups repre-
sented as single-sphere probes on a grid surrounding the tar-
get structure. The GRID nonbonded interaction energy in-
cludes an explicit hydrogen bonding term (Boobbyer et al.,
1989; Wade et al., 1993; Wade & Goodford, 1993) in addi-
tion to electrostatic and van der Waals (vdW) terms. The re-
sulting grid contour map for a given probe looks like elec-
tron density into which fragments of that probe type can be
built. Therefore, GRID should be fairly intuitive for a crys-
tallographer to use, and is particularly useful for designing
modifications to existing lead compounds. As an example,
GRID was used to suggest the replacement of a single hy-
droxyl by an amino group in an existing inhibitor of influenza
virus sialidase (2-deoxy-2,3-didehydro-
N
-acetylneuraminimic
acid) that led to an inhibitor (4-amino-Neu5Ac2en) with
dramatically improved binding affinity (two orders of mag-
nitude improvement in Ki) (von Itzstein et al., 1996). In the
newer versions of GRID, the ability to create multi-sphere
probes is available, but at least three atoms in the multi-sphere probe must be capable of making hydrogen bonds
and must not be in a linear arrangement (so a multi-sphere
phenol group, for example, cannot be created). In contrast,
with the MCSS program, the probes are fully flexible and
individual atoms are represented using the CHARMM
(Brooks et al., 1983) potential energy function. In its stan-
dard single atom probe mode, GRID is fast, but gives much
less detailed information than MCSS. A detailed compari-
son of the two methods (R. Putzer, D. Joseph-McCarthy,
J. M. Hogle, & M. Karplus, in preparation) has shown that
the time required for a typical MCSS calculation for meth-
ane, for example, is approximately 2.5 times that required
for the corresponding GRID calculation, although neither
time is prohibitive and the results are similar. For larger
functional groups (such as phenol), the MCSS calculation
takes significantly longer than the corresponding GRID sin-
gle-sphere probe calculation (an aromatic hydroxyl), but the
results are effective at indicating where in the binding site
the group can be accommodated (Fig. 4). The resulting MCSS
maps are more analogous to experimental mapping of a pro-
tein surface by determining its three-dimensional structure
in various organic solvents (Allen et al., 1996; Joseph-
McCarthy et al., 1996; Shuker et al., 1996). MCSS has been
used to suggest improvements to HIV-1 protease inhibitors
(Caflisch et al., 1993) and thrombin inhibitors (Grootenhuis
& Karplus, 1996) and to design novel picornavirus capsid-binding ligands (D. Joseph-McCarthy, unpublished data).
Other related methods include HIPPO (Gillet et al., 1995)
and the fragment positioning mode of LUDI (Bohm, 1992).
Fragment positioning methods can be considered as the
first step in a three-step approach to de novo drug design.
The second step in the process involves clustering and con-
Fig. 4. Comparison of an MCSS functional group map for phenol and a GRID map for the aromatic hydroxyl probe, both calculated for the poliovirus
capsid protein. MCSS phenol minima with E 12 kcal/mol are shown colored by element, and the GRID density contoured at E 4 kcal/mol is in
magenta.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
7/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 185
necting the optimally placed molecular fragments to form
chemically sensible candidate ligands. The third step in-
volves estimating how well the proposed compounds should
bind relative to one another and to existing drugs (see Sec-
tion 4). Several different approaches can be employed for
the second step, and a number of groups have been develop-
ing ways to automate the process. In one application, MC
minimizations were performed using a pseudo-potential en-
ergy function to connectN-methyl acetamide minima in the
binding site to form peptide backbones (Caflisch et al.,
1993). In another application, a link procedure involving the
optimization of linker carbon positions and their connectiv-
ity to selected functional group minima was used to con-
struct nonpeptide small molecules (Joseph-McCarthy et al.,
1997). The newer program, OLIGO (E. Evensen & M. Kar-
plus, unpublished), can similarly construct peptide back-
bones using a simulated annealing MC minimization proce-
dure and a pseudo potential. In this case, however, each MC
move is the substitution of one backbone monomer frag-
ment (an N
-methyl acetamide minimum position) for an-other in the chain. Allowed side chains (in their optimal po-
sitions in the binding site) are then automatically and
exhaustively added to these backbones. Two related dynam-
ical approaches are DLD (Dynamic Ligand Design)
(Miranker & Karplus, 1995) and CONCERTS (Creation Of
Novel Compounds by Evaluation of Residues at Target
Sites) (Pearlman & Murcko, 1996). DLD saturates the tar-
get binding site with sp3 carbons, which can connect to each
other or to functional group minima (as determined by
MCSS or a related method) to form molecules with the cor-
rect stereochemistry using a pseudo-energy function. This
potential function depends on the Cartesian coordinates ofthe atoms, as well as their occupancies and types. In the
present implementation, it is sampled and optimized using
MC simulated annealing. CONCERTS saturates the binding
site with multiple copies of various molecular fragments
and does both the fragment positioning and connection us-
ing molecular dynamics (MD) with the AMBER potential
energy function. The fragments are fully flexible during the
minimization, and only connected fragments interact with
each other. Connections can occur along user-specified
bonds to hydrogen in each fragment; when an inter-frag-
ment bond is formed, two hydrogens (one belonging to each
fragment) are deleted. During the optimization procedure,
bonds can break, as well as form, if the result lowers the
overall energy of the molecule or macro-fragment. With
both DLD and CONCERTS, multiple molecules are simul-
taneously formed and scored.
3.2. Molecule growth methods
In molecule growth methods, a seed atom (or fragment)
is first placed in the binding site of the target structure. A
ligand molecule is successively built by bonding another
atom (or fragment) to it. There are a number of molecule
growth methods available, including SMoG (Small Mole-
cule Growth) (DeWitte et al., 1997; DeWitte & Shakhnov-
ich, 1996), GrowMol (Bohacek & McMartin, 1994, 1995),
GenStar (Rotstein & Murcko, 1993a), GroupBuild (Rotstein
& Murcko, 1993b), and GROW (Moon & Howe, 1991).
GROW starts with a user-selected residue position and con-
structs a peptide by sequentially adding residues. Amino
acid conformations are selected from a large predefined li-
brary. Peptides are scored as they are being constructed, us-
ing a molecular mechanics force field. GroupBuild is simi-
lar in that it uses a predefined library of chemical fragments
and scores candidate fragment positions based on a molecu-
lar mechanics force field to generate candidate small mole-
cule ligands fragment-by-fragment. In contrast, GenStar se-
quentially grows structures composed of only sp3 carbons,
starting from either the position of user-selected seed atoms
in the target structure or a docked ligand core onto which at-
oms are to be built. For each new atom, several hundred
candidate positions are generated based on geometry con-
siderations, and then scored using a simple contact function.
The new atom position is randomly chosen from among thebest-scoring candidate positions. Branching is allowed, and
ring formation is favored to generate structures that fill the
binding site. Similarly, GrowMol sequentially builds up
ligand structures from a library of allowed atom types (in-
cluding oxygen, nitrogen, negatively charged oxygen, and
hydrogen, in addition to sp3 carbon), as well as small func-
tional group types. In this case, each new atom (or func-
tional group) position is scored based on its chemical com-
plementarity with nearby atoms in the binding site of the
target structure. The Metropolis criterion with this comple-
mentarity score taken as the energy is used to accept or re-
ject the candidate atom position. The complementarity scoreis determined by a grid surrounding the target binding site
with grid points designated as binding-site forbidden (too
close to the target structure), hydrogen bond acceptor, hy-
drogen bond donor, or neutral. SMoG uses a coarse-grained
knowledge-based potential that is based on statistical analy-
sis of crystal structures of small molecule-protein complexes
to estimate the binding affinity of molecules as they are
grown. Molecules are built by joining small, rigid fragments
together with standard bond lengths and angles and optimal
torsions. Functional group additions are accepted based on a
metropolis MC method. The disadvantages to this general
approach are that the final results depend a great deal on the
position of the seed atom in the binding site and that many
of the resulting molecules may be too difficult to synthesize.
In future implementations, additional chemistry rules need
to be considered when growing the molecules.
3.3. Fragment methods coupled to database searches
Fragment positioning methods can also be coupled to da-
tabase searching techniques either to extract those existing
molecules from a database that can be docked into the bind-
ing site with the desired fragments in their optimal positions
or for de novo design. The program HOOK (Eisen et al.,
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
8/13
186 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
1994), for example, can be used to do both. In its de novo
design mode, HOOK first creates a database of molecular
skeletons by stripping off all the functional groups on the
database molecules and then searches this database for
those molecular skeletons that can be fit into the target bind-
ing site in such a way that two MCSS functional group min-
ima can be attached or hooked onto them. After this initial
docking by geometrical superposition of two designated
hooks (methyl groups and attached atoms) in the skeletal
molecules and in two functional group minima, the fit of the
skeleton in the binding site is scored using a simplified, in-
verted Lennard-Jones type contact potential. If the fit is ac-
ceptable, secondary searches are carried out to attach addi-
tional MCSS minima to the skeleton, possibly through an
extra carbon. CAVEAT (Lauri & Bartlett, 1994) is similar
in that it searches a database of three-dimensional structures
of small molecules (often cyclic molecules) to use as molec-
ular frameworks to connect fragments already optimally
placed in the binding site. For each molecule in the data-
base, specific bonds are represented as vectors, and the mol-ecule is represented as a set of pairwise combinations of
these bond vectors. CAVEAT matches specified pairs of
bond vectors from the fragments (or the query) and the data-
base molecules to retrieve compounds. It is fast because the
interaction between the skeletal molecule and the binding
site is only considered in a post-processing step. As with
HOOK and CAVEAT, LUDI (Bohm, 1992, 1994b) can be
used either for database searching or de novo design. For de
novo design, LUDI uses either statistical data from small-
molecule crystal structures, geometric rules, or output from
the program GRID to identify interaction sites in the target
binding site. Molecular fragments (taken from a library ofhundreds) are then placed in binding site positions, where
they can connect up to four of these favorable hydrogen-
bonding or hydrophobic interaction sites. Smaller linker
groups such as CH2 are used interactively to connect these
larger, optimally placed fragments into candidate ligands.
LUDIs empirical scoring function takes into account hy-
drogen bonds, ionic interactions, the lipophilic protein-
ligand contact surface, and the number of rotatable bonds in
a ligand. It was calibrated by fitting to experimental binding
affinities for 45 protein-ligand complexes to obtain the indi-
vidual energy contributions for an ideal neutral hydrogen
bond (
4.7 kJ/mol), an ideal ionic hydrogen bond (
8.3 kJ/
mol), a lipophilic contact (
0.17 kJ/mol), and one rotatable
bond in the ligand (
1.4 kJ/mol). Deviations from ideal ge-
ometry reduce these contributions, and the sum of all inter-
actions gives an estimate of the free energy of binding for a
given protein-ligand complex. Since its scoring function is
based solely on geometric considerations, LUDI is very fast
and can be used interactively to predict protein-ligand com-
plex structures, but it may sometimes miss optimal positions
that are due to more delocalized electrostatic and vdW inter-
actions. Instead of docking molecular fragments from a li-
brary, LUDI can similarly be used to dock and score mole-
cules from a large database.
Fragment positioning methods can also be used to deter-
mine or combinatorially generate possible structure-based
pharmacophores. Traditionally, a pharmacophore is the set
of features common to a series of active molecules. A three-
dimensional pharmacophore specifies the spatial relation-
ship between the groups or features, often defining dis-
tances or distance ranges between groups, angles between
groups or planes, and exclusion spheres (see Fig. 5) (Leach,
1996). Structural information about the target can also be
used to help align ligand molecules to obtain better pharma-
cophores. Programs such as Catalyst can be used to generate
a pharmacophore by aligning and overlaying a set of ligand
structures. Catalyst can also use a pharmacophore to search
a database for new molecules that possess that pharmaco-
phore (Sprague, 1995). ISIS (MDL Information Systems
Inc., 1997) and UNITY (Tripos Associates, 1995) are two
other popular programs for searching a database for two- or
three-dimensional pharmacophores.
3.4. Virtual library construction and screening
The de novo design methods described in Sections 3.1
3.3 can be used to suggest individual molecules or to con-
struct large virtual combinatorial libraries of compounds
that can be screened computationally. MCSS functional
group maps, for example, have been used to design large
structure-based libraries for major histocompatibility Class
II molecules (E. Evensen, D. Joseph-McCarthy, G. Weiss,
S. Schreiber, & M. Karplus, in preparation), and small di-
rected libraries of poliovirus capsid-binding ligands (D. Jo-
seph-McCarthy, J. M. Hogle, & M. Karplus, in preparation).
An automated method, CCLD, for generating combinatorial
libraries by iteratively and exhaustively connecting MCSSminima has also been developed (Caflisch, 1996). Starting
with the MCSS minimum with the lowest approximated
binding free energy, small linker units (with from 0 to 3 co-
valent bonds) are used to add additional fragment minima.
The calculation is fast because a list of mutually excluding
(overlapping) fragment pairs and of possible bonding frag-
ments pairs is pre-computed. Also, ligand growth is stopped
if the average value of the binding free energy of its frag-
Fig. 5. An example of a three-dimensional pharmacophore.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
9/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 187
ments exceeds a specified cutoff. In addition, HOOK can be
used with a database of all allowed conformations of a scaf-
fold or a set of scaffolds, with only positions that can be
combinatorialized designated as hooks. FLO98 can also be
used to generate and score combinatorial libraries in an au-
tomated manner.
4. Ligand-binding scoring functions
The success of docking molecules into a target site, de-
signing ligands de novo, or constructing and screening large
virtual combinatorial libraries is ultimately dependent on
the accuracy of the scoring function that ranks the com-
pounds or how well the corresponding relative binding af-
finities can be predicted. Ligand binding is governed by ki-
netic and thermodynamic principles. Factors that contribute
to ligand binding include the hydrophobic effect, vdW and
dispersion interactions, hydrogen bonding, other electro-
static interactions, and solvation effects (Ajay & Murcko,1995). If the change in free energy associated with complex
formation is negative, the association will be favorable.
Once a candidate ligand is constructed, its interaction en-
ergy with the protein is calculated and compared with that
for other proposed compounds and existing ligands.
In order of increasing complexity, the various ap-
proaches for estimating binding affinities include scoring
functions based on the statistical analysis of known struc-
tures of protein-ligand complexes (Koppensteiner & Sippl,
1998), physicochemical properties (Bohm & Klebe, 1996),
molecular mechanics force-field calculations, force-field
calculations with added solvation corrections, and free en-ergy perturbation (FEP) calculations (Gilson et al., 1997a).
The SMoG pseudo-energy function is an example of a scor-
ing function based on the statistical analysis of high resolu-
tion X-ray structures. The simplest physicochemical scoring
functions include those that count the number of receptor
atom contacts within specified distances or that scale these
counts depending on the distance from the ligand, as HOOK
does. More complicated ones include the LUDI energy
function and similar empirical scoring functions (Eldridge
et al., 1997; Jain, 1996). Molecular mechanics force-field
calculations attempt to model explicitly the atomic inter-
actions in the system. The resulting interaction energies rep-
resent the enthalpic contribution to the free energy. The
simplest force-field calculations are performed with the
ligand-target complex in vacuum using truncation schemes
for the nonbonded interactions. The calculated ligand-target
interaction energies include electrostatic and vdW interac-
tions between the ligand and target, and often also include
the internal energy (bond, angle, and torsion terms) of the
ligand or a ligand strain term (the internal energy of the
ligand in its bound conformation minus a reference energy
for the ligand in an unbound conformation). In a number of
cases of sets of related compounds, a reasonable correlation
exists between the vdW interaction energy alone and bind-
ing affinities (Caflisch & Karplus, 1995; Grootenhuis & van
Galen, 1995; Grootenhuis & Van Helden, 1994; Holloway
et al., 1995; Joseph-McCarthy et al., 1997; Kurinov & Har-
rison, 1994).
A mean force-field approximation or continuum repre-
sentation for solvent can be used to calculate an electrostatic
term that is substituted for the molecular mechanics Cou-
lombic term to estimate the electrostatic contribution to the
free energy. This continuum treatment of long-range elec-
trostatic interactions involves first calculating the electro-
static potential for the final state and the individual refer-
ence states, using a finite difference approach to solve the
linearized Poisson-Boltzmann equation, as implemented in
UHBD (Davis et al., 1991; Madura et al., 1995) or Delphi
(Gilson et al., 1988; Nicholls & Honig, 1991). Calculation
of the electrostatic energy from the electrostatic potential is
trivial, and for ligand binding, the difference in the electro-
static energy approximates the difference in the electrostatic
contribution to the free energy (that is, for the binding of
ligand L to protein P,
GelecUUPLUPUL). Toaccount further for solvation, the solvent-accessible surface
area can be calculated for the ligand, the protein, and the
ligand-protein complex. The surface area buried upon com-
plex formation can be related to the free energy of nonpolar
solvation or the hydrophobic effect associated with ligand
binding (Eisenberg & McLachlan, 1986; Ooi et al., 1987).
A number of groups have used a weighted sum of a contin-
uum electrostatic term and a buried surface area term, some-
times with the addition of a ligand internal energy term, to
predict binding affinities with some success (Caflisch,
1996; Froloff et al., 1997; Novotny et al., 1997; Simonson
et al., 1997). Another approach is to incorporate an implicitsolvation term directly into the molecular mechanics force
field. For example, an excluded volume-implicit solvation
model can be used that assumes that the solvation free en-
ergy for each group or residue in the system is equal to the
calculated solvation free energy for that group in a small
model compound less the amount of solvation lost due to
solvent exclusion by the other atoms of the macromolecular
system (Lazaridis & Karplus, 1999).
The only rigorous way to predict relative or absolute
binding free energies is a FEP calculation with explicit sol-
vent. FEP MD is difficult due to problems with sampling
and the accuracy of the empirical potential used, but it does
allow the free energy contributions to be examined on an
atomic level (Beveridge & DiCapua, 1989; Brooks et al.,
1988; Kollman, 1993; Straatsma & McCammon, 1992).
Furthermore, component analysis of the results can aid in
understanding the relative contribution of various parts of
the system (i.e., of the ligand or protein) to the free energy
(Boresch et al., 1994; Boresch & Karplus, 1995; Gao et al.,
1989). There are a number of approaches for calculating rel-
ative free energies with explicit solvent present (Pearlman,
1994), but all are computer intensive, and for ligand bind-
ing, the calculations are limited to very small changes in the
ligand structure. The standard free energy cycle for deter-
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
10/13
188 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
mining the relative binding free energy (Gbind) for ligand
1 (L1) versus ligand 2 (L2) is
where, for example, L1(w) is ligand 1 in water, L1(p) is
ligand 1 bound to the protein, and is the relative
free energy of solvation of L1 vs. L2 in water. This cycle re-
sults in Gbind Gbind(L1) Gbind(L2)
. (If L1 is taken as a nil particle, in principle, this cy-
cle would yield the absolute binding free energy for L2.)
With the FEP method, the free energies associated with thetwo nonphysicalpaths corresponding to mutating L1 into L2(in water and bound to protein, respectively) are calculated.
Often the simulation is carried out by varying a mixing con-
stant
from 0 to 1 in set increments, where the potential en-
ergy for the system is represented as (1
) of the potential
energy for L1 and of that for L2 [i.e., V(rN, ) (1
)
VL1(rN) VL2(r
N)]. Mutating one ligand into another al-
most always involves the creation and annihilation of at-
oms, as well as the redistribution of molecular charges, pro-
cesses that converge very slowly during a simulation, even
on current computers.
In practice, for drug design applications, large sets ofcandidate ligand molecules that differ considerably need to
be compared. In an attempt to circumvent this problem,
Aqvist and co-workers have developed a semi-empirical
method for calculating absolute-binding free energies from
MD simulations of the two physical paths (Aqvist et al.,
1994; Hansson et al., 1998; Marelius et al., 1998). In this
semi-empirical approach, a linear approximation of the po-
lar and nonpolar free energy contributions is estimated from
averages of MD simulations of the ligand in water and of
the ligand-protein complex in water. That is,
Gbind 1/2
, where is, for example,
the solute-solvent electrostatic term,
refers to the differ-
ence between the protein and water environments for the
ligand, and
is a parameter determined by empirical cali-
bration with a series of ligand-protein complexes with
known binding affinities.
New approaches to address some of the problems associ-
ated with sampling in standard FEP calculations are also be-
ing developed (Cieplak & Kollman, 1996; Gerber et al.,
1993; Gilson et al., 1997b; Guo et al., 1998; Liu et al., 1996;
Tidor, 1993). One such approach by Brooks and co-workers
(Guo et al., 1998) involves an extended Hamiltonian
method whereby the mixing parameter
is treated as a dy-
namic variable that is propagated (as if it were a particle)
Gso lw
Gso lp
Gso lw
VL sel
VL svd W
VL sel
along with the atomic coordinates for the system according
to Newtons equations of motion. A series of related ligands
can be simultaneously simulated using a set ofs, with the
variant parts of the ligand interacting with the target struc-
ture, but not with each other, to calculate relative binding
free energies. Since this allows for more efficient sampling,
the calculations are faster and larger differences in ligand
structures can be examined. Future improvements in the
force fields likely will involve the inclusion of polarization
(Liu et al., 1998) and should also lead to more accurate
binding energy calculations.
5. Summation and future outlook
The greatest success of computer-aided structure-based
drug design to date are the HIV-1 protease inhibitors that re-
cently have been approved by the United States Food and
Drug Administration and reached the market (Wlodawer &
Vondrasek, 1998). With the development of new computa-
tional drug design technologies and their use in connectionwith combinatorial chemistry, there promises to be many
more successes. Improved scoring functions, faster comput-
ers, and better database storage methods will aid in the pro-
cess. These improvements will be particularly relevant to the
areas of virtual library screening and quantitative structure-
activity relationships (QSAR). QSAR approaches, which are
used by all pharmaceutical companies, involve the statistical
analysis of a set of properties or descriptors for a series of bi-
ologically active molecules in order to predict the activity of
additional compounds. QSAR methods that also take into ac-
count available structural information on the protein, as well
as the ligands, are now being developed (So & Karplus,1999), and represent a way of systematically taking into ac-
count all information available for a given pharmaceutical
target to predict binding of new compounds. Expert systems
for organic synthesis, such as LHASA (Corey et al., 1992;
Long & Kappos, 1994) or WODCA (Fick et al., 1995), may
be used either to map out potential synthetic routes or possi-
bly to assess the ease or feasibility of synthesis for a set of
compounds. The construction of large virtual libraries based
on available chemistry or a set of existing combinatorial scaf-
folds and the use of the structure of a macromolecular target
to screen computationally will also be a major focus of future
drug discovery efforts.
Acknowledgments
The author thanks Juan C. Alvarez, Bert E. Thomas, and
Paul D. Lyne for helpful discussions and Erik Evensen,
Ryan Putzer, and Martin Karplus for allowing the discus-
sion of results prior to publication.
References
Ajay, M. M. A. (1995). Computational methods to predict binding free en-
ergy in ligand-receptor complexes.J Med Chem 38, 49534967.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
11/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 189
Allen, K. N., Bellamacina, C. R., Ding, X. C., Jeffery, C. J., Mattos, C.,
Petsko, G. A., & Ringe, D. (1996). An experimental approach to map-
ping the binding surfaces of crystalline proteins. J Phys Chem 100,
26052611.
Andrade, M. A., & Sander, C. (1997). Bioinformatics: from genome data to
biological knowledge. Curr Opin Biotechnol 8, 675683.
Aqvist, J., Medina, C., & Samuelsson, J. E. (1994). New method for pre-
dicting binding-affinity in computer-aided drug design. Protein Eng 7,
385391.Bemis, G. W., & Murcko, M. A. (1996). The properties of known drugs. 1.
Molecular frameworks.J Med Chem 39, 28872893.
Beveridge, D. L., & DiCapua, F. M. (1989). Free energy via molecular
simulation: applications to chemical and biomolecular systems. Annu
Rev Biophys Biophys Chem 18, 431492.
Bohacek, R. S., & McMartin, C. (1994). Multiple highly diverse structures
complementary to enzyme binding-sitesresults of extensive applica-
tion of a de-novo design method incorporating combinatorial growth.J
Am Chem Soc 116, 55605571.
Bohacek, R. S., & McMartin, C. (1995). De-novo design of highly diverse
structures complementary to enzyme binding-sitesapplication to
thermolysin. Comput Aided Mol Des 589, 8297.
Bohm, H. J. (1992). Ludirule-based automatic design of new substitu-
ents for enzyme-inhibitor leads.J Comput Aided Mol Des 6, 593606.
Bohm, H. J. (1994a). The development of a simple empirical scoring func-tion to estimate the binding constant for a protein ligand complex of
known 3-dimensional structure.J Comput Aided Mol Des 8, 243256.
Bohm, H. J. (1994b). On the use of Ludi to search the Fine Chemicals Di-
rectory for ligands of proteins of known 3-dimensional structure. J
Comput Aided Mol Des 8, 623632.
Bohm, H. J., & Klebe, G. (1996). What can we learn from molecular recog-
nition in protein-ligand complexes for the design of new drugs.Angew
Chem Int Ed Engl 35, 25882614.
Boobbyer, D., Goodford, P., McWhinnie, P., & Wade, R. (1989). New hy-
drogen-bond potentials for use in determining energetically favorable
binding sites on molecules of known structure. J Med Chem 32, 1083
1094.
Boresch, S., & Karplus, M. (1995). The meaning of component analysis
decomposition of the free-energy in terms of specific interactions. J
Mol Biol 254, 801807.
Boresch, S., Archontis, G., & Karplus, M. (1994). Free-energy simula-
tionsthe meaning of the individual contributions from a component
analysis. Proteins 20, 2533.
Borman, S. (1997). Combinatorial chemistry. Chem Eng News 75, 4353.
Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J., Swami-
nathan, S., & Karplus, M. (1983). CHARMM: a program for macromo-
lecular energy, minimization, and dynamics calculations. J Comput
Chem 4, 187217.
Brooks, C. L., Karplus, M., & Pettitt, B. N. (1988). Proteins: A Theoretical
Perspective of Dynamics, Structure, and Thermodynamics. New York:
John Wiley & Sons.
Buzbee, B. (1993). Workstation clusters rise and shine (Computing in Sci-
ence: Perspective). Science 261, 852853.
Caflisch, A. (1996). Computational combinatorial ligand design: applica-
tion to human alpha-thrombin.J Comput Aided Mol Des 10, 372396.
Caflisch, A., & Karplus, M. (1995). Computational combinatorial chemis-
try for de novo ligand design: review and assessment. Perspect Drug
Discov Des 3, 5184.
Caflisch, A., Miranker, A., & Karplus, M. (1993). Multiple copy simulta-
neous search and construction of ligands in binding sites.J Med Chem
36, 21422167.
Chen, Q., Shafer, R. H., & Kuntz, I. D. (1997). Structure-based discovery
of ligands targeted to the RNA double helix. Biochemistry 36, 11402
11407.
Cieplak, P., & Kollman, P. A. (1996). A technique to study molecular rec-
ognition in drug design: preliminary application of free energy deriva-
tives to inhibition of a malarial cysteine protease. J Mol Recognit 9,
103112.
Collins, F. S., Patrinos, A., Jordan, E., Chakravarti, A., Gesteland, R.,
Walters, L., Fearon, E., Hartwelt, L., Langley, C. H., Mathies, R. A.,
Olson, M., Pawson, A. J., Pollard, T., Williamson, A., Wold, B., Bue-
tow, K., Branscomb, E., Capecchi, M., Church, G., Garner, H., Gibbs,
R. A., Hawkins, T., Hodgson, K., Knotek, M., Meisler, M., Rubin,
G. M., Smith, L. M., Smith, R. F., Westerfield, M., Clayton, E. W.,
Fisher, N. L., Lerman, C. E., McInerney, J. D., Nebo, W., Press, N., &
Valle, D. (1998). New goals for the US Human Genome Project: 1998
2000. Science 282, 682689.Corey, E. J., Long, A. K., Lotto, G. I., & Rubenstein, S. D. (1992). Com-
puter-assisted synthetic analysisquantitative assessment of transform
utilities.Recl Trav Chim Pays-Bas J R Nether Chem Soc 111, 304309.
Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, K. M., Fergu-
son, D. M., Spellmeyer, D. C., Fox, T., Caldwell, J. W., & Kollman,
P. A. (1995). A 2nd generation force-field for the simulation of pro-
teins, nucleic-acids, and organic-molecules. J Am Chem Soc 117,
51795197.
Couzin, J. (1998). Supercomputingcomputer experts urge new federal
initiative. Science 281, 762.
Davis, M. E., Madura, J. D., Luty, B. A., & McCammon, J. A. (1991).
Electrostatics and diffusion of molecules in solutionsimulations with
the University-of-Houston-Brownian Dynamics program. Comput
Phys Commun 62, 187197.
DeWitte, R., & Shakhnovich, E. (1996). SMoG: de novo design methodbased on simple, fast, and accurate free energy estimates. 1. Methodol-
ogy and supporting evidence.J Am Chem Soc 118, 1173311744.
DeWitte, R., Ishchenko, A., & Shakhnovich, E. (1997). SMoG: de novo
design method based on simple, fast, and accurate free energy esti-
mates. 2. Case studies on molecular design.J Am Chem Soc 119 , 4608
4617.
Drews, J. (1996). Genomic sciences and the medicine of tomorrow. Nat
Biotechnol 14 , 15161518.
Dunbrack, R. L., Gerloff, D. L., Bower, M., Chen, X. W., Lichtarge, O., &
Cohen, F. E. (1997). Meeting review: The Second Meeting on the Crit-
ical Assessment of Techniques for Protein Structure Prediction
(CASP2), Asilomar, California, December 1316, 1996. Fold Des 2,
R27-R42.
Eisen, M. B., Wiley, D. C., Karplus, M., & Hubbard, R. E. (1994). HOOK:
a program for finding novel molecular architectures that satisfy the
chemical and steric requirements of a macromolecule binding sites.
Proteins 19, 199221.
Eisenberg, D., & McLachlan, A. D. (1986). Solvation energy in protein
folding and binding.Nature 319, 199203.
Eldridge, M. D., Murray, C. W., Auton, T. R., Paolini, G. V., & Mee, R. P.
(1997). Empirical scoring functions. 1. The development of a fast em-
pirical scoring function to estimate the binding affinity of ligands in re-
ceptor complexes.J Comput Aided Mol Des 11 , 425445.
Evensen, E., Joseph-McCarthy, D., & Karplus, M. (1997).MCSSv2. Cam-
bridge: Harvard University.
Fedorov, A. A., JosephMcCarthy, D., Fedorov, E., Sirakova, D., Graf, I., &
Almo, S. C. (1996). Ionic interactions in crystalline bovine pancreatic
ribonuclease A.Biochemistry 35, 1596215979.
Fick, R., Ihlenfeldt, W. D., & Gasteiger, J. (1995). Computer-assisted de-
sign of syntheses for heterocyclic-compounds. Heterocycles 40, 993
1007.
Friedman, S. H., Ganapathi, P. S., Rubin, Y., & Kenyon, G. L. (1998). Op-
timizing the binding of fullerene inhibitors of the HIV-1 protease
through predicted increases in hydrophobic desolvation. J Med Chem
41, 24242429.
Froloff, N., Windemuth, A., & Honig, B. (1997). On the calculation of
binding free energies using continuum methods: application to MHC
class I protein-peptide interactions. Protein Sci 6, 12931301.
Gao, J., Kuczera, K., Tidor, B., & Karplus, M. (1989). Hidden thermody-
namics of mutant proteinsa molecular-dynamics analysis. Science
244, 10691072.
Gerber, P. R., Mark, A. E., & Vangunsteren, W. F. (1993). An approximate
but efficient method to calculate free-energy trends by computer-simu-
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
12/13
190 D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191
lationapplication to dihydrofolate-reductase inhibitor complexes. J
Comput Aided Mol Des 7, 305323.
Gillet, V. J., Myatt, G., Zsoldos, Z., & Johnson, A. P. (1995). SPROUT,
HIPPO and CAESA: tools for de novo structure generation and estima-
tion of synthetic accessibility. Perspect Drug Discov Des 3, 3450.
Gilson, M. K., Sharp, K. A., & Honig, B. H. (1988). Calculating the elec-
trostatic potential of molecules in solution: method and error assess-
ment.J Comput Chem 9, 327335.
Gilson, M. K., Given, J. A., Bush, B. L., & McCammon, J. A. (1997a). Thestatistical-thermodynamic basis for computation of binding affinities: a
critical review.Biophys J 72, 10471069.
Gilson, M. K., Given, J. A., & Head, M. S. (1997b). A new class of models
for computing receptor-ligand binding affinities. Chem Biol 4, 8792.
Goodford, P. (1985). A computational procedure for determining energeti-
cally favorable binding sites on biologically important macromole-
cules.J Med Chem 28, 849857.
Goodsell, D. S., Morris, G. M., & Olson, A. J. (1996). Automated docking
of flexible ligands: applications of AutoDock.J Mol Recognit9, 15.
Grootenhuis, P. D. J., & Karplus, M. (1996). Functionality map analysis of
the active site cleft of human thrombin. J Comput Aided Mol Des 10,
110.
Grootenhuis, P. D. J., & van Galen, P. J. M. (1995). Correlation of binding
affinities with nonbonded interaction energies of thrombin-inhibitor
complexes.Acta Crystallogr D 51, 560566.Grootenhuis, P. D. J., & Van Helden, S. P. (1994). Rational approaches to-
wards protease inhibition: predicting the binding of thrombin inhibi-
tors. In G. Wipff (Ed.), Computational Approaches in Supramolecular
Chemistry (pp. 137149). Dordrecht: Kluwer Academic Press.
Grootenhuis, P. D. J., Roe, D. C., Kollman, P. A., & Kuntz, I. D. (1994).
Finding potential DNA-binding compounds by using molecular shape.
J Comput Aided Mol Des 8, 731750.
Gschwend, D. A., Sirawaraporn, W., Santi, D. V., & Kuntz, I. D. (1997).
Specificity in structure-based drug design: Identification of a novel, se-
lective inhibitor ofPneumocystis carinii dihydrofolate reductase. Pro-
teins 29, 5967.
Guo, Z., Brooks, C. L., & Kong, X. (1998). Efficient and flexible algorithm
for free energy calculations using the lambda-dynamics approach. J
Phys Chem B 102, 20322036.
Halgren, T. A. (1996). Merck Molecular Force Field. I. Basis, form, scope,
parameterization, and performance of MMFF94. J Comput Chem 17,
490519.
Hansson, T., Marelius, J., & Aqvist, J. (1998). Ligand binding affinity pre-
diction by linear interaction energy methods.J Comput Aided Mol Des
12, 2735.
Hoffman, L. R., Kuntz, I. D., & White, J. M. (1997). Structure-based iden-
tification of an inducer of the low-pH conformational change in the in-
fluenza virus hemagglutinin: irreversible inhibition of infectivity.J Vi-
rol 71, 88088820.
Holloway, M. K., Wai, J. M., Halgren, T. A., Fitzgerald, P. M. D., Vacca,
J. P., Dorsey, B. D., Levin, R. B., Thompson, W. J., Chen, L. J., de-
Solms, S. J., Gaffin, N., Ghosh, A. K., Giuliani, E. A., Graham, S. L.,
Guare, J. P., Hungate, R. W., Lyle, T. A., Sanders, W. M., Tucker,
T. J., Wiggins, M., Wiscount, C. M., Woltersdorf, O. W., Young, S. D.,
Darke, P. L., & Zugay, J. A. (1995).A priori prediction of activity for
HIV-1 protease inhibitors employing energy minimization in the active
site.J Med Chem 38, 305317.
Houston, J. G., & Banks, M. (1997). The chemical-biological interface: de-
velopments in automated and miniaturised screening technology. Curr
Opin Biotechnol 8, 734740.
Jain, A. N. (1996). Scoring noncovalent protein-ligand interactions: a con-
tinuous differentiable function tuned to compute binding affinities. J
Comput Aided Mol Des 10, 427440.
Joseph-McCarthy, D., Fedorov, A. A., & Almo, S. C. (1996). Comparison
of experimental and computational functional group mapping of an
RNase A structure: implications for computer-aided drug design. Pro-
tein Eng 9, 773780.
Joseph-McCarthy, D., Hogle, J. M., & Karplus, M. (1997). Use of the mul-
tiple copy simultaneous search (MCSS) method to design a new class
of picornavirus capsid binding drugs. Proteins 29, 3258.
Kingsbury, D. T. (1997). Bioinformatics in drug discovery.Drug Dev Res
41, 120128.
Kollman, P. (1993). Free energy calculations: applications to chemical and
biochemical phenomena. Chem Rev 93, 23952417.
Koppensteiner, W. A., & Sippl, M. J. (1998). Knowledge-based poten-
tialsback to the roots.Biochemistry Moscow 63, 247252.
Kramer, B., Rarey, M., & Lengauer, T. (1997). CASP2 experiences withdocking flexible ligands using FLEXX. Proteins (suppl. 1), 221225.
Kuntz, I. (1992). Structure-based strategies for drug design and discovery.
Science 257, 10781082.
Kuntz, I. D., Blaney, J. M., Oarley, S. J., Langridge, R., & Ferrin, T. E.
(1982). A geometric approach to macromolecule-ligand interactions. J
Mol Biol 161, 269288.
Kurinov, I. V., & Harrison, R. W. (1994). Prediction of new serine protein-
ase inhibitors.Nature Struct Biol 1, 735743.
Lauri, G., & Bartlett, P. A. (1994). Caveata program to facilitate the de-
sign of organic-molecules.J Comput Aided Mol Des 8 , 5166.
Lazaridis, T., & Karplus, M. (1999). Effective energy function for proteins
in solution. Proteins 35, 133152.
Leach, A. R. (1996). Molecular modelling: principles and applications.
Essex: Addison Wesley Longman Ltd.
Li, R. S., Chen, X. W., Gong, B. Q., Selzer, P. M., Li, Z., Davidson, E.,Kurzban, G., Miller, R. E., Nuzum, E. O., McKerrow, J. H., Fletterick,
R. J., Gillmor, S. A., Craik, C. S., Kuntz, I. D., Cohen, F. E., &
Kenyon, G. L. (1996). Structure-based design of parasitic protease in-
hibitors.Bioorg Med Chem 4, 14211427.
Lii, J. H., & Allinger, N. L. (1991). The MM3 force-field for amides,
polypeptides and proteins.J Comput Chem 12, 186199.
Liu, H. Y., Mark, A. E., & vanGunsteren, W. F. (1996). Estimating the rel-
ative free energy of different molecular states with respect to a single
reference state.J Phys Chem 100, 94859494.
Liu, Y. P., Kim, K., Berne, B. J., Friesner, R. A., & Rick, S. W. (1998).
Constructing ab initio force fields for molecular dynamics simulations.
J Chem Phys 108, 47394755.
Long, A. K., & Kappos, J. C. (1994). Computer-assisted synthetic analy-
sisperformance of tactical combinations of transforms. J Chem Inf
Comput Sci 34, 915921.
Lorber, D. M., & Shoichet, B. K. (1998). Flexible ligand docking using
conformational ensembles. Protein Sci 7, 938950.
MacKerell, A. D., Bashford, D., Bellott, M., Dunbrack, R. L., Evanseck,
J. D., Field, M. J., Fischer, S., Gao, J., Guo, H., Ha, S., Joseph-McCar-
thy, D., Kuchnir, L., Kuczera, K., Lau, F. T. K., Mattos, C., Michnick,
S., Ngo, T., Nguyen, D. T., Prodhom, B., Reiher, W. E., Roux, B.,
Schlenkrich, M., Smith, J. C., Stote, R., Straub, J., Watanabe, M.,
Wiorkiewicz-Kuczera, J., Yin, D., & Karplus, M. (1998). All-atom em-
pirical potential for molecular modeling and dynamics studies of pro-
teins.J Phys Chem B 102, 35863616.
Madura, J. D., Briggs, J. M., Wade, R. C., Davis, M. E., Luty, B. A., Ilin,
A., Antosiewicz, J., Gilson, M. K., Bagheri, B., Scott, L. R., & Mc-
Cammon, J. A. (1995). Electrostatics and diffusion of molecules in so-
lutionsimulations with the University-of-Houston Brownian Dynam-
ics program. Comput Phys Commun 91, 5795.
Makino, S., & Kuntz, I. D. (1997). Automated flexible ligand docking
method and its application for database search. J Comput Chem 18,
18121825.
Marelius, J., Hansson, T., & Aqvist, J. (1998). Calculation of ligand bind-
ing free energies from molecular dynamics simulations.Int J Quantum
Chem 69, 7788.
Maxwell, D. S., Tiradorives, J., & Jorgensen, W. L. (1995). A comprehen-
sive study of the rotational energy profiles of organic-systems by ab-
initio MO theory, forming a basis for peptide torsional parameters. J
Comput Chem 16, 9841010.
McMartin, C., & Bohacek, R. S. (1997). QXP: Powerful, rapid computer
algorithms for structure-based drug design. J Comput Aided Mol Des
11, 333344.
-
8/3/2019 Computational Approaches to Structure Based Ligand Design 1999 Pharmacology & Therapeutics
13/13
D. Joseph-McCarthy / Pharmacology & Therapeutics 84 (1999) 179191 191
Meng, E. C., Shoichet, B. K., & Kuntz, I. D. (1992). Automated docking
with grid-based energy evaluation.J Comput Chem 13, 505524.
Miranker, A., & Karplus, M. (1991). Functionality maps of binding sites: a
multiple copy simultaneous search method. Proteins 11, 2934.
Miranker, A., & Karplus, M. (1995). An automated method for dynamic
ligand design. Proteins 23, 472490.
Moon, J., & Howe, W. (1991). Computer design of bioactive molecules: a
method for receptor-based de novo ligand design. Proteins 11, 314
328.Morris, G. M., Goodsell, D. S., Huey, R., & Olson, A. J. (1996). Distrib-
uted automated docking of flexible ligands to proteins: parallel applica-
tions of AutoDock 2.4.J Comput Aided Mol Des 10 , 293304.
Neria, E., Fischer, S., & Karplus, M. (1996). Simulation of activation ener-
gies in molecular systems.J Chem Phys 105, 19021921.
Nicholls, A., & Honig, B. (1991). A rapid finite-difference algorithm, uti-
lizing successive over-relaxation to solve the Poisson-Boltzmann equa-
tion.J Comput Chem 12, 435445.
Novotny, J., Bruccoleri, R. E., Davis, M., & Sharp, K. A. (1997). Empirical
free energy calculations: a blind test and further improvements to the
method.J Mol Biol 268, 401411.
Onuchic, J. N., LutheySchulten, Z., & Wolynes, P. G. (1997). Theory of
protein folding: the energy landscape perspective. Annu Rev Phys
Chem 48, 545600.
Ooi, W., Oobataki, M., Nemethy, G., & Scheraga, H. A. (1987). Accessiblesurface areas as a measure of the thermodynamic parameters of hydra-
tion of peptides. Proc Natl Acad Sci USA 84, 30863090.
Pearlman, D. A. (1994). A comparison of alternative approaches to free en-
ergy calculations.J Phys Chem 98, 14871493.
Pearlman, D. A., & Murcko, M. (1996). CONCERTS: dynamic connection
of fragments as an approach to de novo ligand design.J Med Chem 39,
16511663.
Pearlman, R. S. (1987). Rapid generation of high quality approximate 3D
molecular structures. Chem Des Aut News 2, 16.
Petsko, G. A. (1996). For medicinal purposes.Nature 384, 79.
Rarey, M., Wefing, S., & Lengauer, T. (1996). Placement of medium-sized
molecular fragments into active sites of proteins.J Comput Aided Mol
Des 10, 4154.
Ricketts, E. M., Bradshaw, J., Hann, M., Hayes, F., Tanna, N., & Ricketts,
D. M. (1993). Comparison of conformations of small-molecule struc-
tures from the protein data-bank with those generated by Concord, Co-
bra, Chemdbs-3d, and Converter and those extracted from the Cam-
bridge Structural Database.J Chem Inf Comput Sci 33, 905925.
Ring, C., Sun, E., McKerrow, J., Lee, G., Rosenthal, P., Kuntz, I., & Co-
hen, F. (1993). Structure-based inhibitor design by using protein mod-
els for the development of antiparasitic agents. Proc Natl Acad Sci USA
90, 35833587.
Rose, J. R., & Craik, C. S. (1994). Structure-assisted design of nonpeptide
human immunodeficiency virus-1 protease inhibitors.Am J Respir Crit
Care Med 150, S176S182.
Rost, B. (1998). Marrying structure and genomics. Structure 6, 259263.
Rotstein, S. H., & Murcko, M. A. (1993a). Genstara method for de novo
drug design.J Comput Aided Mol Des 7, 2343.
Rotstein, S. H., & Murcko, M. A. (1993b). Groupbuilda fragment-based
method for de novo drug design.J Med Chem 36, 17001710.
Sharp, K. A., & Honig, B. (1990). Electrostatic interactions in macromole-
culestheory and applications. Annu Rev Biophys Biophys Chem 19,
301332.
Shoichet, B. K., Stroud, R. M., Santi, D. V., Kuntz, I. D., & Perry, K. M.
(1993). Structure-based discovery of inhibitors of thymidylate syn-
thase. Science 259, 14451450.
Shoichet, B. K., Leach, A. R., & Kuntz, I. D. (1999). Ligand solvation in
molecular docking. Proteins 34, 416.
Shuker, S. B., Hajduk, P. J., Meadows, R. P., & Fesik, S. W. (1996). Dis-covering high-affinity ligands for proteins: SAR by NMR. Science 274,
15311534.
Simonson, T., Archontis, G., & Karplus, M. (1997). Continuum treatment
of long-range interactions in free energy calculations. Application to
protein-ligand binding.J Phys Chem B 101 , 83498362.
So, S. S., & Karplus, M. (1999). A comparative study of ligand-receptor
complex binding affinity prediction methods based on glycogen phos-
phorylase inhibitors.J Comput Aided Mol Des 13 , 243258.
Sprague, P. W. (1995). Automated chemical hypothesis generation and da-
tabase searching with Catalyst. Perspect Drug Discov Des 3, 120.
Straatsma, T. P., & McCammon, J. A. (1992). Computational alchemy.
Annu Rev Phys Chem 43, 407435.
Thomas, B. E., IV, Joseph-McCarthy, D., Alvarez, J. C. (1999). Pharma-
cophore-based molecular docking. In O. F. Guner (Ed.), Pharmaco-
phore perception, development, and use in drug design. La Jolla: Inter-national University Press, in press.
Tidor, B. (1993). Simulated annealing on free-energy surfaces by a com-
bined molecular-dynamics and Monte-Carlo approach. J Phys Chem
97, 10691073.
Verlinde, C. L. M. J., & Hol, W. G. J. (1994). Structure-based drug design:
progress, results and challenges. Structure 2, 577587.
von Itzstein, M., Dyason, J. C., Oliver, S. W., White, H. F., Wu, W. Y.,
Kok, G. B., & Pegg, M. S. (1996). A study of the active site of influ-
enza virus sialidase: an approach to the rational design of novel anti-
influenza drugs.J Med Chem 39, 388391.
Wade, R., & Goodford, P. (1993). Further development of hydrogen bond
functions for use in determining energetically favorable binding sites
on molecules of known structure. 2. Ligand probe groups with the abil-
ity to form more than two hydrogen bonds.J Med Chem 36, 148156.
Wade, R., Clark, K., & Goodford, P. (1993). Further development of hy-
drogen bond functions for use in determining energetically favorable
binding sites on molecules of known structure. 1. Ligand probe groups
with the ability to form two hydrogen bonds.J Med Chem 36, 140147.
Welch, W., Ruppert, J., & Jain, A. N. (1996). Hammerhead: fast, fully au-
tomated docking of flexible ligands to protein binding sites. Chem Biol
3, 449462.
Westhead, D. R., & Thornton, J. M. (1998). Protein structure prediction.
Curr Opin Biotechnol 9, 383389.
Wilson, E. K. (1997). Combinatorial chemistry. Chem Eng News 75, 2425.
Wlodawer, A., & Vondrasek, J. (1998). Inhibitors of HIV-1 protease: a ma-
jor success of structure-assisted drug design.Annu Rev Biophys Biomol
Struct 27, 249284.
Xu, G. Y., McDonagh, T., Yu, H. A., Nalefski, E. A., Clark, J. D., & Cum-
ming, D. A. (1998). Solution structure and membrane interactions of the
C2 domain of cytosolic phospholipase A(2).J Mol Biol 280, 485500.