Molecular Modeling The compendium of methods for mimicking the behavior of molecules or molecular...
-
Upload
ethan-marshall -
Category
Documents
-
view
214 -
download
0
Transcript of Molecular Modeling The compendium of methods for mimicking the behavior of molecules or molecular...
Molecular Modeling
The compendium of methods for mimicking the behavior of molecules or molecular systems
Points for Consideration
Remember• Molecular modeling forms a model of the real world• Thus we are studying the model, not the world• A model is valid as long as it reproduces the real world
Why Use Molecular Modeling?(and not deal directly with the real world?)
Fast, accurate and relatively cheap way to:• Study molecular properties• Rationalize and interpret experimental results• Make predictions for yet unstudied systems• Study hypothetical systems• Design new molecules
And• Understand
reactants products?
probe
Some Molecular PropertiesEnergy
• Free energy (G)• Enthalpy (H)• Entropy (S)• Steric energy (G)
Molecular Spectroscopy• NMR (Nuclear Magnetic
Resonance)• IR (Infra Red)• UV (Ultra Violate)• MW (Microwave)
Kinetics• Reaction Mechanisms• Rate constants
Electronic Properties• Molecular orbitals• Charge Distributions• Dipole Moments
3D Structure• Distances• Angles• Torsions
Molecular Structure: Saccharin
C7H5NO3SA single 1D structure
S
N
O
O O
HA single 2D structure
Many 3D structures
Descriptors1D: e.g., Molecular weight2D: e.g., # of rotatable bonds3D: e.g., Molecular volume
StructureProperty
ActivityCell PermeabilityToxicity
Molecular Structure and Molecular Properties
Molecular mechanics
• Ball and spring description of molecules• Better representation of equilibrium geometries than
plastic models• Able to compute relative strain energies• Cheap to compute• Lots of empirical parameters that have to be carefully
tested and calibrated• Limited to equilibrium geometries• Does not take electronic interactions into account • No information on properties or reactivity• Cannot readily handle reactions involving the making
and breaking of bonds
Energy as a function of geometry• Polyatomic molecule:
– N-degrees of freedom– N-dimensional potential energy surface
http://www.chem.wayne.edu/~hbs/chm6440/PES.html
A Molecule is a Collection of Atoms Held Together by ForcesIntuitively forces act between bonded atomsForces act to return structural parameters to their equilibrium values
stretch
bend torsion
And More Forces…
Forces also act between non-bonded atomsCross terms couple the different types of interactions
Stretch-bend Non-bonded
Forces are Described by Potential Energy Functions
20 )(
2)( llk
lv
20 )(
2)( kv )]cos(1(
2)(
0
nV
vN
n
n
And More Functions…
))](()[(2
),,( 00,220,112121 llll
kllv ll
N
i
N
ij
ij
ji
ij
ij
ij
ijij
r
rrv
1 1
0
612
4
4
A Force Field is a Collection of Potential Energy Functions
A force field is defined by the functional forms of the energy functions and by the values of their parameters.
))cos(1(2
)(2
)(2
)( 20,
20, n
Vkll
krV
torsions
nii
angles
iii
bonds
iN
termscross4
41 1 0
612
N
i
N
ij ij
ji
ij
ij
ij
ijij r
rr
Molecular Mechanics Energy (Steric Energy) is Calculated from the Force
FieldA molecular mechanics program will return an energy value for every conformation of the system.
Steric energy is the energy of the system relative to a reference point. This reference point depends on the bonded interactions and is both force field dependent and molecule dependent.
Thus, steric energy can only be used to compare the relative stabilities of different conformations of the same molecule and can not be used to compare the relative stabilities of different molecules.
Further, all conformational energies must be calculated with the same force field.
Steric Energy
Steric Energy = Estretch + Ebend + Etorsion + EVdW + Eelectrostatic +
Estretch-bend + Etorsion-stretch + …
Estretch Stretch energy (over all bonds)Ebend Bending energy (over all angles)Etorsion Torsional (dihedral) energy (over all dihedral angles)EVdW Van Der Waals energy (over all atom pairs > 1,3)Eelectrostatic Electrostatic energy (over all charged atom pairs >1,3)Estretch-bend Stretch-bend energyEtorsion-stretch Torsion-stretch energy
EVdW + Eelectrostatic are often referred to as non-bonded energies
Force Field and Potential Energy Surface
A force field defines for each molecule a unique PES.Each point on the PES represents a molecular conformation characterized by its structure and energy.Energy is a function of the coordinates.Coordinates are function of the energy.
ener
gy
coordinates
CH3
CH3
CH3
Moving on (Sampling) the PES
Each point on the PES is represents a molecular conformation characterized by its structure and energy.
By sampling the PES we can derive molecular properties.
Sampling energy minima only (energy minimization) will lead to molecular properties reflecting the enthalpy only.
Sampling the entire PES (molecular simulations) will lead to molecular properties reflecting the free energy.
In both cases, molecular properties will be derived from the PES.
Force Fields: General Features
Force field definition• Functional form (usually a compromise between accuracy and
ease of calculation.• Parameters (transferability assumed).
Force fields are empirical• There is no “correct” form of a force field.• Force fields are evaluated based solely on their performance.
Force field are parameterized for specific properties• Structural properties• Energy• Spectra
Force Fields: Atom Types
C
CMM2 type 2
O
C MM2 type 3
In molecular mechanics atoms are given types - there are often several types for each element.
Atom types depend on:• Atomic number (e.g., C, N, O, H).• Hybridization (e.g., SP3, SP2, SP).• Environment (e.g., cyclopropane, cyclobutane).
“Transferability” is assumed - for example that a C=O bond will behave more or less the same in all molecules.
A General Force Field Calculation
Input• Atom Types• Starting geometry• Connectivity
Energy minimization / geometry optimizationCalculate molecular properties at final geometryOutput
• Molecular structure• Molecular energy• Dipole moments• etc. etc. etc.
A Simple Molecular Mechanics Force Field Calculation
))cos(1(2
)(2
)(2
)( 20,
20, n
Vkll
krV
torsions
nii
angles
iii
bonds
iN
ij
ji
ij
ij
ij
ijN
i
N
ijij r
rr 0
612
1 1 44
Torsions• H-C-C-H x 12• H-C-C-C x 6
Non-bonded• H-H x 21• H-C x 6
Bonds• C-C x 2• C-H x 8
Angles• C-C-C x 1• C-C-H x 10• H-C-H x 7
Bond Stretching
20 )]}(exp[1{)( llaDelv
Dea 2
k
Morse potential
E
r
• Accurate.• Computationally inefficient (form, parameters).• Catastrophic bond elongation.
De = Depth of the potential energy minimuml0 = Equilibrium bond length (v(l)=0)= Frequency of the bond vibration = Reduced massk = Stretch constant
Bond Stretching
• Inaccurate.• Coincides with the Morse potential at the bottom of the well.• Computationally efficient.
20 )(
2)( llk
lv
Harmonic potential (AMBER)
r
E
Bond Stretching
30
20 )(
22
)(21
)( llk
llk
lv
Cubic (MM2) and quadratic (MM3) potentials
40
30
20 )(
23
)(22
)(21
)( llk
llk
llk
lv
harmoniccubicquadratic
Bond Stretching Parameters (MM2)
Bond l0 (A) k (kcal mol-1 A-2)
Csp3-Csp3 1.523 317
Csp3-Csp2 1.497 317
Csp2=Csp2 1.337 690
Csp2=O 1.208 777
Csp3-Nsp3 1.438 367C-N (amide) 1.345 719
Hard mode.Bond types correlate with l0 and k values.A 0.2Å deviation from l0 when k=300 leads to an energy increase of 12 kcal/mol.
ener
gy
angle
Angle Bending
MM3MM2AMBER
20 )(
2)( kvAMBER:
60
20 )(
2
2)(
2
1)( kk
vMM2:
60
50
40
30
20 )(
25
)(24
)(23
)(22
)(21
)( kkkkkvMM3:
Angle Bending Parameters (MM2)
Hard mode (softer than bond stretching)
Angle 0 k (kcal mol-1 deg-1)
Csp3-Csp3-Csp3 109.47 0.0099
Csp3-Csp3-H 109.47 0.0079
H-Csp3-H 109.47 0.007
Csp3-Csp2-Csp3 117.2 0.0099
Csp3-Csp2=Csp2 121.4 0.0121
Csp3-Csp2=O 122.5 0.0101
Torsional (Dihedral) Terms
Reflect the existence of barriers to rotation around chemical bonds.
Used to set the relative energies of the rotational minima and maxima.
Together with the non-bonded terms are responsible for most of the structural and energetic changes (soft mode).
Usually parameterized last.
Soft mode.
General Functional Form)]cos(1(
2)(
0
nV
vN
n
n
: Torsional angle.n (multiplicity): Number of minima in a 360º cycle.Vn: Correlates with the barrier height.(phase factor): Determines where the torsion passes through its minimum value.
Vn = 4, n = 2, = 180
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0 60 120 180 240 300 360
Torsion angle
To
rsio
n e
ne
rgy
Vn = 2, n = 3, = 60
Functional Form: AMBER
)]cos(1(2
)(0
nV
vN
n
n
Preference to single terms.Usage of general torsional parameters (i.e., torsional potential solely depends on the two central atoms of the torsion).
• C-C-C-C = O-C-C-C = O-C-C-N = …
Example: Butane
# Type V1 V2 V31 C-C-C–C 0.200 0.270 0.0934 C-C-C–H 0.000 0.000 0.2674 H-C-C–H 0.000 0.000 0.237
The barrier to rotation around the C-C bond in butane is ~20kJ/mol.All 9 torsional interactions around the central C-C bond should be considered for an appropriate reproduction of the torsional barrier.
Out-of-Plane BendingAllows for non-planarity when required (e.g., cyclobutanone).Prevents inversion about chiral centers (e.g., for united atoms). Induces planarity when required.
United AtomsAbsorb non-polar hydrogen atoms into respective carbons.Greatly reduces computational cost.
O
O
correct
Out-of-Plane Bending
i
j
k
l
j
k
i
l
Wilson Angle
Improper torsion
Pyramidal Distance
20 )()( kv
20 )()( ddkdv
)]cos(1[)( 0 nkv
Between ijk plane and i-l bond
1-2-3-4 torsion
Between atom i and jkl plane
Cross Terms: Stretch-BendCoupling between internal coordinates.Important for reproducing structures of unusual (e.g., highly strained ) systems and of vibrational spectra.
Stretch-Bend: As a bond angle decreases, the adjacent bonds stretch to reduce the interaction between the 1,3 atoms.
))](()[(2
),,( 00,220,112121 llll
kllv ll
Cross Terms: Stretch-Torsion
Stretch-Torsion: For an A-B-C-D torsion, the central B-C bond elongates in response to eclipsing of the A-B and C-D terminal bonds.
nllklv cos)(),( 0
Non-Bonded Interactions
Operate within molecules and between molecules.
Through space interactions.
Modeled as a function of an inverse power of the distance.
Soft mode.
Divided into:• Electrostatic interactions.• VdW interactions.
Electrostatic Interactions
A
i
B
j ij
jilarintermolec r
qqV
1 1 04
A
i
A
ij ij
jilarintramolec r
qqV
1 1 04
qi, qj are point charges.Charge-charge interactions are long ranged (decay as r-1).When qi, qj are centered on the nuclei they are referred to as partial atomic charges.
• Fit to known electric moments (e.g., dipole, quadrupole etc.)• Fit to thermodynamic properties.• Ab initio Calculations
– Electrostatic potential
Rapid Methods for Calculating Atomic Charges
Partial Equalization of Orbital Electronegativity (Gasteiger and Marsili)
http://www2.chemie.uni-erlangen.de/software/petra/manual/manual.pdf
Electronegativity (Pauling): “The power of an atom to attract an electron to itself”
Orbital electronegativity depends upon:• Valence state (e.g., sp > sp3).• Occupancy (e.g., empty > single > double).• Charges in other orbitals.
Electrons flow from the less electronegative atoms to the more electronegative atoms thereby equalizing the electronegativities.
For the dependency of orbital electronegativity on charge assume:
Theoretically correct for orbitals but formally applied to atoms.Values of a, b and c were obtained for common elements in their usual valence states (e.g., for atom types).
Apply an interactive process:• Assign each atom its formal charge.• Calculate atomic electronegativities based on above assumption.• Calculate the electron charge transferred from atom A to the
more electronegative atom B bonded to it by:
Partial Equalization of Orbital Electronegativity
2AAAAAA QcQba
k
A
kA
kBkQ
)()(
)(Dumping factor
Cation Electronegativity of the less electronegative atom
Solvent Dielectric Models
A
i
A
ij ij
ji
r
qqV
1 1 04
0 = Dielectric constant of vacuum (1).For a given set of charges and distance, 0 determines the strength of the electrostatic interactions.Solvent effect dampen the electrostatic interactions and so can be modeled by varying 0: eff = 0r
r(protein interior) = 2-4 r(water) = 80
Solvent Dielectric Models: Distance Dependence
A
i
A
ij ij
jieffeff r
qqVrr
1 12
04)(
rSrreff erSrS ]22)[(
21 2
Smooth increase in eff form 1 to r as the distance increases
Dielectric constant increases as a function of distance
distance
eff
eff varies from 1 at zero separation to the bulk permitivity of the solvent at large separation. As the separation between the interacting atoms increases, more solvent can “penetrate” between them.
Van Der Waals (VdW) Interactions
Electrostatic interactions can’t account for all non-bonded interactions within a system (e.g., rare gases).
VdW interactions:• Attractive (dispersive) contribution (London forces)
– Instantaneous dipoles due to electron cloud fluctuations.– Decays as r6.
• Repulsive contribution– Nuclei repulsion.– At short distances (r < 1) rises as 1/r.– At large distances decays as exp(-2r/a0); a0 the Bohr radius.
Van Der Waals (VdW) Interactions
The observed VdW potential results from a balance between the attractive and repulsive forces.
Modeling VdW Interactions
radiiVdW of sum2 61 mr
})(2){()( 612 rrrrrv mm 612)( rCrArv 1212 4 mrA 66 42 mrC
Alternative forms
separationen
ergy
rm
612
4)(rr
rv
Lennard-Jones Potential
• Rapid to calculate• Attractive part theoretically sound• Repulsive part easy to calculate but
too steep
Hydrogen Bonding
H-bond geometry independent:
1012)(rC
rA
rv
• r is the distance between the H-bond donor and H-bond acceptor.
H-bond geometry dependent (co-linearity with lp preferred):
LPAccHAccHDonAccHAccH
HB rA
rA
v ......4
......2
10...
12...
coscos
N
H
OC DonAcc
Force Field ParameterizationChoosing values for the parameters in the potential function equations to best reproduce experimental data.
Parameterization techniques• Trial and error• Least square methods
Types of parameters• Stretch: natural bond length (l0) and force constants (k).• Bend: natural bond angles (0) and force constants (k).• Torsions: Vi’s.• VdW: (, VdW radii).• Electrostatic: Partial atomic charges.• Cross-terms: Cross term parameters.
Parameters - Quantity
For N atom types require:• N non-bonded parameters• N*N stretch parameters• N*N*N bend parameters• N*N*N*N torsion parameters
ExampleMacroModel MM2*: 39 atom types
Required ActualStretchBendTorsion
152158,3192,313,441
164357508
Parameters - Source
Experiment: geometries and non-bonded parameters • X-Ray crystallography• Electron diffraction• Microwave spectroscopy• Lattice energies
Advantages• Real
Disadvantages• Hard to obtain• Non-uniform• Limited availability
A force field parameterized according to data from one source (e.g., experimental gas phase, experimental solid phase, ab initio) will fit data from other sources only qualitatively.
Parameterization - Example
0.00
1.00
2.00
3.00
4.00
5.00
0 90 180 270 360torsion
En
ergy
(k
cal/m
ol)
0.00
1.00
2.00
3.00
4.00
5.00
0 90 180 270 360torsion
En
ergy
(k
cal/m
ol)
relative energy (kcal/mol)
ab initio -0.69 ABMER -1.25parameterized AMBER -0.68
O
OH
O
O
H
Transferability of ParametersAssumption: The same set of parameters can be used to model a related series of compounds:
• Prediction• Missing parameters
– Educated guess– Parameters derived from atomic properties (e.g., UFF)
• Reducing number of parameters– Generalized torsions (e.g., depend only on two central atoms)– Generalized VdW parameters (same for SP, SP2, SP3)
Above assumption breaks down for close interaction function groups:
OO O
X
Existing Force FieldsAMBER (Assisted Model Building with Energy Refinement)
• Parameterized specifically for proteins and nucleic acids.• Uses only 5 bonding and non-bonding terms along with a
sophisticated electrostatic treatment.• No cross terms are included.• Results can be very good for proteins and nucleic acids, less so
for other systems.
CHARMM (Chemistry at Harvard Macromolecular Mechanics)• Originally devised for proteins and nucleic acids.• Now used for a range of macromolecules, molecular dynamics,
solvation, crystal packing, vibrational analysis and QM/MM studies.
• Uses 5 valence terms, one of which is electrostatic term.• Basis for other force fields (e.g., MOIL).
Existing Force FieldsGROMOS (Gronigen molecular simulation)
• Popular for predicting the dynamical motion of molecules and bulk liquids.
• Also used for modeling biomolecules.• Uses 5 valence terms, one of which is an electrostatic term.
MM1, 2, 3, 4• General purpose force fields for (mono-functional) organic
molecules.• MM2 was parameterized for a lot of functional groups.• MM3 is probably one of the most accurate ways of modeling
hydrocarbons.• MM4 is very new and little is known about its performance.• The force fields use 5 to 6 valence terms, one of which is an
electrostatic term and one to nine cross terms.
Existing Force FieldsMMFF (Merck Molecular Force Field)
• General purpose force fields mainly for organic molecules.• MMFF94 was originally designed for molecular dynamics
simulations but is also widely used for geometry optimization.• Uses 5 valence terms, one of which is an electrostatic term and
one cross term.• MMFF was parameterized based on high level ab initio
calculations.
OPLS (Optimized Potential for Liquid Simulations)• Designed for modeling bulk liquids.• Has been extensively used for modeling the molecular dynamics
of biomolecules.• Uses 5 valence terms, one of which is an electrostatic term but no
cross terms.
Existing Force FieldsTripos (SYBYL force field)
• Designed for modeling organic and biomolecules.• Often used for CoMFA analysis (QSAR method).• Uses 5 valence terms, one of which is an electrostatic term.
MOMEC• A force field for describing transition metals coordination
compounds.• Originally parameterized to use 4 valence terms but not an
electrostatic term.• Metal-ligand interactions consist of bond stretch only.• Coordination sphere is maintained by non-bonding interactions
between ligands.• Work reasonable well for octahedrally coordinated compounds.
Existing Force FieldsUFF (Universal Force Field)
• Designed for coverage of the entire periodic table.• All force field parameters are atomic based.• Parameters for specific interactions are derived from atomic
parameters based on a series of mixing rules, thereby addressing the problem of parameter explosion.
• Uses 4 valence terms but not an electrostatic term.
YATI• Designed for the accurate representation of non-bonded
interactions.• Most often used for modeling interactions between biomolecules
and small substrate molecules.
Existing Force Fields
CVFF (Consistent Valence Force Field)• Parameterized for small organic (amides, carboxylic acids, etc.)
crystals and gas phase structures. • Handles peptides, proteins, and a wide range of organic
systems. • Primarily intended for studies of structures and binding
energies, although it predicts vibrational frequencies and conformational energies reasonably well.
Existing Force FieldsCFF, CFF91, CFF95 (Consistent Force Field)
• Parameterized (based on quantum mechanics calculations and molecular simulations) for acetals, acids, alcohols, alkanes, alkenes, amides, amines, aromatics, esters, and ethers.
• Useful for predicting:– Small molecules: gas-phase geometries, vibrational
frequencies, conformational energies, torsion barriers, crystal structures.
– Liquids: cohesive energy densities; for crystals: lattice parameters, RMS atomic coordinates, sublimation energies.
– Macromolecules: protein crystal structures, polycarbonates and polysaccharides (CFF91,95).
Existing Force Fields
Dreiding• Parameters derived by a rule based approach.• Good coverage of the periodic table.• Good coverage for organic, biological and main-group inorganic
molecules. • Only moderately accurate for geometries, conformational energies,
intermolecular binding energies, and crystal packing.
IntroductionA system of N atoms is defined by 3N Cartesian coordinates or 3N-6 internal coordinates. These define a multi-dimensional potential energy surface (PES).
A PES is characterized by stationary points:
• Minima (stable conformations)• Maxima• Saddle points (transition states)
ener
gy
coordinates
Classification of Stationary Points
0.0
4.0
8.0
12.0
16.0
20.0
0 90 180 270 360
transition state
local minimum
global minimum
ener
gy
coordinate
TypeMinimum MaximumSaddle point
1st Derivative000
2nd Derivative*positivenegative
1 negative
* Refers to the eigenvalues of the second derivatives (Hessian) matrix
Population of Minima
Most minimization method can only go downhill and so locate the closest (downhill sense) minimum.No minimization method can guarantee the location of the global energy minimum.No method has proven the best for all problems.
Global minimum
Most populated minimum
Active Structure
Sequential Univariate MethodFor each coordinate:
• Generate two new points (xi+xi, xi+2xi) and calculate their energies.
• Fit a parabola to xi (1), xi+xi (2), xi+2xi (3) and locate its minimum (4).
• Set the coordinate of xi to the parabola minimum.• When the change in all coordinates is small enough, stop.
This method requires fewer function evaluations than the Simplex method but is still slow to converge.
SummarySimplex, Sequential Univariate (memory requirements scale as N)
• Robust (i.e., can be initiated from poor starting geometries).• Slow to converge even on quadratic surfaces.• Require many function evaluations (especially Simplex).
Steepest Descent (memory requirements scale as 3N)• Robust.• Convergence guaranteed on quadratic surfaces.• Slow to converge especially near the minimum.
Conjugate gradient, Quasi NR (memory requirements scale as (3N)2)• Converges in approximately N steps for N degrees of freedom.• Convergence guaranteed on quadratic surfaces.• Unstable away from the minimum (i.e., when surface not quadratic).• CG is the method of choice for large systems.
Newton-Raphson (memory requirements scale as (3N)2)• Converges in one step on quadratic surfaces.• Unstable away from the minimum.• Computationally expensive.