Introduction to Classical Molecular Dynamics

Post on 25-Nov-2021

4 views 0 download

Transcript of Introduction to Classical Molecular Dynamics

Introduction to Classical Molecular Dynamics Alex de Vries, MD group

Introduction: material properties & simulations What goes in: Positions, boundary conditions, potentials, forces What comes out: Time-series of positions (and more...) What do we get out of it: Distributions! Introduction of the simulation software GROMACS

› Atte Sillanpää • and all CSC staff for organizing this workshop

› Sander Pronk and Justin Lemkul • GROMACS Basic & Intermediate Exercises

› Tsjerk Wassenaar • facilitating multiscaling and back-mapping • GROMACS Advanced Exercise

› Molecular Dynamics Group Groningen

Acknowledgements

Biomolecular simulation: really?!?

› Lipid bilayer phase transformation: liquid crystal to ripple (gel) phasede Vries et al. P.N.A.S. 102, 5392 (2005)

› Matter is characterized by • Structure

• Spatial relationships between building blocks • ORGANIZATION

• Dynamics • Temporal behavior of structure • FLUCTUATIONS • RESPONSE

Material Properties

› All properties really are dynamic • On (very) different time-scales • Motion of the particles involved determines the time-

scales

Time-scales

› Collect statistics to characterize structure • From dynamical simulation

Time-trace and Distribution

› Collect statistics to characterize structure • For all kinds of structural elements - YOU CHOOSE!

Time-trace and Distribution

› Collect statistics to characterize structure • Time-scale influences CONVERGENCE

Time-trace and Distribution

Ingredients for Molecular Modeling

Van Gunsteren et al. Angew. Chem. Int. Ed. 45, 4064 (2006)

ENERGY

MOTION

BUILDING BLOCKS

SYSTEM

› What goes in...

Popular Molecular Models› All Atom Models

(OPLS-AA/L, CHARMM, AMBER) • Each atom is treated as a particle

› United Atom Models (GROMOS, OPLS) • Each atom is treated as a particle

› Coarse Grained Models (MARTINI) • A group of (~4 for Martini) united

atoms is treated as a particle › DPPC molecule (a lipid) in UA and CG (MARTINI) representations

Common Potentials in Molecular Models› BONDED: Simple harmonic/cosine bonds & angles

V φ( ) = Kn cos nφ +δ( )n∑V d( ) = k

2d − d0( )2 V θ( ) = kθ

2θ −θ0( )2

› Close link between potential and distribution

• Can be used to extract potential from structural data

• Inverse Boltzmann techniques

Potential and Distribution

W R( ) = Exp[−V R( ) / kBT ]Z

; V R( )∝−kBT lnW R( )

V R( )W R( )

V R( ) = 12k R −R0( )2

R

› Close link between potential and frequency

Potential and Time

V R( ) = 12k R −R0( )2

R

ω = kmeff

meff

Etot

• Time required to traverse potential depends on shape of the potential and the mass

• For harmonic oscillator, the characteristic frequency can be used to gauge time

› Usually simple functions with some physical justification

› NON-BONDED: Coulomb interaction between partial charges (electrostatics) and Lennard-Jones potential (long-range dispersion and short-range repulsion)

› distance ›

ener

gy

› ε determines strength of the interaction

› σ determines optimal distance of the interaction ECoulomb =

q1q2r12

ELJ = Edispersion + Erepulsion

= 4ε −σ6

r126 + σ 12

r1212

⎛⎝⎜

⎞⎠⎟

Common Potentials in Molecular Models

› Full descriptions of force fields implemented in GROMACS are described in the manual, Chapter 4

Classical Molecular Dynamics› Numerical integration of Newton’s Law: F = m*a

› t=0

› forces

› Force = Negative of Change in Potential with Position

› For pair-wise additive potentials

Fi t( ) = −∇riV r1 ,r2 ,...,ri ,rj ...;t( )

Fi t( ) = −∇ri

V ri ,rj ;t( )j∑

FC r1( ) = −∇r1

q1qjr1 − rjj

∑ › (for example Coulomb interaction)

Fi = −

∂V∂xi∂V∂yi∂V∂zi

⎜⎜⎜⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟⎟⎟⎟

= −∇riV

Basic MD Algorithm› Numerically integrate Newton’s Law: F = m*a

› t=0

› t=1

› Forces & velocities

› displacements

› new displacements, etc...

ma = m!!r = F r( ) = −∇E r( ) Δr = !rΔt + F

2mΔt( )2

Verlet Integrator

Taylor expansion of the position in time

F = ma = m dvdt

= m ddt

drdt

⎛⎝⎜

⎞⎠⎟ = m

d 2rdt 2

r t + Δt( ) = r t( ) + drdt t

Δt + 12!d 2rdt 2 t

Δt( )2 + ...

r t − Δt( ) = r t( )− drdt t

Δt + 12!d 2rdt 2 t

Δt( )2 − ...

r t + Δt( ) + r t − Δt( ) = 2r t( ) + d2rdt 2 t

Δt( )2 +O( Δt( )4 )

⇒ r t + Δt( ) + r t − Δt( ) ≈ 2r t( ) + Fm

Δt( )2

Verlet Integrator Depicted

r t + Δt( )− r t( ) ≈ r t( )− r t − Δt( ) + Fm

Δt( )2

r t + Δt( )− r t( )

Fm

Δt( )2

r t( )− r t − Δt( )r t( )

r t − Δt( )r t + Δt( )

r t + Δt( ) + r t − Δt( ) ≈ 2r t( ) + Fm

Δt( )2

Verlet Integrator: Velocity Reconstruction

Fm

Δt( )2

r t + Δt( )r t − Δt( )

v t( )

v t( )

v t( ) ≡ r t + Δt( )− r t − Δt( )2Δt

Note: GROMACS implements the Leap-Frog and velocity Verlet algorithms. They are mathematically equivalent to Verlet but Leap-Frog uses velocities at half time steps. See GROMACS manual 3.4.4

A Very Simple System

2 particles

vacuum

constant E

V R( ) = 12k R −R0( )2

› What goes in...

› Single oscillator treated classically in vacuo

› Single oscillator treated classically in vacuo A Very Simple System

coordinates

vacuum

constant E

V R( ) = 12k R −R0( )2

› What goes in...

.gro or .pdb

.top or .itp

.mdp

also algorithm for integration, number of steps, time step Δt, how often should we write output, and many other options!

definition of the system and of the molecules and their bonded and nonbonded interactions

Van Gunsteren et al. Angew. Chem. Int. Ed. 45, 4064 (2006)

GROMACS .files and programs .top

.itp.gro

.mdp

gmx grompp gmx mdrun.tpr

› Descriptions of the system • Positions

• Velocities • Forces

• Energies • Other system information

Time-traces

› What comes out...

.edr

.xtc and/or .trr

› The user must then extract this information and visualize it • Gromacs has many tools for this! • Always VISUALLY inspect your simulation

(ngmx, VMD, Pymol, ...)

› Time-trace data can extracted and visualized

Visualization of some resultsgmx distance

Bond distance and relative velocity as a function of time

› Time-trace data can be collected in a distributionSampling a Distribution

Distribution of distances in harmonic oscillator model

Bond distance and relative velocity as a function of time

gmx analyze -dist

DistributionsO N =

niOi Δii=1

nbin

ni Δii=1

nbin

O =f q( )O q( )dq∫f q( )dq∫

› Distributions can be used to calculate characteristics (observables) efficiently.

› The average (or mean) is:

› Measurements of comparable values can be collected in the same bins of a histogram.

› In the limit of infinitesimally small intervals, one speaks of a probability distribution function (say f) of a variable (say q).

O =ψ * q( )O q( )ψ q( )dq∫

ψ * q( )ψ q( )dq∫cf. to expectation value

› Relevant or interesting degrees of freedom › Harmonic oscillator traces out a simple path

described by • Distance • Relative velocity

Useful and Informative Description(s)

› What comes out...

Distance (d) versus relative velocity (v) for a simple harmonic oscillator.

For an oscillator at constant energy, the ellipse constitutes its PHASE SPACE. If the energy is not constant, many ellipses span its phase space, characterized by different amplitudes. (The path is traversed in counter clock-wise fashion.)

› Describes possible states of the system • Dimension 2*Ndof (R,P) or Ndof (R) • A single system, in time, follows an allowed path

through phase space visiting allowed states (R,P)

A Tasting (Sampling) of Classical Phase Space

dof=degree(s) of freedom; R is the collection of positions, P is the collection of momenta

› What comes out...

› ARA peptide

DSSP analysis: Kabsch and Sander Biopolymers 22, 2577 (1983)

Timeline in VMD showing secondary structure elements

A Summary of a Path Through Phase Space

time→

resi

due→

› Ramachandran plot and DSSP analysis

DSSP analysis: Kabsch and Sander Biopolymers 22, 2577 (1983) Pictures: Wikipedia and Kleywegt & Jones Structure 4, 1395 (1996)

Region in Ramachandran plot plus hydrogen bonding between residues defines whether a residue is involved in a helix, sheet, or coil region - or variations thereof...

sheet

α-helix

L-helix

Most populated Ramachandran angle combinations in the Protein Data Bank

gmx rama

Definition of Ramachandran angles

Secondary Structure Summarizes Structure

DSSP visualization from GROMACS (Basic Exercise, Manual 8.14)

› Each time-frame represents one structure (realization, configuration, conformation) out of all possible structures

› The collection of all possible structures is called an ensemble

A Summary of a Path Through Phase Spacegmx do_dssp

requires third-party code (dssp_cmbi), downloads free from CMBI

› For larger systems, phase space is very extended • Aim is to obtain a REPRESENTATIVE SAMPLE of the

REALIZATIONS of the system • Structural and/or dynamic

› Phase space (R, P) to be sampled depends on degrees of freedom and research question

• WE DEFINE THE SYSTEM OF INTEREST AND CHOOSE A MODEL TO YIELD INFORMATION ON A CERTAIN LENGTH AND TIME SCALE

• Realism depends on judiciousness of our choice

Aim and Scope of Simulation

SUMMARY› Molecular Dynamics

• Generate representative ensemble of structures connected through representative dynamics

› Molecular Model • Should reflect research question • Potentials determine distributions and dynamics • Complexity and costs increase with level of detail

Practical aspects of MD simulationsAlex de Vries, MD group

Molecular Dynamics: Equations of motion algorithm, approximations Harmonic oscillator & time-step Cut-offs in condensed-phase systems Battling failure to conserve energy: thermostats

Ingredients for Molecular Modeling

Van Gunsteren et al. Angew. Chem. Int. Ed. 45, 4064 (2006)

ENERGY

MOTION

BUILDING BLOCKS

SYSTEM

› What goes in...

› We want to simulate behavior of the molecules of biology • Need to generate distributions at (sub)atomic level

• Characterize structure & dynamics • Think about spatial and temporal resolution

• Fairly many particles (at least ~10,000 atoms) and processes take time (at least microseconds)

• Drastic approximations required • Computational limits: computing power & disk space

• Simulation time: accuracy of algorithm • Simulation volume: number of particles (interactions!) • Analysis: storing the information

A Moment for Reflection

› At atomistic level, time step for fastest vibrations

• Rule of thumb: need 10 points on the harmonic potential to characterize it well. This leads to a time step ~1 fs.

Appropriate Time Step for Sampling

1/ Δtvib = ν IR = c !ν IR ≈

3×108m ⋅s-1 × 3,000cm-1

≈1014HzW R( )

Etot

.mdp

Limiting the Computational Effort› Pairwise-additive potential: number of interactions N*(N-1)

› In gas: all relevant interactions can be calculated (N≈103 particles)

› Pairwise-additive potential: number of interactions N*(N-1)

› In condensed system (even 1 µl): too many interactions (N≈1018 particles)

Limiting the Computational Effort

› Pairwise-additive potential: number of interactions N*(N-1)

Rc Rc

› Using cut-off reduces to N*n: local interactions only

.mdpLimiting the Computational Effort

› GROMACS Manual 3.2.2

Periodic Boundary Conditions› Pairwise-additive potential: number of interactions N*(N-1)

Rc

› Pretend system repeats itself (periodic boundary conditions): N*n (N is number of particles in unit cell)

Rc

Rc

.mdp

› GROMACS Manual 3.2

Cut-off noise› Particles start/stop interacting at certain distance:

energy is not conserved

RcRc

Full Electrostatics› Cut-off is not necessary in periodic systems!

› Ewald summation converges but may lead to overstructuring › Numerically more efficient method Particle Mesh Ewald

uses grids to represent the electric field

.mdp

› GROMACS Manual 4.8

Conservation of Energy› In practice, even with small time-steps,

conservation of energy is difficult to achieve

ΔV = −W = −F•ΔR

.mdp

ΔV = −W = − F• dR∫

Too large a time-step takes the system away from its potential energy surface and leads to failure of energy conservation.

› assuming constant force

Conservation of Energy› Repair for any energy gain or loss by

coupling equations of motion to a heat bath

› Couple to heat bath to dissipate or gain heat › Popular methods are velocity rescaling (e.g. Berendsen) or

extended ensembles (e.g. Nose-Hoover) › Strictly speaking, we are not doing Newtonian mechanics › Time step and bath coupling strengths are part of the

parameter set!

ΔV = −W = −F•ΔR

ΔV = −W = − F• dR∫

.mdp

› GROMACS Manual 3.4.8

› Limited numerical accuracy of integrator • Constant energy difficult to attain • Diverging trajectories

› Limited realism of potentials & system • Coarse-grained over electrons and some nuclei • Analytical functions, e.g. no proper dissociation • Artificial periodicity (limited size) • Ensembles not always strictly well-defined

› Absence of quantum dynamical effects • Classical statistics • Continuous energy (cf Car-Parrinello)

Sources of error and approximation

Free Energy Differences from Simulations

The basis to understand free energy is the Boltzmann distribution Free energy differences from counting Potentials and experiment Thermodynamic integration

› Boltzmann distribution is observed for non-harmonically coupled systems

Potential and Distribution

p E( )∝ e−E kBT

› Describes possible states of the system • Dimension 2*Ndof (R,P) or Ndof (R) • A single system, in time, follows an allowed path

through phase space visiting allowed states (R,P)

A Tasting of Classical Phase Space

› What comes out...

dof=degree(s) of freedom; R is the collection of positions, P is the collection of momenta

› As long as you are clear about your states! • Regions in phase space define a (thermodynamic)

state • A thermodynamic state is a (large) collection of

microstates (realizations)

Free Energy is NOT Vague...

› What comes out...

Free Energy Differences from Simulations› Direct by counting

› A thermodynamic “state” is in fact a collection of configurations!

1 2Q12

eq = K12 =p2eq

p1eq =

e− Ei −E0( ) kBT

i∈2∑e− Ej −E

0( ) kBT

j∈1∑

= e− G20−G1

0( ) RT

GS0 ≡ −RT ln e− Ek−E

0( ) kBT

k∈S∑

Q12 =a2a1

≈ c2c1

= N2

V2× V1N1!p2p1

› Q: reaction quotient › K: equilibrium constant

› Boltzmann weights!

› Model • Parameters should give correct properties

• Interactions lead to density, heat of vaporization, solvation free energy, hydrophobic/hydrophilic partitioning, viscosity, etc etc etc

Realism of Molecular Modeling & MD

› distance

›en

ergy

› ε determines strength of the interaction

› σ determines optimal distance of the interaction

ELJ = 4ε −σ6

r126 + σ 12

r1212

⎛⎝⎜

⎞⎠⎟Van Gunsteren et al. Angew. Chem. Int. Ed.

45, 4064 (2006)

Use Free Energies for Parameterization› EXTENDING APPLICABILITY & TRANSFERABILITY › GROMOS 53A6 FINE-GRAINED (united atom)

Biomolecules, mostly proteins (peptides) USES THERMODYNAMIC INTEGRATION

› MARTINI COARSE-GRAINED (‘4-to-1’) Lipids, proteins, sugars USES DIRECT OBSERVATION OF PARTITIONING

Oostenbrink et al. J. Comput. Chem. 25, 1656 (2004); Marrink et al. J. Phys. Chem. B 108, 750 (2004); J. Phys. Chem. B 111, 7812 (2007)

+ + +

+

Free energy differences from Simulations

+ + +

+

› Indirect by restricted sampling › Thermodynamic integration:

switch interactions (represented by H) on/off (represented by λ)

K12 =p2eq

p1eq = e

−ΔG120 RT ⇒ΔG12

0 = −RT ln p2eq

p1eq

ΔG12 =∂G λ( )∂λ

dλ0

1

∫ =∂H λ( )∂λ λ

dλ0

1

∫λ = 0 :state 1, e.g. solute in solventλ = 1:state 2, e.g. pure solvent and solute in vauo

› Intermediate Exercise

Thermodynamic Integration› Thermodynamic integration: switch interactions

(represented by H) on/off (represented by λ)

ΔG12 =∂G λ( )∂λ

dλ0

1

∫ =∂H λ( )∂λ λ

dλ0

1

› Intermediate Exercise

G λ( ) ≡ −kBT ln e−Ek λ( ) kBT

k∑ ⇒

∂G λ( )∂λ

= −kBTe−En λ( ) kBT

n∑

e−Ek λ( ) kBT −1kBT

∂Ek λ( )∂λk

∑ =

pk∂Ek λ( )∂λk

∑ =∂Ek λ( )∂λ

› Strictly speaking, E includes pV and therefore what is written here as E is more commonly written as H

Thermodynamic IntegrationΔG12 =

∂H λ( )∂λ λ0

1

+ + +

+

› GROMACS topologies enable definition of A and B states; interactions between solvent and solute can be switched off by defining solute particles as type DUM in the B state

› λ = 0 › λ = 1

› Excerpt from topology for alchemical transformation from propanol to pentane (GROMACS manual Section 5.8.4)

.top/.itp

+ + +

+

› Thermodynamic integration: switch interactions (represented by H) on/off (represented by λ)

› λ = 0 › λ = 1

gmx barThermodynamic Integration

.mdp

ΔG12 =∂H λ( )∂λ λ

dλ0

1

Thermodynamic CyclesΔG12 =

∂H λ( )∂λ λ

dλ0

1

+

+

?

ΔGsolv1

ΔGsolv2

ΔGvac+liquid = 0

ΔGtotcycle = 0 = +ΔGtransfer

2→1 − ΔGsolv1 + ΔGvac+liquid + ΔGsolv

2

› Accessibility of Phase Space

• All relevant conformations should be accessible during the simulation, i.e. the barriers between them should not be too large

• Use tricks to search for other local minimum free energy regions (more on this on Friday!)

Realism of Molecular Modeling & MD

› Quality and Resolution of Experimental Data • Experiment often needs interpretation using a model because data

themselves offer too little resolution (in space and/or time)

Realism of Molecular Modeling & MD

› E.g. What does it mean that Circular Dichroism measurements of a peptide show that it is 60% helical?

A Quick Word on Hybrid Models

The best of ≥2 worlds Combining atomistic and particle-based models

› QM-MM models do local quantum chemistry under the influence of a lower-level molecular model • Quite well established after learning how to deal with the coupling between

the models

Do We Really Need All Detail?

› The Nobel Prize in Chemistry 2013 was awarded to Karplus, Levitt and Warshell for their work in developing multiscale modeling methods (picture from Nobel Prize press release shows QM-MM-continuum model of catalytic cleavage of polysaccharide chain by lysozyme).

› Hybrid molecular models combine atomistic with coarse-grained models • Exciting development

with lots of issues to be resolved

• GROMACS facilitates hybrid models through VIRTUAL SITES and TABULATED POTENTIALS

• Advanced Tutorial treats something much simpler...

Concurrent multiscaling

› UA MscL in Martini lipid bilayer (in Martini water) Movie by Helgi Ingolfsson

› Sequential multiscaling • Switching between resolutions • Run a coarse-grained model to find out about the global extent and shape of

phase space • Put in atomistic details in interesting regions: minima and transitions

› Requires a back-mapping procedure • Several on the market, but recent method by Wassenaar et al. is very user

friendly and general

Possibly the Best Approachbackward.py

Wassenaar et al. J. Chem. Theory Comput. 10, 676 (2014)

› Advanced Exercise

Adapted from Van Gunsteren et al. Angew. Chem. Int. Ed. 45, 4064 (2006)

GROMACS .files and programs .top

.itp.gro

.mdp

gmx grompp gmx mdrun.tpr

Analysis uses gmx ... and scripts on .xtc, .trr, and .edrCHECK!!!!

Thank you for your attention

› To compute a new position after a time interval ∆t

› The momentum follows from integration of the force

› Combining these from t = 0, assuming the force constant

Integrating Equations of Motionpi t( ) = mi

dri t( )dt

; Fi t( ) = dpi t( )dt

ri t + Δt( ) = ri t( ) + pi τ( )mi

dτt

t+Δt

pi t + Δt( ) = pi t( ) + Fi τ( )dτt

t+Δt

ri Δt( ) = ri 0( ) + 1mi

pi 0( ) + Fi ′τ( )d ′τ0

τ

∫⎡⎣⎢

⎤⎦⎥dτ

0

Δt

≈ ri 0( ) + pi 0( )mi

Δt + Fi 0( )2mi

Δt( )2

Mass-weighted covariance matrix

Baron et al J. Phys. Chem. B 110, 15602 (2006)

Strue ≤ S =

kB2ln 1+ kBTe

2

!2D

D =

m1

Kr1k − r1( )2

k∑ …

1K

r1k − r1( ) rNk − rN( )

k∑

! " !1K

rNk − rN( ) r1k − r1( )

k∑ #

mN

KrNk − rN( )2

k∑

⎜⎜⎜⎜⎜⎜

⎟⎟⎟⎟⎟⎟

r1k

m1

› Here, K is the number of conformations › is the position of atom/bead number 1 in frame k › is the mass of atom/bead 1 › Note similarity to variance!

gmx covar

1N f

rki − ri

ref( )2k∑

› Root-Mean-Square Fluctuation › average over time for each atom (or residue)

RMSF: structural mobility/flexibility

› Here, Nf is the number of frames in the trajectory

› is the position of particle i in the reference structure

› is the position of particle i in frame k

riref

rki

gmx rmsf