Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related...

Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM

and Related Methods

Dr. Ronald M. [email protected]

Statistical Thermodynamics

Definitions

• Canonical ensemble:

focus on free energy changes that do not involve changing the number of particles, or their velocities, nor volume and temperature. Can work with configurational partition function:

λ|TV,N,QTk=λ|TV,N,A B ln

NbN

baN

a hΛNΛN=λ|TV,N,Q

333!!

1

λ|,q,edqdq baqβU

ba

l: system parameters and/or constraints

λ|,q,edqdq=λZ=λ|TV,N,Z baqβU

ba

1

212 ln

λZ

λZTk=λAλA=ΔA B Or

equivalently:

1

2

λZ

λZ=e βΔA

Example 1: Configurational Free Energy Changes

l controls a conformational constraint (distance between two particles, dihedral angle, etc.).

Call the constrained property f(q): only conformations that satisfy the constraint f(q)=l (within a tolerance of dl) are allowed:

Z (λ)= dq δ [ f (q )−λ ]e−βU (q )

Now:

Z=edq=eλqfλδddq=dλλZ qβUqβU

So: integration of the constrained Z over all the possible constraints gives the unconstrained Z

=1

Free energy work for imposing the constraint:

e−β A (λ)=Z (λ )

Z= dq δ[ f (q)−λ]e−βU (q)

dq e−βU (q)=⟨δ[ f (q)−λ]⟩Z

Free energy work for imposing the constraint:

it's basically an average of a delta function over the unconstrained ensemble.

The delta function is non-zero only exactly at l. In actual numerical applications we would consider some finite interval Dl around a discrete set of l

i.

Consider:

otherwise0

2/2//1 Δλ<x<ΔλΔλ=xδΔλ 1

1=

ΔλΔλ=xδdx Δλ

xδ=xδΔλlim

ZiZiΔλ

iλβΔAλqfθ=λqfδΔλ=e

e−β A (λ)=⟨δ[ f (q)−λ]⟩Z

Free energy for imposing constraint in interval Dl:

otherwise0

2/2/1 Δλ<x<Δλ=xθ

e−βΔ A (λ i)=⟨θ[ f (q)−λi]⟩Z=1

N samples∑sample k

θ[ f (qk)−λ i]=

1N samples

(0+ 0+ 0+ 1+ 0+ 1+ 0+…)=

ni

N samples

=pi

where:

ii λΔλqfn of withinis timesofnumber :histogram) (a.k.a. ii λΔλqfp ofwithinfinding ofy probabilit:

therefore:

iBi pTk=λΔA ln

At the Dl->0 continuous limit:

A (λ )=−k B T ln [ p (λ )d λ ] probability density

The zero of the free energy is arbitrary:

Δ A ji=Δ A (λ j)−Δ A (λ i)=−kB T lnp j

pi

A(l) is the potential of mean force along l:

Δ A ji=A (λ j)−A (λ i)=−k B T lnp (λ j)d λp (λ i)d λ

=−k B T lnp (λ j)p (λ i)

discrete l:

continuous l:

Zλqfδdλ=dλλp

Lesson #1: configurational free energies can be computed by counting – that is by collecting histograms or probability densities over unrestricted trajectories.

Probably most used method.

Not necessarily the most efficient.

Achievable range of free energies is limited:

NTk

p

pTk=ΔA BB

1lnln

1

2max

N samples, N-1 in bin 1 and one sample in bin 2

p1=N−1

N≈1 p2=

1N

For N=10,000,

DAmax

~ 5 kcal/mol

at room temperature

But in practice needs many more samples than this minimum to achieve

sufficient precision

Method #2: Biased Sampling

“umbrella” potential

Any thermodynamic observable can be “unbiased”:

=eeedq

eeeqOdq=

edq

eqOdq=O

xβwqβUxβw+

xβwqβUx+ββ

qβU

qβU

=w

0

w+U

xβw

w+U

xβw

w+U

xβww+U

w+U

xβww+U

e

eqO=

eZ

eqOZ

x

qxw=qw

Example: Unbiased probability for being near xi:

xw

xβw

xw

xβwi

=xwiie

exxθ=xxθ=p

0

wβΔA

w+U

U

w+U

xβwqβUxβw

xw

xβw e=Z

Z=

Z

eeedq=e

state unbiased the torelative state (biased) ofenergy free w+U=ΔAw

i

ixβwwβΔAiixβwwβΔAjqxβw

ijwβΔA

i pee=N

neeexqxθ

Ne=p

1

Works only if biased potential depends only on x

ensemble biasedin bin in samples ofnumber i=ni

ensemble biased from samples ofnumber total=N

ii x=p aty probabilit biased

pi≈

ni

N eβΔ Aw e−βw (x i)

distribution with biasing potential 1

(histogram probabilities p1i)





Unbiased distribution p°i

w1

w2

w3

How do we “merge” data from multiple biased simulations?







Unbiased distribution p°1i w1

w2

w3

Unbiased distribution p°2i

Unbiased distribution p°3i

?

sisii pu=p Any set of weights u

si gives the right answer, but what is

the best set of usi for a given finite sample size?

Multiple biased simulations to cover conformational space

Biasing potentials ws(x)

x

Weighted Histogram Analysis Method (WHAM):Optimal way of combining multiple sources of biased data

unbiased probability

bias of bin i in simulation sfree energy factor

Let's assume the unbiased probability is known, we can predict what would be the biased distribution with biased potential w

s:

biased probability

si

isβwsβΔA

i px

ee=p

iisβwsβΔA

si px

ee=p

isissi pcf=p

sβΔA

s e=f isβw

si

xe=c

sisii pu=p Any set of weights u

si gives the right answer, but what is the

optimal (most accurate estimate) set of usi for a given finite

sample size?

Likelihood of histogram at s given probabilities at s (multinomial distribution):

n si=histogram of x from simulation s

smn

smsn

ssi

ssmssms pp

n

N=pp|nnP

1111 !

!

s=n=N sis simulation from samples ofnumber totalIn terms of unbiased probabilties:

m

=i

sin

isissmssms pcf=pp|nnP1

11 const.

Joint likelihood of the histograms from all simulations:

S

=s

m

=i

sin

isis

SmSsmsSmSsms

pcf

pp;;pp|nn;;nnP

1 1

1111

isiii

isβwsβw

sβwqβUsβw

sβwqβU

sβwqβUssβΔA

s

pc=xpΔxx

expx

edx

=xqxδx

edx=xqxδeZ

dqx

edx

=xqxδdxqx

eedqZ

=qx

eedqZ

=Z

Z=e=f

00

00

0

00

1

1

1

1

Log likelihood: :

const.lnln1 1

+pcf=P sin

isis

S

=s

m

=i

Max likelihood principle: choose pi° that maximize the likelihood of the observed histograms.

Need to express fs in terms of pi°:

Log likelihood: :

constlnln

lnlnln

constlnln1 1

+pn+fN

=cn+np+nf

=+pcf=P

iiss

sisisiisis

sin

isis

S

=s

m

=i

n i= total number of samples in bin i from all simulations

N s= total number of samples from simulation s

0

lnlnln

2 =cfNp

n=cf

f

N+

p

n

=p

f

f

P+

p

P=

p

P

skssk

ksks

s

s

k

k

k

s

ipssf

kk

skss

kk cfN

n=p

Thus (WHAM equations):

isis pc=f 1

Solved by iteration until convergence.

Compare with single simulation case derived earlier:

ixβwwβΔA

ii

eNe

np

WHAM gives both probabilities (PMFs) and state free energies

Ferrenberg & Swendsen (1989)Kumar, Kollman et al. (1992)Bartels & Karplus (1997)Gallicchio, Levy et al. (2005)

What about those optimal combining weights we talked about?

sisii pu=p

solution WHAM:exp is's's'

sii xβwfN

n=p

issssisiisss

sisi xβwfNp=n

xβwfN

n=p

exp

exp

Substituting ... is's's'

issssii xβwfN

xβwfNp=p

exp

exp

Therefore ...

s'is's'

siss

is's's'

issssi ΔAxwβN

ΔAxwβN=

xβwfN

xβwfN=u

exp

exp

exp

exp

WHAM optimal combining weights

A simulation makes a large contribution at bin i if:1. It provides many samples (large N

s)

2. Its bias is small compared to its free energy relative to the unbiased state.

WHAM: getting unbiased averages

Computing the average of an observable that depends only on x is straightforward:

Δi iii iii pO=xpxOxxpxOdx=xO

binbin 000

For a property that depends on some other coordinate y=y(q). Solve for p

0(x,y) – no bias on y – and then integrate out x:

ypyOdy=yx,pyOdydx=yO 000

yprobabilit marginal00 =yx,pdx=yp where:

From WHAM

Some WHAM Applications for PMFs

Chekmarev, Ishida, & Levy (2004) J. Phys. Chem. B 108: 19487-19495

PMF of Alanine Dipeptide

22

22, s

fs

fs ψψ

k+

k=w

2D Potential. bias = temperature

Gallicchio, Andrec, Felts & Levy (2005) J. Phys. Chem. B 109: 6722-6731

qUββqw ss 0

b-hairpin peptide. Bias = temperature

Gallicchio, Andrec, Felts & Levy (2005) J. Phys. Chem. B 109: 6722-6731

The Concept of “WHAM weights”Consider the simple average of O(x):

is isss

iii ii xwfN

nO=pO=xO

binbin0 exp β

Let's sum over individual samples rather than bins and set:

x i : center of bin i ksis xwxw

x k : position of sample k

k kkks ksss

k WO=xwfN

O=xO

samplesample0 exp

1

β

Where:

s ksssk xβwfN

=Wexp

1WHAM weight of sample k.

Measures likelihood of encountering xk in unbiased simulation.

WHAM weights example: probability distribution

Apply previous averaging formula to p°i

k ik kkik

ii

WWxx

=xqxθ=p

sample binsample

0

θ

That is the unbiased probability at xi is the sum of the WHAM weights

of the sample belonging to that bin.

k kkWO=xOsample0

Latest development: No binning WHAM = MBAR

s siss

ii cfN

n=p

WHAM equations:

i isis pc=f 1

substituteGet:

is isss

siis cfN

cn=f

' '''

1

ks ksss

sks cfN

c=f

sample' '''

1

Sum over samples

MBAR equationSolved iteratively to convergence to get the f's

Distributions obtained by binning the corresponding WHAM weights.

Shirts & Chodera J. Chem. Phys. (2008).Tan, Gallicchio, Lapelosa, Levy JCTC (2012).

Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related...

Documents

Transcript of Lecture 19: Free Energies in Modern Computational Statistical Thermodynamics: WHAM and Related...