Importance sampling for MC simulation(“Importance-weighted random walk”)
M
i i
ib
ax
b
a x
xf
Mx
xfdxx
x
xfdxxfA
1)()(
)(1
)(
)()(
)(
)()(
Sampling points from a uniform distribution may not be the best way for MC.When most of the weight of the integral comes from a small range of x where f(x) is large, sampling more often in this region would increase the accuracy of the MC.
Example: Importance of small region: Measuring the depth of the NileSystematic
quadrature or uniform sampling
Importance sampling(importance weighted random
walk)
Frenkel and Smit, Understanding Molecular Simulations
Cyrus Levinthal formulated the “Levinthal paradox” (late 1960’s): -Consider a protein molecule composed of (only) 100 residues, -each of which can assume (only) 3 different conformations. -The number of possible structures of this protein yields 3100 = 5×1047. -Assume that it takes (only) 100 fs to convert from one structure to another. -It would require 5×1034 s = 1.6×1027 years to ”systematically” explore all possibilities.-This long time disagrees with the actual folding time (μs~ms). Levinthal’s paradox
Example: Importance of small region:Energy funnel in protein folding
To decrease the error of MC simulation
(f) = standard deviation in the observable O, i.e., y = f(x), itself
f measures how much f(x) deviates from its average over the integration region. ~ independent of the number of trials M (or N) ~ estimated from one simulation
where,OOtrue M
)deviation standard , trialsofnumber ( M~ cost
O=f(x)
a b x
A
<f>
x2 x1xi xM
O=f(x)
a b x
A
<f>
O
0 1 p(O)
O
0 1 p(O)
vs.
(f = 0)(f > 0) non-ideal, real case
the ideal case
M
i i
ib
ax
b
a x
xf
Mx
xfdxx
x
xfdxxfA
1)()(
)(1
)(
)()(
)(
)()(
Importance sampling
flat function
sharp(probabilit
y) distributio
n
broad distributio
n
fluctuating, varying function f
f
From N-step uniform sampling
From normalized importance sampling
weighted by w(x)
3 in accurac
y
Importance sampling for MC simulation: Example
Lab 3: Importance sampling for MC simulation
* Calculate the normalization constant N for each probability distribution function (x).
What’s new in Lab 3: Importance sampling* Include “cpu.h”, compile “cpu.c”, and call “cpu()” to measure the cpu time of the run.tstart = cpu();
~tend = cpu(); printf("CPU time: %5.5f s, CPU time/measure: %g s", tend - tstart, (tend - tstart) / M);
* Include “ran3.h” & “fran3.h”; compile fran3.c; call “fexp” or “flin” defined in fran3.r = ran3(&seed);
f_x = exp(-x*x);r = ran3(&seed);rho = fexp(r, &x);f_x = exp(-x*x) / rho;
r = ran3(&seed);rho = flin(r, &x);f_x = exp(-x*x) / rho;
* Display histograms for the distribution of x values generated by fran3 (for the step 4 only).r = ran3(&seed);
f_x = exp(-x*x);
i_hist = (int) (x * inv_dx);if (i_hist < n_hist)hist[i_hist] += 1.0;
r = ran3(&seed);rho = fexp(r, &x);f_x = exp(-x*x) / rho;
i_hist = (int) (x * inv_dx);if (i_hist < n_hist)hist[i_hist] += 1.0;
r = ran3(&seed);rho = flin(r, &x);f_x = exp(-x*x) / rho;
i_hist = (int) (x * inv_dx);if (i_hist < n_hist)hist[i_hist] += 1.0;
/* number of bins in histogram*/n_hist = 50;
/* Allocate memory for histogram */hist = (double *) allocate_1d_array(n_hist, sizeof(double));
/* Initializae histogram */for (i_hist = 0; i_hist < n_hist; ++i_hist) hist[i_hist] = 0.0;
/* Size of the ihistogram bins */dx = 1.0 / n_hist; /* 1.0 is the size of the interval */inv_dx = 1.0 / dx;
/* histogram accumulated */i_hist = (int) (x * inv_dx);if (i_hist < n_hist)hist[i_hist] += 1.0;
/* Write histogram. */fp = fopen("hist_2.dat", "w");for (i_hist = 0; i_hist < n_hist; ++i_hist) {x = (i_hist + 0.5) * dx;fprintf(fp, "%g %g", x, hist[i_hist] / M); }fclose(fp);
* Display histograms for the distribution of x values generated by fran3. (full version)
Plot with gnuplot, excel, origin, …Fit to a function. What is the resulted function?
Further reading: Sampling a non-uniform & discrete probability distribution {pi} (tower sampling)
(Ref) Gould, Toboshnik, Christian, Ch. 11.5
Further reading:Sampling a non-uniform & continuous probability distribution (x)
Example
Lab 3: Importance sampling for MC simulation
Integrand f(x)
constantdistributi
on function(uniform)
gooddistributi
on function
baddistributi
on function
Normalized!
Normalized!
Normalized!
w(x)(x)
p(x)
Results: Importance sampling for MC simulationOne experiment with M measures
n = 100 experiments, each with M measures
Results: Importance sampling for MC simulationOne experiment with M measures
n = 100 experiments, each with M measures
M
i i
ib
ax
b
a x
xf
Mx
xfdxx
x
xfdxxfA
1)()(
)(1
)(
)()(
)(
)()(
Analogy: Throw a dice with the results of {1, 2, 2, 2, 2, 4, 5, 6}.
(Discrete) probability distribution {pi, i=1-6}
= {1, 4, 0, 1, 1, 1} / 8 where 8 = 1+4+0+1+1+1 = {1/8, 1/2, 0, 1/8, 1/8, 1/8}
Mean value <A> = 3 = 24/8 = (1+2+2+2+2+4+5+6) / 8 = (1 + 2x4 + 4 + 5 + 6) / 8 = 1x1/8 + 2x4/8 + 3x0/8 + 4x1/8 + 5x1/8 + 6x1/8
Importance sampling = Importance-weighted average
)(constant for )(1
otherwise )(
)()(
1)( )( normalizedfor
)()(
1
1
1
1
1
i
M
ii
M
ii
i
M
ii
M
iii
i
M
ii
xpxfM
xp
xpxf
xpxp
xpxff
Beyond 1D integrals: A system of N particles in a container
of a volume V in contact with a thermostat T (constant NVT)
(=1/kT) or for discrete microstates
for discrete microstates
• Particles interact with each other through a potential energy U(rN) (~ pair potential).
• U(rN) is the potential energy of a microstate {rN} = {x1, y1, z1, …, xN, yN, zN}.
• (rN) is the probability to find a microstate {rN} under the constant-NVT constraint.
• Partition function Z (required for normalization) = the weighted sum of all the microstates compatible with the constant-NVT condition
• Average of an observable O, <O>, over all the microstates compatible with constant NVT
or for discrete microstates
ensemble average
“canonical ensemble”
external constraint
Top Related