Monte Carlo Integration
description
Transcript of Monte Carlo Integration
1
Ch 5: Monte Carlo Integration and Variance Reduction
Book Statistical Computing with R Maria L. Rizzo Chapman & Hall/CRC, 2008
2
Integral estimation
∑
∫
∫
=
∞
∞
=
==
=
•
•
•
n
ii
n
XY
Xgn
XgE
XfXXX
dxxfxgXgEYE
XgYXfXXfX
dxxg
xg
1
21
-
).(1)]([
ofestimator unbiasedan Then ).( from sample i.i.d.an be ,...,,Let
estimator. following thesuggests This .)()())(()(
then variable,randomanother is )( and))(~ as(written )(on distributi with variablerandom a is if Recall
integrals. estimate tomoments lstatistica from facts use We
finite. is integral theassuming ,)( compute want to We
function. a is )(
3
Simple Monte Carlo estimator for an integral over [0,1]
Numbers. Large of Law Strong by the 1y probabilitwith
))(()(1ˆ
.,...,, variablesrandom 1) (0, i.i.d. Generate
.)( estimate tois Goal
1
21
1
0
∑
∫
=
=→=
•
=•
m
ii
m
XgEXgm
XXXUm
dxxg
θθ
θ
Exercise: Write R code to compute the Monte Carlo estimate of the integral of exp(-‐x) on the interval [0,1] and compare it to the exact answer.
U(0,1) used because it fits the domain of integration [0,1].
4
One step harder:domain [a,b]
. and )()(Then .)( is hat worksfunction t The
.))(())((
:nintegratio theperform and 1 )( 0,)(such that )(function find ly,Specifical
used. becan [0,1]over estimator Carlo Monte simple that theso variablesof change a use tois idea One
.)( Estimate
1
0
)(
abdydtyabayt
abatty
dydydtytgdy
dydtytg
byayty
dttg
by
y(a)
b
a
−=−+=−
−=•
=
==•
•
=•
∫∫
∫θ
5
One step harder:domain [a,b]
∑
∫∫∫
=
−=•
•
−=−=−
−=
•
⋅≤≤−
•
•
m
ii
iid
m
U
b
aU
b
a
b
a
U
Xgmab
baUXXX
ugEabduufugabdtab
tgabdttg
ufa,bU
IbUaIab
a,bU
a,bUa,b
1
21
)(ˆ
),(~,...,, Generate
:ALGORITHM SAMPLING
)].([)()()()(1)()()(
:follows as )(density )( the toregardn with expectatio the torelated is want weinterval that theNote
function.indicator theis )( where),(1 form hasdensity )( The
that.use and density, )( theexamplefor ),( limitsth density wiy probabilit a find ely,Alternativ
θ
6
Example 5.3 from book: Non-finite limits
.arbitrary an for 21)(
cdf normal standard theestimate oapproach t above the Use
22
x
dt,eπ
xΦ /tx
-
−
∞∫=
•
0.for estimating toreduces problem The
above. method use so ),(1)(0,For
limits. finite back to ,215.0)( 0, For
0
2
2
0
2
2
>=•
−=<•
+=>•
∫
∫
−
−
xdte
xΦx-Φ -x
dteπ
xΦx
x /t
/tx
θ
7
Example 5.3:Non-finite limits
? variablesrandom (0,1) of sample onegeneratingjust by every for problem thesolvepossibly weCould
. of choiceevery for generation new a require wouldbut thisshown,just as variablesrandom )(0, generatingby done be could This
.arbitrary an for 21)( Estimate 2
0
2
Umx
xxU
xdt,eπ
xΦ /tx
•
•
=• −∫
8
Example 5.3:Non-finite limits
).(ˆ1)(ˆ 0, if ;2/ˆ5.0)(ˆ 0, If
.1ˆSet
).1,0(~,..., Generate
ALGORITHM SAMPLING
(0,1).~ where],[
becomes solved be tointegral The
. 1, 0, 0Then
. with variablesof change Use
.arbitrary an for Estimate
1
2/)(
1
2)(1
0
2)(
0
2
2
22
2
xΦxΦxxΦx
xem
UUU
UYxeEdyxe
xdydt xytyxtyt
t/xy
x,dte
m
i
xU
iid
m
/xyY
/xy
x /t
i
−−=<+=>•
=•
•
==
=→==→==→=
=
=
∑
∫
∫
=
−
−−
−
πθ
θ
θ
θ
9
R Code for Example 5.3
Generates the integral for 10 positive x’s ranging from .1 to 2.5.
Note, u and hence g is a vector; We are looping through the vector x.
R has a function pnorm to calculate this automatically.
Close except for the very high values of x.
10
Example 5.4: Semi-finite limits
∑
∫∫∫
∫
=
∞−
−−∞
∞
∞
∞
∞−
−
≤=•
•
Φ==≤=≤=≤
=Φ
m
ii
iid
m
x/z/z
Z
x /t
xZIm
xΦ
NZZ
xdzedzexzIdzzfxzIxIE
N~Z
dtex
1
1
22
--
2
).(1)(ˆSet
).1,0(~,..., Generate
ALGORITHM SAMPLING
).(21
21)()()()](Z[
(0,1).Let
disposal.your at
generator normal standard a haveyou where21)( Calculate
22
2
ππ
π
By the strong law of large numbers this estimate approximates the true normal probability P(Z ≤ x) with probability 1.
12
General Result
(SLLN). Numbers Large of Law Strongby the 1y probabilit with as )ˆ(ˆ
).(1ˆset and )(~,..., generate
,)()( estimate To
.set on supporteddensity y probabilit a )(
11
A
∞→=→•
=
=•
•
∑
∫
=
mE
Xgm
xfXX
dxxfxg
Axf
m
ii
iid
m
θθθ
θ
θ
13
Standard errors
.)Var( that statistics from Recall
.
).( variablerandom theof variance theis ))(Var( where,1
))((1)(1)ˆVar(
.principles lstatistica basic use and )(),...,(),(t independen theofmean sample a is ˆ
that realize we),(1ˆ oferror standard thecalculate To
2
2
222
12
1
21
1
nX
m
XgXgmm
XgVarm
Xgm
Var
XgXgXg
Xgm
m
ii
m
ii
m
m
ii
σ
σ
σσ
θ
θ
θ
=
=
==
=⎟⎠
⎞⎜⎝
⎛=
=
∑∑
∑
==
=
Uses that the variance of a sum is the sum of the variances of independent things.
14
Standard errors
.)ˆ)((1ˆ
is estimate likelihood maximum the while,)ˆ)((1
1 is variancesample of estimate unbiased that thestatistics from Recall
).(),...,(),( of variancesample ...by the
? estimate wedo How
)).(( where,)ˆVar(
.principles lstatistica basic use and ).(),...,(),(t independen theofmean sample a is ˆ
2
1
2
2
1
2
21
2
22
21
θσ
θ
σ
σσ
θ
θ
∑
∑
=
=
−=
−−
=
==
m
ii
m
ii
m
m
Xgm
Xgm
s
XgXgXg
XgVarm
XgXgXg
Since m/(m-1) approaches 1 for m large, and m can be fixed large by the user, we will follow the book and use the second estimate.
15
Standard errors
.)ˆ)((1ˆ
is estimate likelihood maximum the while,)ˆ)((1
1 is variancesample of estimate unbiased that thestatistics from Recall
).(),...,(),( of variancesample ...by the
? estimate wedo How
)).(( where,)ˆVar(
.principles lstatistica basic use and ).(),...,(),(t independen theofmean sample a is ˆ
2
1
2
2
1
2
21
2
22
21
θσ
θ
σ
σσ
θ
θ
∑
∑
=
=
−=
−−
=
==
m
ii
m
ii
m
m
Xgm
Xgm
s
XgXgXg
XgVarm
XgXgXg
Since m/(m-1) approaches 1 for m large, and m can be fixed large by the user, we will follow the book and use the second estimate.
16
Standard errors
.)ˆ)((
)ˆs.e.(
and
,)ˆ)((
)ˆ)((1
ˆ)ˆVar(
2
1
2
2
1
2
1
2
m
Xg
m
Xg
m
Xgm
m
m
ii
m
ii
m
ii
θθ
θ
θ
σθ
∑
∑
∑
=
=
=
−
=
−=
−=
≈
Have to be careful to have two m’s in the denominator.
17
Confidence intervals (CI)
).ˆ.(.1.96ˆ is for CI 95%A
0.95.))ˆ.(.1.96ˆ )ˆ.(.1.96ˆ(
and 0.95,1.96))ˆ.(.
ˆ (-1.96
)ˆ.(.
ˆ ngsubstituti and 0.951.96) (-1.96 (0,1),~For
.for interval confidence 95% a develop toused isfact this,)ˆ( Since
. ason distributiin )1,0()ˆVar(
)ˆ(ˆ thatimplies (CLT) TheoremLimit Central The
θθθ
θθθθ
θ
θθ
θ
θθ
θθθ
θ
θθ
es
esesθP
esP
esZZPNZ
E
mNE
±
=+<<−
=<−
<
⇒−
==<<
=
∞→→−
19
.))(1)(( is ))(,Bin(for varianceThe d.distribute Binomial iswhich
),(y probabilit successeach with trials,m ofout successes of proportion the
equals also and trials,Bernoullit independen of average theis )(1ˆ
)).(1)(()]([ on,distributi Bernoulli the toaccording Therefore, ).(1])([ y,probabilit success theis )(
).()()(0)(1)]([)]([otherwise. 0 and Zif 1 value taking variable,random Bernoulli a is )()(
)1,0(~,2
1
mxxxm
x
mXgm
xxZgVarxZgPx
xxZPxZPxZPxZIEZgExxZIZg
NZx
m
ii
Φ−ΦΦ
Φ
=
Φ−Φ=
Φ==Φ
Φ=<=≥⋅+<⋅=<=
<<=
=
∑=
θ
Example 5.5 continued
If this does not ring a bell, maybe variance of Bin(n,p) = p(1-p)/n does.
20
Example 5.5 from book
close. very is estimate varianceMC The 06.-2.223e0000.977)/10,-0.977(1 varianceal theoreticyield ld which wou,977.0)2( =≈Φ
> pnorm(2) [1] 0.9772499
MC variance estimate
21
Remarks on Example 5.5
below. examplein 0.06- example,for , of spacesupport theof endlower at the is if however, s,simulationmany require could algorithm This 3.)
).( hits therecords and variablesrandom oflot a generatesit because algorithm miss"or hit " theas toreferred sometimes is
)( form theof functions general estimatingfor shown just algorithm The 2.)
used. becan Either s.proportion estimating of case for the estimate MC n therather tha
,/))(ˆ1)((ˆ)]([ estimate, second prefer the Some 1.)
=
<
<
Φ−Φ≈
xZx
xZZ
xZI
mxxZgVar
22
Efficiency
.efficiencywith yourselfconcern you before (unbiased)correct isestimator your y whether first worr
oyou want t suggests,cartoon theAs property.order -second a called is Efficiency
.1)ˆ()ˆ( if ˆthan efficient more is ˆ then ,for estimators twoare ˆ and ˆ If
s.simulation ofnumber same for the estimateyour of riancesmaller va a getting meansit ,simulationIn
faster. thingsdoing means generalin Efficiency
2
12121 <
θθ
θθθθθVarVar
23
Notes on efficiency
.)ˆr(
)ˆ()ˆ(100
:reported is ˆ of instead ˆ usingreduction percent theSometimes
s.simulation ofnumber theincreasingby is variance thedecrease way toone so increases)
ssimulation ofnumber theas (decrease 1order of are averages of Variances
ns.calculatio efficiencyfor used are estimates MC their sounknown are Variances
1
21
12
θθθ
θθ
VaVarVar
m/m
−×
•
•
•
24
Power calculations
. need t weobtain tha tofor solve We
s.experimentprior from of estimate priori" a"an have We. belowerror standard a achieve toneeded ssimulation ofnumber thedetermine
want toand costly, is study that simulation arun toplanning are weSuppose
)].([ of average the takingare object we
theof variance true theis where,)ˆat var(earlier th saw We
accuracy. of level desired aget toperform tossimulationor collect tosamplesmany how gdeterminin refer to nscalculatiopower lStatistica
2
2
2
22
εσ
εσ
ε
θ
><•
•
=•
•
mmm
σm
Xg
σmσ
Jim Carrey, Bruce Almighty
25
Tricks for reducing MC variance
There are some tricks for reducing the variance of MC integration, which ultimately reduce the number of random variable generations. Two include the use of antithetic variables and control variates in Sections 5.4 and 5.5. These are beyond the scope of the course.
26
Importance sampling
∑
∫∫∫
∫
=
−=•
•
−=−=−
−=
=
m
ii
iid
m
U
b
aU
b
a
b
a
U
b
a
Xgmab
baUXXX
ugEabduufugabdtab
tgabdttg
baUfdttg
1
21
)(ˆ
),(~,...,, Generate
: wasalgorithm sampling The
)].([)()()()(1)()()(
that noting density, generating a as
),( a used have n weintegratio MC using )( calculate To
:MOTIVATION
θ
27
Importance sampling
density. )( by the wellmatchednot is if ellnot work w willThis
)(ˆ
),(~,...,, Generate
1
21
a,bU g
Xgm
abbaUXXX
m
ii
iid
m
∑=
−=•
•
θ
g(x)
28
Importance sampling
)]([)()()()(1)()()( ugEabduufugabdtab
tgabdttg U
b
aU
b
a
b
a
−=−=−
−= ∫∫∫
The idea is to replace the generating density f here by something that is easy to sample from and more closely represents the function to be integrated.
29
Importance sampling
.)()(1][Set
).(~,..., Generate
:ALGORITHM
integral. required thegives )()()()(
)()(][Then
).(~ where, of variablerandom ed transforma be )()(Let
function. importance thecalled is )( from; generatecan you that 0})({set on the 0)(such that )(density Find
:LOGIC
.)( Calculate :GOAL
1
1
∑
∫∫
∫
=
=•
•
==⎥⎦
⎤⎢⎣
⎡=•
=•
≠>•
m
i i
i
iid
m
XfXg
mYE
XfXX
dxxgdxxfxfxg
XfXgEYE
XfXXXfXgY
xfxx:gxfxf
dxxg
30
Picking the right f
0. isconstant a of variance thesince constant, a ,)()( that so possible
asclosely as of shape themimic to choose tois thisdo obest way t The
ty. variabililittle has )()(such that )( choose want to We
.)()(
)(1at earlier th from Recall1
cXfXg
gf
XfXgXf
mXfXgVar
mYVarY
mVar
m
ii
≈
∴
⎟⎟⎠
⎞⎜⎜⎝
⎛
==⎟⎠
⎞⎜⎝
⎛∑=
31
Example from book
Note that some have a bigger support than g.
Cauchy = t1
Exp(1)
Uniform
Rescaled Exp(1) Rescaled Cauchy
32
g
• Plot g(x) and each of the f’s. • See which f matches the
shape of g most closely. • It looks as if f3 is the best.
f3
f0
f1
f2
f4
Example continued
33
g/f3
g/f2
g/f4
• Plot g(x)/f(x) for each of the f’s. • See which is most constant. • f3 looks the best. • Rescaling the Cauchy (f2 à f4)
really helped!
Example continued
34
Uniform
Exp(1)
Cauchy
Note these will have g(x) = 0 so it does not matter what you assign to them.
Example continued
36
• f3 has the smallest standard error, followed by f4. • The Cauchy (f2) is the worst. This is because its support
is so much larger than [0,1] that most of the generated g/f’s = 0. In fact 75% were 0.
Summary statistics of g/f2.
Example continued
37
Importance sampling to calculate expectations
.)(
)()(1])([Set
).(~,..., Generate
:ALGORITHM
).( )( resemblesclosely now that function, envelope theas toreferred sometimes ),( from, sample to
density another find toneeds one Here sampling. importanceapply stillCan
inference.ayesian Bin happensregularly This from.sample easy tonot isit density,y probabilit aalready is )(Although
.)()())(( calculate want to),(For :GOAL
1
1
∑
∫
=
=•
•
•
=
m
i i
ii
iid
m
XXfXg
mXgE
XXX
xgxfx
Xf
dxxfxgXgEXf~X
φ
φ
φ
• All estimates approach the true value of the integral as m approaches infinity by the SLLN.
• Despite its simplicity importance sampling is rarely used for expectation calculations due to the difficulty in finding an appropriate envelope function.