Comparison of Latin Hypercube andComparison of Latin...
Transcript of Comparison of Latin Hypercube andComparison of Latin...
Comparison of Latin Hypercube andComparison of Latin Hypercube and Quasi Monte Carlo Sampling
.
Techniques.
Sergei Kucherenko1, Daniel Albrecht2, Andrea Saltelli2
1Imperial College London, UKEmail: [email protected]
2The European Commission, Joint Research Centre, ISPRA(VA), ITALY
1
Outline Outline
Monte Carlo integration methods
Latin Hypercube sampling design
.
Quasi Monte Carlo methods. Sobol’ sequences and their properties
Comparison of sample distributions generated by different techniques.Comparison of sample distributions generated by different techniques
Do QMC methods loose their efficiency in higher dimensions ?
Global Sensitivity Analysis and Effective dimensions
C i ltComparison results
2
Monte Carlo integration methods
[ ] ( )
see as an expectation: [ ] [ ( )]nH
I f f x dx
I f E f x
=
=∫
1
p [ ] [ ( )]1Monte Carlo : [ ] ( )
N
N ii
f f
I f f zN =
= ∑
.
1
{ } is a sequence of random points in Error: [ ] [ ]
in
i
N
z HI f I fε
−
= − .
2 1/ 21/ 2
[ ] [ ]
( )= ( ( ))
N
N
f f
fENσε ε = →
Convergence does not depent N
Improve MC convergence by decreasing ( )on dimensionality but it is slow
fσp g y g ( )Use variance reduction techniques:antithetic variables; control variates;
f
3stratified sampling LHS sampling→
Latin Hypercube sampling
..
Latin Hypercube sampling is a type of Stratified Sampling.
To sample N points in d-dimensions
Divide each dimension in N equal intervals => Nd subcubes.
Take one point in each of the subcubes so that being projected to
4
lower dimensions points do not overlap
Latin Hypercube sampling
{ } 1 - independent random permutations of {1 }k n Nπ = { }, 1,..., - independent random permutations of {1,..., }each uniformly distributed over all N! possible permutations
( ) 1
k
k
k n N
U
π =
.
( ) 1LHS coordinates: , 1,..., , 1,...,k
k k ii
i Ux i N k nN
π − += = =
k .
U(0,1) LHS is built by superimposing well stratified one-dimensional
kiU ∼
It cannot be expected to provide good uniformity propertiessamples.
p p g y p pin a n-dimensional unit hypercube .
55
Deficiencies of LHS sampling
..
1) S i b dl l d ( )1) Space is badly explored (a)
2) Possible correlation between variables (b)
3) Points can not be sampled sequentially3) Points can not be sampled sequentially
=> Not suited for integration
6
Discrepancy. Quasi Monte Carlo.
n
:Defintions: ( ) ( ) [0 ) [0 ) [0 )Discrepancy is a measure of deviation from unifo yrmit
Q y H Q y y y y∈ = × × ×1 2
( )*
Defintions: ( ) , ( ) [0, ) [0, ) ... [0, ),( ) volume of
n
Q
Q y H Q y y y ym Q Q
N
∈ = × × ×−
.
( )*
( )sup ( )
n
Q yN
Q y H
ND m Q
N∈= −
.
* 1/ 2 1/ 2
*
Random sequences: (ln ln ) / ~ 1/
(
ND N N N
D c d
→
≤(ln )) Low discrepancy sequences (LDS)
nN(ND c d≤
*
) Low discrepancy sequences (LDS)
Convergence: [ ] [ ] ( ) ,QMC N N
NI f I f V f Dε
−
= − ≤
Assymptotically ~ (1/ ) much higher than
(ln )M
n
CQO N
NO Nε
ε
→
=
7
Assymptotically ~ (1/ ) much higher than
~ ( )1/QMC
MC
O N
O N
ε
ε
→
QMC. Sobol’ sequences
(ln )Convergence: for all LDSnO N
Nε = −
1(ln )For Sobol' LDS: , if 2 , integern
kO N N kN
ε−
= = −
.Sobol' LDS:
.
1. Best uniformity of distribution as N goes to infinity.2. Good distribution for fairly small initial sets.3 A f t t ti l l ith3. A very fast computational algorithm.
"Preponderance of the experimental evidence amassed to datePreponderance of the experimental evidence amassed to date points to Sobol' sequences as the most effective quasi-Monte Carlo method for application in financial engineering."
Paul Glasserman Monte Carlo Methods in Financial Engineering
8
Paul Glasserman, Monte Carlo Methods in Financial Engineering, Springer, 2003
Sobol LDS. Property A and Property A’
A low-discrepancy sequence is said to satisfy Property A if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 2n there is exactly one point in each 2n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into halfhypercube along each of its length extensions into half.
A low-discrepancy sequence is said to satisfy Property A’ if for any binary segment (not an arbitrary subset) of the n-dimensional sequence of length 4n there is exactly one point in each 4n hyper octant that results from subdividing the unit
.
exactly one point in each 4n hyper-octant that results from subdividing the unit hypercube along each of its length extensions into four equal parts.
.
9Property A Property A’
Distributions of 4 points in two dimensions
Property A
NoMC -> No
..
LHS -> No
Sobol’ -> Yes
10
Distributions of 16 points in two dimensions
Property A’
NoMC -> No
..
LHS -> No
Sobol’ -> Yes
11
Comparison of Discrepancy I.Low Dimensions
Use standard MC and
0.1MC
LHSQMC-Sobol Use standard MC and ,
LHS generators
Sobol' sequence generator:0.001
0.01Lo
g(D
N_L
2)
.
Sobol sequence generator:
SobolSeq:
Sobol' sequences satisfy1e-05
0.0001
128 256 512 1024 2048 4096 8192 16384 32768
L
*. q y
Properties A and A’
www.broda.co.uk
128 256 512 1024 2048 4096 8192 16384 32768Log2(N)
0.01MC
LHS
N
0.001
g(D
N_L
2)
LHSQMC-Sobol
Result:
QMC in low dimensions shows h ll di th0 0001
Log
12
much smaller discrepancy than MC and LHS
0.0001128 256 512 1024 2048 4096 8192 16384 32768
Log2(N)
Comparison of Discrepancy IIHigh Dimensionsg
1e-07MC
LHSQMC-Sobol
.
1e-08
Log(
DN
_L2)
.
1e-09
L
1e 09128 256 512 1024 2048 4096 8192 16384 32768
Log2(N)
All sampling methods in high-dimensions have comparable discrepancy
13
Do QMC methods loose their efficiency in higher dimensions ?dimensions ?
(ln )QMC
nO Nε =
.
Assymptotically ~ (1/ )
QMC
QMC
NO Nε
.
*
21
but increseas with until exp( )
50 5 10 not achievable for practical applicationsQMC N N n
n N
ε ≈
= ≈50, 5 10 not achievable for practical applicaIs QMC better than MC and LHS in higher dimensions ( 2
tion0
s)
n N≥
= ≈ −
14
ANOVA decomposition and Sensitivity Indices
Consider a modelx is a vector of input variables
( )( )
Y f xx x x x==
f(x) is integrable
ANOVA decomposition:
1 2( , ,..., )0 1
k
i
x x x xx
=≤ ≤
.( ) ( ) ( )0 1,2,..., 1 2
1( ) , ... , , ..., ,
k
i i ij i j k ki i j i
Y f x f f x f x x f x x x>
= = + + + +∑ ∑∑
ANOVA decomposition:
.
1 1
1
1
...0
( , , ..., ) 0, , 1i s is k
i i j i
i i if x x dx k k s
= >
= ∀ ≤ ≤∫0
Variance decomposition: 2 2 2 2, 1,2,...,i iji i j nσ σ σ σ= + +∑ ∑ …
kijlij
k
i SSSS 21...1 ++++= ∑∑∑Sobol’ SI:
15
klji
ijlji
iji
i ,...,2,11
∑∑∑<<<=
Sobol’ Sensitivity Indices (SI)
Definition:1 1
2 2... ... /
s si i i iS σ σ=
- partial variances( )1 1 1 1
12 2... ...
0
,..., ,...,s si i i i i is i isf x x dx xσ = ∫
12
∫.
- total variance
Sensitivity indices for subsets of variables:
( )( )220
0
f x f dxσ = −∫( )x y z.Sensitivity indices for subsets of variables: ( ),x y z=
( )1
2 2, ,
1s
m
y i is i i
σ σ= ⟨ ⟨ ∈Κ
=∑ ∑ …
Total variance for a subset:( )11 ... ss i i= ⟨ ⟨ ∈Κ
( ) 222z
toty σσσ −=
Corresponding global sensitivity indices:
/ 22 σσS ( ) / 22tottotS16
,/σσ yyS = ( ) ./ 2σσ toty
totyS =
Effective dimensions
superposition sensThe effective dimension of ( ) in the i th ll t i t h th t
Let u be a cardinality of a set of variables . ef
ux
d
0
is the smallest integer such that (1 ), 1
S
S
uu d
dS ε ε
< <≥ − <<∑
.
It means that ( ) is almost a us m xf of -dimensional functions.Sd.
truncation sensThe function ( ) has effective dimension in the ie f( 11 ),
T
u
f x dS ε ε≥ − <<∑
___________________________________________________________
{1,2,..., }
Important property:
( ),T
uu d
d d
⊆
≤
∑
( ) 1Examp
Importa
le
nt
:
property:
n
TS
n
d d
f x x d d
≤
→ = == ∑17
( )1
1,Example: i
i TS nf x x d d=
→ = == ∑
Classification of functions
Type B,C. Variables are Type A. Variables are yp ,equally important
S S d↔
ypnot equally important
T Ty zS S d>> ↔ <<
.
i j TS S nd≈ ↔ ≈y z
y zT n
n nd>> ↔ <<
.
Type B. Dominant low order indices
Type C. Dominanthigher order indices n
11
n
ii
SS nd=
≈ ↔ <<∑1
1n
ii
SS nd=
<< ↔ ≈∑18
When LHS is more effective than MC ?
0ANOVA: ( ) ( ) ( )i if x f f x r x= + +∑( ) high order interactions terms
1 1
i
r x −
∑
.
2 21 1LHS: ( ) [ ( )] ( ) (Stein, 1987)
1 1 1
nLHS HE r x dx O
N Nε = +∫
∫ ∫.
2 2 2
2
1 1 1MC: ( ) [ ( )] [ ( )] ( )
if [ ( )] i
n nMC i iH Hi
E f x dx r x dx ON N N
d
ε = + +∑∫ ∫
∫ ll ( T B f ti )d2if [ ( )] is snH
r x dx∫ mall ( Type B functions )Sd⇔
2 2 ( ) ( )
LHS MCE Eε ε→ <
1919
Classification of functions
Function type
Description Relationship between
dT dS QMC is more
LHS is more
Si and Sitot efficient
than MCefficient than MC
A A few dominant Sy
tot/ny >> Szto / nz
<< n << n Yes No
.
variablesy y z z
B No unimportant subsets; only
≈ n << n Yes Yes
.subsets; only low-order interaction terms are present
Si ≈ Sj, ∀ i, jSi / Si
tot ≈ 1, ∀ i
presentC No
unimportant subsets; high- Si ≈ Sj ∀ i j
≈ n ≈ n No No
order interaction terms are present
Si ≈ Sj, ∀ i, jSi / Si
tot << 1, ∀ i
20
p
How to monitor convergence of MC LHS and QMC calculations ?MC, LHS and QMC calculations ?
The root mean square error is defined asThe root mean square error is defined as 1/2
2
1
1 ( )K
kd N
k
I IK
ε ⎛ ⎞= −⎜ ⎟⎝ ⎠∑
.
K is a number of independent runs
MC and LHS: all runs should be statistically independent ( use a
1kK =⎝ ⎠
. y p (different seed point ).
QMC: for each run a different part of the Sobol' LDS was used ( start from a different index number )
0 1cN α α− < <
start from a different index number ).
The root mean square error is approximated by the formula
MC: 0.5QMC 1
, 0 1cNα
α
≤
< <≈
21
QMC: 1LHS: ?
αα≤∼
Integration error vs. N. Type A(a) f(x) = ∑n
j 1(-1)iΠij 1 xj n = 360 (b) f(x) = Πs
i 1 │4xi-2│/(1+a i) n = 100(a) f(x) ∑ j=1( 1) Π j=1 xj, n 360, (b) f(x) Π i=1 │4xi 2│/(1+a i), n 100
1/ 21 K⎛ ⎞
-5
0MC (-0.50)
QMC-SOB (-0.94)
LHS (-0.52)
T Ty z
S S d2
1
1 ( )K
kN
k
I IK
ε=
⎛ ⎞= −⎜ ⎟⎝ ⎠∑
20
-15
-10
log 2
(err
or)
LHS ( 0.52) y z
y zT n
n nd>> ↔ <<
.(a)
~ , 0 1N αε α− < <-25
-20
8 11 14 17 20log2(N)
.(a)
-6
-2
or)
MC (0.49)
QMC-SOB (0.65)
LHS (0.50)
-10log 2
(err
o
(b) -148 11 14 17 20
log2(N)
22
Integration error. Type A
1/ 21 K⎛ ⎞
T Ty z
TS S nd>> ↔ <<
2
1
1 ( )K
kN
k
I IK
ε=
⎛ ⎞= −⎜ ⎟⎝ ⎠∑
~ , 0 1N αε α− < <
y zTn n
.
Index Function Dim n Slope MC
Slope QMC
Slope LHS
in .
1A 360 0.50 0.94 0.52( )
1 1
1 ij
i j
x= =
−∑ ∏
4 2n x a+
2Aa1 = a2 = 0
a = = a = 6 52
100 0.49 0.65 0.501
4 21
ni i
i i
x aa=
− ++∏
a3 = … = a100 = 6.52
23
Integration error vs. N. Type BDominant low order indices 1
n
S d↔ <<∑Dominant low order indices1
1ii
SS nd=
≈ ↔ <<∑
-8MC (0.52)
QMC-SOB (0 96)
5.0)(
1 −−
=∏= n
xnxfn
i
i
(a)-16
-12
g2(e
rror
)
QMC-SOB (0.96)
LHS (0.69)
.
360=n(a)
-24
-20
8 11 14 17 20
log2(N)
lo
.log2(N)
MC (0.50)
360
)11()(1
/1
=
+=∏=
n
xnxfn
i
ni
(b)-12
-8
og2(
erro
r)
QMC-SOB (0.87)
LHS (0.62)
-20
-16
8 11 14 17 20
l
24
log2(N)
Integration error. Type B functionsDominant low order indicesDominant low order indices
n
11
n
ii
SS nd=
≈ ↔ <<∑
.
Index Function Dim n Slope MC
Slope QMC
Slope LHS.
1B 30 0.52 0.96 0.691 0 5
ni
i
n xn−−∏
2B 30 0.50 0.87 0.62
1 0.5i n=
1
11n n
ni
i
xn =
⎛ ⎞+⎜ ⎟⎝ ⎠
∏4 2
3Bai = 6.52
30 0.51 0.85 0.551
4 21
ni i
i i
x aa=
− ++∏
25
The integration error vs. N. Type CDominant higher order indices: 1
n
S nd<< ↔ ≈∑Dominant higher order indices: 1
1ii
SS nd=
<< ↔ ≈∑
-2MC (0.47)
QMC-SOB (0.64)
(a) 1
4 2( ) , 0
1
ni i
ii i
x af x a
a=
− += =
+∏-8
-6
-4
log2
(err
or)
QMC SOB (0.64)
LHS (0.50)
.
(a)
1
4 2
10
in
ii
x
n=
→ −
=
∏-12
-10
8 11 14 17 20
log2(N).log2(N)
-2 MC (0.49)
QMC-SOB (0.68)
1/( ) (1/ 2)n
nif x x= ∏(b) -10
-8
-6
-4
log2
(err
or)
LHS (0.51)
1
10i
n=
=
∏-14
-12
8 11 14 17 20
log2(N)
26
Integration error for type C functionsDominant higher order indicesg
n
11
n
ii
SS nd=
<< ↔ ≈∑
.Index Function Dim n Slope
MCSlope QMC
Slope LHS. MC QMC LHS
1C 10 0.47 0.64 0.501
4 2n
ii
x=
−∏n
2C 10 0.49 0.68 0.51( )1
2n
ni
i
x=∏
27
The integration error vs. N. Function 1AMC QMC LHS Theoretical Value
-0.320
-0.315
MC QMC LHS Theoretical Value
( )1in
i∑ ∏-0.330
-0.325
A -
inte
gral
( )1 1
1 ,
360
ij
i j
x
n= =
−
=
∑ ∏
.-0.345
-0.340
-0.335Fu
nctio
n 1A
.
-0.355
-0.350
0 2 4 6 8 10 12
QMC: convergence is monotonicMC d LHS ill ti
0 2 4 6 8 10 12log2 (N)
MC and LHS: convergence curves are oscillating
QMC is 30 times faster than MC and LHS
28
LHS: it is not possible to incrementally add a new point while keeping the old LHS design
Evaluation of quantiles I. Low quantile
Distribution dimension n = 5.
are independent standard normal variates
2
1
( ) ,n
ii
f x x=
= ∑(0 1)x N are independent standard normal variates(0,1)ix N∼
0.0100
.w)
"MC"
"QMC"
"LHS"Expon. ("QMC")
Expon. ("MC")
Expon. ("LHS").
0.0010
og(M
eanE
rror
Low
Lo
0.000110 11 12 13 14 15 16 17 18
Log2(N)
29
Low quantile (percentile for the cumulative distribution function) = 0.05A superior convergence of the QMC method
Evaluation of quantiles II. High quantile
Distribution dimension n = 5.
are independent standard normal variates
2
1
( ) ,n
ii
f x x=
= ∑(0 1)x N are independent standard normal variates(0,1)ix N∼
0.0100"MC"
.igh)
"QMC"
"LHS"
Expon. ("MC")
Expon. ("QMC")
Expon. ("LHS").
0.0010
Log(
Mea
nErr
or H
0 0001
Hi h til ( til f th l ti di t ib ti f ti ) 0 95
0.000110 11 12 13 14 15 16 17 18
Log2(N)
30
High quantile (percentile for the cumulative distribution function) = 0.95QMC convergences faster than MC and LHS
Summary
Sobol’ sequences possess additional uniformity properties which MC andSobol sequences possess additional uniformity properties which MC and LHS techniques do not have (Properties A and A’).
Comparison of L discrepancies shows that the QMC method has the
.
Comparison of L2 discrepancies shows that the QMC method has the lowest discrepancy in low dimensions ( up to 20).
QMC method outperforms MC and LHS for types A and B functions.QMC method outperforms MC and LHS for types A and B functions (problems with low effective dimensions)
LHS h d f MC l f B f iLHS method outperforms MC only for type B functions.
QMC remains the most efficient method among the three techniques for non-uniform distributions
31