Sparse Grid Methods for Uncertainty QuantificationSparse Grid Methods for Uncertainty Quantification...
Transcript of Sparse Grid Methods for Uncertainty QuantificationSparse Grid Methods for Uncertainty Quantification...
Sparse Grid Methods for
Uncertainty Quantification
Michael Griebel
University of Bonn and Fraunhofer SCAI
Joint work with Alexander Rüttgers
1. Sparse grids
- Construction principles and properties
- Optimal sparse grids
- Adaptive combination method
2. Application
- Multi-scale viscoelastic flows
Motivation
• Numerical methods in uncertainty quantification:
- Galerkin approach
- Collocation technique
- Discrete projection
• Needed on stochastic/parameter domain:
- Approximation of integrals
- Interpolation, especially for collocation
• Simple domains with product structure: • Issue: high- or even infinite-dimensional problems
dd Raa ,],[ I
Curse of dimension
•
• Bellmann ´61: curse of dimension
• Find situations where curse can be broken ?
• Trivial: restrict to
but practically not very relevant
• In any case: some smoothness changes with
or importance of coordinates decays successively
(e.g. after suitable nonlinear transformation)
,)(rVf ,: )( Rf d r isotropic smoothness
)(||)(|||| // dr
H
dr
HM MOfMdCff rss
)(dOr
)()(|||| / cdcd
M MOMOff
d
dof#M
I
Sparse grid approach
• Basic principles:
– 1-dim multilevel series expansion with proper decay
– d-dim product construction
– Trunctation of resulting multivariate expansion
• Effect:
– reduction of cost complexity
– nearly same accuracy as „full“ product
– necessary: certain smoothness requirements
– adaptivity for detection of lower-dimensional manifolds
Simple example: Hierarchical basis
parabola in [-1,1]
conventional coefficients
no decay from level to level
hierarchical coefficients
decay by ¼ from level to level
1l
2l
3l
1W
2W
3W
3V
)1)(1()( xxxf
Tensor product hierarchical basis
Generalization to higher dimension by tensor product
decay in x- and y-direction by 1/4
decay in diagonal direction by 1/16 Idea:
Omit points with small associated hierarchial coefficient values
Table of subspaces
11 l 21 l 31 l
12 l
22 l
32 l
21llW
Regular sparse grids
Properties of regular sparse grids
Cost:
Accuracy:
Mitigates the curse of dimension of conventional full grids
Note: Higher regularity in mixed derivative, ~d
For wavelets, general stable multiscale systems:
instead of
-norm
Sparse grids Full grids
Smoothness:
:
Space, seminorm:
)( dNO
)( 2NO
))log(( 1dNNO
))log(( 12 dNNO
cx
fd
i i
||1
2
2
cxx
f
d
d
|
...|
22
1
2
2
2 ||, fHmixmix fH ,
2 ||, 2
))(log( 2/)1(2 dNNO
nN 2
2L
Re-invented several times: 1957 Korobov, Babenko
1963 Smolyak
1970 Gordon
1980 Delvos, Posdorf
1990 Zenger, G.
2000 Stromberg, deVore
2010 ????
hyperbolic cross points
blending method
Boolean interpolation
sparse grids
hyperbolic wavelets
Application areas include:
• solution of PDEs
• integral equations
• eigenvalue problems
• quadrature
• interpolation
• data compression
History of regular sparse grids
ll WVPP d
ljl
d
jl
d
j jj
)(
111
:)(:
Basic principles of sparse grids
• 1-dim multilevel sequence of operators and spaces
• Sequence of differences, telescopic approach
• d-dim. product construction
• Appropriate truncation of resulting multivariate
expansion
lVll VVP )1(:
llllll WVVVPP ::)(: 1
)1(
1
lll Wff )(
dd NN
d
d Nlll ),,,( 21 l
Nl
l
l
l
l PPdN
I
I
II
• Integration:
– Sequence of nested or non-nested point sets and weights,
size: or
=> various sparse grid quadrature rules
• Interpolation , approximation
– Local piecewise polynomials, multiscale expansion:
hierarchical basis, interpolets, wavelets, multilevel basis,
size:
=> sparse grid finite element spaces
– Global polynomials: Fourier series, Chebyshev, Legendre,
Hermite, Bernoulli polynomials
size or or
=> total degree / hyperbolic cross approximation
Examples of multiscale expansions, 1d
lnl 12 l
ln
12 l
ln
lnl 12 l
ln
RVVQP lll )1(:
lll VVIP )1(: lll VVAP )1(:
I
1|| lW
12|| l
lW
12|| l
lW
Regular sparse grid approach
• Index sets
• The hierarchical representation is then
• Other representations:
– generating system
– Lagrange system over SG points
– semi-hierarchical
– combination method
d
j
j
d
n dnlN1
1
sparse 1|:| ll
nlN jdj
d
n
,,1
full max|:|
ll
1||
sparse
1 dn
nPl
l
1||
sparse
1
)()(dn
n ffPl
l
I
I
• A simple alternative representation is [G., Schneider, Zenger 91],
– Involves just the (anisotropic) full grid discretizations on
different levels and linearly combines them
• 2D example
The combination technique
1|| 1
1|combi
1
1
1||
1)1(
dnn
dn
n Pd
Pl
l
|l
l jl
d
jPP
1:
l
lP
2||1||
combi
11 dndn
n PPPl
l
l
l
4n
5 indices, level n
• Redundant representation but allows the simple
reuse of existing code
• Completely parallel computation of the subproblems
• Corresponds to a certain multivariate extrapolation
method [Rüde 91]
• Necessary: Existence of a pointwise error expansion.
– Euler-Maruyama of stochastic ODE: additive expansion
(leading error term) of mean square error
• Multilevel-Monte Carlo is just 2-d combination method
– Variance and bias for the two dimensions and a proper
refinement rule which reflects the MC and the Euler-
Maruyama rates [Gerstner12, Harbrecht,Peters,Siebenmorgen13]
The combination technique
lP
• In general: Given
– a class of functions and an error norm
– an associated bound for the benefit of
– a bound for the cost of
• We can a-priori derive a (quasi-) optimal sparse grid
by solving a binary knapsack problem [Bungartz+G.03]
and setting
• Boils down to just sorting the quotients of the
benefit versus cost according to its size and taking
the largest indices into account
A priori construction of sparse grids
l
)(/)( ll cb
)(lc
)(lb
l
dN
bl
l l)(max fix
N
Ccd
l
l l)(thatsuch 1,0l
1: ll d
C NI
• Representation
• Cost per subspace
• Benefit for accuracy
• Choice of best subspaces ? Knapsack problem !
=> local benefit2/cost ratio
-norm-based sparse grids in
)()( xxl
l ff ),..,( 1 dxxx
),..,( 1 dlll
ll x Wf )(
1||2)dim()(
1l
ll
Wc
)2( 1||2 lO
l
l
WVdn
optd
n
1||
),(
1
2l
1l
1|| 1 dnl isoline
n
2L
mix
d fbf ,
||2
2 ||23)(|||| 1
2
l
l l
regular sparse grid space 1
1
1
||5
||
||42 2
2
2)(/)(
l
l
l
ll
cb
2
mixH
Anisotropic sparse grids
• Non-equal directions
– Weighted Sobolev spaces [Sloan+Wozniakowski93]
– Anisotropic smoothness spaces [Gerstner+G. 98, G.+Zung15]
– Different dimensions for different directions [G.+Harbrecht 11]
• Via knapsack problem:
– A priori construction of optimal
anisotropic sparse index sets
– log-terms disappear
r
mixH ,
)()()( 21
,,, 2121
d
ssssss
mix IHIHIHH dd
)()()( 2121
d
sss dHHH
Generalized sparse grids
• General index sets
• Downward closed set, no holes
• Associated sparse grid operator
• Associated space and associated function
dN
l
l
WV
l
lP
l
l
l
l fffP )(
dje j ,,1 ll1l
2lI
• Can also be generalized to a given downward closed
index set
• Combination coefficient
with characteristic function on the index set
• Again: just (anisotropic) full grid discretizations
on different levels get linearly combined
• Note: many coefficients on the lower levels are zero
The combination technique
l
llPcP
)()1( 1||zl
1
0z
z
l
c
lP
Tensor product sparse grids
• Examples: – space time, , parabolic problems
– space parameters
but smooth in parameter variables
– space stochastics
but analytic in stochastic variables
• Main result: Curse of dimension only w.r.t. the
larger dimension and/or the lower smoothness [G.+Harbrecht11], [G.+Zung15]
• Time, parametrization and stochastic coordinates
disappear in the overall complexity rate
=> just space discretization matters
1,3 21 dd
2010,3 21 dd
21 ,3 dd
)),0(()(),0()( 22210
2infTHH
n
TLHnVu
ucuunn
Sparse space-time grids
)),0(()( 22 THHu
space dimension 2, space-time sparse grid,
Cranck-Nicolson case, n=4,5:
space dimension 1, space-time
sparse grid, Euler case
• Approximation error and necessary regularity [G.+Oeltz07]
– Classical regularity theory (Ladyzenskaja, Wloka)
– Sparse space-time grids posses same approximation rate as
full space-time grids but just cost complexity of space problem
– In each time slice there is a conventional full grid
• Solutions of stochastic/parametric PDEs
live on product
• of temporal domain
• of spatial domain with
• and stochastic/parametric domain with large or
even infinity.
• Often: Very high smoothness in -part
– Here: especially weighted analyticity for the different
coordinates, decay in covariance [Cohen,Devore,Schwab10,11]
– Then, even infinite-dimensional become treatable
• Sparse grid not only within stochastics but also
between spatial, temporal and stochastic domain
Stochastic and parametric PDEs
),,(),,(),(),,( yxyxyxyx trtfAtut
YXyx Tt ),,(
3,2,11 dX
2dY
y
Y
T
• Analytic regularity in polydisc with radii
• Sequence of smoothness indices
• With global polynomials:
• Accuracy with respect to the involved #dof [Beck,Nobile,Tamellini,Tempone12,14], [Tran,Webster,Zhang15], [G.+Oettershagen15]
• For the infinite-dimensional case:
– Logarithmic growth => algebraic rate [Todor,Schwab07], [Cohen,Devore,Schwab10,11]
– Linear growth => subexponential rate [G.+Oettershagen15],
[Tran,Webster,Zhang15]
Sparse grids and analytic functions
)...( 11|)(| ddkakaecf
k
)log(),( 1 ra daa
)( /)1()()( /1 ddMdgm MeOd a
M
1/
1
1
ja je
dd
j
jagm
/1
1
)(
a eddd d /)!()( /1
)( )1( MO1
ja j 0 ))log(( 2/141)log(
8
3
MMMOM
),(: 1 drr r
Stechkin´s Lemma
Stechkin´s Lemma
can not show this
rate but gives only
an algebraic bound
Dimension-adapted sparse grids
• So far: function class known,
– a-priori choice of best subspaces by optimization
– size of benefit/cost ratio indicated if subspace is relevant
=> sparse grid patterns for
• Now: for given single function
– adaptively build up a set of active indices
– benefit , i.e. local error-indicator of
– cost for subspace ,
– benefit/cost indicator
– refinement strategy to build new index set,
– global stopping criterion => sparse grid pattern
• Directions with product of different smoothness
lW
YXT
f
2||)(||:)( fb ll f
||)( ll Wc
)(/)(:)( lll cb
The adaptive combination algorithm
simple extension to
dimension-adaptive
version exists => UQ14
1l2l
refinement rule
downward closedness
• Evolution of the algorithm:
• As any adaptive heuristics: may terminate too early
• If mixed regularity not present, refinement to the usual full grid
index sets:
corresponding
grids:
Example
Application: Non-Newtonian fluids
• Classical Newtonian fluids: Obey Newton´s law of
viscosity, stress tensor is proportional to load/force
• But various complex fluids show strange behavior
which is not correctly described
Weissenberg effect tubeless siphon effect Barus effect
Application: Non-Newtonian fluids
• Non-Newtonian fluids contain microstructures which
are the reason for their unusual properties
• Examples: paint, toothpaste, shampoo, blood, oils
• Polymeric fluids are a subset of non-Newtonian fluids
• Long-chained molecules in a Newtonian solvent
• Viscoelasticity due to interaction of elastic molecules and
drag forces in basic flow
• A macroscopic model like the Navier Stokes equations
+ macrosopic extensions is no longer sufficient
• Needs to be augmented by model on the micro scale
=> Two scale modelling
Mathematical modelling
• The conservation equations for polymeric fluids are
the same as for the Newtonian case, but the
presence of polymer molecules contributes a
polymeric extra-stress tensor and an additional
polymeric viscosity such that the viscosity ratio
• The Navier-Stokes equations are now
+ b.c., with Reynolds number
and viscosity ratio
pτ
ppt
τuuuu
Re
1
Re
1
0 u
conservation
of momentum
solvent viscosity
Re
ps
s
polymeric viscosity
s
p
1p
• On the microsocopic scale, a polymer chain is
modelled by a spring chain of K+1 beads
• Position in physical space/flow domain
• Orientations in configuration space
• Probability to find chains at time with position in
. and orientations in
Microscopic modelling
x 3R
Kqq ,...,1KR3
t xxx d, KKK dd qqqqqq ,...., 111
),,...,,(),,...,,(,,0: 11 ttRT KK qqxqqx
I
I
I
• The function is a pdf, i.e.
• The application of Newton´s 2nd law to the forces
acting on chain leads to the Fokker-Planck equation
with Rouse matrix
• Describes evolution of under chain´s spring forces
• Various models for spring force: Hooke:
FENE: , FENE-P:
Fokker-Planck equation
i
TK
iit
quu xqx )()(1
ji
K
i
K
j
iji
K
j
ij ADe
ADe
qqqF
1 11 4
1)(
4
1
1,0
)(),..,( 1 KqFqF
KA ]121[
qF(q)
bqb
2||||,1 /||q||
qF(q)
2b
b
2
2,
1q
/q
qF(q)
Deborah number
• represents polymeric configurations of micro-system
• Expectation in configuration space
• Coupling of internal configurations of micro system to
macroscopic stress tensor via Kramer´s expression
Constant C depends on model, Deborah number, viscosity ratio
• Issues with the Fokker-Planck equation
– becomes more singular for higher values of [Suli, Knezevic08]
=> extremely fine numerical resolution needed [Lozinski, Owen 03]
– -dimensional + time-dependent => curse of dim.
Coupling to the macro scale
Kdd qq ...1
K
i
iip C1
)( IdqFqτ
De
)1(333 KK
• There is a formal equivalence between the Fokker-
Planck equation and stochastic partial differential eq.
– Describes evolution of random fields that
represent the configuration vector
– Brownian forces on the beads are modelled by the 3-dim.
Wiener processes
– The vector consists of the component-wise differences
Stochastic microscopic modelling
T
K ),...,( 1 qqq
1,...,1),( KitiW
T
K ),...,( 1 QQQ
),()(),()(),( tttd xQuxQuxQ
)(2
1)),((
4
1td
DedttQA
DeUxF
K
)(tU
Kittt iii ,...,1),()())(( 1 WWU
Deborah number
• Brownian configuration fields (BCF) [Hulsen97]
Random field for configuration
• Discretization of x-space: the grid cells make from
the parabolic SPDE a system of SODEs (MoL)
• Discretization of SODE-system: Put configuration
fields in each of the space grid cells and evolve
their configuration discretely over time, i.e. all
configuration fields have fixed spatial positions (Eulerian view).
Stochastic microscopic simulation
),( txQ
BM
GM
BG MM
GM
• In each grid cell with center we solve/
integrate the stochastic DE for a number of
stochastic realizations
• They are distributed according to the known
equilibrium density for
• But we do not know for . Thus, we approximate
the first moments in Kramer´s
relation as
i.e. we replace the integral by Monte Carlo quadrature
Stochastic microscopic simulation
BM
Bk
j Mjt ,...,1),,()( xQ
K
i
M
j
k
j
ik
j
i
B
K
i
kikikp
B
ttM
C
ttCt
1 1
)()(
1
)),((),(1
)),((),(),(
IdxQFxQ
IdxQFxQxτ
)),((),( tt kiki xQFxQ
0t
0t
GMk ,,1 kx
• Navier Stokes equations:
– Uniform grid cells, staggered grid, cell centers , , cell faces
– WENO for convective terms, 2nd order scheme for other terms
– Euler or Crank-Nicolson in time, CFL-condition
– Chorin-like projection method
• Microscale stochastic equations:
– stochastic samples for each grid cell => samples
– QUICK for convective terms
– Explicit Euler-Maruyama, semi-implicit Euler for FENE
– Same time step size as for NS equations
– Variance reduction scheme with equilibrium control variates
Numerics
ppτ u
BM BG MM
• Code works as expected
• But: Huge memory requirements and huge computing times due to large
number of realizations in each cell
• Example for 3D multi-scale problem
– Flow domain with
• = 100x100x100 grid cells
• = 10.000 stochastic realizations in each grid cell
– Total memory requirements:
• 8 MB for the pressure field
• 24 MB for the velocity field
• 48 MB for the six independent components of
• 75 GB*N for all the stochastic variables
– Some months of computing time
Issues
p
u
pτ
GM
BM
BM
BG MM
Newtonian
non-Newtonian
• Consider our multiscale flow problem in more detail.
• We have the problem parameters:
mesh width, time step size, stochastic realizations, springs
• How can we improve on computational complexity ?
– Instead of MC use QMC
– Multilevel-MC, MLQMC for stochastic ODEs (time + stoch.)
This is just a certain 2d combination technique/
sparse grid approach [Gerstner 12] [Harbrecht,Peters,Siebenmorgen13]
– Combination technique in all 3 discretization parameters
i.e. for space x time x stochastics,
and for model parameter K, i.e. …. x number of springs
– If the optimal combination formula is not a priori known:
run the (dimension)-adaptive algorithm
Sparse grid approach
Coordinates for the combination method
we use only an
isotropic grid in
our NS solver
• Approximation of the vector and the tensor
• Compute benefits and costs componentwise
• One index set for all components
• Weighted and scaled benefit/cost indicator
Scaling with initial level not necessary if or
Indicators for the combination method
u pτ
)(lb )(lc
2,
2,
2,2
2,2
||))(1(||)(
||))((||)1(,
||))(1(||))((
||))((||max)(
Fp
Fp
bc
b
bc
b
l
l
uul
ull
0 1)1(b
• Non-Newtonian fluid in a 2D channel.
– Fluid is at rest at initial time t = 0,
– Shearing of fluid over time with rate
– Linear spring force model (dumbbell, K=1)
– Probability density function
1d in space, 2d in configuration space and time-dependent
• Discretization:
– Initial level
– Refinement from level to level by factor *2
– Error indicator , we are after error in
Example 1: Couette flow
RtxRtx ),,(),,(: 4qq
dydu /
256)16,(4,samples),/1,/1( tx
1
time
velocity
space
u
I I
5.0De
Example 1 Couette flow
• Behaviour of adaptive combination technique
• We asymptotically observe an anisotropic sparse grid structure
• Comparison:
– Full grid error
– Cost (dof)
full grid
sparse grid
Example 1 Couette flow
1u2L
04.0)E(u6,6,6
01.0)E(u7,7,7
8
6,6,6 104.5)C(u 9
7,7,7 103.4)C(u
7C 106.4)C(u
• Relative error of
3 time 2 l 5 time 2 l
3l
• Non-Newtonian fluid in a 3D domain.
– Steady uniaxial extensional flow,
– Stress tensor is aimed for
– FENE force model, K-spring chain
– We vary the number K of springs up to 5
– Probability density function
3N-dimensional in configuration, time-dependent, number of
springs, no space
• Discretization
– Initial level
– Refinement for time and samples from level to level by
factor *2, refinement for springs by +1
– Error indicator , we are after error in
Example 2: Steady extensional flow
RtRRt K ),(),(: 3qq
)2
,2
,( zyx
u
1) 2, (1024,springs),/1 samples,( t
0
pτ
pτ
I I I
0.1De
Example 2: Steady extensional flow
• Behaviour of adaptive
combination technique
• We observe:
– a sparse grid structure
for all indices
– plus a nearly full grid
between time and
springs for the
smallest sample size
– Different refinement:
*2 versus +1
• Relative error for
of adaptive combination
technique
xx2L
• Convergence of model for rising number K of springs
• All results are computed on fine level with 2 million samples.
• Fixed stochastic time step width
Example 2: Steady extensional flow
2048/1t
• Basic principles of sparse grids
• Optimization by knapsack problem
• Dimension-adaptive combination method
– Solution of subproblems on levels
– Sparse grid approximation by linear combination
– Refinement with hierarchical contributions and local cost
• Application to non-Newtonian flow
– Two-scale problem, stochastic microscale
• Adaptive combination method works on discretization
directions (space x time x samples) and also for
model parameters (… x springs)
=> Allows to couple discretization and modelling errors
Concluding remarks
lP l
l
The C library HCFFT G.+Hamaekers
• Hierarchical sparse grid interpolation based on:
- Fast Fourier transform (FFT), fast Sine and Cosine transform
- Fast Chebyshev transform, Fast Legendre transform
- Various other polynomial transforms
• Different hierarchical bases for different dimensions
• Dyadic and arbitrary, non-dyadic refined grids
• Several types of general sparse grids
• Dimension-adaptive sparse grids
• For high precision: possible use of long double
• Freely available at
www.hcfft.org
• Code NAST3DGPF which is freely available at
http://www.nast3dgpf.de/
The flow solver