Adaptive low-rank approximation in hierarchical tensor ... · Adaptive low-rank approximation in...
Transcript of Adaptive low-rank approximation in hierarchical tensor ... · Adaptive low-rank approximation in...
-
Workshop on ”Challenges in HD Analysis and Computation”, San Servolo 4/5/2016
Adaptive low-rank approximation in hierarchical tensor format usingleast-squares method
Anthony Nouy
Ecole Centrale Nantes, GeM
Joint work with Mathilde Chevreuil, Loic Giraldi, Prashant Rai
Supported by the ANR (CHORUS project)
Anthony Nouy Ecole Centrale Nantes 1
-
High-dimensional problems in uncertainty quantification (UQ)
Parameter-dependent models:
M(u(X );X ) = 0
where X are random variables.
Solving forward and inverse UQ problems requires the evaluation of the model formany instances of X .
In practice, we rely on approximations of the solution map
x 7→ u(x)
which are used as surrogate models.
Complexity issues:
• High-dimensional functionu(x1, . . . , xd)
• For complex numerical models, only a few evaluations of the function areavailable.
Anthony Nouy Ecole Centrale Nantes 2
-
High-dimensional problems in uncertainty quantification (UQ)
Parameter-dependent models:
M(u(X );X ) = 0
where X are random variables.
Solving forward and inverse UQ problems requires the evaluation of the model formany instances of X .
In practice, we rely on approximations of the solution map
x 7→ u(x)
which are used as surrogate models.
Complexity issues:
• High-dimensional functionu(x1, . . . , xd)
• For complex numerical models, only a few evaluations of the function areavailable.
Anthony Nouy Ecole Centrale Nantes 2
-
High-dimensional problems in uncertainty quantification (UQ)
Parameter-dependent models:
M(u(X );X ) = 0
where X are random variables.
Solving forward and inverse UQ problems requires the evaluation of the model formany instances of X .
In practice, we rely on approximations of the solution map
x 7→ u(x)
which are used as surrogate models.
Complexity issues:
• High-dimensional functionu(x1, . . . , xd)
• For complex numerical models, only a few evaluations of the function areavailable.
Anthony Nouy Ecole Centrale Nantes 2
-
High-dimensional problems in uncertainty quantification (UQ)
Parameter-dependent models:
M(u(X );X ) = 0
where X are random variables.
Solving forward and inverse UQ problems requires the evaluation of the model formany instances of X .
In practice, we rely on approximations of the solution map
x 7→ u(x)
which are used as surrogate models.
Complexity issues:
• High-dimensional functionu(x1, . . . , xd)
• For complex numerical models, only a few evaluations of the function areavailable.
Anthony Nouy Ecole Centrale Nantes 2
-
Structured approximation for high-dimensional problems
Specific structures of high-dimensional functions have to be exploited (applicationdependent)
Low effective dimensionality
u(x1, . . . , xd) ≈ u1(x1)
Low-order interactions
u(x1, . . . , xd) ≈ u1(x1) + . . .+ ud(xd)
Sparsity (relatively to a basis or frame)
u(x) =∑α∈Nd
uαψα(x) ≈∑α∈Λ
uαψα(x)
...
Low-rank structures
Anthony Nouy Ecole Centrale Nantes 3
-
Outline
1 Rank-structured approximation
2 Statistical learning methods for tensor approximation
3 Adaptive approximation in tree-based low-rank formats
Anthony Nouy Ecole Centrale Nantes 4
-
Outline
1 Rank-structured approximation
2 Statistical learning methods for tensor approximation
3 Adaptive approximation in tree-based low-rank formats
Anthony Nouy Ecole Centrale Nantes 5
-
Rank-structured approximation
A multivariate function u(x1, . . . , xd) defined on X = X1 × . . .×Xd is identified withan element of the tensor space
V1 ⊗ . . .⊗ Vd = span{v 1(x1) . . . vd(xd); vν ∈ Vν}
where Vν is a space of functions defined on Xν .
Approximation in a subset of tensors with bounded rank
M≤r = {v ∈ V n1 ⊗ . . .⊗ V nd ; rank(v) ≤ r}
For order-two tensors, a single notion of rank:
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1)v2i (x2)
For higher-order tensors, different notions of rank, such as the canonical rank
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1) . . . vdi (xd)
Storage complexity: O(rdn)
Anthony Nouy Ecole Centrale Nantes 5
-
Rank-structured approximation
A multivariate function u(x1, . . . , xd) defined on X = X1 × . . .×Xd is identified withan element of the tensor space
V1 ⊗ . . .⊗ Vd = span{v 1(x1) . . . vd(xd); vν ∈ Vν}
where Vν is a space of functions defined on Xν .Approximation in a subset of tensors with bounded rank
M≤r = {v ∈ V n1 ⊗ . . .⊗ V nd ; rank(v) ≤ r}
For order-two tensors, a single notion of rank:
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1)v2i (x2)
For higher-order tensors, different notions of rank, such as the canonical rank
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1) . . . vdi (xd)
Storage complexity: O(rdn)
Anthony Nouy Ecole Centrale Nantes 5
-
Rank-structured approximation
A multivariate function u(x1, . . . , xd) defined on X = X1 × . . .×Xd is identified withan element of the tensor space
V1 ⊗ . . .⊗ Vd = span{v 1(x1) . . . vd(xd); vν ∈ Vν}
where Vν is a space of functions defined on Xν .Approximation in a subset of tensors with bounded rank
M≤r = {v ∈ V n1 ⊗ . . .⊗ V nd ; rank(v) ≤ r}
For order-two tensors, a single notion of rank:
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1)v2i (x2)
For higher-order tensors, different notions of rank, such as the canonical rank
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1) . . . vdi (xd)
Storage complexity: O(rdn)
Anthony Nouy Ecole Centrale Nantes 5
-
Rank-structured approximation
A multivariate function u(x1, . . . , xd) defined on X = X1 × . . .×Xd is identified withan element of the tensor space
V1 ⊗ . . .⊗ Vd = span{v 1(x1) . . . vd(xd); vν ∈ Vν}
where Vν is a space of functions defined on Xν .Approximation in a subset of tensors with bounded rank
M≤r = {v ∈ V n1 ⊗ . . .⊗ V nd ; rank(v) ≤ r}
For order-two tensors, a single notion of rank:
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1)v2i (x2)
For higher-order tensors, different notions of rank, such as the canonical rank
rank(v) ≤ r ⇐⇒ v =r∑
i=1
v 1i (x1) . . . vdi (xd)
Storage complexity: O(rdn)
Anthony Nouy Ecole Centrale Nantes 5
-
Subspace based low-rank formats
α-rank: for α ⊂ {1, . . . , d}, V = Vα ⊗ Vαc , with Vα =⊗
µ∈α Vµ, and
rankα(v) ≤ rα ⇐⇒ v =rα∑i=1
vαi (xα)vαc
i (xαc ), vαi ∈ Vα, vα
c
i ∈ Vαc
⇐⇒ v ∈ Uα ⊗ Vαc with dim(Uα) = rα
{1, 2, 3, 4}
αc = {3, 4}α = {1, 2}
Storage complexity: O(rα(n#α + n#α
c
))
Anthony Nouy Ecole Centrale Nantes 6
-
Subspace based low-rank formats
Tree-based Tucker format [Hackbusch-Kuhn’09,Oseledets-Tyrtyshnikov’09]:
rankT (v) = (rankα(v) : α ∈ T ) with T a dimension tree
{1, 2, 3, 4}
{3, 4}
{4}{3}
{1, 2}
{2}{1}
storage complexity: O(dnR + dR s+1) with R ≥ rα
Example (additive model): u(x1, . . . , xd) = u1(x1) + . . .+ ud(xd) hasrankT (u) = (2, 2, 2, . . . , 2) for any tree T .
Anthony Nouy Ecole Centrale Nantes 7
-
Subspace-based low-rank formats
Tensor-train format: a degenerate case where T is a subset of a linearly-structureddimension tree
{1, 2, 3, 4}
{4}{1, 2, 3}
{3}{1, 2}
{2}{1}
T = {{1}, {1, 2}, {1, 2, 3}, . . . , {1, . . . , d−1}}
rankT (v) ≤ r ⇐⇒ v(x) =r{1}∑i1=1
r{1,2}∑i2=1
. . .
r{1,...,d−1}∑id−1=1
v 11,i1 (x1)v2i1,i2 (x2) . . . v
did−1,1(xd)
storage complexity: O(dnR2) with R ≥ rα
Anthony Nouy Ecole Centrale Nantes 8
-
Subspace based low-rank formats
Storage and computational complexity scales as O(d).
Good topological and geometrical properties of manifolds M≤r of tensors withbounded rank.
[Holtz-Rohwedder-Schneider ’11, Uschmajew-Vandereycken ’13,Falco-Hackbusch-N. ’15]
Multilinear parametrization:
M≤r = {v = F (p1, . . . , pL); pk ∈ Pk , 1 ≤ k ≤ L}
where F is a multilinear map.
Anthony Nouy Ecole Centrale Nantes 9
-
Outline
1 Rank-structured approximation
2 Statistical learning methods for tensor approximation
3 Adaptive approximation in tree-based low-rank formats
Anthony Nouy Ecole Centrale Nantes 10
-
Statistical learning methods for tensor approximation
Approximation of a function u(X ) = u(X1, . . . ,Xd) from evaluations{yk = u(xk)}Kk=1 on a training set {xk}Kk=1 (i.i.d. samples of X )
Approximation in subsets of rank-structured functions
M≤r = {v : rank(v) ≤ r}
by minimization of an empirical risk
R̂K (v) =1
K
K∑k=1
`(u(xk), v(xk))
where ` is a certain loss function.
Here, we consider for least-squares regression
R̂K (v) =1
K
K∑k=1
(u(xk)− v(xk))2 = ÊK ((u(X )− v(X ))2)
but other loss functions could be used for different objectives than L2-approximation(e.g. classification).
Anthony Nouy Ecole Centrale Nantes 10
-
Statistical learning methods for tensor approximation
Approximation of a function u(X ) = u(X1, . . . ,Xd) from evaluations{yk = u(xk)}Kk=1 on a training set {xk}Kk=1 (i.i.d. samples of X )Approximation in subsets of rank-structured functions
M≤r = {v : rank(v) ≤ r}
by minimization of an empirical risk
R̂K (v) =1
K
K∑k=1
`(u(xk), v(xk))
where ` is a certain loss function.
Here, we consider for least-squares regression
R̂K (v) =1
K
K∑k=1
(u(xk)− v(xk))2 = ÊK ((u(X )− v(X ))2)
but other loss functions could be used for different objectives than L2-approximation(e.g. classification).
Anthony Nouy Ecole Centrale Nantes 10
-
Statistical learning methods for tensor approximation
Approximation of a function u(X ) = u(X1, . . . ,Xd) from evaluations{yk = u(xk)}Kk=1 on a training set {xk}Kk=1 (i.i.d. samples of X )Approximation in subsets of rank-structured functions
M≤r = {v : rank(v) ≤ r}
by minimization of an empirical risk
R̂K (v) =1
K
K∑k=1
`(u(xk), v(xk))
where ` is a certain loss function.
Here, we consider for least-squares regression
R̂K (v) =1
K
K∑k=1
(u(xk)− v(xk))2 = ÊK ((u(X )− v(X ))2)
but other loss functions could be used for different objectives than L2-approximation(e.g. classification).
Anthony Nouy Ecole Centrale Nantes 10
-
Alternating minimization algorithm
Multilinear parametrization of tensor manifolds
M≤r = {v = F (p1, . . . , pL) : pl ∈ Rml , 1 ≤ l ≤ L}
so thatmin
v∈M≤rR̂K (v) = min
p1,...,pLR̂K (F (p1, . . . , pd))
Alternating minimization algorithm: Successive minimization problems
minpl∈Rml
R̂K (F (p1, . . . , pl , . . . , pd)︸ ︷︷ ︸Ψl(·)Tpl
)
which are standard linear approximation problems
minpl∈Rml
1
K
K∑k=1
`(u(xk),Ψl(xk)Tpl)
Anthony Nouy Ecole Centrale Nantes 11
-
Alternating minimization algorithm
Multilinear parametrization of tensor manifolds
M≤r = {v = F (p1, . . . , pL) : pl ∈ Rml , 1 ≤ l ≤ L}
so thatmin
v∈M≤rR̂K (v) = min
p1,...,pLR̂K (F (p1, . . . , pd))
Alternating minimization algorithm: Successive minimization problems
minpl∈Rml
R̂K (F (p1, . . . , pl , . . . , pd)︸ ︷︷ ︸Ψl(·)Tpl
)
which are standard linear approximation problems
minpl∈Rml
1
K
K∑k=1
`(u(xk),Ψl(xk)Tpl)
Anthony Nouy Ecole Centrale Nantes 11
-
Regularization
Regularization
minp1,...,pL
R̂K (F (p1, . . . , pL)) +L∑
l=1
λlΩl(pl)
with regularization functionals Ωl promoting
• sparsity (e.g. Ωl(pl) = ‖pl‖1),• smoothness,• ...
Alternating minimization algorithm requires the solution of successive standardregularized linear approximation problems
minpl
1
K
K∑k=1
`(u(xk),Ψl(xk)Tpl) + λlΩl(pl) (?)
• For square-loss and Ωl(pl) = ‖pl‖1, (?) is a LASSO problem.Cross-validation methods for the selection of regularization parameters λl .
Anthony Nouy Ecole Centrale Nantes 12
-
Illustrations
Approximation in tensor-train (TT) format:
v(x1, . . . , xd) =
r1∑i1=1
. . .
rd−1∑id−1=1
v 11,i1 (x1) . . . vdid−1,1(xd)
Polynomial approximationsv kik−1,ik ∈ Pq
v = F (p1, . . . , pd) with parameter pk ∈ R(q+1)rk rk−1 gathering the coefficients offunctions of xk on a polynomial basis (orthonormal in L
2PXk
(Xk)).
Sparsity inducing regularization and cross-validation (leave one out) for theautomatic selection of polynomial basis functions. Use of standard least-squares inthe selected basis.
Anthony Nouy Ecole Centrale Nantes 13
-
Illustration : Borehole function
The Borehole function models water flow through a borehole:
u(X ) =2πTu(Hu − Hl)
ln(r/rw )(
1 + 2LTuln(r/rw )r2wKw
+ TuTl
) , X = (rw , log(r),Tu,Hu,Tl ,Hl , L,Kw )
rw radius of borehole (m) N(µ = 0.10, σ = 0.0161812)r radius of influence (m) LN(µ = 7.71, σ = 1.0056)Tu transmissivity of upper aquifer (m2/yr) U(63070, 115600)Hu potentiometric head of upper aquifer (m) U(990, 1110)Tl transmissivity of lower aquifer (m
2/yr) U(63.1, 116)Hl potentiometric head of lower aquifer (m) U(700, 820)L length of borehole (m) U(1120, 1680)Kw hydraulic conductivity of borehole (m/yr) U(9855, 12045)
Polynomial approximation with degree q = 8.
Test set of size 1000.
Anthony Nouy Ecole Centrale Nantes 14
-
Illustration : Borehole function
Test error for different ranks and for different sizes K of the training set.
rank K = 100 K=1000 K=10000
(1 1 1 1 1 1 1) 1.7 10−2 1.4 10−2 1.4 10−2
(2 2 2 2 2 2 2) 6.7 10−4 9.1 10−4 3.3 10−4
(3 3 3 3 3 3 3) 3.2 10−3 1.2 10−4 1.0 10−5
(4 4 4 4 4 4 4) 2.1 10−1 7.6 10−5 1.9 10−7
(5 5 5 5 5 5 5) 7.3 100 3.8 10−4 2.8 10−7
(6 6 6 6 6 6 6) 7.9 10−1 4.1 10−3 2.1 10−7
Finding optimal rank is a combinatorial problem...
Anthony Nouy Ecole Centrale Nantes 15
-
Illustration : Borehole function
Test error for different ranks and for different sizes K of the training set.
rank K = 100 K=1000 K=10000
(1 1 1 1 1 1 1) 1.7 10−2 1.4 10−2 1.4 10−2
(2 2 2 2 2 2 2) 6.7 10−4 9.1 10−4 3.3 10−4
(3 3 3 3 3 3 3) 3.2 10−3 1.2 10−4 1.0 10−5
(4 4 4 4 4 4 4) 2.1 10−1 7.6 10−5 1.9 10−7
(5 5 5 5 5 5 5) 7.3 100 3.8 10−4 2.8 10−7
(6 6 6 6 6 6 6) 7.9 10−1 4.1 10−3 2.1 10−7
Finding optimal rank is a combinatorial problem...
Anthony Nouy Ecole Centrale Nantes 15
-
Outline
1 Rank-structured approximation
2 Statistical learning methods for tensor approximation
3 Adaptive approximation in tree-based low-rank formats
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}
At iteration m, {rm+1α = r
mα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}At iteration m, {
rm+1α = rmα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.
A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}At iteration m, {
rm+1α = rmα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}At iteration m, {
rm+1α = rmα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .
• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}At iteration m, {
rm+1α = rmα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )
• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}
Anthony Nouy Ecole Centrale Nantes 16
-
Heuristic strategy for rank adaptation (tree-based Tucker format)
Given T ⊂ 2{1,...,d}, construction of a sequence of approximations um in tree-basedTucker format with increasing rank:
um ∈ {v : rankT (v) ≤ (rmα )α∈T}At iteration m, {
rm+1α = rmα + 1 if α ∈ Tm
rm+1α = rmα if α /∈ Tm
where Tm is selected in order to obtain (hopefully) the fastest decrease of the error.A possible strategy consists in computing the singular values
σα1 ≥ . . . ≥ σαrmαof α-matricizations Mα(um) of um for all α ∈ T ,
Mα(um) =rmα∑i=1
σαi vαi ⊗ vα
c
i ∈ Vα ⊗ Vαc
• ‖um‖2 =∑rmα
i=1(σαi )
2 for all α ∈ T .• σαrmα provides an estimation of an upper bound of ‖u − um‖∨(Vα⊗Vαc )• Letting 0 ≤ θ ≤ 1, we choose
Tm =
{α ∈ T : σαrmα ≥ θmaxβ∈T σ
βrmβ
}Anthony Nouy Ecole Centrale Nantes 16
-
Illustration : Borehole function
Training set of size K = 1000
iteration rank test error
0 (1 1 1 1 1 1 1) 1.4 10−2
1 (2 2 2 2 2 2 2) 4.4 10−4
2 (2 2 2 3 3 2 2) 8.1 10−6
3 (3 3 3 4 3 2 2) 6.2 10−6
4 (3 3 3 4 4 3 2) 2.1 10−5
5 (3 3 3 4 4 3 3) 9.6 10−6
6 (3 4 4 4 5 4 4) 1.5 10−5
The selected rank is one order of magnitude better than the optimal “isotropic”rank (r , r , . . . , r)
Anthony Nouy Ecole Centrale Nantes 17
-
Illustration : Borehole function
Different sizes K of training set, selection of optimal ranks.
TT format
K rank test error
100 (3 4 4 3 3 2 1) 7.1 10−4
1000 (3 3 3 4 4 3 2) 6.2 10−6
10000 (5 6 6 7 7 5 4) 2.4 10−8
Canonical format
K rank test error
100 2 1.0 10−3
1000 5 3.8 10−4
10000 7 6.0 10−6
Anthony Nouy Ecole Centrale Nantes 18
-
Influence of the tree
Test error for different trees T (Training set of size K = 50)
{ν1, . . . , νd}
{ν2, . . . , νd}
. . .
{νd}{νd−1}
. . .
{ν1}
tree {ν1, . . . , νd} optimal rank test errorT1 (1 2 3 4 5 6 7 8) (2 2 2 2 2 1 1) 6.2 10
−4
T2 (1 3 8 5 6 2 4 7) (2 2 2 2 2 2 1) 1.3 10−3
T3 (7 6 8 1 4 5 2 3) (1 1 1 1 1 1 1) 1.5 10−2
T4 (8 2 4 7 5 1 3 6) (1 1 2 3 3 2 2) 1.3 10−2
Finding the optimal tree is a combinatorial problem...
Anthony Nouy Ecole Centrale Nantes 19
-
Strategy for tree adaptation
Starting from an initial tree, we perform iteratively the following two steps:
Run the algorithm with rank adaptation to compute an approximation v associatedwith the current tree
v(x1, . . . , xd) =
r1∑i1=1
. . .
rd−1∑id−1=1
v 11,i1 (xν1 ) . . . vdid−1,1(xνd )
Run a tree optimization algorithm yielding an equivalent representation of v (at thecurrent precision)
v(x1, . . . , xd) ≈r̃1∑
i1=1
. . .
r̃d−1∑id−1=1
v 11,i1 (xν̃1 ) . . . vdid−1,1(xν̃d )
with reduced ranks {r̃1, . . . , r̃d−1}, where {ν̃1, . . . , νd} is a permutation of{ν1, . . . , νd}.
Anthony Nouy Ecole Centrale Nantes 20
-
Strategy for tree adaptation
Illustration with training set of size K = 50.We run the algorithm for different initial trees.Indicated in blue are the permuted dimensions in the final tree.
tree {ν1, . . . , νd} optimal rank test errorinitial (1 2 3 4 5 6 7 8) (2 2 2 2 2 1 1) 6.2 10−4
final (1 2 3 5 4 6 7 8) (2 2 2 2 2 1 1) 4.5 10−4
initial (1 3 8 5 6 2 4 7) (2 2 2 2 2 2 1) 1.3 10−3
final (1 3 8 5 2 6 4 7) (2 2 2 2 2 2 1) 5.1 10−4
initial and final (7 6 8 1 4 5 2 3) (1 1 1 1 1 1 1) 1.5 10−2
initial (8 2 4 7 5 1 3 6) (1 1 2 3 3 2 2) 1.3 10−2
(8 2 7 5 1 4 3 6) (1 1 2 2 2 2 2) 1.2 10−3
final (8 2 7 5 1 3 4 6) (1 1 2 2 2 2 2) 1.3 10−3
Anthony Nouy Ecole Centrale Nantes 21
-
Concluding remarks and on-going works
For rank adaptation, possible use of constructive (greedy) algorithms for tree-basedTucker formats.
Need for robust strategies for tree adaptation.
“Statistical dimension” of low-rank subsets ?
Adaptive sampling strategies.
Goal-oriented construction of low-rank approximations.
Anthony Nouy Ecole Centrale Nantes 22
-
References and announcement
A. Nouy.Low-rank methods for high-dimensional approximation and model order reduction.arXiv:1511.01554
M. Chevreuil, R. Lebrun, A. Nouy, and P. Rai.A least-squares method for sparse low rank approximation of multivariate functions.SIAM/ASA Journal on Uncertainty Quantification, 3(1):897–921, 2015.
Open post-doc positions in Ecole Centrale Nantes (France).Mode order reduction for uncertainty quantification, high-dimensional approximation,
low-rank tensor approximation, statistical learning
Anthony Nouy Ecole Centrale Nantes 23
Rank-structured approximationStatistical learning methods for tensor approximationAdaptive approximation in tree-based low-rank formats