Tensor-Product Approximation in Computational Physics: Theory and

1

ICIAM07, Zuerich, July 17, 2007

Everything should be made as simple

as possible, but not simpler.

A. Einstein

Tensor-Product Approximation in Computational

Physics: Theory and Numerics

Boris N. Khoromskij

Max-Planck-Institute for

Mathematics in the Sciences

Leipzig

References/Acknowledgements B. Khoromskij, Zuerich 17.07.07 2

Related references

1. W. Hackbusch, B.N. Khoromskij and E.E. Tyrtyshnikov: Approximate Iterations for Structured

Matrices. Preprint 112, MPI MIS, Leipzig 2005; Numer. Math. submitted.

2. W. Hackbusch and B.N. Khoromskij: Low-Rank Kronecker-Product Approximation to

Multi-Dimensional Nonlocal Operators. Parts I/II. Computing, 76 (2006), 177-202/203-225.

3. W. Hackbusch and B.N. Khoromskij: Tensor-Product Approximation to Operators and Functions in

High Dimension. Journal of Complexity (2007), doi: 10.1016/j.jco.2007.03.007.

4. B.N. Khoromskij and V. Khoromskaia: Low-Rank Tucker-Type Tensor Approximation to Classical

Potentials. Central European J. of Math., 5(3) 2007, 1-28.

5. BNK: Multi-linear approximation of higher dimensional convolution in linear cost. MPI MIS, Leipzig

2007 (in progress).

6. S.R. Chinnamsetty, M. Espig, W. Hackbusch, B.N. Khoromskij, H-J Flad: Kronecker tensor product

approximation in quantum chemistry. Preprint MPI MIS 41-2007 Leipzig; J. of Chemistry (to appear).

7. S.R. Chinnamsetty, H.J. Flad, V. Khoromskaia and B.N. Khoromskij: Tensor decomposition in fully

discrete electronic structure calculations. Preprint MPI MIS 65-2007 Leipzig, (submitted).

8. Further papers: http://personal-homepages.mis.mpg.de/bokh/, http://www.mis.mpg.de

Acknowledgements

W. Hackbusch (Leipzig),

R. Schneider, H.-J. Flad (Kiel),

E. Tyrtyshnikov (Moscow), M. Fedorov (Cambridge)

Toward multi-dimensional modelling B. Khoromskij, Zuerich 17.07.07 3

1929, Dirac:

“The fundamental laws necessary for the mathematical

treatment of large part of physics and the whole of chemistry

are thus completely known, and the difficulty lies only in the

fact that application of these laws leads to equations that are

too complex to be solved“.

1998, W. Kohn, A. Pople:

Nobel Prize in Chemistry for development of DFT, based on

computations via (tensor-product) GTO basis sets.

Nowadays:

Systematic use of tensor-product approximations in

multi-dimensional modelling.

Breaking down the complexity B. Khoromskij, Zuerich 17.07.07 4

Focus in this talk:

Toward solving equations in Rd (d ≥ 3) at sublinear cost,

O(dn logq n) ≪ O(nd).

• Separable approximation to multi-dimensional functions

and operators getting rid of the “curse of dimensionality”

• Tensor approx. of Green’s kernel 1|x−y| in R

3

(appears in nearly any field of physics and chemistry).

• Fast convolution in Rd via multi-linear collocation at

“linear” cost O(dR1R2n log n).

• Numerics: 1|x| ,

sin(κ|x|)|x| ,

cos(κ|x|)|x| , x ∈ R

3,

Hartree potential ρ(x) ∗ 1|x| , x ∈ R

3 for simple molecules,

energy calculations.

Why low tensor rank approximation ? B. Khoromskij, Zuerich 17.07.07 5

Advantages:

Tremendous reduction of computational cost, removing d

from the exponential, nd → d · r · n+rd, r ≪ n.

Special case d = 2:

Best r-term separble approx. f(x1, x2) ≈∑r

k=1 uk(x1)vk(x2).

The elegant solution is via celebrated Schmidt’s expansion,

1905, (cf. SVD for best rank-r approx. to a matrix).

Goal: Extension to d ≥ 3

(A) Approximation of a multi-variate function f : Rd → R, over

a ”r-term” sum of separable functions.

(B) d-th order tensor in Rn, is a funct. f : R

n1×...×nd → R.

(C) Elaborate both the efficient analytic methods and fast

numerical multi-linear algebra (MLA).

Electronic structure calculations B. Khoromskij, Zuerich 17.07.07 6

The Hartree-Fock equation for the N-electron system is a system of

nonlinear eigenvalue problems in L2(R3)

Fφi(x) = λi φi(x), for i = 1, ..., N/2.

The nonlinear Fock operator is

F := −1

22 −

MX

A=1

ZA

|x − RA|+ 2VH − Vex,

with the Hartree potential VH and the exchange potential Vex,

VHφ(x) :=

Z

R3

ρ(y,y)

|x − y|dy φ(x), Vexφ(x) :=

Z

R3

ρ(x,y)

|x − y|φ(y)dy,

and with the density matrix (electron density if x = y),

ρ(x,y) = 2

N/2X

i=1

φ∗

i (x)φi(y), x,y ∈ R3.

Applicability in 3D FEM/BEM B. Khoromskij, Zuerich 17.07.07 7

Figure 1: Examples of step-type geometries in FEM/BEM applications,

which are well suited for direct application of tensor-product methods.

Separable approximation of functions in Rd

B. Khoromskij, Zuerich 17.07.07 8

Key ingredient:

Approximation of a multi-variate function f = f(x1, ..., xd) ∈ H,

f : Rd → R, over a ”r-term” sum of separable functions in

M⊗r ⊂ V, dim(M⊗r) = rd or M⊕r ⊂ V, dim(M⊕r) = r with

V = V 1 ⊗ ... ⊗ V d.

Usual choice: H = L2(Rd), V ℓ = L2(R) (ℓ = 1, ..., d).

Both M⊗r and M⊕r are not linear spaces ⇒ a challenging

nonlinear approximation problem on estimation

f ∈ H : σ(f,S) := infs∈S

‖f − s‖, (1)

S = M⊗r or S = M⊕r, or

S ⊂ M⊗r - the subset of symmetric/antisymmetric functions.

Low-rank decomposition of high-order tensors B. Khoromskij, Zuerich 17.07.07 9

A d-th order tensor on Id = I1 × ... × Id,

A := [ai1...id] ∈ R

Id

, Iℓ = 1, ..., n, ℓ = 1, ..., d, n ∈ N.

The ℓ2 inner product of tensors induces the Frobenius norm

〈A,B〉 :=∑

(i1...id)∈Id

ai1...idbi1...id

, ‖A‖F :=√〈A,A〉.

A ∈ RId

has |Id| = nd entries.

How to remove d from the exponential ?

Key ingredient: decompose by a sum of rank-1 tensors

V = V (1) ⊗ · · · ⊗ V (d), vi1...id= v

(1)i1

· · · v(d)id

with one-dimensional components V (ℓ) = v(ℓ)iℓ

∈ Rn.

Rank-(r1, ..., rd) Tucker model B. Khoromskij, Zuerich 17.07.07 10

Tucker Model (M⊗r). (orthonormalised set V(ℓ)kℓ

∈ Rn)

A(r) =

r1X

k1=1

...

rdX

kd=1

bk1...kdV

(1)k1

⊗ ... ⊗ V(d)kd

∈ RI1×...×Id . (2)

Core tensor B = bk ∈ Rr1×...×rd is unique up to rotation.

Methods: ALS, De Lathauwer et al. ’00; ALS + special input and initial guess, BNK, VKK ’06

Storage: rd + rdn ≪ nd, r = max rℓ ≪ n. Visualization of the Tucker model for d = 3:

=

I 2

I 1

I 3

A B

I 1

r 2

r 1

I 2

I 3

r 3

V

V

V

(1)

(2)

(3)

CANDECOMP/PARAFAC (CP) model B. Khoromskij, Zuerich 17.07.07 11

CP Model (M⊕r). Approximation by a sum of rank-1 tensors

A(r) =r∑

k=1

bk V(1)k ⊗ · · · ⊗ V

(d)k ≈ A, bk ∈ R

with normalised V(ℓ)k ∈ R

n. Uniqueness is due to J. Kruskal ’77.

The minimal number r is called a tensor rank of A(r).

Methods: ALS; Newton, Paatero ’97; ALS + ELS, Common, Rajih ’05

Storage: rdn + r.

Visualization of the CP-model for d = 3.

+

b

A

1b

V V V

V V V

V V V

+= ...+

1

1 2

2

2

r

r

r

(1) (1) (1)

(2) (2) (2)

21

(3) (3) (3)

rb

Are the Tucker/CP models robust in κ ? B. Khoromskij, Zuerich 17.07.07 12

Problem: tensor approximation to the Helmholtz kernels

f1,κ(|x|) :=sin(κ|x|)

|x| ; f2,κ(|x|) :=1

|x| −cos(κ|x|)

|x| = 2|x|f21, κ

2(|x|),

f1,κ(|x − y|), f2,κ(|x − y|), |x − y|−1, x, y ∈ R

d, κ ∈ R.

Smooth kernels: (BNK ’06-’07) For the Nystrom/collocation

discretizations, both the Tucker and CP approximations to

f1,κ and f2,κ are proven to have the rank bounds

rT ≤ rC ≤ Cd(| log ε| + κ).

Complexity of the Tucker model:

κ ≤ n ⇒ scales linear in Nvol = nd.

κ ≤ n1/d ⇒ scales linear in univariate problem-size n.

f1,κ(|x|), f2,κ(|x|), d = 3, |x| ≤ π, κ = 1, 15 B. Khoromskij, Zuerich 17.07.07 13

0 10 20 30 40 50 60 70 80

0

20

40

60

80

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0

20

40

60

80

0

20

40

60

80−14

−12

−10

−8

−6

−4

−2

0

2

4

0 10 20 30 40 50 60 70 80

0

20

40

60

800

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 10 20 30 40 50 60 70 800

20

40

60

80

0

2

4

6

8

10

12

Separation of shift-invariant kernels B. Khoromskij, Zuerich 17.07.07 14

1|x| appears in nearly any field of physics and chemistry.

Approximation result: (Hackbusch, BNK ’05-’07; BNK ’06-’07)1|x| is proven to have a low-rank separable approximation by

collocation/Galerkin scheme with

rT ≤ rC = O(log2 ε) or rT ≤ rC = O(log ε log n).

Recipy: optimised sinc-quadratures applied to generalised

Laplace transform of G(ρ), ρ(y) = y21 + ... + y2

d,

G(ρ) =

∫

R+

G(τ)e−ρF (τ)dτ ≈r∑

k=0

wkG(τk)e−ρF (τk) ⇒ Nystrom,

(G(ρ), φi) =

∫

R+

G(τ)(e−ρ(·)F (τ), φi(·)

)dτ ⇒ collocation.

Other approaches: Beylkin, Mohlenkamp ’02; Beylkin, Monzon ’05

CP approximation of the Newton potential B. Khoromskij, Zuerich 17.07.07 15

1. Approx. the Gaussian transform by the r-term

sinc-quadrature

1√ρ

=2√π

∫

R+

e−ρτ2dτ ≡ 4√π

∫

R

cosh(t)g(sinh(t))dt, g(z) = eze−ρe2z

,

(1√ρ(·)

, φi(·))

≈r∑

k=0

wkf(tk), f(z) =4√π

ez(e−ρ(·)e2z

, φi(·))

2. tk = sinh(khr), wk = 2αkhr cosh(khr), with αk = 2 if k > 0,

and αk = 1 for k = 0, hr = C0 log(r)/r optimizing via C0.

(e−ρ(·)τ2

, φi(·)) =

d∏

ℓ=1

Ψiℓ(τ), φi − pw const t.p. b.f.,

Ψi(τ) =π(d−1)/2d

2τerf(τih) − erf(τ(i − 1)h), h = A/n,

erf(t) =2√π

∫ t

0

e−τ2

dτ.

Convolution Transform via Collocation B. Khoromskij, Zuerich 17.07.07 16

Goal: Fast convolution transform in Rd,

w(x) := (f ∗ g)(x) :=

∫

Rd

f(y)g(x − y)dy f, g ∈ L1(Rd).

Application to Hartree’s and exchange potentials

VH(x) =

∫

R3

ρ(y, y)

|x − y|dy, Vexφ(x) :=

∫

R3

ρ(x, y)

|x − y| φ(y)dy, x ∈ R3.

Method: Tensor-product collocation + Richardson (BNK ’07).

Physical prerequisits:

(a) Compute f ∗ g in some fixed box Ω = [−A, A]d.

(b) f has support in Ω′ ⊂⊂ Ω.

(c) f allows R-term separable repr. with moderate R.

Other approaches: Goedecker, Beylkin et al. ’06; Hackbusch ’07

Convolution Transform via Collocation B. Khoromskij, Zuerich 17.07.07 17

Tensor-product grid and basis:

Let ωd := ω1 × ... × ωd ∈ Ω be the equi-distant tensor grid of

collocation points xm with step-size h = 2A/n, s.t.

ωℓ := −A + (mℓ − 1)h : mℓ = 1, ..., n + 1 (ℓ = 1, ..., d),

m ∈ M := 1, ..., n + 1d.

For given p.w. constant b.f. φii∈I associated with ωd, let

f(y) ≈∑

i∈I

fiφi(y), fi = f(Pi), I := 1, ..., nd.

The discrete collocation scheme (cost O(n2d)):

f ∗ g ≈ wmm∈M, wm :=∑

i∈I

fi

∫

Rd

φi(y)g(xm − y)dy, xm ∈ ωd.

d-dimensional discrete convolution B. Khoromskij, Zuerich 17.07.07 18

Define the d-th order tensors F = fi,G = gi ∈ RI with the

collocation coefficients

gi =

∫

Rd

φi(y)g(−y)dy, i ∈ I.

Compute discrete convolution in Rd (cost O(nd logq n) via FFT)

F ∗ G := zj, zj :=∑

i

figj−i+1, j ∈ J := 1, ..., 2n − 1d,

where the sum is over all i ∈ I which lead to legal subscripts

for fi and gj−i+1.

Final step: wm, m ∈ M, is obtained by copying of the

appropriate part of zj (centred by j = n),

wm = zj|j=n/2+m, m ∈ M.

Error analysis B. Khoromskij, Zuerich 17.07.07 19

Lem. 1.(super-convergence, BNK ’07) Let f ∈ C2(Ω) and let

g ∈ L1(Ω). Assume that there exist µ ≥ 1 and β > 0, s.t.

|F(g)| ≤ C/|κ|µ as |κ| → ∞, κ ∈ Rd,

|∇yg(x − y)| ≤ C/|x − y|β for x, y ∈ Ω, x 6= y.

Then there is a constant C > 0 independent of h, s.t.

|w(xm) − wm| ≤ Ch2, m ∈ M.

Illustration.

The fundamental solution of the Laplacian in Rd, d ≥ 3, is

given by G(x) = c(d)/|x|d−2, while F(G) = C/|κ|2.Hence Lem. 1 applies with β = d − 1, µ = 2.

Richardson extrapolation B. Khoromskij, Zuerich 17.07.07 20

Lem. 2. (Richardson extrapolation (RE), BNK ’07)

Under the assumptions of Lem. 1, let f ∈ C3(Ω). Then there

exists a function c1 ∈ C(Ω) which is independent of h, s.t. for

m ∈ M we have

w(xm) = wm + c1(xm)h2 + ηm,h with |ηm,h| ≤ Ch3.

Higher order, O(h3), approximation without extra cost!

wm = (4 ∗ wh/2m − wh

m)/3, m ∈ Mh,

then

|w(xm) − wm| ≤ Ch3, m ∈ Mh.

RE applies to functionals of w(xm), say to Coulomb matrix.

Tensor-product Convolution Venera Khoromskaia, Leipzig 11.04.07 21

Tensorization of convolution: Let G ∈ CRG, F ∈ CRF

, then

F ∗ G =

RG,RF∑

k=1,m=1

bkcm

(U

(1)k ∗ V (1)

m

)× ... ×

(U

(d)k ∗ V (d)

m

).

1D convolution on equidistant grid, U(ℓ)k ∗ V

(ℓ)m ∈ R

2n−1, can

be computed by FFT in O(n log n) operations,

(U(ℓ)k ∗ V (ℓ)

m )j =n∑

p=1

upvj−p+1, j = 1, ..., 2n − 1.

This leads to the linear (in n) complexity

NC∗C = O(dRGRF n log n) ≪ nd log n.

Example. n = 200, d = 3: Naive collocation - n6 ∼ 2 · 1012;

FFT - n3 log n ∼ 107; CP-CP tensor format - 3 RG RF · 200 · 7.

I: Orthog. rank-r approx. to 1/|x|, d = 3, |x| ≤ 10. B. Khoromskij, Zuerich 17.07.07 22

Numerics I-III.

2 4 6 8 10 12

10−10

10−8

10−6

10−4

10−2

100

Tucker rank

erro

r

Newton , AR=10, n = 64

EFN

EFE

EC

0 10 20 30 40 50 60−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6Canonical components L=3 r=6

Newton , AR=10, n = 64

grid points

Figure 2: Convergence history in r, Newton’s potential on n×n×n grid.

II: Rank-r approx. to f1,κ(|x|), f2,κ(|x|), d = 3, |x| ≤ π B. Khoromskij, Zuerich 17.07.07 23

2 4 6 8 10 12 142

4

6

8

10

12

14

16

18

20

22

κ

Tuc

ker

rank

f1 (|x|) on [0,π ]3

ε =10−3

ε =10−4

2 4 6 8 10 12 142

4

6

8

10

12

14

16

18

20

22

κT

ucke

r ra

nk

f2(|x|) on [0,π]3

ε =10−3

ε =10−4

Figure 3: Convergence history in κ ∈ [1, 15] for f1,κ, f2,κ,

indicating r(κ) ∼ | log ε| + κ.

III: Hartree potential for CH4 B. Khoromskij, Zuerich 17.07.07 24

1. Approximate the collocation coefficients tensor G for 1|x| in

the CP format with r ∈ [10, 20].

2. Given the pseudo-density for CH4 molecule in GTO basis,

ρ(x, x) :=4∑

i=1

(

R0∑

k=1

Pk,i(x)eλk,i(x−xk,i)2

)2,

R0 = 50, A = 10.6(au), polynomial degree(Pk,i) ≤ 2, then

compute (fast TP-collocation + Richardson extr.)

VH(x) =

∫

R3

ρ(y, y)

|x − y|dy, x ∈ Ω = [−A, A]3,

on the uniform n × n × n grids with n = 112, 224, 448.

3. Calcul. the total energy using Coulomb matr. for VH ,

Ja,b =

∫ga(x)gb(x)VH(x)dx, a, b = 1, ..., R0.

III: Fast tensor convolution vs. FFT B. Khoromskij, Zuerich 17.07.07 25

How fast ? (time in sec.; ∗ - out of memory)

n3 323 643 1283 2563 5123

FFT -based 0.12 0.82 4.07 49.44 ∼ 500.0(∗)ConvCC 0.51 0.73 1.26 3.05 10.01

How accurate ?

n 112 224 448 (400, 800)

εn 0.1133 0.0286 7.194 · 10−3 -

εn,Rich - 1.0752 · 10−4 3.527 · 10−5 2.948 · 10−5

εn - error in the energy

εn,Rich - the RE error in the energy

III: Effect of the Richardson extrapolation B. Khoromskij, Zuerich 17.07.07 26

−6 −4 −2 0 2 4 60

2

4

6

8

10

12

x 10−5

X: 5.551e−16Y: 5.531e−06

atomic units

Ric

hard

son

erro

r

abs. Richardson error for VH

112−224224−448

−6 −4 −2 0 2 4 60

0.5

1

1.5

2

x 10−3

atomic units

abs.

err

or

abs. error for VH

in [−10.6,10.6] AU

n=448n=224n=112

Figure 4: Extrapolated error in the Hartree potential for CH4.

III: Error in the Coulomb matrix B. Khoromskij, Zuerich 17.07.07 27

0

20

40

010

2030

4050−1

0

1

2

3

Coulomb matrix for CH4

Figure 5: Coulomb matrix Jab for CH4 (left) and absolute ex-

trapolated error in the uniform n × n × n grids, n = 224, 448.

Summary B. Khoromskij, Zuerich 17.07.07 28

Benefits:

→ Model reduction. Basic MLA can be performed using

one-dimensional operations (linear scaling in d),

→ Gainfull combination of analytic and algebraic methods.

Tensor approximation to Helmholtz/Newton kernels:

– At least linear complexity in Nvol = n3.

– Linear cost in NBEM = n2, in the frequency domain κ ≤ n2/3.

Fast convolution in Rd:

Cost O(dRGRF n log n) ≪ nd log n.

Efficient fully discrete low tensor-rank approximation to the

Hartree potential and to the “Hartree” energy.

Goals in comp. phys.: Solving equations in tensor format.

Tensor-Product Approximation in Computational Physics: Theory and

Documents

Transcript of Tensor-Product Approximation in Computational Physics: Theory and