Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least...

24
Total least squares erard MEURANT October, 2008

Transcript of Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least...

Page 1: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Total least squares

Gerard MEURANT

October, 2008

Page 2: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

1 Introduction to total least squares

2 Approximation of the TLS secular equation

3 Numerical experiments

Page 3: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Introduction to total least squares

In least squares (LS) we have only a perturbation of the right handside whereas Total Least Squares (TLS) considers perturbations ofthe vector of observations c and of the m × n data matrix A

minimize ‖(E r

)‖F ,

E , r

subject to the constraint (A + E )x = c + r

This is finding the smallest perturbations E and r such that c + ris in the range of A + E

see Golub and Van Loan; Van Huffel and Vandewalle; Paige andStrakos

Page 4: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Theorem (Golub and Van Loan)

Let C =(A c

)and UTCV = Σ be its SVD

Assume that the singular values of C are such that

σ1 ≥ · · · ≥ σk > σk+1 = · · ·σn+1

Then the solution of the TLS problem is given by

min ‖(E r

)‖F = σn+1

andxTLS = − y

α

where the vector(y α

)Tof norm 1 with α 6= 0 is in the

subspace Sk spanned by the right singular vectors{vk+1, . . . , vn+1} of V . If there is no such vector with α 6= 0, theTLS problem has no solution

Page 5: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

The right singular vectors v i are the eigenvectors of CTC and Sk isthe invariant subspace associated to the smallest eigenvalue σ2

n+1

The TLS solution xTLS solves the eigenvalue problem

CTC

(x−1

)= σ2

n+1

(x−1

)

TheoremIf σA

n > σn+1, then xTLS exists and is the unique solution of theTLS problem

xTLS = (ATA− σ2n+1I )

−1AT c

Moreover, σn+1 satisfies the following secular equation

σ2n+1

[1 +

n∑i=1

d2i

(σAi )2 − σ2

n+1

]= ρ2

LS

where the vector d = UT c and ρ2LS = ‖(c − AxLS)‖2

Page 6: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

The secular equation can also be written as

σ2n+1 = cT c − cTA(ATA− σ2

n+1I )−1AT c

This is obtained by writing(ATA AccTA cT c

) (x−1

)= (σn+1)

2

(x−1

)and eliminating x

For data least squares (DLS) when only the matrix is perturbed,the secular equation is

cT c − cTA(ATA− σ2I )−1AT c = 0

This can also be written as

cT (AAT − σ2I )−1c = 0

Page 7: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

−5 0 5 10 15 20 25−10

−8

−6

−4

−2

0

2

4

6

8

10TLS secular function as a function of σ2

Example of TLS secular function as a function of σ2

Page 8: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Approximation of the TLS secular equation

We approximate the quadratic form in the TLS secular equation byusing one of the Golub-Kahan bidiagonalization algorithms with cas a starting vectorIt reduces A to lower bidiagonal form and generates a matrix

Ck =

γ1

δ1. . .. . .

. . .

. . . γk

δk

a k + 1 by k matrix such that CT

k Ck = Jk the tridiagonal matrixgenerated by the Lanczos algorithm for the matrix ATA

Page 9: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

At iteration k we approximate the TLS secular equation by

cT c − ‖c‖2(e1)TCk(CTk Ck − σ2I )−1CT

k e1 = σ2

This corresponds to the Gauss quadrature rule

We use the SVD of Ck = UkSkV Tk . Let σ

(k)i be the singular values

of Ck and ξ(k) = UTk e1

(ξ(k)k+1)

2

σ2−

k∑i=1

(ξ(k)i )2

(σ(k)i )2 − σ2

=1

‖c‖2

We need to compute the smallest zero. Secular equation solversuse rational interpolationWhen an approximate solution σ2

tls has been computed, we solve

xtls = (ATA− σ2tls I )

−1AT c

Page 10: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

The Gauss–Radau rule

We implement the Gauss–Radau rule by using the otherGolub-Kahan bidiagonalization algorithm with AT c as a startingvectorIt reduces A to upper bidiagonal form. If

Bk =

γ1 δ1

. . .. . .

γk−1 δk−1

γk

the matrix Bk is the Cholesky factor of the Lanczos matrix Jk

Page 11: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

To obtain the Gauss–Radau rule we must modify Bk to have aprescribed eigenvalue zLet ω be the solution of

(BTk Bk − zI )ω = (γk−1δk−1)

2ek

Let

ωk = (z + ωk)− (γk−1δk−1)2

γ2k−1

= (z + ωk)− δ2k−1

The modified matrix giving the Gauss–Radau rule is

Bk =

γ1 δ1

. . .. . .

γk−1 δk−1

γk

where γk =

√ωk

Page 12: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Using Bk we solve the secular equation

‖c‖2 − ‖AT c‖2(e1)T (BTk Bk − σ2I )−1e1 = σ2

with the SVD of Bk

Page 13: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Numerical experiments

As = UsΣsVTs , Us = I − 2

usuTs

‖us‖2, Vs = I − 2

vsvTs

‖vs‖2

where us and vs are random vectorsΣs is an m × n diagonal matrix with elements [1, · · · ,

√n]

Let xs be a vector whose ith component is 1/i and cs = Asxs

A = As + ξ randn(m, n)

The right hand side is

c = cs + ξ randn(m, 1)

Page 14: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

A small example

Example TLS1, m = 100, n = 50, BNS1 ε = 10−6

ξ L it. s it. sol. exact sol.

0.3 10−2 30 57 0.01703479103104873 0.01703478979190218

0.3 10−1 26 49 0.169448388286749 0.1694483528865543

0.3 28 73 1.464892131470029 1.464891451263777

30 33 64 88.21012648624229 88.21012652906667

Page 15: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

A larger example

I We are not able to store A which is a dense matrix in Matlab

I We use the vectors us and vs to do matrix multiplies with As

or ATs

I We perturb the singular values in the same way as the righthand side

Example TLS3, m = 10000, n = 5000, noise=0.3, BNS1

ε L it. s it. min it. max it. av. it. solution

10−6 250 273 1 2 1.09 1.418582932414374

10−10 328 660 1 3 2.01 1.418576233569240

It works fine but it is too expensive

Page 16: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

The Gauss–Radau rule

Example TLS1, m = 100, n = 50, Gauss–Radau, noise=0.3,ε = 10−6

Met. L it.√

z s it. min it. max it. solution

Newt 28 σmin 130 2 14 1.464891376927382

Newt 28 σmax 79 2 4 1.464892626809155

Rat 28 σmin 98 2 5 1.464891376927382

Rat 28 σmax 74 2 3 1.464892626809155

Page 17: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Example TLS3, m = 10000, n = 5000, Gauss–Radau, noise=0.3,ε = 10−6

Met. L it.√

z s it. min it. max it. solution

Newt 250 σmin 2572 3 31 1.418576232676234

Newt 250 σmax 1926 3 26 1.418583305908228

Rat 250 σmin 837 2 5 1.418576232676233

Rat 250 σmax 653 2 4 1.418583305908227

Page 18: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Optimization of the algorithm

To reduce the cost

I We monitor the convergence of the smallest singular value ofA

I For this we solve a secular equation at every Lanczos iteration

I We use a third order rational approximation and tridiagonalsolves

I The Gauss and Gauss–Radau estimates are only computed atthe end

Page 19: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Example TLS3, m = 10000, n = 5000, noise=0.3, ε = 10−6

Met. L it. trid√

z s it. solution

- 250 551

Gauss - 2 1.418582932414440

G–R σmin(Bk) 2 1.418582932414443

G–R σmax(Bk) 3 1.418583305908306

Page 20: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

Example TLS4, m = 100000, n = 50000, noise=0.3, ε = 10−6

Met. L it. trid√

z s it. solution

- 755 1775

Gauss - 1 0.8721122166701496

G–R σmin(Bk) 2 0.8721122166735605

G–R σmax(Bk) 3 0.8721124331415380

Page 21: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

For example TLS3 with m = 10000, n = 5000 and ε = 10−6

I The computing time when solving for Gauss and Gauss–Radauat each iteration was 117 seconds

I With the last algorithm it is 12 seconds

Page 22: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

J.R. Bunch, C.P. Nielsen and D.C. Sorensen,Rank-one modification of the symmetric eigenproblem,Numer. Math., v 31, (1978), pp 31–48

G.H. Golub and C. Van Loan, An analysis of the totalleast squares problem, SIAM J. Numer. Anal., v 17 n 6,(1980), pp 883–893

Ren-Cang Li, Solving secular equations stably andefficiently, Report UCB CSD-94-851, University of California,Berkeley, (1994)

A. Melman, A unifying convergence analysis of second-ordermethods for secular equation, Math. Comp., v 66 n 217,(1997), pp 333–344

A. Melman, A numerical comparison of methods for solvingsecular equations, J. Comp. Appl. Math., v 86, (1997),pp 237–249

Page 23: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

C.C. Paige and Z. Strakos, Bounds for the least squaresresidual using scaled total least squares, in Proc. 3rd int.workshop on TLS and error-in-variables modelling, S. VanHuffel and P. Lemmerling eds., Kluwer, (2001), pp 25–34

C.C. Paige and Z. Strakos, Unifying least squares, totalleast squares and data least squares, in Proc. 3rd int. workshopon TLS and error-in-variables modelling, S. Van Huffel andP. Lemmerling eds., Kluwer, (2001), pp 35–44

C.C. Paige and Z. Strakos, Bounds for the least squaresdistance using scaled total least squares problems,Numer. Math., v 91, (2002), pp 93-115

C.C. Paige and Z. Strakos, Scaled total least squaresfundamentals, Numer. Math., v 91, (2002), pp 117-146

C.C. Paige and Z. Strakos, Core problems in linearalgebraic systems, SIAM J. Matrix Anal. Appl., v 27 n 3,(2006), pp 861–874

Page 24: Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least squares, total least squares and data least squares, in Proc. 3rd int. workshop

S. Van Huffel and J. Vandewalle, The total leastsquares problem: computational aspects and analysis, SIAM,(1991)