Structure in total least squares parameter estimation for ...
Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least...
Transcript of Total least squares - Hong Kong Baptist University · C.C. Paige and Z. Strakosˇ , Unifying least...
Total least squares
Gerard MEURANT
October, 2008
1 Introduction to total least squares
2 Approximation of the TLS secular equation
3 Numerical experiments
Introduction to total least squares
In least squares (LS) we have only a perturbation of the right handside whereas Total Least Squares (TLS) considers perturbations ofthe vector of observations c and of the m × n data matrix A
minimize ‖(E r
)‖F ,
E , r
subject to the constraint (A + E )x = c + r
This is finding the smallest perturbations E and r such that c + ris in the range of A + E
see Golub and Van Loan; Van Huffel and Vandewalle; Paige andStrakos
Theorem (Golub and Van Loan)
Let C =(A c
)and UTCV = Σ be its SVD
Assume that the singular values of C are such that
σ1 ≥ · · · ≥ σk > σk+1 = · · ·σn+1
Then the solution of the TLS problem is given by
min ‖(E r
)‖F = σn+1
andxTLS = − y
α
where the vector(y α
)Tof norm 1 with α 6= 0 is in the
subspace Sk spanned by the right singular vectors{vk+1, . . . , vn+1} of V . If there is no such vector with α 6= 0, theTLS problem has no solution
The right singular vectors v i are the eigenvectors of CTC and Sk isthe invariant subspace associated to the smallest eigenvalue σ2
n+1
The TLS solution xTLS solves the eigenvalue problem
CTC
(x−1
)= σ2
n+1
(x−1
)
TheoremIf σA
n > σn+1, then xTLS exists and is the unique solution of theTLS problem
xTLS = (ATA− σ2n+1I )
−1AT c
Moreover, σn+1 satisfies the following secular equation
σ2n+1
[1 +
n∑i=1
d2i
(σAi )2 − σ2
n+1
]= ρ2
LS
where the vector d = UT c and ρ2LS = ‖(c − AxLS)‖2
The secular equation can also be written as
σ2n+1 = cT c − cTA(ATA− σ2
n+1I )−1AT c
This is obtained by writing(ATA AccTA cT c
) (x−1
)= (σn+1)
2
(x−1
)and eliminating x
For data least squares (DLS) when only the matrix is perturbed,the secular equation is
cT c − cTA(ATA− σ2I )−1AT c = 0
This can also be written as
cT (AAT − σ2I )−1c = 0
−5 0 5 10 15 20 25−10
−8
−6
−4
−2
0
2
4
6
8
10TLS secular function as a function of σ2
Example of TLS secular function as a function of σ2
Approximation of the TLS secular equation
We approximate the quadratic form in the TLS secular equation byusing one of the Golub-Kahan bidiagonalization algorithms with cas a starting vectorIt reduces A to lower bidiagonal form and generates a matrix
Ck =
γ1
δ1. . .. . .
. . .
. . . γk
δk
a k + 1 by k matrix such that CT
k Ck = Jk the tridiagonal matrixgenerated by the Lanczos algorithm for the matrix ATA
At iteration k we approximate the TLS secular equation by
cT c − ‖c‖2(e1)TCk(CTk Ck − σ2I )−1CT
k e1 = σ2
This corresponds to the Gauss quadrature rule
We use the SVD of Ck = UkSkV Tk . Let σ
(k)i be the singular values
of Ck and ξ(k) = UTk e1
(ξ(k)k+1)
2
σ2−
k∑i=1
(ξ(k)i )2
(σ(k)i )2 − σ2
=1
‖c‖2
We need to compute the smallest zero. Secular equation solversuse rational interpolationWhen an approximate solution σ2
tls has been computed, we solve
xtls = (ATA− σ2tls I )
−1AT c
The Gauss–Radau rule
We implement the Gauss–Radau rule by using the otherGolub-Kahan bidiagonalization algorithm with AT c as a startingvectorIt reduces A to upper bidiagonal form. If
Bk =
γ1 δ1
. . .. . .
γk−1 δk−1
γk
the matrix Bk is the Cholesky factor of the Lanczos matrix Jk
To obtain the Gauss–Radau rule we must modify Bk to have aprescribed eigenvalue zLet ω be the solution of
(BTk Bk − zI )ω = (γk−1δk−1)
2ek
Let
ωk = (z + ωk)− (γk−1δk−1)2
γ2k−1
= (z + ωk)− δ2k−1
The modified matrix giving the Gauss–Radau rule is
Bk =
γ1 δ1
. . .. . .
γk−1 δk−1
γk
where γk =
√ωk
Using Bk we solve the secular equation
‖c‖2 − ‖AT c‖2(e1)T (BTk Bk − σ2I )−1e1 = σ2
with the SVD of Bk
Numerical experiments
As = UsΣsVTs , Us = I − 2
usuTs
‖us‖2, Vs = I − 2
vsvTs
‖vs‖2
where us and vs are random vectorsΣs is an m × n diagonal matrix with elements [1, · · · ,
√n]
Let xs be a vector whose ith component is 1/i and cs = Asxs
A = As + ξ randn(m, n)
The right hand side is
c = cs + ξ randn(m, 1)
A small example
Example TLS1, m = 100, n = 50, BNS1 ε = 10−6
ξ L it. s it. sol. exact sol.
0.3 10−2 30 57 0.01703479103104873 0.01703478979190218
0.3 10−1 26 49 0.169448388286749 0.1694483528865543
0.3 28 73 1.464892131470029 1.464891451263777
30 33 64 88.21012648624229 88.21012652906667
A larger example
I We are not able to store A which is a dense matrix in Matlab
I We use the vectors us and vs to do matrix multiplies with As
or ATs
I We perturb the singular values in the same way as the righthand side
Example TLS3, m = 10000, n = 5000, noise=0.3, BNS1
ε L it. s it. min it. max it. av. it. solution
10−6 250 273 1 2 1.09 1.418582932414374
10−10 328 660 1 3 2.01 1.418576233569240
It works fine but it is too expensive
The Gauss–Radau rule
Example TLS1, m = 100, n = 50, Gauss–Radau, noise=0.3,ε = 10−6
Met. L it.√
z s it. min it. max it. solution
Newt 28 σmin 130 2 14 1.464891376927382
Newt 28 σmax 79 2 4 1.464892626809155
Rat 28 σmin 98 2 5 1.464891376927382
Rat 28 σmax 74 2 3 1.464892626809155
Example TLS3, m = 10000, n = 5000, Gauss–Radau, noise=0.3,ε = 10−6
Met. L it.√
z s it. min it. max it. solution
Newt 250 σmin 2572 3 31 1.418576232676234
Newt 250 σmax 1926 3 26 1.418583305908228
Rat 250 σmin 837 2 5 1.418576232676233
Rat 250 σmax 653 2 4 1.418583305908227
Optimization of the algorithm
To reduce the cost
I We monitor the convergence of the smallest singular value ofA
I For this we solve a secular equation at every Lanczos iteration
I We use a third order rational approximation and tridiagonalsolves
I The Gauss and Gauss–Radau estimates are only computed atthe end
Example TLS3, m = 10000, n = 5000, noise=0.3, ε = 10−6
Met. L it. trid√
z s it. solution
- 250 551
Gauss - 2 1.418582932414440
G–R σmin(Bk) 2 1.418582932414443
G–R σmax(Bk) 3 1.418583305908306
Example TLS4, m = 100000, n = 50000, noise=0.3, ε = 10−6
Met. L it. trid√
z s it. solution
- 755 1775
Gauss - 1 0.8721122166701496
G–R σmin(Bk) 2 0.8721122166735605
G–R σmax(Bk) 3 0.8721124331415380
For example TLS3 with m = 10000, n = 5000 and ε = 10−6
I The computing time when solving for Gauss and Gauss–Radauat each iteration was 117 seconds
I With the last algorithm it is 12 seconds
J.R. Bunch, C.P. Nielsen and D.C. Sorensen,Rank-one modification of the symmetric eigenproblem,Numer. Math., v 31, (1978), pp 31–48
G.H. Golub and C. Van Loan, An analysis of the totalleast squares problem, SIAM J. Numer. Anal., v 17 n 6,(1980), pp 883–893
Ren-Cang Li, Solving secular equations stably andefficiently, Report UCB CSD-94-851, University of California,Berkeley, (1994)
A. Melman, A unifying convergence analysis of second-ordermethods for secular equation, Math. Comp., v 66 n 217,(1997), pp 333–344
A. Melman, A numerical comparison of methods for solvingsecular equations, J. Comp. Appl. Math., v 86, (1997),pp 237–249
C.C. Paige and Z. Strakos, Bounds for the least squaresresidual using scaled total least squares, in Proc. 3rd int.workshop on TLS and error-in-variables modelling, S. VanHuffel and P. Lemmerling eds., Kluwer, (2001), pp 25–34
C.C. Paige and Z. Strakos, Unifying least squares, totalleast squares and data least squares, in Proc. 3rd int. workshopon TLS and error-in-variables modelling, S. Van Huffel andP. Lemmerling eds., Kluwer, (2001), pp 35–44
C.C. Paige and Z. Strakos, Bounds for the least squaresdistance using scaled total least squares problems,Numer. Math., v 91, (2002), pp 93-115
C.C. Paige and Z. Strakos, Scaled total least squaresfundamentals, Numer. Math., v 91, (2002), pp 117-146
C.C. Paige and Z. Strakos, Core problems in linearalgebraic systems, SIAM J. Matrix Anal. Appl., v 27 n 3,(2006), pp 861–874
S. Van Huffel and J. Vandewalle, The total leastsquares problem: computational aspects and analysis, SIAM,(1991)