Hybrid LSQR and RSVD solutions of Ill-Posed Least Squares …rosie/mypresentations/Syracuse... ·...
Transcript of Hybrid LSQR and RSVD solutions of Ill-Posed Least Squares …rosie/mypresentations/Syracuse... ·...
Hybrid LSQR and RSVD solutions ofIll-Posed Least Squares Problems
Rosemary Renaut1 Anthony Helmstetter 1 SaeedVatankhah2
1: School of Mathematical and Statistical Sciences, Arizona State University,[email protected], [email protected]
2: Institute of Geophysics, University of Tehran, [email protected]
Syracuse University
Outline
AimsMotivation Example: Large Scale Gravity Inversion
Background: SVD for the small scaleStandard Approaches to Estimate Regularization Problem
Methods for the Large Scale: Approximating the SVDKrylov: Golub Kahan Bidiagonalization - LSQRRandomized SVDSimulations
1D Contrasting RSVD and LSQRTwo dimensional Examples
Extension : Iteratively Reweighted Regularization [LK83]Undersampled 3D Magnetic: approximate L1 regularization
Conclusions: RSVD - LSQR
Outline
AimsMotivation Example: Large Scale Gravity Inversion
Background: SVD for the small scaleStandard Approaches to Estimate Regularization Problem
Methods for the Large Scale: Approximating the SVDKrylov: Golub Kahan Bidiagonalization - LSQRRandomized SVDSimulations
1D Contrasting RSVD and LSQRTwo dimensional Examples
Extension : Iteratively Reweighted Regularization [LK83]Undersampled 3D Magnetic: approximate L1 regularization
Conclusions: RSVD - LSQR
Outline
AimsMotivation Example: Large Scale Gravity Inversion
Background: SVD for the small scaleStandard Approaches to Estimate Regularization Problem
Methods for the Large Scale: Approximating the SVDKrylov: Golub Kahan Bidiagonalization - LSQRRandomized SVDSimulations
1D Contrasting RSVD and LSQRTwo dimensional Examples
Extension : Iteratively Reweighted Regularization [LK83]Undersampled 3D Magnetic: approximate L1 regularization
Conclusions: RSVD - LSQR
Outline
AimsMotivation Example: Large Scale Gravity Inversion
Background: SVD for the small scaleStandard Approaches to Estimate Regularization Problem
Methods for the Large Scale: Approximating the SVDKrylov: Golub Kahan Bidiagonalization - LSQRRandomized SVDSimulations
1D Contrasting RSVD and LSQRTwo dimensional Examples
Extension : Iteratively Reweighted Regularization [LK83]Undersampled 3D Magnetic: approximate L1 regularization
Conclusions: RSVD - LSQR
Research Aims : Efficient solvers for 3D inversion
1. Effective regularization parameter estimation using smallscale surrogate for large scale problem
2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value
Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using
surrogate initialization.
Explain: hopefully - what these statements mean
Research Aims : Efficient solvers for 3D inversion
1. Effective regularization parameter estimation using smallscale surrogate for large scale problem
2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value
Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using
surrogate initialization.
Explain: hopefully - what these statements mean
Research Aims : Efficient solvers for 3D inversion
1. Effective regularization parameter estimation using smallscale surrogate for large scale problem
2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value
Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using
surrogate initialization.
Explain: hopefully - what these statements mean
Research Aims : Efficient solvers for 3D inversion
1. Effective regularization parameter estimation using smallscale surrogate for large scale problem
2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value
Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using
surrogate initialization.
Explain: hopefully - what these statements mean
Research Aims : Efficient solvers for 3D inversion
1. Effective regularization parameter estimation using smallscale surrogate for large scale problem
2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value
Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using
surrogate initialization.
Explain: hopefully - what these statements mean
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Motivation Example: Large Scale 3D Magnetic Anomaly Inversion
Observation point r = (x, y, z)
Vertical magnetic anomaly m(r)
m(r) ∝∫dΩκ(r′)
r′ − r
|r′ − r|3dΩ′
Susceptibility κ(r′) at r′ = (x′, y′, z′)
Linear Relation m = Gκ
Aim: Given surface observations mij find susceptibility κijk
Underdetermined, measurements 5000, unknowns 75000
Practical Approaches for Large Scale Ill-Posed Problems needed
Ill-Posed Problem: Example Solutions Image Restoration
True Contaminated Naive Restoration
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
50 100 150 200 250
50
100
150
200
250
Background Spectral Decomposition of the Solution: The SVD
Consider general discrete problem
Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.
Singular value decomposition (SVD) of A rank r ≤ min(m,n)
A = UΣV T =
r∑i=1
uiσivTi , Σ = diag(σ1, . . . , σr).
Singular values σi =√λi, λi the eigenvalue of ATA.
Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution
x =
r∑i=1
uTi b
σivi
Background Spectral Decomposition of the Solution: The SVD
Consider general discrete problem
Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.
Singular value decomposition (SVD) of A rank r ≤ min(m,n)
A = UΣV T =
r∑i=1
uiσivTi , Σ = diag(σ1, . . . , σr).
Singular values σi =√λi, λi the eigenvalue of ATA.
Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution
x =
r∑i=1
uTi b
σivi
Background Spectral Decomposition of the Solution: The SVD
Consider general discrete problem
Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.
Singular value decomposition (SVD) of A rank r ≤ min(m,n)
A = UΣV T =
r∑i=1
uiσivTi , Σ = diag(σ1, . . . , σr).
Singular values σi =√λi, λi the eigenvalue of ATA.
Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution
x =
r∑i=1
uTi b
σivi
Background Spectral Decomposition of the Solution: The SVD
Consider general discrete problem
Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.
Singular value decomposition (SVD) of A rank r ≤ min(m,n)
A = UΣV T =
r∑i=1
uiσivTi , Σ = diag(σ1, . . . , σr).
Singular values σi =√λi, λi the eigenvalue of ATA.
Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution
x =
r∑i=1
uTi b
σivi
Background Spectral Decomposition of the Solution: The SVD
Consider general discrete problem
Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.
Singular value decomposition (SVD) of A rank r ≤ min(m,n)
A = UΣV T =
r∑i=1
uiσivTi , Σ = diag(σ1, . . . , σr).
Singular values σi =√λi, λi the eigenvalue of ATA.
Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution
x =
r∑i=1
uTi b
σivi
The Picard Plot - examines the solution
x =
r∑i=1
uTi b
σivi
10 20 30 40 50 60
10−8
10−6
10−4
10−2
100
σi
|ui
Tb|
|ui
Tb|/σ
i
Solution with these coefficients would be non feasible
The Picard Plot - examines the solution
x =
r∑i=1
uTi b
σivi
10 20 30 40 50 60
10−8
10−6
10−4
10−2
100
σi
|ui
Tb|
|ui
Tb|/σ
i
Solution with these coefficients would be non feasible
Regularization: Solutions are not stable
Truncation: k < r - surrogate problem is size k, Ak ≈ A
x =
k∑i=1
uTi b
σivi
Filtering: γi
x =
r∑i=1
γi(α)uTi b
σivi
Filtered Truncated: k, γi - surrogate problem is size k ,Ak ≈ A
x =
k∑i=1
γi(α)uTi b
σivi
k and α are regularization parameters. γi filter function
Regularization: Solutions are not stable
Truncation: k < r - surrogate problem is size k, Ak ≈ A
x =
k∑i=1
uTi b
σivi
Filtering: γi
x =
r∑i=1
γi(α)uTi b
σivi
Filtered Truncated: k, γi - surrogate problem is size k ,Ak ≈ A
x =
k∑i=1
γi(α)uTi b
σivi
k and α are regularization parameters. γi filter function
Regularization: Solutions are not stable
Truncation: k < r - surrogate problem is size k, Ak ≈ A
x =
k∑i=1
uTi b
σivi
Filtering: γi
x =
r∑i=1
γi(α)uTi b
σivi
Filtered Truncated: k, γi - surrogate problem is size k ,Ak ≈ A
x =
k∑i=1
γi(α)uTi b
σivi
k and α are regularization parameters. γi filter function
Regularization: Solutions are not stable
Truncation: k < r - surrogate problem is size k, Ak ≈ A
x =
k∑i=1
uTi b
σivi
Filtering: γi
x =
r∑i=1
γi(α)uTi b
σivi
Filtered Truncated: k, γi - surrogate problem is size k ,Ak ≈ A
x =
k∑i=1
γi(α)uTi b
σivi
k and α are regularization parameters. γi filter function
Tikhonov Regularization : regularization parameter α
Filter Functions
γi(α) =σ2i
σ2i + α2
, i = 1 . . . r,
Solves Standard Form
x(α) = argminx‖b−Ax‖2 + α2‖x‖2
≈ argminx‖b−Akx‖2 + α2‖x‖2
Generalized Tikhonov has operator L
x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2
Solve with standard form if L invertible.Desirable automatic estimation of α
How to efficiently solve and find αopt for large scale problems?
Tikhonov Regularization : regularization parameter α
Filter Functions
γi(α) =σ2i
σ2i + α2
, i = 1 . . . r,
Solves Standard Form
x(α) = argminx‖b−Ax‖2 + α2‖x‖2
≈ argminx‖b−Akx‖2 + α2‖x‖2
Generalized Tikhonov has operator L
x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2
Solve with standard form if L invertible.Desirable automatic estimation of α
How to efficiently solve and find αopt for large scale problems?
Tikhonov Regularization : regularization parameter α
Filter Functions
γi(α) =σ2i
σ2i + α2
, i = 1 . . . r,
Solves Standard Form
x(α) = argminx‖b−Ax‖2 + α2‖x‖2
≈ argminx‖b−Akx‖2 + α2‖x‖2
Generalized Tikhonov has operator L
x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2
Solve with standard form if L invertible.Desirable automatic estimation of α
How to efficiently solve and find αopt for large scale problems?
Tikhonov Regularization : regularization parameter α
Filter Functions
γi(α) =σ2i
σ2i + α2
, i = 1 . . . r,
Solves Standard Form
x(α) = argminx‖b−Ax‖2 + α2‖x‖2
≈ argminx‖b−Akx‖2 + α2‖x‖2
Generalized Tikhonov has operator L
x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2
Solve with standard form if L invertible.Desirable automatic estimation of α
How to efficiently solve and find αopt for large scale problems?
Tikhonov Regularization : regularization parameter α
Filter Functions
γi(α) =σ2i
σ2i + α2
, i = 1 . . . r,
Solves Standard Form
x(α) = argminx‖b−Ax‖2 + α2‖x‖2
≈ argminx‖b−Akx‖2 + α2‖x‖2
Generalized Tikhonov has operator L
x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2
Solve with standard form if L invertible.Desirable automatic estimation of α
How to efficiently solve and find αopt for large scale problems?
Regularization Parameter Estimation: Minimize F (α) some F
Introduce φi(α) = α2
α2+σ2i
= 1− γi(α), i = 1 : r, γi = 0, i > k.
Unbiased Predictive Risk : Minimize functional noise level η2
Uk(α) =
k∑i=1
φ2i (α)(uTi b)2 − 2η2
k∑i=1
φi(α)
GCV : Minimize rational function m∗ = minm,n
G(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(∑m∗
i=1 φ2i (α)
)WGCV : Minimize
WG(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(
1 + k − ωk + ω∑k
i=1 φ2i (α)
)How does αopt = argminF (α) depend on k?
Regularization Parameter Estimation: Minimize F (α) some F
Introduce φi(α) = α2
α2+σ2i
= 1− γi(α), i = 1 : r, γi = 0, i > k.
Unbiased Predictive Risk : Minimize functional noise level η2
Uk(α) =
k∑i=1
φ2i (α)(uTi b)2 − 2η2
k∑i=1
φi(α)
GCV : Minimize rational function m∗ = minm,n
G(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(∑m∗
i=1 φ2i (α)
)WGCV : Minimize
WG(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(
1 + k − ωk + ω∑k
i=1 φ2i (α)
)How does αopt = argminF (α) depend on k?
Regularization Parameter Estimation: Minimize F (α) some F
Introduce φi(α) = α2
α2+σ2i
= 1− γi(α), i = 1 : r, γi = 0, i > k.
Unbiased Predictive Risk : Minimize functional noise level η2
Uk(α) =
k∑i=1
φ2i (α)(uTi b)2 − 2η2
k∑i=1
φi(α)
GCV : Minimize rational function m∗ = minm,n
G(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(∑m∗
i=1 φ2i (α)
)WGCV : Minimize
WG(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(
1 + k − ωk + ω∑k
i=1 φ2i (α)
)How does αopt = argminF (α) depend on k?
Regularization Parameter Estimation: Minimize F (α) some F
Introduce φi(α) = α2
α2+σ2i
= 1− γi(α), i = 1 : r, γi = 0, i > k.
Unbiased Predictive Risk : Minimize functional noise level η2
Uk(α) =
k∑i=1
φ2i (α)(uTi b)2 − 2η2
k∑i=1
φi(α)
GCV : Minimize rational function m∗ = minm,n
G(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(∑m∗
i=1 φ2i (α)
)WGCV : Minimize
WG(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(
1 + k − ωk + ω∑k
i=1 φ2i (α)
)How does αopt = argminF (α) depend on k?
Regularization Parameter Estimation: Minimize F (α) some F
Introduce φi(α) = α2
α2+σ2i
= 1− γi(α), i = 1 : r, γi = 0, i > k.
Unbiased Predictive Risk : Minimize functional noise level η2
Uk(α) =
k∑i=1
φ2i (α)(uTi b)2 − 2η2
k∑i=1
φi(α)
GCV : Minimize rational function m∗ = minm,n
G(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(∑m∗
i=1 φ2i (α)
)WGCV : Minimize
WG(α) =
(∑m∗
i=1 φ2i (α)(uTi b)2
)(
1 + k − ωk + ω∑k
i=1 φ2i (α)
)How does αopt = argminF (α) depend on k?
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
UPRE function Uk(α) with increasing k
10 -15 10 -10 10 -5
10 -2
10 -1
heat upre
5152535455565758595105115125opt
10 -15 10 -10 10 -5 100
10 -2
10 -1
wing upre
5152535455565758595105115125opt
Regularization parameter independent of k: unique minimum:Uk(α) increases.
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
UPRE function Uk(α) with increasing k
10 -15 10 -10 10 -5
10 -2
10 -1
heat upre
5152535455565758595105115125opt
10 -15 10 -10 10 -5 100
10 -2
10 -1
wing upre
5152535455565758595105115125opt
Regularization parameter independent of k: unique minimum:Uk(α) increases.
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
GCV function Gk(α) with increasing k:
10 -15 10 -10 10 -5
10 -6
10 -5
heat gcv
5152535455565758595105115125opt
10 -15 10 -10 10 -5 100
10 -6
10 -5
wing gcv
5152535455565758595105115125opt
Regularization parameter dependent on k: nonunique minimumWk(α) generally decreases
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
GCV function Gk(α) with increasing k:
10 -15 10 -10 10 -5
10 -6
10 -5
heat gcv
5152535455565758595105115125opt
10 -15 10 -10 10 -5 100
10 -6
10 -5
wing gcv
5152535455565758595105115125opt
Regularization parameter dependent on k: nonunique minimumWk(α) generally decreases
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
Weighted GCV function WGk(α) with increasing k
10-15
10-10
10-5
10-5
10-4
10-3
10-2
heat wgcv
5
15
25
35
45
55
65
75
85
95
105
115
125
opt
10-15
10-10
10-5
100
10-6
10-5
10-4
10-3
wing wgcv
5
15
25
35
45
55
65
75
85
95
105
115
125
opt
Regularization parameter dependent on k: unique minimum.WGk(α) decreases
Examples: F (α) Increasing truncation k. Noise level η2 = .0001
Weighted GCV function WGk(α) with increasing k
10-15
10-10
10-5
10-5
10-4
10-3
10-2
heat wgcv
5
15
25
35
45
55
65
75
85
95
105
115
125
opt
10-15
10-10
10-5
100
10-6
10-5
10-4
10-3
wing wgcv
5
15
25
35
45
55
65
75
85
95
105
115
125
opt
Regularization parameter dependent on k: unique minimum.WGk(α) decreases
Theorem on UPRE for the FTSVD regularization
Coefficients of the data : bi = uTi b
Assumption : There exists ` such that E(b2i ) = η2 for all i > `,i.e. coefficients bi noise contaminated i > `.
Define
αopt = argminU(α) and αk = argminUk(α),
Theorem (Convergence of αk for UPRE)Suppose that k = `+ p, p > 0, then the sequence αk is onthe average increasing with limk→r αk = αopt. FurthermoreUk(αk) is increasing, with limk→r Uk(αk) = U(αopt).
Theorem on UPRE for the FTSVD regularization
Coefficients of the data : bi = uTi b
Assumption : There exists ` such that E(b2i ) = η2 for all i > `,i.e. coefficients bi noise contaminated i > `.
Define
αopt = argminU(α) and αk = argminUk(α),
Theorem (Convergence of αk for UPRE)Suppose that k = `+ p, p > 0, then the sequence αk is onthe average increasing with limk→r αk = αopt. FurthermoreUk(αk) is increasing, with limk→r Uk(αk) = U(αopt).
Theorem on UPRE for the FTSVD regularization
Coefficients of the data : bi = uTi b
Assumption : There exists ` such that E(b2i ) = η2 for all i > `,i.e. coefficients bi noise contaminated i > `.
Define
αopt = argminU(α) and αk = argminUk(α),
Theorem (Convergence of αk for UPRE)Suppose that k = `+ p, p > 0, then the sequence αk is onthe average increasing with limk→r αk = αopt. FurthermoreUk(αk) is increasing, with limk→r Uk(αk) = U(αopt).
Theorem on UPRE for the FTSVD regularization
Coefficients of the data : bi = uTi b
Assumption : There exists ` such that E(b2i ) = η2 for all i > `,i.e. coefficients bi noise contaminated i > `.
Define
αopt = argminU(α) and αk = argminUk(α),
Theorem (Convergence of αk for UPRE)Suppose that k = `+ p, p > 0, then the sequence αk is onthe average increasing with limk→r αk = αopt. FurthermoreUk(αk) is increasing, with limk→r Uk(αk) = U(αopt).
Theorem on WGCV for the FTSVD regularization
Weight parameter : ω = (k + 1)/m∗
Define
αopt = argminWG(α) and αk = argminWGk(α),
Theorem (Convergence of αk for WGCV)Suppose that ω = (k + 1)/m∗, then WGk(αk) is decreasing,with limk→rWGk(αk) = WG(αopt).
Theorem on WGCV for the FTSVD regularization
Weight parameter : ω = (k + 1)/m∗
Define
αopt = argminWG(α) and αk = argminWGk(α),
Theorem (Convergence of αk for WGCV)Suppose that ω = (k + 1)/m∗, then WGk(αk) is decreasing,with limk→rWGk(αk) = WG(αopt).
Theorem on WGCV for the FTSVD regularization
Weight parameter : ω = (k + 1)/m∗
Define
αopt = argminWG(α) and αk = argminWGk(α),
Theorem (Convergence of αk for WGCV)Suppose that ω = (k + 1)/m∗, then WGk(αk) is decreasing,with limk→rWGk(αk) = WG(αopt).
Truncated Singular Value Decomposition approximates A
Observations :1. Find αk for TSVD with k terms.2. Determine optimal k as αk converges to αopt
Method approximating SVD with regularization: UPRE, WGCV
Regularizes the full problem
But finding the TSVD for large problems is not feasible
Truncated Singular Value Decomposition approximates A
Observations :1. Find αk for TSVD with k terms.2. Determine optimal k as αk converges to αopt
Method approximating SVD with regularization: UPRE, WGCV
Regularizes the full problem
But finding the TSVD for large problems is not feasible
Truncated Singular Value Decomposition approximates A
Observations :1. Find αk for TSVD with k terms.2. Determine optimal k as αk converges to αopt
Method approximating SVD with regularization: UPRE, WGCV
Regularizes the full problem
But finding the TSVD for large problems is not feasible
Truncated Singular Value Decomposition approximates A
Observations :1. Find αk for TSVD with k terms.2. Determine optimal k as αk converges to αopt
Method approximating SVD with regularization: UPRE, WGCV
Regularizes the full problem
But finding the TSVD for large problems is not feasible
Truncated Singular Value Decomposition approximates A
Observations :1. Find αk for TSVD with k terms.2. Determine optimal k as αk converges to αopt
Method approximating SVD with regularization: UPRE, WGCV
Regularizes the full problem
But finding the TSVD for large problems is not feasible
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
Large Scale -The LSQR iteration: Given k defines range space
LSQR Let β1 := ‖b‖2, and e(k+1)1 first column of Ik+1
Generate, lower bidiagonal Bk ∈ R(k+1)×k, columnorthonormal Hk+1 ∈ Rm×(k+1) , Gk ∈ Rn×k
AGk = Hk+1Bk, β1Hk+1e(k+1)1 = b.
Projected Problem on projected space: (standard Tikhonov)
wk(ζk) = argminw∈Rk
‖Bkw − β1e(k+1)1 ‖22 + ζ2
k‖w‖22.
Projected Solution depends on ζoptk : Let Bk = U ΣV T
xk(ζoptk ) = Gkwk(ζ
optk ) = β1Gk
k+1∑i=1
γi(ζoptk )
uTi e(k+1)1
σivi
=
k∑i=1
γi(ζoptk )
uTi (HTk+1b)
σiGkvi
Approximate SVD: Ak = (Hk+1U)Σ(GkV )T
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
The Randomized Singular Value Decomposition : Proto [HMT11]
A ∈ Rm×n, target rank k, oversampling parameter p,k + p = kp m. Power factor q. Compute A ≈ Ak = UkΣkV
Tk ,
Uk ∈ Rm×k, Σk ∈ Rk×k, V k ∈ Rn×k.1: Generate a Gaussian random matrix Ω ∈ Rn×kp.2: Compute Y = AΩ ∈ Rm×kp. Y = orth(Y )3: If q > 0 repeat q times Y = A(ATY ), Y = orth(Y ). Power4: Form B = Y TA ∈ Rkp×n. (Q = Y )5: Economy SVD B = UBΣBV
TB , UB ∈ Rkp×kp, VB ∈ Rk×k
6: Uk = QUB(:, 1 : k), V k = VB(:, 1 : k), Σk = ΣB(1 : k, 1 : k)
Projected RSVD Problem
xk(µk) = argminx∈Rk
‖Akx− b‖22 + µ2k‖x‖22.
=
k∑i=1
γi(µk)(uk)
Ti b
(σk)i(vk)i.
Approximate SVD Ak = UkΣkVTk
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Summary Comparisons : rank k
TSVD LSQR RSVDModel Ak Ak AkSVD UkΣkV
Tk (Hk+1U)Σ(GkV )T UkΣkV k
T
Terms k k k
Coeff γi(αk)uT
i bσi
vi γi(ζk)uT
i (HTk+1b)
σi(Gkv)i γi(µk)
(uk)Ti b
(σk)i(vk)i
‖A−Ak‖ σk+1 Theorem Ak Theorem Ak
Questions 1. Determine optimal k in each case?2. Determine optimal αk, ζk, µk. How are they
related?3. Do surrogate subproblems appropriately
regularize the full problem?4. Depends on approximation of singular space
for A?
Theorems on Ak Approximation of the spectral space: LSQR
Define νk = ‖A− Ak‖ = ‖δAk‖ then:
Theorem ([DDLT91]: (σi 6= σj) - looks at angles betweensingular vectors )ui and ui (vi and vi) are left (right) unit singular vectors of Aand Ak. For ‖δAk‖ ≤ νk, if 2νk < mini 6=j |σi − σj |, then
max(sin Θ(ui, ui), sin Θ(vi, vi)) ≤νk
mini 6=j |σi − σj | − νk≤ 1.
Theorem ([Jia16]: For fast decay of singular values nearbest rank k )For `: b` > σ`. Decay rate σi = ζρ−i, ρ > 2. ThenAk = Hk+1BkGk
T is a near best rank k approximation to A fork = 1, 2, . . . , `.
Theorems on Ak Approximation of the spectral space: LSQR
Define νk = ‖A− Ak‖ = ‖δAk‖ then:
Theorem ([DDLT91]: (σi 6= σj) - looks at angles betweensingular vectors )ui and ui (vi and vi) are left (right) unit singular vectors of Aand Ak. For ‖δAk‖ ≤ νk, if 2νk < mini 6=j |σi − σj |, then
max(sin Θ(ui, ui), sin Θ(vi, vi)) ≤νk
mini 6=j |σi − σj | − νk≤ 1.
Theorem ([Jia16]: For fast decay of singular values nearbest rank k )For `: b` > σ`. Decay rate σi = ζρ−i, ρ > 2. ThenAk = Hk+1BkGk
T is a near best rank k approximation to A fork = 1, 2, . . . , `.
Theorems on Ak Approximation of the spectral space: RSVD
Theorem (Proto: near best rank k approximation)Target rank k ≥ 2, oversampling p ≥ 2, k + p ≤ minm,n.
E(‖A−QQTA‖) ≤[1 +
4√k + p
p− 1·√
minm,n]σk+1
Theorem (Power Iteration to force singular values to 0)
E(‖A−UkΣkVTk ‖) ≤
1 +
[1 + 4
√2 minm,n
k − 1
]1/(2q+1)σk+1
Theorems on Ak Approximation of the spectral space: RSVD
Theorem (Proto: near best rank k approximation)Target rank k ≥ 2, oversampling p ≥ 2, k + p ≤ minm,n.
E(‖A−QQTA‖) ≤[1 +
4√k + p
p− 1·√
minm,n]σk+1
Theorem (Power Iteration to force singular values to 0)
E(‖A−UkΣkVTk ‖) ≤
1 +
[1 + 4
√2 minm,n
k − 1
]1/(2q+1)σk+1
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Regularization Estimation for UPRE / WGCV
Theorem ([RVA17]: with assumptions on theapproximation of the spectral space for LSQR)
1. αopt for F full(α) can be estimated via LSQR2. Minimizer of F proj(ζk) is minimizer of F full(ζk)
3. ζoptk depends on k, αopt depends on m∗ =: min(m,n)
4. If k∗ approx numerical rank A, and right singular space iswell-approximated ζopt
k∗ ≈ αopt for Kk∗(ATA,ATb)
Theorem (with assumptions on the approximation of thespectral space for RSVD follows from UPRE / WGCVconvergence for FTSVD)
1. αopt for F full(α) can be estimated via RSVD2. Minimizer of F rsvd(µk) is minimizer of F full(µk)
3. µoptk depends on k, αopt depends on m∗ =: min(m,n)
4. For good rank k approximation µk ≈ αk
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR
RSVD RSVD with standard oversampling. (p = k)RSVDQ RSVD with power iteration and q = 2. (p = k)
LSQR Standard LSQRLSQRO Oversample in the LSQR using p = k to find Bk+p
and its SVD. Use relevant k components of theSVD as for the RSVD.
Aims 1. Compare running times2. Compare spectral approximation3. Compare regularization estimation
Contrasting the RSVD and LSQR spectrum - impact of theory
Figure: RSVD
0 10 20 30 40 50 6010-4
10-3
10-2
10-1
100
101RSVD Singular Values
Truek=4k=12k=20k=28k=36k=44k=54k=60
Contrasting the RSVD and LSQR spectrum - impact of theory
Figure: LSQR
0 10 20 30 40 50 6010-5
10-4
10-3
10-2
10-1
100
101LSQR Singular Values
Truek=4k=12k=20k=28k=36k=44k=54k=60
Contrasting the RSVD and LSQR spectrum - impact of theory
Figure: LSQRO
0 10 20 30 40 50 6010-4
10-3
10-2
10-1
100
101LSQROver Singular Values
Truek=4k=12k=20k=28k=36k=44k=54k=60
Contrasting the RSVD and LSQR spectrum - impact of theory
Figure: RSVDQ
0 10 20 30 40 50 6010-4
10-3
10-2
10-1
100
101RSVDQ Singular Values
Truek=4k=12k=20k=28k=36k=44k=54k=60
Contrasting the RSVD and LSQR: singular space approximation
Figure: rank k approximation error
10 20 30 40 50 60
10-3
10-2
10-1
100
Rank k Approximation errors
RSVDRSVDqLSQRLSQRO
LSQR gives poor approximation of the singular space. LSQRwith oversampling recovers accuracy comparable to RSVD
Contrasting the RSVD and LSQR: singular space approximation
Figure: run time
0 10 20 30 40 50 600
0.01
0.02
0.03
0.04
0.05Running Time
RSVDRSVDqLSQRLSQRO
LSQRO is expensive
Contrasting the RSVD and LSQR: convergence of regularization
Figure: No Regularization: Relative Error
10 20 30 40 50 60
100
101
102
103
104The relative errors for phillips
RSVDRSVDqLSQRLSQRO
Contrasting the RSVD and LSQR: convergence of regularization
Figure: Regularization: Relative Error
10 20 30 40 50 60
100
101
The relative errors for regularized phillips
RSVDRSVDqLSQRLSQRO
Contrasting the RSVD and LSQR: convergence of regularization
Figure: Parameter Convergence
10 20 30 40 50 60
10-2
10-1
The regularization parameter phillips
RSVDRSVDqLSQRLSQRO
αk converges with k when singular space approximated well:RSVD, LSQRO, RSVDQ
Example Solutions for Phillips (Trivial)
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
The Regularized Solutions for phillips
50 100 150 200 2500
0.005
0.01
0.015
0.02
0.025 TrueRSVDLSQRLSQRORSVDq
Parameter k increasing [4, 12, 20, 28, 36, 44, 52, 60]
Restoration of Grain noise level η2 = .0001 : Restoretools
True and ContaminatedTrue Image
10 20 30 40 50 60
10
20
30
40
50
60
Blurred Noisy Image
10 20 30 40 50 60
10
20
30
40
50
60
Summary Results: Image Restoration
Figure: Relative Errors
200 400 600 800 1000 1200 1400 1600 1800 2000
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1The relative errors for regularized solutions
RSVDRSVDqLSQRLSQRO
Relative Errors decrease with TSVD approximation.
Summary Results: Image Restoration
Figure: Regularization Parameter
200 400 600 800 1000 1200 1400 1600 1800 2000
10-4
10-3
10-2
The regularization parameter
RSVDRSVDqLSQRLSQRO
Regularization parameter converges as k increases
Restored Regularized Solutions noise level η2 = .0001 k = 1200 UPRE
Figure: RSVD
RSVDREG
10 20 30 40 50 60
10
20
30
40
50
60
Restored Regularized Solutions noise level η2 = .0001 k = 1200 UPRE
Figure: LSQR
RSVDQREG
10 20 30 40 50 60
10
20
30
40
50
60
Restored Regularized Solutions noise level η2 = .0001 k = 1200 UPRE
Figure: LSQRO
LSQRREG
10 20 30 40 50 60
10
20
30
40
50
60
Restored Regularized Solutions noise level η2 = .0001 k = 1200 UPRE
Figure: RSVDQ
LSQROREG
10 20 30 40 50 60
10
20
30
40
50
60
Optimal Solutions with RSVD and LSQR for Image Restoration Noise5% oversampling 25%
Figure: RSVD UPRE k = 2000
RSVD : UPRE
50 100 150 200 250
50
100
150
200
250
Optimal Solutions with RSVD and LSQR for Image Restoration Noise5% oversampling 25%
Figure: LSQR UPRE k = 20
LSQR : UPRE
50 100 150 200 250
50
100
150
200
250
Optimal Solutions with RSVD and LSQR for Image Restoration Noise5% oversampling 25%
Figure: RSVD GCV k = 2000
RSVD : GCV
50 100 150 200 250
50
100
150
200
250
Optimal Solutions with RSVD and LSQR for Image Restoration Noise5% oversampling 25%
Figure: LSQR GCV k = 2000
LSQR : GCV
50 100 150 200 250
50
100
150
200
250
Timing
Table: The timings to restore the images illustrated for the Grain andSatellite Images
Image Grain SatelliteOversampling p=25 p=25
Method UPRE GCV UPRE GCVRSVD 54.602 55.646 20.065 19.729
LSQR 1.9909 1761.9 3.0514 1180
LSQR may require a smaller subspace k size
Notice solutions are still not good - blurred - lack of resolution
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Iteratively Reweighted Regularization for Edge Preservation [LK83]‖Ax− b‖2 + α2‖L(`)(x(`) − x(`−1))‖2
Minimum Support Stabilizer Regularization operator L(`).
(L(`))ii = ((x(`−1)i − x
(`−2)i )2 + β2)−1/2 β > 0
Parameter β ensures L(`) invertibleInvertibility use (L(`))−1 as right preconditioner for A
(L(`))−1ii = ((x
(`−1)i − x
(`−2)i )2 + β2)1/4 β > 0
Initialization L(0) = I, x(0) = x0. (might be 0)Reduced System When β = 0 and x
(`−1)i = x
(`−2)i remove
column i, A is AL−1 with columns removed.Update Equation Solve Ay ≈ R = b−Ax(`−1). With correct
indexing set yi = yi if updated, else yi = 0.
x(`) = x(`−1) + y
Cost of L(`) is minimal
Magnetic data m = 5000, n = 75000 β2 = 1e− 9, p = 10%
True LSQR (k = 5) UPRE RSVD (k = 1000) UPRE5
3
1500
46
Easting (m)
1000300
200
2
100
Dep
th (
m)
0
800500
600
Northing (m)
1
400
20000
5
3
1500
46
Easting (m)
1000300
200
2
100
Dep
th (
m)
0
800500
600
Northing (m)
1
400
20000
5
3
1500
46
Easting (m)
1000300
200
2
100
Dep
th (
m)
0
800500
600
Northing (m)
1
400
20000
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(b)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(c)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
LSQR slices - time 8.9566 seconds, k = 5
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(b)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(c)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
RSVD slices - time 1681.8 seconds, k = 1000
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(b)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)(c)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
Magnetic data m = 5000, n = 75000 β2 = 1e− 9, p = 10%
LSQRO GCV k = 800
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(b)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(c)
0
0.01
0.02
0.03
0.04
0.05
0.06
(SI)
RSVD slices GCV at k = 1000
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)(b)
0
0.01
0.02
0.03
0.04
0.05
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(c)
0
0.01
0.02
0.03
0.04
0.05
(SI)
LSQR slices - GCV at k = 5
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(a)
0
0.01
0.02
0.03
0.04
0.05
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(b)
0
0.01
0.02
0.03
0.04
0.05
(SI)
0 200 400 600 800 1000 1200 1400 1600 1800
Easting (m)
0
200
400
600
800
No
rth
ing
(m
)
(c)
0
0.01
0.02
0.03
0.04
0.05
(SI)
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Conclusions: RSVD - LSQR
UPRE / WGCV converges for the TSVDUPRE / WGCV therefore converges for the RSVDUPRE / WGCV converges for LSQR with oversamplingζoptk , αopt µopt
k related across levels for RSVD, RSVDQ andLSQRO
Regularization Find the optimal parameter for reducedsubspace surrogate model and apply for largernumber of terms.
LSQR Run with oversampling to avoid issues ofsemi-convergence but expensive
RSVD or LSQR Results suggestI Advantages of the RSVD - speed!I Disadvantage - not reflecting the full spectrum
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Overview Conclusions
Low Rank Finding low rank approximation of model matrix isimportant (TRUNCATION)
Benefits Low rank saves computational cost and memoryRegularization Given low rank approximation estimate
regularization parameter efficiently (FILTER)Cost While LSQR costs more per iteration, it converges
faster in context of L1 and is cheaperRSVD / LSQR Trade offs depend on speed by which singular
values decreaseRSVD Note algorithm is modified for m < n.
Some key references
Percy Deift, James Demmel, Luen-Chau Li, and Carlos Tomei.The bidiagonal singular value decomposition and hamiltonianmechanics.SIAM Journal on Numerical Analysis, 28(5):1463–1516, 1991.
N. Halko, P. G. Martinsson, and J. A. Tropp.Finding structure with randomness: Probabilistic algorithms forconstructing approximate matrix decompositions.SIAM Review, 53(2):217–288, 2011.
Z. Jia.The Regularization Theory of the Krylov Iterative Solvers LSQR, CGLS,LSMR and CGME For Linear Discrete Ill-Posed Problems.ArXiv e-prints, August 2016.
B. J. Last and K. Kubik.Compact gravity inversion.GEOPHYSICS, 48(6):713–721, 1983.
R. A. Renaut, S. Vatankhah, and V. E. Ardestani.Hybrid and iteratively reweighted regularization by unbiased predictiverisk and weighted gcv for projected systems.SIAM J. Sci. Comput., 39:B221–B243., 2017.
THE END
Thank you.Questions