Hybrid LSQR and RSVD solutions of Ill-Posed Least Squares …rosie/mypresentations/Syracuse... ·...

Hybrid LSQR and RSVD solutions ofIll-Posed Least Squares Problems

Rosemary Renaut1 Anthony Helmstetter 1 SaeedVatankhah2

1: School of Mathematical and Statistical Sciences, Arizona State University,[email protected], [email protected]

2: Institute of Geophysics, University of Tehran, [email protected]

Syracuse University

Outline

AimsMotivation Example: Large Scale Gravity Inversion

Background: SVD for the small scaleStandard Approaches to Estimate Regularization Problem

Methods for the Large Scale: Approximating the SVDKrylov: Golub Kahan Bidiagonalization - LSQRRandomized SVDSimulations

1D Contrasting RSVD and LSQRTwo dimensional Examples

Extension : Iteratively Reweighted Regularization [LK83]Undersampled 3D Magnetic: approximate L1 regularization

Conclusions: RSVD - LSQR

Outline

Research Aims : Efficient solvers for 3D inversion

1. Effective regularization parameter estimation using smallscale surrogate for large scale problem

2. Derive surrogates from Krylov projection - LSQR3. Derive surrogates from Randomized Singular Value

Decomposition (RSVD)4. Effective implementation of L1 and Lp regularization using

surrogate initialization.

Explain: hopefully - what these statements mean

Motivation Example: Large Scale 3D Magnetic Anomaly Inversion

Observation point r = (x, y, z)

Vertical magnetic anomaly m(r)

m(r) ∝∫dΩκ(r′)

r′ − r

|r′ − r|3dΩ′

Susceptibility κ(r′) at r′ = (x′, y′, z′)

Linear Relation m = Gκ

Aim: Given surface observations mij find susceptibility κijk

Underdetermined, measurements 5000, unknowns 75000

Practical Approaches for Large Scale Ill-Posed Problems needed

r′ − r

|r′ − r|3dΩ′

r′ − r

|r′ − r|3dΩ′

r′ − r

|r′ − r|3dΩ′

r′ − r

|r′ − r|3dΩ′

r′ − r

|r′ − r|3dΩ′

Ill-Posed Problem: Example Solutions Image Restoration

True Contaminated Naive Restoration

50 100 150 200 250

Background Spectral Decomposition of the Solution: The SVD

Consider general discrete problem

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.

Singular value decomposition (SVD) of A rank r ≤ min(m,n)

A = UΣV T =

r∑i=1

uiσivTi , Σ = diag(σ1, . . . , σr).

Singular values σi =√λi, λi the eigenvalue of ATA.

Singular Vectors ui, vi:R(A) = span(U(:, 1 : r)) N(A) = span(V (:, r + 1 : n))Expansion for the solution

r∑i=1

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.

A = UΣV T =

r∑i=1

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.

A = UΣV T =

r∑i=1

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.

A = UΣV T =

r∑i=1

Ax = b, A ∈ Rm×n, b ∈ Rm, x ∈ Rn.

A = UΣV T =

r∑i=1

The Picard Plot - examines the solution

r∑i=1

10 20 30 40 50 60

10−8

10−6

10−4

10−2

Tb|/σ

Solution with these coefficients would be non feasible

The Picard Plot - examines the solution

r∑i=1

10 20 30 40 50 60

10−8

10−6

10−4

10−2

Tb|/σ

Solution with these coefficients would be non feasible

Regularization: Solutions are not stable

Truncation: k < r - surrogate problem is size k, Ak ≈ A

k∑i=1

Filtering: γi

r∑i=1

γi(α)uTi b

Filtered Truncated: k, γi - surrogate problem is size k ,Ak ≈ A

k∑i=1

γi(α)uTi b

k and α are regularization parameters. γi filter function

k∑i=1

Filtering: γi

r∑i=1

γi(α)uTi b

k∑i=1

γi(α)uTi b

k∑i=1

Filtering: γi

r∑i=1

γi(α)uTi b

k∑i=1

γi(α)uTi b

k∑i=1

Filtering: γi

r∑i=1

γi(α)uTi b

k∑i=1

γi(α)uTi b

Tikhonov Regularization : regularization parameter α

Filter Functions

γi(α) =σ2i

σ2i + α2

, i = 1 . . . r,

Solves Standard Form

x(α) = argminx‖b−Ax‖2 + α2‖x‖2

≈ argminx‖b−Akx‖2 + α2‖x‖2

Generalized Tikhonov has operator L

x(α) = argminx‖b−Akx‖2 + α2‖Lx‖2

Solve with standard form if L invertible.Desirable automatic estimation of α

How to efficiently solve and find αopt for large scale problems?

Filter Functions

γi(α) =σ2i

σ2i + α2

, i = 1 . . . r,

Filter Functions

γi(α) =σ2i

σ2i + α2

, i = 1 . . . r,

Filter Functions

γi(α) =σ2i

σ2i + α2

, i = 1 . . . r,

Filter Functions

γi(α) =σ2i

σ2i + α2

, i = 1 . . . r,

Regularization Parameter Estimation: Minimize F (α) some F

Introduce φi(α) = α2

α2+σ2i

= 1− γi(α), i = 1 : r, γi = 0, i > k.

Unbiased Predictive Risk : Minimize functional noise level η2

Uk(α) =

k∑i=1

φ2i (α)(uTi b)2 − 2η2

k∑i=1

φi(α)

GCV : Minimize rational function m∗ = minm,n

G(α) =

(∑m∗

i=1 φ2i (α)(uTi b)2

)(∑m∗

i=1 φ2i (α)

)WGCV : Minimize

WG(α) =

(∑m∗

1 + k − ωk + ω∑k

i=1 φ2i (α)

)How does αopt = argminF (α) depend on k?

α2+σ2i

= 1− γi(α), i = 1 : r, γi = 0, i > k.

Uk(α) =

k∑i=1

φ2i (α)(uTi b)2 − 2η2

k∑i=1

φi(α)

G(α) =

(∑m∗

)(∑m∗

i=1 φ2i (α)

)WGCV : Minimize

WG(α) =

(∑m∗

1 + k − ωk + ω∑k

i=1 φ2i (α)

α2+σ2i

= 1− γi(α), i = 1 : r, γi = 0, i > k.

Uk(α) =

k∑i=1

φ2i (α)(uTi b)2 − 2η2

k∑i=1

φi(α)

G(α) =

(∑m∗

)(∑m∗

i=1 φ2i (α)

)WGCV : Minimize

WG(α) =

(∑m∗

1 + k − ωk + ω∑k

i=1 φ2i (α)

α2+σ2i

= 1− γi(α), i = 1 : r, γi = 0, i > k.

Uk(α) =

k∑i=1

φ2i (α)(uTi b)2 − 2η2

k∑i=1

φi(α)

G(α) =

(∑m∗

)(∑m∗

i=1 φ2i (α)

)WGCV : Minimize

WG(α) =

(∑m∗

1 + k − ωk + ω∑k

i=1 φ2i (α)

α2+σ2i

= 1− γi(α), i = 1 : r, γi = 0, i > k.

Uk(α) =

k∑i=1

φ2i (α)(uTi b)2 − 2η2

k∑i=1

φi(α)

G(α) =

(∑m∗

)(∑m∗

i=1 φ2i (α)

)WGCV : Minimize

WG(α) =

(∑m∗

1 + k − ωk + ω∑k

i=1 φ2i (α)

Examples: F (α) Increasing truncation k. Noise level η2 = .0001

UPRE function Uk(α) with increasing k

10 -15 10 -10 10 -5

heat upre