Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

71
Gabriel Peyré www.numerical-tours.com Low Complexity Regularization of Inverse Problems Cours #2 Recovery Guarantees

description

Course given at "Journées de géométrie algorithmique", JGA 2013.

Transcript of Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Page 1: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Gabriel Peyré

www.numerical-tours.com

Low Complexity Regularization of Inverse Problems

Cours #2 Recovery Guarantees

Page 2: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Overview of the Course

• Course #1: Inverse Problems

• Course #2: Recovery Guarantees

• Course #3: Proximal Splitting Methods

Page 3: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Overview

• Low-complexity Regularization with Gauges

• Performance Guarantees

• Grid-free Regularization

Page 4: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Inverse Problem Regularization

observations y

parameter �Estimator: x(y) depends only on

Observations: y = �x0 + w 2 RP .

Page 5: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Inverse Problem Regularization

observations y

parameter �

Example: variational methods

Estimator: x(y) depends only on

x(y) 2 argminx2RN

1

2||y � �x||2 + � J(x)

Data fidelity Regularity

Observations: y = �x0 + w 2 RP .

Page 6: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

J(x0)Regularity of x0

Inverse Problem Regularization

observations y

parameter �

Example: variational methods

Estimator: x(y) depends only on

x(y) 2 argminx2RN

1

2||y � �x||2 + � J(x)

Data fidelity Regularity

Observations: y = �x0 + w 2 RP .

Choice of �: tradeo� ||w||Noise level

Page 7: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

J(x0)Regularity of x0

x

? 2 argminx2RQ

,Kx=y

J(x)

Inverse Problem Regularization

observations y

parameter �

Example: variational methods

Estimator: x(y) depends only on

x(y) 2 argminx2RN

1

2||y � �x||2 + � J(x)

Data fidelity Regularity

Observations: y = �x0 + w 2 RP .

No noise: �� 0+, minimize

Choice of �: tradeo� ||w||Noise level

Page 8: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

J(x0)Regularity of x0

x

? 2 argminx2RQ

,Kx=y

J(x)

This course:

Performance analysis.

Fast computational scheme.

Inverse Problem Regularization

observations y

parameter �

Example: variational methods

Estimator: x(y) depends only on

x(y) 2 argminx2RN

1

2||y � �x||2 + � J(x)

Data fidelity Regularity

Observations: y = �x0 + w 2 RP .

No noise: �� 0+, minimize

Choice of �: tradeo� ||w||Noise level

Page 9: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Coe�cients x Image x

Union of Linear Models for Data ProcessingUnion of models: T 2 T linear spaces.

Synthesissparsity:

T

Page 10: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Coe�cients x Image x

Union of Linear Models for Data ProcessingUnion of models: T 2 T linear spaces.

Synthesissparsity:

T

Structuredsparsity:

Page 11: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Coe�cients x Image x

Union of Linear Models for Data Processing

D�

Image x

Gradient D⇤x

Union of models: T 2 T linear spaces.

Synthesissparsity:

T

Structuredsparsity:

Analysissparsity:

Page 12: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Coe�cients x Image x

Multi-spectral imaging:xi,· =

Prj=1 Ai,jSj,·

Union of Linear Models for Data Processing

D�

Image x

Gradient D⇤x

Union of models: T 2 T linear spaces.

Synthesissparsity:

T

Structuredsparsity:

Analysissparsity:

Low-rank:

S1,·

S2,·

S3,·Figure 3. The concept of hyperspectral imagery. Image measurements are made atmany narrow contiguous wavelength bands, resulting in a complete spectrum for eachpixel.

Hyperspectral Data

Most multispectral imagers (e.g., Landsat, SPOT, AVHRR) measure radiation reflectedfrom a surface at a few wide, separated wavelength bands (Fig. 4). Most hyperspectralimagers (Table 1), on the other hand, measure reflected radiation at a series of narrowand contiguous wavelength bands. When we look at a spectrum for one pixel in ahyperspectral image, it looks very much like a spectrum that would be measured in aspectroscopy laboratory (Fig. 5). This type of detailed pixel spectrum can provide muchmore information about the surface than a multispectral pixel spectrum.

x

Page 13: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Gauge: J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

Convex

Page 14: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

x

T

Gauge:

,Union of linear models

(T )T2TPiecewise regular ball

J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

J(x) = ||x||1T = sparsevectors

Convex

Page 15: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

x

T

Gauge:

,Union of linear models

(T )T2TPiecewise regular ball

J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

J(x) = ||x||1

x

0T 0

T = sparsevectors

Convex

Page 16: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

x

T

Gauge:

,Union of linear models

(T )T2TPiecewise regular ball

J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

J(x) = ||x||1

x

0T 0

T = sparsevectors

|x1|+||x2,3||x

0

x

T

T 0

T = block

vectors

sparse

Convex

Page 17: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

x

T

Gauge:

,Union of linear models

(T )T2TPiecewise regular ball

J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

J(x) = ||x||1T = low-rank

matrices

J(x) = ||x||⇤

x

x

0T 0

T = sparsevectors

|x1|+||x2,3||x

0

x

T

T 0

T = block

vectors

sparse

Convex

Page 18: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

x

T

Gauge:

,Union of linear models

(T )T2TPiecewise regular ball

J : RN ! R+8↵ 2 R+

, J(↵x) = ↵J(x)

Gauges for Union of Linear Models

J(x) = ||x||1T = low-rank

matrices

J(x) = ||x||⇤

x

x

0T 0

T = anti-sparsevectors

J(x) = ||x||1

x

x

0

T 0

T = sparsevectors

|x1|+||x2,3||x

0

x

T

T 0

T = block

vectors

sparse

Convex

Page 19: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

@J(x) = {⌘ \ 8 y, J(y) > J(x)+h⌘, y � xi}

Subdifferentials and Models|x|

Page 20: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

I = supp(x) = {i \ xi 6= 0}

@J(x) = {⌘ \ 8 y, J(y) > J(x)+h⌘, y � xi}

Subdifferentials and Models

Example: J(x) = ||x||1

@||x||1 =

⇢⌘ \ supp(⌘) = I,

8 j /2 I, |⌘j | 6 1

x

@J(x)

0

|x|

Page 21: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Definition:

I = supp(x) = {i \ xi 6= 0}

T

x

= VectHull(@J(x))?

@J(x) = {⌘ \ 8 y, J(y) > J(x)+h⌘, y � xi}

Subdifferentials and Models

Tx

= {⌘ \ supp(⌘) = I}

Example: J(x) = ||x||1

@||x||1 =

⇢⌘ \ supp(⌘) = I,

8 j /2 I, |⌘j | 6 1

Tx

x

@J(x)

0

|x|

Page 22: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Definition:

I = supp(x) = {i \ xi 6= 0}

T

x

= VectHull(@J(x))?

⌘ 2 @J(x) =) Proj

T

x

(⌘) = e

x

@J(x) = {⌘ \ 8 y, J(y) > J(x)+h⌘, y � xi}

Subdifferentials and Models

e

x

= sign(x)

Tx

= {⌘ \ supp(⌘) = I}

Example: J(x) = ||x||1

@||x||1 =

⇢⌘ \ supp(⌘) = I,

8 j /2 I, |⌘j | 6 1

Tx

x

@J(x)

0 ex

|x|

Page 23: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Examples`

1 sparsity: J(x) = ||x||1e

x

= sign(x) T

x

= {z \ supp(z) ⇢ supp(x)}

x

0x

@J(x)

Page 24: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Examples

x

0x

@J(x)

`

1 sparsity: J(x) = ||x||1e

x

= sign(x) T

x

= {z \ supp(z) ⇢ supp(x)}

e

x

= (N (xb

))b2B

N (a) = a/||a||Structured sparsity: J(x) =P

b ||xb||T

x

= {z \ supp(z) ⇢ supp(x)}

x

0x

@J(x)

Page 25: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Examples

x

0x

@J(x)

`

1 sparsity: J(x) = ||x||1e

x

= sign(x) T

x

= {z \ supp(z) ⇢ supp(x)}

e

x

= (N (xb

))b2B

N (a) = a/||a||Structured sparsity: J(x) =P

b ||xb||T

x

= {z \ supp(z) ⇢ supp(x)}

x

@J(x)x

0x

@J(x)

ex

= UV ⇤Nuclear norm: J(x) = ||x||⇤ x = U⇤V ⇤SVD:

Tx

=�UA+BV ⇤ \ (A,B) 2 (Rn⇥n)2

Page 26: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Examples

x

0x

@J(x)

x

x

0

@J(x)

I = {i \ |xi| = ||x||1}Anti-sparsity: J(x) = ||x||1T

x

= {y \ y

I

/ sign(xI

)}e

x

= |I|�1 sign(x)

`

1 sparsity: J(x) = ||x||1e

x

= sign(x) T

x

= {z \ supp(z) ⇢ supp(x)}

e

x

= (N (xb

))b2B

N (a) = a/||a||Structured sparsity: J(x) =P

b ||xb||T

x

= {z \ supp(z) ⇢ supp(x)}

x

@J(x)x

0x

@J(x)

ex

= UV ⇤Nuclear norm: J(x) = ||x||⇤ x = U⇤V ⇤SVD:

Tx

=�UA+BV ⇤ \ (A,B) 2 (Rn⇥n)2

Page 27: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Overview

• Low-complexity Regularization with Gauges

• Performance Guarantees

• Grid-free Regularization

Page 28: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Noiseless recovery:

min�x=�x0

J(x) (P0)

�x = �

x0

Dual Certificates

x

?

Page 29: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Noiseless recovery:

min�x=�x0

J(x) (P0)

Dual certificates:

�x = �

x0Proposition:

D(x0) = Im(�⇤) \ @J(x0)

x0 solution of (P0) () 9 ⌘ 2 D(x0)

Dual Certificates

⌘@J(x0)

x

?

Page 30: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Noiseless recovery:

min�x=�x0

J(x) (P0)

Dual certificates:

�x = �

x0Proposition:

D(x0) = Im(�⇤) \ @J(x0)

x0 solution of (P0) () 9 ⌘ 2 D(x0)

Proof:

(P0) () min�2ker(�)

J(x0 + �)

8 (⌘, �) 2 @J(x0)⇥ ker(�), J(x0 + �) > J(x0) + h�, ⌘i

⌘ 2 Im(�

⇤) =) h�, ⌘i = 0 =) x0 solution.

x0 solution =) 8 �, h�, ⌘i 6 0 =) ⌘ 2 ker(�)

?.

Dual Certificates

⌘@J(x0)

x

?

Page 31: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

= interior for the topology of a↵(E)

ri(E) = relative interior of E

Dual Certificates and L2 Stability

�x = �

x0

⌘@J(x0)

x

?

Tight dual certificates:

D(x0) = Im(�⇤) \ ri(@J(x0))

Page 32: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

= interior for the topology of a↵(E)

ri(E) = relative interior of E

Dual Certificates and L2 Stability

Theorem:

[Fadili et al. 2013]

If 9 ⌘ 2 ¯D(x0), for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)

�x = �

x0

⌘@J(x0)

x

?

Tight dual certificates:

D(x0) = Im(�⇤) \ ri(@J(x0))

Page 33: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

= interior for the topology of a↵(E)

ri(E) = relative interior of E

Dual Certificates and L2 Stability

[Grassmair 2012]: J(x? � x0) = O(||w||).[Grassmair, Haltmeier, Scherzer 2010]: J = || · ||1.

Theorem:

[Fadili et al. 2013]

If 9 ⌘ 2 ¯D(x0), for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)

�x = �

x0

⌘@J(x0)

x

?

Tight dual certificates:

D(x0) = Im(�⇤) \ ri(@J(x0))

Page 34: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

= interior for the topology of a↵(E)

ri(E) = relative interior of E

Dual Certificates and L2 Stability

[Grassmair 2012]: J(x? � x0) = O(||w||).[Grassmair, Haltmeier, Scherzer 2010]: J = || · ||1.

�! The constants depend on N . . .

Theorem:

[Fadili et al. 2013]

If 9 ⌘ 2 ¯D(x0), for � ⇠ ||w|| one has ||x? � x0|| = O(||w||)

�x = �

x0

⌘@J(x0)

x

?

Tight dual certificates:

D(x0) = Im(�⇤) \ ri(@J(x0))

Page 35: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Random matrix: �i,j ⇠ N (0, 1), i.i.d.� 2 RP⇥N ,

Compressed Sensing Setting

Page 36: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Let s = ||x0||0. If

Then 9⌘ 2 ¯D(x0) with high probability on �.

� 2 RP⇥N ,

[Chandrasekaran et al. 2011]P > 2s log (N/s)

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

Page 37: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Theorem:

Then 9⌘ 2 ¯D(x0) with high probability on �.

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Low-rank matrices: J = || · ||⇤.

Let s = ||x0||0. If

Let r = rank(x0). If

Then 9⌘ 2 ¯D(x0) with high probability on �.

� 2 RP⇥N ,

[Chandrasekaran et al. 2011]P > 2s log (N/s)

P > 3r(N1 +N2 � r) x0 2 RN1⇥N2

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

[Chandrasekaran et al. 2011]

Page 38: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Theorem:

Then 9⌘ 2 ¯D(x0) with high probability on �.

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Low-rank matrices: J = || · ||⇤.

Let s = ||x0||0. If

Let r = rank(x0). If

Then 9⌘ 2 ¯D(x0) with high probability on �.

� 2 RP⇥N ,

�! Similar results for || · ||1,2, || · ||1.

[Chandrasekaran et al. 2011]P > 2s log (N/s)

P > 3r(N1 +N2 � r) x0 2 RN1⇥N2

[Rudelson, Vershynin 2006]

Compressed Sensing Setting

[Chandrasekaran et al. 2011]

Page 39: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Phase TransitionsP/N

0

1

s/N r/pN

J = || · ||1 J = || · ||⇤THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 7

0 25 50 75 1000

25

50

75

100

0 10 20 300

300

600

900

FIGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empiricalprobability that the `1 minimization problem (2.6) identifies a sparse vector x0 2 R100 given random linearmeasurements z0 = Ax0. [right] Recovery of low-rank matrices. The empirical probability that the S1minimization problem (2.7) identifies a low-rank matrix X0 2 R30£30 given random linear measurementsz0 =A (X0). In each panel, the colormap indicates the empirical probability of success (black = 0%; white =100%). The yellow curve marks the theoretical prediction of the phase transition from Theorem II; the red curvetraces the empirical phase transition.

fixed dimensions. This calculation gives the exact (asymptotic) location of the phase transition for the S1

minimization problem (2.7) with random measurements.To underscore these achievements, we have performed some computer experiments to compare the

theoretical and empirical phase transitions. Figure 2.2[left] shows the performance of (2.6) for identifyinga sparse vector in R100; Figure 2.2[right] shows the performance of (2.7) for identifying a low-rank matrixin R30£30. In each case, the colormap indicates the empirical probability of success over the randomness inthe measurement operator. The empirical 5%, 50%, and 95% success isoclines are determined from thedata. We also draft the theoretical phase transition curve, promised by Theorem II, where the number m ofmeasurements equals the statistical dimension of the appropriate descent cone, which we compute using theformulas from Sections 4.5 and 4.6. See Appendix A for the experimental protocol.

In both examples, the theoretical prediction of Theorem II coincides almost perfectly with the 50% successisocline. Furthermore, the phase transition takes place over a range of O(

pd) values of m, as promised.

Although Theorem II does not explain why the transition region tapers at the bottom-left and top-rightcorners of each plot, we have established a more detailed version of Theorem I that allows us to predict thisphenomenon as well. See the discussion after Theorem 7.1 for more information.

2.4. Demixing problems. In a demixing problem [MT12], we observe a superposition of two structuredvectors, and we aim to extract the two constituents from the mixture. More precisely, suppose that we measurea vector z0 2Rd of the form

z0 = x0 +U y0 (2.8)where x0, y0 2 Rd are unknown and U 2 Rd£d is a known orthogonal matrix. If we wish to identify the pair(x0, y0), we must assume that each component is structured to reduce the number of degrees of freedom.In addition, if the two types of structure are coherent (i.e., aligned with each other), it may be impossibleto disentangle them, so it is expedient to include the matrix U to model the relative orientation of the twoconstituent signals.

To solve the demixing problem (2.8), we describe a convex programming technique proposed in [MT12].Suppose that f and g are proper convex functions on Rd that promote the structures we expect to find in x0

THE GEOMETRY OF PHASE TRANSITIONS IN CONVEX OPTIMIZATION 7

0 25 50 75 1000

25

50

75

100

0 10 20 300

300

600

900

FIGURE 2.2: Phase transitions for linear inverse problems. [left] Recovery of sparse vectors. The empiricalprobability that the `1 minimization problem (2.6) identifies a sparse vector x0 2 R100 given random linearmeasurements z0 = Ax0. [right] Recovery of low-rank matrices. The empirical probability that the S1minimization problem (2.7) identifies a low-rank matrix X0 2 R30£30 given random linear measurementsz0 =A (X0). In each panel, the colormap indicates the empirical probability of success (black = 0%; white =100%). The yellow curve marks the theoretical prediction of the phase transition from Theorem II; the red curvetraces the empirical phase transition.

fixed dimensions. This calculation gives the exact (asymptotic) location of the phase transition for the S1

minimization problem (2.7) with random measurements.To underscore these achievements, we have performed some computer experiments to compare the

theoretical and empirical phase transitions. Figure 2.2[left] shows the performance of (2.6) for identifyinga sparse vector in R100; Figure 2.2[right] shows the performance of (2.7) for identifying a low-rank matrixin R30£30. In each case, the colormap indicates the empirical probability of success over the randomness inthe measurement operator. The empirical 5%, 50%, and 95% success isoclines are determined from thedata. We also draft the theoretical phase transition curve, promised by Theorem II, where the number m ofmeasurements equals the statistical dimension of the appropriate descent cone, which we compute using theformulas from Sections 4.5 and 4.6. See Appendix A for the experimental protocol.

In both examples, the theoretical prediction of Theorem II coincides almost perfectly with the 50% successisocline. Furthermore, the phase transition takes place over a range of O(

pd) values of m, as promised.

Although Theorem II does not explain why the transition region tapers at the bottom-left and top-rightcorners of each plot, we have established a more detailed version of Theorem I that allows us to predict thisphenomenon as well. See the discussion after Theorem 7.1 for more information.

2.4. Demixing problems. In a demixing problem [MT12], we observe a superposition of two structuredvectors, and we aim to extract the two constituents from the mixture. More precisely, suppose that we measurea vector z0 2Rd of the form

z0 = x0 +U y0 (2.8)where x0, y0 2 Rd are unknown and U 2 Rd£d is a known orthogonal matrix. If we wish to identify the pair(x0, y0), we must assume that each component is structured to reduce the number of degrees of freedom.In addition, if the two types of structure are coherent (i.e., aligned with each other), it may be impossibleto disentangle them, so it is expedient to include the matrix U to model the relative orientation of the twoconstituent signals.

To solve the demixing problem (2.8), we describe a convex programming technique proposed in [MT12].Suppose that f and g are proper convex functions on Rd that promote the structures we expect to find in x0

1

0 11

P/N

From [Amelunxen et al. 20013]

Page 40: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⌘ 2 D(x0) =)⇢

⌘ = �

⇤q

ProjT (⌘) = e

Minimal-norm Certificate

T = Tx0

e = ex0

Page 41: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⌘ 2 D(x0) =)⇢

⌘ = �

⇤q

ProjT (⌘) = e

⌘0 = argmin⌘=�⇤q,⌘T=e

||q||

Minimal-norm Certificate

Minimal-norm pre-certificate:

T = Tx0

e = ex0

Page 42: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⌘ 2 D(x0) =)⇢

⌘ = �

⇤q

ProjT (⌘) = e

⌘0 = argmin⌘=�⇤q,⌘T=e

||q||

�T = � � ProjT

Minimal-norm Certificate

Proposition: One has

Minimal-norm pre-certificate:

T = Tx0

e = ex0

⌘0 = (�+T�)

⇤e

Page 43: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⌘ 2 D(x0) =)⇢

⌘ = �

⇤q

ProjT (⌘) = e

⌘0 = argmin⌘=�⇤q,⌘T=e

||q||

�T = � � ProjT

Minimal-norm Certificate

Proposition:

Theorem:

the unique solution x

?of P�(y) for y = �x0 + w satisfies

Tx

? = Tx0 and ||x? � x0|| = O(||w||) [Vaiter et al. 2013]

One has

Minimal-norm pre-certificate:

T = Tx0

e = ex0

⌘0 = (�+T�)

⇤e

If ⌘0 2 D(x0) and � ⇠ ||w||,

Page 44: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

[Fuchs 2004]: J = || · ||1.[Bach 2008]: J = || · ||1,2 and J = || · ||⇤.

[Vaiter et al. 2011]: J = ||D⇤ · ||1.

⌘ 2 D(x0) =)⇢

⌘ = �

⇤q

ProjT (⌘) = e

⌘0 = argmin⌘=�⇤q,⌘T=e

||q||

�T = � � ProjT

Minimal-norm Certificate

Proposition:

Theorem:

the unique solution x

?of P�(y) for y = �x0 + w satisfies

Tx

? = Tx0 and ||x? � x0|| = O(||w||) [Vaiter et al. 2013]

One has

Minimal-norm pre-certificate:

T = Tx0

e = ex0

⌘0 = (�+T�)

⇤e

If ⌘0 2 D(x0) and � ⇠ ||w||,

Page 45: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Compressed Sensing Setting

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Let s = ||x0||0. If

� 2 RP⇥N ,

Then ⌘0 2 ¯D(x0) with high probability on �.

[Dossal et al. 2011]

[Wainwright 2009]

P > 2s log(N)

Page 46: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

P ⇠ 2s log(N)P ⇠ 2s log(N/s)

L2 stability Model stability

Phase

transitions:

vs.

Compressed Sensing Setting

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Let s = ||x0||0. If

� 2 RP⇥N ,

Then ⌘0 2 ¯D(x0) with high probability on �.

[Dossal et al. 2011]

[Wainwright 2009]

P > 2s log(N)

Page 47: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

�! Similar results for || · ||1,2, || · ||⇤, || · ||1.

P ⇠ 2s log(N)P ⇠ 2s log(N/s)

L2 stability Model stability

Phase

transitions:

vs.

Compressed Sensing Setting

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Let s = ||x0||0. If

� 2 RP⇥N ,

Then ⌘0 2 ¯D(x0) with high probability on �.

[Dossal et al. 2011]

[Wainwright 2009]

P > 2s log(N)

Page 48: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

�! Similar results for || · ||1,2, || · ||⇤, || · ||1.

�! Not using RIP technics (non-uniform result on x0).

P ⇠ 2s log(N)P ⇠ 2s log(N/s)

L2 stability Model stability

Phase

transitions:

vs.

Compressed Sensing Setting

Theorem:

Random matrix: �i,j ⇠ N (0, 1), i.i.d.

Sparse vectors: J = || · ||1.

Let s = ||x0||0. If

� 2 RP⇥N ,

Then ⌘0 2 ¯D(x0) with high probability on �.

[Dossal et al. 2011]

[Wainwright 2009]

P > 2s log(N)

Page 49: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⇥x =�

i

xi�(·��i)

Increasing �:� reduces correlation.� reduces resolution.

J(x) = ||x||1

1-D Sparse Spikes Deconvolution

�x0

x0�

Page 50: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

⇥x =�

i

xi�(·��i)

Increasing �:� reduces correlation.� reduces resolution.

�0 10

2

support recovery.

J(x) = ||x||1

()

||⌘0,Ic ||1 < 1

⌘0 2 D(x0)

I = {j \ x0(j) 6= 0}||⌘0,Ic ||1

1-D Sparse Spikes Deconvolution

�x0

x0�

201

()

Page 51: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Overview

• Low-complexity Regularization with Gauges

• Performance Guarantees

• Grid-free Regularization

Page 52: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

1

NWhen N ! +1, support is not stable:

||⌘0,Ic ||1 �!N!+1

c > 1.

1

c

||⌘0,Ic ||1

StableUnstable

Support Instability and Measures

Page 53: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

1

NWhen N ! +1, support is not stable:

||⌘0,Ic ||1 �!N!+1

c > 1.

Intuition: spikes wants to move laterally.

1

c

||⌘0,Ic ||1

StableUnstable

�! Use Radon measures m 2 M(T), T = R/Z.

Support Instability and Measures

Page 54: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

1

NWhen N ! +1, support is not stable:

||⌘0,Ic ||1 �!N!+1

c > 1.

Intuition: spikes wants to move laterally.

1

c

||⌘0,Ic ||1

StableUnstable

Extension of `1: total variation

Discrete measure: mx,a

=P

i

ai

�xi .

One has ||mx,a

||TV = ||a||1

�! Use Radon measures m 2 M(T), T = R/Z.

||m||TV = sup||g||161

Z

Tg(x) dm(x)

Support Instability and Measures

Page 55: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Sparse Measure Regularization

Measurements: y = �(m0) + w where

8<

:

m0 2 M(T),� : M(T) ! L2(T),w 2 L2(T).

Page 56: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Acquisition operator:

�(m)(x) =

Z

T'(x, x0)dm(x0) where ' 2 C

2(T⇥ T)

Sparse Measure Regularization

Measurements: y = �(m0) + w where

8<

:

m0 2 M(T),� : M(T) ! L2(T),w 2 L2(T).

Page 57: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Acquisition operator:

Total-variation over measures regularization:

minm2M(T)

1

2||�(m)� y||2 + �||m||TV

�(m)(x) =

Z

T'(x, x0)dm(x0) where ' 2 C

2(T⇥ T)

Sparse Measure Regularization

Measurements: y = �(m0) + w where

8<

:

m0 2 M(T),� : M(T) ! L2(T),w 2 L2(T).

Page 58: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Acquisition operator:

Total-variation over measures regularization:

minm2M(T)

1

2||�(m)� y||2 + �||m||TV

�! Infinite dimensional convex program.

�! If dim(Im(�)) < +1, dual is finite dimensional.

�! If � is a filtering, re-cast dual as SDP program.

�(m)(x) =

Z

T'(x, x0)dm(x0) where ' 2 C

2(T⇥ T)

Sparse Measure Regularization

Measurements: y = �(m0) + w where

8<

:

m0 2 M(T),� : M(T) ! L2(T),w 2 L2(T).

Page 59: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Measures: minm2M

12 ||�m� y||2 + �||m||TV

�1

+1

Fuchs vs. Vanishing Pre-Certificates

Page 60: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

mina2RN

12 ||�za� y||2 + �||a||1

Measures: minm2M

12 ||�m� y||2 + �||m||TV

On a grid z:

�1

+1

Fuchs vs. Vanishing Pre-Certificates

zi

Page 61: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

mina2RN

12 ||�za� y||2 + �||a||1

Measures: minm2M

12 ||�m� y||2 + �||m||TV

On a grid z:

⌘F = �⇤�⇤,+I sign(a0,I)

�1

+1

For m0 = mz,a0 , supp(m0) = x0, supp(a0) = I:

Fuchs vs. Vanishing Pre-Certificates

zi

⌘F

Page 62: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

mina2RN

12 ||�za� y||2 + �||a||1

Measures: minm2M

12 ||�m� y||2 + �||m||TV

On a grid z:

⌘F = �⇤�⇤,+I sign(a0,I) ⌘

V

= �⇤�+,⇤x0

�sign(a0), 0

�⇤

�1

+1

For m0 = mz,a0 , supp(m0) = x0, supp(a0) = I:

where �x

(a, b) =P

i

a

i

'(·, xi

) + b

i

'

0(·, xi

)

Fuchs vs. Vanishing Pre-Certificates

⌘V

zi

⌘F

Page 63: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

(holds for ||w|| small enough and � ⇠ ||w||)

mina2RN

12 ||�za� y||2 + �||a||1

Measures: minm2M

12 ||�m� y||2 + �||m||TV

On a grid z:

⌘F = �⇤�⇤,+I sign(a0,I) ⌘

V

= �⇤�+,⇤x0

�sign(a0), 0

�⇤

�1

+1

For m0 = mz,a0 , supp(m0) = x0, supp(a0) = I:

where �x

(a, b) =P

i

a

i

'(·, xi

) + b

i

'

0(·, xi

)

Fuchs vs. Vanishing Pre-Certificates

If 8 j /2 I, |⌘F (xj)| < 1,

then supp(a�) = supp(a0)

Theorem: [Fuchs 2004]

⌘V

zi

⌘F

Page 64: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

(holds for ||w|| small enough and � ⇠ ||w||)

mina2RN

12 ||�za� y||2 + �||a||1

Measures: minm2M

12 ||�m� y||2 + �||m||TV

On a grid z:

⌘F = �⇤�⇤,+I sign(a0,I) ⌘

V

= �⇤�+,⇤x0

�sign(a0), 0

�⇤

�1

+1

For m0 = mz,a0 , supp(m0) = x0, supp(a0) = I:

where �x

(a, b) =P

i

a

i

'(·, xi

) + b

i

'

0(·, xi

)

then m�

= mx�,a� with

||x� � x0||1 = O(||w||)

Theorem: [Duval-Peyre 2013]If 8 t /2 x0, |⌘V (t)| < 1,

Fuchs vs. Vanishing Pre-Certificates

If 8 j /2 I, |⌘F (xj)| < 1,

then supp(a�) = supp(a0)

Theorem: [Fuchs 2004]

⌘V

zi

⌘F

Page 65: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Numerical Illustration

�1

+1⌘V⌘F

Zoom

+1

Ideal low-pass filter: '(x, x

0) =

sin((2fc+1)⇡(x�x

0))sin(⇡(x�x

0)) , f

c

= 6.

⌘F

⌘V

Page 66: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Solution path � 7! a�

Numerical Illustration

�1

+1⌘V⌘F

Zoom

+1

Ideal low-pass filter: '(x, x

0) =

sin((2fc+1)⇡(x�x

0))sin(⇡(x�x

0)) , f

c

= 6.

⌘F

⌘V

Page 67: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Solution path � 7! a�

�Theorem: [Duval-Peyre 2013]

Discrete ! continuous:

If ⌘V is valid, then a�

neighbors around supp(m0).

is supported on pairs of

(holds for � ⇠ ||w|| small enough.

Numerical Illustration

�1

+1⌘V⌘F

Zoom

+1

Ideal low-pass filter: '(x, x

0) =

sin((2fc+1)⇡(x�x

0))sin(⇡(x�x

0)) , f

c

= 6.

⌘F

⌘V

Page 68: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Gauges: encode linear models as singular points.

Conclusion

Page 69: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Gauges: encode linear models as singular points.

Performance measures

L2error

model

di↵erent CS guarantees

Conclusion

Page 70: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

Gauges: encode linear models as singular points.

Performance measures

L2error

model

di↵erent CS guarantees

Specific certificate:⌘0, ⌘F , ⌘V , . . .

Conclusion

Page 71: Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guarantees

– Approximate model recovery Tx

? ⇡ Tx0 .

Gauges: encode linear models as singular points.

Performance measures

L2error

model

di↵erent CS guarantees

Specific certificate:⌘0, ⌘F , ⌘V , . . .

(e.g. grid-free recovery)

– CS performance with arbitrary gauges.

Conclusion

Open problems: