A Review of Proximal Methods, with a New One
-
Upload
gabriel-peyre -
Category
Documents
-
view
336 -
download
0
description
Transcript of A Review of Proximal Methods, with a New One
A Review of ProximalSplitting Methods
www.numerical-tours.com
with a new one
Gabriel Peyré Jalal FadiliHugo Raguet
Overview
• Inverse Problems Regularization
• Proximal Splitting
• Generalized Forward-Backward
Forward model:
Observations Operator Noise(Unknown)Input� : RQ � RP
Inverse Problems
y = K f0 + w � RP
Forward model:
Observations Operator Noise(Unknown)Input� : RQ � RP
Denoising: K = IdQ, P = Q.
Inverse Problems
y = K f0 + w � RP
Forward model:
Observations Operator Noise(Unknown)Input
(Kf)(x) =�
0 if x � �,f(x) if x /� �.
K
� : RQ � RP
Denoising: K = IdQ, P = Q.
Inpainting: set � of missing pixels, P = Q� |�|.
Inverse Problems
y = K f0 + w � RP
Forward model:
Observations Operator Noise(Unknown)Input
(Kf)(x) =�
0 if x � �,f(x) if x /� �.
K
� : RQ � RP
Denoising: K = IdQ, P = Q.
Inpainting: set � of missing pixels, P = Q� |�|.
Super-resolution: Kf = (f � k) �� , P = Q/� .
Inverse Problems
K
y = K f0 + w � RP
Noisy measurements: y = Kf0 + w.
Prior model: J : RQ � R assigns a score to images.
Inverse Problem Regularization
Noisy measurements: y = Kf0 + w.
f� � argminf�RQ
12||y �Kf ||2 + � J(f)
Prior model: J : RQ � R assigns a score to images.
Inverse Problem Regularization
Data fidelity Regularity
Noisy measurements: y = Kf0 + w.
Choice of �: tradeo�
||w||Regularity of f0
J(f0)Noise level
f� � argminf�RQ
12||y �Kf ||2 + � J(f)
Prior model: J : RQ � R assigns a score to images.
Inverse Problem Regularization
Data fidelity Regularity
Noisy measurements: y = Kf0 + w.
No noise: �� 0+, minimize
Choice of �: tradeo�
||w||Regularity of f0
J(f0)Noise level
f� � argminf�RQ
12||y �Kf ||2 + � J(f)
Prior model: J : RQ � R assigns a score to images.
f� � argminf�RQ,Kf=y
J(f)
Inverse Problem Regularization
Data fidelity Regularity
L1 Regularization
coe�cientsx0 � RN
L1 Regularization
coe�cients image�
x0 � RN f0 = �x0 � RQ
L1 Regularization
observations
w
coe�cients image� K
x0 � RN f0 = �x0 � RQ y = Kf0 + w � RP
L1 Regularization
observations
� = K �⇥ ⇥ RP�N
w
coe�cients image� K
x0 � RN f0 = �x0 � RQ y = Kf0 + w � RP
Fidelity Regularization
minx�RN
12
||y � �x||2 + �||x||1
L1 Regularization
Sparse recovery: f� = �x� where x� solves
observations
� = K �⇥ ⇥ RP�N
w
coe�cients image� K
x0 � RN f0 = �x0 � RQ y = Kf0 + w � RP
K
y = Kf0 + wMeasures:
Inpainting Problem
(Kf)(x) =�
0 if x � �,f(x) if x /� �.
Overview
• Inverse Problems Regularization
• Proximal Splitting
• Generalized Forward-Backward
Proximal operator of G:Prox�G(x) = argmin
z
12
||x� z||2 + �G(z)
Proximal Operators
Proximal operator of G:Prox�G(x) = argmin
z
12
||x� z||2 + �G(z)
Prox�G(x)i = max�
0, 1� �
|xi|
�xi
G(x) = ||x||1 =�
i
|xi|
Proximal Operators
−10 −8 −6 −4 −2 0 2 4 6 8 10
−2
0
2
4
6
8
10
12
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
||x||0|x|log(1 + x2)
G(x)
ProxG(x)
Proximal operator of G:Prox�G(x) = argmin
z
12
||x� z||2 + �G(z)
Prox�G(x)i = max�
0, 1� �
|xi|
�xi
Prox�G(x)i =�
xi if |xi| � �2�,0 otherwise.
G(x) = ||x||1 =�
i
|xi|
G(x) = ||x||0 = | {i \ xi �= 0} |
Proximal Operators
−10 −8 −6 −4 −2 0 2 4 6 8 10
−2
0
2
4
6
8
10
12
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
||x||0|x|log(1 + x2)
G(x)
ProxG(x)
�� 3rd order polynomial root.
Proximal operator of G:Prox�G(x) = argmin
z
12
||x� z||2 + �G(z)
Prox�G(x)i = max�
0, 1� �
|xi|
�xi
Prox�G(x)i =�
xi if |xi| � �2�,0 otherwise.
G(x) = ||x||1 =�
i
|xi|
G(x) = ||x||0 = | {i \ xi �= 0} |
G(x) =�
i
log(1 + |xi|2)
Proximal Operators
−10 −8 −6 −4 −2 0 2 4 6 8 10
−2
0
2
4
6
8
10
12
−10 −8 −6 −4 −2 0 2 4 6 8 10−10
−8
−6
−4
−2
0
2
4
6
8
10
||x||0|x|log(1 + x2)
G(x)
ProxG(x)
Solve minx�H
E(x)
Problem: Prox�E is not available.
Proximal Splitting Methods
Solve minx�H
E(x)
Splitting:
SimpleSmooth
Problem: Prox�E is not available.
E(x) = F (x) +�
i
Gi(x)
Proximal Splitting Methods
Solve minx�H
E(x)
Splitting:
SimpleSmooth
Problem: Prox�E is not available.
Iterative algorithms using: �F (x)Prox�Gi(x)
Forward-Backward:Douglas-Rachford:
Primal-Dual:Generalized FB:
�Gi
F +�
Gi
F + Gsolves
E(x) = F (x) +�
i
Gi(x)
�Gi � Ai
Proximal Splitting Methods
minx�RN
F (x) + G(x)
Forward-Backward
SimpleSmooth
(�)
minx�RN
F (x) + G(x)
x(�+1) = Prox�G
�x(�) � ��F (x(�))
�
Forward-Backward
Forward-backward:
SimpleSmooth
(�)
G = �C
minx�RN
F (x) + G(x)
x(�+1) = Prox�G
�x(�) � ��F (x(�))
�
Forward-Backward
Forward-backward:
Projected gradient descent:
SimpleSmooth
(�)
G = �C
minx�RN
F (x) + G(x)
x(�+1) = Prox�G
�x(�) � ��F (x(�))
�
Forward-Backward
Forward-backward:
Projected gradient descent:
Theorem:
a solution of (�)If � < 2/L,
Let �F be L-Lipschitz.x(�) � x�
SimpleSmooth
(�)
G = �C
�� Multi-step accelerations (Nesterov, Beck-Teboule).
minx�RN
F (x) + G(x)
x(�+1) = Prox�G
�x(�) � ��F (x(�))
�
Forward-Backward
Forward-backward:
Projected gradient descent:
Theorem:
a solution of (�)If � < 2/L,
Let �F be L-Lipschitz.x(�) � x�
SimpleSmooth
(�)
minx
12
||�x� y||2 + �||x||1 minx
F (x) + G(x)
F (x) =12
||�x� y||2
G(x) = �||x||1
�F (x) = ��(�x� y)
Prox�G(x)i = max�
0, 1� �⇥
|xi|
�xi
L = ||���||
Example: L1 Regularization
��
Forward-backward Iterative soft thresholding��
Douglas Rachford Scheme
(�)minx
G1(x) + G2(x)
SimpleSimple
Douglas-Rachford iterations:
RProx�G(x) = 2Prox�G(x)� x
Reflexive prox:
z(�+1) =�1� �
2
�z(�) +
�
2RProx�G2 � RProx�G1(z
(�))
x(�+1) = Prox�G2(z(�+1))
Douglas Rachford Scheme
(�)minx
G1(x) + G2(x)
SimpleSimple
Douglas-Rachford iterations:
Theorem:
a solution of (�)
RProx�G(x) = 2Prox�G(x)� x
x(�) � x�
If 0 < � < 2 and ⇥ > 0,
Reflexive prox:
z(�+1) =�1� �
2
�z(�) +
�
2RProx�G2 � RProx�G1(z
(�))
x(�+1) = Prox�G2(z(�+1))
Douglas Rachford Scheme
(�)minx
G1(x) + G2(x)
SimpleSimple
C = {x \ �x = y}
Prox�G1(x) = ProjC(x) = x + �⇥(��⇥)�1(y � �x)
Prox�G2(x) =�
max�
0, 1� �
|xi|
�xi
�
i
�� e⇥cient if ��� easy to invert.
minx
G1(x) + G2(x)min�x=y
||x||1
G1(x) = iC(x),
G2(x) = ||x||1
Example: Constrainted L1
��
50 100 150 200 250
−5
−4
−3
−2
−1
0
1
C = {x \ �x = y}
Prox�G1(x) = ProjC(x) = x + �⇥(��⇥)�1(y � �x)
Prox�G2(x) =�
max�
0, 1� �
|xi|
�xi
�
i
�� e⇥cient if ��� easy to invert.
� = 0.01� = 1� = 10
Example: compressed sensing
� � R100�400 Gaussian matrix
||x0||0 = 17y = �x0
log10(||x(�)||1 � ||x�||1)
minx
G1(x) + G2(x)min�x=y
||x||1
G1(x) = iC(x),
G2(x) = ||x||1
Example: Constrainted L1
��
�
Overview
• Inverse Problems Regularization
• Proximal Splitting
• Generalized Forward-Backward
GFB Splitting
(�)minx�RN
F (x) +n�
i=1
Gi(x)
SimpleSmooth
� i = 1, . . . , n,
x(�+1) =1n
n�
i=1
z(�+1)i
z(�+1)i =z(�)
i + Proxn�Gi(2x(�)�z(�)i ���F (x(�)))�x(�)
GFB Splitting
(�)minx�RN
F (x) +n�
i=1
Gi(x)
SimpleSmooth
� i = 1, . . . , n,
n = 1 �� Forward-backward.F = 0 �� Douglas-Rachford.
x(�+1) =1n
n�
i=1
z(�+1)i
z(�+1)i =z(�)
i + Proxn�Gi(2x(�)�z(�)i ���F (x(�)))�x(�)
GFB Splitting
Theorem:
a solution of (�)x(�) � x�If � < 2/L,Let �F be L-Lipschitz.
(�)minx�RN
F (x) +n�
i=1
Gi(x)
SimpleSmooth
Coe�cients x.Image f = �x
�1 � �2 block sparsity:
Block Regularization
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
G(x) =�
b�B||x[b]||,
b � B
||x[b]||2 =�
m�b
x2m
Coe�cients x.Image f = �x Blocks B1
Non-overlapping decomposition:
�1 � �2 block sparsity:
B = B1 � . . . � Bn
G(x) =n�
i=1
Gi(x) Gi(x) =�
b�Bi
||x[b]||,
Block Regularization
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
B1 � B2
G(x) =�
b�B||x[b]||,
b � B
||x[b]||2 =�
m�b
x2m
Coe�cients x.Image f = �x Blocks B1
Non-overlapping decomposition:
�1 � �2 block sparsity:
B = B1 � . . . � Bn
G(x) =n�
i=1
Gi(x)
⇤m ⇥ b ⇥ Bi, Prox�Gi(x)m = max�
0, 1� �
||x[b]||
�xm
Gi(x) =�
b�Bi
||x[b]||,
Each Gi is simple:
Block Regularization
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
B1 � B2
G(x) =�
b�B||x[b]||,
b � B
||x[b]||2 =�
m�b
x2m
y = �x0 + w
x0
x�
� = TI wavelets
� = convolution
Numerical Illustration
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
Numerical ExperimentsDeconvolution minx
12 ��Y ⇥K � x ��2 + �(2)`1�`2 �4
k=1 ��x ��Bk1,2
10 20 30 40−1
0
1
2
3
tEFB: 161s; tPR: 173s; tCP: 190s
iteration #
log 10
(E−E
min
)
EFBPRCP
N: 256
noise: 0.025; convol.: 2λl1/l2
2 : 1.30e−03; it. #50; SNR: 22.49dB
Numerical ExperimentsDeconv. + Inpaint. minx
12 ��Y ⇥ P�K � x ��2 + �(4)`1�`2 �16
k=1 ��x ��Bk1,2
10 20 30 40
0
1
2
3
tEFB: 283s; tPR: 298s; tCP: 368s
iteration #
log 10
(E−E
min
)
EFBPRCP
noise: 0.025; degrad.: 0.4; convol.: 2λl1/l2
4 : 1.00e−03; it. #50; SNR: 21.80dB
Numerical ExperimentsDeconv. + Inpaint. minx
12 ��Y ⇥ P�K � x ��2 + �(4)`1�`2 �16
k=1 ��x ��Bk1,2
10 20 30 40
0
1
2
3
tEFB: 283s; tPR: 298s; tCP: 368s
iteration #
log 10
(E−E
min
)
EFBPRCP
noise: 0.025; degrad.: 0.4; convol.: 2λl1/l2
4 : 1.00e−03; it. #50; SNR: 21.80dB
log10(E(x(�))� E(x�))
minx
12
||y � �⇥x||2 + ��
i
Gi(x)
� = inpainting+convolution
Inverse problems in imaging:� Large scale, N � 106.
� Non-smooth (sparsity, TV, . . . )
� (Sometimes) convex.
� Highly structured (separability, �p norms, . . . ).
Conclusion
Inverse problems in imaging:� Large scale, N � 106.
� Non-smooth (sparsity, TV, . . . )
� (Sometimes) convex.
� Highly structured (separability, �p norms, . . . ).
Proximal splitting:
� Parallelizable.� Unravel the structure of problems.
Conclusion
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk
Inverse problems in imaging:� Large scale, N � 106.
� Non-smooth (sparsity, TV, . . . )
� (Sometimes) convex.
� Highly structured (separability, �p norms, . . . ).
Proximal splitting:
Open problems:� Less structured problems without smoothness.� Non-convex optimization.
� Parallelizable.� Unravel the structure of problems.
Conclusion
Towards More Complex Penalization
⇥⇥x ⇥⇥1 = �i ⇥xi ⇥ �b�B��i�b x2
i
�b�B1
��i�b x2i+
�b�B2
��i�b x2i
Decomposition G = �k Gk