CompressiveSensingGabriel Peyré
www.numerical-tours.com
Overview
•Shannon’s World
•Compressive Sensing Acquisition
•Compressive Sensing Recovery
•Theoretical Guarantees
•Fourier Domain Measurements
Sampling:
f̃ � L2([0, 1]d) f � RN
Idealization:
acquisitiondevice
f [n] ⇡ f̃(n/N)
Discretization
Data aquisition:
Sensors
Pointwise Sampling and Smoothness
f̃ � L2 f � RN
f [i] = f̃(i/N)
Data aquisition:
Sensors
f̃(t) =�
i
f [i]h(Nt� i)
Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]
h(t) =sin(�t)
�t
Pointwise Sampling and Smoothness
f̃ � L2 f � RN
f [i] = f̃(i/N)
Data aquisition:
Sensors
f̃(t) =�
i
f [i]h(Nt� i)
�� Natural images are not smooth.
Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]
h(t) =sin(�t)
�t
Pointwise Sampling and Smoothness
f̃ � L2 f � RN
f [i] = f̃(i/N)
Data aquisition:
Sensors
f̃(t) =�
i
f [i]h(Nt� i)
�� Natural images are not smooth.
�� But can be compressed e�ciently.
Shannon interpolation: if Supp( ˆ̃f) � [�N�, N�]
0,1,0,. . .
h(t) =sin(�t)
�t
�� Sample and compress simultaneously?
Pointwise Sampling and Smoothness
f̃ � L2 f � RN
f [i] = f̃(i/N)
JPEG-2k
Sampling and Periodization
(a)
(c)
(d)
(b)
1
0
Sampling and Periodization: Aliasing
(b)
(c)
(d)
(a)
0
1
Overview
•Shannon’s World
•Compressive Sensing Acquisition
•Compressive Sensing Recovery
•Theoretical Guarantees
•Fourier Domain Measurements
f̃
Single Pixel Camera (Rice)
f̃
P measures � N micro-mirrors
Single Pixel Camera (Rice)
y[i] = �f, �i�
f̃
P/N = 0.16 P/N = 0.02P/N = 1
P measures � N micro-mirrors
Single Pixel Camera (Rice)
y[i] = �f, �i�
Physical hardware resolution limit: target resolution f � RN .
f̃ � L2 f � RN y � RPmicromirrors
arrayresolution
CS hardwareK
CS Hardware Model
CS is about designing hardware: input signals f̃ � L2(R2).
Physical hardware resolution limit: target resolution f � RN .
f̃ � L2 f � RN y � RPmicromirrors
arrayresolution
CS hardware
,
...
K
CS Hardware Model
CS is about designing hardware: input signals f̃ � L2(R2).
,
,
Physical hardware resolution limit: target resolution f � RN .
f̃ � L2 f � RN y � RPmicromirrors
arrayresolution
CS hardware
,
...
fOperator K
K
CS Hardware Model
CS is about designing hardware: input signals f̃ � L2(R2).
,
,
Overview
•Shannon’s World
•Compressive Sensing Acquisition
•Compressive Sensing Recovery
•Theoretical Guarantees
•Fourier Domain Measurements
Need to solve y = Kf .
� More unknown than equations.
dim(ker(K)) = N � P is huge.
Inversion and Sparsity
f
Operator K
Need to solve y = Kf .
� More unknown than equations.
dim(ker(K)) = N � P is huge.
Prior information: f is sparse in a basis {�m}m.
J�(f) = Card {m \ |�f, �m�| > �} is small.
Inversion and Sparsity
f
Operator K
�f, �m�f
Image with 2 pixels:
q = 0
Convex Relaxation: L1 Prior
J0(f) = # {m \ ⇥f, �m⇤ �= 0}J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.
Image with 2 pixels:
Jq(f) =�
m
|�f, �m⇥|qq = 0 q = 1 q = 2q = 3/2q = 1/2
Convex Relaxation: L1 Prior
J0(f) = # {m \ ⇥f, �m⇤ �= 0}
�q priors: (convex for q � 1)
J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.
Image with 2 pixels:
Jq(f) =�
m
|�f, �m⇥|qq = 0 q = 1 q = 2q = 3/2q = 1/2
Convex Relaxation: L1 Prior
J1(f) =�
m
|�f, �m⇥|Sparse �1 prior:
J0(f) = # {m \ ⇥f, �m⇤ �= 0}
�q priors: (convex for q � 1)
J0(f) = 0 �� null image.J0(f) = 1 �� sparse image.J0(f) = 2 �� non-sparse image.
f0 � RN sparse in ortho-basis �
Sparse CS Recovery
���
x0 � RN
f0 � RN
(Discretized) sampling acquisition:
f0 � RN sparse in ortho-basis �
y = Kf0 + w = K � �(x0) + w= �
Sparse CS Recovery
���
x0 � RN
f0 � RN
(Discretized) sampling acquisition:
f0 � RN sparse in ortho-basis �
y = Kf0 + w = K � �(x0) + w= �
K drawn from the Gaussian matrix ensemble
Ki,j � N (0, P�1/2) i.i.d.
� � drawn from the Gaussian matrix ensemble
Sparse CS Recovery
���
x0 � RN
f0 � RN
(Discretized) sampling acquisition:
f0 � RN sparse in ortho-basis �
y = Kf0 + w = K � �(x0) + w= �
K drawn from the Gaussian matrix ensemble
Ki,j � N (0, P�1/2) i.i.d.
� � drawn from the Gaussian matrix ensemble
Sparse recovery:min
||�x�y||�||w||||x||1 min
x
12
||�x� y||2 + �||x||1||w||�� �
Sparse CS Recovery
���
x0 � RN
f0 � RN
� = translation invariantwavelet frame
Original f0
CS Simulation Example
Overview
•Shannon’s World
•Compressive Sensing Acquisition
•Compressive Sensing Recovery
•Theoretical Guarantees
•Fourier Domain Measurements
⇥ ||x||0 � k, (1� �k)||x||2 � ||�x||2 � (1 + �k)||x||2Restricted Isometry Constants:
�1 recovery:
CS with RIP
x⇥ � argmin||�x�y||��
||x||1 where�
y = �x0 + w||w|| � �
⇥ ||x||0 � k, (1� �k)||x||2 � ||�x||2 � (1 + �k)||x||2Restricted Isometry Constants:
�1 recovery:
CS with RIP
[Candes 2009]
x⇥ � argmin||�x�y||��
||x||1 where�
y = �x0 + w||w|| � �
Theorem: If �2k ��
2� 1, then
where xk is the best k-term approximation of x0.
||x0 � x�|| � C0⇥k
||x0 � xk||1 + C1�
f�(⇥) =1
2⇤�⇥
�(⇥� b)+(a� ⇥)+
Eigenvalues of ��I�I with |I| = k are essentially in [a, b]
a = (1��
�)2 and b = (1��
�)2 where � = k/P
When k = �P � +�, the eigenvalue distribution tends to
[Marcenko-Pastur]
Large deviation inequality [Ledoux]
Singular Values Distributions
0 0.5 1 1.5 2 2.50
0.5
1
1.5
P=200, k=10
0 0.5 1 1.5 2 2.50
0.2
0.4
0.6
0.8
1
P=200, k=30
0 0.5 1 1.5 2 2.50
0.2
0.4
0.6
0.8
P=200, k=50
0 0.5 1 1.5 2 2.50
0.5
1
1.5
P=200, k=10
0 0.5 1 1.5 2 2.50
0.2
0.4
0.6
0.8
1
P=200, k=30
0 0.5 1 1.5 2 2.50
0.2
0.4
0.6
0.8
P=200, k=50
P = 200, k = 10
f�(�)
�
�k = 30
Link with coherence:
�k � (k � 1)µ(�)
�2 = µ(�)
RIP for Gaussian Matrices
µ(�) = maxi �=j
|��i, �j⇥|
Link with coherence:
�k � (k � 1)µ(�)
For Gaussian matrices:
�2 = µ(�)
RIP for Gaussian Matrices
µ(�) = maxi �=j
|��i, �j⇥|
µ(�) ��
log(PN)/P
Link with coherence:
�k � (k � 1)µ(�)
For Gaussian matrices:
Stronger result:
�2 = µ(�)
RIP for Gaussian Matrices
k � C
log(N/P )PTheorem: If
then �2k ��
2� 1 with high probability.
µ(�) = maxi �=j
|��i, �j⇥|
µ(�) ��
log(PN)/P
(1� ⇥1(A))||�||2 � ||A�||2 � (1 + ⇥2(A))||�||2Stability constant of A:
smallest / largest eigenvalues of A�A
Numerics with RIP
�2� 1
(1� ⇥1(A))||�||2 � ||A�||2 � (1 + ⇥2(A))||�||2Stability constant of A:
Upper/lower RIC:
�ik = max
|I|=k�i(�I)
�k = min(�1k, �2
k)
k
�̂2k
�̂2k
Monte-Carlo estimation:�̂k � �k
smallest / largest eigenvalues of A�A
N = 4000, P = 1000
Numerics with RIP
�(B�)
x0 �x0
�
�1
��2
�2�3
��3
��1
� = (�i)i � R2�3
B� = {x \ ||x||1 � �}� = ||x0||1
x� � argmin�x=y
||x||1 (P0(y))Noiseless recovery:
y �� x�
Polytopes-based Guarantees
�(B�)
x0 �x0
�
�1
��2
�2�3
��3
��1
� = (�i)i � R2�3
B� = {x \ ||x||1 � �}� = ||x0||1
x0 solution of P0(�x0) �⇥ �x0 ⇤ ��(B�)
x� � argmin�x=y
||x||1 (P0(y))Noiseless recovery:
y �� x�
Polytopes-based Guarantees
C(0,1,1)
K(0,1,1)
Ks =�(�isi)i � R3 \ �i � 0
� 2-D conesCs = �Ks
2-D quadrant
L1 Recovery in 2-D
��1
�2�3
� = (�i)i � R2�3
y �� x�
All MostRIP
� Sharp constants.
� No noise robustness.
All x0 such that ||x0||0 � Call(P/N)P are identifiable.Most x0 such that ||x0||0 � Cmost(P/N)P are identifiable.
Call(1/4) � 0.065
Cmost(1/4) � 0.25
[Donoho]
Polytope Noiseless Recovery
50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Counting faces of random polytopes:
All MostRIP
� Sharp constants.
� No noise robustness.
All x0 such that ||x0||0 � Call(P/N)P are identifiable.Most x0 such that ||x0||0 � Cmost(P/N)P are identifiable.
Call(1/4) � 0.065
Cmost(1/4) � 0.25
[Donoho]
� Computation of“pathological” signals
[Dossal, P, Fadili, 2010]
Polytope Noiseless Recovery
50 100 150 200 250 300 350 4000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Counting faces of random polytopes:
Overview
•Shannon’s World
•Compressive Sensing Acquisition
•Compressive Sensing Recovery
•Theoretical Guarantees
•Fourier Domain Measurements
Tomography and Fourier Measures
Kf = (f̂ [!])!2⌦
Tomography and Fourier Measures
�
Fourier slice theorem: p̂�(⇥) = f̂(⇥ cos(�), ⇥ sin(�))
1D 2D Fourier
�k
f̂ = FFT2(f)
Partial Fourier measurements:
Equivalent to:
{p�k(t)}t�R0�k<K
Regularized Inversion
f⇥ = argminf
12
�
���
|y[⇤] � f̂ [⇤]|2 + ��
m
|⇥f, ⇥m⇤|.�1 regularization:
Noisy measurements: ⇥� � �, y[�] = f̂0[�] + w[�].
Noise: w[⇥] � N (0,�), white noise.
MRI ImagingFrom [Lutsig et al.]
Fourier sub-sampling pattern:
randomization
MRI Reconstruction
High resolution Linear SparsityLow resolution
From [Lutsig et al.]
Fourier sampling(Earth’s rotation)
Linearreconstruction
Radar InterferometryCARMA (USA)
Gaussian matrices: intractable for large N .
Random partial orthogonal matrix: {��}� orthogonal basis.
Fast measurements: (e.g. Fourier basis)
Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.
Structured Measurements
Gaussian matrices: intractable for large N .
Random partial orthogonal matrix: {��}� orthogonal basis.
Fast measurements: (e.g. Fourier basis)
Mutual incoherence: µ =⌅
Nmax�,m
|⇥⇥�, �m⇤| � [1,⌅
N ]
Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.
Structured Measurements
�� not universal: requires incoherence.
Gaussian matrices: intractable for large N .
Random partial orthogonal matrix: {��}� orthogonal basis.
Fast measurements: (e.g. Fourier basis)
Mutual incoherence: µ =⌅
Nmax�,m
|⇥⇥�, �m⇤| � [1,⌅
N ]
Kf = (h'!, fi)!2⌦ where |⌦| = P uniformly random.
Structured Measurements
Theorem: with high probability on �,
If M � CP
µ2 log(N)4, then �2M �
�2� 1
[Rudelson, Vershynin, 2006]
� = K
dictionary
ConclusionSparsity: approximate signals with few atoms.
�� Randomized sensors + sparse recovery.�� Number of measurements � signal complexity.
Compressed sensing ideas:
�� CS is about designing new hardware.
dictionary
ConclusionSparsity: approximate signals with few atoms.
�� Randomized sensors + sparse recovery.�� Number of measurements � signal complexity.
Compressed sensing ideas:
The devil is in the constants:
�� Worse case analysis is problematic.
�� Designing good signal models.
�� CS is about designing new hardware.
dictionary
ConclusionSparsity: approximate signals with few atoms.
Dictionary learning:
learning
�
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Dictionary learning:
Analysis vs. synthesis:
learning
�
Js(f) = minf=�x
||x||1
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Image f = �x
Coe�cients x
�
Dictionary learning:
Analysis vs. synthesis:
learning
�
Ja(f) = ||D�f ||1
Js(f) = minf=�x
||x||1
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Image f = �x
Coe�cients x c = D�f
� D�
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
Dictionary learning:
Analysis vs. synthesis:
learning
�
Ja(f) = ||D�f ||1
Js(f) = minf=�x
||x||1
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Other sparse priors:
Image f = �x
Coe�cients x c = D�f
� D�
|x1| + |x2| max(|x1|, |x2|)
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
Dictionary learning:
Analysis vs. synthesis:
learning
�
Ja(f) = ||D�f ||1
Js(f) = minf=�x
||x||1
|x1| + (x22 + x2
3)12
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Other sparse priors:
Image f = �x
Coe�cients x c = D�f
� D�
|x1| + |x2| max(|x1|, |x2|)
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
Dictionary learning:
Analysis vs. synthesis:
learning
�
Ja(f) = ||D�f ||1
Js(f) = minf=�x
||x||1
|x1| + (x22 + x2
3)12
Some Hot Topics
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 57
Fig. 2. Dictionaries with 256 atoms learned on a generic database of natural images, with two different sizes of patches. Note the large number of color-less atoms.Since the atoms can have negative values, the vectors are presented scaled and shifted to the [0,255] range per channel: (a) 5 5 3 patches; (b) 8 8 3 patches.
Fig. 3. Examples of color artifacts while reconstructing a damaged version of the image (a) without the improvement here proposed ( in the new metric).Color artifacts are reduced with our proposed technique ( in our proposed new metric). Both images have been denoised with the same global dictionary.In (b), one observes a bias effect in the color from the castle and in some part of the water. What is more, the color of the sky is piecewise constant when(false contours), which is another artifact our approach corrected. (a) Original. (b) Original algorithm, dB. (c) Proposed algorithm,
dB.
Fig. 4. (a) Training Image; (b) resulting dictionary; (b) is the dictionary learned in the image in (a). The dictionary is more colored than the global one.
MA
IRA
Letal.:SPA
RSE
RE
PRE
SEN
TAT
ION
FOR
CO
LO
RIM
AG
ER
EST
OR
AT
ION
61
Fig.7.D
atasetused
forevaluating
denoisingexperim
ents.
TAB
LE
IPSN
RR
ESU
LTS
OF
OU
RD
EN
OISIN
GA
LG
OR
ITH
MW
ITH
256A
TO
MS
OF
SIZ
E7
73
FOR
AN
D6
63
FOR
.EA
CH
CA
SEIS
DIV
IDE
DIN
FO
UR
PA
RT
S:TH
ET
OP-L
EFT
RE
SULT
SA
RE
TH
OSE
GIV
EN
BY
MCA
UL
EY
AN
DA
L[28]W
ITH
TH
EIR
“33
MO
DE
L.”T
HE
TO
P-RIG
HT
RE
SULT
SA
RE
TH
OSE
OB
TAIN
ED
BY
APPLY
ING
TH
EG
RA
YSC
AL
EK
-SVD
AL
GO
RIT
HM
[2]O
NE
AC
HC
HA
NN
EL
SE
PAR
AT
ELY
WIT
H8
8A
TO
MS.T
HE
BO
TT
OM
-LE
FTA
RE
OU
RR
ESU
LTS
OB
TAIN
ED
WIT
HA
GL
OB
AL
LYT
RA
INE
DD
ICT
ION
AR
Y.TH
EB
OT
TO
M-R
IGH
TA
RE
TH
EIM
PRO
VE
ME
NT
SO
BTA
INE
DW
ITH
TH
EA
DA
PTIV
EA
PPRO
AC
HW
ITH
20IT
ER
AT
ION
S.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT
SFO
RE
AC
HG
RO
UP.
AS
CA
NB
ESE
EN,
OU
RP
RO
POSE
DT
EC
HN
IQU
EC
ON
SISTE
NT
LYP
RO
DU
CE
ST
HE
BE
STR
ESU
LTS
TAB
LE
IIC
OM
PAR
ISON
OF
TH
EPSN
RR
ESU
LTS
ON
TH
EIM
AG
E“C
AST
LE”
BE
TW
EE
N[28]
AN
DW
HA
TW
EO
BTA
INE
DW
ITH
2566
63
AN
D7
73
PA
TC
HE
S.F
OR
TH
EA
DA
PTIV
EA
PPRO
AC
H,20
ITE
RA
TIO
NS
HA
VE
BE
EN
PE
RFO
RM
ED.B
OL
DIN
DIC
AT
ES
TH
EB
EST
RE
SULT,
IND
ICA
TIN
GO
NC
EA
GA
INT
HE
CO
NSIST
EN
TIM
PRO
VE
ME
NT
OB
TAIN
ED
WIT
HO
UR
PR
OPO
SED
TE
CH
NIQ
UE
patch),inorder
topreventany
learningof
theseartifacts
(over-fitting).
We
definethen
thepatch
sparsityof
thedecom
po-sition
asthis
number
ofsteps.T
hestopping
criteriain
(2)be-
comes
thenum
berof
atoms
usedinstead
ofthe
reconstructionerror.U
singa
small
duringthe
OM
Pperm
itsto
learna
dic-tionary
specializedin
providinga
coarseapproxim
ation.O
urassum
ptionis
that(pattern)
artifactsare
lesspresent
incoarse
approximations,preventing
thedictionary
fromlearning
them.
We
proposethen
thealgorithm
describedin
Fig.6.We
typicallyused
toprevent
thelearning
ofartifacts
andfound
outthattw
oouteriterations
inthe
scheme
inFig.6
aresufficientto
givesatisfactory
results,while
within
theK
-SVD
,10–20itera-
tionsare
required.To
conclude,inorderto
addressthedem
osaicingproblem
,we
usethe
modified
K-SV
Dalgorithm
thatdealsw
ithnonuniform
noise,asdescribed
inprevious
section,andadd
toitan
adaptivedictionary
thathasbeen
learnedw
ithlow
patchsparsity
inorder
toavoid
over-fittingthe
mosaic
pattern.The
same
techniquecan
beapplied
togeneric
colorinpainting
asdem
onstratedin
thenextsection.
V.
EX
PER
IME
NTA
LR
ESU
LTS
We
arenow
readyto
presentthe
colorim
agedenoising,in-
painting,anddem
osaicingresultsthatare
obtainedw
iththe
pro-posed
framew
ork.
A.
Denoising
Color
Images
The
state-of-the-artperform
anceof
thealgorithm
ongrayscale
images
hasalready
beenstudied
in[2].
We
nowevaluate
ourextension
forcolor
images.
We
trainedsom
edictionaries
with
differentsizesof
atoms
55
3,66
3,7
73
and8
83,
on200
000patches
takenfrom
adatabase
of15
000im
agesw
iththe
patch-sparsityparam
eter(six
atoms
inthe
representations).We
usedthe
databaseL
abelMe
[55]to
buildour
image
database.T
henw
etrained
eachdictionary
with
600iterations.
This
providedus
aset
ofgeneric
dictionariesthat
we
usedas
initialdictionaries
inour
denoisingalgorithm
.C
omparing
theresults
obtainedw
iththe
globalapproach
andthe
adaptiveone
permits
usto
seethe
improvem
entsin
thelearning
process.W
echose
toevaluate
MAIRAL et al.: SPARSE REPRESENTATION FOR COLOR IMAGE RESTORATION 61
Fig. 7. Data set used for evaluating denoising experiments.
TABLE IPSNR RESULTS OF OUR DENOISING ALGORITHM WITH 256 ATOMS OF SIZE 7 7 3 FOR AND 6 6 3 FOR . EACH CASE IS DIVIDED IN FOURPARTS: THE TOP-LEFT RESULTS ARE THOSE GIVEN BY MCAULEY AND AL [28] WITH THEIR “3 3 MODEL.” THE TOP-RIGHT RESULTS ARE THOSE OBTAINED BY
APPLYING THE GRAYSCALE K-SVD ALGORITHM [2] ON EACH CHANNEL SEPARATELY WITH 8 8 ATOMS. THE BOTTOM-LEFT ARE OUR RESULTS OBTAINEDWITH A GLOBALLY TRAINED DICTIONARY. THE BOTTOM-RIGHT ARE THE IMPROVEMENTS OBTAINED WITH THE ADAPTIVE APPROACH WITH 20 ITERATIONS.
BOLD INDICATES THE BEST RESULTS FOR EACH GROUP. AS CAN BE SEEN, OUR PROPOSED TECHNIQUE CONSISTENTLY PRODUCES THE BEST RESULTS
TABLE IICOMPARISON OF THE PSNR RESULTS ON THE IMAGE “CASTLE” BETWEEN [28] AND WHAT WE OBTAINED WITH 256 6 6 3 AND 7 7 3 PATCHES.
FOR THE ADAPTIVE APPROACH, 20 ITERATIONS HAVE BEEN PERFORMED. BOLD INDICATES THE BEST RESULT, INDICATING ONCEAGAIN THE CONSISTENT IMPROVEMENT OBTAINED WITH OUR PROPOSED TECHNIQUE
patch), in order to prevent any learning of these artifacts (over-fitting). We define then the patch sparsity of the decompo-sition as this number of steps. The stopping criteria in (2) be-comes the number of atoms used instead of the reconstructionerror. Using a small during the OMP permits to learn a dic-tionary specialized in providing a coarse approximation. Ourassumption is that (pattern) artifacts are less present in coarseapproximations, preventing the dictionary from learning them.We propose then the algorithm described in Fig. 6. We typicallyused to prevent the learning of artifacts and found outthat two outer iterations in the scheme in Fig. 6 are sufficient togive satisfactory results, while within the K-SVD, 10–20 itera-tions are required.
To conclude, in order to address the demosaicing problem, weuse the modified K-SVD algorithm that deals with nonuniformnoise, as described in previous section, and add to it an adaptivedictionary that has been learned with low patch sparsity in orderto avoid over-fitting the mosaic pattern. The same technique canbe applied to generic color inpainting as demonstrated in thenext section.
V. EXPERIMENTAL RESULTS
We are now ready to present the color image denoising, in-painting, and demosaicing results that are obtained with the pro-posed framework.
A. Denoising Color Images
The state-of-the-art performance of the algorithm ongrayscale images has already been studied in [2]. We nowevaluate our extension for color images. We trained somedictionaries with different sizes of atoms 5 5 3, 6 6 3,7 7 3 and 8 8 3, on 200 000 patches taken from adatabase of 15 000 images with the patch-sparsity parameter
(six atoms in the representations). We used the databaseLabelMe [55] to build our image database. Then we trainedeach dictionary with 600 iterations. This provided us a set ofgeneric dictionaries that we used as initial dictionaries in ourdenoising algorithm. Comparing the results obtained with theglobal approach and the adaptive one permits us to see theimprovements in the learning process. We chose to evaluate
Other sparse priors:
Image f = �x
Coe�cients x c = D�f
� D�
|x1| + |x2| max(|x1|, |x2|)
(a) (b) (c)
Figure 1: Unit balls of some atomic norms: In each figure, the set of atoms is graphed in red andthe unit ball of the associated atomic norm is graphed in blue. In (a), the atoms are the unit-Euclidean-norm one-sparse vectors, and the atomic norm is the !1 norm. In (b), the atoms are the2!2 symmetric unit-Euclidean-norm rank-one matrices, and the atomic norm is the nuclear norm.In (c), the atoms are the vectors {"1,+1}2, and the atomic norm is the !! norm.
natural procedure to go from the set of one-sparse vectors A to the !1 norm? We observe thatthe convex hull of (unit-Euclidean-norm) one-sparse vectors is the unit ball of the !1 norm, or thecross-polytope. Similarly the convex hull of the (unit-Euclidean-norm) rank-one matrices is thenuclear norm ball; see Figure 1 for illustrations. These constructions suggest a natural generaliza-tion to other settings. Under suitable conditions the convex hull conv(A) defines the unit ball ofa norm, which is called the atomic norm induced by the atomic set A. We can then minimize theatomic norm subject to measurement constraints, which results in a convex programming heuristicfor recovering simple models given linear measurements. As an example suppose we wish to recoverthe sum of a few permutation matrices given linear measurements. The convex hull of the set ofpermutation matrices is the Birkho! polytope of doubly stochastic matrices [73], and our proposalis to solve a convex program that minimizes the norm induced by this polytope. Similarly if wewish to recover an orthogonal matrix from linear measurements we would solve a spectral normminimization problem, as the spectral norm ball is the convex hull of all orthogonal matrices. Asdiscussed in Section 2.5 the atomic norm minimization problem is, in some sense, the best convexheuristic for recovering simple models with respect to a given atomic set.
We give general conditions for exact and robust recovery using the atomic norm heuristic. InSection 3 we provide concrete bounds on the number of generic linear measurements required forthe atomic norm heuristic to succeed. This analysis is based on computing certain Gaussian widthsof tangent cones with respect to the unit balls of the atomic norm [37]. Arguments based on Gaus-sian width have been fruitfully applied to obtain bounds on the number of Gaussian measurementsfor the special case of recovering sparse vectors via !1 norm minimization [64, 67], but computingGaussian widths of general cones is not easy. Therefore it is important to exploit the special struc-ture in atomic norms, while still obtaining su!ciently general results that are broadly applicable.An important theme in this paper is the connection between Gaussian widths and various notionsof symmetry. Specifically by exploiting symmetry structure in certain atomic norms as well as con-vex duality properties, we give bounds on the number of measurements required for recovery usingvery general atomic norm heuristics. For example we provide precise estimates of the number ofgeneric measurements required for exact recovery of an orthogonal matrix via spectral norm min-imization, and the number of generic measurements required for exact recovery of a permutationmatrix by minimizing the norm induced by the Birkho" polytope. While these results correspond
3
Nuclear
Top Related