Mohsen Bayati, David Donoho, Adel Jaanmardv Iain …montanar/OTHER/TALKS/itw2012.pdfAndrea Montanari...
Transcript of Mohsen Bayati, David Donoho, Adel Jaanmardv Iain …montanar/OTHER/TALKS/itw2012.pdfAndrea Montanari...
Iterative Methods in Statistical Estimation
Mohsen Bayati, David Donoho, Adel JavanmardIain Johnstone, Marc Lelarge, Arian Maleki, Andrea Montanari
Stanford University
September 6, 2012
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 1 / 88
Statistical estimation
y = f (�;noise)
� ! Unknown objecty ! Observations
f ( � ;noise) ! Parametric model
Problem: Estimate � from observations y .
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 2 / 88
Statistical estimation
y = f (�;noise)
� ! Unknown objecty ! Observations
f ( � ;noise) ! Parametric model
Problem: Estimate � from observations y .
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 2 / 88
Example: Statistical network analysis
� ! Membership of nodes to `communities'y ! Graph
[Newman, 2012]
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 3 / 88
Example: Exploration seismology
� ! Density �eld in the earthy ! Seismographic measurements
[Herrmann, 2012]
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 4 / 88
A broad convergence
I Statistics
[Genomics, . . . ]
I Data mining
[Collaborative �ltering, Predictive analytics, . . . ]
I Signal processing
[Compressive sampling, . . . ]
I Inverse problems
[Medical imaging, Seismographic imaging,. . . ]
+Data, + Computation, Exploit hidden structure
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 5 / 88
A broad convergence
I Statistics
[Genomics, . . . ]
I Data mining
[Collaborative �ltering, Predictive analytics, . . . ]
I Signal processing
[Compressive sampling, . . . ]
I Inverse problems
[Medical imaging, Seismographic imaging,. . . ]
+Data, + Computation, Exploit hidden structure
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 5 / 88
How should we think about these problems?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 6 / 88
How should we think about these problems?
Information theory?
� ! NOISY CHANNEL! y = f (�;noise)
Fundamental limits, No algorithm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 7 / 88
How should we think about these problems?
Information theory?
� ! NOISY CHANNEL! y = f (�;noise)
Fundamental limits, No algorithm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 7 / 88
How should we think about these problems?
Optimization?
maximize Likelihood(�jy)� Complexity(�)
E�cient (convex) algorithms, Di�cult statistical theory
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 8 / 88
How should we think about these problems?
Optimization?
maximize Likelihood(�jy)� Complexity(�)
E�cient (convex) algorithms, Di�cult statistical theory
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 8 / 88
How should we think about these problems?
Iterative methods?
y ! �̂1 ! �̂2 ! �̂3 ! : : :
I Each step ! One matrix-vector multiplication
I A few steps (say � 20)
I What can we achieve?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 9 / 88
Outline
I An example (algorithm + heuristics)
I A couple of theorems
I Generalizations and open problems
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 10 / 88
A long example
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 11 / 88
What type of example?
I Image processing (because they make nice �gures)
I Compressed sensing (simpler/cleaner)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 12 / 88
What type of example?
I Image processing (because they make nice �gures)
I Compressed sensing (simpler/cleaner)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 12 / 88
What type of example?
I Image processing (because they make nice �gures)
I Compressed sensing (simpler/cleaner)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 12 / 88
Which image?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 13 / 88
Examples appearing in the literature
Lena Cameraman Barbara Fabio
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 14 / 88
Better someone who is familiar to everybody
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 15 / 88
Better someone who is familiar to everybody
Suhas Claude Rüdi
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 16 / 88
Better someone who is familiar to everybody
Suhas Claude Rüdi
Who's the most handsome?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 17 / 88
Better someone who is familiar to everybody
� = 2 Cn
Unknown object (n = 5122 � 2:5 � 105)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 18 / 88
Noiseless linear measurements
y = A� = A�
Want to reconstruct �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 19 / 88
Noiseless linear measurements
y = A� = A�
Want to reconstruct �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 19 / 88
Measurement structure
A = SFR
F = Fourier transform
S =
26666664
10
10
01
37777775 = random subsampling matrix (rate � = 0:15)
R =
26666664
+1�1
�1+1
+1�1
37777775 = random modulation
! y 2 Cm , m = 0:15nAndrea Montanari (Stanford) Iterative Methods September 6, 2012 20 / 88
Measurement structure
A = eFReF = subsampled Fourier matrix
R =
26666664
+1�1
�1+1
+1�1
37777775 = random modulation
! y 2 Cm , m = 0:15n
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 21 / 88
Constructing a �rst estimate
y = A�
Matched �lter (� pseudoinverse)
�̂1 = N�1Ayy
N = diag�Nii = ki -th col of Ak22
�
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 22 / 88
Constructing a �rst estimate
y = A�
Matched �lter
�̂1 =1
mAyy
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 23 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
How good is this?
E �̂1 =1
mEfAyyg
=1
mEfAyAg�
=1
mEfRF yS SFRg�
=1
mEfRF �IF yRg�
=1
mEfRn�IRg� =
n�
m� = �
Will rede�ne A A=pm
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 24 / 88
Check it out
�̂1 = Ayy = � =
Does not look that good!
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 25 / 88
Check it out
�̂1 = Ayy = � =
Does not look that good!
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 25 / 88
Idea
= + `noise'
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 26 / 88
Idea
= +
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 27 / 88
How big is the `noise'? (take wlog R = I)
�̂1 � � = (AyA� I)� =� 1
mF ySSF � I
�� =
1
n�F y(S � ES)F�
Hence
Efk�̂1 � �k22g =1
�2Ef�yF y(S � ES)2F�g = 1
�2�(1� �) kF�k22
=1� �
�k�k22
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 28 / 88
How big is the `noise'? (take wlog R = I)
�̂1 � � = (AyA� I)� =� 1
mF ySSF � I
�� =
1
n�F y(S � ES)F�
Hence
Efk�̂1 � �k22g =1
�2Ef�yF y(S � ES)2F�g = 1
�2�(1� �) kF�k22
=1� �
�k�k22
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 28 / 88
Matched �lter blows up noise
MSEout =1� �
�MSEin
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 29 / 88
Let's check
0
1
2
3
4
5
6
7
8
9
0 0.5 1 1.5 2 2.5 3 3.5 4
MSEin
MSEout
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 30 / 88
Noise distribution?
error
Fre
quen
cy
−5 0 5
050
0010
000
1500
020
000
2500
030
000
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 31 / 88
Denoising
Statistical estimation with
y = f (�;noise) = � + � z ; zi � N(0; 1)
Idea: Treat �̂1 as e�ective observations in denoising
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 32 / 88
Denoising
Statistical estimation with
y = f (�;noise) = � + � z ; zi � N(0; 1)
Idea: Treat �̂1 as e�ective observations in denoising
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 32 / 88
Denoising by nonlocal means
y = � + � z ;
�̂i =
Pj W (i ; j )yjPj W (i ; j )
;
W (i ; j ) =
(1 if kPatch(i ; y)� Patch(j ; y)k22 � � �2;
0 otherwise
[Buades, Coll, Morel, 2005]
�̂ � �(y)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 33 / 88
Denoising by nonlocal means
y = � + � z ;
�̂i =
Pj W (i ; j )yjPj W (i ; j )
;
W (i ; j ) =
(1 if kPatch(i ; y)� Patch(j ; y)k22 � � �2;
0 otherwise
[Buades, Coll, Morel, 2005]
�̂ � �(y)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 33 / 88
Denoising by nonlocal means
y = � + � z ;
�̂i =
Pj W (i ; j )yjPj W (i ; j )
;
W (i ; j ) =
(1 if kPatch(i ; y)� Patch(j ; y)k22 � � �2;
0 otherwise
[Buades, Coll, Morel, 2005]
�̂ � �(y)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 33 / 88
Patches
i
j
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 34 / 88
Will it work?
�̂2 = �(�̂1) = �(Ayy) = �� �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 35 / 88
Let's try
�̂1 = Ayy = �̂2 = �(Ayy) =
Better than garbage!
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 36 / 88
How much better?
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9
MSEin
MSEout
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 37 / 88
How much better?
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9
MSEin
MSEout
c1 x
c2px
?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 38 / 88
Let us repeat the denoising experiment
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 39 / 88
Let us repeat the denoising experiment: y = � + � z
y
�(y)
� = 1 � = 0:5 � = 0:25 � = 0:12
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 40 / 88
Quantitatively
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9
MSEin
MSEout
c1 x
c2px
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 41 / 88
How much better?
1e-05
0.0001
0.001
0.01
0.1
0.001 0.01 0.1 1 10
MSEin
MSEout
c1 x
c2px
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 42 / 88
Approximate denoiser characterization
MSEout = cpMSEin
(enough for our purposes)
Theorem (Maleki, Baraniuk, Narayan, 2012, informal)
The minimax risk of nonlinear means satis�es
inftuning params
supimages
MSE = � Poly(log �)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 43 / 88
Approximate denoiser characterization
MSEout = cpMSEin
(enough for our purposes)
Theorem (Maleki, Baraniuk, Narayan, 2012, informal)
The minimax risk of nonlinear means satis�es
inftuning params
supimages
MSE = � Poly(log �)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 43 / 88
What we achieved so far
0
1
2
3
4
5
6
7
8
9
0 0.5 1 1.5 2 2.5 3 3.5 4 0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9
MSEin
MSEout
MSEin
MSEout
DenoiserMatched �lter
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 44 / 88
What we achieved so far
0 1 2 3 4 5 6 7 8 9 0
0.5
1
1.5
2
2.5
3
3.5
4
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9
MSEin
MSEout
MSEout
MSEin
DenoiserMatched �lter
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 45 / 88
What we achieved so far
0
0.2
0.4
0.6
0.8
1
1.2
1.4
0 1 2 3 4 5 6 7 8 9
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 46 / 88
What we achieved so far
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
What about iterating?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 47 / 88
What we achieved so far
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
What about iterating?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 47 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = ???
A(� � �̂2) = y �A�̂2
� � �̂2 � Ay(y �A�̂2)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 48 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = ???
A(� � �̂2) = y �A�̂2
� � �̂2 � Ay(y �A�̂2)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 48 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = ???
A(� � �̂2) = y �A�̂2
� � �̂2 � Ay(y �A�̂2)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 48 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = �̂2 +Ay(y �A�̂2)
�̂4 = �(�̂3)
�̂5 = �̂4 +Ay(y �A�̂4)
�̂6 = �(�̂5)
� � � � � �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 49 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = �̂2 +Ay(y �A�̂2)
�̂4 = �(�̂3)
�̂5 = �̂4 +Ay(y �A�̂4)
�̂6 = �(�̂5)
� � � � � �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 49 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = �̂2 +Ay(y �A�̂2)
�̂4 = �(�̂3)
�̂5 = �̂4 +Ay(y �A�̂4)
�̂6 = �(�̂5)
� � � � � �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 49 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = �̂2 +Ay(y �A�̂2)
�̂4 = �(�̂3)
�̂5 = �̂4 +Ay(y �A�̂4)
�̂6 = �(�̂5)
� � � � � �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 49 / 88
How do we iterate?
�̂1 = Ayy
�̂2 = �(�̂1)
�̂3 = �̂2 +Ay(y �A�̂2)
�̂4 = �(�̂3)
�̂5 = �̂4 +Ay(y �A�̂4)
�̂6 = �(�̂5)
� � � � � �
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 49 / 88
For t = 1; 2; 3; : : : ; 20
�̂2t = �(�̂2t�1)
�̂2t+1 = �̂2t +Ay(y �A�̂2t )
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 50 / 88
t = 1
�̂1 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 51 / 88
t = 2
�̂2 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 52 / 88
t = 3
�̂3 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 53 / 88
t = 3
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 54 / 88
t = 4
�̂4 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 55 / 88
t = 3
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 56 / 88
t = 5
�̂5 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 57 / 88
t = 5
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 58 / 88
t = 6
�̂6 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 59 / 88
t = 6
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 60 / 88
t = 7
�̂7 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 61 / 88
t = 7
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 62 / 88
t = 8
�̂8 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 63 / 88
t = 8
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 64 / 88
t = 9
�̂9 =
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 65 / 88
t = 9
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 66 / 88
t = 0; 1; 2; 3; : : : ; 20
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 67 / 88
t = 0; 1; 2; 3; : : :
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 68 / 88
Well in reality I cheated
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 69 / 88
Well in reality I cheated
Instead of this:
�̂2t = �(�̂2t�1)
�̂2t+1 = �̂2t +Ay(y �A�̂2t )
I used this (for bt 2 C)
�̂2t = �(�̂2t�1)
�̂2t+1 = �̂2t +Ay r t
r t = y �A�̂2t + bt rt�1
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 70 / 88
Approximate Message Passing (AMP)
�̂2t = �(�̂2t�1)
�̂2t+1 = �̂2t +Ay r t
r t = y �A�̂2t + bt rt�1
bt =1
mdiv�(�̂2t�1)
(can be computed explicitly)
[Thouless, Anderson, Palmer, 1977, Kabashima, 2003, Donoho, Maleki,
Montanari, 2009, Donoho, Johnstone, Montanari, 2009]
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 71 / 88
State Evolution
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
MSEold
MSEnew
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 72 / 88
A few theorems
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 73 / 88
Things to play with
I Matrix A 2 Rm�n .
I Denoiser � : Rn ! Rn .
I Additive noise.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 74 / 88
State Evolution
0.0001
0.001
0.01
0.1
1
10
100
0.001 0.01 0.1 1 10 100
Theorem (Bayati, Montanari 2010)
Assume A has i.i.d. Gaussian entries, and � is separable
�(v) = (�1(v1); �2(v2); : : : ; �n(vn))
Then state evolution holds asymptotically as n !1.
[Proof uses a very nice technique by Erwin Bolthausen]
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 75 / 88
State Evolution: More theorems
Bayati, Montanari 2010: A more general class of iterations.
Bayati, Lelarge, Montanari 2012: A with non-Gaussian i.i.d.entries; polynomial separable denoiser.
Still far from the example of the �rst part
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 76 / 88
State Evolution: More theorems
Bayati, Montanari 2010: A more general class of iterations.
Bayati, Lelarge, Montanari 2012: A with non-Gaussian i.i.d.entries; polynomial separable denoiser.
Still far from the example of the �rst part
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 76 / 88
State Evolution: More theorems
Bayati, Montanari 2010: A more general class of iterations.
Bayati, Lelarge, Montanari 2012: A with non-Gaussian i.i.d.entries; polynomial separable denoiser.
Still far from the example of the �rst part
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 76 / 88
State Evolution: More theorems
Bayati, Montanari 2010: A more general class of iterations.
Bayati, Lelarge, Montanari 2012: A with non-Gaussian i.i.d.entries; polynomial separable denoiser.
Still far from the example of the �rst part
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 76 / 88
Connection with convex optimization
J : Rn ! R convex regularizer
Proximal operator
�(y) = argminx2R
n12ky � xk22 + J (x )
o
Examples:
J (x ) = kxk1 ; () � separable)
J (x ) = kxkTV ; () total variation denoising)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 77 / 88
Connection with convex optimization
Lemma
If �( � ) is the proximal operator of J ( � ), and �̂ is a �xed point of
AMP, then
�̂ 2 arg min�2Rn
n12ky �A�k22 + �J (�)
o;
for � = (1� b1)�1.
(But theory applies to more general denoisers!)
Does AMP converge to a minimizer?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 78 / 88
Connection with convex optimization
Lemma
If �( � ) is the proximal operator of J ( � ), and �̂ is a �xed point of
AMP, then
�̂ 2 arg min�2Rn
n12ky �A�k22 + �J (�)
o;
for � = (1� b1)�1.
(But theory applies to more general denoisers!)
Does AMP converge to a minimizer?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 78 / 88
Does AMP converge to a minimizer?
Theorem (Bayati, Montanari 2011)
If J (x ) = kxk1 and A is Gaussian with i.i.d. entries, then (for n
large enough) AMP converge within relative distance " from a
minimizer in t = O(log(1=")) iterations.
Corollary
Asymptotic distributional characterization of the minimizer.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 79 / 88
Does AMP converge to a minimizer?
Theorem (Bayati, Montanari 2011)
If J (x ) = kxk1 and A is Gaussian with i.i.d. entries, then (for n
large enough) AMP converge within relative distance " from a
minimizer in t = O(log(1=")) iterations.
Corollary
Asymptotic distributional characterization of the minimizer.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 79 / 88
Connection with convex optimization
I Asymototic characterization of the minimizer through the(non-rigorous) replica method.[Tanaka 2002, Guo, Verdú 2005, Kabashima, Tanaka 2009,Rangan, Fletcher, Goyal 2009, Caire, Tulino, Shamai, Verdú 2012,Javanmard, Montanari 2012. . . ]
I Bayati, Lelarge, Montanari 2012:Partial result for J (x ) = kxk1 and A with non-Gaussian entries.
I Bean, Bickel, El Karoui, Lim, Yu 2012:Alternative argument for robust regression (e.g. min� ky �A�k1)
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 80 / 88
One proof idea: Universality
For simplicity A 2 Rn�n symmetric
�̂t+1 = Af (�̂t ) + bt f (�̂t�1)
f (v) = (f (v1); f (v2); : : : ; f (vn))
Lemma
If f is a polynomial, then L(�̂ti ) is asymptotically universal for A
with i.i.d. entries with E(Aij ) = 0, E(A2ij ) = 1=n.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 81 / 88
One proof idea: Universality
For simplicity A 2 Rn�n symmetric
�̂t+1 = Af (�̂t ) ; �̂0 = 1
f (v) = ((v1)2; (v2)
2; : : : ; (vn)2)
�̂2i =P
i ;j1;j2;j3Aij1Aj1j2Aj1j3 =
Pi j1
j2
j3
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 82 / 88
One proof idea: Universality
For simplicity A 2 Rn�n symmetric
�̂t+1 = Af (�̂t ) ; �̂0 = 1
f (v) = ((v1)2; (v2)
2; : : : ; (vn)2)
Prove universality of 2nd moment
Ef(�̂2i )2g =P
Prove that the only terms that `survive' as n !1 have each edge Akl
appearing zero or two times.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 83 / 88
One proof idea: Universality
For simplicity A 2 Rn�n symmetric
�̂t+1 = Af (�̂t ) ; �̂0 = 1
f (v) = ((v1)2; (v2)
2; : : : ; (vn)2)
Prove universality of 2nd moment
Ef(�̂2i )2g =P
Prove that the only terms that `survive' as n !1 have each edge Akl
appearing zero or two times.
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 83 / 88
Generalizations and open problems
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 84 / 88
More general matrices
I Partial Fourier matrices, random unitary matrices.[Caire, Shamai, Tulino, Verdú, 2012 (non-rigorous)]
I Independent Gaussian rows[Javanmard, Montanari, 2012 (non-rigorous)]
I Spatially coupled matrices[Krzakala, Mézard, Sausset, Sun, Zdeborova, 2011;
Donoho, Javanmard, Montanari, 2011]
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 85 / 88
More general models
I Generalized linear models [Rangan 2011]
I Graphical model priors [Schniter et al. 2010-. . . ]
I Low-rank matrices [Rangan, Fletcher 2012]
I . . .
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 86 / 88
Optimal estimation under limited computation
?
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 87 / 88
Conclusion
Information theory:Simple probabilistic models, Sharp asymptotics, Surprising insights
Thanks!
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 88 / 88
Conclusion
Information theory:Simple probabilistic models, Sharp asymptotics, Surprising insights
Thanks!
Andrea Montanari (Stanford) Iterative Methods September 6, 2012 88 / 88