New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$...

32
New Algorithms for Learning Incoherent and Overcomplete Dic:onaries Sanjeev Arora Rong Ge Tengyu Ma Ankur Moitra Princeton Microso8 Research Princeton MIT ICERM Workshop, May 7

Transcript of New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$...

Page 1: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

New  Algorithms  for  Learning  Incoherent  and  Overcomplete  Dic:onaries

Sanjeev  Arora   Rong  Ge   Tengyu  Ma   Ankur  Moitra  Princeton   Microso8  

Research  Princeton   MIT  

ICERM  Workshop,  May  7  

Page 2: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Dic:onary  Learning

•  Simple  “dicEonary  elements”  build  complicated  objects  

• Given  the  objects,  can  we  learn  the  dicEonary?  

Page 3: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Why  dic:onary  learning?  [Olshausen  Field  ’96]

natural  image  patches  

dicEonary  learning  

Gabor-­‐like  Filters  

Page 4: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Example:  Image  Comple:on  [Mairal,  Elad  &  Sapiro  ’08]  

Page 5: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Outline

• DicEonary  Learning  problem  

• GeNng  a  crude  esEmate  

• Refining  the  soluEon  

Page 6: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Dic:onary  Learning  Problem •  Given  samples  of  the  form  Y  =  AX  •  X  is  a  sparse  matrix  •  Goal:  Learn  A  (dicEonary).  •  InteresEng  case:  m  >  n  (overcomplete)  

……   ……  =  

Y   A   X  

n  

m  

DicEonary  Element   Sparse  CombinaEon  Samples  

Page 7: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Previous  Approach

……  

DicEonary  A   Sparse  Code  X  

LASSO  Basis  Pursuit  

Matching  Pursuit    

Least  Squares  K-­‐SVD  

AlternaEng  MinimizaEon  

Page 8: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Problem  with  Alterna:ng  Minimiza:on

• Monotone  objecEve  funcEon?    •  Local  Minimum  Issues  

……  

DicEonary  A   Sparse  Code  X  

Page 9: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Empirical  Behavior IteraEons  

log  accuracy  

•  SyntheEc  Experiment,  “qualitaEve”  plot  •  K-­‐SVD  converges  with  prob.  1/3,  random  samples  as  iniEal  dict.  

Converge,  prob  1/3  

Stuck,  prob  2/3  

slow  at  the  beginning  

This  Talk  

Page 10: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Provable  Algorithms

• Run  in  poly  Eme,  uses  poly  samples,  learn  the  ground  truth  

•  Separate  modeling  and  opEmizaEon  error  • Design  new  algorithms/Tweak  old  algorithms  

• Work  only  on  “reasonable  instances”  

Page 11: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

• Consider  Image  CompleEon  

•  The  representaEon  should  be  unique  and  robust!  

When  is  the  solu:on  “reasonable”?    

?  

DicEonary   Image  

Page 12: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Sparse  Recovery

• Given  A,  y,  find  x.  

•  Incoherence  [Donoho  Huo  ’99]  

• DicEonary  elements  have  inner-­‐product  𝜇/√𝑛    •  SoluEon  is  unique  and  robust  •  Long  line  of  work  [Logan,  Donoho  Stark,  Elad,  …….]  •  Sparsity  up  to  √𝑛   

Page 13: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Our  Results

•  Thm:  If  dicEonary  A  is  incoherent,  X  is  randomly  k-­‐sparse  from  “nice  distribuEon”,  learn  dicEonary  A  with  accuracy  ε  when  sparsity    𝑘≤ min {√𝑛 /𝜇log 𝑚  , 𝑚↑0.4 }   

• Handles  sparsity  up  to  √𝑛   •  Sample  complexity  O*(m/ε2).  Independently  [Agarwal  et  al.]  obtain  similar  result  with  slightly  different  assumpEons  and  weaker  sparsity  Later  [Barak  et  al.]  get  stronger  result  using  SOS    

Page 14: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Our  Results

•  Thm:  Given  an  esEmated  dicEonary  ε-­‐close  to  true  dicEonary,  one  iteraEon  of  K-­‐SVD  outputs  a    ε/2-­‐close  dicEonary    

• Works  whenever  ε<1/log  m  (before  require  1/poly).  •  Sample  complexity  O(mlog  m)  

• Combine:  Can  learn  an  incoherent  dicEonary  with  O*(m)  samples  in  poly  Eme.  

Page 15: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Outline

• DicEonary  Learning  problem  

• GeNng  a  crude  esEmate  

• Refining  the  soluEon  

Page 16: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Ideas

•  Find  the  support  of  X,  without  knowing  A.  

• Given  support  of  X,  find  approximate  A  

Page 17: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Finding  the  Support

•  Tool:  Test  whether  two  columns  of  X  intersect  

Disjoint    ≈  Small  Inner-­‐product  Intersect  ≈  Large  Inner-­‐product  

Page 18: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Finding  the  Support:  Overlapping  Clustering

• Connect  pairs  of  samples  with  large  inner-­‐product  • Vertex  =  Sample  • Cluster  =  Rows  of  X!  

Page 19: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Overlapping  Clustering

• Main  problem  

•  Idea:  Count  the  number  of  common  neighbors  • pair  of  points  share  unique  cluster  è  cluster  

Many  Common  Neighbors  in  same  Cluster  

Few  Common  Neighbors  

Page 20: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Es:mate  Dic:onary  Elements

•  Focus  on  a  row  of  X/column  of  A  •  Can  use  SVD  to  find  maximum  variance  direcEon  •  Or  take  samples  with  same  sign  and  average  

Page 21: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Outline

• DicEonary  Learning  problem  

• GeNng  a  crude  esEmate  

• AlternaEng  MinimizaEon  Works!  

Page 22: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

K-­‐SVD[Aharon,Elad,  Bruckstein  06]

• Given:  a  good  guess  ( 𝐴 )  • Goal:  find  a  even  beqer  dicEonary  • Update  one  dict.  element:  

•  Take  all  samples  with  the  element  •  Decode:  𝑦≈𝐴 𝑥     •  Residual:  𝑟=𝑦−∑𝑗≠𝑖↑▒𝐴 ↓𝑗 𝑥 ↓𝑗  =± 𝐴↓𝑖 +∑𝑗≠𝑖↑▒( 𝐴 ↓𝑗 𝑥 ↓𝑗 − 𝐴↓𝑗 𝑥↓𝑗 )   •  Use  top  singular  vec  of  residuals  

Noise  

Page 23: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

K-­‐SVD  illustrated

Blue:  True  dicEonary  Dashed:  EsEmated  DicEonary    Take  all  samples  with  same  element  Compute  Residual  

Hope:  In  residuals,  noise  is  small  and  random,    top  singular  vector  robust  for  random  noise  

Page 24: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

K-­‐SVD:  Intui:on

• When  error  (𝐴 ↓𝑖 − 𝐴↓𝑖 )  is  random  •  SEll  incoherent  •  Can  “decode”  •  Noise  looks  random  

• When  error  is  adversarial  •  May  not  be  incoherent  •  Noise  can  be  correlated  

• Bad  case:  error  is  highly  correlated,  poinEng  to                      same  direcEon  

Page 25: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

make  the  noise  “random”

• ObservaEon:  Can  detect  the  bad  case!  

•  To  handle  bad  case,  need  to  •  Perturb  the  esEmated  dicEonary  •  Keep  perturbaEon  small  •  The  result  has  low  spectral  norm  

• █■min ‖𝐵‖ @𝑠.𝑡.  ‖𝐵↓𝑖 − 𝐴 ↓𝑖 ‖≤𝜖   

 

Large  Singular  value!  

Convex!  OPT≤||A||  

Page 26: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Low  spectral  norm  is  enough

• Key  Lemma:  When  B  has  small  spectral  norm,  |<Bi,Bj>|≤  1/log  m,  random  k  columns  of  B  are  “almost  orthogonal”  • è  Decoding  is  accurate  for  a  random  sample  

• Proof  sketch:  For  BTB  •  Diagonals  are  large  •  Off-­‐diagonals  are  small  in  Exp.  •  ConcentraEon  è  random  submatrix  is  diag.  dominant  

Page 27: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Conclusion  &  Open  Problems

• K-­‐SVD  works  provably  with  good  iniEalizaEon  • Does  the  proof  give  any  insight  in  pracEce?  • Whitening  

•  “Error  behaves  random”  useful  in  other  seNngs?  • Handle  larger  sparsity?  •  work  with  RIP  assumpEon?  

•  Lowerbounds?  

Thank  you!  

Page 28: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Thank  you! QuesEons?  

Page 29: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

K-­‐SVD[Aharon,Elad,  Bruckstein  06]

• Given:  a  good  guess  • Goal:  find  a  even  beqer  dicEonary  

• Update  one  dict.  element:  •  Take  all  samples  with  the  element  •  Remove  other  elements  •  Use  top  singular  vec  of  residuals  

• Hope:  In  residuals,  error  is  small    and  random,  SVD  robust  for  random  noise  

Page 30: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Other  Applica:ons

Image  Denoising  [Mairal  et  al.  ’09]  

Digital  Zooming  [Couzinie-­‐Devy  ’10]  

Page 31: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Applica:ons

Image  CompleEon  [Mairal,  Elad  &  Sapiro  ’08]  

Image  Denoising  [Mairal  et  al.  ’09]  

Digital  Zooming  [Couzinie-­‐Devy  ’10]  

Page 32: New$Algorithms$for$Learning$ Incoherentand ......New$Algorithms$for$Learning$ Incoherentand$ OvercompleteDiconaries Sanjeev’Arora Rong’Ge’ Tengyu’Ma Ankur’Moitra Princeton’

Refining  the  solu:on

• Use  other  columns  to  reduce  the  variance!  • Get  ε  accuracy  with  poly(m,n)log  1/ε  samples