Statistical Modeling and Learning in Vision --- cortex-like generative models Ying Nian Wu
description
Transcript of Statistical Modeling and Learning in Vision --- cortex-like generative models Ying Nian Wu
Statistical Modeling and Learning in Vision --- cortex-like generative models
Ying Nian WuUCLA Department of Statistics
JSM, August 2010
http://www.stat.ucla.edu/~ywu/ActiveBasis
Matlab/C code, Data
Outline• Primary visual cortex (V1)• Modeling and learning in V1 • Layered hierarchical models
Source: Scientific American, 1999
Visual cortex: layered hierarchical architecture
V1: primary visual cortex simple cells complex cells
bottom-up/top-down
1]}[2
1exp{)(
22
22
21
21 ixe
xxxG
Simple V1 cells Daugman, 1985
Gabor wavelets: localized sine and cosine waves
Transation, rotation, dilation of the above function
)'()'(,'
,,,, xBxIBIx
sxsx
image pixels
V1 simple cells
,,sxB
respond to edges
Complex V1 cells Riesenhuber and Poggio,1999
2,,)(),( |,|max sxxAx BI
Image pixels
V1 simple cells
V1 complex cells
Local max
Local sum
•Larger receptive field •Less sensitive to deformation
Independent Component Analysis Bell and Sejnowski, 1996
CBcBcI NN B ...11
Nicpci ,...,1tly independen )(~
)dim(IN
IIC AB 1
mNNmmm CBcBcI B ,11, ...
mmm IIC AB 1
Laplacian/Cauchy
Hyvarinen, 2000
Sparse coding Olshausen and Field, 1996
Laplacian/Cauchy/mixture Gaussians
Nicpci ,...,1tly independen )(~
NNBcBcI ...11
mNNmmm BcBcI ,11, ...)dim(IN
Inference: sparsification, non-linear lasso/basis pursuit/matching pursuit mode and uncertainty of p(C|I) explaining-away, lateral inhibition
Nicpci ,...,1tly independen )(~
Sparse coding / variable selection
Learning: mNNmmm BcBcI ,11, ...
)dim(IN
A dictionary of representational elements (regressors)
NNBcBcI ...11
Olshausen and Field, 1996
}exp{)(
1),(
,, j
jiiji IcB
ZICp
B
Nici ,...,1 ,
I
Restricted Boltzmann Machine Hinton, Osindero and Teh, 2006
P(I|C) P(C|I): factorized no-explaining away
hidden, binary
visible
Energy-based model Teh, Welling, Osindero and Hinton, 2003
)},(exp{),(
1)(
iiBIZ
Ip B
Features, no explaining-away
Maximum entropy with marginalsExponential family with sufficient stat
)},(exp{)(
1)(
,,,
sxis BI
ZIp
Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000
Markov random field/Gibbs distribution
Zhu, Wu, and Mumford, 1997Wu, Liu, and Zhu, 2000
Source: Scientific American, 1999
Visual cortex: layered hierarchical architecture
bottom-up/top-down
What is beyond V1?Hierarchical model?
Hierchical ICA/Energy-based model?
Larger featuresMust introduce nonlinearitiesPurely bottom-up
P(I,C) = P(C)P(I|C) P(C) P(J,C)
I
C
I
J
Discriminative correction by back-propagation
Unfolding, untying, re-learning
Hierarchical RBM Hinton, Osindero and Teh, 2006
Hierarchical sparse coding
NNBcBcI ...11
,,sxB
Attributed sparse coding elements transformation group topological neighborhood system
UBcIii sx
n
ii
,,
1
Layer above : further coding of the attributes of selected sparse coding elements
msx
n
iimm UBcI
imim
,, ,,
1,
msxx
n
iimm UBcI
imiimi
,, ,,
1,
Hierarchical sparse coding
Active basis
imiim xxx ,,
imiim ,,
Wu, Si, Fleming, Zhu, 2007
Residual generalization
Shared matching pursuit
msxx
n
iimm UBcI
imiimi
,, ,,
1,
M
msxx
n
iimm imiimiBcI
1
2,,
1, ||||
,,
, ,
2, , ( , ) ( ) , ,
1
2, , ( , ) ( ) , ,
, , ,
0: Initialize , 0.
1: Let 1. Let ( , ) arg max max | , | .
2: Let ( , ) arg max | , | .
3: Let , .
i i i
i m i i m i
m m
M
i i x s x A m x x sm
m i m i x A m x x s
m i m x x s
U I i
i i x U B
x U B
c U B
, ,, , ,Update .
4: Stop if , else go back to 1.
i m i i m im m m i x x sU U c B
i n
1. Local maximization in step 1: complex cells, Riesenhuber and Poggio,19992. Arg-max in step 2: inferring hidden variables 3. Explaining-away in step 3: lateral inhibition
Wu, Si, Fleming, Zhu, 2007
Active basis
msxx
n
iimm UBcI
imiimi
,, ,,
1, Two different scales
Putting multiple scales together
msxx
n
iimm UBcI
imiiimi
,, ,,
1,
)'(,, xB sx
More elements added
Residual images
msxx
n
iimm UBcI
imiimi
,, ,,
1,
Statistical modeling
, ,, ,( , 1,..., )i m i i m im x x sB i n Borthogonal
imiimi sxxmim BIc,, ,,, ,
n
i
n
i im
imim
im
imimmm rq
rpIq
cq
cpIqIp
1 1 ,
,
,
,
)(
)()(
)(
)()()|( B
2,, || imim cr
)()}(exp{)(
1)( rqrhZ
rp ii
i
)|()|( CUqCUp mm
Conditional independence of coefficients
Exponential family model
Strong edges in background
Wu, Si, Gong, Zhu, 2010
……
……
UBcIiiii sxxx
n
ii
,,
1
)](log)|,(|max[)(1
2,,)(),( i
n
isxxxAxi ZBIhxl
iii
Detection by sum-max maps Wu, Si, Gong, Zhu, 2010
Image pixels
V1 simple cells
V1 complex cells
Local max
Local sum
Complex V1 cells Riesenhuber and Poggio,1999
2,,)(),( |,|max sxxAx BI
•Larger receptive field •Less sensitive to deformation
SUM-MAX maps (bottom-up/top-down)
)](log)|,(|max[)(1
2,,)(),( i
n
isxxxAxi ZBIhxl
iii
Local maximization: complex cellsRiesenhuber and Poggio,1999
Gabor wavelets: simple cellsOlshausen and Field, 1996
SUM2 operator: what “cell”?
Bottom-up detectionTop-down sketching
SUM1
MAX1
SUM2
arg MAX1
Sparse selective connection as a result of learningExplaining-away in learning but not in inference
Bottom-up scoring and top-down sketching
Adjusting Active Basis Model by L2 Regularized Logistic RegressionBy Ruixun Zhang
L2 regularized logistic regressionre-estimated lambda’s
Conditional on: (1) selected basis elements (2) inferred hidden variables (1) and (2) generative learning
•Exponential family model, q(I) negatives Logistic regression•Generative learning without negative examples•Discriminative correcting of conditional independence assumption (with hugely reduced dimensionality)
Learning from non-aligned training images
msxxx
n
iimm UBcI
imiimim
,,)( ,,
1,
, ,( , 1,..., )i ix sB i n B
Learning from non-aligned training images
EM mixture
msxxk
n
iimm UBcI im
kiim
ki
,
)(,
)( ,,)(
1,
( ) ( )
( )
, ,{ ( , 1,..., ), 1,..., }k k
i i
k
x sB i n k K
B
EM mixture
MNIST
Active bases as part-templates
Split bike template to detect and sketch tandem bike
Is there an edge here?
Is there an edge nearby?
Is there a wheel here?
Is there a wheel nearby?
Is there a tandem bike here?
Soft scoring instead of hard decision
Learning part templates or visual words
Shape script model
Shape motifs: elementary geometric shapes
UBcIii sx
n
ii
,,
1
),...,1,,(motif shape),...,1,,( nixnix iik
kii
UCIkkkk sx
K
kk
,,,
1
B nK
Si and Wu, 2010
UBcIii sx
n
ii
,,
1
),...,1,,(motif shape),...,1,,( nixnix iik
kii
Layers of attributed sparse coding elements