iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A...
Transcript of iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A...
![Page 1: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/1.jpg)
i-theory:
The Center for Brains, Minds and Machines
tomaso poggio, CBMM, BCS, CSAIL, McGovern MIT
1
visual cortex and deep networks
![Page 2: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/2.jpg)
SecondAnnualNSFSiteVisit,June2–3,2015
Theoretical/conceptual framework for vision
• The first 100ms of vision: feedforward and invariant: what, who, where
• Top-down needed for verification step and more complex questions: generative models, probabilistic inference, top-down visual routines.
Following this conceptual framework we are working on:
1.theory of invariance in feedforward networks (visual cortex) 2.a generative approach, probabilistic in nature 3.visual routines, and of how they may be learned.
2
![Page 3: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/3.jpg)
HumanBrain– 1010-1011neurons(~1millionflies)– 1014-1015synapses Marr, Crick, circa 1979
Computa(onalVision
© Source Unknown. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
3
![Page 4: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/4.jpg)
Objectrecogni(on
Figure removed due to copyright restrictions. Please see the video.
Courtesy of Elsevier, Inc., http://www.sciencedirect.com.© Elsevier. All rights reserved. This content is excluded from our Creative Commons Used with permission.license. For more information, see https://ocw.mit.edu/help/faq-fair-use/. Source: Wallisch, Pascal, and J. Anthony Movshon. "Structure
and function come unglued in the visual cortex." Neuron 60,no. 2 (2008): 195-197.
4
![Page 5: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/5.jpg)
Vision:whatiswhere
• Human Brain –1010-1011 neurons (~1 million flies) –1014- 1015 synapses
• Ventral stream in rhesus monkey –~109 neurons in the ventral stream
(350 106 in each hemisphere) –~15 106 neurons in AIT (Anterior
InferoTemporal) cortex
• ~200M in V1, ~200M in V2, 50M in V4
Figure removed due to copyright restrictions. Please see the video.Source: Figure 2 from Felleman, Daniel J., and David C. Van Essen."Distributed hierarchical processing in the primate cerebral cortex."Cerebral cortex 1, no. 1 (1991): 1-47.
5
![Page 6: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/6.jpg)
• It is in the family of “Hubel-Wiesel” models (Hubel & Wiesel, 1959: qual. Fukushima, 1980: quant; Oram &
Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu [software available online] Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
Perrett, 1993: qual; Wallis & Rolls, 1997; Riesenhuber & Poggio, 1999; Thorpe, 2002; Ullman et al., 2002; Mel, 1997; Wersing and Koerner, 2003; LeCun et al 1998: not-bio; Amit & Mascaro, 2003: not-bio; Hinton, LeCun, Bengio not-bio; Deco & Rolls 2006…)
• As a biological model of object recognition in the ventral stream – from V1 to PFC -- it is perhaps the most quantitatively faithful to known neuroscience data
Recogni(oninVisualCortex:‘’classicalmodel”,selec(veandinvariant
Source: Serre, Thomas, Minjoon Kouh, Charles Cadieu, Ulf Knoblich,Gabriel Kreiman, and Tomaso Poggio. A theory of object recognition:Computations and circuits in the feedforward path of the ventral streamin primate visual cortex. No. AI MEMO-2005-036. Massachusetts Instituteof Technology Center for Biological and Computational Learning, 2005.
6
![Page 7: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/7.jpg)
Feedforward Models:“predict” rapid categorization (82% model vs. 80% humans)
Hierarchical feedforward models of the ventral stream
© Source Unknown. All rights reserved.This content is excluded from our Creative Commons license. For more information,see https://ocw.mit.edu/help/faq-fair-use/.
Figure removed due to copyright restrictions. Please see the video.
Source: Serre, Thomas, Minjoon Kouh, Charles Cadieu, Ulf Knoblich,Gabriel Kreiman, and Tomaso Poggio. A theory of object recognition:Computations and circuits in the feedforward path of the ventral streamin primate visual cortex. No. AI MEMO-2005-036. Massachusetts Instituteof Technology Center for Biological and Computational Learning, 2005.
7
![Page 8: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/8.jpg)
Why do these networks including DLCNs
work so well?
Models are not enough… we need a theory!
8
![Page 9: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/9.jpg)
Plan
• i-theory (main results)
• equivalence to DCLNs, theory notes on DCLNs
• Some predictions + perspectives in i-theory
• Details and ML remarks
9
![Page 10: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/10.jpg)
i-theoryLearning of invariant&selective Representations in Sensory Cortex
© Source Unknown. All rights reserved.Source: Serre, Thomas, Minjoon Kouh, Charles Cadieu, This content is excluded from our CreativeUlf Knoblich, Gabriel Kreiman, and Tomaso Poggio. A Commons license. For more information,theory of object recognition: Computations and circuits see https://ocw.mit.edu/help/faq-fair-use/.in the feedforward path of the ventral stream in primatevisual cortex. No. AI MEMO-2005-036. MassachusettsInstitute of Technology Center for Biological andComputational Learning, 2005.
10
Courtesy of Elsevier, Inc., http://www.sciencedirect.com.Used with permission.Source: Wallisch, Pascal, and J. Anthony Movshon."Structure and function come unglued in the visualcortex." Neuron 60, no. 2 (2008): 195-197.
![Page 11: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/11.jpg)
What i-theory can answer for you• why some hierarchical nets work well
• what is visual cortex computing?
• function and circuits of simple-complex cells
• why Gabor-like tuning in simple cells?
• why generic, Gabor-like tuning in early areas and specific selective tuning higher up?
• what is the computational reason for the eccentricity-dependent size of RFs in V1, V2, V4?
• what are the roles of back projections? poggio, anselmi, rosasco, tacchetti, leibo, liao
Courtesy of Tomaso Poggio, Jim Mutch, Fabio Anselmi,Andrea Tacchetti, Lorenzo Rosasco and Joel Leibo.Used with permission.Source: Poggio, Tomaso, Jim Mutch, Fabio Anselmi,Andrea Tacchetti, Lorenzo Rosasco, and Joel Z. Leibo."Does invariant recognition predict tuning of neuronsin sensory cortex?" (2013).
11
![Page 12: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/12.jpg)
i-theory: exploring a new hypothesis
A main computational goal of the feedforward ventral stream hierarchy — and of vision — is to compute a representation for each incoming image which is invariant to transformations previously experienced in the visual environment.
12
![Page 13: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/13.jpg)
EmpiricaldemonstraJon:invariantrepresentaJonleadstolowersamplecomplexityforasupervisedclassifier
Theorem (transla)on case)Consider a space of images ofdimensions pixelswhich may appear in anyposiJon within a window ofsize pixels.Theusualimage representaJon yields asamplecomplexity(ofalinearc l a s s i fi e r ) o forder ;the oraclerepresentaJon (invariant)yields (because of muchsmaller covering numbers) asamplecomplexityoforder
d × d
rd × rd
m = O(r2d 2 )
mmoracle = O(d
2 ) = image
r2
Courtesy of Elsevier, Inc., http://www.sciencedirect.com. Used with permission.Source: Anselmi, Fabio, Joel Z. Leibo, Lorenzo Rosasco, Jim Mutch, AndreaTacchetti, and Tomaso Poggio. "Unsupervised learning of invariantrepresentations. " Theoretical Computer Science 633 (2016): 112-121.
13
![Page 14: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/14.jpg)
An algorithm that learns in an unsupervised way to compute invariant representations
ν
P(ν )
ν X|G|
µkn(I) = 1/|G| �(I
i=1
· gitk + n�)14
![Page 15: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/15.jpg)
15
Invariant signature from a single image of a new object
15
![Page 16: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/16.jpg)
We need only a finite number of projections, K, to distinguish among n images.
Similar in spirit to Johnson-Lindestrauss
16
![Page 17: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/17.jpg)
I-Theory
So far: compact groups in
I-theory extend proves invariance+uniqueness theorems for
• partially observable groups
• non-group transformations
• hierarchies of magic HW modules (multilayer)
R2
Courtesy of NIPS. Used with permission.Source: Liao, Qianli, Joel Z. Leibo, and Tomaso Poggio."Learning invariant representations and applications toface verification." In Advances in Neural InformationProcessing Systems, pp. 3057-3065. 2013.
17
![Page 18: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/18.jpg)
Invariance, sparsity, wavelets
Theorem: Sparsity is necessary and sufficient condition for translation and scale invariance. Sparsity for translation (respectively scale) invariance is equivalent to the support of the template being small in space (respectively frequency).
Theorem: Maximum simultaneous invariance to translation and scale is achieved by Gabor templates:
x
t(x) = e−2 2σ eiω0 x2
18
![Page 19: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/19.jpg)
Non-group transformations: approximate invariance in class-specific
regime
is locally invariant if:
- is sparse in the dictionary of t k
- transforms in the same way (belong to the same class) as
- the transformation is sufficiently smooth
µnk (I )
I
I t k
19
![Page 20: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/20.jpg)
Hierarchies of magic HW modules: key property is covariance
l=4
l=3
l=2
l=1ule HW mod
Courtesy of The Center for Brains, Minds and Machines, MIT.
20
![Page 21: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/21.jpg)
Local and global invariance: whole-parts theorem
For any signal (image) there is a layer in the hierarchy such that the response is invariant w.r.t. the signal transformation.
21
Source: Serre, Thomas, Minjoon Kouh, Charles Cadieu, Ulf Knoblich,Gabriel Kreiman, and Tomaso Poggio. A theory of object recognition:Computations and circuits in the feedforward path of the ventral streamin primate visual cortex. No. AI MEMO-2005-036. Massachusetts Instituteof Technology Center for Biological and Computational Learning, 2005.
![Page 22: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/22.jpg)
biophysics: predictionon simple-complex cell
22
![Page 23: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/23.jpg)
...
Basic machine: a HW module (dot products and histograms/moments for image seen through RF)
• The cumulative histogram (empirical cdf) can be be computed as
• This maps directly into a set of simple cells with threshold
• …and a complex cell indexed by n and k summating the simple cells
1 |G |
µ kn (I ) = ∑σ ( I ,g
|G | itk + nΔ)
i=1
nΔ
The nonlinearity can be rather arbitrary for invariance provided it is stationary in time23
![Page 24: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/24.jpg)
Robust and bio plausible
• nonlinearity can be almost anything
• pooling is average but softmax is OK
• low bit precision • Details and ML remarks
24
![Page 25: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/25.jpg)
SecondAnnualNSFSiteVisit,June2–3,2015
Dendrites of a complex cells as simple cells…
Active properties in the dendrites of the complex cell
© Source Unknown. All rights reserved. This content is excluded from our CreativeCommons license. For more information, see https://ocw.mit.edu/help/faq-fair-use/.
25
![Page 26: iTheory: Visual Cortex and Deep Networks · 2020-04-23 · i-theory: exploring a new hypothesis. A main computational goal of the . feedforward. ventral stream hierarchy — and of](https://reader033.fdocuments.us/reader033/viewer/2022043018/5f3a9e6aac77a035453f36e3/html5/thumbnails/26.jpg)
MIT OpenCourseWarehttps://ocw.mit.edu
Resource: Brains, Minds and Machines Summer CourseTomaso Poggio and Gabriel Kreiman
The following may not correspond to a particular course on MIT OpenCourseWare, but has beenprovided by the author as an individual learning resource.
For information about citing these materials or our Terms of Use, visit: https://ocw.mit.edu/terms.