Learning in Hierarchical Architectures: from Neuroscience...
Transcript of Learning in Hierarchical Architectures: from Neuroscience...
![Page 1: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/1.jpg)
Learning in Hierarchical Architectures: from Neuroscience to Derived Kernels
Learning is the gateway to understanding the brain and to making intelligent machines.
Problem of learning: a focus for o math o computer algorithms o neuroscience
tomaso poggio, McGovern Institute BCS, CSAIL MIT
![Page 2: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/2.jpg)
Message of today
Neuroscience may begin to provide new ideas and approaches to machine learning, AI and computer vision..
![Page 3: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/3.jpg)
Message of today
Neuroscience may begin to provide new ideas and approaches to machine learning, AI and computer vision.
A case in point: (un)supervised learning
![Page 4: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/4.jpg)
Learning in Hierarchical Architectures: from Neuroscience to Derived Kernels
Learning is the gateway to understanding the brain and to making intelligent machines.
Problem of learning: a focus for o math o computer algorithms o neuroscience
tomaso poggio, McGovern Institute BCS, CSAIL MIT
![Page 5: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/5.jpg)
1. Today’s supervised learning algorithms: sample complexity problem and shallow architectures
2. Visual Cortex: hierarchical architecture, from neuroscience to a class of models
3. Physiology, psychophysics, computer vision 4. Models suggest new architectures for learning 5. Extensions and limitations of models
![Page 6: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/6.jpg)
INPUT OUTPUT f
Supervised learning
![Page 7: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/7.jpg)
Supervised learning
![Page 8: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/8.jpg)
Definitions
![Page 9: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/9.jpg)
Classical learning theory and Kernel Machines (Regularization in RKHS)
Equation includes splines, Radial Basis Functions and SVMs (depending on choice of V).
implies
For a review, see Poggio and Smale, The Mathematics of Learning, Notices of the AMS, 2003; see also Schoelkopf and Smola, 2002; Bousquet, O., S. Boucheron and G. Lugosi.
![Page 10: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/10.jpg)
Classical learning theory and Kernel Machines (Regularization in RKHS)
implies
Kernel machines correspond to shallow networks
X 1
f
X l
![Page 11: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/11.jpg)
How do the learning machines described by classical learning theory -- such as kernel machines -- compare with brains?
One of the most obvious differences is the apparent ability of people and animals to learn from very few examples (“poverty of stimulus” problem).
A comparison with real brains offers another, related, challenge to learning theory. Classical “learning algorithms” correspond to one-layer architectures. The cortex suggests a hierarchical architecture.
Are hierarchical architectures with more layers the answer to the sample complexity issue?
Notices of the American Mathematical Society (AMS), Vol. 50, No. 5,
537-544, 2003. The Mathematics of Learning: Dealing with Data
Tomaso Poggio and Steve Smale
Present learning algorithms: “high” sample complexity and shallow
architectures
![Page 12: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/12.jpg)
How then do the learning machines described in the theory compare with brains?
One of the most obvious differences is the ability of people and animals to learn from very few examples.
Are hierarchical architectures with more layers justifiable in terms of learning theory?
Why hierarchies in the brain?
Notices of the American Mathematical Society (AMS), Vol. 50, No. 5, 537-544, 2003.
The Mathematics of Learning: Dealing with Data Tomaso Poggio and Steve Smale
Towards a hierarchical learning theory
![Page 13: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/13.jpg)
1. Today’s supervised learning algorithms: sample complexity problem and shallow architectures
2. Visual Cortex: hierarchical architecture, from neuroscience to a class of models
3. Physiology, psychophysics, computer vision 4. Models suggest new architectures for learning 5. Extensions and limitations of models
![Page 14: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/14.jpg)
The Ventral Stream
Desimone & Ungerleider 1989
dorsal stream: “where”
ventral stream: “what”
Hypothesis: the hierarchical architecture of the ventral stream in monkey visual cortex has a key role in object recognition…of course subcortical pathways may also be important (thalamus, in particular pulvinar…).
![Page 15: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/15.jpg)
visual recognition is a difficult learning problem (e.g., “is there an animal in the image?”)
The Ventral Stream
![Page 16: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/16.jpg)
• Human Brain – 1010-1011 neurons (~1 million flies ) – 1014- 1015 synapses
• Ventral stream in rhesus monkey – 109 neurons – 5 106 neurons in AIT (Anterior
InferoTemporal) cortex
The Ventral Stream
![Page 17: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/17.jpg)
The ventral stream in monkey visual cortex has a key role in solving this problem
Desimone & Ungerleider 1989
dorsal stream: “where”
ventral stream: “what”
…of course subcortical pathways may also be important (thalamus, in particular pulvinar…).
![Page 18: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/18.jpg)
The ventral stream hierarchy: V1, V2, V4, IT
A gradual increase in the receptive field size, in the complexity of the preferred stimulus, in tolerance to position
and scale changes
Kobatake & Tanaka, 1994
The Ventral Stream
![Page 19: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/19.jpg)
The ventral stream
Feedforward connections only?
![Page 20: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/20.jpg)
The ventral stream
Feedforward connections as well as backprojections:
How far we can push the simplest type of feedforward hierarchical models?
![Page 21: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/21.jpg)
(Thorpe and Fabre-Thorpe, 2001)
The ventral stream
![Page 22: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/22.jpg)
*Modified from (Gross, 1998)
Model of Visual Recognition (millions of units) based on neuroscience of cortex
[software available online] Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
![Page 23: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/23.jpg)
[software available online] Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
• It is in the family of “Hubel-Wiesel” models (Hubel & Wiesel, 1959; Fukushima, 1980; Oram & Perrett, 1993, Wallis & Rolls, 1997; Riesenhuber & Poggio, 1999; Thorpe, 2002; Ullman et al., 2002; Mel, 1997; Wersing and Koerner, 2003; LeCun et al 1998; Amit & Mascaro 2003; Deco & Rolls 2006…)
• As a biological model of object recognition in the ventral stream – from V1 to PFC -- it is perhaps the most quantitative and faithful to known neuroscience
• Feedforward only: an approximation of the first 100 msec of visual perception . Potential key limitation
• Hierarchy of disjunctions of conjunctions (~Geman)
Model of Visual Recognition (millions of units) based on neuroscience of cortex
![Page 24: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/24.jpg)
Max-like operation (OR-like)
Complex units
Tuning operation (Gaussian-like, AND-like)
Simple units
Stage 1
Stage 2
Two operations, one circuit?
Stage 3
![Page 25: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/25.jpg)
Stage 1
Stage 2
Instead of Gaussian, normalized dot product
A plausible biophysical implementation of a Gaussian-like tuning (Kouh, Poggio, 2008):
normalized dot product
€
w ⋅ x| x |
![Page 26: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/26.jpg)
Stage 1
Stage 2
Two operations, one circuit?
A plausible biophysical implementation for both Gaussian tuning (~AND) + max
(~OR): normalization circuits with divisive inhibition (Kouh, Poggio, 2008; also RP, 1999;
Heeger, Carandini, Simoncelli,…)
A canonical microcircuit?
![Page 27: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/27.jpg)
• Overcomplete dictionary of “templates” or image “patches” is learned during an unsupervised learning stage (from ~10,000 natural images) by tuning S units.
see also (Foldiak 1991; Perrett et al 1984; Wallis & Rolls, 1997; Lewicki and Olshausen, 1999; Einhauser et al 2002; Wiskott & Sejnowski 2002; Spratling 2005)
• Task-specific circuits (from IT to PFC)
- Supervised learning: ~ classifier
Learning: supervised and unsupervised
![Page 28: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/28.jpg)
• Preprocessing stages lead to a representation that has lower sampling complexity than the image itself
• We refer to the sample complexity of the preprocessing stage as the # of labeled examples required by the classifier at the top
Learning: supervised and unsupervised
Riesenhuber & Poggio 1999, 2000; Serre Kouh Cadieu Knoblich Kreiman & Poggio 2005; Serre Oliva Poggio 2007
![Page 29: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/29.jpg)
1. Today’s supervised learning algorithms: sample complexity problem and shallow architectures
2. Visual Cortex: hierarchical architecture, from neuroscience to a class of models
3. Physiology, psychophysics, computer vision 4. Models suggest new architectures for learning 5. Extensions and limitations of models
![Page 30: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/30.jpg)
Hierarchical feedforward models of the ventral stream
Millions of units
CBCL software available on
the Web
![Page 31: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/31.jpg)
• V1:
• Simple and complex cells tuning (Schiller et al 1976; Hubel & Wiesel 1965; Devalois et al 1982)
• MAX-like operation in subset of complex cells (Lampl et al 2004)
• V4:
• Tuning for two-bar stimuli (Reynolds Chelazzi & Desimone 1999)
• MAX-like operation (Gawne et al 2002)
• Two-spot interaction (Freiwald et al 2005)
• Tuning for boundary conformation (Pasupathy & Connor 2001, Cadieu, Kouh, Connor et al., 2007)
• Tuning for Cartesian and non-Cartesian gratings (Gallant et al 1996)
• IT:
• Tuning and invariance properties (Logothetis et al 1995, paperclip objects)
• Differential role of IT and PFC in categorization (Freedman et al 2001, 2002, 2003)
• Read out results (Hung Kreiman Poggio & DiCarlo 2005)
• Pseudo-average effect in IT (Zoccolan Cox & DiCarlo 2005; Zoccolan Kouh Poggio & DiCarlo 2007)
• Human:
• Rapid categorization (Serre Oliva Poggio 2007)
• Face processing (fMRI + psychophysics) (Riesenhuber et al 2004; Jiang et al 2006)
Hierarchical Feedforward Models: predict/are consistent //w neural data
Hierarchical feedforward models of the ventral stream
![Page 32: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/32.jpg)
Rapid Categorization
Hierarchical feedforward models of the ventral stream
![Page 33: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/33.jpg)
Mask should force visual cortex to operate in feedforward mode
Animal present or not ?
30 ms ISI
20 ms
Image
Interval Image-Mask
Mask 1/f noise
Thorpe et al 1996; Van Rullen & Koch 2003; Bacon-Mace et al 2005
Rapid Categorization
![Page 34: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/34.jpg)
Feedforward Models: “predict” rapid categorization (82% model vs. 80% humans)
Hierarchical feedforward models of the ventral stream
![Page 35: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/35.jpg)
Hierarchical feedforward models of the ventral stream
• Image-by-image correlation: – Heads: ρ=0.71 – Close-body: ρ=0.84 – Medium-body: ρ=0.71 – Far-body: ρ=0.60
![Page 36: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/36.jpg)
Feedforward Models: perform well compared to
engineered computer vision systems (in 2006)
Hierarchical feedforward models of the ventral stream
![Page 37: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/37.jpg)
Serre, Kouh, Cadieu, Knoblich, Kreiman & Poggio 2005 Pinto, Cox and DiCarlo 2008; Pinto, DiCarlo and Cox 2009
“Mutations” of the architecture perform well!
![Page 38: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/38.jpg)
Cox, Pinto, Doukhan, Corda & DiCarlo (2008); Pinto, DiCarlo & Cox (in prep)
2. Unsupervised training of each variant
1. Generate thousands of model variants
GPU-accelerated (~1000x speed-up)
3. Supervised testing of each variant
“Mutations” of the architecture perform well!
![Page 39: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/39.jpg)
1. Today’s supervised learning algorithm: sample complexity problem and shallow architectures
2. Visual Cortex: from neuroscience to a class of models 3. Physiology, psychophysics, computer vision 4. Models suggest new architectures for learning 5. Extensions (video+attention) and limitations of
feedforward hierarchical models
![Page 40: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/40.jpg)
Hierarchical feedforward models of visual cortex may be wrong
…but present a challenge for “classical” learning theory:
an unusual, hierarchical architecture with unsupervised and supervised learning
working well…
…so… we need theories -- not just models!
![Page 41: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/41.jpg)
![Page 42: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/42.jpg)
![Page 43: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/43.jpg)
![Page 44: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/44.jpg)
![Page 45: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/45.jpg)
![Page 46: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/46.jpg)
![Page 47: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/47.jpg)
![Page 48: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/48.jpg)
![Page 49: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/49.jpg)
![Page 50: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/50.jpg)
![Page 51: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/51.jpg)
![Page 52: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/52.jpg)
![Page 53: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/53.jpg)
![Page 54: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/54.jpg)
![Page 55: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/55.jpg)
Hierarchy can reduce sample complexity: empirical support
Bouvrie, Rosasco, Poggio, Smale, 2009
![Page 56: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/56.jpg)
Neural ResponseTheoretical Analysis
Summary
Compact mathematical description of a feedforward model ofthe visual cortex
“Derived” kernel recursively defined
Initial results on invariance/discrimination properties
Open problem: Discrimination/Approximation properties
Open problem: Number of layers and sample complexity(poverty-of-stimulus)
Open problem: E!cient learning of the templates
Conjecture: small e"ective dimensionality at each layer
T. Poggio Derived Kernels
![Page 57: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/57.jpg)
1. Today’s supervised learning algorithm: sample complexity problem and shallow architectures
2. Visual Cortex: from neuroscience to a class of models 3. Physiology, psychophysics, computer vision 4. Models suggest new architectures for learning 5. Extensions (video+attention) and limitations of
feedforward hierarchical models
![Page 58: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/58.jpg)
human agreement 72%
proposed system 71%
commercial system 56%
chance 12%
Automatic recognition of continuous rodent behavior (over hours): automatic phenotyping
Extension to motion: model of the dorsal stream
![Page 59: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/59.jpg)
From neuroscience to models: extension to attention
![Page 60: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/60.jpg)
An integrated Bayesian model with B. Desimone + E. Miller Chikkerur, Tan, Serre, Poggio (submitted); see Koch & Ullman 1985; Tsotsos 1985; Itti & Koch
1999; Deco & Rolls 2004; Walther et al 2005; Hamker 2005; Zelinski 2005; Rao 2005
From neuroscience to models: extension to attention
![Page 61: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/61.jpg)
Chikkerur, Tan, Serre, Poggio (submitted); see Koch & Ullman 1985; Tsotsos 1985; Itti & Koch 1999;
Deco & Rolls 2004; Walther et al 2005; Hamker 2005; Zelinski 2005; Rao 2005
An integrated Bayesian model with B. Desimone + E. Miller
From neuroscience to models: extension to attention
![Page 62: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/62.jpg)
Chikkerur, Tan, Serre, Poggio (submitted); see Koch & Ullman 1985; Tsotsos 1985; Itti & Koch 1999;
Deco & Rolls 2004; Walther et al 2005; Hamker 2005; Zelinski 2005; Rao 2005
An integrated Bayesian model with B. Desimone + E. Miller
From neuroscience to models: extension to attention
![Page 63: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/63.jpg)
Limitations of present feedforward hierarchical models
![Page 64: Learning in Hierarchical Architectures: from Neuroscience ...web.cse.ohio-state.edu/mlss09/mlss09_talks/6.june... · Learning in Hierarchical Architectures: from Neuroscience to Derived](https://reader034.fdocuments.us/reader034/viewer/2022051909/5ffdf159cedbbd622039f942/html5/thumbnails/64.jpg)
Collaborators in recent work
T. Serre, L. Rosasco, S. Chikkerur, E. Meyers, J. Bouvrie, H. Jhuang, C. Tan
Also: M. Kouh, G. Kreiman, M. Riesenhuber, , J. DiCarlo, E. Miller, B. Desimone, A. Oliva, C. Koch, D. Walther, C. Cadieu, U. Knoblich, T. Masquelier, S. Bileschi, L. Wolf, E. Connor. D. Ferster, I. Lampl, A. Pasupathy