Efficient BackProp Yann LeCuni, Leon Bottoui, Genevieve B. Orr2 ...
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks...
-
Upload
hillary-cunningham -
Category
Documents
-
view
220 -
download
0
Transcript of ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks...
![Page 1: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/1.jpg)
ECE 6504: Deep Learningfor Perception
Dhruv Batra
Virginia Tech
Topics: – Neural Networks
– Backprop– Modular Design
![Page 2: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/2.jpg)
Administrativia• Scholar
– Anybody not have access?– Please post questions on Scholar Forum. – Please check scholar forums. You might not know you have
a doubt.
• Sign up for Presentations– https://docs.google.com/spreadsheets/d/1m76E4mC0wfRjc4
HRBWFdAlXKPIzlEwfw1-u7rBw9TJ8/edit#gid=2045905312
(C) Dhruv Batra 2
![Page 3: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/3.jpg)
Plan for Today• Notation + Setup• Neural Networks• Chain Rule + Backprop
(C) Dhruv Batra 3
![Page 4: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/4.jpg)
Supervised Learning• Input: x (images, text, emails…)
• Output: y (spam or non-spam…)
• (Unknown) Target Function– f: X Y (the “true” mapping / reality)
• Data – (x1,y1), (x2,y2), …, (xN,yN)
• Model / Hypothesis Class– g: X Y– y = g(x) = sign(wTx)
• Learning = Search in hypothesis space– Find best g in model class.
(C) Dhruv Batra 4
![Page 5: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/5.jpg)
Basic Steps of Supervised Learning• Set up a supervised learning problem
• Data collection – Start with training data for which we know the correct outcome provided by a teacher or
oracle.
• Representation – Choose how to represent the data.
• Modeling – Choose a hypothesis class: H = {g: X Y}
• Learning/Estimation – Find best hypothesis you can in the chosen class.
• Model Selection– Try different models. Picks the best one. (More on this later)
• If happy stop– Else refine one or more of the above
(C) Dhruv Batra 5
![Page 6: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/6.jpg)
Error Decomposition
(C) Dhruv Batra 6
Reality
Mod
eling
Erro
r
model class
Estimation
Error
OptimizationError
![Page 7: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/7.jpg)
Error Decomposition
(C) Dhruv Batra 7
Reality
Estimation
ErrorO
ptimization
Error
Modeling Erro
r
model class
![Page 8: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/8.jpg)
Error Decomposition
(C) Dhruv Batra 8
Reality
Modeling E
rror
model class
Estimatio
n
Error
Optimization
Error
Higher-OrderPotentials
![Page 9: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/9.jpg)
Biological Neuron
(C) Dhruv Batra 9
![Page 10: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/10.jpg)
10
Recall: The Neuron Metaphor• Neurons
• accept information from multiple inputs, • transmit information to other neurons.
• Artificial neuron• Multiply inputs by weights along edges• Apply some function to the set of inputs at each node
Image Credit: Andrej Karpathy, CS231n
![Page 11: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/11.jpg)
11
Types of Neurons
Linear Neuron
Logistic Neuron
Perceptron
Potentially more. Require a convex
loss function for gradient descent training.
Slide Credit: HKUST
![Page 12: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/12.jpg)
Activation Functions• sigmoid vs tanh
(C) Dhruv Batra 12
![Page 13: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/13.jpg)
A quick note
(C) Dhruv Batra 13Image Credit: LeCun et al. ‘98
![Page 14: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/14.jpg)
Rectified Linear Units (ReLU)
(C) Dhruv Batra 14
[Krizhevsky et al., NIPS12]
![Page 15: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/15.jpg)
Limitation• A single “neuron” is still a linear decision boundary
• What to do?
• Idea: Stack a bunch of them together!
(C) Dhruv Batra 15
![Page 16: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/16.jpg)
Multilayer Networks• Cascade Neurons together• The output from one layer is the input to the next• Each Layer has its own sets of weights
(C) Dhruv Batra 16Image Credit: Andrej Karpathy, CS231n
![Page 17: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/17.jpg)
Universal Function Approximators• Theorem
– 3-layer network with linear outputs can uniformly approximate any continuous function to arbitrary accuracy, given enough hidden units [Funahashi ’89]
(C) Dhruv Batra 17
![Page 18: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/18.jpg)
Neural Networks• Demo
– http://neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.html
(C) Dhruv Batra 18
![Page 19: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/19.jpg)
Key Computation: Forward-Prop
(C) Dhruv Batra 19Slide Credit: Marc'Aurelio Ranzato, Yann LeCun
![Page 20: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/20.jpg)
Key Computation: Back-Prop
(C) Dhruv Batra 20Slide Credit: Marc'Aurelio Ranzato, Yann LeCun
![Page 21: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/21.jpg)
(C) Dhruv Batra 21
![Page 22: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/22.jpg)
(C) Dhruv Batra 22
![Page 23: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/23.jpg)
Visualizing Loss Functions• Sum of individual losses
(C) Dhruv Batra 23
![Page 24: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/24.jpg)
Detour
(C) Dhruv Batra 24
![Page 25: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/25.jpg)
Logistic Regression as a Cascade
(C) Dhruv Batra 25Slide Credit: Marc'Aurelio Ranzato, Yann LeCun
![Page 26: ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –Neural Networks –Backprop –Modular Design.](https://reader036.fdocuments.us/reader036/viewer/2022062422/56649ec15503460f94bcd6a3/html5/thumbnails/26.jpg)
Forward Propagation• On board
(C) Dhruv Batra 26