DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato...
Transcript of DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato...
![Page 1: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/1.jpg)
1
1
Deep Learning & Neural Networks
Machine Learning – CSE4546Sham KakadeUniversity of WashingtonNovember 29, 2016
©Sham Kakade
2
Announcements:
n HW4 postedn Poster Session Thurs, Dec 8
n Today: ¨Review: EM¨Neural nets and deep learning
©Sham Kakade
![Page 2: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/2.jpg)
2
Poster Sessionn Thursday Dec 8, 9-11:30am
¨ Please arrive 20 mins early to set up
n Everyone is expected to attendn Prepare a poster
¨ We provide poster board and pins¨ Both one large poster (recommended) and several pinned pages are OK.
n Capture¨ Problem you are solving¨ Data you used¨ ML methodology¨ Results
n Prepare a 1-minute speech about your projectn Two instructors will visit your poster separatelyn Project Grading: scope, depth, data
©Sham Kakade 3
©Sham Kakade 4
Logistic regression
n P(Y|X) represented by:
n Learning rule – MLE:
-6 -4 -2 0 2 4 60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 3: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/3.jpg)
3
©Sham Kakade 5
Perceptron as a graph-6 -4 -2 0 2 4 60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
©Sham Kakade 6
Linear perceptron classification region
-6 -4 -2 0 2 4 60
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
![Page 4: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/4.jpg)
4
©Sham Kakade 7
Percepton, linear classification, Boolean functions
n Can learn x1 AND x2
n Can learn x1 OR x2
n Can learn any conjunction or disjunction
©Sham Kakade 8
Percepton, linear classification, Boolean functions
n Can learn majority
n Can perceptrons do everything?
![Page 5: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/5.jpg)
5
©Sham Kakade 9
Going beyond linear classification
n Solving the XOR problem
©Sham Kakade 10
Hidden layer
n Perceptron:
n 1-hidden layer:
![Page 6: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/6.jpg)
6
©Sham Kakade 11
Example data for NN with hidden layer
©Sham Kakade 12
Learned weights for hidden layer
![Page 7: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/7.jpg)
7
©Sham Kakade 13
NN for images
©Sham Kakade 14
Weights in NN for images
![Page 8: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/8.jpg)
8
©Sham Kakade 15
Forward propagation for 1-hidden layer - Predictionn 1-hidden layer:
©Sham Kakade 16
Gradient descent for 1-hidden layer –Back-propagation: Computing
Dropped w0 to make derivation simpler
![Page 9: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/9.jpg)
9
©Sham Kakade 17
Gradient descent for 1-hidden layer –Back-propagation: Computing
Dropped w0 to make derivation simpler
©Sham Kakade 18
Multilayer neural networks
![Page 10: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/10.jpg)
10
©Sham Kakade 19
Forward propagation – prediction
n Recursive algorithmn Start from input layern Output of node Vk with parents U1,U2,…:
©Sham Kakade 20
Back-propagation – learning
n Just stochastic gradient descent!!! n Recursive algorithm for computing gradientn For each example
¨Perform forward propagation ¨Start from output layer¨Compute gradient of node Vk with parents U1,U2,…¨Update weight wik
![Page 11: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/11.jpg)
11
©Sham Kakade 21
Many possible response/link functionsn Sigmoid
n Linear
n Exponential
n Gaussian
n Hinge
n Max
n …
22
Convolutional Neural Networks & Application to Computer Vision
Machine Learning – CSE4546Sham KakadeUniversity of Washington
November 29, 2016©Sham Kakade
![Page 12: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/12.jpg)
12
Contains slides from…
n LeCun & Ranzaton Russ Salakhutdinovn Honglak Lee
©Sham Kakade 23
Neural Networks in Computer Vision
n Neural nets have made an amazing come back¨ Used to engineer high-level features of images
n Image features:
©Sham Kakade 24
![Page 13: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/13.jpg)
13
Some hand-created image features
©Sham Kakade 25
Computer$vision$features$
SIFT$ Spin$image$
HoG$ RIFT$
Textons$ GLOH$Slide$Credit:$Honglak$Lee$
Scanning an image with a detectorn Detector = Classifier from image patches:
n Typically scan image with detector:
©Sham Kakade 26
![Page 14: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/14.jpg)
14
Using neural nets to learn non-linear features
©Sham Kakade 27
Y LeCunMA Ranzato
Deep Learning = Learning Hierarchical Representations
It's deep if it has more than one stage of non-linear feature transformation
Trainable Classifier
Low-LevelFeature
Mid-LevelFeature
High-LevelFeature
Feature visualization of convolutional net trained on ImageNet from [Zeiler & Fergus 2013]
But, many tricks needed to work well…
©Sham Kakade 28
ConvoluAnal$Deep$Models$$for$Image$RecogniAon$
• $Learning$mulAple$layers$of$representaAon.$
(LeCun, 1992)!
![Page 15: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/15.jpg)
15
Convolution Layer
©Sham Kakade 29
Y LeCunMA Ranzato
Fully-connected neural net in high dimension
Example: 200x200 imageFully-connected, 400,000 hidden units = 16 billion parameters
Locally-connected, 400,000 hidden units 10x10 fields = 40
million params
Local connections capture local dependencies
Parameter sharingn Fundamental technique used throughout MLn Neural net without parameter sharing:
n Sharing parameters:
©Sham Kakade 30
![Page 16: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/16.jpg)
16
Pooling/Subsampling n Convolutions act like detectors:
n But we don’t expect true detections in every patchn Pooling/subsampling nodes:
©Sham Kakade 31
Example neural net architecture
©Sham Kakade 32
![Page 17: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/17.jpg)
17
Sample results
©Sham Kakade 33
Y LeCunMA Ranzato
Simple ConvNet Applications with State-of-the-Art Performance
Traffic Sign Recognition (GTSRB)German Traffic Sign Reco
Bench
99.2% accuracy
#1: IDSIA; #2 NYU
House Number Recognition (Google) Street View House Numbers
94.3 % accuracy
Example from Krizhevsky, Sutskever, Hinton 2012
©Sham Kakade 34
Y LeCunMA Ranzato
Object Recognition [Krizhevsky, Sutskever, Hinton 2012]
CONV 11x11/ReLU 96fm
LOCAL CONTRAST NORM
MAX POOL 2x2sub
FULL 4096/ReLU
FULL CONNECT
CONV 11x11/ReLU 256fm
LOCAL CONTRAST NORM
MAX POOLING 2x2sub
CONV 3x3/ReLU 384fm
CONV 3x3ReLU 384fm
CONV 3x3/ReLU 256fm
MAX POOLING
FULL 4096/ReLU
Won the 2012 ImageNet LSVRC. 60 Million parameters, 832M MAC ops4M
16M
37M
442K
1.3M
884K
307K
35K
4Mflop
16M
37M
74M
224M
149M
223M
105M
![Page 18: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/18.jpg)
18
Results by Krizhevsky, Sutskever, Hinton 2012
©Sham Kakade 35
Y LeCunMA Ranzato
Object Recognition: ILSVRC 2012 results
ImageNet Large Scale Visual Recognition Challenge1000 categories, 1.5 Million labeled training samples
©Sham Kakade 36
Y LeCunMA Ranzato
Object Recognition [Krizhevsky, Sutskever, Hinton 2012]
![Page 19: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/19.jpg)
19
©Sham Kakade 37
Y LeCunMA Ranzato
Object Recognition [Krizhevsky, Sutskever, Hinton 2012]
TEST IMAGE RETRIEVED IMAGES
Application to scene parsing
©Sham Kakade 38
Y LeCunMA Ranzato
Semantic Labeling:Labeling every pixel with the object it belongs to
[Farabet et al. ICML 2012, PAMI 2013]
Would help identify obstacles, targets, landing sites, dangerous areasWould help line up depth map with edge maps
![Page 20: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/20.jpg)
20
Learning challenges for neural nets
n Choosing architecturen Slow per iteration and convergence
n Gradient “diffusion” across layersn Many local optima
©Sham Kakade 39
Random dropouts
n Standard backprop:
n Random dropouts: randomly choose edges not to update:
n Functions as a type of “regularization”… helps avoid “diffusion” of gradient
©Sham Kakade 40
![Page 21: DeepLearning&% Neural Networks€¦ · 15 Convolution%Layer ©Sham%Kakade 29 Y LeCun MA Ranzato Fully-connected neural net in high dimension Example: 200x200 image Fully-connected,](https://reader033.fdocuments.us/reader033/viewer/2022042302/5ecd24475f998d2c8838dbd4/html5/thumbnails/21.jpg)
21
Revival of neural networks
n Neural networks fell into disfavor in mid 90s -early 2000s¨ Many methods have now been rediscovered J
n Exciting new results using modifications to optimization techniques and GPUs
n Challenges still remain:¨ Architecture selection feels like a black art¨ Optimization can be very sensitive to parameters¨ Requires a significant amount of expertise to get good results
©Sham Kakade 41