Deep Learning and its Applications - Computer Vision
-
Upload
adam-gibson -
Category
Engineering
-
view
916 -
download
1
description
Transcript of Deep Learning and its Applications - Computer Vision
![Page 1: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/1.jpg)
{Deep Learning
And Its Applications: Computer Vision
Adam Gibson{ deeplearning4j.org // skymind.io // zipfian academy
![Page 2: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/2.jpg)
• Object Recognition• Image Categorization• Scene Parsing• Face Recognition
Computer Vision: A Primer
![Page 3: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/3.jpg)
• OpenCV • SIFT• Filters/Edge Detection• Feature Extraction
What’s currently done?
![Page 4: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/4.jpg)
• Representation Learning • More precise than hand-done
features• Non-linearities and higher-
order trends• Pretrain and Hessian Free
This is manual!
![Page 5: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/5.jpg)
• Representation Learning• Position Invariance with
convolutions• Semantic Hashing
Deep Learning and Images
![Page 6: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/6.jpg)
• Normal pixels – 0-255 – normalization
• Sparse – binarization (depending on pixel presence)
Different kinds of images
![Page 7: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/7.jpg)
• Faces = a collection of images.• With persistent patterns of pixels.• Pixel patterns = features.• Nets learn to identify features in data, to
classify faces as faces and label them: John or Sarah.
• Nets train by reconstructing faces from features many times.
• Measuring their work against a benchmark.
Facial recognition
![Page 8: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/8.jpg)
DL4J’s Facial Reconstructions
![Page 9: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/9.jpg)
• Slices of a feature space (Max pooling)• Learns different portions for easily
scalable and robust feature engineering.
Position Invariance - Convolutions
![Page 10: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/10.jpg)
Visual Example - Convolutions
![Page 11: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/11.jpg)
Pen Strokes
![Page 12: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/12.jpg)
• Facebook uses facial recognition to make itself stickier and know more about us.
• Government agencies use it to secure national borders.
• Video game makers use it to construct more realistic worlds.
• Stores use it to identify customers and track behavior.
What are faces for?
![Page 13: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/13.jpg)
• 2 layers of neuron-like nodes.• The 1st is the visible, or input, layer• The 2nd is “hidden.” It identifies features in
input• Symmetrically connected.• “Restricted” = no visible-visible or hidden-
hidden ties• All connections happen between layers.
Restricted Boltzmann Machines (RBMs)
![Page 14: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/14.jpg)
• A stack of RBMs.• Each RBM’s hidden layer Next RBM’s
visible/input layer. • DBNs learn more & more complex features• Example:
• 1) Pixels = input; • 2) H1 learns an edge or line; • 3) H2 learns a corner or set of lines; • 4) H3 learns two groups of lines forming an
object -- a face!• Final layer classifies feature groups: sunset,
elephant, flower, John, Sarah.
Deep-Belief Net (DBN)
![Page 15: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/15.jpg)
• 2 DBNs.• 1st DBN *encodes* data into vector of 10-30
numbers = Pre-training.• 2nd DBN decodes data into original state.• Backprop only happens on 2nd DBN• 2nd is the fine-tuning stage (reconstruction
entropy).• Reduces documents or images to compact
vectors .• Useful in search, QA and information
retrieval.
Deep Autoencoder
![Page 16: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/16.jpg)
Deep Autoencoder Architecture
![Page 17: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/17.jpg)
Image Search Results
![Page 18: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/18.jpg)
• Top-down & hierarchical rather than feed-forward (DBNs).
• Handles sequence-based classification, windows of several events, entire scenes (multiple objects).
• Features themselves are vectors. • A tensor = a multi-dimensional matrix, or multiple
matrices of the same size.
Recursive Neural Tensor Net
![Page 19: Deep Learning and its Applications - Computer Vision](https://reader036.fdocuments.us/reader036/viewer/2022081507/554a06c8b4c905557a8b559c/html5/thumbnails/19.jpg)
RNTNs & Scene Composition