Introduction to Deep Learning - CVG @ ETHZ · Introduction to Deep Learning February 17, 2020 For...

Institute of Visual Computing

Introduction to Deep Learning

February 17, 2020

For slides credits we thank Vagia Tsiminaki.

Martin Oswald


Outline:

What is Deep Learning?

Artificial Neural Networks

Convolutional Neural Networks

Training Deep Learning Architectures

Applications on Computer Vision



Figure Credit: Adam Gibson, Josh Patterson “Deep Learning”



Machine Learning:

Input Data

Train model

Use trained model for new prediction


Towards Deep Learning

Hand-crafted Features

E.g. Canny edges, Harris corners, SIFT

Feature Learning

Extract automatically patterns (features)

Deep Learning

Learning hierarchical representations of data

End to end learning


Image Classification

Chihuahua or Muffin ?

f( ) = ”Muffin” f( ) = ”Chihuahua”



Machine Learning:

Input Data: Training set of images

and labels

Train model (Image classifier)

Use trained model for new prediction



Training Images

TrainingImage

features

Image

classifier

Image

labels

Slide Credit:CS 131, Lecture 1, 2016

Training

Testing

Image

features

Learned image

classifier”Chihuahua”



Training Images

TrainingImage

features

Image

classifier

Image

labels


Training

Testing

Image

features

Learned image

classifier”Chihuahua”

Feature Engineering



Training Images

Image

labels


Testing

”Chihuahua”

Training

Image

features

Image

classifier

Learned Model

Learned

features

Learned

classifierLearned Model

Learned

features

Learned

classifier

Feature Learning



Training

Images

Image

labels

Learned ModelDeep Learning

Training

Low-level

features

Image

Classifier

High-level

features

Mid-level

features

Mid-level

features

Low-level

features

Image

Classifier

High-level

features


Artificial Neural Networks

Figure Credit: Artificial Intelligence Techniques for Modelling of Temperature in the Metal

Cutting Process

Input Layer

Hidden Layer

Output Layer


Artificial Neuron

x1, x2,…, xN: Inputs to the neuron

w0,w1, w2,…,wN: Weights on each

input

f: Activation function


Activation Function

Sigmoid Function Tanh Activation

ReLU (Rectified Linear Unit)



Convolution Layer

Rectified Linear Unit (ReLu)

Pooling Layer

Fully Connected Layer



Image Source:http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution

Image

Filter or

KernelConvolution

Feature Map or

Activation Map or

Convolved Feature

Convolution layer to extract features from input

image



Image Source: https://ujjwalkarn.me/2016/08/11/intuitive-explanationconvnets/


image



Image Source: https://cs.nyu.edu/~fergus/tutorials/deep_learning_cvpr12/


image



Image Feature Map

Size of Feature Map

Depth: Number of filters

Stride

Zero-padding


image



Rectified Linear Unit (ReLu) to introduce non-

linearity

Most of real-world data are non linear

Convolution is linear operation

Image Source:http://mlss.tuebingen.mpg.de/2015/slides/fergus/Fergus_1.pdf

Output = max (0, Input)



Pooling Layer

Max

Average

Sum

Image Source:http://mlss.tuebingen.mpg.de/2015/slides/fergus/Fergus_1.pdf



Image Source:http://cs231n.github.io/convolutional-networks/

Pooling Layer to:

Reduce the dimension of input

Reduce the number of parameters and

computations(control overfitting)

Make the network invariant to small

transformations, distortions, translations

Get and almost scale invariant representation of

input




Each node is connected to each node in the

adjacent layer

Input: High-level features of the input image

from the convolutional and pooling layers

Goal:

Classification

Regression

Segmentation




Each node is connected to each node in the

adjacent layer

Input: High-level features of the input image

from the convolutional and pooling layers

Goal:

Classification

Multi Layer Perceptron with a softmax activation function in

the output layer



Chihuahua or Muffin ?

f( ) = ”Muffin” f( ) = ”Chihuahua”



Training

Images

Image

labels

Learned ModelDeep Learning

Training

Low-level

features

Image

Classifier

High-level

features

Mid-level

features

Mid-level

features

Low-level

features

Image

Classifier

High-level

features



Classes= {”Chihuahua”, ” Muffin”}

Input= ”Chihuahua”

Target Vector= [1,0]

Chihuahua(0)

Muffin(1)



Initialize weights of filters

Forward propagation pass

Convolution

ReLu

Pooling

Fully connected layer

Chihuahua(0)

Muffin(1)



Chihuahua(0)

Muffin(1)

Initialize weights of filters

Forward propagation pass

Calculate the total error: 𝐸 = Σ1

2(𝑡𝑎𝑟𝑔𝑒𝑡 − 𝑜𝑢𝑡𝑝𝑢𝑡)2

Backward propagation pass



Chihuahua(0)

Muffin(1)

Calculate the total error: 𝐸 = Σ1

2(𝑡𝑎𝑟𝑔𝑒𝑡 − 𝑜𝑢𝑡𝑝𝑢𝑡)2

Backward propagation pass

Iterate Forward – Backward propagation with all training data


Applications on Computer Vision

Super-Resolution

Figure Credit: Ledig et al. “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”

Image Segmentation &

Classification

Figure Credit: Dai, He and Sun “Instance-aware Semantic Segmentation via Multi-task Network Cascades”

Style Transfer

Figure Credit: Zhu et al. “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks ”

Semantic 3D Reconstruction

Figure Credit: Dai et al. “3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation”, ECCV 2018


Questions ?

Introduction to Deep Learning - CVG @ ETHZ · Introduction to Deep Learning February 17, 2020 For...

Documents

Transcript of Introduction to Deep Learning - CVG @ ETHZ · Introduction to Deep Learning February 17, 2020 For...