Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A...

135
© 2020 KNIME AG. All Rights Reserved. Welcome to Deep Learning for Image Analysis Benjamin Wilhelm David Kolb Going live at: Berlin 5:00 PM (CEST) New York City 11:00 AM (EDT) Austin 10:00 AM (CDT) London 4:00 PM (GMT)

Transcript of Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A...

Page 1: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved.

Welcome to Deep Learning for Image Analysis

Benjamin WilhelmDavid Kolb

Going live at:

Berlin 5:00 PM (CEST) New York City 11:00 AM (EDT)Austin 10:00 AM (CDT) London 4:00 PM (GMT)

Page 2: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

2© 2020 KNIME AG. All Rights Reserved.

Before we start…

• Please use the Q&A section to post your questions.

• Upvote for your favorite questions.

• Session is recorded and will be available on YouTube.

Page 3: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

3© 2020 KNIME AG. All Rights Reserved.

Before we start…

• Please use the Q&A section to post your questions.

• Upvote for your favourite questions.

• Session is recorded and will be available on YouTube.

Page 4: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 4

Outline

• Motivation

• Fundamentals

• Image Classification

– Cats & Dogs Classification in KNIME

• Semantic Segmentation

– Natural Image Segmentation in KNIME

• Image Captioning

– Image Captioning in KNIME

Page 5: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

5© 2020 KNIME AG. All Rights Reserved.

Motivation

Page 6: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 6

0

5

10

15

20

25

30

2010 2011 2012 2013 2014 2015 2016 2017

Erro

r in

%

Year

Winners of the ImageNet Challenge

Deep Learning

Why Deep Learning?

Page 7: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 7

Why Deep Learning?

Sergios Karagiannakoshttps://sergioskar.github.io/Semantic_Segmentation/

Bearman and Donghttp://www.catherinedong.com/pdfs/231n-paper.pdf

Isola et al.http://openaccess.thecvf.com/content_cvpr_2017/papers/Isola_Image-To-Image_Translation_With_CVPR_2017_paper.pdf

Purnasai Gudikandulahttps://medium.com/@purnasaigudikandula/artistic-neural-style-transfer-with-pytorch-1543e08cc38f

Silver et al.https://doi.org/10.1038/nature24270

Page 8: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 8

History of Deep Learning

1943

Neural Nets McCulloch & Pitt

1958

PerceptronRosenblatt

1960

Adaline

Widrow & Hoff

1969XOR Problem

Minsky & Papert

1974Backpropagation

Werbos

1980Neocognitron

(CNN)Fukushima

1986Multi-layered

Perceptron (Backpropagation)Rumelhart, Hinton

& Williams

1990LeNetLecun

2012AlexNet

Krizhevsky

Page 9: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 9

Interest in Deep Learning according to Google Trends

Page 10: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 10

What has changed?

Image Source:https://www.nvidia.com/content/dam/en-zz/es_em/Solutions/Data-Center/tesla-v100/[email protected]

Image Source:https://medium.com/syncedreview/sensetime-trains-imagenet-alexnet-in-record-1-5-minutes-e944ab049b2c

Page 11: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 11

Deep Learning Software

https://developer.nvidia.com/caffe2

https://de.wikipedia.org/wiki/Datei:Pytorch_logo.png

https://danilobzdok.de/links/theano-deeplearning-package/

https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/logo-m/mxnet2.png

https://geekflare.com/wp-content/uploads/2018/05/MicrosoftCNTKlogo.png

https://chainer.org/images/chainer_icon_red.png

https://upload.wikimedia.org/wikipedia/commons/2/2d/Tensorflow_logo.svg

https://miro.medium.com/max/368/1*u2t2N3lu8sH1CSsSrP_UyQ.png

https://upload.wikimedia.org/wikipedia/commons/c/c0/ONNX_logo_main.png

https://upload.wikimedia.org/wikipedia/commons/c/c9/Keras_Logo.jpg

Page 12: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 12

Deep Learning Software in KNIME

https://developer.nvidia.com/caffe2

https://de.wikipedia.org/wiki/Datei:Pytorch_logo.png

https://danilobzdok.de/links/theano-deeplearning-package/

https://raw.githubusercontent.com/dmlc/dmlc.github.io/master/img/logo-m/mxnet2.png

https://geekflare.com/wp-content/uploads/2018/05/MicrosoftCNTKlogo.png

https://chainer.org/images/chainer_icon_red.png

https://upload.wikimedia.org/wikipedia/commons/2/2d/Tensorflow_logo.svg

https://miro.medium.com/max/368/1*u2t2N3lu8sH1CSsSrP_UyQ.png

https://upload.wikimedia.org/wikipedia/commons/c/c0/ONNX_logo_main.png

https://upload.wikimedia.org/wikipedia/commons/c/c9/Keras_Logo.jpg

Page 13: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 13

KNIME Keras Integration

Page 14: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

14© 2020 KNIME AG. All Rights Reserved.

Fundamentals

Page 15: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 15

Recap: Machine Learning

• Learning programs from data

• Supervised Learning– Input: Data points with labels

– Output: Model that maps from data points to labels

– Examples: Classification, regression

• Unsupervised Learning– Input: Data points without labels

– Output: Model that captures structure of data

– Examples: Clustering, dimensionality reduction

Page 16: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 16

Examples of Supervised Learning

Images → Class labels

Credit history → Credit score

Customer data → Churn probability

Low resolution image → High resolution image

Cell image → Segmentation

Page 17: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 17

The Multilayer Perceptron

Input Hidden Output

Neuron

Page 18: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 18

A Single Neuron

𝑤1

𝑤2

𝑤3

𝜎 𝑏 +

𝑖

𝑤𝑖𝑥𝑖

𝑥1

𝑥2

𝑥3

Page 19: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 19

Activation Functions

Page 20: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 20

Forward Propagation

𝑥

Page 21: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 21

Forward Propagation

𝑥 ℎ(𝑥)

Page 22: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 22

Forward Propagation

𝑥 ℎ(𝑥) 𝑜(ℎ 𝑥 )

Page 23: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 23

Modelling Probabilities

• Classification tasks require to output probabilities

• Properties of a probability distribution

– All values are non-negative

– All values sum up to 1

• Binary classification: Sigmoid

• Multi-class classification: Softmax

Page 24: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 24

Forward Propagation

𝑥 ℎ(𝑥) 𝑜(ℎ 𝑥 ) Correct?

Page 25: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 25

Loss Functions

• Evaluate how far model outputs are from the true label

• Task dependent

– Binary classification: Binary cross entropy

– Multi-class classification: Categorical cross entropy

– Regression: Mean squared/absolute error

• Must be differentiable

Page 26: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 26

Gradient Descent

Gradient

Page 27: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 27

Gradient Descent

Page 28: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 28

Gradient Descent

Page 29: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 29

Backpropagation

• All parts of a deep learning model are differentiable

• Backpropagation uses the chain rule to calculate the gradient of the loss with respect to all weights

• Modern deep learning software performs this automagically

Page 30: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 30

Forward Propagation

𝑥 ℎ(𝑥) 𝑜(ℎ 𝑥 ) 𝑙𝑜𝑠𝑠 = 𝑙 𝑜 ℎ 𝑥

Page 31: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 31

Backpropagation

𝑙′ 𝑥 = 𝑜′ ℎ 𝑥 ℎ′(𝑥)

Information Flow

Page 32: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 32

Stochastic Gradient Descent

• Calculating the gradient on the full dataset is time-consuming

• Stochastic Gradient Descent: Evaluate on single data point

• Mini-batch Gradient Descent: Evaluate on a small set of data points

Page 33: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 33

Momentum

• Averages past gradients

• Equivalent of a ball rolling down a slope (acceleration)

• Can help to

– Reduce fluctuation

– Speed-up progress in direction with small but consistent gradients

– Escape local minima

Page 34: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 34

Adaptive Learning Rate

• The learning rate controls how large the steps taken by gradient descent are

• Not all parameters may require the same learning rate

• Solution: Adapt the learning rate based on the variance of the gradient

Page 35: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 35

Different Gradient Descent Optimizers

Optimizer Momentum Adaptive Learning Rates

SGD ✗ ✗

SGD + Momentum ✔ ✗

Adagrad ✗ ✔

Adadelta ✗ ✔

RMSProp ✗ ✔

Adam ✔ ✔

Page 36: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 36

The True Goal: Generalization

• Overfitting: Model overfits noise of the training set

• Low loss on training data but high loss on unseen data

• Remedy– Decrease model capacity

– Use Data Augmentation

– RegularizationImage Source:https://upload.wikimedia.org/wikipedia/commons/1/19/Overfitting.svg

Page 37: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 37

Old-school Regularization

• Add regularization term to loss that penalizes large parameters

• 𝐿2-Regularization (weight decay)

– Prefers solutions with small weights

• 𝐿1-Regularization

– Prefers solution with sparse weights (most weights are 0)

• Elastic net

– Combination of 𝐿1- and 𝐿2-Regularization

Page 38: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 38

Dropout

• During training: Randomly drop some neurons

• During inference:Scale neuron activations by drop rate

• Prevents the network to rely too much on individual features

Page 39: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

39© 2020 KNIME AG. All Rights Reserved.

Image Classification

Page 40: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 40

What is Image Classification?

Task:Decide to which class an image belongs to

Example:Cat or Dog?

Page 41: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 41

Image Classification with Deep Learning

Input Output

Page 42: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 42

Image Input for Deep Learning

255 250 100 113 117

248 223 89 105 101

227 65 233 95 91

89 6 65 89 186

70 211 100 78 111

Image Source: https://cdn.pixabay.com/photo/2017/09/12/21/17/dog-2743705_960_720.jpg

Page 43: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 43

Image Classification with Deep Learning

Input

Output

255 250 100 113 117

248 223 89 105 101

227 65 233 95 91

89 6 65 89 186

70 211 100 78 111

Page 44: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 44

Image Classification Output

Class Probabilities

Cat Dog

0% 100%

Cat Dog

100% 0%

One-hot vector

Page 45: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 45

Image Classification with Deep Learning

Input

255 250 100 113 117

248 223 89 105 101

227 65 233 95 91

89 6 65 89 186

70 211 100 78 111

Cat Dog

14% 86%

Page 46: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 46

Image Classification with Deep Learning

Input

255 250 100 113 117

248 223 89 105 101

227 65 233 95 91

89 6 65 89 186

70 211 100 78 111

Cat Dog

14% 86%

Feature Extraction & Information Aggregation

Page 47: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 47

Feature Extraction using Convolution

1 2 3

-4 7 4

2 -5 1

Kernel

Page 48: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 50

Kernel Example

-1 0 1

-2 0 2

-1 0 1

* =

Sobel Y

Page 49: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 51

Convolutional Layer

• Filter weights are trainable parameters

• Many filters to extract different kinds of features

Image Source: https://datascience.stackexchange.com/a/67324

Page 50: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 52

Pooling: Aggregating Spatial Information

1 2 8 2

7 4 6 1

8 5 6 9

5 3 1 0

7 8

8 9

3.5 4.25

5.25 4

Max Pooling

Average Pooling

Page 51: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 53

CNN for Image Classification

Cat

Dog

Image Source: https://upload.wikimedia.org/wikipedia/commons/6/63/Typical_cnn.png

Page 52: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 54

Data Augmentation

• Idea: Create more data using ground truth preserving transformations

• Examples

– Mirroring

– Rotation

– Translation

– Zooming

– Color transformations

– Blur

– NoiseImage Source: https://cdn.pixabay.com/photo/2017/09/12/21/17/dog-2743705_960_720.jpg

Page 53: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 55

The Secret Sauce: Pretrained Networks

• A trained network can be used as initialization for a network solving a different/related task

• Fine-tuning: The other task is similar– Example: A network trained for classification on Imagenet is fine-

tuned to discriminate between images of cats and dogs

• Transfer-learning: The other tasks differs greatly– Example: A network trained for classification on Imagenet is used to

initialize the backbone of a semantic segmentation network

• Feature extraction: The network is only used to extract features

Page 54: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

56© 2020 KNIME AG. All Rights Reserved.

1. Example:Cats & Dogs Classification in KNIME

Page 55: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 57

Cats & Dogs Data

https://www.kaggle.com/c/dogs-vs-cats/overview

Page 56: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 58

Cats & Dogs Classification in KNIME

1. Image preprocessing and augmentation

2. Train a simple CNN from scratch

3. Fine-tune a pretrained model

Three Workflows:

Page 57: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 59

1. Image Preprocessing and Augmentation

Page 58: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 60

1. Image Preprocessing and Augmentation

Page 59: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 61

1. Image Preprocessing and Augmentation

Input:

3200 examples

Output:

64000 augmented examples

(80/20) split

Page 60: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 62

1. Image Preprocessing and Augmentation

Page 61: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 63

2. Train a Simple CNN

Page 62: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 65

2. Train a Simple CNN

Page 63: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 66

Create One-hot Vector

Page 64: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 67

2. Train a Simple CNN

Page 65: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 71

2. Train a Simple CNN

Page 66: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 72

Format Network Output

Page 67: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 73

Score

Page 68: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 74

3. Fine-tune a Pretrained Model

Page 69: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 75

How to Fine-tune a Model?

Basic Recipe (of many):

1. Choose existing architecture, pretrained on a similar task

2. Adapt network head to new task (e.g. number of neurons)

3. Re-train new head only (maybe also some other layers)

Page 70: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 76

Prepare pretrained ResNet50 Model

ResNet50 Only train the

new network head

Add new head

ResNet50: https://arxiv.org/abs/1512.03385

Page 71: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 77

3. Fine-tune a Pretrained Model

Page 72: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 78

Score

Page 73: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

79© 2020 KNIME AG. All Rights Reserved.

Semantic Segmentation

Page 74: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 80

Semantic Segmentation

Page 75: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 82

Before: Classification

Image Source: https://upload.wikimedia.org/wikipedia/commons/6/63/Typical_cnn.png

We have: One classification per imageWe need: One classification per pixel

Page 76: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 83

Simple Approach: Sliding Window

Monkey

Tree

Fence

Problem: Inefficient

Page 77: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 84

Another Approach: Only Convolutional Layers

Image Source: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf

Page 78: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 85

Better: Encoder-Decoder

Image Source: http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture11.pdf

Page 79: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 86

Upsampling? Transpose Convolution

Image Source: https://medium.com/apache-mxnet/transposed-convolutions-explained-with-ms-excel-52d13030c7e8

Page 80: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 87

U-Net

Ronneberger et al.https://arxiv.org/pdf/1505.04597.pdf

Page 81: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

88© 2020 KNIME AG. All Rights Reserved.

2. Example:Natural Image Segmentation in KNIME

Page 82: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 89

Workflow Demo

Page 83: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 90

Workflow Demo

Page 84: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 91

Workflow Demo

Page 85: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 92

Workflow Demo

Page 86: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 93

Workflow Demo

Page 87: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 94

Workflow Demo

Page 88: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 95

Workflow Demo

Page 89: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 96

Workflow Demo

Page 90: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

97© 2020 KNIME AG. All Rights Reserved.

Image Captioning

Page 91: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 98

What is Image Captioning?

Task:Describe the contents of an image

Example Captions:• A fancy desert on a plate

with a twisted orange.• A plate has a dessert and

orange slices on it.• some iice crean sitting next

to some orange slices …

Page 92: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 99

Image Captioning with Deep Learning

`A fancy desert on a plate with a twisted orange.`

Does this work?

Page 93: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 100

Problem Formulation

Problem:

• Length of caption to predict unknown (we need a fixed output dimension for the output layer)

Simple Approach (of many):

• Iterative approach predicting word by word

Page 94: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 101

Image Captioning with Deep Learning

Image

Next Word in Caption

Partial Caption

One neuron for each possible word, each

word is a `class`

Page 95: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 102

Iterative Approach Example

Input1: Image Input2: Partial Caption Output: Target Word

*image1* startseq A

*image1* startseq, A fancy

*image1* startseq, A, fancy desert

*image1* startseq, A, fancy, desert on

*image1* …, with, a, twisted, orange, . endseq

`A fancy desert on a plate with a twisted orange.`

Special tokens marking start and end of sentence

Page 96: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 103

Network Inputs

Input1:

255 250 100 113 117

248 223 89 105 101

227 65 233 95 91

89 6 65 89 186

70 211 100 78 111

Input2:startseq, A, fancy, desert ?

Replace words with vocabulary indices

12, 452, 1120, 38

Page 97: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 104

Network Output

Output: fancy […, 0, 0, 1, 0, …]

Convert vocabulary index of target word to one-hot vector

Page 98: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 105

How to do Prediction?

Using same iterative approach:

• Predict first word using image and start token (startseq)

• Predict next words using image and partial caption from the previous prediction iteration

• Repeat until endseq is predicted

Page 99: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 106

Reduce Complexity

Use Help ➜ Transfer Learning:

• Image Input: Use pretrained image features (InceptionV3)

• Text Input: Use pretrained embedding vectors (GLOVE)

Approach: Pre-calculate InceptionV3 image- and GLOVE

embedding-features

• Make captions simpler using textprocessing

InceptionV3 : https://arxiv.org/abs/1512.00567, GLOVE: https://nlp.stanford.edu/projects/glove/

Page 100: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

107© 2020 KNIME AG. All Rights Reserved.

3. Example:Image Captioning in KNIME

Page 101: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 108

COCO Data

Large image datasets for many different tasks, e.g. image captioning

Five captions per image:• A hot dog bun filled with macaroni salad.• A hot dog bun has macaroni and cheese in it.• A hotdog bun filled with noodles on a plate with fries.• Mac and cheese sub with some fries on the side. • A nice meal sitting on top of a plate.

We are using a randomly sampled subset containing ≈ 8000 images.

Dataset: http://cocodataset.org/#home

Page 102: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 109

Image Captioning in KNIME

1. Caption preprocessing

2. Pre-calculate image features

3. Pre-calculate GLOVE embedding vectors

4. Model Training

5. Prediction

Five Workflows:

Page 103: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 110

1. Caption Preprocessing

Page 104: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 111

1. Caption Preprocessing

Page 105: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 112

Clean Captions

Page 106: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 113

1. Caption Preprocessing

Page 107: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 114

1. Caption Preprocessing

1830 unique wordsvs. ≈ 10000 before cleaning

Page 108: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 115

2. Pre-calculate Image Features

Page 109: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 116

2. Pre-calculate Image Features

Extract features of last dense layer (length 2048)

Page 110: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 117

2. Pre-calculate Image Features

Page 111: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 118

3. Pre-calculate GLOVE Embedding Vectors

GLOVE is a type of Word Embedding

What are Word Embeddings?:

• Map a word (or vocabulary index) to some position in an n-dimensional space, the position (relative to other words) encodes the semantics of the word

Page 112: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 119

GLOVE Embedding Vectors Intuition

Nearest Neighbors to ‘frog’:(in terms of distance on the GLOVE vectors)

Image Source: https://nlp.stanford.edu/projects/glove/

Page 113: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 120

3. Pre-calculate GLOVE Embedding Vectors

Several versions with different length vectors, we choose the 200-dimensional ones here

Page 114: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 121

3. Pre-calculate GLOVE Embedding Vectors

Look-up word vector for every vocabulary entry and save it in a Python dictionary

Page 115: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 122

4. Model Training

Page 116: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 123

Word/Vocab Mapping

Page 117: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 124

4. Model Training

Page 118: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 125

Create Training Data

29

Padded with zeros to create equal length vectors (29)

Page 119: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 126

4. Model Training

Page 120: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 127

Caption Network

Input1: Image Vector

Input2: Word Indices

Shape: [2048]

Shape: [29]

Maps word indices to GLOVE vectors using our pre-calculated dictionary

Shape: [1831]

Page 121: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 128

Caption Network

Input1: Image Vector

Input2: Word Indices

Shape: [2048]

Shape: [29]

Shape: [1831]

1831 softmax vector (1800 vocabulary size + ‘0’ padding)

Page 122: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 129

4. Model Training

Page 123: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 130

Training

Page 124: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 131

Training

Creates one-hot vector from indices

Caution: Indices must not get out of range of the output shape!

Page 125: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 132

4. Model Training

Page 126: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 133

5. Prediction

Page 127: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 134

Prepare Test Data

startseq:1176

29

Page 128: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 135

5. Prediction

Page 129: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 136

Iterative Prediction

1. Start with startseq token

2. Predict next token

3. If predicted token == endseq, exclude example from next iteration

4. Else, go to 2.

5. Repeat until all examples have been excluded

Page 130: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 137

Iterative Prediction

endseq:348

Trained Model

Test Data

Predict next token

If predicted token == endseq, route

example to output

Loop output

Data for next iteration,stop loop if empty

Else

Page 131: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 138

5. Prediction

Page 132: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 139

Caption Results

Page 133: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

© 2020 KNIME AG. All Rights Reserved. 140

Caption Results

Page 134: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

141© 2020 KNIME AG. All Rights Reserved.

Questions?

Page 135: Deep Learning for Image Analysis · 5/13/2020  · The Secret Sauce: Pretrained Networks • A trained network can be used as initialization for a network solving a different/related

142© 2020 KNIME AG. All Rights Reserved.

The End –thank you for joining this webinar.