ODSC West

85
Proprietary and confidential. Do not distribute. Diving deep into Deep Learning: Convolutional and Recurrent Neural Networks Urs Köster, PhD, Yinyin Liu, PhD MAKING MACHINES SMARTER.™ now part of

Transcript of ODSC West

Proprietary and confidential. Do not distribute.

Diving deep into Deep Learning:Convolutional and Recurrent Neural Networks

Urs Köster, PhD, Yinyin Liu, PhD

MAKING MACHINES SMARTER.™

now part of

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

2

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

download neon!https://github.com/NervanaSystems/neongit clone [email protected]:NervanaSystems/neon.git

Nervana’s deep learning tutorials:https://www.nervanasys.com/deep-learning-

tutorials/

We are hiring!https://www.nervanasys.com/careers/

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

3

Back-propagation

End-to-end

Resnet

ImageNetWord2Vec

Regularization

ConvolutionUnrolling

RNNGeneralization

hyperparametersVideo recognition

dropoutPooling

LSTM

AlexNet

Speech recognition

download neon!https://github.com/NervanaSystems/neongit clone [email protected]:NervanaSystems/neon.git

Nervana’s deep learning tutorials:https://www.nervanasys.com/deep-learning-tutorials/

We are hiring!https://www.nervanasys.com/careers/

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

4

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

5

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

6

https://www.nervanasys.com/industry-focus-serving-the-automotive-industry-with-the-nervana-platform/

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

7

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

8

http://www.nervanasys.com/deep-reinforcement-learning-with-neon/

https://youtu.be/KkIf0Ok5GCE

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

9

~60 million parameters

Positive/negative

End-to-end learning

Raw image input Output

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

10

(Zeiler and Fergus, 2013)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

11

Historical perspective:

• Input → designed features → output

• Input → designed features → SVM → output

• Input → learned features → SVM → output

• Input → levels of learned features → output

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

12

A method for extracting features

at multiple levels of abstraction

• Features are discovered from data

• Performance improves with more data

• Network can express complex

transformations

• High degree of representational power

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

13

No free lunch:

• lots of data

• flexible models

• powerful priors

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

Source: ImageNet

ImageNet top 5 error rate

0%

10%

20%

30%

2010 2011 2012 2013 2014 2015

human performance

• No free lunch

• lots of data• flexible and fast

frameworks• powerful computing

resources

14

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

15

Healthcare: Tumor detection

Automotive: Speech interfaces Finance: Time-series search engine

Positive:

Negative:

Agricultural Robotics Oil & Gas

Positive:

Negative:

Proteomics: Sequence analysis

Query:

Results:

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

16

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

0 1 2

3 4 5

6 7 8

0 1

2 3

19 25

37 43

0 1 3 4 0 1 2 3 19

• Each element in the output is the result of a dot product between two vectors

17

input filter output

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

24

B0 B1 B2

B3 B4 B5

B6 B7 B8

G0 G1 G2

G3 G4 G5

G6 G7 G8

R0 R1 R2

R3 R4 R5

R6 R7 R8

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

19

B0 B1 B2

B3 B4 B5

B6 B7 B8

G0 G1 G2

G3 G4 G5

G6 G7 G8

R0 R1 R2

R3 R4 R5

R6 R7 R8

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

20

B0 B1 B2

B3 B4 B5

B6 B7 B8

G0 G1 G2

G3 G4 G5

G6 G7 G8

R0 R1 R2

R3 R4 R5

R6 R7 R8

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

21

B0 B1 B2

B3 B4 B5

B6 B7 B8

G0 G1 G2

G3 G4 G5

G6 G7 G8

R0 R1 R2

R3 R4 R5

R6 R7 R8

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

22

0 1 2

3 4 5

6 7 8

0 1

2 3

19 25

37 43

0

1

2

3

4

5

6

7

8

190

23

1

0

23

1

0

23

1

0

2

3

1

25

37

43

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

23

Detected the pattern!

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

24

0 1 2

3 4 5

6 7 8

4 5

7 8

0 1 3 4 4

• Each element in the output is the maximum value within the pooling window

• Precise location becomes less relevant

• The layer becomes tolerant to local perturbations in the input – build in invariance

Max( )

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

25

0 0 1

0 4 6

4 12 9

0 1

2 3

0 1

2 3

• Opposite transformation of convolution

• Represents the bases to reconstruct shape of an input

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

26

0

0

1

0

4

6

4

12

9

0

1

2

3

01

23

0 1

23

0 1

23

01

23

x +

x =

31

13 6

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

27

• AlexNet (ILSVRC 2012 winner)

• ZF Net (2013 winner)

• GoogLeNet (2014 winner)

• VGG (2014 runner-up)

• ResNet (2015 winner)

conv

1

pool

1

conv

2

pool

2

conv

3

conv

4

conv

5

pool

5

fc6

fc7

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

28

When you construct a deep network, and train with a lot of data:

• optimize all the parameters for the problem –optimized feature

extractor

• it discovers the intrinsic structures of the data on its own

• different layers of filters discovers different level of features

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

29

• Filters can be visualized by the weights

• The weights reflect what patterns a filter is looking for

• Low-level filters represent lower-level features, edges, color blobs

11x11x3 conv filters learned by the first layerhttp://www.cs.toronto.edu/~fritz/absps/imagenet.pdf

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

30

• High-level filters represent abstract

features

• What input activates the filters (neurons)

will pass through to the upper layers

• But the pattern can be hard to interpret

http://eblearn.sourceforge.net/lib/exe/mnist_fprop1.png

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

31

• A conv-deconv network to project filters

to the pixel level

• For high level filters:

• Each tile shows a feature map

activation projected to pixel space

• Strong grouping within each feature

map

• Greater invariance at higher layers

• Exaggeration of discriminative parts

of the image, eyes, wheels…

(Zeiler and Fergus, 2013)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

32

1. Each layer has different scope of things it can look for. Lower layers will develop general features, so it doesn’t have a wide variety to look forHigher layers have a larger variety of things to look for

à Number of features increase

2. Combine simple features to complex featuresChoose convolution strides / padding to retain FM sizeUse pooling to reduce FM size

à (H, W) decrease

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

33

Layer Output shapeInput (224, 224, 3)

CONV (3x3x64) (224, 224, 64)

CONV (3x3x64) (224, 224, 64)

POOL (2x2) (112, 112, 64)

CONV (3x3x128) (112, 112, 128)

CONV (3x3x128) (112, 112, 128)

POOL (2x2) (56, 56, 128)

CONV (3x3x256) (56, 56, 256)

CONV (3x3x256) (56, 56, 256)

CONV (3x3x256) (56, 56, 256)

POOL (2x2) (28, 28, 256)

CONV (3x3x256) (28, 28, 512)

CONV (3x3x256) (28, 28, 512)

CONV (3x3x256) (28, 28, 512)

POOL (2x2) (14, 14, 512)

CONV (3x3x512) (14, 14, 512)

CONV (3x3x512) (14, 14, 512)

CONV (3x3x512) (14, 14, 512)

POOL (2x2) (7, 7, 512)

AFFINE (4096 units) (4096, 1)

AFFINE (4096 units) (4096, 1)

AFFINE (100 units) (100, 1)

https://www.cs.toronto.edu/~frossard/post/vgg16/

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

Input Conv1 Conv2 Conv3 Deconv1 Deconv2 Deconv3

• Can be trained to reconstruct meaningful variations

• Have been used to generate images, and object localization

http://arxiv.org/abs/1411.5928http://arxiv.org/abs/1412.6583

http://arxiv.org/abs/1505.04366

34

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

35

Image classification

Image segmentation

Object localizationVideo classification

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

36

• Neon supports optimized convolution kernels for maxwell-based GPUs

• All components for constructing example CNNs

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

37

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

38

• https://www.kaggle.com/c/noaa-right-whale-recognition

• Right whales being photographed and tracked for over 10 years

• ~4500 labeled images, ~450 whales

• ~7000 test images

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

39

• They all look quite the same

• Small objects to identify with

background

• Whales in the pictures have different

orientations - challenging to build in

this much variance.

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

40

• How to go from to - up-close and orientation aligned?

• Estimate the heading (angle) of the whale using a CNN?

• Training set can be manually labeled, to train a segmentation CNN

• Apply the segmentation CNN to process and auto-align the test images

• Apply classification CNN on the pre-process images

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

?

41

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

42

Input epoch 0 epoch 2 epoch 4 epoch 6

target prediction indicated by

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

43

init = Gaussian(scale=0.1) opt = Adadelta(decay=0.9) common = dict(init=init, batch_norm=True, activation=Rectlin())

layers = [] nchan = 128 layers.append(Conv((2, 2, nchan), strides=2, **common)) for idx in range(16):

layers.append(Conv((3, 3, nchan), **common)) if nchan > 16:

nchan /= 2 for idx in range(15):

layers.append(Deconv((3, 3, nchan), **common)) layers.append(Deconv((4, 4, nchan), strides=2, **common)) layers.append(Deconv((3, 3, 1), init=init))

cost = GeneralizedCost(costfunc=SumSquared()) mlp = Model(layers=layers) callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args) mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

44

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

init = Gaussian(scale=0.01) opt = Adadelta(decay=0.9) common = dict(init=init, batch_norm=True, activation=Rectlin())

layers = [] nchan = 64 layers.append(Conv((2, 2, nchan), strides=2, **common)) for idx in range(6):

if nchan > 1024: nchan = 1024

layers.append(Conv((3, 3, nchan), strides=1, **common)) layers.append(Pooling(2, strides=2)) nchan *= 2

layers.append(DropoutBinary(keep=0.5)) layers.append(Affine(nout=447, init=init, activation=Softmax()))

cost = GeneralizedCost(costfunc=CrossEntropyMulti()) mlp = Model(layers=layers) callbacks = Callbacks(mlp, train, eval_set=val, **args.callback_args) mlp.fit(train, optimizer=opt, num_epochs=args.epochs, cost=cost, callbacks=callbacks)

45

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

46

https://github.com/anlthms/whale-2015

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

47

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

48

Source: http://mi.eng.cam.ac.uk/projects/segnet/

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

49

• Uses CamVid dataset: https://github.com/alexgkendall/SegNet-Tutorial/tree/master/CamVid

• converts the 1 channel target class images holding the ground truth values for each pixel into a 12 channel image using a one-hot representation for the class of each pixel

• Takes about 12G GPU memory

• After 650 epochs of training, the network should reach ~9000 training cost and ~80% pixel classification accuracy.

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

50

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

51

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

52

Sky

Building

Sidewalk

Tree

Car

Pedestrian

Pole

Road

Sign

Fence

Bicyclist

Unlabeled

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

53

Sky

Building

Pole

Road

Sidewalk

Tree

Sign

Fence

Car

Pedestrian

Bicyclist

Unlabeled

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

54

Source at https://github.com/NervanaSystems/neon/tree/master/examples

Example 1:./conv_autoencoder.pyConv-deconv network to reconstruct input images

Example 2:./cifar10_conv.pyConvNet for Cifar10 dataset

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

55

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

56

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

57

Back Propagation through time

1. Unroll the network across time steps.

2. Follow the back-propagated gradients.

3. Update weights with average gradients.

ht

jt

kt

xt

Encoder RNNEncoder RNN

Recurrent weights

Feed-forward weights

h1

j1

k1

h2

j2

k2

hn

xn

jn

kn

x2x1

Unrolled Network

gradients

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

58

Network activations determine states of input, forget, output gate:• Open input, open output,

closed forget: LSTM network acts like a standard RNN

• Closing input, opening forget: Memory cell recalls previous state, new input is ignored

• Closing output: Internal state is stored for the next time step without producing any output

f g i o

c

ht

Input

Hidden

f g i o

c

ht

f g i o

c

ht

f g i o

c

ht

FF Weights

Recurrent Weights

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

59

• Neon supports a wide range of recurrent layers

• Connectivity between recurrent and feed-forward layers

• Deep and bi-directional RNNs

• Containers for Encoders, Decoders, Sequence to Sequence models.

Recurrent output layers

Standard Recurrent layers

Bidirectional Recurrent layers

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

60

• Simple RNN example: neon/examples/char_rnn.py

• Penn Tree Bank text dataset: Learn to predict text, one letter at a time.

• Small enough to run on your Laptop, right now (should take about 4 minutes per epoch of training on a Laptop CPU).

• LSTM example: text_generation_lstm.py

• Generate Shakespeare-style text

Backend

Hyper-parameters

Network Layers

Dataset

Cost Function

Optimizer

Fitting the model

Other RNN examples you can try:

61

Source at https://github.com/anlthms/meetup2/blob/master

Example 1:./rnn1.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v(Does not work well! This example demonstrates challenges of training RNNs)

Example 2:./rnn2.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v(Uses Glorot init, gradient clipping, Adagrad and LSTM)

Example 3./rnn3.py -e 16 -w /home/ubuntu/nervana/music -r 0 -v(Uses multiple bi-rnn layers)

Warning: Large dataset, please do not download over ODSC WiFi.

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

62

Image Captioning

Speech recognition

Machine TranslationTime Series Analysis

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

63

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

64

• Whale Detection challenge from Kaggle:

https://www.kaggle.com/c/whale-detection-challenge

• Identify calls by Right Whales based

on their signature chirp sound

• 30.000 training clips of 2s length at 2kHz

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

65

• Processing the data:Work in the spectrogram domain: 81 frequencies, 49 time steps

• neon dataloader has built in audio processing tools.

• Essentially transforms the sound into an “image” we can apply ConvNet tools to.

Whale Call Spectrogram

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

66

Network• Spectrograms are 81 x 49 “pixel” images. • Apply convolutional layers to obtain a 37x10 feature map of depth 512• RNN layers applied to 10 time steps of the 37*512-D stack• Feed last time step into a binary classifier with SoftMax• Conv layers have ReLu activations and use Batch Normalization

Training• Optimization with AdaDelta• Initialization with Gaussian noise • This model is not very deep, not challenging to train

Network Architecture

67

Linear (BN)

BiRNN

BiRNN

BiRNN

Conv2 (BN)

Pool

Conv1 (BN)

Conv0

Spectrogram

Class Label

MAX

Inspired by DeepSpeech 2 (Baidu). Convolution + Recurrent layers

Network Architecture

68

Main Python script

Full source at https://github.com/NervanaSystems/neon/blob/master/examples/whale_calls.py

Spectrogram

Class Label

MAX

Running the example

69

Command line:./whale_calls.py -e 16 -r 0 -s whales.pkl –v -w /home/ubuntu/nervana/wdc

Convolution Layer 'Convolution_0': 1 x (81x49) inputs, 128 x (79x24) outputs Activation Layer 'Convolution_0_Rectlin': RectlinConvolution Layer 'Convolution_1': 128 x (79x24) inputs, 256 x (77x22) outputsBatchNorm Layer 'Convolution_1_bnorm': 433664 inputs, 1 steps, 256 feature mapsActivation Layer 'Convolution_1_Rectlin': RectlinPooling Layer 'Pooling_0': 256 x (77x22) inputs, 256 x (38x11) outputsConvolution Layer 'Convolution_2': 256 x (38x11) inputs, 512 x (37x10) outputsBatchNorm Layer 'Convolution_2_bnorm': 189440 inputs, 1 steps, 512 feature mapsActivation Layer 'Convolution_2_Rectlin': RectlinBiRNN Layer 'BiRNN_0': 18944 inputs, (256 outputs) * 2, 10 stepsBiRNN Layer 'BiRNN_1': (256 inputs) * 2, (256 outputs) * 2, 10 stepsBiRNN Layer 'BiRNN_2': (256 inputs) * 2, (256 outputs) * 2, 10 stepsRecurrentOutput choice RecurrentLast : (512, 10) inputs, 512 outputsLinear Layer 'Linear_0': 512 inputs, 32 outputsBatchNorm Layer 'Linear_0_bnorm': 32 inputs, 1 steps, 32 feature mapsActivation Layer 'Linear_0_Rectlin': RectlinLinear Layer 'Linear_1': 32 inputs, 2 outputsActivation Layer 'Linear_1_Softmax': Softmax

Spectrogram

Class Label

MAX

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

70

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

71

Image: Kyunghyun Cho

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

72

• Data is tokenized and

mapped to a dictionary

• One-hot encoding: Each

word is a category

• No fixed mapping

between words

Input sentence

De nouvelles règlessur les transferts de

données pour unecoopération policière

plus efficace

00001000000000000000000010000010000000000000000000100000… …

Output Sentence

New rules on data transfers to

ensure smootherpolice cooperation

0000000010010000000000010000000000001000… …

Input Sentence

?

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

73

Sequence to Sequence Model

h1

j1

k1

h2

j2

k2

hn

xn

jn

kn

x2x1

ENCODER

kn k2 k1

jn j2 j1

hn h2 h1

~~ ~

~

~

~

~ ~

~

y1y2yn~ ~ ~

~ ~

DECODER

Encoding

the cat is

le chat est

Enco

der

Deco

der

le chat

Recurrent weights

Feed-forward weights

Embedding weights

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

74

neon layer configuration

Encoder and Decoder are layer containers in neon

stack of GRU-type LSTM layers in the container

Seq2Seq container to train Encoder and Decoder

The (correct) previous word is fed as input to the decoder LookupTable

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

75

• Toy model with a vocabulary of 16,384 words

• Word embedding of 1024 dimensions

• 2 hidden layers with 512 GRU units

Network Layers:Seq2Seq LookupTable Layer : 20 inputs, (512, 20) outputs size Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps Recurrent Layer 'GRU1Enc': 512 inputs, 512 outputs, 20 steps LookupTable Layer : 20 inputs, (512, 20) outputs size Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps Recurrent Layer 'GRU1Dec': 512 inputs, 512 outputs, 20 steps Linear Layer 'Affine': 512 inputs, 16384 outputs Bias Layer 'Affine_bias': size 16384 Activation Layer 'Affine_Softmax': Softmax

Model trained with Cross-Entropy costusing RMSProp

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

76

BLEU Score (bilingual evaluation understudy)

• Compare n-grams with one or multiple references

• Modified form of precision, additional penalties.

Beam Search

• Greedy algorithm to obtain output sequences

• Not perfect, so often NMT systems used for rescoring

Candidate on the mat there is a catReference 1 the cat is on the matReference 2 there is a cat on the mat

on the mat is

there

a

0.1

0.5

0.03

is

cat

a

0.3

0.07

0.05

is

the

a

0.01

0.2

0.02

BLEU Score

Beam Search

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

77

• What is Deep Learning and What Can It Do Today? (2:00 pm – 2:25 pm)

• Convolution Neural Networks (2:25 pm – 2:50 pm)

• BREAK: 2:50 – 3:00

• Using Convolution Neural Networks to identify Whales (3:00 pm – 2:25 pm)

• Segnet for Object Segmentation (3:25 pm – 3:50 pm)

• BREAK: 3:50 – 4:00

• Recurrent Neural Networks (4:00 pm – 4:25 pm)

• Using Recurrent Neural Networks for Whale Call Classification (4:25 pm – 4:50 pm)

• BREAK: 4:50 – 5:00

• Neural Machine Translation (5:00 pm – 5:25 pm)

• Finding the Right Deep Learning Framework For You (5:25 pm – 5:50 pm)

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

78

Krizhevsky, 2012

Kendall et al, 2016

Amodei et al, 2015

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

LayersLinear, Convolution, Pooling, Deconvolution, Dropout, Recurrent, Long Short-

Term Memory, Gated Recurrent Unit, BatchNorm, LookupTable, Local Response Normalization, Bidirectional-RNN, Bidirectional-LSTM

Backend NervanaGPU, NervanaCPU, NervanaMGPU

DatasetsMNIST, CIFAR-10, Imagenet 1K, PASCAL VOC, Mini-Places2, IMDB, Penn Treebank,

Shakespeare Text, bAbI, Hutter-prize, UCF101, flickr8k, flickr30k, COCO

Initializers Constant, Uniform, Gaussian, Glorot Uniform, Xavier, Kaiming, IdentityInit, Orthonormal

Optimizers Gradient Descent with Momentum, RMSProp, AdaDelta, Adam, Adagrad,MultiOptimizer

Activations Rectified Linear, Softmax, Tanh, Logistic, Identity, ExpLin

Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error

Metrics Misclassification (Top1, TopK), LogLoss, Accuracy, PrecisionRecall, ObjectDetection

79

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

80

neon Theano Caffe Torch TensorFlow

Academic Research

Bleeding-edge

Curated models

Iteration Time

Inference speed

Package ecosystem

Support

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

81

Third-party (Facebook)

benchmarking

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

82

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

83

• github.com/NervanaSystems/ModelZoo

• model files, parameters

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

84

Nervana’s deep learning tutorials:

https://www.nervanasys.com/deep-learning-tutorials/

Github page:

https://github.com/NervanaSystems/neon

For more information, contact:

[email protected]

The image part with relationship ID rId7 was not found in the file.

Nervana Systems Proprietary

85

THANK YOU!

QUESTIONS?