Joy of Designing Deep Neural Networks November 28, 2019 · #bigdata2019 @electricbrainio Joy of...

56
https://www.electricbrain.io/ #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

Transcript of Joy of Designing Deep Neural Networks November 28, 2019 · #bigdata2019 @electricbrainio Joy of...

Page 1: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

Joy of Designing Deep Neural Networks

November 28, 2019

1

Page 2: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

Creating a complex function, but not having to actually program it

def computeSomething(data):…newData = something # I don’t give a shit return newData

2

Imagine

Page 3: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 2

History

Page 4: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Avid video game player● Programming from a young age

3

At the beginning

Page 5: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 4

Globulation 2

Page 6: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 5

A different direction

● Took a business degree● Freelanced on the side● Did a startup (failed)● Broke; back to programming for cash

Page 7: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 6

Sensibill

● Receipt processing technology

Page 8: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 7

Sensibill{ "items": [{ "name": "T1 Cafvn Lt Frapp", "regularPriceTotal": "4.25" }], "receiptDate": "04/06/2016", "receiptNumber": "656335", "receiptTime": "07:35 am", "store": [{ "addressLines": [ "438 Richmond Street West", "Toronto, ON M5V 3S6" ], "name": "Starbucks Coffee Canada #", "storeID": "4495" }], "taxes": [{ "amount": "0.55", "currencyCode": "", "percent": "13", "ruleID": "HST" }], "tenders": [{ "amount": "4.80", "currencyCode": "", "currentBalance": "14.80", "maskedCardNumber": "**** 3616", "tenderType": "Sbux Card" }], "total": { "currencyCode": "", "grand": "4.80", "subtotal": "4.25" }

>

Page 9: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 8

Stumbling Around

● Hand baked heuristic algorithm● Later turns out to be a variant on

k-nearest-neighbor algorithm

Page 10: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Building and maintaining AI datasets● Designing annotators● Building a data-operations team● Cleaning and transforming data● Testing different algorithms (like

decision trees)

9

Valuable Experience

Page 11: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 10

A hint

Page 12: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Upgraded to recurrent neural network

● Massive improvement in accuracy

11

Deep Neural Network Upgrade

Page 13: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Became obsessed with neural networks and how they are designed

● Passion for deep learning ignites!

12

It begins

Page 14: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Learning everything I can about deep neural networks

13

Diving deep

Page 15: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● As a programmer, I’m not a big fan of mathematical equations

14

My dirty secret

Page 16: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Diving deep into the equations is very rarely needed

● The graphs, charts, and anecdotes say everything

15

Not a problem

Page 17: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 16

Example

● Choosing an activation function for a neural network

● What’s the difference between sigmoid and tanh? They’re both just non-linear equations

Page 18: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 17

Sigmoid

● Outputs between 0 and 1● For inputs, action is in the center, close to 0

Page 19: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 18

Tanh

● Outputs between -1 and 1● For inputs, action is in the center, close to 0

Page 20: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 19

When does it matter?

● Do you want negative numbers or not?

● If you don’t care, it doesn’t really matter. Use whatever gets highest accuracy

Page 21: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 20

Example: What do the layers do?

● Neural networks are made of layers● There are many types of layers● How do we understand them?

Page 22: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 21

What is an LSTM?

i[t] = σ(W[x->i]x[t] + W[h->i]h[t−1] + W[c->i]c[t−1] + b[1->i]) (1)f[t] = σ(W[x->f]x[t] + W[h->f]h[t−1] + W[c->f]c[t−1] + b[1->f]) (2)z[t] = tanh(W[x->c]x[t] + W[h->c]h[t−1] + b[1->c]) (3)c[t] = f[t]c[t−1] + i[t]z[t] (4)o[t] = σ(W[x->o]x[t] + W[h->o]h[t−1] + W[c->o]c[t] + b[1->o]) (5)h[t] = o[t]tanh(c[t]) (6)

(source: https://github.com/Element-Research/rnn)

Page 23: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 22

What is an LSTM?

Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 24: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 23

LSTM Variant: GRU

Image Source: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Page 25: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 24

What is a convolutional network?

output[i][j][k] = bias[k] + sum_l sum_{s=1}^kW sum_{t=1}^kH w[s][t][l][k] * i[dW*(i-1)+s)][dH*(j-1)+t][l]

(source: https://github.com/torch/nn/blob/master/doc/convolution.md)

Page 26: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 25

What is a convolutional network?

Image Source: http://deeplearning.net/tutorial/lenet.html

Page 27: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 26

Understanding the layers

● Layers can be understood as math equations, but why bother?

● The high-level intuitions are much more useful

● You don’t need to know how the CPU works to write Python code. Same with deep learning.

Page 28: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 27

Examples

● Linear/Dense = Generally combines / processes data

● Convolution = Matches fixed number of patterns on data

● Recurrent = Processes data of arbitrary size, like sequences/arrays

Page 29: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 28

Examples

● Attention Layer = Suppresses irrelevant information

● Dropout = Prevents overfitting, spreads the knowledge across vector

● Batch Norm = Speeds up learning● Pooling = Combines nearby data

together

Page 30: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 28

Programming Analogies

● Projection - cast<float64[]>(data)● Dense Layer - function (data) {...}● Recurrent Layer - for () {...} loop● Convolutional Layer - RegExp● Attention Layer - Map & Reduce

Page 31: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 29

The architecture

● At architecture level, the math is hideous! It can only be understood as a graph

● Many intuitions form based on the graphs

● What's possible, what’s useful

Page 32: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 30

Understanding the architecture

Page 33: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 31

Edges and Nodes

● What is an edge?○ Technically it’s a vector, e.g. <1, 5.3, 3.1>○ Intuitively, its information

● Whats a node?○ Technically, a bundle of math equations○ Intuitively, an information processing unit

Page 34: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 32

The architecture: recurrent translation

Image Source: http://cs224d.stanford.edu/lectures/CS224d-Lecture8.pdf

Page 35: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 33

The architecture: recurrent translation

Image Source: http://cs224d.stanford.edu/lectures/CS224d-Lecture8.pdf

All information about that sentence!

Page 36: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 34

The architecture: recurrent translation

Image Source: https://devblogs.nvidia.com/parallelforall/introduction-neural-machine-translation-gpus-part-2/

Page 37: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 35

The intuition

● Any information can be represented in abstract within a vector

● Vectors can be at various stages of processing, part way between the input and finished output

Page 38: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 36

The architecture: inception network

Image Source: https://arxiv.org/pdf/1409.4842.pdf

Page 39: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 37

The intuition

● Sometimes it’s useful to process the same information multiple ways

Page 40: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 38

The architecture: residual networks

Page 41: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 39

The intuition

● Too many information processing units in a row don’t work, due to vanishing gradients

● Adding paths around processing modules allows networks to go deeper

● Works similar to gossip

Page 42: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 40

Going Rogue

● After enough reading, playing, and experimenting, you start to feel comfortable enough to create original designs

Page 43: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Infinite search space of possible architectures, each with massive to infinite number of hyper-parameters

● Intuition, experimentation, trial and error dominate

41

Big problem

Page 44: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● More or less, code either works or it doesn’t

● The deep neural network always works at least a little bit. Question is, what makes it better or worse?

42

Not like programming

Page 45: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio

● Julian Konomi “The deep neural network always learns something”

● The only question is, did it learn something useful? Did it learn better than another design?

43

The Profound and Annoying Fact

Page 46: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 44

Example: Regulation Matching

● Client wants to detect if part of a project plan might violate a law or internal control

● Automated Compliance Review

Page 47: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 45

Options ...

Page 48: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 46

Options ...

Description:Use vanilla recurrent network stack

Process one paragraph at a time

Treat each regulation / control as an independent output

Output the probability the whole paragraph violates each regulation.

Hyperparameters:● # of recurrent layers● # of dense layers● Type of recurrent layer● Size of recurrent layer● Type of activation function● Size of dense layers● Method for input word-vectors● Use attention?● Use residual connections?● more….

Page 49: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 47

Options ...

Page 50: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 48

Options ...

Description:Use a neural network with an external memory

Process the entire project plan rather than one paragraph at a time

Treat the regulations as mutually exclusive, n-class output

Output on a word-by-word basis whether or not that word is describing a violation

Hyperparameters:● # of recurrent layers in storage location

stack● # of recurrent layers in storage value stack● # of recurrent layers in retrieval location

stack● Size of storage location recurrent layers● Size of storage value recurrent layers● Size of retrieval location recurrent layers● # of dense layers at the end● Type of activation function● Size of dense layers● Size of single vector in NN memory● Number of locations in NN memory● Method for input word-vectors● Use attention?● Use residual connections?● more….

Page 51: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 49

Options….

Page 52: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 50

Options ...

Description:Use two vanilla recurrent stacks, one for the project plan and one for the regulation

Process the project plan in sentences

Process the text of the regulation through the neural network as well, computing a ‘regulation vector’

Use cosine distance between ‘regulation vector’ and ‘project vector’ to determine relevance

Hyperparameters:● # of recurrent layers in project plan stack● # of recurrent layers in regulation stack● Size of project plan recurrent layers● Size of regulation stack recurrent layers● # of dense layers after project plan stack● # of dense layers after regulation stack● Type of activation function● Size of dense layers● Size of the matching vector● Cutoff point to determine if project plan

fails regulation● Method for input word-vectors● Use attention?● Use residual connections?● more….

Page 53: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 51

Challenge and Joy

● Far too many techniques to test for any one client

● If only clients had unlimited money…. (**cough cough** Google)

● One must gain an intuition on what's likely to work - can’t rely entirely on copying results from NIPS papers

Page 54: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 52

Winging It

● Even smartest PhD’s don’t really understand how or why deep neural networks work so well

● Deep learning is half science, half art● Just dive in and get your hands dirty.

Don’t bother trying to “understand”

Page 55: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 53

Conclusion

● Anyone with a technical background can learn to apply deep learning, and even to create novel architectures

● Mathematics not required● Learning these intuitions is fun!

Page 56: Joy of Designing Deep Neural Networks November 28, 2019 ·  #bigdata2019 @electricbrainio Joy of Designing Deep Neural Networks November 28, 2019 1

https://www.electricbrain.io/ #bigdata2019 @electricbrainio 54

Have a great evening!