Intel Nervana Artificial Intelligence Meetup 1/31/17
-
Upload
nervana-systems -
Category
Technology
-
view
697 -
download
0
Transcript of Intel Nervana Artificial Intelligence Meetup 1/31/17
![Page 1: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/1.jpg)
Proprietary and confidential. Do not distribute.
Introduction to deeplearning with neon
MAKING MACHINES SMARTER.™
![Page 2: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/2.jpg)
Nervana Systems Proprietary
2
• Intel Nervana overview• Machine learning basics
• What is deep learning?
• Basic deep learning concepts
• Example: recognition of handwritten digits
• Model ingredients in-depth
• Deep learning with neon
![Page 3: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/3.jpg)
Nervana Systems Proprietary
Intel Nervana‘s deep learning solution stack
3
Images
Video
Text
Speech
Tabular
Time series
Solutions
![Page 4: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/4.jpg)
Nervana Systems Proprietary
Deep Dream
Autoencoders
Deep Speech 2
Skip-thought
SegNet
Fast-RCNN Object Localization
Deep Reinforcement Learning
imdb Sentiment Analysis
Video Activity Detection
Deep Residual Net
bAbI Q&A
AIICNN AlexNet GoogLeNet
VGG
https://github.com/NervanaSystems/ModelZoo
![Page 5: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/5.jpg)
Nervana Systems Proprietary
Intel Nervana in action
5
Healthcare: Tumor detection
Automotive: Speech interfacesFinance: Time-series search engine
Positive:
Negative:
Agricultural Robotics Oil & Gas
Positive:
Negative:
Proteomics: Sequence analysis
Query:
Results:
![Page 6: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/6.jpg)
Nervana Systems Proprietary
• Optimized AVX-2 and AVX-512 instructions• Intel® Xeon® processors and Intel® Xeon Phi™ processors• Optimized for common deep learning operations
• GEMM (useful in RNNs and fully connected layers)• Convolutions• Pooling• ReLU• Batch normalization
• Coming soon: LSTM, GRU, Winograd-based convolutions
6
![Page 7: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/7.jpg)
Nervana Systems Proprietary
![Page 8: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/8.jpg)
Nervana Systems Proprietary
8
• Intel Nervana overview
• Machine learning basics• What is deep learning?
• Basic deep learning concepts
• Example: recognition of handwritten digits
• Model ingredients in-depth
• Deep learning with neon
![Page 9: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/9.jpg)
Nervana Systems Proprietary
9
• SUPERVISED LEARNING
• DATA -> LABELS
• UNSUPERVISED LEARNING
• NO LABELS; CLUSTERING
• REDUCING DIMENSIONALITY
• REINFORCEMENT LEARNING
• REWARD ACTIONS (E.G., ROBOTICS)
![Page 10: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/10.jpg)
Nervana Systems Proprietary
10
• SUPERVISED LEARNING
• DATA -> LABELS
• UNSUPERVISED LEARNING
• NO LABELS; CLUSTERING
• REDUCING DIMENSIONALITY
• REINFORCEMENT LEARNING
• REWARD ACTIONS (E.G., ROBOTICS)
![Page 11: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/11.jpg)
Nervana Systems Proprietary
11
(𝑓#, 𝑓%, … , 𝑓')
SVMRandom ForestNaïve BayesDecision TreesLogistic RegressionEnsemble methods
𝑁×𝑁
𝐾 ≪ 𝑁
Arjun
![Page 12: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/12.jpg)
Nervana Systems Proprietary
12
Animals
FacesChairs
Fruits
Vehicles
![Page 13: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/13.jpg)
Nervana Systems Proprietary
Animals
FacesChairs
Fruits
Vehicles
13
![Page 14: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/14.jpg)
Nervana Systems Proprietary
Animals
FacesChairs
Fruits
Vehicles
14
Training error
x
x
x
x
x
x
x
x x
xx
x xxx x
xxx
x
x
xxx
xxx
Testing error
![Page 15: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/15.jpg)
Nervana Systems Proprietary
15
Training Time
Erro
r
Training Error
Testing/Validation Error
Underfitting Overfitting
Bias-Variance Trade-off
![Page 16: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/16.jpg)
Nervana Systems Proprietary
16
• Intel Nervana overview
• Machine learning basics
• What is deep learning? • Basic deep learning concepts
• Example: recognition of handwritten digits
• Model ingredients in-depth
• Deep learning with neon
![Page 17: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/17.jpg)
Nervana Systems Proprietary
17
~60 million parameters
Arjun
But old practices apply: Data Cleaning, Underfit/Overfit, Data exploration, right cost function, hyperparameters, etc.
𝑁×𝑁
![Page 18: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/18.jpg)
Nervana Systems Proprietary
18
Bigger Data Better Hardware Smarter Algorithms
Image: 1000 KB / pictureAudio: 5000 KB / song
Video: 5,000,000 KB / movie
Transistor density doubles every 18 months
Cost / GB in 1995: $1000.00Cost / GB in 2015: $0.03
Advances in algorithm innovation, including neural networks, leading to better accuracy in training models
![Page 19: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/19.jpg)
Nervana Systems Proprietary
19
![Page 20: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/20.jpg)
Nervana Systems Proprietary
20
• Intel Nervana overview
• Machine learning basics
• What is deep learning?
• Basic deep learning concepts• Model ingredients in-depth
• Deep learning with neon
![Page 21: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/21.jpg)
Nervana Systems Proprietary
𝑦𝑥%
𝑥0
𝑥#
𝑎
max(𝑎, 0)
𝑡𝑎𝑛ℎ(𝑎)
Output of unit
Activation FunctionLinear weights Bias unit
Input from unit j
𝒘𝟏
𝒘𝟐
𝒘𝟑
𝑔∑
![Page 22: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/22.jpg)
Nervana Systems Proprietary
InputHidden
Output
Affine layer: Linear + Bias + Activation
![Page 23: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/23.jpg)
Nervana Systems Proprietary
MNIST dataset 70,000 images (28x28 pixels)Goal: classify images into a digit 0-9
N = 28 x 28 pixels = 784 input units
N = 10 output units (one for each digit)
Each unit i encodes the probability of the
input image of being of the digit i
N = 100 hidden units (user-defined parameter)
InputHidden
Output
![Page 24: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/24.jpg)
Nervana Systems Proprietary
N=784N=100
N=10
Total parameters:
𝑊@→B, 𝑏B𝑊B→D, 𝑏D
𝑊@→B
𝑏B𝑊B→D𝑏D
784x100100100x1010
= 84,600
𝐿𝑎𝑦𝑒𝑟𝑖𝐿𝑎𝑦𝑒𝑟𝑗
𝐿𝑎𝑦𝑒𝑟𝑘
![Page 25: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/25.jpg)
Nervana Systems Proprietary
InputHidden
Output 1. Randomly seed weights2. Forward-pass3. Cost4. Backward-pass5. Update weights
![Page 26: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/26.jpg)
Nervana Systems Proprietary
InputHidden
Output
𝑊@→B, 𝑏B ∼ 𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛(0,1)
𝑊B→D, 𝑏D ∼ 𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛(0,1)
![Page 27: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/27.jpg)
Nervana Systems Proprietary
0.00.10.00.30.10.10.00.00.40.0
Output (10x1)
28x28
InputHidden
Output
![Page 28: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/28.jpg)
Nervana Systems Proprietary
0.00.10.00.30.10.10.00.00.40.0
Output (10x1)
28x28
InputHidden
Output0001000000
Ground Truth
Cost function𝑐(𝑜𝑢𝑡𝑝𝑢𝑡, 𝑡𝑟𝑢𝑡ℎ)
![Page 29: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/29.jpg)
Nervana Systems Proprietary
0.00.10.00.30.10.10.00.00.40.0
Output (10x1)
InputHidden
Output0001000000
Ground Truth
Cost function𝑐(𝑜𝑢𝑡𝑝𝑢𝑡, 𝑡𝑟𝑢𝑡ℎ)
Δ𝑊@→B Δ𝑊B→D
![Page 30: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/30.jpg)
Nervana Systems Proprietary
InputHidden
Output 𝐶 𝑦, 𝑡𝑟𝑢𝑡ℎ
𝑊∗
𝜕𝐶𝜕𝑊∗
compute
![Page 31: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/31.jpg)
Nervana Systems Proprietary
InputHidden
Output 𝐶 𝑦, 𝑡𝑟𝑢𝑡ℎ = 𝐶 𝑔 ∑(𝑊B→D𝑥D + 𝑏D)
𝑊∗
![Page 32: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/32.jpg)
Nervana Systems Proprietary
InputHidden
Output 𝐶 𝑦, 𝑡𝑟𝑢𝑡ℎ = 𝐶 𝑔 ∑(𝑊B→D𝑥D + 𝑏D)
𝑎(𝑊B→D, 𝑥D)=
𝑊B→D∗𝜕𝐶𝜕𝑊∗ =
𝜕𝐶𝜕𝑔 \
𝜕𝑔𝜕𝑎 \
𝜕𝑎𝜕𝑊∗
a
𝑔 = max(𝑎, 0)
a
𝑔′(𝑎)
= 𝐶 𝑔(𝑎 𝑊B→D, 𝑥D )
![Page 33: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/33.jpg)
Nervana Systems Proprietary
InputHidden
Output 𝐶 𝑦, 𝑡𝑟𝑢𝑡ℎ = 𝐶 𝑔D(𝑎D 𝑊B→D, 𝑔B(𝑎B(𝑊@→B, 𝑥B))
𝜕𝐶𝜕𝑊∗ =
𝜕𝐶𝜕𝑔D
\𝜕𝑔D𝜕𝑎D
\𝜕𝑎D𝜕𝑔B
\𝜕𝑔B𝜕𝑎B
\𝜕𝑎B𝜕𝑊∗
𝐶 𝑦, 𝑡𝑟𝑢𝑡ℎ = 𝐶 𝑔D 𝑎D(𝑊B→D, 𝑥D = 𝑦B
𝑦B
𝑊@→B∗
![Page 34: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/34.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
![Page 35: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/35.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝑑𝐽 𝒘(_)
𝑑𝒘
![Page 36: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/36.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝒘(#) = 𝒘(_) −𝑑𝐽 𝒘(_)
𝑑𝒘
![Page 37: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/37.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝒘(#) = 𝒘(_) − 𝛼𝑑𝐽 𝒘(_)
𝑑𝒘
learning rate
![Page 38: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/38.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝒘(#) = 𝒘(_) − 𝛼𝑑𝐽 𝒘(_)
𝑑𝒘
𝒘(#)
too small
![Page 39: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/39.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝒘(#) = 𝒘(_) − 𝛼𝑑𝐽 𝒘(_)
𝑑𝒘
𝒘(#)
too large
![Page 40: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/40.jpg)
Nervana Systems Proprietary
𝐽 𝒘(_) =`𝑐𝑜𝑠𝑡(𝒘(_), 𝒙𝑖)b
@c#
𝒘𝒘(_)
𝒘(#) = 𝒘(_) − 𝛼𝑑𝐽 𝒘(_)
𝑑𝒘
𝒘(#)
good enough
![Page 41: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/41.jpg)
Nervana Systems Proprietary
𝐽 𝒘(#) =`𝑐𝑜𝑠𝑡(𝒘(#), 𝒙𝑖)b
@c#
𝒘𝒘(%)
𝒘(%) = 𝒘(#) − 𝛼𝑑𝐽 𝒘(#)
𝑑𝒘
𝒘(#)
![Page 42: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/42.jpg)
Nervana Systems Proprietary
𝐽 𝒘(%) =`𝑐𝑜𝑠𝑡(𝒘(%), 𝒙𝑖)b
@c#
𝒘
𝒘(0) = 𝒘(%) − 𝛼𝑑𝐽 𝒘(%)
𝑑𝒘
𝒘(%)𝒘(0)
![Page 43: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/43.jpg)
Nervana Systems Proprietary
𝐽 𝒘(0) =`𝑐𝑜𝑠𝑡(𝒘(0), 𝒙𝑖)b
@c#
𝒘
𝒘(g) = 𝒘(0) − 𝛼𝑑𝐽 𝒘(0)
𝑑𝒘
𝒘(g)
𝒘(0)
![Page 44: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/44.jpg)
Nervana Systems Proprietary
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
![Page 45: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/45.jpg)
Nervana Systems Proprietary
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
Update weights via:
Δ𝑊 = 𝛼 ∗1𝑁`𝛿𝑊
�
�
Learning rate
![Page 46: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/46.jpg)
Nervana Systems Proprietary
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
fprop cost bprop 𝛿𝑊
minibatch #1 weight update
minibatch #2 weight update
![Page 47: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/47.jpg)
Nervana Systems Proprietary
Epoch 0
Epoch 1
Sample numbers:• Learning rate ~0.001• Batch sizes of 32-128• 50-90 epochs
![Page 48: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/48.jpg)
Nervana Systems Proprietary
SGDGradient Descent
![Page 49: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/49.jpg)
Nervana Systems Proprietary
Krizhevsky, 2012
60 million parameters
120 million parameters Taigman, 2014
![Page 50: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/50.jpg)
Nervana Systems Proprietary
50
• Intel Nervana overview
• Machine learning basics
• What is deep learning?
• Basic deep learning concepts
• Model ingredients in-depth• Deep learning with neon
![Page 51: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/51.jpg)
Nervana Systems Proprietary
Dataset Model/Layers Activation OptimizerCost
𝐶(𝑦, 𝑡)
![Page 52: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/52.jpg)
Nervana Systems Proprietary
Filter + Non-Linearity
Pooling
Filter + Non-Linearity
Fully connected layers
…
“how can I help you?”
cat
Low level features
Mid level features
Object parts, phonemes
Objects, words
*Hinton et al., LeCun, Zeiler, Fergus
Filter + Non-Linearity
Pooling
![Page 53: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/53.jpg)
Nervana Systems Proprietary
Tanh Rectified Linear UnitLogistic
-1
11
0
𝑔 𝑎 =𝑒j
∑ 𝑒jk�D
Softmax
![Page 54: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/54.jpg)
Nervana Systems Proprietary
Gaussian Gaussian(mean, sd)
GlorotUniform Uniform(-k, k)
Xavier Uniform(k, k)
Kaiming Gaussian(0, sigma)
𝑘 =6
𝑑@m + 𝑑nop
�
𝑘 =3𝑑@m
�
𝜎 =2𝑑@m
�
![Page 55: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/55.jpg)
Nervana Systems Proprietary
• Cross Entropy Loss
• Misclassification Rate
• Mean Squared Error
• L1 loss
![Page 56: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/56.jpg)
Nervana Systems Proprietary
0.00.10.00.30.10.10.00.00.40.0
Output (10x1)
0001000000
Ground Truth
−`𝑡D×log(𝑦D)�
D= −log(0.3)
![Page 57: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/57.jpg)
Nervana Systems Proprietary
0.3 0.3 0.4
0.3 0.4 0.3
0.1 0.2 0.7
0 0 1
0 1 0
1 0 0
Outputs Targets Correct?YY
N
0.1 0.2 0.7
0.1 0.7 0.2
0.3 0.4 0.3
0 0 1
0 1 0
1 0 0
YY
N
-(log(0.4) + log(0.4) + log(0.1))/3=1.38
-(log(0.7) + log(0.7) + log(0.3))/3=0.64
![Page 58: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/58.jpg)
Nervana Systems Proprietary
• SGD with Momentum
• RMS propagation
• Adagrad
• Adadelta
• Adam
![Page 59: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/59.jpg)
Nervana Systems Proprietary
Δ𝑊# Δ𝑊% Δ𝑊0 Δ𝑊g
training time
𝛼pcxy =𝛼
∑ Δ𝑊p%pcx
pc_�
![Page 60: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/60.jpg)
Nervana Systems Proprietary
Δ𝑊# Δ𝑊% Δ𝑊0 Δ𝑊g
training time
𝛼pcgy =𝛼
Δ𝑊%% + Δ𝑊0
% + Δ𝑊g%�
![Page 61: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/61.jpg)
Nervana Systems Proprietary
61
• Intel Nervana overview
• Machine learning basics
• What is deep learning?
• Basic deep learning concepts
• Model ingredients in-depth
• Deep learning with neon
![Page 62: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/62.jpg)
Nervana Systems Proprietary
![Page 63: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/63.jpg)
Nervana Systems Proprietary
![Page 64: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/64.jpg)
Nervana Systems Proprietary
•Popular, well established, developer familiarity
•Fast to prototype
•Rich ecosystem of existing packages.
•Data Science: pandas, pycuda, ipython, matplotlib, h5py, …
•Good “glue” language: scriptable plus functional and OO support,
plays well with other languages
![Page 65: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/65.jpg)
Nervana Systems Proprietary
Backend NervanaGPU, NervanaCPU
DatasetsMNIST, CIFAR-10, Imagenet 1K, PASCAL VOC, Mini-Places2, IMDB, Penn Treebank,
Shakespeare Text, bAbI, Hutter-prize, UCF101, flickr8k, flickr30k, COCO
Initializers Constant, Uniform, Gaussian, Glorot Uniform, Xavier, Kaiming, IdentityInit, Orthonormal
Optimizers Gradient Descent with Momentum, RMSProp, AdaDelta, Adam, Adagrad,MultiOptimizer
Activations Rectified Linear, Softmax, Tanh, Logistic, Identity, ExpLin
LayersLinear, Convolution, Pooling, Deconvolution, Dropout, Recurrent,Long Short-
Term Memory, Gated Recurrent Unit, BatchNorm, LookupTable,Local Response Normalization, Bidirectional-RNN, Bidirectional-LSTM
Costs Binary Cross Entropy, Multiclass Cross Entropy, Sum of Squares Error
Metrics Misclassification (Top1, TopK), LogLoss, Accuracy, PrecisionRecall, ObjectDetection
![Page 66: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/66.jpg)
Nervana Systems Proprietary
1. Generate backend2. Load data3. Specify model architecture4. Define training parameters5. Train model6. Evaluate
![Page 67: Intel Nervana Artificial Intelligence Meetup 1/31/17](https://reader031.fdocuments.us/reader031/viewer/2022021918/589b2c1a1a28ab2d4c8b5f91/html5/thumbnails/67.jpg)
Nervana Systems Proprietary