Hardware and Software Lecture 6: Deep Learning Hardware,...
Transcript of Hardware and Software Lecture 6: Deep Learning Hardware,...
![Page 1: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/1.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 20211
Lecture 6:Hardware and SoftwareDeep Learning Hardware, Dynamic & Static Computational Graph, PyTorch & TensorFlow
![Page 2: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/2.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 20212
AdministrativeAssignment 1 is due tomorrow April 16th, 11:59pm.
Assignment 2 will be out tomorrow, due April 30th, 11:50 pm.
Project proposal due Monday April 19.
![Page 3: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/3.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 20213
Administrative
Friday’s section topic: course project
![Page 4: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/4.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
two more layers to go: POOL/FC
4
![Page 5: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/5.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Convolution Layers (continue from last time)
64
56
56
3x3 CONV, stride=1, padding=1n_filters=64
3256
56
5
![Page 6: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/6.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Pooling layer- makes the representations smaller and more manageable - operates over each activation map independently:
6
![Page 7: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/7.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
1 1 2 4
5 6 7 8
3 2 1 0
1 2 3 4
Single depth slice
x
y
max pool with 2x2 filters and stride 2 6 8
3 4
MAX POOLING
7
![Page 8: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/8.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 20218
Pooling layer: summary
Let’s assume input is W1 x H1 x CConv layer needs 2 hyperparameters:- The spatial extent F- The stride S
This will produce an output of W2 x H2 x C where:- W2 = (W1 - F )/S + 1- H2 = (H1 - F)/S + 1
Number of parameters: 0
![Page 9: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/9.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Fully Connected Layer (FC layer)- Contains neurons that connect to the entire input volume, as in ordinary Neural
Networks
9
![Page 10: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/10.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202110
Lecture 6:Hardware and SoftwareDeep Learning Hardware, Dynamic & Static Computational Graph, PyTorch & TensorFlow
![Page 11: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/11.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202111
x
W
hinge loss
R
+ Ls (scores)
Computational graphs
*
Where we are now...
![Page 12: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/12.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202112
Illustration of LeCun et al. 1998 from CS231n 2017 Lecture 1
Where we are now...
Convolutional Neural Networks
![Page 13: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/13.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202113
Where we are now… (more on optimization in lecture 8)
Landscape image is CC0 1.0 public domainWalking man image is CC0 1.0 public domain
Learning network parameters through optimization
![Page 14: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/14.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202114
Today
- Deep learning hardware- CPU, GPU
- Deep learning software- PyTorch and TensorFlow- Static and Dynamic computation graphs
![Page 15: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/15.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Deep Learning Hardware
15
![Page 16: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/16.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202116
Inside a computer
![Page 17: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/17.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202117
Spot the CPU!(central processing unit)
This image is licensed under CC-BY 2.0
![Page 18: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/18.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202118
Spot the GPUs!(graphics processing unit)
This image is licensed under CC-BY 2.0
![Page 19: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/19.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202119
CPU vs GPUCores Clock
SpeedMemory Price Speed (throughput)
CPU(Intel Core i9-7900k)
10 4.3 GHz
System RAM
$385 ~640 GFLOPS FP32
GPU(NVIDIARTX 3090)
10496 1.6 GHz
24 GB GDDR6X
$1499 ~35.6 TFLOPS FP32
CPU: Fewer cores, but each core is much faster and much more capable; great at sequential tasks
GPU: More cores, but each core is much slower and “dumber”; great for parallel tasks
![Page 20: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/20.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202120
Example: Matrix Multiplication
A x BB x C
A x C
=
cuBLAS::GEMM (GEneral Matrix-to-matrix Multiply)
![Page 21: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/21.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
CPU vs GPU in practice
Data from https://github.com/jcjohnson/cnn-benchmarks
(CPU performance not well-optimized, a little unfair)
66x 67x 71x 64x 76x
21
![Page 22: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/22.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
CPU vs GPU in practice
Data from https://github.com/jcjohnson/cnn-benchmarks
cuDNN much faster than “unoptimized” CUDA
2.8x 3.0x 3.1x 3.4x 2.8x
22
![Page 23: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/23.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202123
![Page 24: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/24.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202124
NVIDIA AMDvs
![Page 25: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/25.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202125
NVIDIA AMDvs
![Page 26: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/26.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202126
CPU vs GPUCores Clock
SpeedMemory
Price Speed
CPU(Intel Core i7-7700k)
10 4.3 GHz System RAM
$385 ~640 GFLOPs FP32
GPU(NVIDIARTX 3090)
10496 1.6 GHz 24 GB GDDR6X
$1499 ~35.6 TFLOPs FP32
GPU (Data Center)NVIDIA A100
6912 CUDA,432 Tensor
1.5 GHz 40/80 GB HBM2
$3/hr (GCP)
~9.7 TFLOPs FP64~20 TFLOPs FP32~312 TFLOPs FP16
TPUGoogle Cloud TPUv3
2 Matrix Units (MXUs) per core, 4 cores
? 128 GB HBM
$8/hr(GCP)
~420 TFLOPs (non-standard FP)
CPU: Fewer cores, but each core is much faster and much more capable; great at sequential tasks
GPU: More cores, but each core is much slower and “dumber”; great for parallel tasks
TPU: Specialized hardware for deep learning
![Page 27: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/27.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202127
Programming GPUs● CUDA (NVIDIA only)
○ Write C-like code that runs directly on the GPU○ Optimized APIs: cuBLAS, cuFFT, cuDNN, etc
● OpenCL○ Similar to CUDA, but runs on anything○ Usually slower on NVIDIA hardware
● HIP https://github.com/ROCm-Developer-Tools/HIP ○ New project that automatically converts CUDA code to
something that can run on AMD GPUs● Stanford CS 149: http://cs149.stanford.edu/fall20/
![Page 28: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/28.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
CPU / GPU Communication
Model is here
Data is here
28
![Page 29: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/29.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
CPU / GPU Communication
Model is here
Data is here
If you aren’t careful, training can bottleneck on reading data and transferring to GPU!
Solutions:- Read all data into RAM- Use SSD instead of HDD- Use multiple CPU threads
to prefetch data
29
![Page 30: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/30.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Deep Learning Software
30
![Page 31: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/31.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
A zoo of frameworks!
Caffe (UC Berkeley)
Torch (NYU / Facebook)
Theano (U Montreal)
TensorFlow (Google)
Caffe2 (Facebook) mostly features absorbed by PyTorch
PyTorch (Facebook)
CNTK (Microsoft)
PaddlePaddle(Baidu)
MXNet (Amazon)Developed by U Washington, CMU, MIT, Hong Kong U, etc but main framework of choice at AWS
And others...
31
Chainer(Preferred Networks)The company has officially migrated its research infrastructure to PyTorch
JAX(Google)
![Page 32: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/32.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
A zoo of frameworks!
Caffe (UC Berkeley)
Torch (NYU / Facebook)
Theano (U Montreal)
TensorFlow (Google)
Caffe2 (Facebook) mostly features absorbed by PyTorch
PyTorch (Facebook)
CNTK (Microsoft)
PaddlePaddle(Baidu)
MXNet (Amazon)Developed by U Washington, CMU, MIT, Hong Kong U, etc but main framework of choice at AWS
And others...
32
Chainer(Preferred Networks)The company has officially migrated its research infrastructure to PyTorch
JAX(Google)
We’ll focus on these
![Page 33: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/33.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Recall: Computational Graphs
x
W
hinge loss
R
+ Ls (scores)
*
33
![Page 34: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/34.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
input image
loss
weights
Figure copyright Alex Krizhevsky, Ilya Sutskever, and
Geoffrey Hinton, 2012. Reproduced with permission.
Recall: Computational Graphs
34
![Page 35: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/35.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Recall: Computational Graphs
Figure reproduced with permission from a Twitter post by Andrej Karpathy.
input image
loss
35
![Page 36: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/36.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202136
The point of deep learning frameworks
(1) Quick to develop and test new ideas(2) Automatically compute gradients(3) Run it all efficiently on GPU (wrap cuDNN, cuBLAS,
OpenCL, etc)
![Page 37: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/37.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202137
Computational Graphsx y z
*
a+
b
Σ
c
Numpy
![Page 38: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/38.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202138
Computational Graphsx y z
*
a+
b
Σ
c
Numpy
![Page 39: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/39.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202139
Computational Graphsx y z
*
a+
b
Σ
c
Numpy
Bad: - Have to compute
our own gradients- Can’t run on GPU
Good: Clean API, easy to write numeric code
![Page 40: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/40.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202140
Computational Graphsx y z
*
a+
b
Σ
c
Numpy PyTorch
Looks exactly like numpy!
![Page 41: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/41.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202141
Computational Graphsx y z
*
a+
b
Σ
c
Numpy PyTorch
PyTorch handles gradients for us!
![Page 42: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/42.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202142
Computational Graphsx y z
*
a+
b
Σ
c
Numpy PyTorch
Trivial to run on GPU - just construct arrays on a different device!
![Page 43: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/43.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202143
PyTorch(More details)
![Page 44: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/44.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202144
PyTorch: Fundamental Concepts
torch.Tensor: Like a numpy array, but can run on GPU
torch.nn.Module: A neural network layer; may store state or learnable weights
torch.autograd: Package for building computational graphs out of Tensors, and automatically computing gradients
![Page 45: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/45.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202145
PyTorch: Versions
For this class we are using PyTorch version 1.7
Major API change in release 1.0
Be careful if you are looking at older PyTorch code (<1.0)!
![Page 46: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/46.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202146
PyTorch: Tensors
Running example: Train a two-layer ReLU network on random data with L2 loss
![Page 47: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/47.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202147
PyTorch: TensorsCreate random tensors for data and weights
![Page 48: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/48.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202148
PyTorch: Tensors
Forward pass: compute predictions and loss
![Page 49: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/49.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202149
PyTorch: Tensors
Backward pass: manually compute gradients
![Page 50: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/50.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202150
PyTorch: Tensors
Gradient descent step on weights
![Page 51: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/51.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202151
PyTorch: Tensors
To run on GPU, just use a different device!
![Page 52: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/52.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202152
PyTorch: Autograd
Creating Tensors with requires_grad=True enables autograd
Operations on Tensors with requires_grad=True cause PyTorch to build a computational graph
![Page 53: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/53.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202153
PyTorch: Autograd
Forward pass looks exactly the same as before, but we don’t need to track intermediate values - PyTorch keeps track of them for us in the graph
![Page 54: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/54.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202154
PyTorch: Autograd
Compute gradient of loss with respect to w1 and w2
![Page 55: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/55.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202155
PyTorch: Autograd
Make gradient step on weights, then zero them. Torch.no_grad means “don’t build a computational graph for this part”
![Page 56: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/56.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202156
PyTorch: Autograd
PyTorch methods that end in underscore modify the Tensor in-place; methods that don’t return a new Tensor
![Page 57: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/57.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202157
PyTorch: New Autograd FunctionsDefine your own autograd functions by writing forward and backward functions for Tensors
Use ctx object to “cache” values for the backward pass, just like cache objects from A2
![Page 58: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/58.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202158
PyTorch: New Autograd FunctionsDefine your own autograd functions by writing forward and backward functions for Tensors
Use ctx object to “cache” values for the backward pass, just like cache objects from A2
Define a helper function to make it easy to use the new function
![Page 59: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/59.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202159
PyTorch: New Autograd Functions
Can use our new autograd function in the forward pass
![Page 60: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/60.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202160
PyTorch: New Autograd Functions
In practice you almost never need to define new autograd functions! Only do it when you need custom backward. In this case we can just use a normal Python function
![Page 61: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/61.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202161
PyTorch: nn
Higher-level wrapper for working with neural nets
Use this! It will make your life easier
![Page 62: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/62.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202162
PyTorch: nn
Define our model as a sequence of layers; each layer is an object that holds learnable weights
![Page 63: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/63.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202163
PyTorch: nn
Forward pass: feed data to model, and compute loss
![Page 64: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/64.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202164
PyTorch: nn
torch.nn.functional has useful helpers like loss functions
Forward pass: feed data to model, and compute loss
![Page 65: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/65.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202165
PyTorch: nn
Backward pass: compute gradient with respect to all model weights (they have requires_grad=True)
![Page 66: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/66.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202166
PyTorch: nn
Make gradient step on each model parameter(with gradients disabled)
![Page 67: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/67.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202167
PyTorch: optim
Use an optimizer for different update rules
![Page 68: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/68.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202168
PyTorch: optim
After computing gradients, use optimizer to update params and zero gradients
![Page 69: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/69.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202169
PyTorch: nnDefine new ModulesA PyTorch Module is a neural net layer; it inputs and outputs Tensors
Modules can contain weights or other modules
You can define your own Modules using autograd!
![Page 70: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/70.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202170
PyTorch: nnDefine new Modules
Define our whole model as a single Module
![Page 71: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/71.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202171
PyTorch: nnDefine new Modules
Initializer sets up two children (Modules can contain modules)
![Page 72: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/72.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202172
PyTorch: nnDefine new Modules
Define forward pass using child modules
No need to define backward - autograd will handle it
![Page 73: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/73.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202173
PyTorch: nnDefine new Modules
Construct and train an instance of our model
![Page 74: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/74.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202174
PyTorch: nnDefine new ModulesVery common to mix and match custom Module subclasses and Sequential containers
![Page 75: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/75.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202175
PyTorch: nnDefine new Modules
Define network component as a Module subclass
![Page 76: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/76.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202176
PyTorch: nnDefine new Modules
Stack multiple instances of the component in a sequential
![Page 77: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/77.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202177
PyTorch: Pretrained Models
Super easy to use pretrained models with torchvision https://github.com/pytorch/vision
![Page 78: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/78.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
PyTorch: torch.utils.tensorboard
This image is licensed under CC-BY 4.0; no changes were made to the image
A python wrapper around Tensorflow’s web-based visualization tool.
78
![Page 79: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/79.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
PyTorch: Computational Graphs
Figure reproduced with permission from a Twitter post by Andrej Karpathy.
input image
loss
79
![Page 80: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/80.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202180
PyTorch: Dynamic Computation Graphs
![Page 81: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/81.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202181
PyTorch: Dynamic Computation Graphsx w1 w2 y
Create Tensor objects
![Page 82: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/82.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202182
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
Build graph data structure AND perform computation
![Page 83: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/83.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202183
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
-
pow sum lossBuild graph data structure AND perform computation
![Page 84: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/84.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202184
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
-
pow sum lossSearch for path between loss and w1, w2 (for backprop) AND perform computation
![Page 85: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/85.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202185
PyTorch: Dynamic Computation Graphsx w1 w2 y
Throw away the graph, backprop path, and rebuild it from scratch on every iteration
![Page 86: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/86.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202186
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
Build graph data structure AND perform computation
![Page 87: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/87.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202187
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
-
pow sum lossBuild graph data structure AND perform computation
![Page 88: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/88.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202188
PyTorch: Dynamic Computation Graphsx w1 w2 y
mm
clamp
mm
y_pred
-
pow sum lossSearch for path between loss and w1, w2 (for backprop) AND perform computation
![Page 89: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/89.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202189
PyTorch: Dynamic Computation Graphs
Building the graph and computing the graph happen at the same time.
Seems inefficient, especially if we are building the same graph over and over again...
![Page 90: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/90.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202190
Static Computation Graphs
Alternative: Static graphs
Step 1: Build computational graph describing our computation (including finding paths for backprop)
Step 2: Reuse the same graph on every iteration
![Page 91: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/91.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202191
TensorFlow
![Page 92: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/92.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202192
TensorFlow Versions
Default static graph, optionally dynamic graph (eager mode).
Pre-2.0 (1.14 latest) 2.0+Default dynamic graph, optionally static graph.We use 2.4 in this class.
![Page 93: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/93.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202193
TensorFlow: Neural Net(Pre-2.0)
(Assume imports at the top of each snippet)
![Page 94: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/94.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202194
TensorFlow: Neural Net(Pre-2.0)
First define computational graph
Then run the graph many times
![Page 95: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/95.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202195
TensorFlow: 2.0+ vs. pre-2.0
Tensorflow 2.0+:“Eager” Mode by defaultassert(tf.executing_eagerly())
Tensorflow 1.13
![Page 96: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/96.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202196
TensorFlow: 2.0+ vs. pre-2.0
Tensorflow 1.13
Tensorflow 2.0+:“Eager” Mode by defaultassert(tf.executing_eagerly())
![Page 97: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/97.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202197
TensorFlow: 2.0+ vs. pre-2.0
Tensorflow 1.13
Tensorflow 2.0+:“Eager” Mode by defaultassert(tf.executing_eagerly())
![Page 98: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/98.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202198
TensorFlow: Neural Net
Convert input numpy arrays to TF tensors.Create weights as tf.Variable
![Page 99: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/99.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 202199
TensorFlow: Neural Net
Use tf.GradientTape() context to build dynamic computation graph.
![Page 100: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/100.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021100
TensorFlow: Neural Net
All forward-pass operations in the contexts (including function calls) gets traced for computing gradient later.
![Page 101: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/101.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021101
TensorFlow: Neural Net
Forward pass
![Page 102: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/102.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021102
TensorFlow: Neural Net
tape.gradient() uses the traced computation graph to compute gradient for the weights
![Page 103: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/103.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021103
TensorFlow: Neural Net
Backward pass
![Page 104: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/104.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021104
TensorFlow: Neural Net
Train the network: Run the training step over and over, use gradient to update weights
![Page 105: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/105.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021105
TensorFlow: Neural Net
Train the network: Run the training step over and over, use gradient to update weights
![Page 106: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/106.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021106
TensorFlow: Optimizer
Can use an optimizer to compute gradients and update weights
![Page 107: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/107.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021107
TensorFlow: Loss
Use predefined loss functions
![Page 108: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/108.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021108
Keras: High-Level WrapperKeras is a layer on top of TensorFlow, makes common things easy to do
(Used to be third-party, now merged into TensorFlow)
![Page 109: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/109.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021109
Keras: High-Level Wrapper
Define model as a sequence of layers
Get output by calling the model
Apply gradient to all trainable variables (weights) in the model
![Page 110: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/110.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021110
Keras: High-Level Wrapper
Keras can handle the training loop for you!
![Page 111: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/111.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021111
Keras (https://keras.io/)
tf.keras (https://www.tensorflow.org/api_docs/python/tf/keras)
tf.estimator (https://www.tensorflow.org/api_docs/python/tf/estimator)
Sonnet (https://github.com/deepmind/sonnet)
TFLearn (http://tflearn.org/)
TensorLayer (http://tensorlayer.readthedocs.io/en/latest/)
TensorFlow: High-Level Wrappers
![Page 112: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/112.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021112
@tf.function: compile static graph
tf.function decorator (implicitly) compiles python functions to static graph for better performance
![Page 113: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/113.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021113
@tf.function: compile static graph
Here we compare the forward-pass time of the same model under dynamic graph mode and static graph mode
Ran on Google Colab, April 2020
![Page 114: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/114.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021114
@tf.function: compile static graph
Static graph is in theory faster than dynamic graph, but the performance gain depends on the type of model / layer / computation graph.
Ran on Google Colab, April 2020
![Page 115: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/115.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021115
@tf.function: compile static graph
Static graph is in theory faster than dynamic graph, but the performance gain depends on the type of model / layer / computation graph.
Ran on Google Colab, April 2020
![Page 116: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/116.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Static vs Dynamic: OptimizationWith static graphs, framework can optimize the graph for you before it runs!
ConvReLUConvReLUConvReLU
The graph you wrote
Conv+ReLU
Equivalent graph with fused operations
Conv+ReLUConv+ReLU
116
![Page 117: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/117.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021117
Static PyTorch: ONNX SupportYou can export a PyTorch model to ONNX (Open Neural Network Exchange)
Run the graph on a dummy input, and save the graph to a file
Will only work if your model doesn’t actually make use of dynamic graph - must build same graph on every forward pass
![Page 118: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/118.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021118
Static PyTorch: ONNX Supportgraph(%0 : Float(64, 1000) %1 : Float(100, 1000) %2 : Float(100) %3 : Float(10, 100) %4 : Float(10)) { %5 : Float(64, 100) = onnx::Gemm[alpha=1, beta=1, broadcast=1, transB=1](%0, %1, %2), scope: Sequential/Linear[0] %6 : Float(64, 100) = onnx::Relu(%5), scope: Sequential/ReLU[1] %7 : Float(64, 10) = onnx::Gemm[alpha=1, beta=1, broadcast=1, transB=1](%6, %3, %4), scope: Sequential/Linear[2] return (%7);}
After exporting to ONNX, can run the PyTorch model in Caffe2
![Page 119: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/119.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021119
Static PyTorch: ONNX Support
ONNX is an open-source standard for neural network models
Goal: Make it easy to train a network in one framework, then run it in another framework
Supported by PyTorch, Caffe2, Microsoft CNTK, Apache MXNet (3rd-party support Tensorflow)
https://github.com/onnx/onnx
![Page 120: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/120.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021120
Static PyTorch: TorchScriptgraph(%self.1 : __torch__.torch.nn.modules.module.___torch_mangle_4.Module, %input : Float(3, 4), %h : Float(3, 4)): %19 : __torch__.torch.nn.modules.module.___torch_mangle_3.Module = prim::GetAttr[name="linear"](%self.1) %21 : Tensor = prim::CallMethod[name="forward"](%19, %input) %12 : int = prim::Constant[value=1]() # <ipython-input-40-26946221023e>:7:0 %13 : Float(3, 4) = aten::add(%21, %h, %12) # <ipython-input-40-26946221023e>:7:0 %14 : Float(3, 4) = aten::tanh(%13) # <ipython-input-40-26946221023e>:7:0 %15 : (Float(3, 4), Float(3, 4)) = prim::TupleConstruct(%14, %14) return (%15)
Build static graph with torch.jit.trace
![Page 121: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/121.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
PyTorch vs TensorFlow, Static vs Dynamic
PyTorchDynamic Graphs
Static: ONNX, TorchScript
121
TensorFlowDynamic: Eager
Static: @tf.function
![Page 122: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/122.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Static vs Dynamic: Serialization
Once graph is built, can serialize it and run it without the code that built the graph!
Graph building and execution are intertwined, so always need to keep code around
Static Dynamic
122
![Page 123: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/123.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Dynamic Graph Applications
Karpathy and Fei-Fei, “Deep Visual-Semantic Alignments for Generating Image Descriptions”, CVPR 2015Figure copyright IEEE, 2015. Reproduced for educational purposes.
123
- Recurrent networks
![Page 124: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/124.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Dynamic Graph Applications
The cat ate a big rat
124
- Recurrent networks- Recursive networks
![Page 125: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/125.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Dynamic Graph Applications
- Recurrent networks- Recursive networks- Modular networks
Andreas et al, “Neural Module Networks”, CVPR 2016Andreas et al, “Learning to Compose Neural Networks for Question Answering”, NAACL 2016Johnson et al, “Inferring and Executing Programs for Visual Reasoning”, ICCV 2017
125
Figure copyright Justin Johnson, 2017. Reproduced with permission.
![Page 126: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/126.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Dynamic Graph Applications
- Recurrent networks- Recursive networks- Modular Networks- (Your creative idea here)
126
![Page 127: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/127.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021127
Model Parallel vs. Data Parallel
Model Parallel minibatch
Data Parallel
Model parallelism: split computation graph into parts & distribute to GPUs/ nodes
Data parallelism: split minibatch into chunks & distribute to GPUs/ nodes
![Page 128: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/128.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021128
PyTorch: Data Parallelnn.DataParallelPro: Easy to use (just wrap the model and run training script as normal)Con: Single process & single node. Can be bottlenecked by CPU with large number of GPUs (8+).
nn.DistributedDataParallelPro: Multi-nodes & multi-process trainingCon: Need to hand-designate device and manually launch training script for each process / nodes.
Horovod (https://github.com/horovod/horovod): Supports both PyTorch and TensorFlow
https://pytorch.org/docs/stable/nn.html#dataparallel-layers-multi-gpu-distributed
![Page 129: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/129.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021129
TensorFlow: Data Paralleltf.distributed.Strategy
https://www.tensorflow.org/tutorials/distribute/keras
![Page 130: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/130.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021130
PyTorch vs. TensorFlow: Academia
https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/
![Page 131: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/131.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021131
PyTorch vs. TensorFlow: Academia
https://thegradient.pub/state-of-ml-frameworks-2019-pytorch-dominates-research-tensorflow-dominates-industry/
![Page 132: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/132.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021132
PyTorch vs. TensorFlow: Industry (202)
● No official survey / study on the comparison.
● A quick search on a job posting website turns up 2389 search results for TensorFlow and 1366 for PyTorch.
● The trend is unclear. Industry is also known to be slower on adopting new frameworks.
● TensorFlow mostly dominates mobile deployment / embedded systems.
![Page 133: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/133.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
My Advice:PyTorch is my personal favorite. Clean API, native dynamic graphs make it very easy to develop and debug. Can build model using the default API then compile static graph using JIT. Lots of research repositories are built on PyTorch.
TensorFlow’s syntax became a lot more intuitive after 2.0. Not perfect but has huge community and wide usage. Can use same framework for research and production. Probably use a higher-level wrapper (Keras, Sonnet, etc.).
133
![Page 134: Hardware and Software Lecture 6: Deep Learning Hardware, …vision.stanford.edu/teaching/cs231n/slides/2021/lecture... · 2021. 4. 15. · Hardware and Software Deep Learning Hardware,](https://reader036.fdocuments.us/reader036/viewer/2022062612/61396b0aa4cdb41a985bafcc/html5/thumbnails/134.jpg)
Fei-Fei Li, Ranjay Krishna, Danfei Xu Lecture 6 - April 15, 2021
Next Time: Training Neural Networks
134