A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down...

61
A Practical Guide to Deep Learning at the Department of Mathematics Vegard Antun (UiO) March 19, 2019 1 / 61

Transcript of A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down...

Page 1: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

A Practical Guide to Deep Learning at theDepartment of Mathematics

Vegard Antun (UiO)

March 19, 2019

1 / 61

Page 2: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Layout of the talk

Part I Computer resources, the linux operating system, large scalecomputations.

Part II Neural networks, mathematical framework, practical example.

2 / 61

Page 3: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Computer resources

CPU

Cache

Memory

Hard drive

3 / 61

Page 4: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

INF1060, Pål Halvorsen University of Oslo

cache(s)

main memory

secondary storage (disks)

tertiary storage (tapes)

Memory Hierarchies

0.3 ns

On die memory - 1 ns

50 ns

5 ms

< 1 s

2 s

1.5 minutes

3.5 months

Page 5: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Computer resources

GPU

Memory

CPU

Cache

Memory

Hard drive

5 / 61

Page 6: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Time measurements

Total time for 10 epochs on CIFAR10. Batch size 10.

I CPU: 8 min, 35 sec

I GPU: 53 sec (≈10 times faster)

Network Local disk RAM

0

5

10

15

20

Seco

nds

Loading 50 MR scans (each 40 MB) on nam shub

6 / 61

Page 7: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Operating systems (OS)

Hardware

Operating system

7 / 61

Page 8: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

The Linux Filesystem Hierarchy

The uppermost directory in the Linux file system is /

[ ∼ ]$ ls

Desktop Downloads Pictures www_docs

Documents pc WINDOWS

[ ∼ ]$ pwd

/mn/sarpanitu/ansatte -u4/vegarant

[ ∼ ]$ cd /

[ / ]$ ls

admin etc lib misc opt sbin tf usit

bin hf lib64 mn proc site tmp usr

boot home local mnt rh srv ub uv

dev ifi med net root sv uio var

div jus media odont run sys use

8 / 61

Page 9: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Some important directories

I /bin Most basic executable files (ls, cp, cd)

I /lib Libraries used by the executables

I /boot Files related to the boot loader

I /dev All devices, /dev/random, /dev/null, /dev/pst/0

I /etc Configuration files, /etc/hostname, /etc/passwd

I /home/username Your home folder ∼/ (not on UiO-system)

I /root Home directory of root user

I /tmp Temporary files - Not preserved during reboots

I /usr Read-only user data. Multiuser applications

I /var Variable files, i.e. files which changes during execution

9 / 61

Page 10: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can
Page 11: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Environment variables

Variable with a name and a value, used by one or moreapplications. To view all type env

Some important environment variables

I PATH All directories where we search for executables

I PYTHONPATH All directories where we search for python modules

I HOME Your home directory i.e. the position of ∼/

I EDITOR Default editor

I TF_CPP_MIN_LOG_LEVEL Level of verbosity for tensorflow

11 / 61

Page 12: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Environment variables - Example

[ ∼ ]$ echo $PYTHONPATH

/path/to/module1 :/path/to/module2

[ ∼ ]$

[ ∼ ]$ export PYTHONPATH=$PYTHONPATH :/path/to/new_module

[ ∼ ]$

[ ∼ ]$ echo $PYTHONPATH

/path/to/module1 :/path/to/module2 :/path/to/new_module

12 / 61

Page 13: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

The ∼/.bashrc

The scrip language you type in the terminal is called “BASH“(Bourne Again SHell)

We often want the environment to stay persistent between logins.Set defaults in the files

I ∼/.bashrc Run each time you open a terminal on yourcomputer

[ ∼ ]$ cat ∼/. bashrc

export PYTHONPATH=$PYTHONPATH :/path/to/new_module

export TF_CPP_MIN_LOG_LEVEL =1

alias la=’ls -a --color=auto ’

alias ll=’ls -lh --color=auto ’

# Describes the command line prompt

PS1=’[ \h \w ]$ ’

13 / 61

Page 14: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

The ∼/.bashrc and ∼/.bash profile files

I ∼/.bashrc Run each time you open a terminal on yourcomputer

I ∼/.bash_profile Run each time you log in remotely.

To have two different settings in ∼/.bashrc and ∼/.bash profile isoften inconvenient. To only use the ∼/.bashrc file, place thefollowing lines in your ∼/.bash profile

[ ∼ ]$ cat .bash_profile

if [ -f ∼/. bashrc ]; then

. ∼/. bashrc

fi

Note: Files starting with ’.’ don’t show whenever you type ls. Inorder to see these files, type ls -a

14 / 61

Page 15: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Login to remote machines via SSH

Login to the universities network from a personal linux or maccomputer

[ ∼ ]$ ssh -X [email protected]

The -X options enabels X11 forwarding i.e. you can open GUIbased applications.

Once you are logged in you can continue to the desired computerby typing

[ ∼ ]$ ssh -X computername

[ ∼ ]$ # Example , logging into the hadad computer

[ ∼ ]$ ssh -X hadad

15 / 61

Page 16: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Login to remote machines via SSH

Next we will see how to make this preceedure require less typing!

16 / 61

Page 17: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

SSH config file

Create the file ∼/.ssh/config and add the following lines

host uio

hostname login.math.uio.no

user your_username

ForwardX11 no

You can then logon to the university’s network by

ssh -X uio

We assume you have this setup in the rest of this presentation

17 / 61

Page 18: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

SSH keys

To make the UiO passwords secure they often require a lot oftyping. SSH-keys provides an easy way to maintain high sequretywhile having shorter passwords.

18 / 61

Page 19: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Generate and set up SSH-key

[ ∼ ]$ ssh -keygen -t rsa -b 4096 -C "[email protected]"

This command will create two files

I ∼/.ssh/id_rsa Private key. Do not share it.

I ∼/.ssh/id_rsa.pub Public key. Can be shared with anyone.

Copy the public key to the remote host (UIO)

ssh−copy−i d − i ∼/ . s s h / i d r s a . pub <username>@ l o g i n . math . u i o . no

19 / 61

Page 20: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

SSH and jump connections

Your comp. login.math.uio.no math comp.

I Jump connection sends the ssh trafic directly through acomputer like a regular ruter

I You avoid some typing and you do not allocate a terminal onthe jump computer

I Does only allow for one jump

20 / 61

Page 21: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

SSH and jump connections

To use jump connection add the following to your ∼/.ssh/config

# Setup for the math computers , this example belet -ili

Host belet -ili1

Hostname belet -ili.uio.no

ProxyJump [email protected]

User vegarant

or you can add the jump connection directly

s s h −J <username>@ l o g i n . math . u i o . no <username>@<hostname>. u i o . no

21 / 61

Page 22: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Terminal window managers

I Common choices are “tmux“ or “screen“.

22 / 61

Page 23: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Monitor CPU usage

I Use the htop command to view CPU-usage and priority

23 / 61

Page 24: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Reducing the priority of your process

I Linux processes can have “niceness“ values {−20, . . . , 19}where a smaller value gives higher priority.

I Negative nice values can only be given by rootuser/administrator.

I The default priority of any process you start will be 0 i.e. youwill typicaly reduce the priority.

[ ∼ ]$ nice -n 19 python3 my_python_script.py &

24 / 61

Page 25: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Monitor GPU usage

I All of our GPUs are from Nvidia. To view their current usageuse nvidia-smi

I To call this command every 5 second use the watch command

[ ∼ ]$ watch -n 5 nvidia -smi

[ ∼ ]$ # or use

[ ∼ ]$ nvidia -smi -l 5

25 / 61

Page 26: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

GPU resources at Dep. of Mathematics

Name GPU CPU cores Mem. scratchnam-shub-01 4 × RTX 2080 ti 28 128GB 30GB

zadkiel 1 × RTX 2080 4 16 GB −belet-ili 1 × GTX 1080 4 16 GB −cleopatra 1 × GTX 1080 4 16 GB −euphrosyne 1 × GTX 1080 4 16 GB −hadad 1 × GTX 1080 4 16 GB −

26 / 61

Page 27: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can
Page 28: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

AI HUB

I An experimental service for machine learning provided byUSIT, to gain experience with hardware and software for deeplearning.

I Reserved for students on weekdays (Mon-Fri) from 09:00 to17:00.

I Need to login via Abel (add ssh keys as before).

Name GPU CPU cores Mem.

Nonepresistent

scratchml1 4 × RTX 2080 Ti 28 128 GB 17TB

ml2 4 × RTX 2080 Ti 28 128 GB 17TB

ml3 4 × RTX 2080 Ti 28 128 GB 17TB

I AI mailing list: [email protected]

28 / 61

Page 29: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Deep learning frameworks

I Many old frameworks like: MatConvNet, Caffe, Theano ...

I For most scientists Tensorflow (and maybe Pytorch) would bethe prefered option.

29 / 61

Page 30: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow

I Developed by Google, and have a large community.

I Relatively well documented

I Have APIs in Python, JavaScript, C++, Java, Go, Swift.

I Models can be deployed into applications, such as websitesand phones.

30 / 61

Page 31: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

How to run Tensorflow?

I No unified way to do this on all systems.

I The machines ml1, ml2 and ml3, have tensorflow v1.12 andPyTorch v1.0. Just type python3 to get started.

I On math computers we use the module system (and maybesingularity)

module avail # See which modules are avaiable

module load tensorflow/<version > # Load tensorflow

module rm tensorflow/<version > # Unload tensorflow

module list # view loaded modules

I ML software located under python-ml/<version> andtensorflow/<version>. Do not load both.

31 / 61

Page 32: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Singularity

I Singularity (similar to docker) is container with a minimaloperating system.

I Shares the kernel with the host operating system so that CPUoverhead is almost non.

I You can install whatever software you like within thecontainer, with the nessesary libaries.

I Makes reproducible research much easier!

I Check out Tormod Landet’s excelent guide to singularityhttp://folk.uio.no/tormodla/singularity/

I On maths computers precompiled singularity images arelocated at /mn/sarpanitu/singularity/images/Machine_learning

32 / 61

Page 33: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Neat commands

I ag or ack – Search for pattern in each source file in the treefrom the current directory and downward.

I fzf – Fuzzy finder. Search for filenames in the tree from thecurrent directory and downwards.

I which <command> – E.g. which python Gives the location of theprogram python.

I nohup nice -n 19 python -u my_script.py > output.txt & –Start prosess which aren’t shut down when you exit the loginshell.

33 / 61

Page 34: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

File permissions

On UNIX systems, access can be given to a user, group or all. Thetree types of permissions are read, write and execute

[ ∼/some/directory]$ ls -l

drwxrwxr -x. 1 vegarant vegarant 4096 Oct 26 10:53 my_dir

-rwxrwxr -x. 1 vegarant vegarant 8448 Oct 26 10:53 my_file

-rw -r--r--. 1 vegarant vegarant 108 Oct 26 10:52 my_file.c

d︸︷︷︸directory

rwx︸︷︷︸user

rwx︸︷︷︸group

r − x︸ ︷︷ ︸all

vegarant︸ ︷︷ ︸username

vegarant︸ ︷︷ ︸group name

4096︸︷︷︸size

Oct26 10 : 53︸ ︷︷ ︸last modified

my dir︸ ︷︷ ︸name

[ ∼/some/directory]$ # Make directory private

[ ∼/some/directory]$ chmod 700 my_dir

[ ∼/some/directory]$ ls -l

drwx ------. 1 vegarant vegarant 4096 Oct 26 10:53 my_dir

-rwxrwxr -x. 1 vegarant vegarant 8448 Oct 26 10:53 my_file

-rw -r--r--. 1 vegarant vegarant 108 Oct 26 10:52 my_file.c

34 / 61

Page 35: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Part II

Neural networks, mathematical framework, practical example.

35 / 61

Page 36: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can
Page 37: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Neural Network

Definition 1Let NNN,L,d with N = (c = NL+1,NL, . . . ,N2,N1 = d) denote theset of all L-layer neural networks. That is, all mappingsf : Rd → Rc of the form

f (x) = WL(. . . ρ(W2(ρ(W1(x)))) . . .), x ∈ Rd ,

where Wjz = Ajz + bj , Aj ∈ RNj×Nj+1 , bj ∈ RNj+1

and ρ : R→ R is a non-linear function that acts elementwise on a vector.

37 / 61

Page 38: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Choices of ρ

ρ : R→ R acts elementwise on a vector.

Sigmoid: ρ(x) = 1/(1 + e−x) ReLu: ρ(x) = max(0, x)

tanh: ρ(x) = tanh(x) Leaky ReLu: ρ(x) =

{x x ≥ 0

αx x < 038 / 61

Page 39: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Choices of ρ

ρ

x1...xN

=

max{x1, x2}...

max{xN−1, xN}

, ρ

x1...xN

=

x1+x2

2...

xN−1+xN2

Max pooling Avrage pooling (linear map)

39 / 61

Page 40: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Neural Network (Alternative definition)

Directed acyclic graph

x

z1 = A1x + b1

z2 = ρ1(z1)

z3 = A2z2 + b2

z4 = A3x + b3

z5 = ρ2(z4)z6 = z3 + z5

z7 = ρ3(z6)

Output40 / 61

Page 41: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

What is machine learning?

41 / 61

Page 42: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Machine learning model

I Training set: S = (z1, . . . , zm) ⊂ Z where each zi is i.i.d.from an unknown probability distribution D over Z ⊂ Rd .

I Function class: F class of funtions/hypotheses.

I Cost function: C : F × Z → RI Risk: RD(f ) := Ez∼DC (f , z) where z ∼ D is independent of

S .

I Goal: Find a “good hypotesis“ f̂ ∈ F based on S such thatRD(f̂ ) is small.

Shalev-Shwartz & Ben-David, Understanding Machine Learning: From Theory

to Algorithms, Cambridge University Press, 2014.

42 / 61

Page 43: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Examples

Binary classificationI Training set: {(xi , yi )}mi=1 ⊂ Rd × {0, 1}.I Function class: F can be set of linear classifiers, Neural

networks, decision trees.I Cost function: C (f , (xi , yi )) = 1{yi=f (xi )}.

Linear regressionI Training set: {(xi , yi )}mi=1 ⊂ Rd × R.I Function class: F = {〈·, θ〉 : θ ∈ Rd+1}I Cost function: C (f , (xi , yi )) = (yi − 〈[xi , 1], θ〉)2.

ClusteringI Training set: S = {zi}mi=1 ⊂ Rd .I Function class:

F = {T = {T1, . . . ,Tk} : Partition of S with centers (c1, . . . , ck)}

I Cost function: C (T , zi ) = ||zi − cj || for zi ∈ Tj .

43 / 61

Page 44: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Machine learning model

I Risk: RD(f ) := Ez∼DC (f , z) where z ∼ D is independent ofS .

I Goal: Find a “good hypotesis“ f̂ ∈ F based on S such thatRD(f̂ ) is small. Notice: We can not evaluate RD(f ) since D isunknown

Emperical Risk Minimazation

Approximate RD(f ) by

RS(f ) =1

|S |∑z∈S

C (f , z)

We seek to findf ] ∈ argminf ∈F RS(f )

44 / 61

Page 45: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Bias-Complexity tradeoff

Let

εapprox = minf ∈F

RD(f ) and f ] ∈ argminf ∈F RS(f ).

Then

RD(f ]) = εapprox︸ ︷︷ ︸approximation error

+RD(f ])− εapprox︸ ︷︷ ︸estimation error

45 / 61

Page 46: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Emperial Risk Minimization for Neural Networks

I Training set: {(xi , yi )}mi=1 ⊂ Rd × Rc .

I Function class: F = NNN,L,d parametrized by the weightsθ = (vec(A1), b1, . . . , vec(AL), bL) i.e. f (·, θ) : Rd → RNL+1 .

I Cost function: C (f , (xi , yi )) = d(f (xi , θ), yi ). Functiond : Rc × Rc → R+ problem dependent.

1. θ ∈ Rp is often referred to as the weights.

2. Define loss function

L(θ) =n∑

i=1

d(f (xi , θ), yi )

3. Try to findθ ∈ argmin

θ∈RpL(θ)

using (stochastic) gradient decent.

46 / 61

Page 47: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Convex Optimization – Boyd & Vandenberghe

“Nonlinear optimization (or nonlinear programming) is the termused to describe an optimization problem when the objective orconstraint functions are not linear, but not known to be convex.Sadly, there are no effective methods for solving the generalnonlinear programming problem (1.1). Even simple lookingproblems with as few as ten variables can be extremely challenging,while problems with a few hundreds of variables can be intractable.Methods for the general nonlinear programming problem thereforetake several different approaches, each of which involves somecompromise.“

minimize f0(x), x ∈ Rn

subject to fi (x) ≤ bi i = 1, . . . ,m(1.1)

Boyd & Vandenberghe, Convex Optimization, Cambridge universitypress, 2004.

47 / 61

Page 48: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Convex Optimization – Boyd & Vandenberghe

From section on local optimization approaches to nonlinearoptimization:

“Roughly speaking, local optimization methods are more art thantechnology. Local optimization is a well developed art, and oftenvery effective, but it is nevertheless an art.“

Boyd & Vandenberghe, Convex Optimization, Cambridge universitypress, 2004.

48 / 61

Page 49: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Gradient Decent for Neural Networks

I Recall we wanted to minimize

L(θ) =n∑

i=1

d(f (xi , θ), yi )

Gradient decent gives the iterations

θk+1 = θk − αk∇L(θk)

for some step length αk > 0.

I What happens to the computational cost if n is very large, sayn ≈ 1 200 000

49 / 61

Page 50: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Stochastic Gradient Decent for Neural Networks

I Create a partition {T1, . . . ,Tk} of the numbers {1, . . . , n}where each |Tj | ≤ s.

I LetGj(θ) =

∑i∈Tj

∇θC (f (xi , θ), yi )

I Perform the updates

1: t = 02: for e = 1, . . . ,M do3: for j = 1, . . . , k do4: θt+1 = θt − αtGj(θt)5: t = t + 1;

6: return θkM

50 / 61

Page 51: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Alternative update rules

GD with momentum, 0 < γ < 1.

vt+1 = γvt + ηGj(θt)

θt+1 = θt − vt+1

Individual scaling of the different parameters. (Adagrad, RMSprop,Adam)

θt+1 = θt − DtGj(θt)

Dt is a diagonal matrix depending on some or all of the previouscomptued gradients.

51 / 61

Page 52: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow

import tensorflow as tf

import numpy as np

Most important tensors

I tf.Variable (Must be initialized. Can take gradient)

I tf.placeholder (Input to the network)

I tf.constant (Constant values)

I tf.Tensor (Output of an operation)

Important Attributes

I shape (Default is None, i.e. not specified)

I dtype (tf.float32, tf.int32, . . .)

I name (Will be assigned a name of not specified)

52 / 61

Page 53: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

xA

z1 = Ax b

z2 = z1 + b

I A: tf.Variable

I x : tf.placeholder

I z1: tf.Tensor

I b: tf.Variable, tf.placeholder or tf.constant

I z2: tf.Tensor

53 / 61

Page 54: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow

# Nodes in a graph

a = tf.Variable(initial_value=np.random.randn(1,3),

name=’weights ’, dtype=tf.float32)

b = tf.Variable(initial_value =[0], name=’bias ’,

dtype=tf.float32)

print(a)

print(b)

$ python3 program_name.py

<tf.Variable ’weights:0’ shape =(1, 3) dtype=float32_ref >

<tf.Variable ’bias:0’ shape =(1,) dtype=float32_ref >

54 / 61

Page 55: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Linear regression

# Code generating all the data

N = 50

a_true = np.array ([[4., -5, 3 ]], dtype=np.float32)

b_true = np.array ([2], dtype=np.float32)

x_data = np.concatenate( (np.random.randn(1, N),

np.random.uniform(size=[1, N]),

np.random.chisquare(df=3.0, size=(1, N))) )

noise = 0.01*np.random.randn(1, N)

labels = np.dot(a_true , x_data) + b_true # + noise

a =

4−53

, b = 2, xi ∈ R3, i = 1, . . . ,N

xTi a + b = yi , i = 1, . . . ,N

55 / 61

Page 56: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow

# Nodes in a graph

a = tf.Variable(initial_value=np.random.randn(1,3),

name=’weights ’, dtype=tf.float32)

b = tf.Variable(initial_value =[0], name=’bias ’,

dtype=tf.float32)

X = tf.placeholder(dtype=tf.float32 , name=’data ’,

shape =[3, N])

prediction = tf.linalg.matmul(a,X) + b # TF graph

print(x)

print(prediction)

$ python3 program_name.py

Tensor ("data:0", shape =(3, 50), dtype=float32)

Tensor ("add:0", shape =(1, 50), dtype=float32)

56 / 61

Page 57: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow – Sessions

I Graphs only define the function you would like to compute.I To execute a graph (function), open a tf.Session().

init = tf.global_variables_initializer ();

with tf.Session () as sess:

sess.run(init); # All variables must be initalized

# All relevant placeholders goes into the feed_dict

pred = sess.run(prediction , feed_dict ={X: x_data })

a_start = sess.run(a);

print(a_start );

print(pred) # pred is a numpy array with

# values = a*data + b

$ python3 program_name.py

[[ -0.9025026 0.6354202 -0.09739944]]

[[ -0.86136425 0.6985589 0.51153713 1.2961135

...

0.91275173 -1.0157912 -0.41740212 0.45071918

0.3727951 -0.81552047]]

57 / 61

Page 58: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

Tensorflow – Gradient Decent

Y = tf.placeholder(dtype=tf.float32 , name=’label ’,

shape =[1, N]);

# Compute sum_{i} (y[i]-prediction[i])^2

loss = tf.reduce_sum(tf.pow(prediction -Y, 2));

nbr_epochs = 100;

step_length = 0.01; # often called learning rate

optimizer = tf.train.GradientDescentOptimizer(

step_length ). minimize(loss);

with tf.Session () as sess:

sess.run(init); # All variables must be initalized

for epoch in range(nbr_epochs ):

# Do gradient decent step

sess.run(optimizer , feed_dict ={X: x_data ,

Y: labels })

a_pred , b_pred = sess.run([a, b]);58 / 61

Page 59: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can

NeurIPS (earlier NIPS)

Submitted papers

I 2016: 2406 submissions

I 2017: 3240 submissions

I 2018: ∼4900 submissions

Source: Twitter

59 / 61

Page 60: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can
Page 61: A Practical Guide to Deep Learning at the Department of ......Start prosess which aren’t shut down when you exit the login shell. 33/61 File permissions On UNIX systems, access can