Introduction to Deep Learning

Introduction to Deep Learning

(c) Oleg Mygryn 2016

Neuron Model

1899 - Discovered. Santiago Ramón

Neuron

Input1

Input2

Input3

weight 2weight 1 weight 3

Activation Function

Output

1958 - Perceptron. Frank Rosenblatt1984 - Backpropagation optimization2010 - Recurrent and Deep FF nets2012+ ….

ImageNet Challenge150,000 images and 1000 object classes

top 5 suggestions, error rate, %

Deep Learning and last 10 years

- AlexNet (2012) exploded industry. 5 layers

- ZF Net (2013) - 5 layers improved AlexNet

- GoogLeNet (2015) - 22 layers

- VGG Net (2014) - Oxford 19, layers

In 2016: - NVIDIA DGX-1 system. ~170TFlops!!! - Intel® Xeon Phi™ 7210. ~ 3 TFlops

Top AI scientists

Geoffrey E. Hinton

University of Toronto, AlexNet curator,

researcher

Andrew NgYann LeCunFacebook,

AI research group, working on AI

since 1998

Chief Scientist of Baidu, Co-Founder Coursera,

Professor at Stanford University

Modeling Neuron

Human body: ~ 86 billion neurones ~ 100 trillons synapses

Activation functions: - sigmoid; - tanh; - ReLU;

Layer-wise organization

Most networks are fully-connectedNot counting Input layer (3-layered on picture)

Output layer - no activation function

4 + 4 + 1 = 9 neurons

[3 x 4] + [4 x 4] + [4 x 1] = 12 + 16 + 4 = 32 weights

4 + 4 + 1 = 9 biases

∑ = 41 learnable params

Modern NN ~100 million parameters with ~10-20 layers

Example: Visual Geometry Group Network (Oxford) have 19 layers and 138 Millions parameters to learn

Rectifier (ReLU)

TanH

Sigmoid

Binary

Activation Functions

ReLU Rocks !

ReLUs (solid line) reaches a 25% training error rate on CIFAR-10 six times faster than an equivalent network with tanh neurons (dashed line)

by Alex Krizhevsky

TanH example

Convolutional Neural Networks (CNN)

CNN Example 1

CNN Example 2

https://github.com/rasbt/python-machine-learning-book

Training network. Backpropagation

AlexNet. Image recognition samples

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

AlexNet (2012): Illustration of the Architecture

“Train time on GTX 580 3GB GPUs 5-6 days. All of our experiments suggest that our results can be improved simply by waiting for faster GPUs and bigger datasets to become available” (c) Alex Krizhevsky

https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

GoogleNet (2015): Inception

9 Inception modules in the whole architecture, with over 100 layers in total

http://www.deepdrumpf2016.com/@DeepDrumpf

Deconvolution

Deconvolution

@ Google

tensorflow.org launched in Nov 2015. - most popular ML library - GitHub: 35,000 stars 15,000 forks - 350 contributors

http://tensorflow.org

TensorFlow

- Python API - C++ API (poorly documented) - Java API ??? (TBA in 201X)

API

Features- CPU or multiple CPU, GPU or multiple GPU; - Async computation with lazy loading of execution graph; - Many of algorithm have already implemented; - TensorBoard: graph execution visualisation + debugging;

import tensorflow as tf

hello = tf.constant('Hello, TensorWorld!') sess = tf.Session() print sess.run(hello)

Thank YOU

1. www.cs231n.github.io2. www.tensorflow.org3. www.playground.tensorflow.org4. www.yann.lecun.com5. www.kdnuggets.com6. www.devblogs.nvidia.com

Introduction to Deep Learning

Engineering

Transcript of Introduction to Deep Learning