Introduction to deep learning

54
INTRODUCTION TO DEEP LEARNING Zeynep Su Kurultay

Transcript of Introduction to deep learning

Page 1: Introduction to deep learning

INTRODUCTION TO DEEP

LEARNINGZeynep Su Kurultay

Page 2: Introduction to deep learning

Outline

■ Modeling humans in machines■ Introduction to neural nets■ What makes an algorithm intelligent?■ Learning– Supervised learning■ Deep learning– Neural nets in detail■ Framework discussion & sample code■ Future

Page 3: Introduction to deep learning

Modeling humans in machines

Page 4: Introduction to deep learning

Modeling humans in machinesBut why?

Page 5: Introduction to deep learning

Neural networks■ The mammal brain is organized in a

deep architecture (Serre, Kreiman, Kouh, Cadieu, Knoblich, & Poggio, 2007)(E.g. visual system has 5 to 10 levels)

■ Very popular at the beginning of 1990s but fell out of favor after it was found that they were not performing well

■ Why is it gaining power again now: Deep architectures might be able to represent some functions otherwise not efficiently representable. Breakthrough in 2006/2007 with Hinton, Bengio papers

Page 6: Introduction to deep learning

Examples around us

Page 7: Introduction to deep learning

Examples around us

Date: November 2014

Page 8: Introduction to deep learning

Examples around us

Page 9: Introduction to deep learning

Examples around us

Page 10: Introduction to deep learning

Examples around us

Page 11: Introduction to deep learning

Examples around us

Image: NasenSpray/Imgur

Page 12: Introduction to deep learning

Examples around us

Image: http://www.telegraph.co.uk/technology/google/11730050/deep-dream-best-images.html?frame=3370388

Page 13: Introduction to deep learning

Examples around us

Image: drkaugumon/Imgur

Page 14: Introduction to deep learning

What makes an algorithm intelligent?

Image courtesy of Toptal.com

Page 15: Introduction to deep learning

What makes an algorithm intelligent?

Page 16: Introduction to deep learning

Learning

■ Supervised machine learning: The program is “trained” on a pre-defined set of “training examples”, which then facilitate its ability to reach an accurate conclusion when given new data.

■ Semi-supervised machine learning: The program infers the unknown labels through “label propagation”, utilizing similarities between different examples and inferring non-existent labels from existent ones

■ Unsupervised machine learning: The program is given a bunch of data and must find patterns and relationships therein. – e.g. clustering via nearest neighbor algorithm

Page 17: Introduction to deep learning

Supervised Learning■ Binary classification: Does this person have that disease?■ Regression: What is the market value of this house?■ Multiclass classification: Digit recognition, Face recognition

Page 18: Introduction to deep learning

Supervised Learning

■ Goal: Given a number of features, try to make sense out of it!

■ Example: Employee satisfaction rates – depends on ? So, given these features in a dataset, try to predict the rate

Page 19: Introduction to deep learning

Supervised Learning

Page 20: Introduction to deep learning

Supervised Learning

Page 21: Introduction to deep learning

Supervised Learning

Page 22: Introduction to deep learning

Supervised Learning

Page 23: Introduction to deep learning

Supervised Learning

■ But how do we adjust ourselves? How do we know at each step we are getting better?

■ Measurement of wrongness: Loss functions

Page 24: Introduction to deep learning

Loss functions

Page 25: Introduction to deep learning

Gradient descentHow do we know how to “roll down the hill”?

The gradient (the derivatives of the loss function over all of the individual weights of features -i.e. parameters-) tells us “which way is down”.

Page 26: Introduction to deep learning

What exactly is deep learning?

■ “a network would need more than one hidden layer to be a deep network, networks with one or two hidden layers are traditional neural networks…….”

■ “in my experience, a network can be considered deep when there is at least one hidden layer. Although the term deep learning can be fuzzy, …”

■ “in my own thinking, deep is not related to the number of layers, but it talks about how hard the feature to be discovered is…….”

■ - a discussion from StackExchange

Page 27: Introduction to deep learning

Deep learning■ What is the difference? Remember the quote from Yann LeCun from

before? It goes on:■ “A pattern recognition system is like a black box with a camera at one

end, a green light and a red light on top, and a whole bunch of knobs on the front…. Now, imagine a box with 500 million knobs, 1,000 light bulbs, and 10 million images to train it with. That’s what a typical Deep Learning system is.”

Page 28: Introduction to deep learning
Page 29: Introduction to deep learning

Aim: Learning features

■Deep learning excels in tasks where the basic unit, a single pixel, a single frequency, or a single word has very little meaning in and of itself, but the combination of such units has a useful meaning. It can learn these useful combinations of values without any human intervention. 

Page 30: Introduction to deep learning

Aim: Learning features(convolutional neural networks)

Page 31: Introduction to deep learning

Neural networks

■ An input, output, and one or more hidden layers of units/neurons/perceptrons

■ Each connection between two neurons has a weight w. Best weights can again be found with gradient descent.

Image courtesy of http://ljs.academicdirect.org/A15/053_070.htm

Page 32: Introduction to deep learning

Neural networks

■ Example: Input vector: [7, 1, 2] Into the input units

■ Forward propagation■ Activation function

Image courtesy of http://ljs.academicdirect.org/A15/053_070.htm

Page 33: Introduction to deep learning

Neural networks

■ Why deep? ■ Number of parameterized

transformations a signal encounters as it propagates from the input layer to the output layer, where a parameterized transformation is a processing unit that has trainable parameters, such as weights.

Image courtesy of http://ljs.academicdirect.org/A15/053_070.htm

Page 34: Introduction to deep learning

Aim: Learning features

■ The goal of deep learning methods is to learn higher levels of feature from lower level features.

Page 35: Introduction to deep learning

Other important concepts

■ Overfitting – there is such a thing as learning too much –or too specific-!

■ Regularization – a technique that prevents overfitting

Page 36: Introduction to deep learning

Overfitting■ Overfitting – there is such a thing as learning too much –or too specific-!■ Regularization – a technique that prevents overfitting

Page 37: Introduction to deep learning

OverfittingU.S. Census Population over Time

Page 38: Introduction to deep learning

Different frameworks

■ Pylearn2, Lasagne, Caffe, Torch, Theano, Blocks, Plate, Crino, Theanet, DL4J, Keras, …

Page 39: Introduction to deep learning

Different frameworks

■ Theano: – A mathematical expression compiler, designed with machine learning

in mind.– Lets you define an objective and automatically produces the code

that computes the gradient of the objective.– Good for experimenting with different loss functions– Slightly lower layer of abstraction vs more possibilities

Page 40: Introduction to deep learning

Different frameworks

■ Caffe: – Developed by UC Berkeley– Widely used machine-vision library that ported Matlab’s

implementation of fast convolutional nets to C and C++– Not intended for other deep-learning applications such as text, sound

or time series data CORRECTION: There are new implementations of RNNs and LSTMs in Caffe, so it is not only for images any more!

– Very fast: over 60M images per day with a single NVIDIA K40 GPU

Page 41: Introduction to deep learning

Different frameworks

■ Torch: – Written in Lua (a scripting language developed in Brazil in the early

1990s)– A highly customized version of it is used by large tech companies

such as Google and Facebook

Page 42: Introduction to deep learning

Different frameworks

■ Keras: – Minimalist, highly modular neural network library in the spirit of Torch– Written in Python– Uses Theano under the hood for optimized tensor manipulation on

GPU and CPU – It was developed with a focus on enabling fast experimentation– 60K images took 30 hours on Amazon g2.2xlarge

Page 43: Introduction to deep learning

Comparing Keras and TheanoMNIST digits dataset - serves as a benchmark to

compare results with as new articles come out.

Multilayer Perceptron- Basic feedforward neural network

Page 44: Introduction to deep learning

DemoCode snippets – inside the gradient descent

Output = Wx+b

Page 45: Introduction to deep learning

DemoCode snippets – inside the hidden layer

Page 46: Introduction to deep learning

DemoCode snippets – inside the hidden layer

Page 47: Introduction to deep learning

DemoCode snippets – inside the hidden layer

Page 48: Introduction to deep learning

DemoCode snippets – inside the network

Page 49: Introduction to deep learning

Demo

■ https://algorithmia.com/demo/handwriting

Page 50: Introduction to deep learning

Future of deep learning

■ Deep learning has a lot of hype right now, and it is apparent that it is very useful for specific tasks.

■ What frontiers and challenges do you think are the most exciting for researchers in the field of neural networks in the next ten years?

■ I cannot see ten years into the future. For me, the wall of fog starts at about 5 years. ... I think that the most exciting areas over the next five years will be really understanding videos and text. I will be disappointed if in five years time we do not have something that can watch a YouTube video and tell a story about what happened. I have had a lot of disappointments.

– From Geoffrey Hinton’s AMA on Reddit

Page 51: Introduction to deep learning

Now & The future

Facebook Deep Learning, March 26, 2015Image courtesy of Venturebeat.com

Page 52: Introduction to deep learning

Join us!

■ Open positions: https://angel.co/algorithmia/jobs/– Algorithm Developer [this is me!]– Backend Developer– Product Manager– Technical Evangelist

Page 53: Introduction to deep learning

Further resources■ Introductory:■ Andrew Ng’s Machine Learning course on Coursera■ Geoffrey Hinton’s Neural Networks course on Coursera

■ Advanced:■ Stanford’s Convolutional Neural Networks for Visual Recognition http://cs231n.github.io/■ Who is afraid of non-convex loss functions? By Yann LeCun http

://videolectures.net/eml07_lecun_wia/■ What is wrong with Deep Learning? By Yann Lecun

http://techtalks.tv/talks/whats-wrong-with-deep-learning/61639/

■ For those who like papers, recent advances:■ Playing Atari with Deep Reinforcement Learning - http://www.cs.toronto.edu/~vmnih/docs/dqn.pdf■ Unsupervised Face Detection - http://cs.stanford.edu/~quocle/faces_full.pdf

Page 54: Introduction to deep learning

■ Content:■ Toptal.com, Deeplearning.net■ http://www.computerworld.com/article/2918161/emerging-technology/the-ai-ecosystem.html■ Introduction to Machine Learning CMU-10701 - Deep Learning slides

■ Images:■ http://www.spyemporium.com/images/products/st-sc1720.jpg■ http://stats.stackexchange.com/questions/128616/whats-a-real-world-example-of-overfitting■ http://www.homedepot.com/catalog/productImages/1000/c4/c4c34d2e-56ce-4c11-94c0-67aa19b769fa_1000.jpg■ http://www.bulborama.com/images/products/1933.jpg■ https://xkcd.com/1122/, https://xkcd.com/1425/■ www.deeplearning.net