Download - Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Transcript
Page 1: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Deep Learning Basics Lecture 1: Feedforward

Princeton University COS 495

Instructor: Yingyu Liang

Page 2: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation I: representation learning

Page 3: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Machine learning 1-2-3

β€’ Collect data and extract features

β€’ Build model: choose hypothesis class 𝓗 and loss function 𝑙

β€’ Optimization: minimize the empirical loss

Page 4: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Features

Color Histogram

Red Green Blue

Extract features

π‘₯

𝑦 = π‘€π‘‡πœ™ π‘₯build hypothesis

Page 5: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Features: part of the model

𝑦 = π‘€π‘‡πœ™ π‘₯build hypothesis

Linear model

Nonlinear model

Page 6: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Example: Polynomial kernel SVM

π‘₯1

π‘₯2

𝑦 = sign(π‘€π‘‡πœ™(π‘₯) + 𝑏)

Fixed πœ™ π‘₯

Page 7: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: representation learning

β€’ Why don’t we also learn πœ™ π‘₯ ?

Learn πœ™ π‘₯

π‘₯

𝑦 = π‘€π‘‡πœ™ π‘₯Learn π‘€πœ™ π‘₯

Page 8: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

β€’ View each dimension of πœ™ π‘₯ as something to be learned

π‘₯

πœ™ π‘₯

𝑦 = π‘€π‘‡πœ™ π‘₯

……

Page 9: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

β€’ Linear functions πœ™π‘– π‘₯ = πœƒπ‘–π‘‡π‘₯ don’t work: need some nonlinearity

π‘₯

πœ™ π‘₯

𝑦 = π‘€π‘‡πœ™ π‘₯

……

Page 10: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward networks

β€’ Typically, set πœ™π‘– π‘₯ = π‘Ÿ(πœƒπ‘–π‘‡π‘₯) where π‘Ÿ(β‹…) is some nonlinear function

π‘₯

πœ™ π‘₯

𝑦 = π‘€π‘‡πœ™ π‘₯

……

Page 11: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Feedforward deep networks

β€’ What if we go deeper?

… …

…… …

…

π‘₯

β„Ž1 β„Ž2 β„ŽπΏ

𝑦

Page 12: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Figure fromDeep learning, by Goodfellow, Bengio, Courville.Dark boxes are things to be learned.

Page 13: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation II: neurons

Page 14: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: neurons

Figure from Wikipedia

Page 15: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: abstract neuron model

β€’ Neuron activated when the correlation

between the input and a pattern πœƒ

exceeds some threshold 𝑏

β€’ 𝑦 = threshold(πœƒπ‘‡π‘₯ βˆ’ 𝑏)

or 𝑦 = π‘Ÿ(πœƒπ‘‡π‘₯ βˆ’ 𝑏)

β€’ π‘Ÿ(β‹…) called activation function 𝑦

π‘₯1

π‘₯2

π‘₯𝑑

Page 16: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: artificial neural networks

Page 17: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Motivation: artificial neural networks

β€’ Put into layers: feedforward deep networks

… …

…… …

…

π‘₯

β„Ž1 β„Ž2 β„ŽπΏ

𝑦

Page 18: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components in Feedforward networks

Page 19: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components

β€’ Representations:β€’ Input

β€’ Hidden variables

β€’ Layers/weights:β€’ Hidden layers

β€’ Output layer

Page 20: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Components

… …

…… …

…

Hidden variables β„Ž1 β„Ž2 β„ŽπΏ

𝑦

Input π‘₯

First layer Output layer

Page 21: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Input

β€’ Represented as a vector

β€’ Sometimes require some

preprocessing, e.g.,β€’ Subtract mean

β€’ Normalize to [-1,1]

Expand

Page 22: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

β€’ Regression: 𝑦 = π‘€π‘‡β„Ž + 𝑏

β€’ Linear units: no nonlinearity

β„Ž

𝑦

Output layer

Page 23: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

β€’ Multi-dimensional regression: 𝑦 = π‘Šπ‘‡β„Ž + 𝑏

β€’ Linear units: no nonlinearity

β„Ž

𝑦

Output layer

Page 24: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

β€’ Binary classification: 𝑦 = 𝜎(π‘€π‘‡β„Ž + 𝑏)

β€’ Corresponds to using logistic regression on β„Ž

β„Ž

𝑦

Output layer

Page 25: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Output layers

β€’ Multi-class classification:

β€’ 𝑦 = softmax 𝑧 where 𝑧 = π‘Šπ‘‡β„Ž + 𝑏

β€’ Corresponds to using multi-class

logistic regression on β„Ž

β„Ž

𝑦

Output layer

𝑧

Page 26: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ Neuron take weighted linear combination of the previous layer

β€’ So can think of outputting one value for the next layer

……

β„Žπ‘– β„Žπ‘–+1

Page 27: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ 𝑦 = π‘Ÿ(𝑀𝑇π‘₯ + 𝑏)

β€’ Typical activation function π‘Ÿβ€’ Threshold t 𝑧 = 𝕀[𝑧 β‰₯ 0]

β€’ Sigmoid 𝜎 𝑧 = 1/ 1 + exp(βˆ’π‘§)

β€’ Tanh tanh 𝑧 = 2𝜎 2𝑧 βˆ’ 1

𝑦π‘₯π‘Ÿ(β‹…)

Page 28: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ Problem: saturation

𝑦π‘₯π‘Ÿ(β‹…)

Figure borrowed from Pattern Recognition and Machine Learning, Bishop

Too small gradient

Page 29: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ Activation function ReLU (rectified linear unit)β€’ ReLU 𝑧 = max{𝑧, 0}

Figure from Deep learning, by Goodfellow, Bengio, Courville.

Page 30: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ Activation function ReLU (rectified linear unit)β€’ ReLU 𝑧 = max{𝑧, 0}

Gradient 0

Gradient 1

Page 31: Deep Learning Basics Lecture 1: feedforward · PDF fileDeep Learning Basics Lecture 1: Feedforward Princeton University COS 495 Instructor: ... Motivation: abstract neuron model •Neuron

Hidden layers

β€’ Generalizations of ReLU gReLU 𝑧 = max 𝑧, 0 + 𝛼min{𝑧, 0}

β€’ Leaky-ReLU 𝑧 = max{𝑧, 0} + 0.01min{𝑧, 0}

β€’ Parametric-ReLU 𝑧 : 𝛼 learnable

𝑧

gReLU 𝑧