Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review...
Transcript of Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review...
![Page 1: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/1.jpg)
1
Neural Networks Review
Machine Learning 10-601 Fall 2012 Recitation
October 2 & 3, 2012 Petia Georgieva
![Page 2: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/2.jpg)
2
Outline
• Biological neuron networks
• Artificial neuron model (perceptron)
• Activation functions
• Gradient Descent
• BackPropagation
![Page 3: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/3.jpg)
3
Biological neuron networks
bird cerebellum
human hippocampus human cortex
http://en.wikipedia.org/wiki/Neuron
![Page 4: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/4.jpg)
4
Rat neo-cortex
BlueBrainProject: http://mediatheque.epfl.ch/sv/Blue_brain
![Page 5: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/5.jpg)
5
Artificial neuron model (McCulloch and Pitts, 1942)
ℜ→ℜng :
[ ]nxxx ,...., 21=x - Neuron Input Vector
1, 00
==∑=
xxwsn
iii (bias)
)(sfy = - Neuron Scalar Output
(.)f - Activation Function
![Page 6: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/6.jpg)
6
Biological analogy
![Page 7: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/7.jpg)
7
Activation functions • Linear : ssf β=)(
• Step (Heaviside): ⎩⎨⎧
<
≥=
0,0,
)(1
2
ss
sfβ
β, normally ( ) ( )1/0,1 112 −=== βββ
• Ramp: ⎪⎩
⎪⎨
⎧
>−
<
>
=
ββ
β
ββ
ssss
sf,,,
)(
• Sigmoid: )exp(1
1)(s
sfλ−+
=
• Tangent hyperbolic: )exp()exp()exp()exp()(sssssfλλλλ−+
−−=
![Page 8: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/8.jpg)
8
Activation functions
![Page 9: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/9.jpg)
9
Discriminative capacity of one neuron
Linearly separable objects – Hyperplane that separates objects: ( )0/0 <> yy Rosenblatt's perceptron (late 1950's) - simple Learning Machine, step act. funct. Example : Logical AND or Logical OR are classification problems.
![Page 10: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/10.jpg)
10
Limitations of perception (Minsky and Papert, Perceptrons, 1969)
The exclusive-or (XOR) problem cannot be computed by a perceptron (or any two layer network). XOR problem - the output must be turned on when either of the inputs is turned on, but not when both are turned on. XOR is not a linearly separable problem.
![Page 11: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/11.jpg)
11
Why a Hidden Layer? Idea 1: Recode the XOR problem in three dimensions so that it becomes linearly separable. By adding an appropriate extra feature, it is possible to linearly separate the two classes.
Idea 2: Another way to make a problem become linearly separable is to add an extra (hidden) layer between the inputs and the outputs. ! Given a sufficient number of hidden units, it is possible to recode any unsolvable problem into a linearly separable one.
![Page 12: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/12.jpg)
12
Gradient Descent (GD)
Quadratic Error Function: ( )2jjj yde −=
jd - target output, jy - observed (NN) output
fig.: A. P. Engelbrecht, Computational Intelligence, An Introduction, John Wiley & Sons, 2002
![Page 13: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/13.jpg)
13
Gradient Descent (GD) Given data point j [ ]( )jjnjjj dxxx →≡ ,.., 21x
Each weight iw at iteration t is updated as
)()1()( twtwtw iii Δ+−=
⎥⎦
⎤⎢⎣
⎡
∂
∂−=Δ
i
ji w
etw η)(
( ) jij
jji
j xsfyd
we
∂∂
−−=∂
∂2
jix - input i of data point j, η- learning rate
f – neuron activation function Any constraints on f ?
![Page 14: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/14.jpg)
14
GD- Sigmoid Activation Function
If )exp(1
1)(s
sfλ−+
= ( ))(1)( sfsfsf
−=∂∂
⇒ λ
( ) ( ) jijjjji xsfsfydtw )(1)(2)( −−=Δ λη
Least Mean Squares (Delta Rule)
If ssf =)( - linear act. funct. 1=∂∂
⇒sf
( ) jijji xydtw −=Δ η2)(
Adaline (Widrow e Hof, 1959) is the first hardware realization of a NN trained with Delta Rule.
![Page 15: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/15.jpg)
15
Multilayer perceptron (MLP)
FeedForward Neural Network (FFNN)
Perceptron (one layer)
![Page 16: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/16.jpg)
16
MLP as universal approximator
• FFNN with 1 hidden layer and continuous and differentiable activation functions can approximate any continuous function.
1. G. Cybenko. Approximation by superposition of a sigmoidal function. Mathematics of Control, Signal and Systems, 2:303-314, 1989.
2. K. Hornik. Some new results on neural network approximation. Neural Networks, 6:1060-1072, 1993.
• FFNN with 2 hidden layer and continuous and differentiable activation
functions can approximate any function.
3. A.Lapedes and R. Farber. How neural nets work. In Anderson, editor, Neural Information Processing Systems, pages 442-456. New York, American Institute of Physics, 1987.
4. G. Cybenko. Continuous valued neural networks with two hidden layers are sufficient. Technical report, Dep. of Computer Science, Tufts University, Medford, MA, 1988.
![Page 17: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/17.jpg)
17
BackPropagation (BP)
1. Initialize the NN weights For each object of the training set compute:
2. Forward computations
layerfirst 0 ∑ −+=i
iijjj xwws
sywwsi
iijjj layeroutput &hidden layerprevious
0 ∑∈
−+=
jsfy jj neuron ofoutput )( −=
![Page 18: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/18.jpg)
18
BackPropagation (BP)
3. Backward computations (c –cost function)
layeroutput )( −∂∂ʹ′=j
jj ycsfe
layersinput &hidden)(layernext
−ʹ′= ∑∈p
pjpjj ewsfe
4. Weights update (batch training)
ijijij www Δ+←
layerfirst −−=Δ ∑∈Xx
jiij exN
w η
layersoutput &hidden−−=Δ ∑∈Xx
jiij eyN
w η
5. Stop if conditions are achieved, if not go back to p. 2
![Page 19: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/19.jpg)
19
BackPropagation (BP)
4. Weights update (on-line training)
layerfirst −−=Δ jiij exw η layersoutput &hidden−−=Δ jiij eyw η
For f sigmoide: ( ))(1)()( jjj sfsfsf −=ʹ′ , 1=λ
Mean Squared Error (MSE) cost function
( )∑=
−=N
ijj yd
Nc
1
21
( )jjj
ydNy
c−−=
∂∂ 2
![Page 20: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/20.jpg)
20
Classification with MPL
MLP creates a decision boundary between classes. Rule of thumb: Each line of the boundary corresponds to one hidden layer neuron.
(fig.: Engelbrecht, p.50)
![Page 21: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/21.jpg)
21
Regression with MPL (function approximation)
Rule of thumb: # of hidden layer neurons = inflexion points of the function to approximate
(fig.: Engelbrecht, p.51)
# of hidden layer neurons ? The NN architecture?
![Page 22: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/22.jpg)
22
Lu ANN Applications -
System Identification
![Page 23: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/23.jpg)
23
ANN Applications - Model Predictive Control (NN MPC)
![Page 24: Neural Networks Reviewtom/10601_fall2012/recitations/Recitation… · Neural Networks Review Machine Learning 10-601 Fall 2012 Recitation October 2 & 3, 2012 Petia Georgieva . 2 Outline](https://reader033.fdocuments.us/reader033/viewer/2022042316/5f0538917e708231d411e190/html5/thumbnails/24.jpg)
24
ANN Applications - BRAIN MACHINE INTERFACE (BMI)