NEURAL NETWORKS,CELLULAR NEURAL NETWORKS AND ADAPTIVE FUZZY FILTERS
New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij...
Transcript of New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij...
![Page 1: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/1.jpg)
INTRODUCTION TO NEURAL NETWORKS
![Page 2: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/2.jpg)
Complex computations: Mach’s Bands
Observe the transitions among the bands
![Page 3: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/3.jpg)
Complex computations: Mach’s Bands
![Page 4: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/4.jpg)
Da: R. Pierantoni, La trottola di Prometeo, Laterza (1996)
Complex computations: Mach’s Bands
Observe the transitions among the bands
![Page 5: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/5.jpg)
Stimulus Percept
Inte
nsi
tyIn
ten
sity
Complex computations: Mach’s Bands
![Page 6: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/6.jpg)
A simple model of the retina neuron
0
50
100
150
200
250
0 20 40 60 80 100
Incident Intensity (fotons/s)
Potential (mV)
Linear light-to-potential transducer
Light
Potential
![Page 7: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/7.jpg)
Neuron transduction
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
0
40
80
120
160
200
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Fo
ton
s/s
mV
![Page 8: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/8.jpg)
Adding lateral inhibition
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Fo
ton
i/s
Each neuron inhibits its neighbor by a 10% of its non inhibited potential
160
-0.1x160-0.1x160
128
![Page 9: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/9.jpg)
Adding lateral inhibition
0
20
40
60
80
100
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Fo
ton
s/s
mV
0
40
80
120
160
200
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
160 - 0.1 160-0.1 40=140
40 - 0.1 160-0.1 40=20
40 - 0.1 40-0.1 40=32
160 - 0.1 160-0.1 160=128
Each neuron inhibits its neighbor by a 10% of its non inhibited potential
![Page 10: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/10.jpg)
Many identical computing units, each one performing verysimple operations, can perform very complex computationswhen they are widely and specifically connected.
The “knowledge” is stored in the topology and in the strength of the synapses
Complex computations: Mach’s Bands
![Page 11: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/11.jpg)
Model Neuron: McCulloch and Pitts
A neuron is a computational unit that1) performs the weighted sum of the input signals,computing the activation signal (a)
2) transforms the activation signal following a tranferfunction g and computing then the output z
i
d
i
i xwa1
)(agz
w: synaptic weights: activation threshold
![Page 12: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/12.jpg)
Transfer functions
0
0,5
1
-10 0 10
aeag
1
1)(
Usually, NON linear functions are adopted
![Page 13: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/13.jpg)
Non linearity
0
0,5
1
-10 0 10
The same variation in the input can give very different variations on the transferred signal
![Page 14: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/14.jpg)
Artificial Neural Networks
Wij Synaptic weitghs
Neuron i
-
i
d
i
i xwa1
)(agz
The threshold can beimplicitly considered by addingan extra-neuron, alwaysactivated and connected tothe current neuron withweight equal to -
![Page 15: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/15.jpg)
Topology of artificial neural networks
The topology of the connections among the neurons definesthe network class. We will take into consideration only thefeed-forward architectures, where the neurons areorganized into hierarchical layers and the signal flows injust a direction.
Perceptrons2 layers: Input and Output
wij
ji
i
ijj xwgz
![Page 16: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/16.jpg)
Neural networks and logical operators
2
1
3
ORw13 = 0.5 w23 = 0.5 3 = 0.25
a3 = 0.25
z3 = 1
a3 = 0.25
z3 = 1
a3 = 0.75
z3 = 1
a3 = -0.25
z3 = 0
![Page 17: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/17.jpg)
2
1
3
ANDw13 = 0.5 w23 = 0.5 3 = 0.75
a3 = -0.25
z3 = 0
a3 = -0.25
z3 = 0
a3 = 0.25
z3 = 1
a3 = -0.75
z3 = 0
Neural networks and logical operators
![Page 18: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/18.jpg)
2
1
3
NOT (1)w13 = -0.5 w23 = 0.1 3 = -0.25
a3 = -0.25
z3 = 0
a3 = 0.35
z3 = 1
a3 = -0.15
z3 = 0
a3 = 0.25
z3 = 1
Neural networks and logical operators
![Page 19: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/19.jpg)
Supervised artificial neural networks
Feed-forward artificial neural networks can be trainedstarting from examples with known solution.
Error functionGiven a set of examples xi with known desired output, di andgiven a network with parameters w, the square error iscomputed starting from the output of the network z (j sumsover the output neurons )
2
,
),(2
1
ji
i
j
i
j dwxzE
The training procedure consists in finding the parameters wthat minimize the error: iterative minimization algorithmsare adopted. However they do NOT guarantee to reach theglobal minimum.
![Page 20: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/20.jpg)
Training a perceptron
We consider a differentiable transfer function
aeag
1
1)(
)(1)(1
)('2
agage
eag
a
a
Given some initial parameters w:
ii
lj
i
j
i
j
i
j
i
jlj wxw
wxa
wxa
wxz
wxz
E
w
E
),(
),(
),(
),(
),(
i
j
i
ji
j
dwxzwxz
E
),(
),(
)('),(
),(ag
wxa
wxzi
j
i
j
i
lj
i
jlx
w
wxa
),(z2
z1
x2
x1
jj agz
j
id
i
ljj lxwa
1
2
,
),(2
1
ji
i
j
i
j dwxzE
deviation: d ij
![Page 21: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/21.jpg)
Then
i
i
l
i
j
i
i
l
i
j
i
j
lj
xxagdwxzw
Ed)('),(
deviation: d ij
Using the gradient we can update the weights with the“steepest descent” procedure
lj
ljljw
Eww
is the learning rate:Too low: slow trainingToo high: the minima can be lost
Convergence:0
ljw
E
Training a perceptron
![Page 22: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/22.jpg)
Steepest descent finds the minimum of a function by always pointing in the direction that leads downhill.
![Page 23: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/23.jpg)
Steepest descent finds the LOCAL minimum of a function by always pointing in the direction that leads downhill.
![Page 24: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/24.jpg)
f: RnR. If f(x) is of class C2, objective function
Gradient of f
Is a vector containing all the partial derivatives of the first order
Gradient
![Page 25: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/25.jpg)
Given a function f(xy) and a level curve f(x,y) = c
the gradient of f is:
Consider 2 points of the curve: (x,y); (x+εx, x+εy), for small ε
The Gradient is locally perpendicular to level curves
y
f
x
ff ,
(x,y)(x+εx, y+εy)
),(
),(),(
,
,,
yx
T
yx
y
yx
xyx
gyxf
y
f
x
fyxfyxf
ε
![Page 26: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/26.jpg)
Since both : (x,y); (x+εx, x+εy), points satisfy the curve equation:
The gradient is perpendicular to ε.For small ε, ε is parallel to the curve and,by consequence, thegradient is perpendicular to the curve.
The gradient points towards the direction of maximumincrease of f
The local perpendicular to a curve: Gradient
(x,y)(x+εx, x+εy)
0),(
),(
yx
T
yx
T
f
fcc
ε
ε
ε
grad (f)
![Page 27: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/27.jpg)
![Page 28: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/28.jpg)
Steepest descent finds the LOCAL minimum of a function by always pointing in the direction that leads downhill.
![Page 29: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/29.jpg)
Example: OR
2
1
3
w13 = 0 w23 = 0 3 = 0 =2
Training examplesx1 x2 d a z E dE/dw13 dE/dw23 dE/d3
1 0 1 0 0.5 0.125 -0.125 0 0.125
0 1 1 0 0.5 0.125 0 -0.125 0.125
0 0 0 0 0.5 0.125 0 0 -0.125
0 0 0 0 0.5 0.125 0 0 -0.125
0.5 -0.125 -0.125 0
aeag
1
1)( )1()(1)()(' zzagagag
i
i
l
i
j
i
i
l
i
j
i
j
lj
xxagdxzw
Ed)(')(
![Page 30: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/30.jpg)
Example: OR, Step 1
2
1
3
w13 = 0.25 w23 = 0.25 3 = 0 =2
Training examplesx1 x2 d a z E dE/dw13 dE/dw23 dE/d3
1 0 1 0.25 0.56 0.096 -0.108 0 0.108
0 1 1 0.25 0.56 0.096 0 -0.108 0.108
0 0 0 0 0.5 0.125 0 0 -0.125
0 0 0 0 0.5 0.125 0 0 -0.125
0.442 -0.108 -0.108 -0.035
![Page 31: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/31.jpg)
Example: OR, Step 2
2
1
3
w13 = 0.466 w23 = 0.466 3 = 0.069 =2
Training examplesx1 x2 d a z E dE/dw13 dE/dw23 dE/d3
1 0 1 0.397 0.598 0.081 -0.097 0 0.097
0 1 1 0.397 0.598 0.081 0 -0.097 0.097
0 0 0 -0.069 0.483 0.117 0 0 -0.121
0 0 0 -0.069 0.483 0.117 0 0 -0.121
0.395 -0.097 -0.097 -0.048
![Page 32: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/32.jpg)
Example: OR, Step 3
2
1
3
w13 = 0.659 w23 = 0.659 3 = 0.164 =2
Training examplesx1 x2 d a z E dE/dw13 dE/dw23 dE/d3
1 0 1 0.494 0.621 0.072 -0.089 0 0.089
0 1 1 0.494 0.621 0.072 0 -0.089 0.089
0 0 0 -0.164 0.459 0.105 0 0 -0.114
0 0 0 -0.164 0.459 0.105 0 0 -0.114
0.354 -0.089 -0.089 -0.05
![Page 33: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/33.jpg)
Generalization
2
1
3
w13 = 0.659 w23 = 0.659 3 = 0.164 =2
And what happens for the input (1,1)?x1 x2 d a z
1 1 1 1.153 0.760
The network generalized the rules learned from known examples
![Page 34: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/34.jpg)
Linear separability
Given a step-like transfer function, the output neuron of a perceptron is activated if the activation is positive:
0a
01
i
d
i
i xw
The input space is then divided into two regions
If the requested mapping cannot be separated by an hyperplane, the perceptron in insufficient.
![Page 35: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/35.jpg)
Linear separability
AND OR NOT(1)
XOR
The XOR problem cannot be solved with a perceptron.
![Page 36: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/36.jpg)
Multi-layer feed-forward neural networks
Neurons are organized into hierarchical layers
Each layer receive their inputs from the previous one and transmits the output to the next one
w1ij
w2ij
111
ji
i
ijj xwgz
2122
ji
i
ijj zwgz
![Page 37: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/37.jpg)
2
(12
1
(11
1
(21
2
1w1
11
w122
w121
w112
w211
w221
XORw1
11 = 0.7 w121 = 0.7 1
1 = 0. 5 w1
12 = 0.3 w122 = 0.3 1
2 = 0. 5 w2
11 = 0.7 w221 = -0.7 2
1 = 0. 5
a11 = -0.5 z1
1 = 0
a12 = -0.5 z1
2 = 0
a21 = -0.5 z2
1 = 0
x1 = 0 x2 = 0
![Page 38: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/38.jpg)
2
(12
1
(11
1
(21
2
1w1
11
w122
w121
w112
w211
w221
a11 = 0.2 z1
1 = 1
a12 = -0.2 z1
2 = 0
a21 = 0.2 z2
1 = 1
x1 = 1 x2 = 0
XORw1
11 = 0.7 w121 = 0.7 1
1 = 0. 5 w1
12 = 0.3 w122 = 0.3 1
2 = 0. 5 w2
11 = 0.7 w221 = -0.7 2
1 = 0. 5
![Page 39: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/39.jpg)
2
(12
1
(11
1
(21
2
1w1
11
w122
w121
w112
w211
w221
a11 = 0.2 z1
1 = 1
a12 = -0.2 z1
2 = 0
a21 = 0.2 z2
1 = 1
x1 = 0 x2 = 1
XORw1
11 = 0.7 w121 = 0.7 1
1 = 0. 5 w1
12 = 0.3 w122 = 0.3 1
2 = 0. 5 w2
11 = 0.7 w221 = -0.7 2
1 = 0. 5
![Page 40: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/40.jpg)
2
(12
1
(11
1
(21
2
1w1
11
w122
w121
w112
w211
w221
a11 = 0.9 z1
1 = 1
a12 = 0.1 z1
2 = 1
a21 = -0.5 z2
1 = 0
x1 = 1 x2 = 1
XORw1
11 = 0.7 w121 = 0.7 1
1 = 0. 5 w1
12 = 0.3 w122 = 0.3 1
2 = 0. 5 w2
11 = 0.7 w221 = -0.7 2
1 = 0. 5
![Page 41: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/41.jpg)
The hidden layer maps the input in a new representation that is linearly separable
Input Desired Activation ofoutput hidden neurons
0 0 0 0 01 0 1 1 00 1 1 1 01 1 0 1 1
![Page 42: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/42.jpg)
Supervised artificial neural networks
Feed-forward artificial neural networks can be trainedstarting from examples with known solution.
Error functionGiven a set of examples xi with known desired output, di andgiven a network with parameters w, the square error iscomputed starting from the output of the network z (j sumsover the output neurons )
2
,
),(2
1
ji
i
j
i
j dwxzE
The training procedure consists in finding the parameters wthat minimize the error: iterative minimization algorithmsare adopted. However they do NOT guarantee to reach theglobal minimum.
![Page 43: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/43.jpg)
Training a perceptron
We consider a differentiable transfer function
aeag
1
1)(
)(1)(1
)('2
agage
eag
a
a
Given some initial parameters w:
ii
lj
i
j
i
j
i
j
i
jlj wxw
wxa
wxa
wxz
wxz
E
w
E
),(
),(
),(
),(
),(
i
j
i
ji
j
dwxzwxz
E
),(
),(
)('),(
),(ag
wxa
wxzi
j
i
j
i
lj
i
jlx
w
wxa
),(z2
z1
x2
x1
jj agz
j
id
i
ljj lxwa
1
2
,
),(2
1
ji
i
j
i
j dwxzE
deviation: d ij
![Page 44: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/44.jpg)
Then
i
i
l
i
j
i
i
l
i
j
i
j
lj
xxagdwxzw
Ed)('),(
deviation: d ij
Using the gradient we can update the weights with the“steepest descent” procedure
lj
ljljw
Eww
is the learning rate:Too low: slow trainingToo high: the minima can be lost
Convergence:0
ljw
E
Training a perceptron
![Page 45: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/45.jpg)
Training of multilayer network: Back-propagation
w1ij
w2ij
i
i
l
i
j
i
i
l
i
j
i zzagdwxzw
Ej
lj
,1,2,12
2)('),( d
For the layer 2, the perceptron formula holds, upon the substitution x z1,i
![Page 46: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/46.jpg)
i
l
i
i
j
i
i
j
i
j
xw
a
a
E
w
E
ljlj
,1
1
,1
,11d
i
j
i
k
k
i
k
ki
j
i
k
i
k
i a
a
a
a
a
E
a
E
j
,1
,2,2
,1
,2
,2,1
d
w1ij
w2ij
For the layer 1:
Defining d 1,ij
m
mk
i
m
i
k waga 2,1,2 )(2,1
,1
,2
)(' jk
i
ji
j
i
k waga
a
2,1,2,1 )(' jk
i
j
k
i
k
i
j wag dd
Training of multilayer network: Back-propagation
![Page 47: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/47.jpg)
Compute zl for each example (Feed-forward step) ;
Compute the deviation on the output layer, d 2l;
Compute the deviation on the hidden layer, d j1;
Compute the gradient of the Error with respect to the weights
Update the weights with the steepest-descent method
Input
Output
Training of multilayer network: Back-propagation
![Page 48: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/48.jpg)
What does a neural network learn?
Considering the ideal case, consisting in a continuous set ofexample, x, each one represented with frequency P(x). Thedesired solutions t are associated to the input with probabilityP(t | x)
jj
j
jj dxxPxdPdwxzE dd)()|(),(2
1 2
0),(
wxz
E
jd
dTraining, after convergence:
jjjj
j
jj dxxxxPxdPdwxz d)d-()()|(),(0 , dd
jjjj dxdPdwxz d)|(),(
Functional derivative
The activation state of the j-th
output neuron is equal to the
average of the solution associated
to the input x in the training set.
![Page 49: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/49.jpg)
Neural Networks for classification and regression
Networks can be used for classification or for regression
In regression: desired outputs are real numbersIn classification: desired outputs are 0 or 1
Error function
2
,
),(2
1
ji
i
j
i
j ywxzE
![Page 50: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/50.jpg)
Increasing the hidden neurons increases the number of parameters and then increases risk of overfitting learning
data
Neural Networks and overfitting
![Page 51: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/51.jpg)
1) Be sure that the number of parameters is far lower than the number of points to learn
(What is the number of parameters of a network with n inputs, k outputs and r hidden neurons?)
2) Use regularizers (if possible). E.g.
Many other formulation are possible
2)(2
,
),(2
1 k
ij
ji
i
j
i
j wywxzE
![Page 52: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/52.jpg)
3) Use always an independent test set for deciding when to stop the training [EARLY STOPPING]
(and then validate the method on a third independent set)
Test
Stop the training at this iteration
![Page 53: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/53.jpg)
Back propagation is not suitable for training the networks as the number of
layers increases: DEEP LEARNING PROCEDURES ARE NEEDED
Can we add more layers?
![Page 54: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/54.jpg)
Stuttgart Neural Network Simulator
http://www.ra.cs.uni-tuebingen.de/SNNS/
![Page 55: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/55.jpg)
http://www.opennn.net/
OpenNN
![Page 56: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/56.jpg)
http://deeplearning.net/software/theano/
THEANO
![Page 57: New INTRODUCTION TO NEURAL NETWORKS - unibo.it · 2016. 4. 28. · Artificial Neural Networks W ij Synaptic weitghs Neuron i- i d i a wi x 1 z g(a) The threshold can be implicitly](https://reader036.fdocuments.us/reader036/viewer/2022071110/5fe560e1748b6e2acc4db894/html5/thumbnails/57.jpg)
https://grey.colorado.edu/emergent/index.php/
Comparison_of_Neural_Network_Simulators
More on: