1
Artificial Neural Networks
Chapter 4
• Perceptron• Gradient Descent• Multilayer Networks • Backpropagation Algorithm
2
Amazing numbers
3
Properties of Neural Networks
6
Real Neurons
• Cell structures– Cell body– Dendrites– Axon– Synaptic terminals
7
Neural Communication• Electrical potential across cell membrane
exhibits spikes called action potentials.• Spike originates in cell body, travels down
axon, and causes synaptic terminals to release neurotransmitters.
• Chemical diffuses across synapse to dendrites of other neurons.• Neurotransmitters can be excititory or
inhibitory.• If net input of neurotransmitters to a neuron
from other neurons is excititory and exceeds some threshold, it fires an action potential.
8
Real Neural Learning
• Synapses change size and strength with experience.
• Hebbian learning (1949): When two connected neurons are firing at the same time, the strength of the synapse between them increases.
“Neurons that fire together, wire together.”
9
A Neuron model
10
Artificial Neuron Model• Model network as a graph with cells as nodes and synaptic
connections as weighted edges wi , from node i
• Model net input to cell as
• Cell output is:
O
x11 x3x2 x4
w0
w1 w2
w3w4
ii
ixwnet
0 if 1
0 if 1
net
neto
net
O=sign(net)
0
1
-1w0: is called bias or threshold
11
Artificial Neural Network
• Learning approach based on modeling adaptation in biological neural systems:
– Perceptron: Initial algorithm for learning simple neural networks (single layer)
(Rosenblatt, Frank (1957), The Perceptron--a perceiving and recognizing automaton. Report 85-460-1, Cornell Aeronautical Laboratory)
– Backpropagation: More complex algorithm for learning feed forward multi-layer neural networks
(Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (8 October 1986). "Learning representations by back-propagating errors". Nature 323 (6088): 533–536)
12
A Perceptron
14
Perceptron Training
• Assume supervised training examples giving the desired output for a unit given a set of known input activations.
• Learn synaptic weights so that unit produces the correct output for each example.
• Perceptron uses iterative update algorithm to learn a correct set of weights.
15
Perceptron Learning Rule
iii xotww )( • Update weights by:
where, η: is the “learning rate” t: is the teacher specified output (target value).• Equivalent to rules:– If output is correct do nothing.– If output is high, lower weights on active inputs– If output is low, increase weights on active inputs
16
Perceptron Learning Algorithm
• Iteratively update weights until convergence.
• Each execution of the outer loop is typically called an epoch.
Initialize weights to random valuesUntil outputs of all training examples are correct For each training pair, E, do: Compute current output o for E given its inputs Compare current output to target value, t , for E Update synaptic weights and threshold using learning rule
17
Perceptron training rule
18
Perceptron as Hill Climbing
• to minimize error on the training set.• Perceptron effectively does hill-climbing (gradient descent) in this space,
changing the weights a small amount at each point to decrease training set error.
• For a single model neuron, the space is well behaved with a single minima.
0
weights
trainingerror
19
Geometric Interpretation:Perceptron as a Linear Separator
022110 xwxww
• Since perceptron uses linear threshold function, it is searching for a linear separator that discriminates the classes.
x2
x1
??
Or hyperplane in n-dimensional space
20
Perceptron Performance
• Linear threshold functions are restrictive
• In practice, converges fairly quickly for linearly separable data.
• Can effectively use even incompletely converged results when only a few outliers are misclassified.
• Experimentally, Perceptron does quite well on many benchmark data sets.
21
Concept Perceptron Cannot Learn XOR
• Cannot learn exclusive-or, or parity function in general.
x3
x2
??+1
01
–
+–
22
Perceptron Limits
• If the data are not linearly separable, the convergence is not assured.
• Minksy and Papert (1969) wrote a book analyzing the perceptron and demonstrating many functions it could not learn.
(M.L. Minsky and S. Papert. Perceptrons; An introduction to computational geometry. Cambridge, Mass.,: MIT Press, 1969)
• These results discouraged further research on neural nets
24
Hidden unit: a solution for non-linear separable data
• Such networks show the potential of hidden neurons
Threshold (bias)
weights
x1 x2
Top Related