AI - Competitive Learning NN
-
Upload
girish-kumar-nistala -
Category
Documents
-
view
229 -
download
0
Transcript of AI - Competitive Learning NN
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 1/10
8
Competitive Learning NeuralNetworks
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 2/10
http://rajakishor.co.cc Page 2
Consider pattern recognition tasks that the following network can perform.
The network consists of an input layer of linear units. The output of each of these
units is given to all the units in the second layer (output layer) with adaptive (adjustable)
feedforward weights. The output functions of the units in the second layer are either linear
or nonlinear depending on the task for which the network is to be designed. The output of
each unit in the second layer is fed back to itself in a self-excitatory manner and to the
other units in the layer in an excitatory or inhibitory manner depending on the task.
Generally, the weights on the connections in the feedback layer are nonadaptive or
fixed. Such a combination of both feedforward and feedback connection layers results insome kind of competition among the activations of the units in the output layer. Hence such
networks are called competitive learning neural networks.
Different choices of the output functions and interconnections in the feedback layer
of the network can be used to perform different pattern recognition tasks. For example, if
the output functions are linear, and the feedback connections are made in an on-centre off-surround fashion, the network performs the task of storing an input pattern temporarily. In
an on-centre off-surround connection there is an excitatory connection to the same unit and inhibitory connections to the other units in the layer.
On the other hand, if the output functions of the units in the feedback layer are made
nonlinear, with fixed weight on-centre off-surround connections, the network can be usedfor pattern clustering.
The objective in pattern clustering is to group the given input patterns in an
unsupervised manner, and the group for a pattern is indicated by the output unit that has anonzero output at equilibrium. The network is called a pattern clustering network, and the
feedback layer is called a competitive layer. The unit that gives the nonzero output at equilibrium is said to be the winner. Learning in a pattern clustering network involves
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 3/10
http://rajakishor.co.cc Page 3
adjustment of weights in the feedforward path so as to orient the weights (leading to the
winning unit) towards the input pattern.
If the output functions of the units in the feedback layer are nonlinear and the units
are connected in such a way that connections to the neighbouring units are all made
excitatory and to the farther units inhibitory, the network then can perform the task of
feature mapping. The resulting network is called a self-organization network . In the self-organization, at equilibrium the output signals from the nearby units in the feedback layer
indicate the proximity of the corresponding input patterns in the feature space. A self-
organization network can be used to obtain mapping of features in the input patterns onto
a one-dimensional or a two-dimensional feature space.
Pattern Storage
(STM)
Architecture
Learning
Recall
Limitation
To overcome
Two layers (input and competitive), linear processing
unit s
No learning in FF stage, fixed weight s in FB layer
Not relevant
STM, no application, theoretical interest
Nonlinear output f unction in FB stage, learning in FF
stage
Pattern Clustering
(Grouping)
Architecture
Learning
Recall
Limitation
To overcome
Two layers (input and competitive), nonlinear
processing units in the competitive layer
Only in FF stage, Competitive learning
Direct in FF stage, activation dynamics until stablestate is reached in FB layer
Fixed (rigid) grouping of patterns
Train neighbourhood units in competition layer
Feature Map Architecture
Learning
Recall
Limitation
To overcome
Self -organization network, two layers, nonlinear
processing units, excitatory neighbourhood units
Weights leading to the neighbourhood units in thecompetitive layer
Apply input, det ermine winner
Only visual features, not quantit ative
More complex architecture
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 4/10
http://rajakishor.co.cc Page 4
Analysis of P attern C lustering N etworks
A competitive learning network with nonlinear output functions for units in the
feedback layer can produce at equilibrium larger activation on a single unit and smallactivations on other units. This behaviour leads to a winner-take all situation, where, when
the input pattern is removed, only one unit in the feedback layer will have nonzero
activation. That unit may be designat ed as the winner for the input pattern.
If the feedforward weights are suitably adjusted, each of the units in the feedback
layer can be made to win for a group of similar input patterns. The corresponding learning
is called competitive learning.
The units in the feedback layer have nonlinear f(x) = xn , n > 1 output functions.
Other nonlinear output functions such as hard-limiting threshold function or semilinear
sigmoid function can also be used. These units are connected among themselves with fixed
weights in an on-centre off -surround manner. Such networks are called competitivelearning networks. Since they are used for clustering or grouping of input patterns, they can
also be called pattern clustering networks.
In the pattern clustering task, the pattern classes are formed on unlabelled input
data, and hence the corresponding learning is unsupervised. In the competitive learning the
weights in the feedforward path are adjusted only after the winner unit in the feedback
layer is identified for a given input pattern.
There are three different methods of implementing the competitive learning as
illustrated below.
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 5/10
http://rajakishor.co.cc Page 5
In the figure, it is assumed that the input is a binary {0, 1} vector. The activaition of
the ith unit in the feedback layer for an input vector x = (x1, x2, …, xM)T is given by
1
M
i ij j
j
y w x
where wij is the (i, j)th element of the weight matrix W, connecting the j th input to the ith
unit. let i = k be the unit in the feedback layer such that
max( )k i
i y y
then
,T T
k iw x w x for all i
Assume that the weight vectors to all the units are normalized, i.e., || wi || = 1, for all
i. Geometrically, the above result means that the input vector x is closest to the weight
vector wk among all wi. That is
|| x – wk || ≤ || x - wi ||, for all i.
Start wit h some initial random values for the weight s. The input vectors are applied
one by one in a random order. For each input the winner unit in t he f eedback layer is
ident ified, and the weights leading to the unit are adjusted in such a way that the weight
vector wk moves towards the input vector x by a small amount, determined by a learning
rate parameter .
A straight forward implementat ion of the weight adjustment is to make
wkj = (xj - wkj),
so that
wk (m+1) = wk (m) + wk (m)
= wk (m) + (x – wk (m))
This looks more like Hebb's learning with a decay t erm if the output of the winning
unit is assumed to be 1. It work s best when the input vectors are not normalized. This is
called the standard compet itive learning.
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 6/10
http://rajakishor.co.cc Page 6
If the input vectors are not normalized, then they are normalized in the weight
adjustment formula as follows:
j
kj kji
i
x
w w x
only for those j for which x j = 1. This can be called minimal learning.
In the case of binary input vectors, for the winner unit, only the weights which have
nonzero input would be adjusted. That is
, 1
0,
j
kj kj j
ii
j
xw w for x
x
for x o
In the minimal learning there is no automatic normalization of weights after each
adjustment.
That is,1
1.
M
kj
j
w
In order to overcome this problem, Malsburg suggested the following learning law,in which all the weights leading to the winner unit will be adjusted:
, 1 j
kj kj j
i
i
xw w for all x
x
In this law if a unit wins the competition, then each of its input lines gives up some
portion of its weights, and that weight is distributed equally among the active connections
for which Xj = 1.
The unit i with an initial weight vector wi far from any input vector, may never win
the competition. Since a unit will never learn unless it wins the competition, anothermethod called leaky learning law is proposed. In this case, the weights leading to the units
which do not win also are adjusted for each update as follows:
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 7/10
http://rajakishor.co.cc Page 7
ijw , for all j,
j
w ij
m
m
xw
x
if i wins the competition, i.e., i = k.
, for all j, j
l ij
m
m
xw
x
if i loses the competition, i.e., i ≠ k. where w and l are the learning rate parameters for the winning and losing units,
respectively (w > l ). In this case the weights of the losing units are also slightly moved for
each presentation of an input.
Basic competitive learning and its variations are used for adapt ive vector
quantization, in which t he input vectors are grouped based on the Euclidean distance
between vect ors.
Associative M emory
Pattern storage is an obvious pattern recognition task that one would like to realize
using an artificial neural network. This is a memory function, where the network isexpected to store the pattern inf ormation (not data) for later recall. The pat terns to be
st ored may be of spatial type or spatio-temporal (pattern sequence) type.
Typically, an artificial neural network behaves like an associative memory, in which
a pattern is associated with another pattern, or with itself. This is in contrast with the
random access memory which maps an address to a data. An artificial neural network can
also function as a content addressable memory where data is mapped onto an address.The pattern information is stored in the weight matrix of a feedback neural network .
The stable states of the network represent the stored patterns, which can be recalled byproviding an external stimulus in the form of partial input.
If the weight matrix stores the given patterns, then the network becomes an
autoassociative memory . If the weight matrix stores the association between a pair of
patterns, the network becomes a bidirectional associative memory . This is called
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 8/10
http://rajakishor.co.cc Page 8
heteroassociation between the two patterns. If the weight matrix stores multiple
associations among several (> 2) patterns, then the network becomes a multidirectional
associative memory . If the weights store the associations between adjacent pairs of patterns
in a sequence of patterns, then the network is called a temporal associative memory .
Some desirable characteristics of associative memories are:
1. The network should have a large capacity, i.e. , ability to store a large number of
patterns or pattern associations.
2. The network should be fault tolerant in the sense that damage to a few units or
connections should not affect the performance in recall significantly.
3. The network should be able to recall the stored pattern or the desired associated
pattern even if the input pattern is distorted or noisy.
4. The net work performance as an associative memory should degrade only gracefullydue to damage to some units or connections, or due to noise or distortion in the
input .
5. The network should be flexible to accommodate new patterns or associations
(within the limits of its capacity) and to be able to eliminate unnecessary patterns or
associations.
Bidirectional Associative Memory (BAM)
The objective is to store a set of patt ern pairs in such a way that any stored pattern
pair can be recalled by giving either of the patterns as input . The network is a two-layer
heteroassociative neural network as shown below.
The network encodes binary or bipolar pattern pairs (al , bl ) using the Hebbian
learning. It can learn on-line and it operates in discrete time steps.
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 9/10
http://rajakishor.co.cc Page 9
The BAM weight matrix from the first layer to the second layer is given by
1
LT
l l
l
W a b
where al Є {-1, +1}M and bl Є {-1, +1}N for bipolar patterns, and L is the number of training
patterns.
For binary patterns pl Є {0, 1}M and ql Є {0, 1}N, the bipolar values ali = 2pli - 1 and
bli = 2qli – 1 corresponding to the binary elements p li and qli, respectively, are used in the
computation of the weight matrix.
The weight matrix from the second layer to the first layer is given by
1
LT T
l l
l
W b a
The activation equations for the bipolar case are as follows:
j
j
j
1, if y 0
( 1) ( ), if y 0
1, if y 0
j jb m b m
where
1
( ) M
j ji i
i
y w a m
and
i
i
i
1, if x 0
( 1) ( ), if x 0
1, if x 0
i ia m a m
where
1
( ) N
i ij j
j
x w b m
8/6/2019 AI - Competitive Learning NN
http://slidepdf.com/reader/full/ai-competitive-learning-nn 10/10
http://rajakishor.co.cc Page 10
In the above equations a(m) = [a1(m), a2(m), …, aM(m)]T is the output of the first
layer at the mth iteration, and b(m) = [b1(m), b2(m), …, bN(m)]T is the output of the secondlayer at the mth iteration.
The updates in the BAM are synchronous in the sense that the units in each layer are
updated simultaneously.
The BAM is limited t o binary or bipolar valued pattern pairs. The upper limit on the
number (L) of pattern pairs that can be stored is min(M, N).
The performance of BAM depends on the nature of the pattern pairs and their
number. As the number of pattern pairs increases, the probability of error in recall will also
increase.