Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of...

Artificial Neural Networks

The Brain

How do brains work?How do human brains differ from that

of other animals?

Can we base models ofartificial intelligence onthe structure and innerworkings of the brain?

The Brain

The human brain consists of:Approximately 10 billion neurons …and 60 trillion connections

The brain is a highly complex, nonlinear,parallel information-processing systemBy firing neurons simultaneously, the brain

performs faster than the fastest computers in existence today

The human brain consists of:Approximately 10 billion neurons …and 60 trillion connections (synapses)

Soma Soma

Synapse

Synapse

Dendrites

Axon

Synapse

Dendrites

Axon

An individual neuron has a very simple structureCell body is called a somaSmall connective fibers are called dendritesSingle long fibers are called axons

An army of such elements constitutes tremendous processing power

Soma Soma

Synapse

Synapse

Dendrites

Axon

Synapse

Dendrites

Axon


An artificial neural network consists of a numberof very simple processors called neurons

Neurons are connected by weighted links

The links pass signals from one neuron to another based on predefined thresholds


An individual neuron (McCulloch & Pitts, 1943):Computes the weighted sum of the input

signals Compares the result with a threshold value,

If the net input is less than the threshold,

the neuron output is –1 (or 0)Otherwise, the neuron becomes activated

and its output is +1


Neuron Y

InputSignals

x1

x2

xn

OutputSignals

Y

Y

Y

w2

w1

wn

Weights

X = x1w1 + x2w2 + ... + xnwn

threshold

Activation Functions

Individual neurons adhere to an activation function, which determines whether they propagate their signal (i.e. activate) or not:

Sign Function


The step, sign, and sigmoid activation functionsare also often called hard limit functions

We use such functions in decision-making neural networksSupport classification and other pattern

recognition tasks

Perceptrons

Can an individual neuron learn?In 1958, Frank Rosenblatt introduced a

training algorithm that provided the first procedure for training asingle-node neural network

Rosenblatt’s perceptron model consists of a single neuron with adjustable synaptic weights, followed by a hard limiter

Perceptrons

Threshold

Inputs

x1

x2

Output

Y

HardLimiter

w2

w1

LinearCombiner

X = x1w1 + x2w2

Y = Ystep

Perceptrons

A perceptron:Classifies inputs x1, x2, ..., xn

into one of two distinctclasses A1 and A2

Forms a linearly separablefunction defined by: x1

x2

Class A2

Class A1

1

2

x1w1 +x2w2 =0

(a) Two-inputperceptron. (b) Three-inputperceptron.

x2

x1

x3x1w1 +x2w2 +x3w3 =0

12

Perceptrons

Perceptron with threeinputs x1, x2, and x3 classifies its inputsinto two distinctsets A1 and A2

x1

x2

Class A2

Class A1

1

2

x1w1 +x2w2 =0

(a) Two-inputperceptron. (b) Three-inputperceptron.

x2

x1

x3x1w1 +x2w2 +x3w3 =0

12

Perceptrons

How does a perceptron learn?A perceptron has initial (often random)

weights typically in the range [-0.5, 0.5]Apply an established training dataset Calculate the error as

expected output minus actual output:

error e = Yexpected – Yactual

Adjust the weights to reduce the error

Perceptrons

How do we adjust a perceptron’s weights to produce Yexpected?If e is positive, we need to increase Yactual

(and vice versa)

Use this formula:, where

and

α is the learning rate (between 0 and 1) e is the calculated error

Perceptron Example – AND

Train a perceptron to recognize logical AND

Use threshold Θ = 0.2 andlearning rate α = 0.1

Perceptron Example – ANDRepeat until convergence

i.e. final weights do not change and no error

Use threshold Θ = 0.2 andlearning rate α = 0.1

Perceptron Example – AND

Two-dimensional plotof logical AND operation:

A single perceptron canbe trained to recognizeany linear separable function Can we train a perceptron to

recognize logical OR?How about logical exclusive-OR (i.e. XOR)?

x1

x2

1

1

x1

x2

1

1

(b) OR (x1 x2)

x1

x2

1

1

(c) Exclusive-OR(x1 x2)

00 0

Perceptron – OR and XOR

Two-dimensional plots of logical OR and XOR:

x1

x2

1

1

x1

x2

1

1

(b) OR (x1 x2)

x1

x2

1

1

(c) Exclusive-OR(x1 x2)

00 0

Perceptron Coding Exercise

Write a code to:Calculate the error at each stepModify weights, if necessary

i.e. if error is non-zeroLoop until all error values are zero

for a full epoch

Modify your code to learn to recognize the logical OR operationTry to recognize the XOR

operation....

InputLayer OutputLayer

MiddleLayer

Multilayer neural networks consist of:An input layer of source neuronsOne or more hidden layers of

computational neuronsAn output layer of more

computational neurons

Input signals are propagated in alayer-by-layer feedforward manner

Multilayer Neural Networks


InputLayer OutputLayer

MiddleLayer

I n

p u

t

S i

g

n a

l s

O u

t p

u t

S

i g

n

a l

s


Inputlayer

Firsthiddenlayer

Secondhiddenlayer

Outputlayer

I n

p u

t

S i

g n

I

n p

u t

S

i g

n

a l

sa l

s

O u

t p

u t

S

i g

p

u t

S

i g

n

a l

sn

a l

s


Inputlayer

xi

x1

x2

xn

1

2

i

n

Outputlayer

1

2

k

l

yk

y1

y2

yl

Inputsignals

Error signals

wjk

Hiddenlayer

wij

1

2

j

m

XINPUT = x1 XH = x1w11 + x2w21 + ... + xiwi1 + ... + xnwn1

XOUTPUT = yH1w11 + yH2w21 + ... + yHjwj1 + ... + yHmwm1

y55

x1 31

x2

Inputlayer

Outputlayer

Hiddenlayer

42

3w13

w24

w23

w24

w35

w45

4

5

1

1

1

Three-layer network:


w14

Commercial-quality neural networks often incorporate 4 or more layersEach layer consists of about 10-1000

individual neurons

Experimental and research-based neural networks often use 5 or 6 (or more) layersOverall, millions of individual neurons may

be used


A back-propagation neural network is a multilayer neural network that propagates error backwards through the network as it learnsWeights are modified based on the

calculated error

Training is complete when the error is below a specified threshold e.g. less than 0.001

Back-Propagation NNs


Inputlayer

xi

x1

x2

xn

1

2

i

n

Outputlayer

1

2

k

l

yk

y1

y2

yl

Inputsignals

Error signals

wjk

Hiddenlayer

wij

1

2

j

m

y55

x1 31

x2

Inputlayer

Outputlayer

Hiddenlayer

42

3w13

w24

w23

w24

w35

w45

4

5

1

1

1


Use the sigmoid activation function; andapply Θ by connecting fixed input -1 to weight Θ

w14

Initially: w13 = 0.5, w14 = 0.9, w23 = 0.4, w24 = 1.0, w35 = -1.2, w45 = 1.1, 3 = 0.8, 4 = -0.1 and 5 = 0.3.

33

Step 2Step 2: Activation : Activation Activate the back-propagation neural Activate the back-propagation neural network by applying inputs network by applying inputs xx11((pp), ), xx22((pp),…, ),…, xxnn((pp) )

and desired outputs and desired outputs yydd,1,1((pp), ), yydd,2,2((pp),…, ),…, yydd,,nn((pp).).

((aa) Calculate the actual outputs of the neurons in ) Calculate the actual outputs of the neurons in the hidden layer:the hidden layer:

where where n n is the number of inputs of neuron is the number of inputs of neuron j j in the in the hidden layer, and hidden layer, and sigmoid sigmoid is the is the sigmoid sigmoid activation activation function.function.

j

n

iijij pwpxsigmoidpy

1

)()()(

34

((bb) Calculate the actual outputs of the neurons in ) Calculate the actual outputs of the neurons in the output layer:the output layer:

Step 2Step 2 : Activation (continued): Activation (continued)

where where m m is the number of inputs of neuron is the number of inputs of neuron k k in the in the output layer.output layer.

k

m

jjkjkk pwpxsigmoidpy

1

)()()(

35

We consider a training set where inputs We consider a training set where inputs xx1 1 and and xx22 are are

equal to 1 and desired output equal to 1 and desired output yydd,5,5 is 0. The actual is 0. The actual

outputs of neurons 3 and 4 in the hidden layer are outputs of neurons 3 and 4 in the hidden layer are calculated ascalculated as

Now the actual output of neuron 5 in the output layer Now the actual output of neuron 5 in the output layer is determined as:is determined as:

Thus, the following error is obtained:Thus, the following error is obtained:

5250.01/1)( )8.014.015.01(32321313 ewxwxsigmoidy

8808.01/1)( )1.010.119.01(42421414 ewxwxsigmoidy

5097.01/1)( )3.011.18808.02.15250.0(54543535 ewywysigmoidy

5097.05097.0055, yye d

04/20/23 Intelligent Systems and Soft Computing 36

Step 3Step 3: Weight training : Weight training Update the weights in the back-propagation Update the weights in the back-propagation network propagating backward the errors associated network propagating backward the errors associated with output neurons. with output neurons. ( (aa) Calculate the error gradient for the neurons in ) Calculate the error gradient for the neurons in the output layer:the output layer:

wherewhere

Calculate the weight corrections:Calculate the weight corrections:

Update the weights at the output neurons:Update the weights at the output neurons:

)()()1( pwpwpw jkjkjk

)()(1)()( pepypyp kkkk

)()()( , pypype kkdk

)()()( ppypw kjjk


((bb) Calculate the error gradient for the neurons in ) Calculate the error gradient for the neurons in the hidden layer:the hidden layer:

Step 3Step 3: Weight training (continued): Weight training (continued)

Calculate the weight corrections:Calculate the weight corrections:

Update the weights at the hidden neurons:Update the weights at the hidden neurons:

)()()(1)()(1

][ pwppypyp jk

l

kkjjj

)()()( ppxpw jiij

)()()1( pwpwpw ijijij


The next step is weight training. To update the The next step is weight training. To update the weights and threshold levels in our network, we weights and threshold levels in our network, we propagate the error, propagate the error, ee, from the output layer , from the output layer backward to the input layer.backward to the input layer.

First, we calculate the error gradient for neuron 5 in First, we calculate the error gradient for neuron 5 in the output layer:the output layer:

Then we determine the weight corrections assuming Then we determine the weight corrections assuming that the learning rate parameter, that the learning rate parameter, , is equal to 0.1:, is equal to 0.1:

1274.05097).0(0.5097)(10.5097)1( 555 eyy

0112.0)1274.0(8808.01.05445 yw0067.0)1274.0(5250.01.05335 yw

0127.0)1274.0()1(1.0)1( 55


Next we calculate the error gradients for neurons 3 Next we calculate the error gradients for neurons 3 and 4 in the hidden layer:and 4 in the hidden layer:

We then determine the weight corrections:We then determine the weight corrections:

0381.0)2.1(0.1274)(0.5250)(10.5250)1( 355333 wyy

0.0147.114)0.127(0.8808)(10.8808)1( 455444 wyy

0038.00381.011.03113 xw0038.00381.011.03223 xw

0038.00381.0)1(1.0)1( 33 0015.0)0147.0(11.04114 xw0015.0)0147.0(11.04224 xw

0015.0)0147.0()1(1.0)1( 44


At last, we update all weights and threshold:At last, we update all weights and threshold:

The training process is repeated until the sum ofThe training process is repeated until the sum of squared errors is less than 0.001.squared errors is less than 0.001.

5038.00038.05.0131313 www

8985.00015.09.0141414 www

4038.00038.04.0232323 www

9985.00015.00.1242424 www

2067.10067.02.1353535 www

0888.10112.01.1454545 www

7962.00038.08.0333

0985.00015.01.0444

3127.00127.03.0555

Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of...

Documents

Transcript of Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of...