NN – cont.

36
NN – cont. Alexandra I. Cristea USI intensive course “Adaptive Systems” April-May 2003

description

NN – cont. Alexandra I. Cristea. USI intensive course “Adaptive Systems” April-May 200 3. We have seen how the neuron computes, let’s see What it can compute? How it can learn?. What does the neuron compute?. Perceptron, discrete neuron. First, simple case: no hidden layers - PowerPoint PPT Presentation

Transcript of NN – cont.

Page 1: NN –  cont.

NN – cont.

Alexandra I. CristeaUSI intensive course “Adaptive Systems” April-May 2003

Page 2: NN –  cont.

• We have seen how the neuron computes, let’s see– What it can compute?– How it can learn?

Page 3: NN –  cont.

What does the neuron compute?

Page 4: NN –  cont.

Perceptron, discrete neuron

• First, simple case: – no hidden layers– Only one neuron

– Get rid of threshold – b becomes w0

– Y – Boolean function : > 0 fires 0 doesn’t fire

Page 5: NN –  cont.

Threshold function f

f

(w0 = - t = -1)

t=1

1f

Page 6: NN –  cont.

Y = X1 or X2

W1=1 W2= 1

X1X2

0 0 1

1 1 1

0 1Y

X1 X2

t=1

1f

Page 7: NN –  cont.

Y = X1 and X2

W1= 0,5 W2= 0,5

X1X2

0 0 0

1 0 1

0 1Y

X1 X2

t=1

1f

Page 8: NN –  cont.

Y = or(x1,…,xn)

w1=w2=…=wn=1t=1

1f

Page 9: NN –  cont.

Y = and(x1,…,xn)

w1=w2=…=wn=1/nt=1

1f

Page 10: NN –  cont.

What are we actually doing?

X1X2

0 -1 1

1 1 1

0 1Y

X1X2

0 0 0

1 0 1

0 1Y

X1X2

0 0 1

1 1 1

0 1Y

w0+w1*X1+w2*X2

W 0=-1; W1 = 7; W2= 9

W 0=-1; W1 = 0,7; W2= 0,9

W 0=1; W1 = 7; W2= 9

X1

X2

Page 11: NN –  cont.

x1

x2

w0+w1*x1+w2*x2

w0= - 1w1= - 0,67w2= 1

Linearly Separable Set

Page 12: NN –  cont.

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,25w2= - 0,1

Page 13: NN –  cont.

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,25w2= 0,04

Page 14: NN –  cont.

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,167w2= 0,1

Page 15: NN –  cont.

Non-linearly separable Set

Page 16: NN –  cont.

w0+w1*x1+w2*x2

Non Linearly Separable Set

x1

x2

w0=w1=w2=

Page 17: NN –  cont.

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

Page 18: NN –  cont.

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

Page 19: NN –  cont.

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

Page 20: NN –  cont.

Perceptron Classification Theorem

A finite set X can be classified correctly by a one-layer perceptron if and only if it is linearly separable.

Page 21: NN –  cont.

w0+w1*x1+w2*x2

Typical non-linearly separable set: Y=XOR(x1,x2)

x1

x20,0 1,0

0,1 1,1

Y=1Y=0

Page 22: NN –  cont.

How does the neuron learn?

Page 23: NN –  cont.

Learning: weight computation• W1* ( X1 =1)+ W 2 * ( X2= 1)>=(t=

1)• W1* ( X1 =0)+ W 2 * ( X2= 1)<(t=1)• W1* ( X1 =1)+ W 2 * ( X2= 0)<(t=1)• W1* ( X1 =0)+ W 2 * ( X2= 0)<(t=1)

X2

X1

W1*X1 + W2*X2

Page 24: NN –  cont.

Perceptron Learning Ruleincremental version

FOR i:= 0 TO n DO wi:=random initial value ENDFOR;

REPEAT select a pair (x,t) in X; (* each pair must have a positive probability of

being selected *) IF wT * x' > 0 THEN y:=1 ELSE y:=0 ENDIF; IF y t THEN

FOR i:= 0 TO n DO wi:= wi + (t-y) xi' ENDFOR ENDIF;

UNTIL X is correctly classified

ROSENBLATT (1962)

Page 25: NN –  cont.

Idea Perceptron Learning Rule

w

x’

wnew wnew=w + x’ t=1y=0 (wTx’0)

wniew

x’

w

x’ x’

wnew=w - x’

wi:= wi + (t-y) xi'

w changes in the w changes in the direction of the input direction of the input

+ -

t=0y=1 (wTx’>0)

Page 26: NN –  cont.
Page 27: NN –  cont.

For multi-layered perceptrons w. continuous neurons, a simple and successful learning algorithm exists.

Page 28: NN –  cont.

BKP:ErrorBKP:Error

Input Output

Hidden layery1、 d 1

y2、 d 2

y3、 d 3

y4、 d 4

e1=d1 - y1

e2=d2 - y2

e3=d3 - y3

e4=d4 - y4

Hidden Hidden layerlayererrorerror ??

Page 29: NN –  cont.

Synapse

W : weight

neuron1 neuron2

y1value

y2 = w*y1value

Value (y1,y2)= Internal activation

Forward propagation

Weight serves as amplifier!Weight serves as amplifier!

Page 30: NN –  cont.

Inverse Synapse

W : weight

neuron1 neuron2

e1=????value

e2value

Value(e1,e2)= Error

Backward propagation

Weight serves as amplifier!Weight serves as amplifier!

Page 31: NN –  cont.

Inverse Synapse

W : weight

neuron1 neuron2

e1=ww ** e2e2value

e2value

Value(e1,e2)= Error

Backward propagation

Weight serves as amplifier!Weight serves as amplifier!

Page 32: NN –  cont.

BKP:ErrorBKP:Error

Input Output

Hidden layery1、 d 1

y2、 d 2

y3、 d 3

y4、 d 4

e1=d1 - y1

e2=d2 - y2

e3=d3 - y3

e4=d4 - y4

Hidden Hidden layerlayererrorerror ??

O2 O1I1 O2,I2

Page 33: NN –  cont.

Backpropagation to hidden layerBackpropagation to hidden layer

w1

w3

w2Input

I1Output

O1

Hidden layer

ee [ j ] = ie [ i ]w[ j,i ]Backpropagation :

e 1

e 2

e 3O2,I2

Page 34: NN –  cont.

Update rule for 2 weight typesUpdate rule for 2 weight types

• ① I2 ( hidden layer ) , O1 ( system output )• ② I1 ( system input ) , O2 ( hidden layer )

① Δ w =α(d[i]-y[i]) f’(S[i])f(S[i]) = =αe[i] f(S[i]) (simplification (simplification f’f’=1 for repeater, e.g.)=1 for repeater, e.g.)

S[i] = jw[j, i ](t)h[j]

② Δ w =α ( ie[i] w [j,i] ) f’(S[j])f(S[j]) =α ee[j]f(S[j]) S[j] = kw[k,j](t)x[k]

Page 35: NN –  cont.

Backpropagation algorithmFOR s := 1 TO r DO Ws := initial matrix(often random);

REPEAT

select a pair (x,t) in X; y0:=x; # forward phase: compute the actual output ys of the network with input x

FOR s := 1 TO r DO ys := F(Ws ys-1) END; # yr is the output vector of the network # backpropagation phase: propagate the errors back through the network # and adapt the weights of all layers

dr := Fr’ (t - yr) ;

FOR s := r TO 2 DO ds-1 := Fs-1' WsT ds;

Ws := Ws + ds ys-1T; END;

W1 := W1 + d1 y0T

UNTIL stop criterion

Page 36: NN –  cont.

Conclusion

• We have seen binary function representation with single layer perceptron

• We have seen a learning algorithm for SLP

• We have seen a learning algorithm for MLP (BP)

• So, neurons can represent knowledge AND learn!