Neural Networks Primer Dr Bernie Domanski The City University of New York / CSI 2800 Victory Blvd...

Post on 22-Dec-2015

212 views 0 download

Transcript of Neural Networks Primer Dr Bernie Domanski The City University of New York / CSI 2800 Victory Blvd...

Neural Networks Primer

Dr Bernie DomanskiThe City University of New York / CSI

2800 Victory Blvd 1N-215Staten Island, New York 10314

drbernie@optonline.nethttp://domanski.cs.csi.cuny.edu

© B. Domanski, 2000-2001. All Rights Reserved. Slide 2

What is a Neural Network?

Artificial Neural Networks – (ANN) Provide a general, practical method for

learninglearning – real valued functions– discrete valued functions– vector valued functionsfrom examples

Algorithms “tune” input parameters to best fit a training set of input-output pairs

© B. Domanski, 2000-2001. All Rights Reserved. Slide 3

What is a Neural Network?

ANN Learning is robust to errors in the training set data

ANN have been applied to problems like Interpreting visual scenes Speech recognition Learning robot control strategies Recoginizing handwriting Face recognition

© B. Domanski, 2000-2001. All Rights Reserved. Slide 4

Biological Motivation

ANNs are built out of a densely interconnected set of simple units (neuronsneurons) Each neuron takes a number of real-

valued inputs Produces a single real-valued output Inputs to a neuron may be the outputs of

other neurons. A neuron’s output may be used as input

to many other neurons

© B. Domanski, 2000-2001. All Rights Reserved. Slide 5

Biological Analogy

Human Brain: 1011 neurons Each neuron is connected to 104 neurons Neuron Activity is inhibitedinhibited or excitedexcited

through interconnections with other neurons

Neuron switching times = 10-3 (human) Time to recognize mom = 10-1 seconds Implies only several hundred neuron neuron

firingsfirings

© B. Domanski, 2000-2001. All Rights Reserved. Slide 6

Complexity of the Biological System

Speculation: Highly parallelparallel processes must be

operating on representations that are distributed over many neurons.

Human neuron switching speeds are slowslow

Motivation is for ANN to capture this Motivation is for ANN to capture this highly parallel computation based on a highly parallel computation based on a distributed representationdistributed representation

© B. Domanski, 2000-2001. All Rights Reserved. Slide 7

A Simple Neural Net Example

Input

Nodes

Input

Nodes

OutputOutput

NeuronsNeurons

LINKLINK

WeightWeight

How Does the Network Work?

Assign weights weights to each input-linkto each input-linkMultiplyMultiply each weight by the input value (0 or 1) each weight by the input value (0 or 1) SumSum all the weight-firing input combinations all the weight-firing input combinationsIf If Sum >Sum > ThresholdThreshold for the Neuron then for the Neuron then Output = +1Output = +1 Else Output = -1Else Output = -1

So for the X=1, Y=1 case –So for the X=1, Y=1 case – IF w1*X+w2*Y > 99 THEN OUTPUT =Z= +1 IF w1*X+w2*Y > 99 THEN OUTPUT =Z= +1 50*1+50*1 > 99 50*1+50*1 > 99 IF w3*X+w4*Y+w5*Z > 59 THEN OUTPUT = IF w3*X+w4*Y+w5*Z > 59 THEN OUTPUT =

+1+1 30*1+30*1+(-30)*1 > 59 30*1+30*1+(-30)*1 > 59 ELSE OUTPUT = -1ELSE OUTPUT = -1

OR

100

10099

X Youtput

0 0 -10 1 11 0 11 1 1

Exclusive OR

X

Y

Output

Neurons

LINK

W1

X Youtput

0 0 -10 1 11 0 11 1 -1

50

50 99

-30

30

30 59

Exclusive-OR

W2

W3

W4

W5

© B. Domanski, 2000-2001. All Rights Reserved. Slide 11

Appropriate Problems for Neural Networks

Instances where there are vectorsvectors of many defined features (eg. meaurements)

Output may be a discrete value or a vector of discrete values

Training examples may contain errors Non-trivial training sets imply non-trivial

time for training Very fast application of the learned network

to a subsequent instance We don’t have to understand the learned

function – only the learned rules

© B. Domanski, 2000-2001. All Rights Reserved. Slide 12

How Are ANNs Trained?

Initially choose small random weights (wi) Set threshold = 1 Choose small learning rate (r)

Apply each member of the training set to the neural net model using the training rule to adjust the weights

© B. Domanski, 2000-2001. All Rights Reserved. Slide 13

The Training Rule Explained

Modify the weights (wi) according to the Training Rule:

Here – r is the learning rate (eg. 0.2) t = target output a = actual output xi = i-th input value

wi = wi + wi wherewi = r * (t – a) * xi

© B. Domanski, 2000-2001. All Rights Reserved. Slide 14

Training for ‘OR’

Training Set:X1 X2

target 0 0 -1 0 1 1 1 0 1 1 1 1

Initial Random Weights

W1 = .3W2 = .7

Learning Rate

r = .2

© B. Domanski, 2000-2001. All Rights Reserved. Slide 15

Applying the Training Set for OR - 1

X1

X21 a

0 0 = -10 1 = -1 X

w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1

= .2 * (2) * 0= 0

w2 = .2 * (1-(-1)) * x2

= .2 * (2) * 1= .4

w1 = w1 + w1

= .3 + 0 = .3

w2 = w2 + w2

= .7 +.4 = 1.1

.3

.7

© B. Domanski, 2000-2001. All Rights Reserved. Slide 16

Applying the Training Set for OR - 2

X1

X21 a

0 0 = -10 1 = +11 0 = -1 X

w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1

= .2 * (2) * 1= .4

w2 = .2 * (1-(-1)) * x2

= .2 * (2) * 0= 0

w1 = w1 + w1

= .3 + .4 = .7

w2 = w2 + w2

= 1.1+0 = 1.1

.3

1.1

© B. Domanski, 2000-2001. All Rights Reserved. Slide 17

Applying the Training Set for OR - 3

X1

X21 a

0 0 = -10 1 = +11 0 = -1 X

w1 = r * (t – a) * x1= .2 * (1-(-1)) * x1

= .2 * (2) * 1= .4

w2 = .2 * (1-(-1)) * x2

= .2 * (2) * 0= 0

w1 = w1 + w1

= .7+.4 = 1.1

w2 = w2 + w2

= 1.1+0 = 1.1

.7

1.1

© B. Domanski, 2000-2001. All Rights Reserved. Slide 18

Applying the Training Set for OR - 4

X1

X21 a

0 0 = -10 1 = +11 0 = +11 1 = +1

1.1

1.1

© B. Domanski, 2000-2001. All Rights Reserved. Slide 19

Training for ‘AND’

Training Set:X1 X2

target 0 0 -1 0 1 -1 1 0 -1 1 1 1

Initial Random Weights

W1 = .3W2 = .7

Learning Rate

r = .2

© B. Domanski, 2000-2001. All Rights Reserved. Slide 20

Applying the Training Set for AND - 1

X1

X21 a

0 0 = -10 1 = -11 0 = -11 1 = -1 X

w1 = r * (t – a) * x1= .2 * (1-(-1)) * 1= .4

w2 = .2 * (1-(-1)) * 1= .4

w1 = w1 + w1

= .3 + .4 = .7

w2 = w2 + w2

= .7 +.4 = 1.1

.3

.7

© B. Domanski, 2000-2001. All Rights Reserved. Slide 21

Applying the Training Set for AND - 2

X1

X21 a

0 0 = -10 1 = +1 X

w1 = r * (t – a) * x1= .2 * (-1-(+1)) * 0= 0

w2 = .2 * (-1-(+1)) * 1= -.4

w1 = w1 + w1

= .7 + 0 = .7

w2 = w2 + w2

= 1.1 -.4 = .7

.7

1.1

© B. Domanski, 2000-2001. All Rights Reserved. Slide 22

Applying the Training Set for AND - 3

X1

X21 a

0 0 = -10 1 = -11 0 = -11 1 = +1

.7

.7

© B. Domanski, 2000-2001. All Rights Reserved. Slide 23

Applying the Technology

Date #Trans CPUBusy RespTime DiskBusy NetBusy01-Oct-93 28 3 9 71 302-Oct-93 140 80 6 90 403-Oct-93 156 87 4 12 504-Oct-93 187 95 7 69 505-Oct-93 226 40 0 16 506-Oct-93 288 16 5 40 607-Oct-93 309 10 2 64 608-Oct-93 449 84 4 18 809-Oct-93 453 89 3 32 810-Oct-93 481 77 2 44 811-Oct-93 535 23 8 61 812-Oct-93 609 37 3 86 913-Oct-93 658 58 9 51 914-Oct-93 739 33 8 25 915-Oct-93 776 25 1 34 10

© B. Domanski, 2000-2001. All Rights Reserved. Slide 25

Select The Data Set

Choose data for the Neugent

© B. Domanski, 2000-2001. All Rights Reserved. Slide 26

Select The Output That You Want to Predict

Choose Inputs

Identify the Outputs

© B. Domanski, 2000-2001. All Rights Reserved. Slide 27

Train And Validate the Neugent

Choose Action to be performed:

• Create the model (Quick Train)

• Train & Validate (to understand the predictive capability)

• Investigate the data (Export to Excel or Data Analysis)

© B. Domanski, 2000-2001. All Rights Reserved. Slide 28

Validate the Neugent With the Data Set

Selecting Training Data –

• Select a random sample percentage

• Use the entire data set

© B. Domanski, 2000-2001. All Rights Reserved. Slide 29

Neugent Model is Trained, Tested, and Validated

Training Results –

•Model Fit: 99.598%(trained model quality)

•Predictive Capability: 99.598%(tested model quality)

© B. Domanski, 2000-2001. All Rights Reserved. Slide 30

View The Results in Excel

Consult trained Neugent for prediction

Save results using Excel

© B. Domanski, 2000-2001. All Rights Reserved. Slide 31

Data Analysis

Stats & Filtering: mean, min, max, std dev, filtering constraints

Ranking: input significance

Correlation Matrix: corr. between all fields

© B. Domanski, 2000-2001. All Rights Reserved. Slide 33

Correlation Matrix

The closer to 1, the stronger the indication that the information represented by the two fields is the same

NetBusy vs #Trans = .9966

© B. Domanski, 2000-2001. All Rights Reserved. Slide 34

Actual Vs Predicted

Net Busy: Actual Vs Predicted

0

20

40

60

80

100

120

10

/1/9

3

10

/15

/93

10

/29

/93

11

/12

/93

11

/26

/93

12

/10

/93

12

/24

/93

1/7

/94

1/2

1/9

4

2/4

/94

2/1

8/9

4

Date

Us

ag

e

NetBusy_actual

NetBusy_predicted

© B. Domanski, 2000-2001. All Rights Reserved. Slide 35

Actual Vs Predicted

Test results:label NetBusy_actual NetBusy_predicted

01-Oct-93 3 3.9391602-Oct-93 4 3.5574803-Oct-93 5 6.0737704-Oct-93 5 5.2292805-Oct-93 5 4.6911606-Oct-93 6 4.799707-Oct-93 6 7.0891208-Oct-93 8 7.3507309-Oct-93 8 5.742410-Oct-93 8 7.9224611-Oct-93 8 7.9441212-Oct-93 9 9.0207813-Oct-93 9 9.5198914-Oct-93 9 8.7129

© B. Domanski, 2000-2001. All Rights Reserved. Slide 36

Summary

Neural NetworksModeled after neurons in the brainArtificial neurons are simpleNeurons can be trainedNetworks of neurons can be taught

how to respond to inputModels can be built quicklyAccurate predictions can be made

© B. Domanski, 2000-2001. All Rights Reserved. Slide 37

Questions?

Questions, comments, … ??

Finding me –Dr Bernie Domanski Email: domanski@postbox.csi.cuny.edu Website: http://domanski.cs.csi.cuny.edu Phone: (718) 982-2850 Fax: 2356

Thanks for coming and listening !