CSC 9010

24
1 CSC 9010 Spring 2011. Paula Matuszek CSC 9010 ANN Lab Paula Matuszek Spring, 2011

description

CSC 9010. ANN Lab Paula Matuszek Spring, 2011. CSC 9010 Spring 2011. Paula Matuszek. Knowledge-Based Systems and Artificial Neural Nets. Knowledge Representation means: Capturing human knowledge In a form computer can reason about Is this relevant to ANNs? - PowerPoint PPT Presentation

Transcript of CSC 9010

Page 1: CSC 9010

1

CSC 9010 Spring 2011. Paula Matuszek

CSC 9010

ANN Lab

Paula Matuszek

Spring, 2011

Page 2: CSC 9010

2

CSC 9010 Spring 2011. Paula Matuszek

Knowledge-Based Systems and Artificial Neural Nets• Knowledge Representation means:

– Capturing human knowledge

– In a form computer can reason about

• Is this relevant to ANNs?

• Consider the question we’ve been modeling all semester:– what class should I take?

• What would be involved in representing this as a neural net?

Page 3: CSC 9010

3

CSC 9010 Spring 2011. Paula Matuszek

Knowledge for ANNs• What human knowledge do we still need to

capture?– Input variables– Output variables– Training and test cases

• Let’s look at some.– http://www.aispace.org/neural/

Page 4: CSC 9010

4

CSC 9010 Spring 2011. Paula Matuszek

Example from AISpace• Mail

– Examples– Set properties– Initialize the parameters– Solve– How do we use it? Calculate output

Page 5: CSC 9010

5

CSC 9010 Spring 2011. Paula Matuszek

Our “Which class to take” problem• Inputs?

• Outputs?

• Sample data

Page 6: CSC 9010

6

CSC 9010 Spring 2011. Paula Matuszek

Some Examples• Example 1:

– 3 inputs, 1 output, all binary

• Example 2:– same inputs, output inverted

Page 7: CSC 9010

7

CSC 9010 Spring 2011. Paula Matuszek

Getting the right inputs• Example 3

– Same inputs as 1 and 2– Same output as 1– Outcomes reversed for half the cases

Page 8: CSC 9010

8

CSC 9010 Spring 2011. Paula Matuszek

Getting the right inputs• Example 3

– Same inputs as 1 and 2– Same output as 1– Outcomes reversed for half the cases

• Network is not converging

• The output here cannot be predicted from these inputs.

• Whatever is determining whether to take the class, we haven’t captured it

Page 9: CSC 9010

9

CSC 9010 Spring 2011. Paula Matuszek

Representing non-numeric values• Example 4

– Required is represented as “yes” or “no”

Page 10: CSC 9010

10

CSC 9010 Spring 2011. Paula Matuszek

Representing non-numeric values• Example 4

– Required is represented as “yes” or “no”

• Actual model still uses 1 and 0; transformation is done by applet.

Page 11: CSC 9010

11

CSC 9010 Spring 2011. Paula Matuszek

More non-numeric values• Example 5

• Workload is low, med, high: text values but they can be ordered.

Page 12: CSC 9010

12

CSC 9010 Spring 2011. Paula Matuszek

More non-numeric values• Example 5

• Workload is low, med, high: text values but they can be ordered.

• Applet asks us to assign values. 1, 0.5, 0 is typical.

Page 13: CSC 9010

13

CSC 9010 Spring 2011. Paula Matuszek

Unordered values• Example 6

– Input variables here include professor– Non-numeric, can’t be ordered.

Page 14: CSC 9010

14

CSC 9010 Spring 2011. Paula Matuszek

Unordered values• Example 6

– Input variables here include professor– Non-numeric, can’t be ordered.

• Still need numeric values

• Solution is to treat n possible values as n separate binary values

• Again, applet does this for us

Page 15: CSC 9010

15

CSC 9010 Spring 2011. Paula Matuszek

Variables with more values• Example 7

– GPA and number of classes taken are integer values

• Takes considerably longer to solve

• Looks for a while like it’s not converging

Page 16: CSC 9010

16

CSC 9010 Spring 2011. Paula Matuszek

And Reals• Example 8

– GPA is a real.

• Takes about 20,000 steps to converge

Page 17: CSC 9010

17

CSC 9010 Spring 2011. Paula Matuszek

And multiple outputs• Small Car database from AIspace

• For any given input case, you will get a value for each possible outcome.

• Typical for, for instance, character recognition.

Page 18: CSC 9010

18

CSC 9010 Spring 2011. Paula Matuszek

Training and Test Cases• The basic training approach will fit the training

data as closely as possible.

• But we really want something that will generalize to other cases

• This is why we have test cases.– The training cases are used to compute the weights

– The test cases tell us how well they generalize

• Both training and test cases should represent the overall population as well as possible.

Page 19: CSC 9010

19

CSC 9010 Spring 2011. Paula Matuszek

Representative Training Cases• Example 9

– Training cases and test cases are similar

Page 20: CSC 9010

20

CSC 9010 Spring 2011. Paula Matuszek

Representative Training Cases• Example 9

– Training cases and test cases are similar– (actually identical...)

• Training error and test error are comparable

Page 21: CSC 9010

21

CSC 9010 Spring 2011. Paula Matuszek

Non-Representative Training Cases• Example 10

– Training cases and test cases represent different circumstances

– We’ve missed including any cases involving Lee in the training

• Training error goes down, but test error goes up.

• In reality these are probably bad training AND test cases; neither seems representative.

Page 22: CSC 9010

22

CSC 9010 Spring 2011. Paula Matuszek

So:• Getting a good ANN still involves

understanding your domain and capturing knowledge about it– choosing the right inputs and outputs

– choosing representative training and test sets

– Beware “convenient” training sets

• You can represent any kind of variable: numeric or not, ordered or not.

• Not every set of variables and training cases will produce something that can be trained.

Page 23: CSC 9010

23

CSC 9010 Spring 2011. Paula Matuszek

Once it’s trained...• When your ANN is trained, you can feed it a specific

set of inputs and get one or more outputs.

• These outputs are typically interpreted as some decision:– take the class

– this is probably a “5”

• The network itself is black box.

• If the situation changes the ANN should be retrained– new variables

– new values for some variables

– new patterns of cases

Page 24: CSC 9010

24

CSC 9010 Spring 2011. Paula Matuszek

One last note• These have all been simple cases, as

examples

• Most of my examples could in fact be predicted much more easily and cleanly with a decision tree, or even a couple of IF statements

• A more typical use for any connectionist system has many more inputs and many more training cases