Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols...

31

Transcript of Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols...

Page 1: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Tools of AI1. Evaluation and Over�tting

2. Introduction to Neural Networks: Perceptron

Marcin Sydow

12.03.09

Page 2: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Topics covered by this lecture:

(Repetition: Decision Table, Supervised Learning)

Evaluation, Over�tting

Neuron, Perceptron, How Perceptron Learns

Page 3: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Repetion:Decision Table for a Mysterious Outdoor Game

outlook temperature humidity windy PLAY?

sunny hot high false no

sunny hot high true no

overcast hot high false yes

rainy mild high false yes

rainy cool normal false yes

rainy cool normal true no

overcast cool normal true yes

sunny mild high false no

sunny cool normal false yes

rainy mild normal false yes

sunny mild normal true yes

overcast mild high true yes

overcast hot normal false yes

rainy mild high true no

Page 4: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Decision Table:Cases and Attributes

Knowledge can be built on previously observed cases

Each case is described by some attributes of speci�ed type

(nominal or numeric)

For a given case, each of its attributes has some value (usually)

Decision Table:

cases = rows

attributes = columns

(a basic concept in data mining)

Page 5: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Machine Learning

Task: to learn the relationships between the values of attributes

This knowledge is to be discovered automatically by machine

There are two main paradigms:

1 Supervised Learning

2 Unsupervised Learning

Page 6: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Supervised Learning

1 given a new case (row): its attribute values are known

except the decision attribute

2 Task: �predict� the correct value of the decision attribute

3 Learn it on an available training set, for which all

attribute values are known, including the decision attribute

It is called:

classi�cation, when the decision attribute is nominal

regression, when the decision attribute is numeric

Some practical problems: e.g. training set can contain:

some noisy, erroneous or missing attribute values

inconsistent rows (di�erent decision for the same attribute values)

Page 7: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Outdoor Game - a new case

outlook temperature humidity windy PLAY?

sunny hot high false no

sunny hot high true no

overcast hot high false yes

rainy mild high false yes

rainy cool normal false yes

rainy cool normal true no

overcast cool normal true yes

sunny mild high false no

sunny cool normal false yes

rainy mild normal false yes

sunny mild normal true yes

overcast mild high true yes

overcast hot normal false yes

rainy mild high true no

overcast cool high true ???

Page 8: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

General Scheme of Supervised Learning

1 Data acquisition

2 Data cleaning and pre-processing

3 Division into training set and evaluation set

4 Learning on the traing set and evaluating on the evaluation

set (iterative)

5 Using the system

Page 9: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Evaluation and the Problem of Over�tting

After each phase of learning the system has to be evaluated

How to measure how much the system has learnt?

We have the training set. It could be possible to count the

fraction of training cases on which the system gave the

incorrect answer (training error rate).

However, the training set is �nite - the system can often learn it

exactly. Learning strictly the training data is known as

over�tting problem.

This is not the goal, since over�tted classi�er is unable of

generalising the knowledge to unkonwn cases. (this is similar to

a student learning the mathematical knowledge by heart, on

examples, without understanding the general rules).

Page 10: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Overcoming the Problem of Over�tting

If there is enough training data available, it is possible to keep

some part of it as an evaluation set.

Important: No training is done on the evaluation set - it is

used exclusively for evaluating: �how well the system can

�generalise� gained knowledge (i.e. on unseen cases)

In practice, training data is too expensive or too scarce to use

part of it as a �xed evaluation set.

In such cases, other techniques are applied:

cross-validation (most popular)

leave-one-out

bootstrap

(The techniques will be discussed on other lecture)

Page 11: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Neural Networks:

Introduction

Page 12: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Neuron

Human neural system has been a natural source of inspiration

for arti�cial intelligence researchers. Hence the interest in

a neuron � the fundamental unit of the neural system.

Page 13: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Behaviour of a Neuron

Transmit the neural signal from the �inputs� (dendrites:

the endings of a neural cell contacting with other cells) to

the �output� (neurite, which is often a very long ending,

transmitting the signal to further neurons)

Non-linear signal processing: the output state is not a

simple sum of input signals

Dynamically modify the connections with other neurons via

synapses (connecting elements), which makes it possible to

strengthen or weaken the signal received from other

neurons according to the current task

Page 14: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Perceptron: an Arti�cial Neuron

Perceptron is a simple mathematical model of a neuron.

Historically, the goal for the work on neural networks was to

gain the ability of generalisation (approximation) and learning,

speci�c for the human brain (according to one of the de�nitions

of arti�cial intelligence).

Currently, arti�cial neural networks focus on less �ambitiuous�,

but more realistic tasks.

A single perceptron can serve as a classi�er or regressor

Perceptron is a building block in more complex arti�cial neural

network structures that can solve practical problems:

supervised or unsupervised learning

controlling complex mechanical devices (e.g. robotics)

Page 15: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Perceptron - a simple model of natural neuron

A perceptron consists of:

n inputs x1, . . . , xn corresponding to dendrites

n weights w1, . . . ,wn, corresponding to synapses

Each weight wi is attached to the i − th input xi

a threshold Θ

one output y

(All the variables are real numbers)

The output y is computed as follows:

y =

{1 if

∑n

i=1 wi · xi = W TX ≥ Θ (perceptron �activated�)

0 else (not �activated�)

W ,X ∈ Rn denote the vector of weights and inputs, respectively

Page 16: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

The perceptron is activated (y=1) only when the dot product

W TX (sometimes called as �net�) reaches the speci�ed

threshold Θ

We call the perceptron discrete if y ∈ {0, 1} (or {−1, 1})continuous if y ∈ [0, 1] (or [−1, 1])

Page 17: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Perceptron: Geometrical Interpretation

The computation of the output of the perceptron has simple

geometrical interpretation.

Consider the n-dimensional input space (each point here is a

potential input vector X ∈ Rn).

The weight vector W ∈ Rn is the normal vector of the

decision hyperplane.

The perceptron is activated (outputs 1) only if the input vector

X is on the same side (of the decision hyperplane) as the

weight vector W .

Moreover, the maximum net value (W TX ) is achieved for X

being close to W , is 0 if they are orthogonal and minimum

(negative) if they are opposite.

Page 18: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Geometrical Interpretation of Perceptron, cont.

The required perceptron's behavior can be obtained by

adjusting appropriate weights and threshold.

The weight vector W determines the �direction� of the decision

hyperplane. The threshold Θ determines how much decision

hyperplane is moved from the origin (0 point)

Page 19: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Example: Perceptrons can simulate logical circuits

A single perceptron with appropriately set weights and threshold

can easily simulate basic logical gates:

Page 20: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Limitations of single perceptron

A single perceptron can �distinguish� (by the value of its

output) only the sets of inputs which are linearly separable in

the input space (i.e. there exists a n-1-dimensional hyperplane

separating the positive and negative cases)

One of the simplest examples of linearly non-separable sets is

logical function XOR (excluding alternative).

Page 21: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Limitations of a single perceptron

One can see on the pictures below that functions AND and OR

correspond to the linearly separable sets, so we can model each

of them using a single perceptron (as shown in the previous

section), while XOR cannot be modeled by any single

perceptron.

Page 22: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Network of Perceptrons

The output of one perceptron can be connected to input of

other perceptron(s) (as in neural system). This makes it

possible to extend the computational possibilities of a single

perceptron.

For example, XOR can be simulated by joining 2 perceptrons

and appropriately setting their weights and thresholds.

Remark: by �perceptron� or �multilayer perceptron� one can

also mean a network of connected entities (neurons).

Page 23: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Example: Iris classi�cation

Single perceptron can distinguish Iris-setosa from 2 other

sub-species

However, it cannot exactly recognise any of the other 2

sub-species

Page 24: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Perceptron: Overcoming the limitations

Discovery of the above limitations (1969) blocked further

development of neural networks for years.

An obvious method to cope with this problem is to join

perceptrons together (e.g. two perceptrons are enough to

model XOR) to form arti�cial neural networks.

However, it is mathematically far more di�cult to adjust more

complex networks to our needs than in case of a single

perceptron.

Fortunately, the development of e�cient techniques of learning

the perceptron networks (80s of XX century), i.e. automatic

tuning of their weights on the basis of positive and negative

examples, caused the �renaissance� of arti�cial neural networks.

Page 25: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Single Perceptron as a Classi�er

A single perceptron can be used as a tool in supervised machine

learning, as a binary classi�er (output equal to 0 or 1)

To achieve this, we present a perceptron with a training set of

pairs:

input vector

correct answer (0 or 1)

The perceptron can �learn� the correct answers by appropriately

setting it's weight vector.

Page 26: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Learning Perceptron

We apply the �training examples� one by one. If the current

output is the same as the desired, we pass to the next example.

If it is incorrect we apply the following �perceptron learning

rule� to the vector of its weights:

W ′ = W + (d − y)αX

d - desired (correct) output

y - actual output

0 < α < 1 - a parameter, tuned experimentally

Page 27: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Interpretation of Perceptron Learning Rule

To �force� the perceptron to give the desired ouputs, its weight

vector should be maximally �close� to the positive (y=1) cases.

Hence the formula:

W ′ = W + (d − y)αX

move W towards �positive� X if it outputs 0 instead to 1

(�too weak activation�)

move W away from �negative� X (if it outputs 1 instead of

0) (�too strong activation�)

Usually, the whole training set should be passed several times to

obtain the desired weights of perceptron.

Page 28: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

The Role of Threshold

If the activation threshold Θ is set to 0, the perceptron can

�distinguish� only the classes which are separable by a decision

hyperplane containing the origin of the input space (0 vector)

To �move� the decision hyperplane away from the origin, the

threshold has to be set to some non-zero value.

Page 29: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Incorporating Threshold into Learning

W TX ≥ Θ

W TX −Θ ≥ 0

Hence, X can be extended by appending −1 (fake n+1-th

input) and W can be extended by appending Θ. Denote by W ′

and X ′ the extended (n+1)-th dimensional vectors. Now we

have:

W ′TX ′ ≥ 0 - the same form as �without� the threshold

Now, the learning rule can be applied to the extended vectors

W ′ and X ′

Page 30: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Questions/Problems:

the problem of over�tting in supervised learning

training set and evaluation set

desired properties of arti�cial neuron

mathematical formulation of perceptron

how perceptron computes it's output

geometric interpretation of perceptron's computation

mathematical limitations of perceptron

learning rule for perceptron

geometric interpretation of perceptron's learning rule

Page 31: Sydow oTols of AImsyd/wyknai/perceptron3.pdf · Neuron, Perceptron , How Perceptron Learns. oTols of AI Marcin Sydow Repetition Decision ableT Learning Evaluation and Over tting Neural

Tools of AI

MarcinSydow

Repetition

DecisionTable

Learning

EvaluationandOver�tting

NeuralNetworks

Perceptron

Limitations

Learning

Summary

Thank you for attention