Lecture 3 Perceptrons
-
Upload
umeshkathuria -
Category
Documents
-
view
219 -
download
0
Transcript of Lecture 3 Perceptrons
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 1/46
The Linear Classifier,
also known as the “Perceptron”
COMP24111 lecture 3
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 2/46
LAST WEEK: our first“machine learning” algorithm
Testing point x
For each training datapoint x’
measure distance(x,x’)
End
Sort distances
Select K nearest
Assign most common class
The K-Nearest Neighbour Classifier
Make your own notes on itsadvantages / disadvantages.
I’ll ask for volunteers nexttime we meet…..
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 3/46
Model(memorize the
training data)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(do nothing)
Supervised Learning Pipeline for Nearest Neighbour
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 4/46
Themost important concept in Machine Learning
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 5/46
Looks good so far…
Themost important concept in Machine Learning
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 6/46
Looks good so far…
Oh no! Mistakes!
What happened?
Themost important concept in Machine Learning
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 7/46
Looks good so far…
Oh no! Mistakes!
What happened?
We didn’t have all the data.
We can never assume that we do.
This is called “OVER-FITTIN”to the small dataset.
Themost important concept in Machine Learning
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 8/46
The Linear Classifier
COMP24111 lecture 3
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 9/46
Model(equation)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(search for good
parameters)
Supervised Learning Pipeline for Linear Classifiers
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 10/46
A more simple,compact model?
height
weightt
dancer""else player""then)(if t weight >
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 11/46
What’s an algorithm to find a good threshold?
height
weight
dancer""else player""then)(if t weight >
t=40
while ( numMistakes != 0 ){
t = t + 1
numMistakes = testRule(t)
}
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 12/46
We have our second Machine Learning procedure.
dancer""else player""then)(if t weight >
t=40while ( numMistakes != 0 )
{
t = t + 1
numMistakes = testRule(t)
}
The threshold classifier (also known as a“Decision Stump”)
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 13/46
Three“ingredients” of a Machine Learning procedure
“Model”The final product, the thing you have to package upand send to a customer. A piece of code with some parameters that need to be set.
“Error function”The performance criterion: the function you use to judge how well the parameters of the model are set.
“Learningalgorithm”
The algorithm that optimises the model parameters,using the error function to judge how well it is doing.
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 14/46
Three“ingredients” of a Threshold Classifier
Model
Learningalgorithm
t=40
while ( numMistakes != 0 )
{ t = t + 1
numMistakes = testRule(t)
}
Error function
dancer""else player""then)(if t x >
General case – we’ re not just talking about the weight
of a rugby player – a threshold can be put on any feature‘ x’ .
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 15/46
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 16/46
Model
(memorize the
training data)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm
(do nothing)
Supervised Learning Pipeline for Nearest Neighbour
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 17/46
What’s the“model” for the Nearest Neighbour classifier?
height
weight
For the k-nn, the model is thetraining data itself ! - very good accuracy - very computationally intensive!
Testing point x
For each training datapoint x’
measure distance(x,x’)
End
Sort distances
Select K nearest
Assign most common class
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 18/46
height
weight
New data: what’s an algorithm to find a good threshold?
Our model does notmatch the problem!
t
1 mistake…
dancer""else player""then)(if t weight >
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 19/46
height
weight
New data: what’s an algorithm to find a good threshold?
But our current modelcannot represent this…
dancer""else player""then)(if t weight >
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 20/46
We need a more sophisticated model…
dancer""else player""then)(if t x >
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 21/46
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 22/46
nput signals sent
from other neurons
f enough sufficient
signals accumulate! theneuron fires a signal"
#onnection strengths determine
ho$ the signals are accumulated
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 23/46
1 x
2 x
3 x
add
)( t aif >1=output output
signal
• input signals!x’ and coefficients!w’ are multiplied
• weights correspond to connection strengths
• signals are added up – if they are enough, FIRE!
else0=output
1w
2w
3w
i
M
i
iw xa ∑=
=1
incoming
signal
connection
strengthacti%ation
le%el
output
signal
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 24/46
Sum notation
(just like a loop from 1 to
M)
double[] x =
double[] =
Multiple corresponding
elements and add them up
a
if (actiation ! threshold) "#$% &
(actiation)
i
M
i
iw xa ∑=
=1
Calculation…
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 25/46
t >if 0else '1then == output output
∑
=
i
M
i
iw x1
The Perceptron Decision Rule
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 26/46
out"ut # $out"ut # %
t >if 0else '1then == output output
∑=
i
M
i
iw x1
Ru&'( "la(er # $)allet dancer # %
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 27/46
s this a good decision boundar&'
t >if 0else '1then == output output
∑=
i
M
i
iw x1
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 28/46
w$ # $.%
w* # %.*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 29/46
w$ # *.$
w* # %.*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 30/46
w$ # $.,
w* # %.%*
t # %.%+
t >if 0else '1then == output output
∑=
i
M
i
iw x1
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 31/46
han&in& the wei&htsthreshold ma/es the decision 'oundar( move.
0ointless im"ossi'le to do it '( hand 1 onl( o/ 2or sim"le *-3 case.
We need an al&orithm4.
w$ # -%.5
w* # %.%6
t # %.%+
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 32/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
,1* -hat is the actiation' a' of the neuron.
,2* /oes the neuron fire.
,3* -hat if e set threshold at 0* and eight 3 to ero.
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
Take a 20 minute break
and think about this.
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 33/46
20 minute break
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 34/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
3*1)*00*2()*0*0()2*00*1(1
=×+×+×==∑=
M
i
iiw xa
" *hat is the acti%ation! a! of the neuron'
+" Does the neuron fire'
if (activation > threshold) output else output"
…# $o %es& it fires#
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 35/46
*0 '*0 '2*0 +=w
0*2 '*0 '0*1 += x
0*1=t
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
w$
w*
w6
7$
7*
76
∑=
= M
i
iiw xa1
*0)0*00*2()*0*0()2*00*1(1
=×+×+×==∑=
M
i
iiw xa
," *hat if $e set threshold at -". and $eight /, to zero'
if (activation > threshold) output else output"
…# $o no& it does not fire##
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 36/46
We need a more sophisticated model…
height
weight
dancer""else player""then)(if t weight >
dancer""else player""then))((if t x f > )(
)(
2
1
kg weight x
cmheight x
==
i
d
i
i xw∑=
=1
)4()4()( 2211 xw xw x f +=
The Perceptron
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 37/46
The Perceptron
i
d
i
i xw∑=
=1
)4()4()( 2211 xw xw x f +=
height
weight
height
weight
dancer""else player""then)(if t x f >
, and change the position of the DECISION BOUNDARYt 1w 2w
Decisionboundary
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 38/46
The Perceptron
error)tionclassifica(a*k*a* mistakesof 5um6erError function
height
weight
Model
Learning algo. alues***andtheoptimisetoneed****... t w
0
1
=
=
"dancer"
"player" 0y7else1y7thenif
1
==>∑=
t xw i
d
i
i
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 39/46
Perceptron Learning Rule
ne eight 8 old eight 9 0*1 ( true:a6el ; output ) input×
i24 8 target = !9 output = ! : 4. then u"date # ;
i24 8 target = !9 output = " : 4. then u"date # ;
i24 8 target = "9 output = ! : 4. then u"date # ;
i24 8 target = "9 output = " : 4. then u"date # ;
×
What 'eight updates do these cases produce?
update
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 40/46
initialise weights to random numbers in range -1 to +1
or n = 1 to "M#$%&R'%$
or ea*h training eam,le (.)
*al*ulate a*ti/ation
or ea*h weight
u,date weight b. learning rule
end
end
end
Perceptron convergence theorem:
If the data is linearly separable, then application of the
Perceptron learning rule will find a separating decision boundary,
within a finite number of iterations
Learning algorithm for the Perceptron
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 41/46
Model(if… then…)
Testing Data(no labels)
Training data
Predicted Labels
Learning algorithm(search for good
parameters)
Supervised Learning Pipeline for Perceptron
N d l l bl
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 42/46
New data…. non-linearly separable
height
weight
Our model does notmatch the problem!
(AGAIN!)
Many mistakes!
dancer""else player""thenif 1
t xw i
d
i
i >∑=
M lil P
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 43/46
Multilayer Perceptron
x1
x2
x3
x4
x5
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 44/46
MLPd ii b d li bl ld!
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 45/46
MLP decision boundary – nonlinear problems, solved!
height
weight
N lNt k
7/21/2019 Lecture 3 Perceptrons
http://slidepdf.com/reader/full/lecture-3-perceptrons 46/46
Neural Networks - summary
Perceptrons are a (simple) emulation of a neuron.
Layering perceptrons gives you… a multilayer perceptron.An MLP is one type of neural network – there are others.
An MLP with sigmoid activation functions can solve highly nonlinear problems.
Downside – we cannot use the simple perceptron learning algorithm.
Instead we have the backpropagation algorithm.
This is outside the scope of this introductory course.