Perceptrons and Learning Learning in Neural Networks.
-
Upload
stephon-esse -
Category
Documents
-
view
230 -
download
5
Transcript of Perceptrons and Learning Learning in Neural Networks.
![Page 1: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/1.jpg)
Perceptrons and Learning
Learning in Neural Networks
![Page 2: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/2.jpg)
Automated Learning Techniques
• ID3 : A technique for automatically developing a good decision tree based on given classification of examples and counter-examples.
![Page 3: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/3.jpg)
Automated Learning Techniques
• Algorithm W (Winston): an algorithm that develops a “concept” based on examples and counter-examples.
![Page 4: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/4.jpg)
Automated Learning Techniques
• Perceptron: an algorithm that develops a classification based on examples and counter-examples.
• Non-linearly separable techniques (neural networks, support vector machines).
![Page 5: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/5.jpg)
Natural versus Artificial Neuron
• Natural Neuron McCullough Pitts Neuron
![Page 6: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/6.jpg)
One NeuronMcCullough-Pitts
• This is very complicated. But abstracting the details,we have
Sw1
w2
wn
x1
x2
xn
ThresholdIntegrate
Note: Nonlinearity. CRUCIAL!!
![Page 7: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/7.jpg)
• Pattern Identification
• (Note: Neuron is trained)
• weights
field. receptive in the is letter The Axw ii
Perceptron
![Page 8: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/8.jpg)
Three Main Issues
• Representability• Learnability• Generalizability
![Page 9: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/9.jpg)
One Neuron(Perceptron)
• What can be represented by one neuron?• Is there an automatic way to learn a
function by examples?
![Page 10: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/10.jpg)
• weights
field receptivein threshold Axw ii
Feed Forward Network
• weights
![Page 11: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/11.jpg)
Representability
• What functions can be represented by a network of McCullough-Pitts neurons?
• Theorem: Every logic function of an arbitrary number of variables can be represented by a three level network of neurons.
![Page 12: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/12.jpg)
Proof
• Show simple functions: and, or, not, implies
• Recall representability of logic functions by DNF form.
![Page 13: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/13.jpg)
AND
![Page 14: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/14.jpg)
OR
![Page 15: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/15.jpg)
Perceptron
• What is representable? Linearly Separable Sets.
• Example: AND, OR function• Not representable: XOR• High Dimensions: How to tell?• Question: Convex? Connected?
![Page 16: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/16.jpg)
XOR
![Page 17: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/17.jpg)
Convexity: Representable by simple extension of perceptron
• Clue: A body is convex if whenever you have two points inside; any third point between them is inside.
• So just take perceptron where you have an input for each triple of points
![Page 18: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/18.jpg)
Connectedness: Not Representable
![Page 19: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/19.jpg)
Representability
• Perceptron: Only Linearly Separable– AND versus XOR– Convex versus Connected
• Many linked neurons: universal– Proof: Show And, Or , Not, Representable
• Then apply DNF representation theorem
![Page 20: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/20.jpg)
Learnability
• Perceptron Convergence Theorem:– If representable, then perceptron algorithm
converges– Proof (from slides)
• Multi-Neurons Networks: Good heuristic learning techniques
![Page 21: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/21.jpg)
Generalizability
• Typically train a perceptron on a sample set of examples and counter-examples
• Use it on general class• Training can be slow; but execution is fast.
• Main question: How does training on training set carry over to general class? (Not simple)
![Page 22: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/22.jpg)
Programming: Just find the weights!
• AUTOMATIC PROGRAMMING (or learning)
• One Neuron: Perceptron or Adaline• Multi-Level: Gradient Descent on
Continuous Neuron (Sigmoid instead of step function).
![Page 23: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/23.jpg)
Perceptron Convergence Theorem
• If there exists a perceptron then the perceptron learning algorithm will find it in finite time.
• That is IF there is a set of weights and threshold which correctly classifies a class of examples and counter-examples then one such set of weights can be found by the algorithm.
![Page 24: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/24.jpg)
Perceptron Training Rule
• Loop: Take an positive example or negative example. Apply to neuron. – If correct answer, Go to loop. – If incorrect, Go to FIX.
• FIX: Adjust neuron weights by input example– If positive example Wnew = Wold + X; increase threshold
– If negative example Wnew = Wold - X; decrease threshold
• Go to Loop.
![Page 25: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/25.jpg)
Perceptron Conv Theorem (again)
• Preliminary: Note we can simplify proof without loss of generality– use only positive examples (replace example
X by –X)– assume threshold is 0 (go up in dimension by
encoding X by (X, 1).
![Page 26: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/26.jpg)
Perceptron Training Rule (simplified)
• Loop: Take a positive example. Apply to network. – If correct answer, Go to loop. – If incorrect, Go to FIX.
• FIX: Adjust network weights by input example– If positive example Wnew = Wold + X
• Go to Loop.
![Page 27: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/27.jpg)
Proof of Conv Theorem• Note:
1. By hypothesis, there is a >0esuch that V*X > e for all x in F 1. Can eliminate threshold (add additional dimension to input) W(x,y,z) >
threshold if and only if W* (x,y,z,1) > 0
2. Can assume all examples are positive ones (Replace negative examples by their negated vectors) W(x,y,z) <0 if and only if W(-x,-y,-z) > 0.
![Page 28: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/28.jpg)
Perceptron Conv. Thm.(ready for proof)
• Let F be a set of unit length vectors. If there is a (unit) vector V* and a value e>0 such that V*X > e for all X in F then the perceptron program goes to FIX only a finite number of times (regardless of the order of choice of vectors X).
• Note: If F is finite set, then automatically there is such an .e
![Page 29: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/29.jpg)
Proof (cont).
• Consider quotient V*W/|V*||W|.
(note: this is cosine between V* and W.)
Recall V* is unit vector .
= V*W*/|W|
Quotient <= 1.
![Page 30: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/30.jpg)
Proof(cont)
• Consider the numerator
Now each time FIX is visited W changes via ADD.
V* W(n+1) = V*(W(n) + X)
= V* W(n) + V*X
> V* W(n) + eHence after n iterations:
V* W(n) > n (*)e
![Page 31: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/31.jpg)
Proof (cont)
• Now consider denominator:• |W(n+1)|2 = W(n+1)W(n+1) =
( W(n) + X)(W(n) + X) =
|W(n)|**2 + 2W(n)X + 1 (recall |X| = 1)
< |W(n)|**2 + 1 (in Fix because W(n)X < 0)
So after n times
|W(n+1)|2 < n (**)
![Page 32: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/32.jpg)
Proof (cont)
• Putting (*) and (**) together:
Quotient = V*W/|W| > n /e sqrt(n) = sqrt(n) . e
Since Quotient <=1 this means n < 1/e2.This means we enter FIX a bounded number of
times. Q.E.D.
![Page 33: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/33.jpg)
Geometric Proof
• See hand slides.
![Page 34: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/34.jpg)
Additional Facts
• Note: If X’s presented in systematic way, then solution W always found.
• Note: Not necessarily same as V*• Note: If F not finite, may not obtain
solution in finite time• Can modify algorithm in minor ways and
stays valid (e.g. not unit but bounded examples); changes in W(n).
![Page 35: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/35.jpg)
Percentage of Boolean Functions Representable by a
Perceptron
• Input Perceptrons Functions
1 4 42 14 16
3 104 2564 1,882 65,5365 94,572 10**96 15,028,134 10**19
7 8,378,070,864 10**388 17,561,539,552,946 10**77
![Page 36: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/36.jpg)
What wont work?
• Example: Connectedness with bounded diameter perceptron.
• Compare with Convex with
(use sensors of order three).
![Page 37: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/37.jpg)
What wont work?
• Try XOR.
![Page 38: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/38.jpg)
What about non-linear separableproblems?
• Find “near separable solutions”• Use transformation of data to space where
they are separable (SVM approach)• Use multi-level neurons
![Page 39: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/39.jpg)
Multi-Level Neurons
• Difficulty to find global learning algorithm like perceptron
• But …– It turns out that methods related to gradient
descent on multi-parameter weights often give good results. This is what you see commercially now.
![Page 40: Perceptrons and Learning Learning in Neural Networks.](https://reader034.fdocuments.us/reader034/viewer/2022052321/551bf89c550346ad4f8b47d6/html5/thumbnails/40.jpg)
Applications
• Detectors (e. g. medical monitors)• Noise filters (e.g. hearing aids)• Future Predictors (e.g. stock markets; also
adaptive pde solvers)• Learn to steer a car!• Many, many others …