Instructor: Dr. Benjamin Thompson Lecture 15: 3 March...
Transcript of Instructor: Dr. Benjamin Thompson Lecture 15: 3 March...
![Page 1: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/1.jpg)
Instructor:
Dr. Benjamin Thompson
Lecture 15: 3 March 2009
![Page 2: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/2.jpg)
Quod Erat Demonstrandum� More Heuristics for Better Learning
� Momentum
� Maximizing Information Content
� The Activation Function
� Input Normalization
� Weight Initialization
![Page 3: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/3.jpg)
Et In Saecula Saeculorum…� Decision Boundaries in Classification Problems
� Matlab Demonstration: Two Moons revisited
� Matlab Demonstration: Five-class problem
� Super-Awesome Really Fun and Amazing Term Project Assignment! Yay!
� Biologically-Inspired Search Algorithms Preview
![Page 4: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/4.jpg)
You’ve gotta make a choice eventually…
![Page 5: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/5.jpg)
Crisp n’ Fuzzy� Recall: no matter how well-trained the neural network
is, it’s always going to produce a continuously-valued output� XOR problem: it came very close to producing the exact
binary outputs we desired, but not exactly
� For function approximations, this typically isn’t an issue – you’re just trying to approximate some input/output relationship that’s already continuous.
� For classification problems, there are a discrete number of classes, and we must ultimately decide towhich class the input belongs
![Page 6: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/6.jpg)
Classification example� Suppose I have fully trained a neural network to
recognize between handwritten letter “a” and handwritten letter “b”
� My input is the features of the handwritten sample, my desired output is either a +1 for “a”, or -1 for “b”
� In reality, my output for new patterns is going to be in the ballpark of +1 for “a”, and in the ballpark of -1 for “b”.
� A simple threshold may be applied at this point:
� Above zero, call it an “a”
� Below zero, call it a “b”
![Page 7: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/7.jpg)
The Decision Boundary� Given a particular threshold, I may then sample the
entire feature space, and determine which sets of points yield the first class, and which sets of points yield the second class
� In the Rosenblatt Perceptron case, this was already done form me, by the plane formed from wTx+b = 0
� The line(s) between the classes (which may be more than two) once this is done is my decision boundary
![Page 8: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/8.jpg)
What To Do With More Classes� Remember, the output in a classification problem just
answers the question “to which class does this input correspond”
� We must supply a numerical value to each possible output in order to train our neural networks!
� When presented with a multiple-class problem, we have several options on how to encode the output:
� Class-t0-scalar mapping
� Class-to-vector mapping
![Page 9: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/9.jpg)
Class-to-Scalar Mapping� Inputs from class 1 all map to some number α1
� Inputs from class 2 all map to some number α2
� …
� Inputs from class n all map to some number αn
� Selection of these scalar values presents a design problem:
� If they vary wildly for adjacent regions, the neural network will have to learn those sharp discontinuities, which neural networks tend to have a hard time learning
� The overall scale doesn’t matter (you could scale the output weight matrix arbitrarily to accommodate for that), but the relative difference between each value does matter
![Page 10: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/10.jpg)
Class-to-Scalar Mapping
� Using 1 2 3 5 6 is much smoother than using 1 -1 2 -2 3
We call this an adjacency problem!
![Page 11: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/11.jpg)
Class-to-Scalar Mapping� Overall output:
� A decision is made by quantizing the output of the neural network, which forms a set of selection rules.
� In our previous example:� IF the output is less than 1.5, it is class 1
� IF the output is greater than 1.5 AND less than 2.5, it is class 2
� IF the output is greater than 2.5 AND less than 3.5, it is class 3
� etc.
� This makes it clear that, for classification problems, we only have to get the answer “in the ballpark” rather than exactly right� This can make training time easier
![Page 12: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/12.jpg)
Class-to-Vector Mapping� Construct your neural network to have as many output
neurons as there are input classes (not inputs, but the classes from which the inputs were drawn!)
� The desired output pattern for an input drawn from class nis just a vector of all zeros except for the nth element, which is a 1!
� This also enables one to observe how well the classifier is working for a given class:� If many inputs are close to 1, that indicates some level of
confusion
� If only a single input is close to 1, that indicates a level of confidence
![Page 13: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/13.jpg)
Class-to-Vector Mapping� Class selection: whichever neuron has the largest
output corresponds to the class of the input
� Additional strength of this approach: if there are two or more competing/conflicting neurons (that is, two or more have approximately the same maximum value), we may choose to make “no decision”
� That is: the neural network is clearly confused to which class the input belongs, so we develop an “I don’t know” case to handle this
![Page 14: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/14.jpg)
ComparisonClass-to-Vector Class-to-Scalar
� One output neuron implies fewer free parameters, thus simplifying neural network training time
� Single output makes interpretation simpler
� Scalar value for each class must be chosen with care
� Multiple output neurons implies more free parameters, which can cause longer training times
� No adjacency problem
� No need for careful selection of output values
� Additional interpretive tools available for confusion or confidence estimates
![Page 15: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/15.jpg)
One-to-Many Hybrid� As with many things, there are tradeoffs here as well:
� Rather than a single output for many classes, or as many outputs as there are classes, we may “meet in the middle” by cleverly coding the classes
� That is, a particular class maps to a particular set of output values
� The more fine-grained the set of output values (and the fewer neurons it takes to represent them), the closer we are to scalar mapping
� The coarser the set of output values (and the more neurons it takes to respresent them), the closer we are to vector mapping
![Page 16: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/16.jpg)
Example: binary coding� For each class number, the desired output is simply the
binary equivalent for that class� e.g., suppose we have 4 classes. The possible outputs become:
� [0,0] for class 1� [0,1] for class 2� [1,0] for class 3� [1,1] for class 4
� Such a coding scheme suffers from some adjacency issues, but not as many as pure scalar coding� e.g., if class 1 and class 4 are close to each other in the input space, it
may be problematic since their outputs are “units” apart in the output space
� This scheme only requires log2(n) outputs, where n is the number of classes.
2
![Page 17: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/17.jpg)
Final Note for Classification� Recall that we said that the overall goal for classification is
to get the answer “in the ballpark” rather than exactly right
� Thus, while we still need to use the traditional error metric for backprop to work, a better error metric to use simply for performance evaluation might be:
� E(n) = % of incorrectly classified patterns on epoch n
� Of course, “correctly classified” requires the calculation of the threshold value(s) used to make the ultimate determination
� This metric gives us a firmer stopping criterion:
� When we have successfully classified all the input patterns (or some acceptably low % of misclassifications), we may stop training!
![Page 18: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/18.jpg)
That’s no two moons revisited! That’s two space stations revisited!
![Page 19: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/19.jpg)
Just a Reminder
![Page 20: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/20.jpg)
More Reminders� This two-class problem is not linearly separable
� So Rosenblatt Perceptron would not suffice
� Neural Network classification:
� Top moon: Class one, should give a “+1” output
� Bottom moon: Class two, should give a “-1” output
� Decision threshold: output ≥ 0 for class 1
� Things to remember:
� Goal of classification is not perfect input-output mapping
� Rather, goal is to get “close enough” so that the two classes are separable by a simple threshold with minimal overlap
![Page 21: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/21.jpg)
Stay classy, neural networks!
![Page 22: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/22.jpg)
The Classes
Class 3
Class 2Class 4
Class 5
Class 1
![Page 23: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/23.jpg)
The Approach� We will train a neural network with a single output
� The outputs simply map to the class number
� Overall goal: to demonstrate the decision boundary for multiple classes
� Thing to note: lots of “white space” where there is no such thing as an “incorrect” classification
� In other words, generalization only applies over the support of each class
Quick, somebody ask, “Hey Dr. Thompson, what’s ‘support’ mean in this context?”
![Page 24: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/24.jpg)
As in, Terminal Project. As in, it’ll be the death of you…
![Page 25: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/25.jpg)
The Purpose� The main goal of the term project is to independently
exercise one or more of the techniques developed in this class on interesting, real-world data sets
� So really, there are several goals:
� 1) Learning how to gather data
� 2) Exercising neural networks or related techniques on real-world data
� 3) Writing a coherent and lucid report on experimental results
� 4) Reporting said results in front of a group in a concise and informative manner
![Page 26: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/26.jpg)
The Details� The project will comprise 35% of your grade for this
course
� The project must be done in assigned groups unless explicit permission is granted by the instructor
� Valid reason: “My project will be based on my own existing research that I am already performing, and for funding reasons I can’t share that data with a partner.”
� Invalid reason: “My homework partners smell funny.”
![Page 27: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/27.jpg)
More Details� The project actually consists of three (3) components:
� 1) Project proposal (5% of your overall course grade)
� Due 3/24/09
� 2) Oral presentation (10% of your overall course grade)
� Given either 4/28/09 or 4/30/09, in-class
� 3) Written Report (20% of your overall course grade)
� Due the last day of class, 4/30/09
![Page 28: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/28.jpg)
If this makes you nervous, that whole “picturing the audience in their underwear” trick is a load of bull.
![Page 29: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/29.jpg)
The Proposal� The proposal for your project topic must be submitted by the
beginning of class on 3/24/09
� The proposal must pithily address the following issues:� What day you would prefer to give your presentation (4/28 or 4/30)?
� You are not guaranteed this date, but I will try to accommodate everyone as best I can
� The problem you plan on solving.� Why is the problem important to solve?
� The technique(s) you plan on applying.
� The data you plan to use.� How will you obtain/have you obtained the data?
� The programming language you will use.
� The proposal should probably only be 1-2 pages in length.
![Page 30: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/30.jpg)
The Oral Presentation� Minimum 5 minutes, maximum 8 minutes
� Exceeding these bounds will lower your grade
� Allow 2 minutes for questions
� Every member of the group must speak for at least 1 minute� Failure to speak at least 1 minute will lower that individual’s
grade
� Yes, I will be timing you.
� Must be accompanied by some form of illustration� Overhead projector slides
� Powerpoint slides (preferred option)
� Large-print poster
![Page 31: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/31.jpg)
The Oral Presentation� The oral presentation must contain the following key
components:� Problem Statement� Description of the data used� The Learning Machine(s) that was/were used� Training and testing results� Conclusions and Future Work
� The Oral Presentation will be graded on the following criteria:� Clarity of speaking (practice, practice, practice!)� Detail (say something useful!)� Interest (make your slides informative and attractive!)
![Page 32: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/32.jpg)
Oral Presentation Caveats� Each group is solely responsible for ensuring that their
presentation will work on the classroom equipment
� Technology failures due to poor planning will count against your grade
� You are perfectly able, and highly encouraged, to come to class the day before the presentation to test out the slides
� You may email me your Powerpoint slides at least 24 hours before your presentation and I will bring them pre-loaded on my laptop
![Page 33: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/33.jpg)
Courtesy of Dr. David W. Krout
![Page 34: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/34.jpg)
Diligent Q. Student
EE/ESC 456
![Page 35: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/35.jpg)
Problem Statement� Goal
� Develop a Neural Network that can classify what sport a person plays based on height and weight
![Page 36: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/36.jpg)
Data Set� Inputs
� Height (in), Weight (lbs)
� Outputs
� Sport Classification
� Jockeys: 1, Basketball: 2, Soccer: 3, Sumo: 4
� 200 athletes, approximately 50 per sport
![Page 37: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/37.jpg)
Plot of Data Set
55 60 65 70 75 80 85100
150
200
250
300
350
400
450
500Data points for Athletes (Height vs. W ieght)
Weight (lbs)
Height (inches)
Jockeys
Basketball Players
Soccer Players
Sumo Wrestlers
![Page 38: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/38.jpg)
Neural Network� Multilayer Feed Forward
� 2/4/3/1
� Linear transfer function on output later and Sigmoid for all others
![Page 39: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/39.jpg)
Training� 80% of data set used for training
� 20% used for testing
![Page 40: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/40.jpg)
Training Results
![Page 41: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/41.jpg)
Training Results
Trial Epochs Time Ave. Error % Correct
1 7955 25s .462 65
2 16681 50s .450 65
3 20540 55s .454 65
4 9541 30s .643 69
Average: 13679 40s .502 66
![Page 42: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/42.jpg)
Conclusions/Future Work� Classification NN performed well
� Still might be room for improvement
� Larger network may be beneficial
� Another metric would probably improve results greatly� 40 meter dash time
� Ratio of weight and height
� Annual income
� Bench press
![Page 43: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/43.jpg)
![Page 44: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/44.jpg)
Written Report� Your written report must address the following issues:
� Problem Statement� Description of the data used� The Learning Machine(s) that was/were used� Training and testing results� Conclusions and Future Work
� Report will be graded on the following criteria:� Clarity of presentation (Be Specific!)� Pithiness (don’t use 10 words to say what can be said in 5)� Grammar and spelling (yes, this counts!)� Depth of knowledge (show me you understand what you’ve
done)� Results (this is, of course, the big one!)
![Page 45: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/45.jpg)
Example Report� Anything in the open literature is a good example of
what your report should strive to achieve.
![Page 46: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/46.jpg)
If this doesn’t give you a reason to show up on Thursday, nothing will!
![Page 47: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/47.jpg)
Particle Swarm Optimization� Mimics the motion of a f lock of birds converging on a
food source to find the global minimum of a search space
� Very very easy to code (I can do the whole algorithm in 6 lines of code)
� Very very easy to modify for better performance
� Simple example of swarm intelligence
� Highly parallelizable and scalable
![Page 48: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/48.jpg)
Genetic Algorithms� Optimization routine based on evolutionary strategy
� Solutions are encoded as bit-string chromosomes
� New generations (new solutions) are generated via crossover (two parents combine genetic material to form a “child” solution) and mutation (genes are randomly changed with some small probability)
![Page 49: Instructor: Dr. Benjamin Thompson Lecture 15: 3 March 2009scripts.cac.psu.edu/users/c/a/cao5021/ee/456/hw/lecture_15 Dec Bo… · The Oral Presentation Minimum 5 minutes, maximum](https://reader036.fdocuments.us/reader036/viewer/2022070806/5f04dd567e708231d410181a/html5/thumbnails/49.jpg)
Simulated Annealing� Based on metallurgic principle of annealing, wherein
metal is slowly heated and cooled to result in the lowest energy state of the atoms in the metal for improved hardness and strength
� Given a particular solution, new solutions are searched for near that solution
� Better solutions (lower error) always accepted, worse solutions accepted with some probability based on an annealing schedule and overall error difference