Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

52
Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele

Transcript of Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Page 1: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Analysis of Classification Algorithms in Handwriting Pattern RecognitionLogan Helms

Jon Daniele

Page 2: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Problem

The construction and implementation of computerized systems capable of classifying visual input with speed and accuracy comparable to that of the human brain has remained an open problem in computer science for over 40 years.

Page 3: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 4: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 5: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 6: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 7: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 8: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 9: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 10: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 11: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 12: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.
Page 13: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Formal Statement

Given an unknown function (the ground truth) that maps input instances to output labels, along with training data assumed to represent accurate examples of the mapping, produce a function that approximates as closely as possible the correct mapping.

Page 14: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Basic Pattern Recognition Model

Page 15: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Pattern RecognitionPreprocessor

A system that processes its input data to produce output data that will be used as input data for another system.

Classifier

A system that attempts to assign each input value to one of a given* set of classes.*A predetermined set of classifications may not always exist

Page 16: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

MNIST Dataset

A subset constructed from NIST’s Special Database 3 (SD-3) and Special Database 1 (SD-1) which contain binary images of handwritten digits.• SD-3 was collected from Census Bureau employees

• SD-1 was collected from high school students

• Samples are 28x28 pixels, size-normalized and centered

• Samples contain gray levels as a result of the anti-aliasing technique used by the normalization algorithm.

• Dataset has been used without further preprocessing.

Page 17: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

MNIST: Training Set

30,000 from Census Bureau employees

30,000 from high school students

Census Bureau EmployeesHigh School Students

Page 18: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

MNIST: Testing Set

5,000 from Census Bureau employees

5,000 from high school students

Census Bureau EmployeesHigh School Students

Page 19: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Algorithms We Will Analyze•Template Matching

•Naïve Bayes Classifier

•Feed Forward Neural Network with Backpropagation

Page 20: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Template MatchingLow hanging fruit

Page 21: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Template Matching•Digital image processing technique used to match smaller parts of one image that match a template image

•Multiple approaches: template-matching vs feature-matching

•Computationally complex

Page 22: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Approaches: Template-based• Used when there are no 'strong features' available in a template

•Makes use of the entire template image rather than just portionsBecomes difficult with high-resolution images

Potentially requires a massive search area to find the best match

Facial recognition:Uses the entire face as a templateMay become difficult if multiple features on a face are obscured or unavailable

Page 23: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Approaches: Feature-based• Identify specific, 'strong' features in a given template and match those features rather than match the entire template

• Less computationally complex as it doesn't require the resolution of an entire template

•May fail when templates are not differentiated between strong features

• Facial recognition example:Match the relative position of strong facial features, i.e. nose, mouth, ears

Match the strong features themselves as wellWorks at a far lower resolution, which may obscure the features necessary for a template match

Page 24: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bayesian classifierA naïve approach (that works)

Page 25: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Preconception

•Conditional probabilityWhat is the probability that this will happen given that something else has already happened?

Event A: happenedEvent B: hasn’t happenedProbability B given A:

Page 26: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bayes' theorem• The idea:

Allows us to go from to

• So what? It allows us to work backwards from a known outcome (instead of just guessing)

• The theorem: Probability(Outcome, given Evidence) is:

p(Evidence given that we know the Outcome) times p(Outcome), scaled by the p(Evidence)

• The formula:

Page 27: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

But that’s not naïve!!•The theorem as written only works with one piece of evidence. We’ve got A LOT, and it makes the math…rather complicated

•So we cheat: a naïve approach allows us to pretend each piece of evidence is independent of any other

•p(Outcome given multiple pieces of evidence)

Page 28: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Some points to note• If is 1, we’re just multiplying by 1

• If comes out to 0, then everything becomes 0 We may be able to rule out that particular outcome if

there’s contradictory evidence

• With everything being divided by , we can get away with not calculating it at all in some instances

• The reason we’re multiplying everything by previous known outcome(s) is so more common outcomes get a higher probability, and give less common outcomes a lower probability. Scales the predicted probabilities (gives us our base rates)

Page 29: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Let’s do this: Leeeeeeroy Jennnnkins!• Application:

Run for each possible outcome

• Outcome = Class Each class also needs a class label so we can keep track of things

• We look at the evidence, determine the probability an outcome belongs in each class, and then assign the class label from the class that has the highest probability.

Page 30: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Leroy’s example• We have 3 pieces of data each for 1000 players on WoW How loud are they Do people like them How many times do they screw up a raid

Training set: Let’s predict

Leroy? Loud Not loud Disliked Liked Screw-up Not a screw-up Total

Leroy fanboy 400 100 350 150 450 50 500

Kinda Leroy 0 300 150 150 300 0 300

Not Leroy 100 100 150 50 50 150 200

Totals 500 500 650 350 800 200 1000

Page 31: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Easy math: The base rates• Class occurences:

p(Leroy fanboy) = 0.5 (500/1000) p(Kinda Leroy) = 0.3 p(Not Leroy) = 0.2

• Probability of “likelihood” p(Loud/Leroy fanboy) p(Loud/Kinda Leroy) … p(Not a screw-up/Not Leroy) = 0.25 (50/200) p(Screw-up/Not Leroy) = 0.75

• Given features from the unknown player p(Loud) = 0.5 p(Disliked) = 0.65 p(Screw-up) = 0.8

Page 32: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Easy math: the base rates

Feature per class:Feature/Class

total

Loud Not loud

Not liked Liked Screw-up

Not a screw-up

Leroy Fanboy 0.8 0.2 0.7 0.3 0.9 0.1

Kinda Leroy 0.0 1.0 0.5 0.5 1.0 0.0

Not Leroy 0.5 0.5 0.75 0.25 0.25 0.75

Probability of class: Class/total players

Leroy Fanboy

0.5

Kinda Leroy 0.3

Not Leroy 0.2

Features from evidence: Feature/total players

Loud 0.5

Disliked 0.65

Screw-up 0.8

Page 33: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bad math• Ok, here’s a new player on Wow, and we want to know into which category of player they should be placed. Are they a Leroy fanboy, Kinda Leroy, or Not a Leroy? We observe the following characteristics for the unknown player: Loud Disliked Screw-up

• We run the numbers for each of the 3 outcomes, then choose the one with the highest probability and classify the unknown player as part of the class with highest probability according to our base rates established by the training set.

Page 34: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bad math, cont’d

=

=

Page 35: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bad math, cont’d

=

Page 36: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bad math, cont’d

=

Page 37: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Bad math, fin

•We now have the following probabilities:Leroy fanboy: 0.252Kinda Leroy: 0Not Leroy:0.01875

•And now we know that the new player falls into the Leroy fanboy category of players!

Page 38: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Feed Forward Neural Network with BackpropagationThe birth of SKYNET

Page 39: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Neural Network•A computational model inspired by central nervous systems, in particular the brain.

•Generally presented as systems of interconnected neurons in a brain.

•Neuron is the basic unit.

Page 40: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Neuron

Page 41: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Neuron

∑i =0

m

x iw i

Page 42: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Transfer Function•Backpropagation requires the transfer function be differentiable.

•We chose to use sigmoid function for our transfer function because it is easily differentiable and easier to work with.

Page 43: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Feed Forward Neural Network

Page 44: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Feed Forward Neural Network•Solutions are known

•Weights are learned

•Evolves in the weight space

•Used for:PredictionClassificationFunction approximation

Page 45: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Backpropagation•Common method of training neural networks

•Backpropagation requires the transfer function to be differentiable.

Two Step Process

1. Propagation

2. Weight update

Page 46: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

1. Propagation•Forward propagation of a training pattern’s input through the neural network in order to generate the propagation’s output activations.

•Backward propagation of the propagation’s output activations through the neural network using the training pattern target in order to generate the deltas of all output and hidden neurons.

Page 47: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

2. Weight Update

Each weight-synapse will follow these steps:

• Multiply its output delta and input activation to get the gradient of the weight.

• Subtract a ratio (percentage) of the gradient from the weight.

Page 48: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Network Error•Total-Sum-Squared-Error (TSSE)

•Root-Mean-Squared-Error (RMSE)

patternsoutputs

actualdesiredTSSE 2)(2

1

Page 49: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

A Pseudo-Code AlgorithmRandomly choose the initial weights

While error is too large

For each training pattern (in random order)

Apply the inputs to the network

Propagation (as described earlier)

Weight Update (as described earlier)

Apply weight adjustments

Periodically evaluate the network performance

Page 50: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Constraints

As a control, each classification algorithm will be presented with the MNIST test set, in the exact same order.

Page 51: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Hypothesis

The neural network will match samples to targets with a higher accuracy than the template matching and Naïve Bayes classifier.

Page 52: Analysis of Classification Algorithms in Handwriting Pattern Recognition Logan Helms Jon Daniele.

Hypothesis TestingAccuracy is defined as such:

In addition we will also test the following:

Implementation: Based on ease of implementation

Average Runtime: Based on average of end time – start time of running an

algorithm

Overall Feasibility: All factors taken into account for the desired use-case.