L i near discriminant analysis (LDA)
-
Upload
theodore-pallas -
Category
Documents
-
view
32 -
download
1
description
Transcript of L i near discriminant analysis (LDA)
Introduction
Fisher’s Linear Discriminant Analysis
Paper from 1936. (link)
Statistical technique for classification
LDA = two classes
MDA = multiple classes
Used in statistics, pattern recognition, machine learning
2/15
Purpose
Discriminant Analysis classifies objects in two or more groups according to linear combination of features
Feature selection Which set of features can best determine
group membership of the object? dimension reduction
ClassificationWhat is the classification rule or model to best
separate those groups?3/15
Method (1)
2 2.2 2.4 2.6 2.8 3 3.2 3.4 3.6 3.80
1
2
3
4
5
6
7
8
9
Passed
Not passed
Good separationBad separation
4/15
Method (2)
Maximize the between-class scatterDifference of mean values (m1-m2)
Minimize the within-class scatterCovariance
Min Min
Max 5/15
Formula
Σy = 0 = Σy = 1 = Σ equal covarinaces
Bayes' theorem
Idea:x – object
i, j – classes, groups
Derivation:probability density functions
-normaly distributet-
QDA - quadratic discriminant analysisMean value Covarinace
FLD
6/24
Example
Curvature Diameter Quality Control Result
2,95 6,63 Passed
2,53 7,79 Passed
3,57 5,65 Passed
3,16 5,47 Passed
2,58 4,46 Not Passed
2,16 6,22 Not Passed
3,27 3,52 Not Passed
Factory for high quality chip rings Training set
7/15
Normalization of data
X1 X2
2,888 5,676
X1 X2 class
2,95 6,63 1
2,53 7,79 1
3,57 5,65 1
3,16 5,47 1
2,58 4,46 0
2,16 6,22 0
3,27 3,52 0
X1o X2o class
0,060 0,951 1
-0,357 2,109 1
0,679 -0,025 1
0,269 -0,209 1
-0,305 -1,218 0
-0,732 0,547 0
0,386 -2,155 0
Training data Mean corrected data
Avrage
8/15
Covarinace
0,166 -0,192
-0,192 1,349
0,259 -0,286
-0,286 2,142
Covarinace for class i
Covarinace class 1 – C1 Covarinace class 2 – C2
One entry of covarinace matrix - C
0,206 -0,233
-0,233 1,689covarinace matrix - C
0,259 -0,286
-0,286 2,142Inverse covarinace matrix C - S
9/15
Mean valuesN P(i) m(X1) m(X2)
Class 1 4 0,571 3,05 6,38 m1Class 2 3 0,429 2,67 4,73 m2Sum 7 5,72 11,12 m1+m2
0,38
1,65
m1-m2
3,487916
1,456612
W= S*(m1-m2)
W0= ln[P(1)\P(2)]-1\2*(m1+m2) = -17,7856
N – number of objects
P(i) – prior probability
m1 – mean value matrix of class 1 (m(x1), m(x2))
m2 – mean value matrix of class 2 (m(x1), m(x2))
0,259 -0,286
-0,286 2,142
S- inverse covariance
* =
10/15
ResaultX1 X2 score class
2,95 6,63 2,149 1
2,53 7,79 2,380 1
3,57 5,65 2,887 1
3,16 5,47 1,189 1
2,58 4,46 -2,285 0
2,16 6,22 -1,203 0
3,27 3,52 -1,240 0
score= X*W + W0 X1 X22,95 6,632,53 7,793,57 5,653,16 5,472,58 4,462,16 6,223,27 3,52
3,487916
1,456612* =W0 +
score
2,1492,3802,8871,189
-2,285-1,203-1,240
1
2
3
4
-3.000 -2.000 -1.000 0.000 1.000 2.000 3.000 4.000
Not Passed
Passed
11/15
Prediction
New chip: curvature = 2.81, diameter = 5.46
Predicition: will not pass
Prediction correct!
score= X*W + W0 W= S*(m1-m2)
score= -0,036
If (score>0) then class1 else class2
score= -0,036 => class21
2
3
4
-3.000 -2.000 -1.000 0.000 1.000 2.000 3.000 4.000
Not Passed
Passed
12/15
Pros & Cons
Cons
Old algorithm
Newer algorithms - much better predicition
Pros
Simple
Fast and portable
Still beats some algorithms (logistic regression) when its assumptions are met
Good to use when begining a project13/15
Conclusion
FisherFace one of the best algorithms for face recognition
Often used for dimension reduction
Basis for newer algorithms
Good for beginig of data mining projects
Thoug old, still worth trying
14/15
Thank you for your attention!
Questions?
15