Pattern Recognition & Detection: Texture Classifier ECE 4226 Presented By: Denis Petrusenko December...
-
Upload
lee-stephen-dalton -
Category
Documents
-
view
220 -
download
0
Transcript of Pattern Recognition & Detection: Texture Classifier ECE 4226 Presented By: Denis Petrusenko December...
Pattern Recognition & Detection: Texture Classifier
ECE 4226Presented By: Denis Petrusenko
December 10, 2007
2
• Introduction– Problem description– Feature extraction– Textures used– Models considered– Overview of work being presented
• Implementation Details• Experimentation• Summary• Conclusion• Demo• References
Overview
3
• Texture Recognition– Texture: “The characteristic appearance of a
surface having a tactile quality”• Problem– Given: texture samples for different classes– Objective: recognize which class a never-before-
seen texture sample belongs to
Overview: Problem Description
4
• Histogram H(k) with K=0…L-1 [1]– L gray intensity levels vs number of occurrences– Represented by L-dimensional feature vector– Several approaches for summarizing H(k)
• Intensity Moments– First, calculate mean:– Then, central moments
» » µ2 : Variance, µ3 : Skewness, µ4 : Kurtosis
• Entropy:• Uniformity:
Overview: Feature Extraction
1
0
( )L
k
m kH k
1
0
( ) ( )n
nn
k
µ k m H k
1
0
( ) log ( )L
k
E H k H k
1
2
0
( )L
k
U H k
5
• GLCM , [1]• Relationship R: pixels shifted, usually by 1• Comes out to L2 features• Again, several ways to summarize– Maximum– Square Sum– Intensity Moments
Feature Extraction: Gray Level Co-Occurrence Matrix
,( ) , , 0, 1, 2,...nn k l
k l
Q k l C k l n
,
,
( , )
( , )k l
k l
R k lC
R k l
2,sq k l
k l
C Cmax ,
,max k lk l
C C
L LC
6
• Entropy• Uniformity• Variance• Skewness• Kurtosis• GLCM Features
– Cmax– Csq– Qp-2– Qp+2
• For all calculations, L = 16
Feature Extraction: Actual Features Used
7
• Picked 8 textures from Brodatz set [10]• Cut out a 200x200 chunk for speed• Samples were taken with a sliding window– Window size 60x60 pixels• Keeping it close to example size for consistency
– Arbitrary overlap of 15 pixels• 3 to 4 random samples per class for testing– Taken before resizing to allow for fresh data
Introduction: Textures Used
8
Introduction: Textures Used
9
• Parzen Windows Classifier (PWC) [2]– Basically, find least error• Training data versus arbitrary example
– Uses class label from closest training data– Training is just computing priors• All the “hard work” happens during classification
Introduction : Models Considered
10
• Multi-Level Perceptron (MLP) [3]
Introduction : Models Considered
11
• Created working texture recognizer– Training images broken down into chunks
• Example-sized window scans across with overlap• Dimensions controlled from GUI
– Feature extraction pulls 9 unique features• Calculates means and st. dev. for normalization• Arbitrary samples are normalized prior to classification
– Training data passed to classifiers• PWC just uses the training data directly• MLP runs training until some error threshold
– Arbitrary samples classified correctly most of the time
Introduction : Overview of Work
12
• Each classifier (PWC, MLP) is a class– Abstract class takes care of generic stuff• Keeping track of training data• Static feature extraction code
– Histogram intensity moments– GLCM calculations
• Examples in vector form, with class label• Vectors can do most basic vector operations• Used C# and Windows Forms for UI
Implementation Details
13
• PWC uses a Gaussian as the kernel function– Spread was set to 0.5 and 100,000– Does not make any difference after normalization
• MLP– Maximum error rate 8%– Class count * 2 hidden nodes– Learning rate set to 0.20– Everything is adjustable from the GUI– Uses backpropagation [4] for updating weights
13
Implementation Details
1414
• Training data cached when training starts• Don’t have to load features every time• Classifier states not cached– PWC only has priors for state• Negligible time to compute them every time
– MLP produces different results every run• Due to random weights assignment
• Training progress shown in real time
14
Implementation Details
15
• Histogram: Entropy
Experimentation: Linear Feature Separation
1.75
1.95
2.15
2.35
2.55
2.75
2.95
3.15
3.35
3.55
3.75
12345678
16
• Histogram: Uniformity
0.07
0.12
0.17
0.22
0.27
0.32
0.37
0.42
12345678
Experimentation: Linear Feature Separation
17
• Histogram: Variance
17
Experimentation: Linear Feature Separation
0
5
10
15
20
25
30
12345678
1818
• Histogram: Skewness
18
Experimentation: Linear Feature Separation
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
-140
-90
-40
10
60
110
12345678
1919
• Histogram: Kurtosis
19
Experimentation: Linear Feature Separation
0
200
400
600
800
1000
1200
1400
12345678
202020
• GLCM: Cmax
20
Experimentation: Linear Feature Separation
0.02
0.12
0.22
0.32
0.42
0.52
12345678
212121
• GLCM: Csq
21
Experimentation: Linear Feature Separation
-0.0499999999999997
2.91433543964104E-16
0.0500000000000003
0.1
0.15
0.2
0.25
0.3
12345678
222222
• GLCM: Qn +2
22
Experimentation: Linear Feature Separation
0
2
4
6
8
10
12
14
16
12345678
232323
• GLCM: Qn -2
23
Experimentation: Linear Feature Separation
0.17
0.22
0.27
0.32
0.37
0.42
0.47
0.52
12345678
24
3 0 0 0 0 0 0 00 3 0 0 0 0 0 00 0 3 0 0 0 0 00 0 0 3 0 0 0 00 0 0 0 3 0 0 00 0 0 0 0 3 0 00 0 0 0 1 0 3 00 0 0 0 0 0 0 3
Experimentation: Confusion Matrix
25
• 0.080 – Default, 1 error• 0.196 – Same as default• 0.260 – 22 errors• 1.851 – 22 errors– Total failure
Experimentation: MLP Min Error
0 0 0 0 0 3 0 00 0 0 0 0 3 0 00 0 0 0 0 3 0 00 0 0 0 0 3 0 00 0 0 0 0 3 0 00 0 0 0 0 3 0 00 0 0 0 0 4 0 00 0 0 0 0 3 0 0
0 0 0 0 0 0 0 30 0 0 0 0 2 0 10 0 0 0 0 0 0 30 0 0 0 0 0 0 30 0 0 0 0 0 0 30 0 0 0 0 0 0 30 0 0 0 0 0 0 40 0 0 0 0 0 0 3
26
Experimentation: MLP Hidden Nodes
6 Nodes:3 . . . . . . .. 3 . . . . . .. . 3 . . . . .. . . 2 1 . . .. . . . 3 . . .. . . . . 3 . .. . . . 1 . 3 .. . . . . . . 3
5 Nodes:3 . . . . . . .. 3 . . . . . .. . 3 . . . . .. . . 3 . . . .. . . . 3 . . .. . . . . 2 . 1. . . . 1 . 3 .. . . . . . . 3
4 Nodes:3 . . . . . . .. 3 . . . . . .. . 3 . . . . .. . . 2 1 . . .. . . . 3 . . .. . . . . 2 . 1. 1 . . . . 3 .. . . . . . . 3
3 Nodes:3 . . . . . . .. 3 . . . . . .. . 2 . . . 1 .. . . . 3 . . .. . . . 3 . . .. . . . . 2 . 1. 2 . . . . 2 .. . . . . . . 3
2 Nodes:. . . . . . 3 .. 3 . . . . . .. . . 1 . 2 . .. . . 1 1 . 1 .. . . . 3 . . .. . . . . . . 3. 1 . . 1 . 2 .. . . . . . . 3
1 Node:. . . . . . . 3. . . . . . . 3. . . . . . . 3. . . . . . . 3. . . . . . . 3. . . . . . . 3. . . . . . . 4. . . . . . . 3
7 Nodes:3 . . . . . . .. 3 . . . . . .. . 3 . . . . .. . . 3 . . . .. . . . 3 . . .. . . . . 2 . 1. . . . 1 . 3 .. . . . . . . 3
8 Nodes:3 . . . . . . .. 3 . . . . . .. . 3 . . . . .. . . 3 . . . .. . . . 3 . . .. . . . . 3 . .. . . . 1 . 3 .. . . . . . . 3
• Epoch count: 500, none had time to converge
27
• Full convergence• 4 Nodes
• Perfect performance with 5+ hidden nodes
3 0 0 0 0 0 0 00 3 0 0 0 0 0 00 0 3 0 0 0 0 00 0 0 2 1 0 0 00 0 0 0 3 0 0 00 0 0 0 0 3 0 00 0 0 0 1 0 3 00 0 0 0 0 0 0 3
Experimentation: MLP Hidden Nodes
28
• Epochs to converge vs hidden nodes
Experimentation: MLP Hidden Nodes
4 5 6 7 8 9 10 11 12 13 14 15 16 32 64 128 256100
1000
10000
29
Experimentation: MLP Hidden Nodes
• Color Assignment and Original Mosaic
30
• 8 hidden neurons
Experimentation: MLP Hidden Nodes
31
• 16 hidden neurons
Experimentation: MLP Hidden Nodes
32
Experimentation: MLP Hidden Nodes
33
Experimentation: MLP Hidden Nodes
34
Experimentation: MLP Hidden Nodes
35
• GLCM features better than Histogram features• Both types combined work out pretty good• PWC and MLP can achieve the same quality
– Not necessarily true for less separated textures• PWC has only one modifiable parameter
– Make no difference on normalized features!• MLP has 3 parameters
– Minimal error rate is crucial– Lambda mostly affects convergence rate– Hidden layers seem to need at least class count
• Trouble converging with too few
Summary
36
• Created texture recognizer• Computed 9 features– 5 from histogram– 4 from GLCM
• Employed two different classifiers– PWC
• No parameters
– MLP• Several parameters to tweak
• UI allows to work with multiple files
Conclusion
37
Demo
38
1. Robert M Haralick, K Shanmugam, Its'hak Dinstein (1973). "Textural Features for Image Classification".
2. Emanuel Parzen (1962). On estimation of a probability density function and mode.3. Warren S. McCulloch and Walter Pitts (1943). A logical calculus of ideas immanent in
nervous activity.4. Robert Hecht-Nielsen (1989). Theory of the backpropagation neural network.5. Lee K. Jones (1990). Constructive approximations for neural networks by sigmoidal
functions.6. Keinosuke Fukunaga (1990). Introduction to Statistical Pattern Recognition.7. R. Rojas: Neural Networks, Springer-Verlag, Berlin, 19968. Claude Mbusa Takenga, Koteswara Rao Anne, K. Kyamakya, Jean Chamberlain Chedjou:
Comparison of Gradient descent method, Kalman Filtering and decoupled Kalman in training Neural Networks used for fingerprint-based positioning
9. http://www.fp.ucalgary.ca/mhallbey/the_glcm.htm10. http://www.ux.uis.no/~tranden/brodatz.html11. http://www.tek271.com/articles/neuralNet/IntoToNeuralNets.html
References