Intensity Transformations (Chapter 3) CS474/674 – Prof. Bebis.
Genetic Feature Subset Selection for Gender Classification: A Comparison Study Zehang Sun, George...
-
date post
22-Dec-2015 -
Category
Documents
-
view
218 -
download
0
Transcript of Genetic Feature Subset Selection for Gender Classification: A Comparison Study Zehang Sun, George...
Genetic Feature Subset Selection for Genetic Feature Subset Selection for Gender Classification: A Comparison Gender Classification: A Comparison
StudyStudy
Zehang Sun, George Bebis, Xiaojing Yuan, and Sushil Zehang Sun, George Bebis, Xiaojing Yuan, and Sushil LouisLouis
Computer Vision LaboratoryComputer Vision Laboratory
Department of Computer ScienceDepartment of Computer Science
University of Nevada, RenoUniversity of Nevada, Reno
http://www.cs.unr.edu/CVL
Gender Classification
• Problem statement– Determine the gender of a subject from facial images.
• Potential applications– Face Recognition– Human-Computer Interaction (HCI)
• Challenges– Race, age, facial expression, hair style, etc.
Gender Classification by Humans
• Humans are able to make fast and accurate gender classifications.– It takes 600 ms on the average to classify faces according to their gender (Bruce et
al.,1987).– 96% accuracy has been reported using photos of non-familiar faces without hair
information (Bruce et. al., 1993).
• Empirical evidence indicates that gender decisions are always made much faster than identity.– Computation of gender and identity might be two independent processes. – There is evidence that gender classification is carried out by a separate population
of cells in the inferior temporal cortex (Damasio et. al., 1990).
Designing a Gender Classifier
• The majority of gender classification schemes are based on supervised learning.
• Definition Feature extractionFeature extraction determines an appropriate subspace of dimensionality mm in the original feature space of dimensionality dd
(mm << dd).
Pre-Processing
Feature Extraction
Classifier
Previous Approaches
• Geometry-based– Use distances, angles, and areas among facial features.– Point-to-point distances + discriminant analysis (Burton ‘93, Fellous ‘97)– Feature-to-feature distances + HyberBF NNs (Brunelli ‘92)– Wavelet features + elastic graph matching (Wiskott ‘95)
• Appearance-based– Raw images + NNs (Cottrell ‘90, Golomb ‘91, Yen ‘94) – PCA + NNs (Abdi ‘95), – PCA + nearest neighbor (Valentin ‘97)– Raw images + SVMs (Moghaddam ‘02)
What Information is Useful for Gender Classification?
• Geometry-based approaches– Representing faces as a set of features assumes a-priori knowledge about
what are the features and/or what are the relationships between them.
– There is no simple set of features that can predict the gender of faces accurately.
– There is no simple algorithm for extracting the features automatically from images.
• Appearance-based approaches– Certain features are nearly characteristic of one sex or the other (e.g.,
facial hair for men, makeup or certain hairstyles for women).
– Easier to represent this kind of information using appearance-based feature extraction methods.
– Appearance-based features, however, are more likely to suffer from redundant and irrelevant information.
Feature Extraction Using PCA
• Feature extraction is performed by projecting the data in a lower-dimensional space using PCA.
• PCA maps the data in a lower-dimensional space using a linear transformation.
• The columns of the projection matrix are the “best” eigenvectors (i.e., eigenfaces) of the covariance matrix of the data.
Which Eigenvectors Encode Mostly Gender-Related Information?
EV#1 EV#2 EV#3 EV#4 EV#5 EV#6
EV#8 EV#10 EV#12 EV#14 EV#19 EV#20
Sometimes, it is possible to determine what features areencoded by specific eigenvectors.
Which Eigenvectors Encode Mostly Gender-Related Information? (cont’d)
• All eigenvectors contain information relative to the gender of faces, however, only the information conveyed by eigenvectors with largewith large eigenvalues can be generalized to new faces (Abdi et al, 1995).
• Removing specific eigenvectors could in fact improve performance (Yambor et al, 2000)
Critique of Previous Approaches
• No explicit feature selection is performed. – Same features used for face identification are also used for gender
classification.
• Some features might be redundant or irrelevant.– Rely heavily on the classifier.
– Classification accuracy can suffer.
– Time consuming training and classification.
Project Goal
• Improve the performance of gender classification using feature subset selection.
ClassifierFeature Subset
Pre-Processing
Feature Extraction
Feature Selection
(GA)
Feature Subset
Feature Selection
• DefinitionDefinition – Given a set of dd features, select a subset of size mm that leads to the smallest classification error.
• Filter MethodsFilter Methods– Preprocessing steps performed independent of the classification algorithm or its error
criteria.
• Wrapper MethodsWrapper Methods– Search through the space of feature subsets using the criterion of the classification
algorithm to select the optimal feature subset.
– Provide more accurate solutions than filter methods, but in general are more computationally expensive.
What constitutes a good set of features for classification?
What are the Benefits?
• Eliminate redundant and irrelevant features.• Less training examples are required.• Faster and more accurate classification.
Project Objectives
• Perform feature extractionfeature extraction by projecting the images in a lower-dimensional space using Principal Components Analysis (PCA).
• Perform feature selectionfeature selection in PCA space using Genetic Algorithms.
• Test four traditional classifiers (Bayesian, LDA, NNs, and SVMs).• Compare with traditional feature subset selection approaches (e.g.,
Sequential Backward Floating Search (SBFS)).
Genetic Algorithms (GAs) Review
• What is a GA?– An optimization technique for searching very large spaces.– Inspired by the biological mechanisms of natural selection and
reproduction.
• What are the main characteristics of a GA?– Global optimization technique.– Uses objective function information, not derivatives.– Searches probabilistically using a population of structures (i.e.,
candidate solutions using some encoding). – Structures are modified at each iteration using selectionselection,
crossovercrossover, and mutationmutation.
Structure of GA
10010110… 10010110…
01100010… 01100010…
10100100... 10100100…
10010010… 01111001…
01111101… 10011101…
Evaluation and Selection
Crossover
Mutation
Current Generation Next Genaration
Encoding and Fitness Evaluation
• Encoding scheme– Transforms solutions in parameter space into finite length strings
(chromosomes) over some finite set of symbols.
• Fitness function– Evaluates the goodness of a solution.
(11,6,9) (1011_ 0110 _1001) (101101101001)
( ( ))Fitness f decode chromosome
Selection Operator
• Probabilistically filters out solutions that perform poorly, choosing high performance solutions to exploitexploit.– Chromosomes with high fitness are copied over to the next generation.
0.1
0.9
0.01
0.01
1001 1001
1101 1101
1000 1101
0001 1101
fitness
Crossover and Mutation Operators
• Generate new solutions for explorationexploration.• Crossover
– Allows information exchange between points.
• Mutation– Its role is to restore lost genetic material.
10011110 10010010
10110010 10111110
10011110 10011010
Mutated bit
Genetic Feature Subset Selection
• Binary encoding
• Fitness evaluation
EV#1 EV#250
fitness=104accuracy +0.4 zeros
accuracy fromvalidation set
number offeatures
(search using first 250 eigenvectors)
Genetic Feature Subset Selection (cont’d)
• Cross-generational selection strategy– Assuming a population of size N, the offspring double the size
of the population, and we select the best N individuals from the combined parent-offspring population.
• GA parameters– Population size: 350– Number of generations: 400– Crossover rate: 0.66– Mutation rate: 0.04
Dataset
• 400 frontal images from 400 different people – 200 male, 200 female– Different races– Different lighting conditions– Different facial expressions
• Images were registered and normalized– No hair information– Account for different lighting conditions
Experiments
• Gender classifiers:– Linear Discriminant Analysis (LDA) – Bayes classifier – Neural Network (NN) classifier– Support Vector Machine (SVM) classifier
• Three - fold cross validation– Training set: 75% of the data– Validation set: 12.5% of the data– Test set: 12.5% of the data
Classification Error Rates
ERM: error rate using manually selected feature subsets
ERG: error rate using GA selected feature subsets
17.7%
11.3%
22.4%
13.3%
14.2%
9% 8.9%
4.7%
6.7%
Ratio of Features - Information Kept
RN: percentage of number of features in the feature subset
RI: percentage of information contained in the feature subset.
17.6%
38%
13.3%
31%36.4%
61.2%
8.4%
32.4%
42.8%
69%
Reconstructed Images
Original imagesOriginal images
Using top 30 EVsUsing top 30 EVs
Using EVs selected Using EVs selected by B-PCA+GAby B-PCA+GA
Using EVs selected Using EVs selected by LDA-PCA+GAby LDA-PCA+GA
Reconstructed Images (cont’d)
Reconstructed faces using GA-selected EVs have lost information about identity but do disclosedo disclose strong gender information
Original imagesOriginal images
Using top 30 EVsUsing top 30 EVs
Using EVs selected Using EVs selected by SVM-PCA+GAby SVM-PCA+GA
Certain gender-irrelevant features do not appear in the reconstructed images using GA-selected EVs
Using EVs selected Using EVs selected by NN-PCA+GAby NN-PCA+GA
Comparison with SBFS
• Sequential Backward Floating Search (SBFSSBFS) is a combination of two heuristic search schemes:
(1) Sequential Forward Selection (SFSSFS) - starts with an empty feature set and at each set selects the best single
feature to be added to the feature subset.
(2) Sequential Backward Selection (SBSSBS). - starts with the entire feature and at each step drops the feature whose
absence least decreases the performance.
Comparison with SBFS (cont’d)
• SBFS SBFS is an advanced version of plus l - take away r method that first enlarges the feature subset by l features using forward selectionforward selection and then removes r features using backward selection.backward selection.
• The number of forward and backward steps in SBFS is dynamically controlled and updated based on the classifier’s performance.
Comparison with SBFS (cont’d)
(a) SVMs+SBFS (b) SVMs+GA
NN Bayes LDA SVMsERM 17.70% 22.38% 14.20% 0.90%ERG 11.33% 13% 9% 4.90%ERSBFS NA NA NA 6.70%
ERMERM: error rate using the manually selected feature subsets;
ERGERG: error rate using GA selected feature subsets.
ERSBFSERSBFS: error rate using SBFS
Comparison with SBFS (cont’d)
Original imagesOriginal images
Using top 30 EVsUsing top 30 EVs
Using EVs selected Using EVs selected by SVM-PCA+GAby SVM-PCA+GA
Using EVs selectedUsing EVs selectedby SVM-PCA+SBFSby SVM-PCA+SBFS
Conclusions
• We have considered the problem of gender classificationgender classification from frontal facial images using genetic feature subset genetic feature subset selectionselection.
• GAs provide a simplesimple, generalgeneral, and powerfulpowerful framework for feature subset selection.
• Very useful, especially when the number of training examples is small.
• We have tested four well-known classifiers using PCA for feature extraction.
• Genetic subset feature selection has led to lower error rates in all cases.
Future Work
• Generalize feature encoding scheme.– Use weights instead of 0/1 encoding.
• Consider more powerful fitness functions.• Use larger data sets.
– FERET data set.
• Apply feature selection using different features.– Various features (e.g., Wavelet or Gabor features)
• Experiment with different data sets.– Different data sets (e.g., vehicle detection)