HUMAN EAR RECOGNITION USING VOTING OF...

IEEE#41458-ACCS'017&PEIT'017, Alexandria, Egypt

HUMAN EAR RECOGNITION USING VOTING OF STATISTICAL

AND GEOMETRICAL TECHNIQUES

Ibrahim E. Ziedan1 Hesham Farouk 2,and Shimaa Mohamed 3 professor1, associate professor2, Research Scholar3

1,3Dept. of computer and systems, Zagazig University, sharkia ,Egypt 2Dept. of computers and systems, Electronics Research Institute, Cairo, Egypt

Abstract - Ear recognition gained great importance in the field of pattern recognition due to its simple formulation, and rich characteristics compared to other used biometrics such as the face, iris, and voice and so on. In this paper, we presented a new algorithm for ear recognition based on a voting method between results obtained by three efficient techniques of 2nd level Haar wavelet transform, Histogram of oriented gradient descriptors and geometrical based technique. Firstly, enhancement procedure is applied on the images. Then we developed an ear segmentation technique that selects the resulted ear image from two methods one depends on active ear contour and the other depending on ear edges endpoints connection. After that for every image one feature vector is extracted using the three methods and a matching is tested between testing and registered images by using Euclidian distance. Such that any distance beyond a certain threshold value for each method is a candidate to be for the matched image. Voting between the candidates selects the correct match. The experimental results achieved overall accuracy of 99.6% when applied in images affected by illuminating changes and pose variations.

Keywords — Biometrics, image processing, ear recognition,

Haar wavelet, Histogram of oriented gradients , geometrical

features, voting technique.

I. INTRODUCTION

Biometrics like face, iris and fingerprints and gait have attracted attention among the research community for many decades and are the most used in security, medical and robotics areas. Ear proved later its superiority in forensic science and its preference over such biometrics with high value in identification systems [1].

Ear recognition has been considered a very important field and gained a great attention among the research community. This attention is gained due to the robustness of ear shape and richness in features with the simple formation of ear shape with respect to other known biometrics like face ,iris and so on .Researchers proved that the ear shape of a human maintains the same features since birth with aging also the measurements proved that ear is a unique feature of human beings even in the cases of fraternal and identical twins, triplets, and quadruplets as no two ears are similar [2,3].

Recent researches of ear recognition in 2D focused on the security systems so it is very challenging to use ear recognition system with captured images of people to be allowed for a place, saved database images should be taken under controlled environment but sometimes they may have pose variation or partial occlusion. Therefore, the most important and hard aspects are making an automated ear identification algorithms deal with those constraints and can effectively extract correct ear shape and describe its characteristics correctly so we can achieve the best possible accuracy when matching using appropriate classifier. The anatomy of the ear is as shown in figure 1.1.

Figure 1 The external anatomy of the ear

In our literature review, we came across several techniques for ear recognition. Burge and Burger [4] were the first to consider the human ear as a biometric in the machine vision field. They constructed a system invariant under affine transformations using Voronoi neighborhood graph. They illustrated the problem of occlusion by the hair and proposed the solution as using the thermo-gram image to detect ear through surface heat. But the study was not complete because they didn’t introduce evaluation of the system. Neural Network Classifiers are used in two experiments introduced by Moreno et al. [5]. The first one used edge detection to extract a feature vector of seven feature points, and obtained a 43% recognition rate and in the other experiment intersection points of horizontal, vertical and diagonal cuts formed a “morphology feature vector” which is used in the recognition stage and reported an 83% recognition rate.

978-1-5386-6407-0/17/$31.00 ©2017 IEEE105


Hurley et al. [6] considered the force field feature extraction approach with template matching which extracts potential energy fields from an image as features. The presence of noise in the database images used doesn’t affect the results due to the use of force field transform. So when using 252 images of 63 subjects the experiment accuracy reached a high gain of 99% rank one recognition rate. Another study on ear biometrics depended on PCA-based approach was proposed by Ping Yan et al. [7]. The results obtained didn't exceed 67.5% in rank one recognition rate and their experiment for edge matching on 3D ear images achieved 98.7% rank one recognition rate. The geometric feature extraction based system was presented by Choras [8]. The features extracted were based on intersection points between circles of the different radii with the calculated centroid of their center and the contours extracted from the ear image. The experiments were made on a database with a very high-quality images and with no illumination changes, which led to an error free recognition. Another ear verification system was introduced by Abate et al. [9]; they extracted main trademarks using Generic Fourier Descriptor (GFD) that is a robust descriptor to illumination and rotation of ear. Ear image was transformed to frequency space with the center of the polar space is the centroid of the ear, to be certain of maintaining the centroid of the produced space in the same position. the technique was tested on a collected database consists of 282 ear images and reported a rank 1 accuracy of 96% with no pose angle but when pose variations affect the system the accuracy drops down Significantly.

In this paper human ear recognition based on three feature extraction techniques of the wavelet transform, geometry-based technique and Histogram of oriented gradient descriptors is proposed. In the first section pre-processing steps are proposed then ear segmentation process is explained. The next section details the feature extraction process. In the proposed approach, ear image is decomposed using 2D Haar wavelet transform and the coefficients are extracted to get the high-frequency information. The statistical features calculated from all coefficients are mean, standard deviation, variance, mean of energy, maximum energy, and minimum energy. Some geometrical features describe the ear are extracted such as centroid, ear maximum line, normal line, triangle lines, and angles. Histogram dependent descriptors for the ear are extracted using HOG method also. After that, a Euclidian distance matching is applied on all methods and in the last section a voting between matching results is utilized for choosing a correct match.

The block diagram of the proposed ear recognition approach is shown as follows in figure 2.

Figure 2 Block diagram of proposed ear recognition system

II. PREPROCESSING STAGE

Most of the collected or the previously existed databases are affected by some harmful factors that blur the appearance of characteristics so usually the first step is the preprocessing stage that removes any noise data, normalizes the color, compensates the illumination.

A. Resizing and grayscale conversion

Ear images are normalized to a constant size of 120x 120 by resizing technique. Then the true color images (RGB) are converted into to a grayscale image using the rgb2gray function of MATLAB 2014.

Figure 3 resized and grayscale image

Wavelet images match

Geometric images match

HOG images match

Wavelet Features

Geometrical Features

HOG Features

106


B. Normalization using adaptive histogram equalization

Some normalization is needed for ear images because they still contain the luminance of the original RGB image. Histogram equalization aims to adjust the intensity values of the entire image to match a certain histogram. This process helps in making the low contrast parts of the image gain a higher contrast by spreading out the most frequent intensity values [10].

This technique is preferred in the case of similar pixel

values distribution in the whole image. But when the image contains some regions that have a highlight or too dark compared with the rest of the image, ordinary Histogram effect on the image doesn’t give the desired results. So Adaptive histogram equalization (AHE) is used .It computes several histograms for different parts of the image and uses them to redistribute the image lighting. The algorithm used depends on using the base of windowing that divides the image into windows (selected 3x3 windows in this work) around a pixel and transforming each pixel with a transformation function derived from a neighborhood region. Calculating the standard deviation for the window is the transformation function. The algorithm is utilized twice but in the second time, a selected constant factor is multiplied by the pixel value. Samples of the resulted images using the previously discussed steps are illustrated in figure 4(a).

C. Conversion images to binary images

The enhanced grayscale image is converted to a binary image using a selected threshold value suitable to separate foreground and background of the image as in figure 4(b).

Figure 4 Ear image after normalization and binary conversion

III. EAR SEGMENTATION STAGE

For extracting the ear shape from other influences within the image, there are two algorithms for creating a mask in ear detection stage. The first depends on active contour detection and the other depends on closing the extracted edges. The two algorithms are applied on an ear image so for each image, there are two detected ears. In experiments, we reported that using both previously discussed detection algorithms give the same detection results with no

difference in most images. But in some other images the detected area of the ear image differs by an increase or a decrease which affect the efficiency of the constructed system. So a selection algorithm that depends on the difference between detected areas is used on the resulted images to select the correct image contain the complete ear contour. Figure 5 illustrates examples of results of both segmentation methods.

Figure 5 ear segmentation using

(a) Active contour method. (b) Edges connection method.

For accurate feature extraction, all images in the database are rotated to have almost the same angle. So the main axes of the ear in images are rotated to be a diagonal direction as in figure 6.

Figure 6 ear image rotation

IV. FEATURE EXTRACTION STAGE

Feature extraction stage is the most important process in the recognition system. If the features extracted are accurate and really highlights a private characteristic for the person the object recognition becomes a simple task and the whole recognition process succeeds in identification or verification.

In this paper, it was decided to detect features of an object

using three efficient methods in the biometrics recognition and then obtain the matching results and recognize the person by making voting between candidates obtained individually from each technique. The methods of feature extraction used in this research can fall under two types statistical and geometrical based features. The methods used are Haar Wavelet Transform, geometrical based technique and

107


Histogram of oriented gradients technique. The following flowchart explains the main recognition system as illustrated in figure 7.

Figure 7 the main steps of ear recognition system

A. Wavelet transform

The wavelet transform is capable of providing the time and frequency information of the ear image. The Haar Transform is one of the basic wavelets transforms. Haar Wavelet is related to the first proposed wavelet transform by Alfred Haar in 1909, which still a commonly used decomposition system [11].

A 2D second level of Haar wavelet was used to decompose the image into the main approximation, Horizontal, vertical and diagonal coefficients. The approximated coefficient of the first level is decomposed into another four sub-bands to extract seven wavelet coefficients as shown in figure 8.

Figure 8 The 2 level decomposition of wavelet transform

In feature vector extraction stage, the features for each image are extracted from seven coefficients. The statistical features we used to form the feature vector are mean, standard deviation, variance, mean of energy, maximum energy, and minimum energy through the seven image coefficients extracted using Haar wavelet.

B. Geometrical Features

The Geometrical based techniques can be considered as a method that describes the shape of the ear so diversity in measurements increases the efficiency of describing the ear shape [12]. The features extracted are summarized as follows:

Figure 9 the geometrical features formed on ear shape

1) Ear length: it is calculated by finding the farthest two pixels on the outer ear contour. Thus the distance between these two points refers to what's called Reference line that expresses the length of the ear.

2) Normal line: it is a line perpendicular to the reference line; we chose the midpoint of the reference line to make normal line pass through. The pixels on the ear contour that intersects with the normal line represent the width of the ear and its length is assumed a feature.

3) Triangle lines: the point of intersection between the normal line and the outer curvature of an ear (regardless left or right ear is used), is used with the farthest two points to compose a triangle on the ear shape. The lengths of the triangle lines are measured.

4) Angles of triangles: The acute angles formed within the sides of the triangle with the reference line are joined in the feature vector.

5) Center of mass: it could be defined as the unique location in the image that represents the average position of main edges mass in all directions. Another name is 'centroid '. The ear centroid exists in the concha region of the ear.

108


6) The location of the farthest two points on the maximum line along ear shape.

C. Histogram of Oriented Gradient (HOG) Features

Histogram of Oriented Gradient is a descriptor that counts the occurrences of gradient orientation in localized portions of an image [13]. Global features are extracted all over the image through sliding windowing algorithm.

The main concept of this technique is obtaining a gradient vector from each pixel in the image. A collection of gradient vectors forming a cell so the whole image is divided into a number of cells. All gradient vectors magnitudes within the cell are used to obtain a histogram. The number of pixels within the cell affects the number of features in the feature vector. The number of descriptors obtained from ear image is shown in figure 10.

(a) Original image. (b) cell size [4x4] (c) cell size [8x8] Length=30276 Length=7056

Figure 10 HOG descriptors extracted with cell size change

V. MATCHING MODULE

In the matching stage, a feature matrix of the extracted features from each method is formed and stored in the database. The database of features for each method is divided into testing and training data sets. The "match" parameter is calculated using Euclidian Distance as in the following equation:

At s= [s1,s2…sn] is the testing feature vector, and r=[r1,r2,…,rn] is the training feature vector

VI. VOTING METHOD

One of the decision-making methods is the voting, like in the election. Political scientists studied some voting techniques, one of them is known as approval voting technique [14]. In approval voting, each voter can select any number of candidates, but only one is the winner. It is the most approved candidate. The main purpose of using this voting algorithm is to increase the performance of the total system.

Voting method is applied on the three used techniques to select the common between all candidate training images. In cases of no common candidate between the three techniques, we take voting between any two of methods.

VII. RESULTS AND DISCUSSION

The experiments are applied on the database of The West Pomeranian University of Technology (WPUT-DB) [15]. It is a huge database of 501 subjects, each subject having 4 and 8 images. We selected a database of 800 images for 100 subjects, each has 4 images for the left ear and 4 images for the right ear were used. In the training set, we used 6 images for each subject (3 for the left ear and 3 for right ear) and 2 images are used in the testing process for each person (one for the left ear and one for right ear). All images are limited to images that have no occlusion or with simple occlusions that affect background only but don't affect the ear shape as in figure 11(a).

Also, a small database that consists of 20 images for 10 subjects is collected to be used for testing, with 10 left and 10right ear images as in figure 11(b). All database images suffer from different lighting conditions and pose variations.

(a) Samples of the selected database from WPUT-DB

(b) samples of collected database

Figure 11 Samples of used ear images database

The total accuracy of every method of the total system was obtained using four components of Right Acceptance (RA), False Acceptance (FA), Right Rejection (RR) and False Rejection (FR). The recognition rate is calculated from the following equation:

109


In experiments on the used methods, it is reported that: 1) For Haar wavelet matching:

The Euclidian distance calculated for every test image and the training image corresponding to the minimum value is the best match.

2) For Geometric method matching:

After calculating Euclidian distances for testing and training images, distances below a selected threshold are considered a "match". The selected threshold is chosen to achieve the best accuracy as in figure 11. The selected threshold for the best accuracy is 19.6. The threshold value of 20.1 also achieves the maximum accuracy, but with the number of accepted images as correct match increase.

Figure11 Geometric method threshold values vs. accuracies

3) For Histogram of oriented gradient matching: In this technique, the distances between testing image and training images are stored and the accepted match is associated with a selected threshold value as in figure 12. The threshold achieves the best accuracy is 8.75.

Figure 12 HOG method threshold value vs. accuracies The obtained accuracies of each method individually and the accuracy of voting between techniques are illustrated in table 1.

VIII. CONCLUSION In this paper, application of the proposed developed approach was discussed. Feature extraction and recognition of different

Table 1 A comparison between the accuracies of used techniques

images referring to a database of 820 images were tested. The study was carried out by choosing different feature extraction techniques and using different threshold values to select matches in the three proposed techniques aiming to reach optimum accuracy for them. A voting algorithm was used with these techniques which enhanced the accuracy obtained by any technique individually. The proposed technique almost overcomes all these influences of illumination variation, ear and head image rotation and medium degree of blurry.

IX. REFERENCES

[1] . J. Hurley, B. Arbab-Zavar and M. S. Nixon , “The Ear as a Biometric, ”in Proc.15th IEEE European Signal Processing Conf., pp. 25-29,2007.

[2] A. Iannarelli, Ear Identification. Paramount Publishing Company, Freemont, California, 1989.

[3] M. Rahman , R. Islam , N.I. Bhuiyan , B.Ahmed , and A. Islam , “ Person Identification Using Ear Biometrics, ” International Journal of The Computer, the Internet and Management ,vol. 15,no. 2, pp. 1 – 8, 2007 .

[4] M. Burge, and W. Burger, “Ear biometrics,” in Biometrics: Personal ID in Networked Society, A. Jain, R. Bolle, and S. Pankanti, Eds. Kluwer, pp. 273- 286,

1998. [5] B. Moreno, A. Sanchez, and J. F. Velez, “On the Use of

Outer Ear Images for Personal Identification,” in Proc. IEEE International Carnahan Conf. on Security Technology, pp. 469-476, 1999.

[6] D. J. Hurley, M. S. Nixon, and J. N. Carter, “Force Field Feature Extraction for Ear Biometrics,” Computer Vision and Image Understanding, vol. 98, pp. 491-512, June 2005.

[7] P. Yan and K. W. Bowyer, “Empirical Evaluation of Advanced Ear Biometrics,” Proc. IEEE Conf. Computer Vision and Pattern Recognition Workshop Empirical Evaluation Methods in Computer Vision, pp. 41-48, 2005.

[8] M. Choras, “Ear Biometrics Based on Geometric Feature Extraction,” Electronic Letters on Computer Vision and Image Analysis 5(3), pp.84-95, 2005.

[9] F.Abate , M.Nappi , D.Riccio and S.Ricciardi ,“Ear Recognition by means of a Rotation Invariant Descriptor, ” In 18th International Conference on Pattern Recognition ( ICPR), vol. 4, pp. 437 -440, 2006.

The technique Accuracy

Haar wavelet transform 97.73% Geometrical features based technique 95.6% Histogram of oriented gradient method 98.64% Proposed voting algorithm 99.6%

110


[10] R.Garg , B. Mittal , and S.Garg “ Histogram Equalization Techniques For Image Enhancement, ” International Journal of Electronics & Communication Technology ( IJECT ) , vol. 2, pp. 107-111, March 2011.

[11] D. V.Jadhav and R. S. Holambe, “Feature extraction using Radon and wavelet transforms with application to face recognition”, Neuro computing 72, pp.1951–1959, 2009.

[12] M. Choras, “Ear biometrics based on geometrical feature extraction”,

Electronic Letters on Computer Vision and Image Analysis, 2005.

[13] N.Dalal and B.Triggs, “Histogram of oriented gradients for human detection,” IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005.

[14] S.J.Brams and P.C.Fishburn, “Approval Voting,” www Springer-Verlag, pp. xv, ISBN 978-0-387-49895-9, 2007.

[15] The West Pommeranian University of Technology database, available at: http://ksm.wi.zut.edu.pl/wputedb/

111

HUMAN EAR RECOGNITION USING VOTING OF...

Documents

Transcript of HUMAN EAR RECOGNITION USING VOTING OF...