[IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore,...

6
Edge detection Feature extraction Feature vector Image Feature vector generation Figure 1. Block diagram of edge-based feature vector generation process Improving Edge-Based Feature Extraction Using Feature Fusion Shahan Nercessian, Karen Panetta Department of Electrical and Computer Engineering Tufts University Medford, MA, USA [email protected], [email protected] Sos Agaian Department of Electrical and Computer Engineering University of Texas at San Antonio San Antonio, TX, USA [email protected] Abstract—Feature extraction is arguably the most important stage of an automatic object detection system. It is in this stage where the results of previous processing steps are interpreted to somehow characterize an object. Developing methods for feature extraction and feature vector generation using information from edge maps is a natural progression, as edge detection determines structure in images. A new edge-based feature extraction scheme is introduced based on the feature fusion of two existing methods. A generalized set of kernels for edge detection is also presented. The experimental results show that the detection of different objects of interests is improved using the new method. Keywords—feature extraction, edge-based feature vectors, object detection I. INTRODUCTION Automatic object detection systems have grown increasingly popular in the field of computer vision. One of the main purposes for automatic object detection systems is to detect anomalies in images. Such applications include the detection of brain tumors and lesions in CT scan images [1], the detection of defects in materials [2, 3], and the detection of potential threat weapons in x-ray luggage scan images [4, 5]. Object detection is also a pivotal step towards object recognition. This is to say that once a certain object type is detected in an image, it is possible to recognize the specific object instance through some means. Generally, feature is extracted from an object image and compiled into a feature vector, a set of numbers which somehow characterizes the object image. Naturally, edge- based feature extraction and feature vector generation methods [6, 7, 8] have been considered because edges show low level structure in images. Feature extraction methods based on object structure will most likely use edge detection at some point. Two existing edge-based feature vector generation methods are the Cell Edge Distribution (CED) and the Projected Principal-Edge Distribution (PPED) feature vector generation methods, which were introduced for both medical [6] and facial detection [9] purposes. The two methods extract different yet important information. A simple concatenation of the feature vectors generated by the two methods has been suggested, which can be generalized further as a weighted concatenation. We show by experiment this sort of concatenation does not improve detection results when using a Support Vector Machine (SVM) classifier, and therefore cannot be taken as a robust means of feature fusion. In this paper, we present a new feature extraction method based on the fusion of the CED and PPED feature extraction methods. A generalized set of kernels for edge detection is also presented. The new feature extraction method was developed by generalizing the process employed by the two methods, determining the nature of the feature extracted by each method, and then determining a new feature extraction method which utilizes the qualities of each of the existing methods. The robustness of the new feature vector generation method is assessed through detection tests. Namely, a feature vector of an object image is generated, and the feature vector is classified as either corresponding to a specific object of interest or not corresponding to the object of interest using an SVM classifier. Due to interest in the use of x-ray luggage scan images for aviation security, the two objects of interest on which detection tests were performed for this paper are handguns and bottles. The experimental results show that the new feature extraction method outperforms CED, PPED, and feature fusion involving the weighted concatenation of CED and PPED feature vectors. The remainder of this paper is organized as follows. Section II provides background information regarding edge detection, edge-based feature extraction, feature generation and SVM classification. Section III describes the new feature vector generation method, including the new generalized set of kernels for edge detection and edge-based feature extraction method. Section IV compares object detection results using the new edge-based feature extraction with those using currently known methods. II. BACKGROUND INFORMATION Edge-based feature vector generation involves detecting the edges in an input image and then extracting feature from the resultant edge map(s). Figure 1 shows a block diagram outlining the edge-based feature vector generation process. 679 1-4244-2384-2/08/$20.00 c 2008 IEEE

Transcript of [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore,...

Page 1: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

Edge detection

Feature extraction Feature vectorImage

Feature vector generation

Figure 1. Block diagram of edge-based feature vector generation process

Improving Edge-Based Feature Extraction Using Feature Fusion

Shahan Nercessian, Karen Panetta Department of Electrical and Computer Engineering

Tufts University Medford, MA, USA

[email protected], [email protected]

Sos Agaian Department of Electrical and Computer Engineering

University of Texas at San Antonio San Antonio, TX, USA

[email protected]

Abstract—Feature extraction is arguably the most important stage of an automatic object detection system. It is in this stage where the results of previous processing steps are interpreted to somehow characterize an object. Developing methods for feature extraction and feature vector generation using information from edge maps is a natural progression, as edge detection determines structure in images. A new edge-based feature extraction scheme is introduced based on the feature fusion of two existing methods. A generalized set of kernels for edge detection is also presented. The experimental results show that the detection of different objects of interests is improved using the new method.

Keywords—feature extraction, edge-based feature vectors, object detection

I. INTRODUCTION

Automatic object detection systems have grown increasingly popular in the field of computer vision. One of the main purposes for automatic object detection systems is to detect anomalies in images. Such applications include the detection of brain tumors and lesions in CT scan images [1], the detection of defects in materials [2, 3], and the detection of potential threat weapons in x-ray luggage scan images [4, 5]. Object detection is also a pivotal step towards object recognition. This is to say that once a certain object type is detected in an image, it is possible to recognize the specific object instance through some means.

Generally, feature is extracted from an object image and compiled into a feature vector, a set of numbers which somehow characterizes the object image. Naturally, edge-based feature extraction and feature vector generation methods [6, 7, 8] have been considered because edges show low level structure in images. Feature extraction methods based on object structure will most likely use edge detection at some point. Two existing edge-based feature vector generation methods are the Cell Edge Distribution (CED) and the Projected Principal-Edge Distribution (PPED) feature vector generation methods, which were introduced for both medical [6] and facial detection [9] purposes. The two methods extract different yet important information. A simple concatenation of the feature vectors generated by the two methods has been suggested, which can be generalized further as a weighted concatenation. We show by experiment this sort of concatenation does not improve detection results when using a Support Vector Machine (SVM) classifier, and therefore cannot be taken as a robust means of feature fusion.

In this paper, we present a new feature extraction method based on the fusion of the CED and PPED feature extraction methods. A generalized set of kernels for edge detection is also presented. The new feature extraction method was developed by generalizing the process employed by the two methods, determining the nature of the feature extracted by each method, and then determining a new feature extraction method which utilizes the qualities of each of the existing methods. The robustness of the new feature vector generation method is assessed through detection tests. Namely, a feature vector of an object image is generated, and the feature vector is classified as either corresponding to a specific object of interest or not corresponding to the object of interest using an SVM classifier. Due to interest in the use of x-ray luggage scan images for aviation security, the two objects of interest on which detection tests were performed for this paper are handguns and bottles. The experimental results show that the new feature extraction method outperforms CED, PPED, and feature fusion involving the weighted concatenation of CED and PPED feature vectors.

The remainder of this paper is organized as follows. Section II provides background information regarding edge detection, edge-based feature extraction, feature generation and SVM classification. Section III describes the new feature vector generation method, including the new generalized set of kernels for edge detection and edge-based feature extraction method. Section IV compares object detection results using the new edge-based feature extraction with those using currently known methods.

II. BACKGROUND INFORMATION

Edge-based feature vector generation involves detecting the edges in an input image and then extracting feature from the resultant edge map(s). Figure 1 shows a block diagram outlining the edge-based feature vector generation process.

679

1-4244-2384-2/08/$20.00 c© 2008 IEEE

Page 2: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

A. Edge Detection In edge detection methods such as Sobel, two masks, Gx

and Gy, are used to approximate the gradient of an input image I to determine edge points. While the standard Sobel operator uses 3x3 masks, it is of interest to consider using larger kernels to approximate the gradient, such as 5x5 masks. This is because using larger kernels generally corresponds to finding coarser scale edges [10], and in object detection, one is more concerned in finding general structure rather than detail. The 5x5 Sobel masks are given as

120214808461201264808412021

xG

146412812820000028128214641

yG ,

and the magnitude and orientation of the gradient are given as

22 )*()*( yx GIGIG (1)

x

yG GI

GI**

tan 1 (2)

The use of only one of the masks results in edges oriented in a certain direction. Diagonal edges can be detected by rotating the masks by 45 degrees. These masks, Gx45 and Gy45, are given as

6412041280218081208124

02146

45xG

021462081241808141280264120

45yG .

B. Edge-Based Feature Extraction and Vector Generation Two known methods for edge-based feature extraction that

have been mentioned are CED and PPED feature extraction. The result of either feature extraction/vector generation method is a 64 element feature vector. Both methods work on a 64x64 input window to extract feature and begin by first determining

edges oriented at 0, 90, 45, and -45 degrees. In CED feature extraction, each 64x64 edge map is divided into 16 16x16 cells, and the number of edge pixels in each of these cells is counted. In PPED feature extraction, edge pixels are counted along the orientation of a given edge map. For example, edge pixels in a horizontal edge map are counted by tallying the number of edge pixels in every 4 rows of the edge map, and edge pixels in a vertical edge map are counted by tallying the number of edge pixels in every 4 columns of the edge map. In either method, the 16 edge pixel tallies from each of the 4 edge maps are concatenated to form a 64 element feature vector that is then normalized. Figure 2 illustrates the generation of CED and PPED feature vectors.

The Mojette transform is a discrete version of the Radon transform which gives projections of an image along different orientations [11]. A projection oriented at an angle i for an image f is given as

)(),()(, ik l

iqipi lpkqblkfbproj

where is related to i is related to pi and qi by i=tan-1(qi / pi),and (b) is the Kronecker delta function which is 1 for b = 1 and 0 otherwise. A projection sums the pixel value of pixels which cross the line b = lpi-kqi for every combination of k and l. The Mojette transform is the set of projections for Ipredetermined projection angles. PPED feature extraction is essentially the Mojette transform with (pi,qi) = (1,0) for a horizontal edge map, (pi,qi) = (0,1) for a vertical edge map, (pi,qi) = (1,1) for a +45 degree edge map, and (pi,qi) = (1,-1) for a -45 degree edge map, except that PPED feature extraction counts the number of edge pixels in every four rows along the orientation direction of an edge map rather than every row along the orientation direction of an edge map.

As long as the input window is square, feature extraction does not need to be limited to the 64x64 window. In fact, neglecting the edge pixel width to image area ratio of images at varying sizes, an object at various sizes should generate approximately equal feature vectors after normalization. Note

(a) (b) Figure 2. Generation of (a) CED and (b) PPED feature vectors

680 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Page 3: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

that in the most general case, an MxM edge map would be divided into 16 M/4xM/4 cells using CED feature extraction or into 16 M/16 wide divisions along the orientation direction of the edge map using PPED feature extraction.

An analysis of the two feature extraction methods shows that CED feature extraction is generally more robust than PPED feature extraction. However, PPED feature vectors contain very useful information, especially when using smaller windows. With that said, the minimum window size on which CED can be performed is 4x4 and the minimum window size on which PPED can be performed is 16x16 simply due to the nature of the divisions. The concatenation of CED and PPED feature vectors, CED©PPED, has been a suggested feature vector which attempts to fuse the information found by using CED and PPED feature extraction methods. This can be generalized by the weighted concatenation of CED and PPED feature vectors, given as CED© PPED, where is a weighting factor [12, 13]. Though this may improve detection results when using a distance classifier, it will be shown that CED© PPED does not improve detection results when using an SVM classifier.

C. SVMSVM has been developed to offer methods for classification

that offer better performance than simple distance measures such as the Euclidean or Manhattan distances. SVM maps input vectors to a higher dimension feature space. For the two class problem, an input vector is labeled as either corresponding to the object of interest or not corresponding to the object of interest. The SVM separates the two classes by the maximal separating hyperplane in the feature space, which should ideally best generalize the classifier. After the initial setup of the classification database, new input vectors are labeled as either corresponding to the object or not corresponding to the object of interest based on the class separation. Apart from the initial setup of the classification database, an SVM classifier will also perform faster than a distance classifier. This is because the classification time will not depend on the number of template feature vectors in the database. Classification is determined simply by mapping the new input vector into the feature space and determining what class it is in, rather than determining the distances between the input feature vector and the individual template feature vectors. The multi-class problem can be implemented as a series of two class problems [14]. Therefore, rotational invariance can be achieved by the mentioned feature extraction methods by creating separate classes of object images at different orientations.

III. A NEW METHOD FOR FEATURE VECTOR GENERATION

The 5x5 masks used for edge detection are generalized as

101000

101

ccaacacabbcbcbaacaca

cc

Gx

11

00000

11

abacacbcacc

cacbcaccaba

Gy

bacabcacc

acaccacbca

cab

Gx

100

1010

01

45

010

1010

10

45

cabcacbca

acacabcaccbac

Gy .

The parameters a, b, and c for each edge map can be selected using a reconstruction based edge map evaluation method [15, 16]. The original input image is multiplied by the edge map, and the result is reconstructed by linear interpolation. Bovik’s SSIM [17] is given as

))((

)2)(2(),( 2222

yxyx

xyyxyxSSIM (4)

where x and y are the two images to be compared, (μx, μy) are the means of (x, y), ( x, y) are the standard deviations of (x, y), and xy is the covariance of x and y. It can be used to measure the structural similarity between the original and reconstructed image, and this in turn can be used as an objective edge evaluation measure. The parameter set whose edge map yields the maximum SSIM can be taken to be the best parameter set by the objective evaluation method. Note that selecting a=4,b=6, and c=2 yields the 5x5 Sobel kernels.

It is noted that the only difference between the CED and PPED methods is the shape and arrangement of the divisions in which edge pixels are counted in. In developing a new shape and arrangement of divisions, it is desirable to fuse the CED and PPED methods in such a way that the useful features of each method are utilized. Figure 3 shows the shape and arrangement of the divisions in which edge pixels are counted in for the presented feature extraction. As is the case with CED and PPED, the number of divisions per edge map is set to 16 and the horizontal, vertical, +45, and -45 degree edge maps are used. With that said, the shape and arrangement of the divisions are not be haphazardly chosen and the reason why the new arrangement is more robust than CED, PPED, and CED© PPED is explainable. The main assessment of

Figure 3. Shape and arrangement of divisions using presented method

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008) 681

Page 4: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

robustness will be defined by the detection/classification results using an SVM classifier.

The rationale for the presented shape and arrangement of divisions is easily observed. First note that square cells similar to those used in CED feature extraction are used on the majority of the window. For an MxM edge map, 12 M/4xM/4 cells are used around its border. The M/2xM/2 center of an edge map is divided into 4 divisions along its edge direction. This is similar to the Mojette transform and PPED feature extraction, except that the number of divisions in the center of the edge map was set to make the total number of divisions per edge map equal to 16. In this manner, the feature extraction method incorporates the qualities of CED and PPED simultaneously. CED feature extraction is performed on the majority of the edge map since it is generally a more robust method, and a modified PPED feature extraction is performed on a smaller window located in the center of the edge map where it will work more effectively.

Figure 4 shows examples of the feature vectors that are generated for two different handgun images. It is seen that between the handgun images, the feature vectors generated using the new feature extraction method are agreeable even though there are some minor differences in their structure. Therefore, a database of template handgun feature vectors can generalize the structural feature of a handgun.

IV. EXPERIMENTAL RESULTS

Figure 5 shows a block diagram of the system used to assess feature extraction performance. An image database consisting of images of the object of interest and images of objects that are not the object of interest has been compiled. Feature vectors for each image in the database are generated using the new method for feature extraction. These feature vectors are used as templates for the SVM classifier. Each template feature vector is appropriately labeled as either being a member of the object of interest class or not being a member of the object of interest class. This information is used to set up the SVM classifier. Figure 6 shows sample images of object of interest images and non-object of interest images used to set up the database for handgun and bottle detection. The choice of non-object of interest images is crucial for effective SVM classification. Non-object of interest images were chosen in a way to best generalize the non-object of interest class. Non-object of interest images range from having sparse to dense edge maps and encompass a broad variety of structure.

Figure 4. Feature vectors of two handgun images using the presented feature extraction method

(a)

(b)

(c)Figure 6. Examples of object of interest images for (a) handgun and (b) bottle detection and (c) examples of non-object of interest images

used to set up the SVM classifier

Figure 5. Block diagram of the detection system

682 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)

Page 5: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

Images that were not used to set up the SVM classifier are used as inputs to then test the detection system. A feature vector is generated for each input image using the new method for feature extraction. Based on the class separation, the SVM classifier classifies each feature vector as either corresponding to the object of interest or not corresponding to the object of interest. Similar tests are used to assess the performance of CED, PPED, and CED© PPED feature vectors, setting up the SVM classifier accordingly. The weighting factor in CED© PPED is varied to range from heavy CED-weighting to heavy PPED-weighting. Figure 7 summarizes the detection results for handgun and bottle detection using the different feature extraction methods.

The detection results show that feature vectors generated using the new feature extraction method outperforms CED, PPED, and CED© PPED feature vectors. In this particular handgun detection test, the new feature extraction method yielded a 100% detection rate. The results of CED© PPED ranged from the PPED detection rate of 92.1% to the CED detection rate of 97.4%. However, in bottle detection, CED© PPED detection rates ranged from 53.3%-80.0% as was varied, while PPED yielded a 73.3% detection rate and CED yielded an 80.0% detection rate. Therefore, there are values of that will actually yield worse detection rates than either CED or PPED feature extraction, and the detection rate using CED© PPED feature vectors will never exceed the detection rate of the better of the two methods individually when using an SVM classifier. Note that the new feature extraction method outperformed the other methods in bottle detection as well, yielding a detection rate of 86.7% in this particular experiment.

V. CONCLUSIONS

A new method for edge-based feature extraction has been presented based on the analysis of two existing edge-based feature extraction methods. A generalized set of kernels for edge detection was also presented, and methods to achieve scale invariance and rotational invariance in detection were suggested. The presented feature vector generation method incorporates the desirable qualities of the two existing methods. This is reflected by the detection results of different objects of interests. The detection results using handgun and bottle images show that feature vectors generated using the new feature extraction method outperform CED, PPED, and CED© PPED feature vectors. According to these experimental results, it can also be concluded that using CED© PPED feature vectors is not a good means of feature fusion when using an SVM classifier. This is particularly important due to the benefits provided by an SVM classifier compared to a standard distance classifier. These benefits include improved classification and faster classification time. Future research in feature extraction will include considering new shapes and arrangements of divisions for edge pixel counting, as well as considering application specific requirements and the effects they impose on feature extraction and classification schemes to improve automatic detection results.

REFERENCES

[1] L. Yin, S. Deshpande, J. K. Chang, “Automatic lesion/tumor detection using intelligent mesh-based active contour,” IEEE Inter. Conf. on Tools with Artifficial Intell., pp. 390-397, 2003.

[2] R. Marik, M. Petrou, and J. Kittler, “Bump and depression detection on ceramic tiles,” IEE Colloq. on Ind. Ins., London, Vol. 5, pp. 1-3, 1997.

[3] A. Kumar and G. Pang, “Defect detection in textured materials using Gabor filters,” IEEE Ind. App. Conf., Vol. 2, pp. 1041-1047, 2000.

[4] M. Singh and S. Singh, “Image segmentation optimisation for x-ray images of airline luggage,” IEEE Comp. Intell. for Home. Secur. and Pers. Safety, Orlando, pp. 10-17 2004.

[5] S. Nercessian, K. Panetta, and S. Agaian, “Automatic detection of potential threat objects in x-ray luggage scan images,” IEEE. Conf. on Tech. for Home. Secur., 2008, in press.

[6] M. Yagi, M. Adachi, and T. Shibata, “A hardware-friendly soft-computing algorithm for image recognition," Proc. of Euro. Sig. Proc. Conf., Tampere, pp. 729-732, Sept. 4-8, 2000.

[7] X. S. Zhou and T. S. Huang, “Edge-based structural features for content-based image retrieval,” Patt. Recog. Let., 2001.

88

90

92

94

96

98

100

Presentedmethod

CED PPED CED+ PPED

Hand

gun

dete

ctio

n ra

te (%

)

(a)

0

10

20

30

40

50

60

70

80

90

100

Presentedmethod

CED PPED CED+ PPED

Bottl

e de

tect

ion

rate

(%)

(b)

Figure 7. (a) Handgun and (b) bottle detection rates using various feature extraction methods

2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008) 683

Page 6: [IEEE 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC) - Singapore, Singapore (2008.10.12-2008.10.15)] 2008 IEEE International Conference on Systems, Man and

[8] M. Bulacu, L. Schomaker, and L. Vuurpijl, “Writer identification using edge-based directional features,” IEEE Inter. Conf. Doc. Anal. and Recog., Edinburgh, Vol. 2, pp. 937-941, 2003.

[9] Y. Suzuki and T. Shibata, “Multiple-resolution edge-based feature representations for robust face segmentation and verification,” Proc. of Euro. Sig. Proc. Conf., Antalya, Sep. 4-8, 2005.

[10] B. Sumengen and B. S. Manjunath, “Multi-scale edge detection and image segmentation,” Proc. Euro. Sig. Proc. Conf., Sep. 2005.

[11] A. Kingston, S. Colosimo, P. Campisi, F. Autrusseau, “Lossless image compression and selective encryption using a discrete radon transform,” IEEE Inter. Conf. on Image Proc., vol. 4, pp. 465-468, 2007.

[12] H. Cai and S. S. Agaian, “Spatial-frequency feature vector fusion based steganalysis” IEEE Conf. Sys., Man, and Cyb., 2006, Taipei, 2006.

[13] H.K. Ekenel and B. Sankur, “ Multiresolution face recognition,” Image and Vision Computing, pp. 469-477, 2005.

[14] L. Qi and Y. Liu, “Two-stage SVMs for solving multi-class prroblems” IEEE Conf. on Comm. Sys., vol. 1, pp. 78-83, 2006.

[15] S. Nercessian, K. Panetta, and S. Agaian, “A parametric method for edge detection based on recursive mean separate image decomposition,” IEEE Inter. Conf. on Mach. Learn. and Cyb. , Kunming, 2008, in press.

[16] S. Carlsson, “Sketch based coding of gray level images,” IEEE Trans. Image Proc., Vol. 15, No. 1, pp. 57-83, 1988.

[17] Z. Wang, E. P. Simoncelli and A. C. Bovik, “Multi-scale structural similarity for image quality assessment,” IEEE Asilomar Conf. Signals, Systems and Computers, 2003.

684 2008 IEEE International Conference on Systems, Man and Cybernetics (SMC 2008)