[IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium...

4
A SHAPE CONTOUR DESCRIPTOR BASED ON SALIENCE POINTS Glauco Vitor Pedrosa, Celia A. Zorzo Barcelos Federal University of Uberlˆ andia (UFU) Faculty of Computer Science; Faculty of Mathematics CEP: 38400-100 - Uberlˆ andia, MG, Brazil Email: [email protected]; [email protected] Marcos Aur´ elio Batista Federal University of Goi´ as (UFG) Department of Computer Science Catal˜ ao, GO, Brazil Email: [email protected] ABSTRACT In this paper, we propose a shape-based image retrieval tech- nique using salience points to describe shapes. This tech- nique consists of a salience point detector robust to noise, a salience representation using angular relative position and curvature value, invariant to rotation, translation and scaling, and an elastic matching algorithm to analyze the similarity. The proposed technique is robust to noise and presents good performance when dealing with shapes of different class but visually similar. The experiments were made in order to illus- trate the performance of the proposed technique. The results show the good performance of our method when comparing with other shape-based methods in the literature. Index Termsimage retrieval, shape description, salience points, contour representation, matching algorithm 1. INTRODUCTION Many areas of human knowledge are using images and, as a consequence there is a growing interest in finding images in large databases. In order to find an image, the image has to be described by its low-level features, such as color, shape and texture. In pattern recognition and related areas, shape is one of the most widely used image features exploited in content- based image retrieval systems. Many shape representation and description techniques have been developed. Some descriptors use the information extracted from the shape contour [3, 6, 7], others use the in- formation from the whole shape [5, 8]. Shape representation using Fourier descriptor [7] is a simple and robust contour- based shape descriptor, and Moments [5] is an effective and compact region-based shape descriptor. More existing shape representation techniques have been reviewed in [4]. In general, a shape descriptor is a set of values that at- tempts to quantify the shape features in ways that agree with human intuition. Efficient shape descriptors must present some essential properties [4], such as: invariant to trans- lation, rotation, scaling, robustness to noise and compact representation. Fig. 1: The salience points of some shapes. In this work, we are interested in describing shapes us- ing salience points. Saliences are defined as the points of high curvature along shape contour. These points are very useful for shape description, because they have the ability to represent a shape in a compact manner, invariant to rotation and translation. Figure 1 shows the salience points of some shapes. Recently, in [1], we proposed a method for detect- ing shape salience points. The method makes use of an anisotropic filter to smooth noisy points from the contour shape. It has good performance for detecting salience points on images with and without noise and it also has the advan- tage of computational simplicity. Using our previous work [1], here we expand the tech- nique to shape description and recognition. In this paper, we propose to represent the salience points using angular posi- tion and curvature value; and an elastic matching algorithm to measure the similarity between two shapes characterized by their salience points. Experiments were made comparing the proposed shape descriptor with several other shape descriptors using the stan- dard MPEG-7 Part-B database [2]. 2. REPRESENTING THE SALIENCE POINTS Here, we propose to represent each salience point s i by a pair (s 1,i ,s 2,i ), where s 1,i represents the angular relative position and s 2,i the curvature value. The curvature value is calculated for each shape contour point. Let the discrete shape contour S = {p 1 , ..., p l } be represented by its points set in a clockwise direction, where 2011 18th IEEE International Conference on Image Processing 978-1-4577-1303-3/11/$26.00 ©2011 IEEE 1049

Transcript of [IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium...

Page 1: [IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium (2011.09.11-2011.09.14)] 2011 18th IEEE International Conference on Image Processing

A SHAPE CONTOUR DESCRIPTOR BASED ON SALIENCE POINTS

Glauco Vitor Pedrosa, Celia A. Zorzo Barcelos

Federal University of Uberlandia (UFU)Faculty of Computer Science; Faculty of Mathematics

CEP: 38400-100 - Uberlandia, MG, BrazilEmail: [email protected]; [email protected]

Marcos Aurelio Batista

Federal University of Goias (UFG)Department of Computer Science

Catalao, GO, BrazilEmail: [email protected]

ABSTRACT

In this paper, we propose a shape-based image retrieval tech-nique using salience points to describe shapes. This tech-nique consists of a salience point detector robust to noise,a salience representation using angular relative position andcurvature value, invariant to rotation, translation and scaling,and an elastic matching algorithm to analyze the similarity.The proposed technique is robust to noise and presents goodperformance when dealing with shapes of different class butvisually similar. The experiments were made in order to illus-trate the performance of the proposed technique. The resultsshow the good performance of our method when comparingwith other shape-based methods in the literature.

Index Terms— image retrieval, shape description, saliencepoints, contour representation, matching algorithm

1. INTRODUCTION

Many areas of human knowledge are using images and, as aconsequence there is a growing interest in finding images inlarge databases. In order to find an image, the image has to bedescribed by its low-level features, such as color, shape andtexture. In pattern recognition and related areas, shape is oneof the most widely used image features exploited in content-based image retrieval systems.

Many shape representation and description techniqueshave been developed. Some descriptors use the informationextracted from the shape contour [3, 6, 7], others use the in-formation from the whole shape [5, 8]. Shape representationusing Fourier descriptor [7] is a simple and robust contour-based shape descriptor, and Moments [5] is an effective andcompact region-based shape descriptor. More existing shaperepresentation techniques have been reviewed in [4].

In general, a shape descriptor is a set of values that at-tempts to quantify the shape features in ways that agree withhuman intuition. Efficient shape descriptors must presentsome essential properties [4], such as: invariant to trans-lation, rotation, scaling, robustness to noise and compactrepresentation.

Fig. 1: The salience points of some shapes.

In this work, we are interested in describing shapes us-ing salience points. Saliences are defined as the points ofhigh curvature along shape contour. These points are veryuseful for shape description, because they have the ability torepresent a shape in a compact manner, invariant to rotationand translation. Figure 1 shows the salience points of someshapes.

Recently, in [1], we proposed a method for detect-ing shape salience points. The method makes use of ananisotropic filter to smooth noisy points from the contourshape. It has good performance for detecting salience pointson images with and without noise and it also has the advan-tage of computational simplicity.

Using our previous work [1], here we expand the tech-nique to shape description and recognition. In this paper, wepropose to represent the salience points using angular posi-tion and curvature value; and an elastic matching algorithmto measure the similarity between two shapes characterizedby their salience points.

Experiments were made comparing the proposed shapedescriptor with several other shape descriptors using the stan-dard MPEG-7 Part-B database [2].

2. REPRESENTING THE SALIENCE POINTS

Here, we propose to represent each salience point si by a pair(s1,i, s2,i), where s1,i represents the angular relative positionand s2,i the curvature value.

The curvature value is calculated for each shape contourpoint. Let the discrete shape contour S = {p1, ..., pl} berepresented by its points set in a clockwise direction, where

2011 18th IEEE International Conference on Image Processing

978-1-4577-1303-3/11/$26.00 ©2011 IEEE 1049

Page 2: [IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium (2011.09.11-2011.09.14)] 2011 18th IEEE International Conference on Image Processing

pi = (xi, yi), i = 1, ..., l, l is the number of contour points,xi and yi represent the coordinates of point pi on the im-age plane. Given a contour point pi = (xi, yi) and its k-left and k-right neighbour, pi+k = (xi+k, yi+k) and pi−k =(xi−k, yi−k), the curvature value for the point pi is calculatedas Ψ(pi, k) = (xy − xy).

The value of parameter k interferes in the salience curva-ture value. In this work, we use the mean of the curvaturevalue for k = {l/5, l//10, l/15, l/20, l/25}, where l is the num-ber of shape contour points.

Taking the shape centroid c and any other point c′ of theshape contour, the angular relative position s1,i is calculatedas the angle c′csi.

3. THE MATCHING ALGORITHM

The goal of the matching algorithm is to measure the similar-ity between two shapes A and B represented by their saliencepoints: A = (a1, ..., an) and B = (b1, ..., bm), where n and mrepresent the number of salience points of shape A and B,respectively.

Considering m ≤ n, the idea of the similarity is to findthe best correspondence between A and B. Formally, thiscorrespondence is defined by a function ϕ : {1, ..., n} →{1, ...,m}, such that ϕ(i) ≤ ϕ(i + 1), for i = 1...n − 1,and ai is mapped into bϕ(i) for i ∈ {1, ...n}. The distance iscalculated by:

DA,B(ϕ) =n∑

i=1

‖ai − bϕ(i)‖ (1)

where ‖ai − bϕ(i)‖ = |a1,i − b1,ϕ(i)|+ |a2,i − b2,ϕ(i)|.However, to make the distance invariant to rotation, the

distance should be calculated over all possible shifts betweenthe vectors A and B. This means, we want to minimize:

DA,B(ϕ) =n∑

i=1

‖(ai + tr)− bϕ(i)‖ (2)

where tr = (trk, 0), and trk is a shift value. The shift valueis defined by the difference between the first point of vectorB and the other points of vector A, this means, trk = (b1,1 −a1,k), k = 1, ..., n. Formally, the optimal distance is obtainedby DA,B(ϕ), where ϕ = argmintrk

DA,B(ϕ).To solve the problem of minimizing Eq. 2, we use an

algorithm based on dynamic programming. First, we define adifferential matrix r : n x m using all elements of vectors Aand B:

r = (ri,j) = ‖ai − bj‖ (3)

where i = 1, ..., n and j = 1, ...,m.The problem of finding the minimum of Eq. 2 can be seen

as a least-value path problem on the differential matrix. Toobtain the solution, we treat the matrix r as a directed acyclic

graph, where an element of ri,j is mapped to rk,l, if and onlyif, i ≤ k ≤ i + 1 and j ≤ l ≤ j + 1. The least-value path canbe obtained by the following recursive function:

M(i, j) =

rij , if i=1 and j= 1;rij + M(i, j − 1), if i=1 and j>1;rij + M(i− 1, j), if i>1 and j=1;rij + min(M(i− 1, j),M(i− 1, j − 1),M(i, j − 1)), if 1<i≤n and 1<j≤m;∞, otherwise.

(4)The distance DA,B(ϕ) is obtained in M(n, m). This dis-

tance should be recalculated over all n possible trk shifts,using:

r = (ri,j) = ‖(ai + tr)− bj‖ (5)

where tr = (trk, 0), i = 1, ..., n and j = 1, ...,m, and trk isthe shift value.

The final distance is the minimal distance over all the npossible shifts.

As in [9], the similarity distance is normalized by theshape complexity (SC) of each contour. The motivation be-hind this normalization is based on the observation that thesensitivity of the human perception to the contour variationsreduces as the shape complexity increases [9]. Here, theshape complexity SC is considered as the absolute differencebetween the maximum and minimum curvature values at allsalience points of the shape:

SC = |max(s2,i)−min(s2,i)| (6)

Then, the similarity distance between two shapes, A andB, is given by:

DA,Bdist =

DA,B(ϕ)K + SCA + SCB

(7)

where SCA and SCB are the complexities of shapes A and Band K is a constant to prevent the domination of the denom-inator when the complexities are very small. In our experi-ments, K is set to 1.

3.1. Increasing the similarity distance

To discard very dissimilar shapes and to further increase thediscrimination power of the descriptor, we use a set of simplegeometric features in shape similarity, such as suggested by[9].

In this work, we used three geometric features: aspect ra-tio (AR), eccentricity (E), and solidity (S). These features in-clude considerable information about the global properties ofa shape.

Therefore, the final similarity distance between shapes Aand B is given as:

2011 18th IEEE International Conference on Image Processing

1050

Page 3: [IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium (2011.09.11-2011.09.14)] 2011 18th IEEE International Conference on Image Processing

Fig. 2: Precision recall for the evaluated descriptors.

DA,Bf = |ARA−ARB |+|EA−EB |+|SA−SB |+DA,B

dist (8)

where ARA, EA, and SA are the aspect ratio, eccentricity,and solidity of shape A (the same for shape B).

4. EXPERIMENTAL RESULTS

To evaluate the effectiveness of the proposed descriptor, ref-erenced as Salience Descriptor (SD), in this section some ex-perimental tests are reported. The SD was compared withfive other shape descriptors in the literature: Beam AngleStatistics [3] (BAS), Multiscale Fractal Dimension [6] (MSFractal), Moment Invariants [5] (MI), Fourier Descriptor [7](Fourier) and Tensor Scale Descriptor [8] (TSDIZ). Experi-ments were conducted using the MPEG-7 CE-shape-1 part B[2] database, which is the widely tested shape database. Inthis way, we used the results previously reported in [8], in or-der to obtain comparative results. This database consists of1400 images, categorized in 70 classes (20 images in eachclass). The comparative retrieval performance was evaluatedusing the Precision versus Recall (PR), where Precision is theratio of the number of relevant retrieved images to the totalnumber of retrieved images; and Recall is the ratio of thenumber of relevant retrieved images to the number of relevantimages in the database. Each of all 1400 images was used asa query.

Figure 2 shows the PR average curve for the evaluated de-scriptors. The SD presented the second best PR curve amongthe tested descriptors. However three aspects should be high-lighted in favor of the SD: the retrieval of visually similar im-ages that do not belong to the same class as the query image;the dimension of the feature vector; the capacity to representnon trivial shapes. In the following we discuss these threepoints.

Fig. 3: Examples of shapes that are visually dissimi-lar/similar.

Fig. 5: Precision recall for the queries of figure 4a.

Shape recognition using MPEG-7 database is not trivial,once some samples are visually dissimilar from other mem-bers of their own class. Furthermore, there are shapes thatare highly similar to examples of other classes, as can be seencomparing the Deer and the Horses of figure 3. The SD wasable to capture features visually similar to images from otherclasses of the image query, as we can see by the three exam-ples of Figure 4a. The images inside the box do not belongto the query class. Depending on the application, this is agood result, however considering that the PR considers onlythe images belonging to the same class as the image query, theperformance of SD is penalized. The PR curve of the three ex-amples of figure 4a has low precision, as can be seen in figure5, and this fact contributed negatively for the SD Precision vs.Recall average curve.

The SD has a variable feature vector dimension depend-ing on the shape features. Table 1 shows the dimension ofthe feature vector used by the evaluated descriptors. The MIis the region-based descriptor with the smallest feature vectordimension. The BAS and Fourier are contour-based descrip-tors where the feature vector dimension is the size of the shapecontour points. The SD has the feature vector dimension vary-ing between 6 and 180, and the mean is 56 (considering theMPEG-7 database). This size of the feature vector is the low-est among all the contour-based descriptors analyzed. Thisfact is very useful in applications with a large database, onceit requires a low size for storing the encoded database.

Another positive aspect in favor of SD is its robustnesswhen dealing with contour complex shapes. Figure 4b showsthe retrieval results using two different queries, Query A andQuery B, for the SD and BAS descriptors. The images insidethe box do not belong to the same class as the image query.Analyzing the result of Query A, we can see that SD describesshapes with a contour complex better than BAS. A shape witha contour complex can be visually similar to an other shapewith a smooth contour, which can be seen, for example, com-

2011 18th IEEE International Conference on Image Processing

1051

Page 4: [IEEE 2011 18th IEEE International Conference on Image Processing (ICIP 2011) - Brussels, Belgium (2011.09.11-2011.09.14)] 2011 18th IEEE International Conference on Image Processing

(a) (b)

Fig. 4: Retrieval results: (a) using the SD descriptor; (b) using the SD and BAS descriptors

Descriptor Vector dimension Shape-based methodMS Fractal 100 Contour

MI 14 RegionFourier 126 ContourTSDIZ 60 RegionBAS 180 ContourSD 3+2 Ns* Contour

Table 1: The feature vector dimension for the evaluated de-scriptors. *Ns = number of saliences.

paring the Query B and the image of position 5 in the retrievalresults for SD rank.

5. CONCLUSION

This paper has presented a compact method for retrievingshape-based images using salience points as shape descrip-tion. The proposed technique is invariant to translation, rota-tion, mirroring, scaling, robustness to noise and ease of fea-ture extraction.

The method presented uses: (i) a salience detector robustto noise, (ii) a salience representation by angular relative po-sition and curvature value; and (iii) an elastic similarity mea-sure that allows comparing shapes with a different numberof salience points. Global features (aspect ratio, eccentricity,and solidity) of the shapes are incorporated in the similaritydistance to further increase the discrimination ability.

Comparative results were made using five other shape de-scriptors in the literature, and the experimental results are ex-pected to validate the efficacy of the proposed technique. Itpresented a good retrieval accuracy, and it was able to captureinformation that is visually similar to the image query.

AcknowledgmentThe authors would like to thank the financial support of CNPq(Proc. 306726/2009-2 and 551852/2010-0) and CAPES.

6. REFERENCES

[1] G.V. Pedrosa and C.A.Z. Barcelos, Anisotropic Diffu-sion for Effective Shape Corner Point Detection, PatternRecognition Letters, 31 (12): 1658-1664, 2010.

[2] M. Bober, MPEG-7 Visual Shape Descriptors, IEEETransactions on Circuits and Systems for Video Technol-ogy, 11(6):716-719, 2001.

[3] N. Arica and F. Vural. BAS: A Perceptual Shape Descrip-tor based on the Beam Angle Statistics, Pattern Recogni-tion Letters, 24(9-10):1627-1639, 2003.

[4] D. Zhang and G. Lu, Review of shape representation anddescription. Pattern Recognition, 37(1): 1-19, 2004.

[5] S.X. Liao and M. Pawlak, On image analysis by moments,IEEE Trans. Pattern Anal. Mach. Intell., 18(3): 254-266,1996.

[6] R. da S. Torres, A. X. Falcao, and L. da F. Costa, Agraph-based approach for multiscale shape analysis, Pat-tern Recognition, 37(6): 1163-1174, 2004.

[7] R. C. Gonzalez and R. E. Woods, Digital image process-ing, 2nd, Addison-Wesley, Longman Publising Co. Inc.,2001.

[8] F. A. Andalo, P. A. V. Miranda, R. da S. Torres, andA. X. Falcao, Shape Feature Extraction and Descriptionbased on Tensor Scale, Pattern Recognition, 43 (1): 26-36, 2010.

[9] N. Alajlan, I. E. Rube, M. S. Kamel, G. Freeman. Shaperetrieval using triangle-area representation and dynamicspace warping, Pattern Recognition, 40 (7): 1991-1920,2007.

2011 18th IEEE International Conference on Image Processing

1052