Image Indexing using a Coloured Pattern Appearance Modelpszqiu/Online/PCAM-Indexing.pdf · Methods...

Image Indexing using a Coloured Pattern Appearance Model ♣

G. QiuSchool of Computer Science & IT

The University of NottinghamJubilee Campus, Nottingham NG8 1BB, United Kingdom

[email protected]

Abstract

We introduce a new method for colour image indexing and content-based image retrieval. An image isdivided into small sub-images and the visual appearance of which is characterised by a coloured patternappearance model. The statistics of the local visual appearance of the image are then computed as measures ofthe global visual appearance of the image. The visual appearance of the small sub-images is modelled by theirspatial pattern, colour direction and local energy strength. To encode the local visual appearance, an approachbased on vector quantisation (VQ) is introduced. The distributions of the VQ code indices are then used toindex/retrieve the images. The new method can not only be used to achieve effective image indexing andretrieval; it can also be used for image compression. Based on this method, indexing and retrieval can be easilyand conveniently performed in the compressed domain without performing decoding operation.

Keywords: colour vision, colour appearance model, image database, image indexing, content-based retrieval,vector quantization, image coding

1. Introduction

Image indexing and content-based image retrieval (CBIR) has been an extensively researched area in thecomputer vision community for the past decade. Many methods and techniques have been developed andpublished. A general approach to CBIR is to extract low-level visual features, such as colour [1], texture [2], andshape [3] etc., and store them as meta-data in addition to the imagery data itself. Content based retrieval isachieved by comparing the visual features extracted from the query example image with those stored in thedatabase based on some forms of similarity measures.

Early approaches used colour alone for indexing [1]. This approach has been very successful and isextensively used in many of today’s research and commercial systems. Even though the concept of colourhistogram matching is simple, computation can be very expensive, and researchers have been trying to addressthis issue [4]. In addition to computational problem, it is well known that the biggest drawback of colourhistogram based method is that histogram is a global measure, it does not contain any spatial information, again,researchers have recognised the issue and developed schemes to tackle the problem, e.g. [5]. Methods usingGabor filter banks to create texture features for CBIR have been reported as being quite successful in certainareas of application [2]. In [2] only grey-scale textures were studied, and in [6] multi-scale Gabor filter bankshave been used for colour texture recognition. In [6] the authors also applied the double-opponent colour visiontheory [7] and used Gabor filters outputs to create Gabor opponent features to enhance the performance. Ingeneral, a combination of various visual features (colour, texture, shape, etc.) is used to achieve betterperformance. For a review of current state of the art in this area, there have been several review papers, e.g. [8],and many annual conferences dedicated to the topic, e.g. [9], [10].

In this paper, we have developed a new approach to CBIR for colour image database. We were motivatedby a study in the field of psychology and human colour vision, specifically, by the pattern colour separable(PCS) model of human colour vision [11]. According to [11], the visual appearance of a colour pattern to a

♣ Part of this work was performed when the author was with the School of Computing, University of Leeds,United Kingdom. To view the images appear in this paper in colour, a pdf version is available online athttp://www.cs.nott.ac.uk/~qiu/Online/Publications.html

human observer is determined by three factors: 1) the spatial frequency, 2) the colour angle and 3) the strengthof the stimulus. In other words, within the human visual system (HVS) there are three visual pathways, one issensitive to the spatial frequency, one is sensitive to the colour and one is sensitive to the strength of the visualstimulus. From this PCS theory, we set out to process the three components, pattern, colour and strength of animage separately. A coloured pattern appearance model (CPAM) is constructed to encode the signals of thevisual pathways. We found that the codes used to encode the three channels can be used successfully for imageindexing.

The organisation of the paper is as follows. Section 2 briefly reviews the pattern colour separable theory.Section 3 introduces a CPAM model for small size patterns which uses vector quantization to encode patternand colour signals. Section 4 explains how the codes for the pattern and colour visual pathway of the CPAM canbe used for content-based image indexing and retrieval. Section 5 will present experimental results and section 6gives some concluding remarks.

2. Colour Vision Theories and Colour Spaces

There is evidence to suggest that different visual pathways process colour and pattern in the human visualsystem. In [11], experiments were carried out using square wave patterns with a range of different spatialfrequencies, colours and stimulus strengths to measure how colour appearance depends on spatial pattern.Results suggest that the value of one neural image is the product of three terms. One term defines the pathway'ssensitivity to the square wave’s colour direction. A second term defines the pathway’s sensitivity to the spatialpattern and the third term defines the pathway’s sensitivity to the square wave’s stimulus strength. This is thepattern-colour-separable (PCS) model of human colour vision.

There is also physiological evidence to suggest the existence of opponent colour signals in the visualpathway [12]. The opponent colour theory suggests that there are three visual pathways in the human colourvision system. One pathway is sensitive mainly to light-dark variations; this pathway has the best spatialresolution. The other two pathways are sensitive to red-green and blue-yellow variation. The blue-yellowpathway has the worst spatial resolution. In opponent-colour representations, the spatial sharpness of a colourimage depends mainly on the sharpness of the light dark component of the images and very little on the structureof the opponent-colour image components

However, the pattern-colour-separable model and the opponent colour theory are consistent with oneanother. In [11], the spatial and spectral tuning characteristics of the pattern, colour and strength pathways wereestimated; the result was that one broadband and two opponent colour pathways were inferred.

The property of the HVS that different visual pathways have different spatial acuity is well known andexploited in colour image processing in the form of colour models (spaces). The earliest exploitation of thisperhaps was the use of YIQ signal in terrestrial TV broadcasting [13]; where the Y component captures thelight-dark variation of the TV signal and is transmitted in full bandwidth, whilst the I and Q channels capture thechromatic components of the signal and are transmitted using half the bandwidth. Similar colour models, suchas YCbCr [15], Lab [14] and many more [16] were also developed in different contexts and applications.

Put the colour models such as YIQ and YCbCr in the context of pattern colour separable framework, itcould be roughly interpreted as that the spatial patterns are mostly contained in the Y channel and colours in theI and Q or Cb and Cr channels, whilst the strength is the overall energy of all three channels, although Y willcontain the vast majority of it. Because colours and patterns are separable, coding the Y, independently from Iand Q, or Cb and Cr, plus the strength, should completely capture the visual appearance of an image. Given thatthese independent codes contain the visual appearance information, they could be used to index image andretrieve (recognise) images of similar visual appearance, thus are ideal features for image database indexing. Inthe rest of the paper, we shall show it is indeed the case.

3. Coloured Pattern Appearance Model

We would like to translate the pattern-colour-separable model [11], into a computational system. Coloursignals captured using a camera or other input devices normally appear in the form of RGB signals. It is first

necessary to convert RGB to an opponent colour space. We decided to use the YCbCr colour model1 in thispaper.

The relation between YCb Cr and the better-known RGB space is as follows:

YCC

RGB

b

r

= − −− −

0 299 0 587 0 1140 169 0 331 0 500

0 500 0 419 0 081

. . .. . .

. . .

The Y contains the luminance information, Cb and Cr contain (mostly) chromatic information as well assome luminance information.

Because pattern and colour are separable, and Y component has the highest bandwidth, the spatial patternswill be mostly contained in Y, Cb and Cr together can be roughly interpreted as colour. The stimulus strength ofa small area of the image can be approximated by the mean values of the area in Y channel only. The threevisual pathways, pattern, colour and strength for a small block of image are now modelled in the colouredpattern appearance model (CPAM) as shown in Fig.1.

Y

Cb

Cr

mean

÷

2↓

2↓

÷

S

P

C

Fig.1 Coloured Pattern Appearance Model (CPAM). The visual appearance of a small image block ismodelled by three components: the stimulus strength (S), the spatial pattern (P) and the colour (C). For asmall image area, the stimulus strength S is approximated by the local mean of the Y component. Thepixels in Y normalised by S form the spatial pattern. Because Cb and Cr have lower bandwidth, they aresub-sampled by a factor of 2 in both dimensions. The sub-sampled pixels of Cb and Cr are normalised byS, to form the colour (C) component of the appearance model. Normalising the pattern and colourchannels by the strength has two purposes. Firstly, from a coding’s point of view, removing the DCcomponent makes the code more efficient [17]. Secondly, from image indexing’s point of view, itremoves to a certain extent the effects of lighting conditions, making the visual appearance modelsomewhat “colour constant” [20] which should improve the indexing and retrieval performance,especially in the case of retrieving similar surfaces imaged under different conditions.

In order to use the model for the purpose of image indexing, the S, P and C signals of the model have to becoded properly. Because we have in mind the code should capture the visual appearance of the image andsimultaneously could be conveniently used as features for indexing in image database, we design our encoderbased on vector quantization [17].

Vector quantization (VQ) is a mature method of lossy signal compression/coding in which statisticaltechniques are used to optimise distortion/rate tradeoffs. A vector quantizer is described by an encoder Q, whichmaps the k-dimensional input vector X to an index i ∈ I specifying which one of a small collection ofreproduction vectors (codewords) in a codebook C = {Ci; i ∈ I} is used for reconstruction, and there is also adecoder, Q-1, which maps the indices into the reproduction vectors, i.e., X’ = Q-1(Q(X)).

1 Other similar colour space can also be used.

There are many methods developed for designing VQ codebook. The K-means types algorithms, such asthe LGB algorithm [17], and neural network based algorithms, such as the Kohonen feature map [18] arepopular tools. In this work, we used a specific neural network training algorithm, the frequency sensitivecompetitive learning (FSCL) algorithm [19] to design our codebook. We find FSCL is insensitive to the initialchoice of codewords, and the codewords designed by FSCL are more efficiently utilised than those designed bymethods such as the LGB algorithm.

The FSCL method can be briefly described as follows:

1. Initialise the codewords, Ci(0), i = 1, 2, …, I, to random number and set the counters associated with eachcodeword to 1, n i(0) =1

2. Present the training sample, X(t), where t is the sequence index, and calculate the distance between X(t) andthe codewords, Di (t) = D(X(t), Ci(t)), and modify the distance according to D* i(t)=n i(t)Di(t)

3. Find j, such that D* j <= D*i, for all i, update the codeword and counterCj(t+1) = Cj(t) + a [X- Cj(t)]nj(t+1) = nj(t) + 1where 0 < a < 1 is the training rate.

4. Repeat by going to 2.

3.1 Coding the Pattern Channel (P)

From the original RGB space, the image is first converted to YCbCr space. For the Y channel, the image isdivided into small blocks. In this paper, the block size was set to 4 x 42. The strength of the block (mean) iscalculated first, and then each pixel in the block is divided by the block mean. A VQ codebook is designed forthe mean-normalised block. Traditionally, it is a standard practice to subtract the mean from the block pixel andthen design a VQ for the residual block. Because we have image indexing in mind, by dividing the pixels by themean will to a certain extent remove the influence of the lighting condition, a problem can dramatically affectthe performance of an indexing and retrieval system [20].

Let Y = {y(i, j), i, j = 0, 1, 2, 3} be the 4 x 4 Y image block. The stimulus strength of the block is calculatedas

∑∑= =

=3

0

3

0

),(161

i j

jiyS (1)

Then the pattern vector, P = {p(i, j), i, j = 0, 1, 2, 3} of the block, is formed as

Sjiy

jip),(

),( = (2)

A 16-demensional vector quantizer, Qp, with a codebook CP = {CPi; i ∈ I} of size I is then designed for Pusing many training samples. In this work, 235 512 x 512 images, which form more than 9 million samples havebeen used to design CP.

3.2 Coding the Colour Channel (C)

The colour component of the model is formed by sub-sampling the two chromatic channels, Cb and Cr toform a single vector, which is also normalised by S. In the case where a 4 x 4 block size is used, the colourcomponent is an 8-dimensional vector. Let Cb = {cb(i, j), i, j = 0, 1, 2, 3} and Cr = {cr(i, j), i, j = 0, 1, 2, 3} bethe two corresponding chromatic blocks of the Cb and Cr channels. Then the sub-sampled Cb signal, SCb ={scb(m, n), m, n = 0,1} is obtained as follows

∑∑= =

++=1

0

1

0)2,2(

41

),(i j

jnimcbS

nmSCb

Similarly, the sub-sampled Cr signal, SCr = {scr(m, n), m, n = 0,1} is obtained as follows

2 Other sizes can also be used.

∑ ∑= =

++=1

0

1

0)2,2(

41

),(i j

jnimcrS

nmSCr

The colour vector, C = {c(k), k = 0, 1, …, 7} is formed by concatenating SCb and SCr, C = {SCb(m,n),SCr(m,n), m, n = 0, 1}.

An 8-demensional vector quantizer, Qc, with a codebook CC = {CCi; i ∈ J} of size J is then designed.Again over 9 million samples have been used to design CC.

4. Indexing and Retrieval of Image based on Codes of P and C Channels

Local visual appearance of the image is characterised by the coloured-pattern appearance model. Thebrightness is captured in the S channel. The spatial pattern variations information is contained in the P channel,whilst the chromatic information is captured by the C channel. To characterise the overall visual appearance ofan image, statistics of the local appearance can be calculated. Since the brightness of the image is largelyaffected by the lighting conditions, S could be an unreliable parameter in image indexing [20]. In this work, weuse the statistics of the VQ code index distributions of the P and C channels to characterise the global visualappearance of the image. A histogram of the P code and the C code is calculated separately. Another possibilityis to calculate the joint distribution of the P and C codes, however, this will increase the computation burdendramatically, and introduce heavy overheads. As can be seen later, separate P and C code form part of thecompressed information that can be used to reconstruct the original image.

To index an image, the image is divided into 4x4 non-overlapping3 blocks, each block is coded using thepattern VQ quantizer, Qp, and the colour VQ quantizer, Qc, and the histograms of the codes are calculated. LetHp ={hpi, i =0, 1, …, I-1}, Hc ={hcj, j =0, 1, …, J-1}, where hpi is the frequency of the pattern codeword CPi,and hcj is the frequency of the colour codeword CCj, being used to code the image.

Let Hp(m), Hc(m) and Hp(n), Hc(n) be the pattern and colour codes histograms of image m and image n. Thesimilarity of the images can be calculated as

)()()()(),( 21 nHmHnHmHnmD ccpp −+−= λλ (3)

where λ1 and λ2 are weighting factors which determine the relative importance given to the pattern and colourcomponents of the appearance model, which are determined experimentally.

5. Experimental Results

We have conducted experiments to test the model in image indexing and retrieval. We used colour textureretrieval as a testing vehicle. Similar to [2], we used a collection of images whose visual appearance is judgedby human observers as roughly containing a single texture surface. Although this description was questionablein the sense the texture surfaces of a single image were not always the same. These large images were thendivided into smaller sub-images. Each of these sub-images is then used as a query image. The target was toretrieve the sub-images belonging to the same large image as the query sub-image.

To design the codebooks, we used 235 512 x 512 VisTex colour texture images from the MIT Media LabVisTex collection. The images were divided into overlapping 4x4 blocks for training the codebooks, in totalmore than 9 million blocks of colour images samples were used. The codebook sizes for the pattern and colourwere set to 256. We found this was a reasonable size in that increased the size did not increase the performancesignificantly both in terms of indexing/retrieval and compression. Training was done using a Pentium III PC,and the total training time was approximately 9 hours.

3 It is also possible to use overlapping blocks, however, because we also have image coding in mind, the histograms will notbe stored, only the code indices will be recorded. For purely indexing purpose, where the histogram will be stored as meta-data, overlapping blocks can be used which sometimes can give slightly better retrieval results. However, it was found thatthe improvement was limited. In addition, using overlapping blocks introduces coding redundancy and is therefore notsuitable for coding purpose. We study the application of CPAM to image coding in [21].

Fig.2 shows thumbnails of a sub-set of the training images. Fig.3 shows the codebook image which contains65536 (256-pattern x 256-colour) 4 x 4 pixels blocks. Each block was constructed using one of the 256 patterncodewords and one of the 256 colour codewords. For display purpose, the strengths of all the blocks were set toa constant value of 128. An amplified section is shown on the right of the Figure.

To test the indexing performance, we first picked 40 single-texture images from the 235 training images.These images were judged by human observers as containing roughly a single texture. Each image was dividedinto 128x128 sub-images forming a 640 texture-image database. Each of these 640 texture images was then usedas a query image. The pattern and colour codes histograms of these texture images were calculated using thecodebooks in Fig.3. The distances D(m, n) calculated according to (3), where m is a query image and n is atexture sub-image from the database. The distances were then sorted in increasing order. We measure theretrieval performance as the average retrieval rate which is defined as the average percentage number of sub-images belonging to the same image as the query pattern in the top 15 matches [2]. As a comparison, a 4096-bincolour histogram matching [1] was also implemented. Results are shown in table 1. As can be seen, the averageover all 640 images the average retrieval rate for combining the pattern and colour appearance codes was87.46%, using pattern appearance alone, 82.74%, using colour appearance alone, 84.7%, whilst colourhistogram matching achieved 73.69%. It is worthy noting that colour histogram matching in this case wascomputationally 8 ~ 16 times more intensive than the new method, even though it is possible to reduce thecomplexity of histogram matching [4].

In a second testing database, we used 120 300x200 pixels colour texture images, which differ significantlyfrom those in the training set. Fig. 4 shows thumbnails of a subset of these images. Each image was divided into6 100x100 pixels sub-image thus forming a 720-image database. Again each of these sub-images was used asquery image to retrieve the 5 sub-images belong to the same image as the query image and the average retrievalrate was used. Results are shown in table 2. Again, a 4096 colour histogram matching was implemented. In thiscase, the average retrieval rate of the colour histogram matching was only 57.28% whilst the new methodachieved 90.44% when both colour and pattern appearance codes were used, used pattern appearance only73.02% and used colour appearance only 86.45%. One of the reasons was that the images in this test set containmany similar colour textures. What was also remarkable was that even though all the codes were trained from aset of very different images, the performance was still excellent. We believe one of the reasons was that we usedthe strength to normalise the pattern and colour channels. The training set contained large enough samples andthe codes have captured most commonly appeared spatial patterns and chromaticities.

Fig. 5 shows two typical images from this set of testing data. The average retrieval rates of each sub-imageof these two images are also shown. Fig. 6 show the two query textures and their 48 top retrievals using λ1 = λ2= 1.

6. Concluding Remarks

In this paper, we have introduced a computational model, which mimics the visual pathways in humanvisual system to capture the visual appearance of coloured patterns in a small area of the image. Global statisticsof the local visual appearance were then used to characterise the appearance of the image. We have successfullyapplied the model to content-based image indexing/retrieval. The model is naturally suited for image codingwhich has the advantage of achieving image compression and indexing simultaneously [21].

It is worthy pointing out that we do not claim the model in any way reflects what is actually going on insideour brain. However, the work was originally motivated by the study in human colour vision, especially by thepattern-colour-separable model [11]. It does appear that the model is psycho-visually plausible. Moreimportantly, from a signal processing point of view, we believe the design is technically sound. Experimentalresults further demonstrated its usefulness.

7. References

1. M. Swain and D. Ballard, "Color Indexing", International Journal of Computer Vision, Vol. 7, pp. 11-32,Year 1991

2. B. S. Manjunath and W. Y. Ma, “Texture features for browsing and retrieval of image data”, IEEE TransPattern Analysis and Machine Intelligence, vol. 18, pp. 837 – 842, 1996

3. W. Y. Ma and B. S. Manjunath, “NeTra: A toolbox for navigating large image databases”, MultimediaSystems, 7, 184-198, 1999

4. J. Berens, G. D. Finlayson and G. Qiu, "A statistical image of colour space", IPA’99, 7th IEE InternationalConference on Image Processing and Its Applications, July 1999, Manchester, UK

5. J. Huang et al, “Image indexing using color correlogram”, IEEE CVPR, pp. 762 – 768, 19976. A. Jain and G. Healey, “A multiscale representation including opponent color features for texture

recognition”, IEEE Trans Image Processing, vol. 7, no. 1, pp.124-128, 19987. A. Gorea and T. V. Papathomas, “Double opponent as a generalized concept in texture segregation

illustrated with stlmuli defined by color, luminance, and orientation”, J. Opt. Soc. Am. A, vol. 10, pp. 1450– 1462, 1993

8. Y. Rui, T.S. Huang, T.S. and S. F. Chang, “Image Retrieval: Current Techniques, Promising Directions,and Open Issues”, Journal of Visual Communication and Image Representation, Vol. 10, pp. 39-62, 1999

9. SPIE Storage and Retrieval for Image Video Databases Series10. IEEE Workshop on Content-based Access of Image and Video Libraries Series11. A. Poirson and B. Wandell, “Appearance of colored patterns: pattern-color separability”, J. Opt. Soc. Am.

A, vol. 10, pp. 2458 – 2470, 199312. P. K. Kaiser and R. M. Boynton, Human Color Vision, Optical Society of America, Washington DC , 199613. W. Pratt, Digital Image Processing, Wiley, New York, 197814. CIE, Colorimetry, CIE Pub. No. 15.2, Centr. Bureau CIE, Vienna, 198615. CCIR, “Encoding parameters of digital television for studios”, CCIR Recommendation 601-2, Int. adio

Consult. Committee, Geneva, 199016. G. Sharma and H. J. Trussell, “Digital color imaging – Tutorial”, IEEE Trans Image Processing, vol. 6 pp.

901 – 932, 199717. A. Gersho, R. M. Gray, Vector quantization and signal compression, Kluwer Academic Publishers, Boston,

199218. T. Kohonen, Self-organization and associative memory, Berlin: Springer-Verlag, 198919. S. C. Ahalt, et al, “Competitive learning algorithms for vector quantization”, Neural Networks, vol. 3, pp.

277-290, 199020. B. V. Funt and G. D. Finlayson, “Color constant color indexing”, IEEE Trans Pattern Analysis and Machine

Intelligence, vol. 17, pp. 522 –529, 199521. G. Qiu, “Image coding using a coloured pattern appearance model”, submitted to Visual Communication

and Image Processing 2001, January 2001, San Jose, CA, USA

Fig. 2, Thumbnails of a subset of training texture images

Fig. 2, Code book image. Left: there are 256 x 256 4x4 patterns in this image. Right: a blow-up of a subset of 124x4 codewords

Table 1, Average retrieval rates for 40 texture images from the VisTex database

Texture Name λ1 =1, λ2=1 λ1 =1, λ2=0 λ1 =0, λ2=1 Colour/Hist Texture Name λ1 =1, λ2=1 λ1 =1, λ2=0 λ1 =0, λ2=1 Colour/HistBark.0002.raw 60.38% 57.50% 62.06% 50.38% Food.0009.raw 90.25% 85.75% 89.56% 79.13%Bark.0008.raw 73.44% 67.94% 84.19% 51.13% Grass.0001.raw 83.69% 79.19% 83.25% 97.50%Bark.0012.raw 84.19% 69.19% 67.94% 70.38% Leaves.0002.raw 82.44% 80.94% 66.38% 59.19%Brick.0001.raw 98.31% 76.75% 85.81% 92.38% Leaves.0003.raw 100.00% 100.00% 86.06% 92.00%Brick.0004.raw 70.06% 59.69% 95.00% 73.25% Leaves.0010.raw 93.25% 92.88% 69.00% 67.13%Fabric.0001.raw 100.00% 100.00% 98.31% 83.38% Leaves.0011.raw 100.00% 100.00% 99.13% 87.06%Fabric.0003.raw 98.75% 98.75% 94.50% 77.31% Metal.0000.raw 92.88% 90.75% 93.31% 93.75%Fabric.0006.raw 84.94% 83.69% 66.63% 52.50% Misc.0000.raw 88.69% 88.25% 79.19% 59.19%Fabric.0007.raw 67.88% 63.69% 96.19% 66.63% Paintings.31.0000.raw 81.69% 76.69% 70.75% 52.13%Fabric.0010.raw 95.81% 92.88% 97.88% 68.75% Paintings.41.0000.raw 61.63% 54.25% 85.88% 81.25%Fabric.0012.raw 82.88% 76.38% 68.31% 81.69% Sand.0000.raw 100.00% 100.00% 99.56% 91.19%Fabric.0014.raw 100.00% 95.25% 99.56% 99.19% Stone.0002.raw 70.31% 65.06% 82.00% 57.56%Fabric.0016.raw 100.00% 95.69% 99.13% 87.00% Tile.0008.raw 91.94% 75.75% 100.00% 59.56%

Flowers.0007.raw 94.13% 87.13% 91.19% 79.94% Tile.0010.raw 47.88% 45.75% 94.25% 39.31%Food.0000.raw 100.00% 99.56% 81.81% 100.00% Water.0002.raw 86.25% 86.56% 74.44% 68.25%Food.0001.raw 100.00% 100.00% 99.56% 96.19% Water.0003.raw 80.75% 74.13% 70.00% 59.25%Food.0002.raw 99.56% 99.13% 73.00% 100.00% Water.0004.raw 81.13% 80.06% 71.25% 48.00%Food.0003.raw 100.00% 100.00% 94.56% 80.88% Water.0005.raw 100.00% 100.00% 76.56% 74.13%Food.0005.raw 92.06% 89.94% 88.75% 85.81% WheresWaldo.0001.raw 89.19% 94.94% 77.69% 77.38%Food.0007.raw 85.06% 85.44% 76.63% 78.69% Wood.0002.raw 88.90% 40.00% 98.67% 29.33%

Fig. 4, A subset of 120 colourful textures used in the test. These images were not included in the training set.

Matches Image

0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

1 2 3 4 5 6

Sub-images

% C

orre

ct r

etri

eval

New MethodColour Histogram

Plant Image

0.00%10.00%20.00%30.00%40.00%50.00%60.00%70.00%80.00%90.00%

100.00%

1 2 3 4 5 6

Sub-images

% C

orre

ct r

etri

eval

New method

Colour Histogram

Fig. 5 Two typical images from the 2nd test set and their sub-images retrieval rates. In many cases, colourhistogram matching returned non-of the five images belong to the same image as the query in the top 5 ranks.

Table 2, Average retrieval rates for 120 texture images from the second testing database

Texrure Name λ1 =1, λ2=1 λ1 =1, λ2=0 λ1 =0, λ2=1 Colour/Hist Texrure Name λ1 =1, λ2=1 λ1 =1, λ2=0 λ1 =0, λ2=1 Colour/Hist0211078l 66.67% 60.00% 73.33% 46.67% 0550088l 100.00% 100.00% 96.67% 23.33%0447007l 100.00% 100.00% 43.33% 23.33% 0550092l 83.33% 83.33% 46.67% 33.33%0447010l 96.67% 90.00% 10.00% 30.00% 0550096l 93.33% 93.33% 80.00% 76.67%0447012l 76.67% 70.00% 30.00% 23.33% 0605037l 100.00% 100.00% 66.67% 23.33%0447024l 83.33% 96.67% 43.33% 10.00% 0612006l 100.00% 100.00% 100.00% 96.67%0447027l 80.00% 80.00% 40.00% 63.33% 0612023l 100.00% 100.00% 90.00% 86.67%0447029l 100.00% 100.00% 100.00% 60.00% 0612046l 100.00% 100.00% 70.00% 36.67%0447031l 100.00% 100.00% 83.33% 100.00% 0612065l 100.00% 96.67% 96.67% 83.33%0447035l 100.00% 100.00% 83.33% 93.33% 0612070l 100.00% 100.00% 83.33% 56.67%0447036l 80.00% 80.00% 53.33% 80.00% 0612079l 93.33% 100.00% 33.33% 46.67%0447037l 66.67% 53.33% 20.00% 20.00% 0647003l 86.67% 83.33% 100.00% 46.67%0447038l 100.00% 100.00% 66.67% 70.00% 0647077l 93.33% 90.00% 73.33% 16.67%0447040l 100.00% 100.00% 83.33% 76.67% 0754073l 100.00% 100.00% 33.33% 36.67%0447042l 96.67% 96.67% 83.33% 63.33% 1116012l 100.00% 93.33% 100.00% 100.00%0447043l 100.00% 100.00% 63.33% 100.00% 1116024l 86.67% 80.00% 40.00% 50.00%0447044l 100.00% 100.00% 90.00% 60.00% 1116033l 66.67% 63.33% 30.00% 36.67%0447046l 100.00% 100.00% 60.00% 3.33% 1117140l 93.33% 93.33% 43.33% 33.33%0447060l 90.00% 90.00% 66.67% 66.67% 1117141l 93.33% 96.67% 26.67% 23.33%0447064l 86.67% 86.67% 83.33% 40.00% 1118033l 100.00% 100.00% 93.33% 83.33%0447066l 83.33% 76.67% 80.00% 33.33% 1118034l 93.33% 83.33% 50.00% 36.67%0447067l 100.00% 100.00% 100.00% 93.33% 1118037l 100.00% 100.00% 70.00% 66.67%0447070l 53.33% 40.00% 100.00% 20.00% 1118072l 90.00% 80.00% 70.00% 96.67%0447084l 100.00% 96.67% 60.00% 53.33% 1350028l 56.67% 50.00% 26.67% 23.33%0447085l 100.00% 100.00% 53.33% 40.00% 1359015l 100.00% 100.00% 86.67% 70.00%0447093l 93.33% 70.00% 100.00% 16.67% 1359070l 93.33% 93.33% 96.67% 56.67%0447099l 96.67% 93.33% 93.33% 33.33% 1365013l 100.00% 93.33% 86.67% 56.67%0471010l 100.00% 93.33% 90.00% 20.00% 1365099l 100.00% 100.00% 100.00% 80.00%0471012l 93.33% 90.00% 36.67% 66.67% 1366001l 66.67% 56.67% 76.67% 30.00%0471019l 100.00% 96.67% 60.00% 23.33% 1366007l 96.67% 100.00% 70.00% 63.33%0471026l 96.67% 86.67% 23.33% 30.00% 1366008l 100.00% 100.00% 90.00% 90.00%0471028l 100.00% 96.67% 53.33% 76.67% 1366017l 100.00% 100.00% 83.33% 13.33%0471049l 86.67% 86.67% 46.67% 70.00% 1366038l 100.00% 100.00% 40.00% 33.33%0471050l 76.67% 73.33% 63.33% 93.33% 1442093l 86.67% 80.00% 90.00% 53.33%0548003l 100.00% 100.00% 100.00% 90.00% 1442097l 100.00% 100.00% 60.00% 56.67%0548005l 100.00% 100.00% 90.00% 53.33% 1442099l 96.67% 90.00% 96.67% 60.00%0548006l 100.00% 93.33% 100.00% 93.33% 1443002l 86.67% 70.00% 63.33% 26.67%0548022l 96.67% 80.00% 100.00% 80.00% 1443003l 96.67% 93.33% 80.00% 43.33%0550001l 100.00% 100.00% 100.00% 56.67% 1443009l 86.67% 80.00% 90.00% 63.33%0550009l 86.67% 76.67% 86.67% 36.67% 1443042l 66.67% 56.67% 56.67% 33.33%0550016l 43.33% 43.33% 83.33% 40.00% 1443063l 90.00% 86.67% 60.00% 33.33%0550019l 96.67% 93.33% 70.00% 53.33% 1443075l 100.00% 96.67% 93.33% 86.67%0550037l 100.00% 90.00% 86.67% 93.33% 1443078l 100.00% 100.00% 36.67% 80.00%0550038l 100.00% 100.00% 100.00% 83.33% 1443083l 86.67% 83.33% 66.67% 73.33%0550040l 80.00% 76.67% 76.67% 76.67% 1918007l 73.33% 66.67% 100.00% 0.00%0550041l 96.67% 83.33% 83.33% 60.00% 1919009l 76.67% 73.33% 100.00% 43.33%0550046l 56.67% 43.33% 53.33% 3.33% 1919052l 100.00% 96.67% 100.00% 93.33%0550049l 80.00% 73.33% 76.67% 53.33% 1922001l 83.33% 80.00% 100.00% 73.33%0550055l 100.00% 100.00% 96.67% 96.67% 1922005l 100.00% 100.00% 90.00% 26.67%0550058l 63.33% 46.67% 73.33% 76.67% 1922009l 50.00% 50.00% 90.00% 46.67%0550059l 90.00% 73.33% 100.00% 86.67% 1922013l 100.00% 100.00% 86.67% 96.67%0550060l 100.00% 100.00% 96.67% 53.33% 1922014l 86.67% 76.67% 66.67% 56.67%0550061l 93.33% 90.00% 76.67% 73.33% 1922033l 100.00% 100.00% 40.00% 36.67%0550063l 100.00% 100.00% 100.00% 70.00% 2065021l 96.67% 96.67% 93.33% 86.67%0550064l 80.00% 73.33% 6.67% 13.33% 2065022l 73.33% 70.00% 100.00% 83.33%0550067l 100.00% 93.33% 100.00% 46.67% 2065059l 100.00% 100.00% 76.67% 80.00%0550068l 100.00% 100.00% 96.67% 100.00% 2065062l 60.00% 26.67% 83.33% 36.67%0550070l 83.33% 73.33% 43.33% 63.33% 2065063l 100.00% 96.67% 76.67% 46.67%0550071l 76.67% 63.33% 46.67% 73.33% 2065065l 90.00% 86.67% 73.33% 100.00%0550086l 100.00% 96.67% 73.33% 70.00% 2065067l 50.00% 30.00% 43.33% 23.33%0550087l 100.00% 93.33% 90.00% 40.00% 2065111l 100.00% 100.00% 52.00% 80.00%

(a)

(b)Fig. 6, (a) and (b) two examples of top 48 matches. The first image on the top left corner is the query image, andsubsequent images from left to right and top to bottom are retrieved image ordered according to their similarityto the query image.

Image Indexing using a Coloured Pattern Appearance Modelpszqiu/Online/PCAM-Indexing.pdf · Methods...

Documents

Transcript of Image Indexing using a Coloured Pattern Appearance Modelpszqiu/Online/PCAM-Indexing.pdf · Methods...