Image Retrieval Based on Region of...

53

CHAPTER - 4

Image Retrieval Based on Region of Interest

4.1. Introduction

Content Based Image Retrieval (CBIR) is browsing, searching and navigation of images from

large image databases based on their visual content. CBIR has been an active area of research for

more than a decade. Many CBIR systems have been developed; like QBIC [Flickner et al.

(1995)], Simplicity [Wang et al. (2001)], and Blob world [Carson et al.(2002)]. A detailed survey

of CBIR techniques can be found in [Lew et al. (2006), Vassilieva (2009), Pradnya and Vikhar

(2010)]. Traditional CBIR systems use low level features like color, texture, shape and spatial

location of objects to index and retrieve images from databases. Low level features can be global

or local (region based). Global feature based CBIR systems [Ciocca et al. (2011), Lin et al.

(2009), Wang et al. (2011), Zhiwen et al. (2012)] fail to compare the regions or objects in which

a user may be interested. Therefore Region Based Image Retrieval (RBIR) is shown to be more

effective in reflecting the user requirement in the literature.

A typical query image consists of both relevant and irrelevant regions. The irrelevant regions

limit the effectiveness of existing content based image retrieval systems. Irrelevant regions can

be removed by defining ROI in the query image. Depending upon the method of formulating

region based query, RBIR can be categorized into two classes: (1) System Designated ROI

(SDR) (2) User Designated ROI (UDR). These systems have various ways of querying and

representing images in the database.

54

UDR approach seems to be more promising as it enables the user to express his intent in the

query formulation. In UDR approaches, it is difficult to formulate the query accurately when

there are variations in the sizes of ROI [Tian et al. (2000), Zhang et al. (2007)]. This chapter

presents a method to select ROI overlapping blocks, based upon the features and overlapping

areas of ROI. However, if multiple ROIs are selected by the user, then it is always beneficial to

consider relative locations of ROIs in the image. For instance, if the user is interested in three

regions as shown in Figure 4.1, then it is very crucial to consider relative locations of the ROIs in

the image.

Figure 4.1: Comparison of locations in multiple ROI (a) Query Image (b) Target image # 1

(c) Target image #2

Target image 4.1(b) is slightly different from query image (a) but the location of ROIs—the bird

at the top, the car in the middle and the person at the bottom—is the same. In target image 4.1(c),

on the other hand, the locations of the bird and the person are switched. In this case, the user’s

intent for retrieval is not realized, so target image 4.1(c) should be excluded from the retrieval

results or be given a lower priority. Therefore, in multiple ROI based retrieval, it is crucial to

consider the relative locations of ROIs so that the user’s intent can be fully reflected.

Only work by [Moghaddam et al. (2001), Lee and Nang (2011)] in the literature has reported

solution to the problem of finding relative locations of ROIs. In addition, these methods fail to

55

give a detailed level of relative location similarity. Consideration of relative locations of several

ROIs in retrieval techniques requires complex algorithms and results in the increase of

computation time. To overcome this problem, this chapter presents a more effective method

based on region codes, which inherently supports the notion of relative locations of multiple

ROIs resulting in less computation time.

Major contribution of work presented in this chapter are first, use of region codes for

reduction of overall computation time without affecting accuracy of retrieval, second, an

efficient technique for ROI overlapping block selection has been suggested, third, a method to

find the similarity while considering the relative locations of multiple ROIs is proposed and

lastly, effective combination of features is utilized for ROI image retrieval. The experimental

results show that the proposed method can meet requirements of a user more accurately while

consuming less time in comparison to existing methods.

4.2. Related Work

Color and texture features have been extensively used in region based image retrieval.

Shrivastava and Tyagi (2012), Liu et al. (2011) have proposed a microstructure descriptor

(MSD). Microstructures are defined by edge orientation similarity with underlying colors, which

can effectively represent image features. Underlying colors are colors with similar edge

orientation. MSD integrates color, texture, shape and spatial layout information for effective

image retrieval. However, it lacks global properties of the image and is unable to utilize relation

between locations of different objects in the layout. Xingyuan and Zongyu (2013) have proposed

a more effective Structure Element Descriptor (SED) that combines color and texture

information. The histogram computed from SED in the HSV color space (quantized to 72 bins)

has been employed for image discrimination. The proposed global descriptor can represent the

56

spatial correlation of color and texture in the image. However, the descriptor is global and cannot

represent the region based properties of images. Also, the feature vector length is high and does

not encode the spatial relation between different objects in the image.

Saykol et al. (2005) have proposed a histogram based scheme that combines color and shape

features. To extract shape, distance and angle histogram is used and the color is encoded using

quantized 166 bin histogram in HSV color space. Texture information is not used in the retrieval.

The scheme can identify shapes of individual objects but fails to capture spatial relations

between various objects in the image.

The color information of the image is also used for object detection using Color Co-

occurrence Histogram (CCH) [Chang and Krumm (1999)] and Color Edge Co-occurrence

Histogram (CECH) [Luo and Crandall (2006)]. CECH can represent the gross spatial

information about the layout but it is incapable of discerning subtle shape differences. This is

useful for handling object distortions but fails in classifying shapes having minor variations.

Apart from low level features, considering spatial locations of different regions and their

relation in the image have also shown to play an important role in increasing the performance of

a region based image retrieval system in the literature.

Hsiao et al. (2010) approach partitions images into five regions with fixed absolute locations.

To avoid noise during the local match, the system allows users to select the ROI from the

segmented five regions and only selected region is compared with regions of other images in the

database.

In the technique presented in [Tian et al. (2000)], images are divided into blocks of a fixed

size (e.g. 2×2, 3×3, and 5×5). However, the size of the user designated ROI may be different

from predefined block size. This may result in an inaccurate representation of ROI features. To

57

address this problem, the authors have represented feature of blocks by their proportion of

overlap with ROI in the calculation of similarity measure. The main drawback of this method is

that it only compares the blocks having similar spatial locations as of ROI. Therefore, blocks

lying in different locations related to ROI are not retrieved.

The method proposed by Prasad et al. (2004) uses automatic specification of regions within

the image with the help of dominant color. The images are divided into blocks of size 3x3 and

each block is given a location index. The block having the largest overlap area with ROI is

designated and its feature vector is matched with database image's blocks having the same

location index. As shown in Figure 4.6, block 4 is designated and its features are matched merely

with block 4 of database images. This method faces the problem as ROIs are not directly

identified by the user and regions are compared only from the fixed locations. Multiple ROIs are

also not supported.

Technique given by Moghaddam et al. (2001) facilitates the user to select multiple ROIs and

retrieve blocks in different locations from ROIs. However, this method has high time complexity

as it requires comparison of all blocks within the query region. It is also compared and reflected

whether blocks in the target image are in the same location as multiple ROIs in the query image.

This method fails to provide a detailed level of relative location similarity as it simply tells

whether blocks in the target image are in the same location as multiple ROIs in the query image.

Chan et al. (2008) suggested a ROI image retrieval method based on Color Variances Among

the Adjacent Objects (CVAAO) feature. CVAAO feature can describe principal pixel colors and

distinguish the objects with inconsistent contours. Furthermore, it is insensitive to scale, shift,

rotation, and distortion variations. Concerning the image querying aspect, the CVAAO based

ROI image retrieval method computes the location and size of the target region image RT in a

58

database image using shape, area, and position of the largest object on the query region image

RQ, where RT is more similar to RQ. However, this method does not consider relative locations of

ROIs in the retrieval process and hence not suitable for multiple ROI based retrieval.

Phase 1: The relative location of ROIs is calculated in the query image.

(a) Set the leftmost side of the query image as the basis ROI.

(b) Divide an image into four quadrants, with the central coordinates of the basis ROI.

(c) Calculate the relative location of other ROIs to determine which quadrant they lie in.

Do not calculate the location of the ROI already designated as basis ROI.

(d) Choose the closest ROI to the basis ROI as the new basis ROI.

(e) Repeat (b) through (d) by (number of ROIs–1) times.

Phase 2: The results are compared with the target image.

(a) Calculate the location of blocks in the target image that are most similar

to ROIs of the query image.

(b) Calculate the relative location of blocks derived in (a) by order of Basis ROI in

Phase1.

(c) Compare the relative location of blocks identified in (b) with that of ROIs obtained in

Phase 1 to determine whether their relative location is the same.

(d) Increase the distance if the location differs.

Figure 4.2: Lee and Nang (2011) algorithm for finding relative locations in the case of

multiple ROI.

59

To incorporate relative locations of multiple ROIs, Lee and Nang (2011) have proposed a

similarity measure using comparative layouts of ROIs. This method divides an image into blocks

of fixed size and MPEG-7 [Bastan et al. (2010)] dominant color feature is used to represent each

block. To select overlapping blocks with UDR, it has been suggested to prefer overlapping

blocks having higher overlap area rather than some predefined threshold. To find relative

location, images are divided into coordinate planes with four quadrants centering on the basis

ROI to determine in which quadrants individual ROIs are located as described in the algorithm

given in Figure 4.2. At this point, the similarity is weighted when the relative location of the

ROIs in the query image and the target image are same. This method fails to provide a detailed

level of relative location similarity and also considering relative location results in the increase of

computation time and complexity. The method proposed in this chapter is based on region codes

and deals with the problem identified in existing studies and provides an effective solution to

these problems.

4.3 Region Codes based Selective Region Matching

This section describes the proposed approach. The details of finding the region codes for

different regions in the image and the process of querying and retrieving based on these codes is

described in the following subsections.

4.3.1 Region Codes Assignment

The region codes were first used in the Cohen-Sutherland line clipping algorithm [Hearn and

Baker (2010)]. In the present work, we have further enhanced the scheme of region codes to

make it applicable in region based image retrieval. To find the region codes, all images are

60

divided into blocks of a fixed size (e.g. 3×3 and 5×5). Each block of the image is assigned a 4-bit

code depending on its spatial location relative to the central region as illustrated in Figure 4.3.

Figure 4.3: Example of an image and its corresponding region codes

Starting from the first lower order bit, each of the four bits in the region code specifies left, right,

bottom and top region of the image respectively. For example, the code of the region that lies on

top left of the central region will have a region code 1001. As the middle region of the image,

generally contain most important details of the image, it has been assigned a code 1111, as an

exception since its direction cannot be decided and it must be included in all comparisons.

Region code for all regions is determined by comparing the coordinates of lower-left corner and

upper-right corner of the central region. The scheme of region codes can be easily extended for

layout of higher dimensions (i.e. 5×5 and 7×7) by adding more bits in the region code for a

particular direction. For instance, region codes for layout of size 5×5 will be of 8 bits (as shown

in Figure 4.4) due to 2 bits assignment for each direction. The designated bits for each direction

can be named accordingly, as 2 bits assigned for left direction may locate left and extreme left

direction from the central region.

The proposed scheme can work well only for layouts of odd dimensions as in that case central

region can be coded uniquely. Region codes play an important role in finding the spatial

61

locations of different regions in the image with respect to the central region. These codes are

further used to filter irrelevant regions for comparison with the query region.

10 00 00 10 10 00 00 01 10 00 00 0 10 00 01 00 10 00 10 00

01 00 00 10 01 00 00 01 01 00 00 00 01 00 01 00 01 00 10 00

00 00 00 10 00 00 00 01 11 11 11 11 00 00 01 00 00 00 10 00

00 01 00 10 00 01 00 01 00 01 00 00 00 01 01 00 00 01 10 00

00 10 00 10 00 10 00 01 00 10 00 00 00 10 01 00 00 10 10 00

Figure 4.4: Region code assignment for a layout of size 5×5

4.3.2 ROI Overlapping Blocks Selection

To support ROI image retrieval, a user must be permitted to query arbitrarily shaped images.

Our approach supports varying size of ROI and multiple ROI in the query image. Let Sb be the

uniform block size and Sr is the size of the query region. Then if Sb <= Sr, region containing ROI

together with its region code is taken as the query region such that dominant color of ROI is

same as that of the block containing it. If this condition is violated, global matching is preferred

over region matching. However, if Sr > Sb, system will find the dominant color of all blocks

overlapping with ROI. The dominant color of block having highest overlaps with ROI is taken as

the reference color (D) and is used to compare with the color of all other ROI overlapping

blocks. Only those ROI overlapping blocks (ROBs) which represent same dominant color as D

together with their region codes are retained for final query formulation. The whole process of

selecting appropriate blocks is described in the algorithm below (Figure 4.5):

This scheme of ROI overlapping block selection is illustrated using Figure 4.5. To establish

the efficacy of our approach, we have compared it with the techniques given in [Lee and Nang

62

(2011), Prasad et al. (2004), Tian et al. (2000)]. As shown in Figure 4.6, an image is divided into

3×3 blocks, and ROI is specified with ROI overlapping blocks 1, 2, 4, 5 and 7. If feature values

of all ROBs are reflected, then all feature values of blocks 1, 2, 4, 5 and 7 are reflected in

similarity computation. This method has a drawback that the regions not overlapping with ROI

are overly reflected. The scheme given by Prasad et al. (2004), reflects ROI by the proportion of

its overlap on respective blocks.

1) Compare the size of the ROI (Sr) and predefined block size ( Sb )

(i) If Sr << Sb, then perform global matching

(ii) If Sr <= Sb, then select the whole block containing the ROI. Go to step 5

(iii) If Sr > Sb, select all blocks overlapping with user designated ROI and create a list

2) Find the dominant color of the blocks selected in step 1 as detailed in section 3.6.1.

3) Determine the block having largest overlapping area with ROI from the list of blocks created

in step 1 and designate its dominant color D for comparison with other ROBs.

4) Finally select only those ROBs, which have a dominant color same as D for comparison.

5) Compute region codes of selected blocks as described in section 4.1.

Figure 4.5: An algorithm to select ROBs

Use of this method fully reflects the features of block 4 and features of all other ROBs like 1,

2, 5 and 7 are reflected according to their proportion of overlap. As for this method, feature

values that are totally different from the ROIs may be reflected while applying the feature values

of blocks overlapping with ROIs by proportion. Let us take block 5 as an example. The block as

a whole, however, is predominantly black, so its color is determined as black. As a result, feature

63

values of the block are reflected by the proportion of overlap for its “black” areas—completely

different from the ROI.

Figure 4.6: ROI selected by the user

Lee and Nang (2011) suggested a method that reflects feature values depending on the

proportion of blocks overlapping with ROIs as in [ Tian et al. (2000)], but selects only those

blocks where regions overlapping with ROIs are greater than a threshold (20%). This approach

has a drawback that the threshold value may vary for different combinations of query ROI and

overlapping blocks. Therefore, it is very difficult to set one value applicable to all combinations.

Using this technique in the scenario shown in Figure 4.6, may result in the selection of block 5,

which is predominantly black but its area of overlap with ROI is greater than 20%. As a result,

feature values of the block are reflected by the proportion of overlap for its “black” area, which

is totally different from the red color of the ROI.

The technique here, selects blocks having same dominant color as that of the block having

highest overlap with ROI. In Figure 4.6, block 4 has highest overlap and its dominant color is

red. The proposed scheme compares the color of block 4 with all other ROBs i.e. blocks 1, 2, 5

and 7. Out of these blocks, the dominant color of block 1 is same as that of block 4. Hence,

blocks 1 and 4 are only selected, and their features are reflected by their proportion of overlap

together with their region codes. There is no possibility of selection of block 5, since its

64

dominant color is not red as that of block 4. It is clear that the proposed approach selects only

relevant ROBs which accurately reflect the features of ROI.

4.3.3 Selective Region Matching based on Region Codes

ROI can be specified manually or automatically in the query image. The ROI are then

converted into corresponding ROBs in the layout. During similarity calculation ROB can be

compared with blocks in the target images in two ways, that are fixed location matching and all-

blocks matching.

Fixed location approaches [Prasad et al. (2004), Tian et al. (2000)] have some disadvantages

due to spatial dependency. For example, if a user is searching for an image with a tiger in the left

corner, then the images with tiger located at the right corner are difficult to retrieve. The region-

matching algorithm can be employed to solve this problem. The user defined ROI moves over

the whole image, block by block and thus compares all blocks of target image with the query

region. For every block, a similarity distance is recorded. The minimum similarity distance is

indexed as the output similarity distance for the image [Lee and Nang (2011), Moghaddam et al.

(2001)].

The computational complexity of all-block matching approach [Lee and Nang (2011),

Moghaddam et al. (2001)] is greatly increased as O(n2) with the increasing dimension n of

layout. The work in this chapter tries to reduce this complexity by reducing the number of

comparisons without the loss of accuracy. Normally, a 3x3 layout has a good tradeoff between

image details and computation complexity [Zhang et al. (2007)]. The proposed method compares

a few but not all blocks which are related to the initial position of ROI in the layout. It works on

the assumption that the probability of finding query region is higher in the parts of the database

image where the ROI query region is located and its related adjacent locations. For example, if

65

the query object is located in the top region of an image, the probability of finding similar

regions in the top, top-left and the top-right region is high. To improve accuracy, central region

should always be selected. Therefore, it is assigned a code as 1111. The proposed scheme only

compares regions of images in the database having region codes similar to query region.

Figure 4.7: Images showing region codes of different regions to be compared with query

region 1000

The similarity between region codes is determined by looking for the region codes having 1 in

the same bit positions as of query region code. This similarity is determined by performing

logical AND operation between region codes. If the outcome of AND operation is not 0000, then

two region codes are similar as shown through an example below. As shown in Figure 4.7, if the

code of the region containing the ROI is 1000, then the system will compare its feature vector

with image regions having codes 1010, 1000, 1001 and 1111 i.e. the region codes having 1 in the

same bit position as the region code of query region.

1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 (not similar) 1 0 0 0 (similar)

66

Further if the region code of query ROI is 1010 (i.e. top-right region), then it will be compared

with region codes 1000, 1001, 1010, 0010, 0110 and 1111 (i.e. top, right and central region).

This approach also considers relative positions of multiple ROIs. Comparing features in this way

reduces the total number of comparisons and improve the accuracy of the system. In addition,

logical operations are computationally more efficient than corresponding conditional operations.

4.3.4 Similarity Measure

In the present work, similarity is measured by comparing blocks overlapping with ROI and

blocks in the target image having a similar region code. Similarity between region codes is

determined by looking for 1 at the similar bit positions. The similarity is measured by obtaining

the list of blocks in the query image corresponding to ROIs Br and scanning the target image n

times by the unit of blocks to find the nearest block list for Br [Lee and Nang (2011)], where n is

the number of regions in target images having region code similarity with Br. This can be written

as equation (4.1) [Lee and Nang (2011)].

niIBLDIBD jbri

jr i

.......1)),,(min(),( , (4.1)

where ),( jr IBD measures the degree of similarity between Br and target image, and I j represents

the jth image of the image database. LDi ( Br ,j

biI ) measures the distance between Br and each

block list (j

biI ) in the target image ( I j ).

jbi

I means ith block list of the jth image that corresponds

to Br. In LDi ( Br , jb i

I ) , the similarity of blocks is measured using different similarity calculation

methods by the property in use. In this work, it is defined as a Euclidean distance measure. For

computing ),( jr IBD , the smallest value is applied among the distances calculated by scanning

blocks fr

is always

4.3.5 Mu

When

locations

intent ac

locations

block. Th

block.

As sh

1, 4, 5 an

compared

approach

location

ensure th

block are

The relat

rom the targe

s less than th

ultiple ROI

n more than

s of multiple

ccurately. A

s as comparis

hese related

hown in Fig

nd 7 of the d

d with block

h. It is obvio

of ROI-1 re

his, blocks sh

e given highe

tive location

et image and

he total numb

based Retri

one ROI is

e ROIs and

Although pr

sons are mad

d locations h

Figure

gure 4.8, RO

database im

ks 1, 2, 3, 5

ous that the

emains lower

howing simi

er priority by

ns of both RO

d comparing

ber of blocks

ieval

specified in

their relativ

roposed app

de only in th

have been fil

4.8: Multip

OI-1 has regio

ages. Where

, 6 and 9 us

relative loc

r or at the s

larity at high

y introducin

OIs are same

g them n tim

s in the layo

n the query im

ve locations

proach for

he locations w

ltered furthe

ple ROIs sel

on code 000

eas ROI-2 h

sing propose

cations of bo

ame level w

her bit positi

ng parameter

e if their regi

mes. It should

out and thus e

mage, some

are also con

single ROI

which are re

er according

lected by th

01 hence it w

as region co

ed region co

oth ROIs ar

with respect

ions with res

r α in the sim

ion codes in

d be noted th

ensures fewe

more param

nsidered to

implicitly

elated to the

g to their rel

he user

will be comp

ode 1010; th

ode based se

re maintaine

to the locati

spect to regi

milarity meas

ndividually m

hat the value

er compariso

meters like sp

reflect the u

reflects rel

location of q

lation with q

pared with b

herefore it w

elective matc

ed as long a

ion of ROI-2

ion code of q

sure calculat

match with re

67

e of n

ons.

patial

user's

lative

query

query

locks

will be

ching

as the

2. To

query

tions.

egion

68

codes of blocks at the higher bit level. To implement this, the system will check the position of 1

in the result of AND operation of two region codes as shown below:

1 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 0 (high similarity) 0 0 1 0 (low similarity)

For instance, while matching with ROI region code 1010, region codes 1000, 1001 and 1010

will be given high priority (i.e. setting α = 0) in comparison to region codes 0010, 0110 and

1111. As the region codes mentioned earlier exhibit similarity at the fourth bit position as

compared with the region code mentioned later which exhibit similarity at second bit position.

The scheme presented by Lee and Nang (2011), compares all locations with ROI. In that case

feature of ROI-1 may match with any block in target images. Hence computation of relative

locations requires a complex algorithm and high computation time. Also, this technique fails to

show a detailed level of relative location similarity and results in ignorance of some important

relative locations. For example, for multiple ROI combination as shown in Figure 4.6, if the

number of the most similar block to ROI-1 is 6 in the target image, then the approach will find

similar relative locations for ROI-2 as block 3 only. However, blocks 1 and 2, which are also on

top region will be ignored or given higher distance.

The scheme presented in this chapter produces better results when applied to same query

image ROI combination as in Figure 4. 8. For instance, the system will compare the features of

ROI-1 with blocks 1, 4, 5 and 7 of the target image only. Thereby, ignoring block 6, location of

which is also not related to the initial location of ROI-1. Moreover, the relative locations for

ROI-2, can be blocks 1, 2 or 3, whichever is closer to initial configuration. Blocks 5, 6 and 9

which are related less close are also reflected using a parameter α in the similarity measure

calculation. This can be summarized as:

69

)),((),(1

n

k

jkr

j IBDwIRMD , (4.2)

where ),( jIRMD calculates the degree of similarity between the query image’s ROI combination

(R) and jth image of the database ( jI ). Here the degree of similarity is calculated as the weighted

sum of the distances of individual ROIs. The parameter α is 0, if relative locations are same and

1, otherwise. The overall distance is less when relative locations of multiple ROIs in both query

and target image are identical. The value of weight w can be assigned between 0 and 1 and n

refers to the number of ROI; krB is the list of blocks in the query image that correspond to the kth

ROI. ),( jr IBD is calculated using equation (4.1) for each ROI.

1) Determine the number of ROI’s as selected by the user.

2) Compute region codes of all ROBs as given in algorithm 1 [Figure 4.5].

3) Perform logical AND operation on region codes computed in step 2 with region codes of all

other blocks of the layout.

4) Create a list of blocks in the layout which produce 1 as a result of AND operation of step 3.

5) Compare each ROB with the list of blocks created in step 4 using equation 4.1.

6) Increase the distance if the relative location differs slightly ie result of AND operation of

respective blocks does not contain 1 at the higher bit position by setting appropriate values of

the parameter α using equation 4.2.

7) Output the images according to increasing order of their similarity score.

Figure 4.9: An algorithm to find relative locations of multiple ROIs

70

Since our approach inherently supports the notion of relative location, therefore, it consumes

less computation time and provides a detailed description of all relative locations in comparison

to the users multiple ROIs combinations. The process of representing the relative location of

multiple ROIs in retrieval process is illustrated in algorithm given in Figure 4.9.

4.4 Feature Extraction

Region based retrieval systems generally use small feature vector because if the number of

regions in the layout is high [Chen et al. (1999), Vu et al. (2003)], then it takes a lot of time to

compare features of all regions within the query image. Proposed approach requires selective

comparisons, hence the length of feature vector can be increased to represent the region

accurately without affecting the computation time. In the proposed technique, different color and

texture features can be used. The performance of the system may vary depending upon the

effectiveness of the descriptor used.

In this work, color feature based on the dominant color has been used because it can represent

an image block in an effective, compact and intuitive way. To describe the texture information

local binary pattern is employed. Local binary pattern and its variants are one of the most well

known texture descriptors and have been proved to be invariant for monotonic gray level

changes. Also rotation invariant uniform local binary patterns have low feature dimension and

have achieved higher classification accuracies on representative texture databases. The process

of extracting features is explained in detail in the following subsection.

4.4.1 Dominant Color Extraction

Color is most commonly used feature in image retrieval. Its 3-dimensional space makes its

discriminating power superior to 1-dimensional gray level image. Commonly used color features

71

are color histogram, correlograms and Dominant Color Descriptor (DCD). Color histogram is

most commonly used representation but it does not include any spatial information. Color

correlograms describe the probability of finding a similar color pair at a fixed distance and

provide spatial information. DCD is a MPEG-7 color descriptor. DCD is described by the color

and its percentage. DCD can describe the principle colors of an image in a compact, precise and

intuitive manner. But for the DCD in MPEG-7, representative color extraction depends upon the

distribution of colors and greater part of the colors is from the higher distribution range with

smaller color distances. This may not match with the human perception as human eye cannot

exactly distinguish between closely related colors. To address this problem we have adopted an

effective scheme for dominant color extraction to address this problem. Before extracting the

color feature of an image, all pixels in a database image are categorized into K clusters using K-

means algorithm [Su and Chou (2001)]. The mean value of all the pixel colors in each cluster is

considered to be a color value in a color lookup table (Table 4.1). The color lookup table,

containing K different colors, is used as reference color palette for all images (including all

database images and query images). All images are quantized to these K colors in RGB color

space. A color will be selected from K pre-defined colors, which are very near to image pixel

color, and it will be stored as new color pixel in the image. According to literature and

experimental results, selection of color space is not a critical issue for DCD extraction. Therefore

for simplicity, without loss of generality, RGB color space is employed in this work. Color

distance CD is calculated using the Euclidean distance given in equation (4.3):

,.........1),)()()((min 222 KiBBGGRRC iTPiTPiTPD (4.3)

where RP, GP, BP are red, green and blue components of the intensity values of the pixel and RiT,

GiT, BiT are the corresponding values of the color entry in the table.

72

Table 4.1 Color Look up table

S. No. Color R G B 1 Black 0 0 0

2 Sea green 0 182 0

3 Light green 0 255 270

4 Olive green 36 73 0

5 Aqua 36 146 170

6 Bright green 36 255 0

7 Blue 73 36 170

8 Green 73 146 0

9 Turquoise 73 219 170

10 Brown 109 36 0

11 Blue gray 109 109 170

12 Lime 109 219 0

13 lavender 146 0 170

14 Plum 146 109 0

15 Teal 146 182 170

16 Dark red 182 0 0

17 Magenta 182 73 170

18 Yellow green 182 182 0

19 Flouro green 182 255 170

20 Red 219 73 0

21 Rose 219 146 170

22 Yellow 219 255 0

23 Pink 255 36 170

24 Orange 255 146 0

25 White 255 255 255

Images are then divided into blocks. For each block, the percentage of pixels of each color is

determined. The color having the highest percentage is the dominant color of the block. Three

dimensional dominant color together with its percentage is stored as a color feature for each

block. The DCD feature can be represented as:

FD = (Vi, Pi), i = 1... N , (4.4)

where N is the total number of dominant colors, Vi is the 3-D color vector and Pi is the

percentage of dominant color.

73

4.4.2 Local Binary Pattern based Texture Features

There is no precise definition of texture. However, one can define texture as the visual

pattern having the properties of homogeneity that do not result from the presence of only a single

color or intensity. Texture plays an important role in describing innate surface properties of an

object and its relationship with the surrounding regions. There are many texture feature

extraction techniques that have been proposed till now. A majority of these are based on

statistical analysis of pixel distribution and others are based on local binary pattern. The

representative statistical methods are Gray Level Co-occurrence Matrix (GLCM) [Haralick et al.

(1973)], Markov Random Field (MRF) model, Simultaneous Auto-Regressive (SAR) model,

Wold decomposition model, Edge Histogram Descriptor (EHD) and wavelet moments.

LBP based methods [Backs et al. (2013), Zhu and Ang (2012)] have gained more

popularity for their simplicity and higher performance results on representative texture databases.

LBP is invariant to monotonic gray level changes. Therefore, we have adopted local binary

pattern to represent the texture. Local binary pattern descriptors are proposed by Ojala et al.

(2002) for texture classification and retrieval. Given a pixel, the LBP code can be calculated by

comparing it with its neighbours as:

,0,0

0,1,2

1

0, x

xxsggsLBP p

P

pcpRP (4.5)

where cg is the gray value of the centre pixel, pg represents the value of neighbouring pixels, P is

the total number of neighbors and R is the radius of the neighbourhood. After generation of LBP

code for each pixel in the image, histogram of LBP codes is used to represent the texture image.

To reduce the feature dimension simple LBPs are further extended to uniform LBPs. The

uniform patterns are extracted from LBP codes such that they have limited discontinuities (≤ 2)

74

in the circular binary representation. In general a uniform binary pattern has (P* (P-1) + 3)

distinct output values.

To further reduce the feature dimension and achieve the rotation invariance, uniform patterns

are reduced to rotation invariant uniform LBP codes. This can be defined as:

,1

,2,1

0 ,2,

P

LBPUifggLBP

P

p RPcpriuRP (4.6)

where U is defined as the number of spatial transition (0/1) in that pattern. The mapping from

RPLBP, to 2,

riuRPLBP , has P+2 distinct output values, can be implemented with the help of lookup

table. In this work, histogram of 2

,riu

RPLBP with P=8 and R=1, is used as texture feature vector to

represent each block of the layout.

4.5 Experimental Results

To evaluate the performance of the descriptors, experiments are performed on two image

databases: MPEG-7 CCD database (dataset-1) and Corel-10000 database (dataset-2). Averaged

Normalized Modified Retrieval Rank (ANMRR) is employed to evaluate the performance of the

image retrieval system for both databases. ANMRR does not only determine if a correct answer

is found from the retrieval results but also calculates the rank of the particular answer in the

retrieval results. A lower ANMRR value represents better performance. The Common Color

dataset (MPEG-7 CCD) contains approximately 5000 images and a set of 50 common color's

queries (CCQ). Each query is specified by a set of ground truth images. The CCD contains

images originated from consecutive frames of television shows, newscasts and sports show. The

Corel database (Dataset-2) contains 10,000 natural images of 100 categories including butterfly,

75

beach, car, fish, flower, door, sunset etc. Each category contains 100 images. Some query images

from both datasets are shown in Figure 4.10.

(a) (b)

Figure 4.10: Query examples in (a) MPEG-7 CCD database (Dataset-1) (b) Corel -10000

database (Dataset-2)

In our experiments on dataset-1, queries and ground truths proposed in the MIRROR image

retrieval system [Wong et al. (2005)] are used. For each of the 50 CCQ images, we have

computed the precision-recall percentage. This data is further used in computing the mean

precision - recall pair. In Corel 10000 dataset, 50 categories are randomly selected and 10 images

from each category are used for querying. Then the mean precision-recall pair is computed.

To determine the effect of block size on retrieval performance, we performed experiments by

varying the block sizes and explored the retrieval accuracy of the proposed method. Figure 4.11

shows the performance difference using different block sizes on datasets-1 and 2. According to

the experiments, retrieval performance for block sizes 5×5 and 7×7 are almost same and are

higher than the block size of 3×3. Moreover, larger block size (i.e. 7×7) may result in an increase

of retrieval time, therefore in this work block sizes 3×3 and 5×5 are considered.

76

(a) (b)

Figure 4.11: Comparison of average precision using different block sizes on (a) dataset-1

(b) dataset -2

4.5.1 Comparison of ROI Overlapping Block Selection Methods

There are various measures to select blocks overlapping with ROIs. First, the proportions of

all ROI overlapping blocks may be reflected as suggested by Tian et al. (2000). Second, features

of the block having the highest area of overlap with ROI are only reflected [Prasad et al. (2004)].

Third, blocks whose overlapping areas do not exceed the threshold may be ignored [Lee and

Nang (2011)]. Fourth, all the feature values of blocks that overlap with ROIs may be reflected.

Lastly, the ROBs having same dominant color as that of the ROB having the largest area of

overlap with ROIs are only reflected as suggested in this chapter. Table 4.1 shows the results of

the retrieval performance comparison of different methods in terms of ANMRR.

77

In Table 4.1, the proposed method shows better retrieval performance compared to other

approaches. Because the blocks selected using the proposed method fully reflect the properties of

core areas while ignoring weakly correlated with ROIs.

Table 4.1 Comparison of retrieval performance by ROI-overlapping block selection

method on dataset-1 and dataset-2

Database All Tian et al.

(2000)

Prasad et al.

(2004)

Lee & Nang

(2011)

Proposed

Dataset-1 0.586 0.517 0.492 0.452 0.358

Dataset-2 0.685 0.632 0.573 0.502 0.425

4.5.2 Comparison of Multiple ROI based Image Retrieval Methods

Image retrieval experiments based on multiple ROIs are: (a) comparing blocks in the same

location alone to measure the degree of similarity as in [Prasad et al. (2004)] (b) merely

examining whether the location of ROIs is the same as in [Moghaddam et al. (2001)] (c)

reflecting the relative location of ROIs as suggested in [Lee and Nang (2011)] and (d) reflecting

relative locations as suggested in the present chapter. Here all four types of experiments are

performed to compare their retrieval performance.

Table 4.2 ANMRR obtained from different methods on dataset-1 and dataset-2

Database Prasad et al.

(2004)

Moghaddam et al.

(2001)

Chan et al.

(2008)

Lee & Nang

(2011)

Proposed

Dataset-1 0.563 0.528 0.483 0.415 0.347

Dataset-2 0.672 0.614 0.562 0.529 0.436

78

In Table 4.2, results of retrieval based on the proposed method show best performance which

is 66.2% and 57.2 % greater than fixed location retrieval [Prasad et al.(2004)] on dataset-1 and

dataset-2 respectively. The performance achieved is 52.1% and 43.4% greater than [Moghaddam

et al. (2001)] on dataset-1 and dataset-2 respectively. Performance is also 19.6 % and 23.5%

better in comparison to the method proposed in [Lee and Nang (2011)] on dataset-1 and dataset-

2 respectively. CVAAO based method [Chan et al. (2008)] does not consider the relative

locations of multiple ROIs in retrieval hence have lower performance than our approach. This

shows that the proposed method performs better in comparison to other methods.

Figure 4.12 shows the performance comparison of different algorithms based on average

precision-and-recall graph. It is obvious from Figure 4.12 (a) that proposed algorithm has better

average precision over different recall points for dataset-1. The method is compared with other

methods based on fixed location matching [Chan et al. (2008), Moghaddam et al. (2001), Prasad

et al. (2004)] and relative location matching [Lee and Nang (2011)]. The techniques presented in

[Moghaddam et al. (2001), Prasad et al.(2004)] compare regions from a fixed location depending

upon the locations of ROIs in the query image. Lee and Nang [Lee and Nang (2011)] approach

considers relative locations but requires a complex algorithm and fails to provide a detailed level

of relative location similarity. The approach given in [Chan et al.(2008)] works well for single

ROI, but is not suitable for multiple ROIs based retrieval.

The method presented in this chapter considers multiple ROIs with a more detailed level of

relative location similarity and covers all relevant parts of an image for comparison using region

code based matching scheme.

79

(a) (b)

Figure 4.12: Interpolated P-R graphs to compare different methods on (a) dataset-1 (b)

dataset-2.

In addition, use of well known color and texture features has boosted the performance of the

proposed scheme. Figure 4.12 (b) shows that the performance of the proposed scheme on

dataset-2 is better in comparison to other methods. Average precision values of the proposed

method for top 10 retrieved images are 15%, 8%, 18% and 27% higher than the methods

described in [Chan et al.(2008), Lee and Nang(2011), Moghaddam et al.(2001) and Prasad et

al.(2004)] respectively. Therefore, the proposed method outperforms the other methods.

Figure 4.13 (a) shows the retrieval performance comparison of the proposed method with

some state-of-the-art region based retrieval methods using color, texture and their combinations

as feature descriptor on dataset-1.

80

(a) (b)

Figure 4.13. Performance comparison of average Precision – Recall graph on (a) dataset-1

(b) dataset -2

The experimental results clearly show that the proposed method is superior to other methods

[Saykol et al. (2005), Luo et al. (2006), Liu et al.(2011)] in performance. The average precision

values of the proposed method for top 10 retrieved images are 23%, 16% and 7% higher in

comparison to techniques given in [Saykol et al.(2005), Luo et al.(2006) and Liu et al.(2011)].

Better results are mainly due to use of region codes and effective feature set. Liu et al. (2011),

presented a MSD based technique. The MSD is defined by the underlying colors in micro-

structures with similar edge orientation, and the feature vector is extracted on edge orientation.

The MSD can represent color, texture and shape information effectively, but does not represent

the spatial location of different objects in the layout. Also, the scheme fails to capture global

information of the image. CECH based descriptor [Luo et al. (2006)] can represent the color and

81

shape information of objects but it lacks in texture features and relative locations are also not

considered in the retrieval process.

Saykol et al. (2005)’s histogram based representation of color and shape also lacks texture

information and is not suitable for multiple ROI based retrieval. The proposed scheme uses

dominant color and hence can extract representative colors of an image in an effective way. The

use of rotation invariant uniform LBP histogram makes it more robust to monotonic gray level

changes and invariant to geometric transformations. The LBP can also detect various structures

in a local region like point, edge, circle etc. In addition, the use of region codes helps in finding

images having high spatial location similarity. All this makes proposed method, outperforms in

comparison to other methods. Figure 4.13 (b) shows the performance of the proposed scheme on

dataset -2 which is better in comparison to other methods. Therefore we can conclude that the

proposed method outperforms in comparison to other three methods on dataset-1 and dataset-2.

Figure 4.14: Performance comparison in terms of retrieval time

82

Figure 4.15: Retrieval results for example query (a) Prasad et al. (2004) (b) Moghaddam et

al. (2001) (c) Chan et al. (2008) (d) Lee & Nang (2011) (e) Proposed method

Figure 4.14 gives a comparison of the average retrieval time of various methods. The

proposed method has a better retrieval time among all, as it considers relative location inherently

using region codes. The actual results of multiple ROI-based image retrieval are shown in Figure

4.15, which indicate that the proposed method retrieves the largest number of images similar to

ROIs.

The above experiments prove that in multiple ROI-based image retrieval, searching other

locations than those designated as ROIs and considering relative locations of ROIs improves the

efficiency of retrieval and better reflects the user’s intent.

83

4.6 Conclusion

In this chapter, a novel scheme for image retrieval based on region codes of ROI is proposed.

Use of region codes facilitates the user to define a ROI of arbitrary size and further helps in

narrowing down the search range resulting in increased accuracy and reduced computation time.

The spatial locations of different regions are also specified using these codes. The present

technique finds a way between fixed location matching and all-blocks matching techniques by

comparing a few blocks, which can reflect the user requirement in a satisfactory way. Region

codes while being more computationally efficient are also effective in finding the detailed level

of relative location similarity between multiple ROIs in the query and target image. To further

improve the efficiency, an effective feature set consisting of a dominant color and local binary

pattern is used to represent the image. Experimental results have shown that the proposed

method produces better results while consuming less computation time. The work can be

extended further to enhance the proposed method by considering partial overlapping region with

the ROI and using more effective feature set to represent various regions in the images.

Image Retrieval Based on Region of...

Documents

Transcript of Image Retrieval Based on Region of...