Image Retrieval Based on Region of...
Transcript of Image Retrieval Based on Region of...
53
CHAPTER - 4
Image Retrieval Based on Region of Interest
4.1. Introduction
Content Based Image Retrieval (CBIR) is browsing, searching and navigation of images from
large image databases based on their visual content. CBIR has been an active area of research for
more than a decade. Many CBIR systems have been developed; like QBIC [Flickner et al.
(1995)], Simplicity [Wang et al. (2001)], and Blob world [Carson et al.(2002)]. A detailed survey
of CBIR techniques can be found in [Lew et al. (2006), Vassilieva (2009), Pradnya and Vikhar
(2010)]. Traditional CBIR systems use low level features like color, texture, shape and spatial
location of objects to index and retrieve images from databases. Low level features can be global
or local (region based). Global feature based CBIR systems [Ciocca et al. (2011), Lin et al.
(2009), Wang et al. (2011), Zhiwen et al. (2012)] fail to compare the regions or objects in which
a user may be interested. Therefore Region Based Image Retrieval (RBIR) is shown to be more
effective in reflecting the user requirement in the literature.
A typical query image consists of both relevant and irrelevant regions. The irrelevant regions
limit the effectiveness of existing content based image retrieval systems. Irrelevant regions can
be removed by defining ROI in the query image. Depending upon the method of formulating
region based query, RBIR can be categorized into two classes: (1) System Designated ROI
(SDR) (2) User Designated ROI (UDR). These systems have various ways of querying and
representing images in the database.
54
UDR approach seems to be more promising as it enables the user to express his intent in the
query formulation. In UDR approaches, it is difficult to formulate the query accurately when
there are variations in the sizes of ROI [Tian et al. (2000), Zhang et al. (2007)]. This chapter
presents a method to select ROI overlapping blocks, based upon the features and overlapping
areas of ROI. However, if multiple ROIs are selected by the user, then it is always beneficial to
consider relative locations of ROIs in the image. For instance, if the user is interested in three
regions as shown in Figure 4.1, then it is very crucial to consider relative locations of the ROIs in
the image.
Figure 4.1: Comparison of locations in multiple ROI (a) Query Image (b) Target image # 1
(c) Target image #2
Target image 4.1(b) is slightly different from query image (a) but the location of ROIs—the bird
at the top, the car in the middle and the person at the bottom—is the same. In target image 4.1(c),
on the other hand, the locations of the bird and the person are switched. In this case, the user’s
intent for retrieval is not realized, so target image 4.1(c) should be excluded from the retrieval
results or be given a lower priority. Therefore, in multiple ROI based retrieval, it is crucial to
consider the relative locations of ROIs so that the user’s intent can be fully reflected.
Only work by [Moghaddam et al. (2001), Lee and Nang (2011)] in the literature has reported
solution to the problem of finding relative locations of ROIs. In addition, these methods fail to
55
give a detailed level of relative location similarity. Consideration of relative locations of several
ROIs in retrieval techniques requires complex algorithms and results in the increase of
computation time. To overcome this problem, this chapter presents a more effective method
based on region codes, which inherently supports the notion of relative locations of multiple
ROIs resulting in less computation time.
Major contribution of work presented in this chapter are first, use of region codes for
reduction of overall computation time without affecting accuracy of retrieval, second, an
efficient technique for ROI overlapping block selection has been suggested, third, a method to
find the similarity while considering the relative locations of multiple ROIs is proposed and
lastly, effective combination of features is utilized for ROI image retrieval. The experimental
results show that the proposed method can meet requirements of a user more accurately while
consuming less time in comparison to existing methods.
4.2. Related Work
Color and texture features have been extensively used in region based image retrieval.
Shrivastava and Tyagi (2012), Liu et al. (2011) have proposed a microstructure descriptor
(MSD). Microstructures are defined by edge orientation similarity with underlying colors, which
can effectively represent image features. Underlying colors are colors with similar edge
orientation. MSD integrates color, texture, shape and spatial layout information for effective
image retrieval. However, it lacks global properties of the image and is unable to utilize relation
between locations of different objects in the layout. Xingyuan and Zongyu (2013) have proposed
a more effective Structure Element Descriptor (SED) that combines color and texture
information. The histogram computed from SED in the HSV color space (quantized to 72 bins)
has been employed for image discrimination. The proposed global descriptor can represent the
56
spatial correlation of color and texture in the image. However, the descriptor is global and cannot
represent the region based properties of images. Also, the feature vector length is high and does
not encode the spatial relation between different objects in the image.
Saykol et al. (2005) have proposed a histogram based scheme that combines color and shape
features. To extract shape, distance and angle histogram is used and the color is encoded using
quantized 166 bin histogram in HSV color space. Texture information is not used in the retrieval.
The scheme can identify shapes of individual objects but fails to capture spatial relations
between various objects in the image.
The color information of the image is also used for object detection using Color Co-
occurrence Histogram (CCH) [Chang and Krumm (1999)] and Color Edge Co-occurrence
Histogram (CECH) [Luo and Crandall (2006)]. CECH can represent the gross spatial
information about the layout but it is incapable of discerning subtle shape differences. This is
useful for handling object distortions but fails in classifying shapes having minor variations.
Apart from low level features, considering spatial locations of different regions and their
relation in the image have also shown to play an important role in increasing the performance of
a region based image retrieval system in the literature.
Hsiao et al. (2010) approach partitions images into five regions with fixed absolute locations.
To avoid noise during the local match, the system allows users to select the ROI from the
segmented five regions and only selected region is compared with regions of other images in the
database.
In the technique presented in [Tian et al. (2000)], images are divided into blocks of a fixed
size (e.g. 2×2, 3×3, and 5×5). However, the size of the user designated ROI may be different
from predefined block size. This may result in an inaccurate representation of ROI features. To
57
address this problem, the authors have represented feature of blocks by their proportion of
overlap with ROI in the calculation of similarity measure. The main drawback of this method is
that it only compares the blocks having similar spatial locations as of ROI. Therefore, blocks
lying in different locations related to ROI are not retrieved.
The method proposed by Prasad et al. (2004) uses automatic specification of regions within
the image with the help of dominant color. The images are divided into blocks of size 3x3 and
each block is given a location index. The block having the largest overlap area with ROI is
designated and its feature vector is matched with database image's blocks having the same
location index. As shown in Figure 4.6, block 4 is designated and its features are matched merely
with block 4 of database images. This method faces the problem as ROIs are not directly
identified by the user and regions are compared only from the fixed locations. Multiple ROIs are
also not supported.
Technique given by Moghaddam et al. (2001) facilitates the user to select multiple ROIs and
retrieve blocks in different locations from ROIs. However, this method has high time complexity
as it requires comparison of all blocks within the query region. It is also compared and reflected
whether blocks in the target image are in the same location as multiple ROIs in the query image.
This method fails to provide a detailed level of relative location similarity as it simply tells
whether blocks in the target image are in the same location as multiple ROIs in the query image.
Chan et al. (2008) suggested a ROI image retrieval method based on Color Variances Among
the Adjacent Objects (CVAAO) feature. CVAAO feature can describe principal pixel colors and
distinguish the objects with inconsistent contours. Furthermore, it is insensitive to scale, shift,
rotation, and distortion variations. Concerning the image querying aspect, the CVAAO based
ROI image retrieval method computes the location and size of the target region image RT in a
58
database image using shape, area, and position of the largest object on the query region image
RQ, where RT is more similar to RQ. However, this method does not consider relative locations of
ROIs in the retrieval process and hence not suitable for multiple ROI based retrieval.
Phase 1: The relative location of ROIs is calculated in the query image.
(a) Set the leftmost side of the query image as the basis ROI.
(b) Divide an image into four quadrants, with the central coordinates of the basis ROI.
(c) Calculate the relative location of other ROIs to determine which quadrant they lie in.
Do not calculate the location of the ROI already designated as basis ROI.
(d) Choose the closest ROI to the basis ROI as the new basis ROI.
(e) Repeat (b) through (d) by (number of ROIs–1) times.
Phase 2: The results are compared with the target image.
(a) Calculate the location of blocks in the target image that are most similar
to ROIs of the query image.
(b) Calculate the relative location of blocks derived in (a) by order of Basis ROI in
Phase1.
(c) Compare the relative location of blocks identified in (b) with that of ROIs obtained in
Phase 1 to determine whether their relative location is the same.
(d) Increase the distance if the location differs.
Figure 4.2: Lee and Nang (2011) algorithm for finding relative locations in the case of
multiple ROI.
59
To incorporate relative locations of multiple ROIs, Lee and Nang (2011) have proposed a
similarity measure using comparative layouts of ROIs. This method divides an image into blocks
of fixed size and MPEG-7 [Bastan et al. (2010)] dominant color feature is used to represent each
block. To select overlapping blocks with UDR, it has been suggested to prefer overlapping
blocks having higher overlap area rather than some predefined threshold. To find relative
location, images are divided into coordinate planes with four quadrants centering on the basis
ROI to determine in which quadrants individual ROIs are located as described in the algorithm
given in Figure 4.2. At this point, the similarity is weighted when the relative location of the
ROIs in the query image and the target image are same. This method fails to provide a detailed
level of relative location similarity and also considering relative location results in the increase of
computation time and complexity. The method proposed in this chapter is based on region codes
and deals with the problem identified in existing studies and provides an effective solution to
these problems.
4.3 Region Codes based Selective Region Matching
This section describes the proposed approach. The details of finding the region codes for
different regions in the image and the process of querying and retrieving based on these codes is
described in the following subsections.
4.3.1 Region Codes Assignment
The region codes were first used in the Cohen-Sutherland line clipping algorithm [Hearn and
Baker (2010)]. In the present work, we have further enhanced the scheme of region codes to
make it applicable in region based image retrieval. To find the region codes, all images are
60
divided into blocks of a fixed size (e.g. 3×3 and 5×5). Each block of the image is assigned a 4-bit
code depending on its spatial location relative to the central region as illustrated in Figure 4.3.
Figure 4.3: Example of an image and its corresponding region codes
Starting from the first lower order bit, each of the four bits in the region code specifies left, right,
bottom and top region of the image respectively. For example, the code of the region that lies on
top left of the central region will have a region code 1001. As the middle region of the image,
generally contain most important details of the image, it has been assigned a code 1111, as an
exception since its direction cannot be decided and it must be included in all comparisons.
Region code for all regions is determined by comparing the coordinates of lower-left corner and
upper-right corner of the central region. The scheme of region codes can be easily extended for
layout of higher dimensions (i.e. 5×5 and 7×7) by adding more bits in the region code for a
particular direction. For instance, region codes for layout of size 5×5 will be of 8 bits (as shown
in Figure 4.4) due to 2 bits assignment for each direction. The designated bits for each direction
can be named accordingly, as 2 bits assigned for left direction may locate left and extreme left
direction from the central region.
The proposed scheme can work well only for layouts of odd dimensions as in that case central
region can be coded uniquely. Region codes play an important role in finding the spatial
61
locations of different regions in the image with respect to the central region. These codes are
further used to filter irrelevant regions for comparison with the query region.
10 00 00 10 10 00 00 01 10 00 00 0 10 00 01 00 10 00 10 00
01 00 00 10 01 00 00 01 01 00 00 00 01 00 01 00 01 00 10 00
00 00 00 10 00 00 00 01 11 11 11 11 00 00 01 00 00 00 10 00
00 01 00 10 00 01 00 01 00 01 00 00 00 01 01 00 00 01 10 00
00 10 00 10 00 10 00 01 00 10 00 00 00 10 01 00 00 10 10 00
Figure 4.4: Region code assignment for a layout of size 5×5
4.3.2 ROI Overlapping Blocks Selection
To support ROI image retrieval, a user must be permitted to query arbitrarily shaped images.
Our approach supports varying size of ROI and multiple ROI in the query image. Let Sb be the
uniform block size and Sr is the size of the query region. Then if Sb <= Sr, region containing ROI
together with its region code is taken as the query region such that dominant color of ROI is
same as that of the block containing it. If this condition is violated, global matching is preferred
over region matching. However, if Sr > Sb, system will find the dominant color of all blocks
overlapping with ROI. The dominant color of block having highest overlaps with ROI is taken as
the reference color (D) and is used to compare with the color of all other ROI overlapping
blocks. Only those ROI overlapping blocks (ROBs) which represent same dominant color as D
together with their region codes are retained for final query formulation. The whole process of
selecting appropriate blocks is described in the algorithm below (Figure 4.5):
This scheme of ROI overlapping block selection is illustrated using Figure 4.5. To establish
the efficacy of our approach, we have compared it with the techniques given in [Lee and Nang
62
(2011), Prasad et al. (2004), Tian et al. (2000)]. As shown in Figure 4.6, an image is divided into
3×3 blocks, and ROI is specified with ROI overlapping blocks 1, 2, 4, 5 and 7. If feature values
of all ROBs are reflected, then all feature values of blocks 1, 2, 4, 5 and 7 are reflected in
similarity computation. This method has a drawback that the regions not overlapping with ROI
are overly reflected. The scheme given by Prasad et al. (2004), reflects ROI by the proportion of
its overlap on respective blocks.
1) Compare the size of the ROI (Sr) and predefined block size ( Sb )
(i) If Sr << Sb, then perform global matching
(ii) If Sr <= Sb, then select the whole block containing the ROI. Go to step 5
(iii) If Sr > Sb, select all blocks overlapping with user designated ROI and create a list
2) Find the dominant color of the blocks selected in step 1 as detailed in section 3.6.1.
3) Determine the block having largest overlapping area with ROI from the list of blocks created
in step 1 and designate its dominant color D for comparison with other ROBs.
4) Finally select only those ROBs, which have a dominant color same as D for comparison.
5) Compute region codes of selected blocks as described in section 4.1.
Figure 4.5: An algorithm to select ROBs
Use of this method fully reflects the features of block 4 and features of all other ROBs like 1,
2, 5 and 7 are reflected according to their proportion of overlap. As for this method, feature
values that are totally different from the ROIs may be reflected while applying the feature values
of blocks overlapping with ROIs by proportion. Let us take block 5 as an example. The block as
a whole, however, is predominantly black, so its color is determined as black. As a result, feature
63
values of the block are reflected by the proportion of overlap for its “black” areas—completely
different from the ROI.
Figure 4.6: ROI selected by the user
Lee and Nang (2011) suggested a method that reflects feature values depending on the
proportion of blocks overlapping with ROIs as in [ Tian et al. (2000)], but selects only those
blocks where regions overlapping with ROIs are greater than a threshold (20%). This approach
has a drawback that the threshold value may vary for different combinations of query ROI and
overlapping blocks. Therefore, it is very difficult to set one value applicable to all combinations.
Using this technique in the scenario shown in Figure 4.6, may result in the selection of block 5,
which is predominantly black but its area of overlap with ROI is greater than 20%. As a result,
feature values of the block are reflected by the proportion of overlap for its “black” area, which
is totally different from the red color of the ROI.
The technique here, selects blocks having same dominant color as that of the block having
highest overlap with ROI. In Figure 4.6, block 4 has highest overlap and its dominant color is
red. The proposed scheme compares the color of block 4 with all other ROBs i.e. blocks 1, 2, 5
and 7. Out of these blocks, the dominant color of block 1 is same as that of block 4. Hence,
blocks 1 and 4 are only selected, and their features are reflected by their proportion of overlap
together with their region codes. There is no possibility of selection of block 5, since its
64
dominant color is not red as that of block 4. It is clear that the proposed approach selects only
relevant ROBs which accurately reflect the features of ROI.
4.3.3 Selective Region Matching based on Region Codes
ROI can be specified manually or automatically in the query image. The ROI are then
converted into corresponding ROBs in the layout. During similarity calculation ROB can be
compared with blocks in the target images in two ways, that are fixed location matching and all-
blocks matching.
Fixed location approaches [Prasad et al. (2004), Tian et al. (2000)] have some disadvantages
due to spatial dependency. For example, if a user is searching for an image with a tiger in the left
corner, then the images with tiger located at the right corner are difficult to retrieve. The region-
matching algorithm can be employed to solve this problem. The user defined ROI moves over
the whole image, block by block and thus compares all blocks of target image with the query
region. For every block, a similarity distance is recorded. The minimum similarity distance is
indexed as the output similarity distance for the image [Lee and Nang (2011), Moghaddam et al.
(2001)].
The computational complexity of all-block matching approach [Lee and Nang (2011),
Moghaddam et al. (2001)] is greatly increased as O(n2) with the increasing dimension n of
layout. The work in this chapter tries to reduce this complexity by reducing the number of
comparisons without the loss of accuracy. Normally, a 3x3 layout has a good tradeoff between
image details and computation complexity [Zhang et al. (2007)]. The proposed method compares
a few but not all blocks which are related to the initial position of ROI in the layout. It works on
the assumption that the probability of finding query region is higher in the parts of the database
image where the ROI query region is located and its related adjacent locations. For example, if
65
the query object is located in the top region of an image, the probability of finding similar
regions in the top, top-left and the top-right region is high. To improve accuracy, central region
should always be selected. Therefore, it is assigned a code as 1111. The proposed scheme only
compares regions of images in the database having region codes similar to query region.
Figure 4.7: Images showing region codes of different regions to be compared with query
region 1000
The similarity between region codes is determined by looking for the region codes having 1 in
the same bit positions as of query region code. This similarity is determined by performing
logical AND operation between region codes. If the outcome of AND operation is not 0000, then
two region codes are similar as shown through an example below. As shown in Figure 4.7, if the
code of the region containing the ROI is 1000, then the system will compare its feature vector
with image regions having codes 1010, 1000, 1001 and 1111 i.e. the region codes having 1 in the
same bit position as the region code of query region.
1 0 0 0 1 0 0 0 0 0 0 1 1 0 0 1 0 0 0 0 (not similar) 1 0 0 0 (similar)
66
Further if the region code of query ROI is 1010 (i.e. top-right region), then it will be compared
with region codes 1000, 1001, 1010, 0010, 0110 and 1111 (i.e. top, right and central region).
This approach also considers relative positions of multiple ROIs. Comparing features in this way
reduces the total number of comparisons and improve the accuracy of the system. In addition,
logical operations are computationally more efficient than corresponding conditional operations.
4.3.4 Similarity Measure
In the present work, similarity is measured by comparing blocks overlapping with ROI and
blocks in the target image having a similar region code. Similarity between region codes is
determined by looking for 1 at the similar bit positions. The similarity is measured by obtaining
the list of blocks in the query image corresponding to ROIs Br and scanning the target image n
times by the unit of blocks to find the nearest block list for Br [Lee and Nang (2011)], where n is
the number of regions in target images having region code similarity with Br. This can be written
as equation (4.1) [Lee and Nang (2011)].
niIBLDIBD jbri
jr i
.......1)),,(min(),( , (4.1)
where ),( jr IBD measures the degree of similarity between Br and target image, and I j represents
the jth image of the image database. LDi ( Br ,j
biI ) measures the distance between Br and each
block list (j
biI ) in the target image ( I j ).
jbi
I means ith block list of the jth image that corresponds
to Br. In LDi ( Br , jb i
I ) , the similarity of blocks is measured using different similarity calculation
methods by the property in use. In this work, it is defined as a Euclidean distance measure. For
computing ),( jr IBD , the smallest value is applied among the distances calculated by scanning
blocks fr
is always
4.3.5 Mu
When
locations
intent ac
locations
block. Th
block.
As sh
1, 4, 5 an
compared
approach
location
ensure th
block are
The relat
rom the targe
s less than th
ultiple ROI
n more than
s of multiple
ccurately. A
s as comparis
hese related
hown in Fig
nd 7 of the d
d with block
h. It is obvio
of ROI-1 re
his, blocks sh
e given highe
tive location
et image and
he total numb
based Retri
one ROI is
e ROIs and
Although pr
sons are mad
d locations h
Figure
gure 4.8, RO
database im
ks 1, 2, 3, 5
ous that the
emains lower
howing simi
er priority by
ns of both RO
d comparing
ber of blocks
ieval
specified in
their relativ
roposed app
de only in th
have been fil
4.8: Multip
OI-1 has regio
ages. Where
, 6 and 9 us
relative loc
r or at the s
larity at high
y introducin
OIs are same
g them n tim
s in the layo
n the query im
ve locations
proach for
he locations w
ltered furthe
ple ROIs sel
on code 000
eas ROI-2 h
sing propose
cations of bo
ame level w
her bit positi
ng parameter
e if their regi
mes. It should
out and thus e
mage, some
are also con
single ROI
which are re
er according
lected by th
01 hence it w
as region co
ed region co
oth ROIs ar
with respect
ions with res
r α in the sim
ion codes in
d be noted th
ensures fewe
more param
nsidered to
implicitly
elated to the
g to their rel
he user
will be comp
ode 1010; th
ode based se
re maintaine
to the locati
spect to regi
milarity meas
ndividually m
hat the value
er compariso
meters like sp
reflect the u
reflects rel
location of q
lation with q
pared with b
herefore it w
elective matc
ed as long a
ion of ROI-2
ion code of q
sure calculat
match with re
67
e of n
ons.
patial
user's
lative
query
query
locks
will be
ching
as the
2. To
query
tions.
egion
68
codes of blocks at the higher bit level. To implement this, the system will check the position of 1
in the result of AND operation of two region codes as shown below:
1 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 1 0 0 0 (high similarity) 0 0 1 0 (low similarity)
For instance, while matching with ROI region code 1010, region codes 1000, 1001 and 1010
will be given high priority (i.e. setting α = 0) in comparison to region codes 0010, 0110 and
1111. As the region codes mentioned earlier exhibit similarity at the fourth bit position as
compared with the region code mentioned later which exhibit similarity at second bit position.
The scheme presented by Lee and Nang (2011), compares all locations with ROI. In that case
feature of ROI-1 may match with any block in target images. Hence computation of relative
locations requires a complex algorithm and high computation time. Also, this technique fails to
show a detailed level of relative location similarity and results in ignorance of some important
relative locations. For example, for multiple ROI combination as shown in Figure 4.6, if the
number of the most similar block to ROI-1 is 6 in the target image, then the approach will find
similar relative locations for ROI-2 as block 3 only. However, blocks 1 and 2, which are also on
top region will be ignored or given higher distance.
The scheme presented in this chapter produces better results when applied to same query
image ROI combination as in Figure 4. 8. For instance, the system will compare the features of
ROI-1 with blocks 1, 4, 5 and 7 of the target image only. Thereby, ignoring block 6, location of
which is also not related to the initial location of ROI-1. Moreover, the relative locations for
ROI-2, can be blocks 1, 2 or 3, whichever is closer to initial configuration. Blocks 5, 6 and 9
which are related less close are also reflected using a parameter α in the similarity measure
calculation. This can be summarized as:
69
)),((),(1
n
k
jkr
j IBDwIRMD , (4.2)
where ),( jIRMD calculates the degree of similarity between the query image’s ROI combination
(R) and jth image of the database ( jI ). Here the degree of similarity is calculated as the weighted
sum of the distances of individual ROIs. The parameter α is 0, if relative locations are same and
1, otherwise. The overall distance is less when relative locations of multiple ROIs in both query
and target image are identical. The value of weight w can be assigned between 0 and 1 and n
refers to the number of ROI; krB is the list of blocks in the query image that correspond to the kth
ROI. ),( jr IBD is calculated using equation (4.1) for each ROI.
1) Determine the number of ROI’s as selected by the user.
2) Compute region codes of all ROBs as given in algorithm 1 [Figure 4.5].
3) Perform logical AND operation on region codes computed in step 2 with region codes of all
other blocks of the layout.
4) Create a list of blocks in the layout which produce 1 as a result of AND operation of step 3.
5) Compare each ROB with the list of blocks created in step 4 using equation 4.1.
6) Increase the distance if the relative location differs slightly ie result of AND operation of
respective blocks does not contain 1 at the higher bit position by setting appropriate values of
the parameter α using equation 4.2.
7) Output the images according to increasing order of their similarity score.
Figure 4.9: An algorithm to find relative locations of multiple ROIs
70
Since our approach inherently supports the notion of relative location, therefore, it consumes
less computation time and provides a detailed description of all relative locations in comparison
to the users multiple ROIs combinations. The process of representing the relative location of
multiple ROIs in retrieval process is illustrated in algorithm given in Figure 4.9.
4.4 Feature Extraction
Region based retrieval systems generally use small feature vector because if the number of
regions in the layout is high [Chen et al. (1999), Vu et al. (2003)], then it takes a lot of time to
compare features of all regions within the query image. Proposed approach requires selective
comparisons, hence the length of feature vector can be increased to represent the region
accurately without affecting the computation time. In the proposed technique, different color and
texture features can be used. The performance of the system may vary depending upon the
effectiveness of the descriptor used.
In this work, color feature based on the dominant color has been used because it can represent
an image block in an effective, compact and intuitive way. To describe the texture information
local binary pattern is employed. Local binary pattern and its variants are one of the most well
known texture descriptors and have been proved to be invariant for monotonic gray level
changes. Also rotation invariant uniform local binary patterns have low feature dimension and
have achieved higher classification accuracies on representative texture databases. The process
of extracting features is explained in detail in the following subsection.
4.4.1 Dominant Color Extraction
Color is most commonly used feature in image retrieval. Its 3-dimensional space makes its
discriminating power superior to 1-dimensional gray level image. Commonly used color features
71
are color histogram, correlograms and Dominant Color Descriptor (DCD). Color histogram is
most commonly used representation but it does not include any spatial information. Color
correlograms describe the probability of finding a similar color pair at a fixed distance and
provide spatial information. DCD is a MPEG-7 color descriptor. DCD is described by the color
and its percentage. DCD can describe the principle colors of an image in a compact, precise and
intuitive manner. But for the DCD in MPEG-7, representative color extraction depends upon the
distribution of colors and greater part of the colors is from the higher distribution range with
smaller color distances. This may not match with the human perception as human eye cannot
exactly distinguish between closely related colors. To address this problem we have adopted an
effective scheme for dominant color extraction to address this problem. Before extracting the
color feature of an image, all pixels in a database image are categorized into K clusters using K-
means algorithm [Su and Chou (2001)]. The mean value of all the pixel colors in each cluster is
considered to be a color value in a color lookup table (Table 4.1). The color lookup table,
containing K different colors, is used as reference color palette for all images (including all
database images and query images). All images are quantized to these K colors in RGB color
space. A color will be selected from K pre-defined colors, which are very near to image pixel
color, and it will be stored as new color pixel in the image. According to literature and
experimental results, selection of color space is not a critical issue for DCD extraction. Therefore
for simplicity, without loss of generality, RGB color space is employed in this work. Color
distance CD is calculated using the Euclidean distance given in equation (4.3):
,.........1),)()()((min 222 KiBBGGRRC iTPiTPiTPD (4.3)
where RP, GP, BP are red, green and blue components of the intensity values of the pixel and RiT,
GiT, BiT are the corresponding values of the color entry in the table.
72
Table 4.1 Color Look up table
S. No. Color R G B 1 Black 0 0 0
2 Sea green 0 182 0
3 Light green 0 255 270
4 Olive green 36 73 0
5 Aqua 36 146 170
6 Bright green 36 255 0
7 Blue 73 36 170
8 Green 73 146 0
9 Turquoise 73 219 170
10 Brown 109 36 0
11 Blue gray 109 109 170
12 Lime 109 219 0
13 lavender 146 0 170
14 Plum 146 109 0
15 Teal 146 182 170
16 Dark red 182 0 0
17 Magenta 182 73 170
18 Yellow green 182 182 0
19 Flouro green 182 255 170
20 Red 219 73 0
21 Rose 219 146 170
22 Yellow 219 255 0
23 Pink 255 36 170
24 Orange 255 146 0
25 White 255 255 255
Images are then divided into blocks. For each block, the percentage of pixels of each color is
determined. The color having the highest percentage is the dominant color of the block. Three
dimensional dominant color together with its percentage is stored as a color feature for each
block. The DCD feature can be represented as:
FD = (Vi, Pi), i = 1... N , (4.4)
where N is the total number of dominant colors, Vi is the 3-D color vector and Pi is the
percentage of dominant color.
73
4.4.2 Local Binary Pattern based Texture Features
There is no precise definition of texture. However, one can define texture as the visual
pattern having the properties of homogeneity that do not result from the presence of only a single
color or intensity. Texture plays an important role in describing innate surface properties of an
object and its relationship with the surrounding regions. There are many texture feature
extraction techniques that have been proposed till now. A majority of these are based on
statistical analysis of pixel distribution and others are based on local binary pattern. The
representative statistical methods are Gray Level Co-occurrence Matrix (GLCM) [Haralick et al.
(1973)], Markov Random Field (MRF) model, Simultaneous Auto-Regressive (SAR) model,
Wold decomposition model, Edge Histogram Descriptor (EHD) and wavelet moments.
LBP based methods [Backs et al. (2013), Zhu and Ang (2012)] have gained more
popularity for their simplicity and higher performance results on representative texture databases.
LBP is invariant to monotonic gray level changes. Therefore, we have adopted local binary
pattern to represent the texture. Local binary pattern descriptors are proposed by Ojala et al.
(2002) for texture classification and retrieval. Given a pixel, the LBP code can be calculated by
comparing it with its neighbours as:
,0,0
0,1,2
1
0, x
xxsggsLBP p
P
pcpRP (4.5)
where cg is the gray value of the centre pixel, pg represents the value of neighbouring pixels, P is
the total number of neighbors and R is the radius of the neighbourhood. After generation of LBP
code for each pixel in the image, histogram of LBP codes is used to represent the texture image.
To reduce the feature dimension simple LBPs are further extended to uniform LBPs. The
uniform patterns are extracted from LBP codes such that they have limited discontinuities (≤ 2)
74
in the circular binary representation. In general a uniform binary pattern has (P* (P-1) + 3)
distinct output values.
To further reduce the feature dimension and achieve the rotation invariance, uniform patterns
are reduced to rotation invariant uniform LBP codes. This can be defined as:
,1
,2,1
0 ,2,
P
LBPUifggLBP
P
p RPcpriuRP (4.6)
where U is defined as the number of spatial transition (0/1) in that pattern. The mapping from
RPLBP, to 2,
riuRPLBP , has P+2 distinct output values, can be implemented with the help of lookup
table. In this work, histogram of 2
,riu
RPLBP with P=8 and R=1, is used as texture feature vector to
represent each block of the layout.
4.5 Experimental Results
To evaluate the performance of the descriptors, experiments are performed on two image
databases: MPEG-7 CCD database (dataset-1) and Corel-10000 database (dataset-2). Averaged
Normalized Modified Retrieval Rank (ANMRR) is employed to evaluate the performance of the
image retrieval system for both databases. ANMRR does not only determine if a correct answer
is found from the retrieval results but also calculates the rank of the particular answer in the
retrieval results. A lower ANMRR value represents better performance. The Common Color
dataset (MPEG-7 CCD) contains approximately 5000 images and a set of 50 common color's
queries (CCQ). Each query is specified by a set of ground truth images. The CCD contains
images originated from consecutive frames of television shows, newscasts and sports show. The
Corel database (Dataset-2) contains 10,000 natural images of 100 categories including butterfly,
75
beach, car, fish, flower, door, sunset etc. Each category contains 100 images. Some query images
from both datasets are shown in Figure 4.10.
(a) (b)
Figure 4.10: Query examples in (a) MPEG-7 CCD database (Dataset-1) (b) Corel -10000
database (Dataset-2)
In our experiments on dataset-1, queries and ground truths proposed in the MIRROR image
retrieval system [Wong et al. (2005)] are used. For each of the 50 CCQ images, we have
computed the precision-recall percentage. This data is further used in computing the mean
precision - recall pair. In Corel 10000 dataset, 50 categories are randomly selected and 10 images
from each category are used for querying. Then the mean precision-recall pair is computed.
To determine the effect of block size on retrieval performance, we performed experiments by
varying the block sizes and explored the retrieval accuracy of the proposed method. Figure 4.11
shows the performance difference using different block sizes on datasets-1 and 2. According to
the experiments, retrieval performance for block sizes 5×5 and 7×7 are almost same and are
higher than the block size of 3×3. Moreover, larger block size (i.e. 7×7) may result in an increase
of retrieval time, therefore in this work block sizes 3×3 and 5×5 are considered.
76
(a) (b)
Figure 4.11: Comparison of average precision using different block sizes on (a) dataset-1
(b) dataset -2
4.5.1 Comparison of ROI Overlapping Block Selection Methods
There are various measures to select blocks overlapping with ROIs. First, the proportions of
all ROI overlapping blocks may be reflected as suggested by Tian et al. (2000). Second, features
of the block having the highest area of overlap with ROI are only reflected [Prasad et al. (2004)].
Third, blocks whose overlapping areas do not exceed the threshold may be ignored [Lee and
Nang (2011)]. Fourth, all the feature values of blocks that overlap with ROIs may be reflected.
Lastly, the ROBs having same dominant color as that of the ROB having the largest area of
overlap with ROIs are only reflected as suggested in this chapter. Table 4.1 shows the results of
the retrieval performance comparison of different methods in terms of ANMRR.
77
In Table 4.1, the proposed method shows better retrieval performance compared to other
approaches. Because the blocks selected using the proposed method fully reflect the properties of
core areas while ignoring weakly correlated with ROIs.
Table 4.1 Comparison of retrieval performance by ROI-overlapping block selection
method on dataset-1 and dataset-2
Database All Tian et al.
(2000)
Prasad et al.
(2004)
Lee & Nang
(2011)
Proposed
Dataset-1 0.586 0.517 0.492 0.452 0.358
Dataset-2 0.685 0.632 0.573 0.502 0.425
4.5.2 Comparison of Multiple ROI based Image Retrieval Methods
Image retrieval experiments based on multiple ROIs are: (a) comparing blocks in the same
location alone to measure the degree of similarity as in [Prasad et al. (2004)] (b) merely
examining whether the location of ROIs is the same as in [Moghaddam et al. (2001)] (c)
reflecting the relative location of ROIs as suggested in [Lee and Nang (2011)] and (d) reflecting
relative locations as suggested in the present chapter. Here all four types of experiments are
performed to compare their retrieval performance.
Table 4.2 ANMRR obtained from different methods on dataset-1 and dataset-2
Database Prasad et al.
(2004)
Moghaddam et al.
(2001)
Chan et al.
(2008)
Lee & Nang
(2011)
Proposed
Dataset-1 0.563 0.528 0.483 0.415 0.347
Dataset-2 0.672 0.614 0.562 0.529 0.436
78
In Table 4.2, results of retrieval based on the proposed method show best performance which
is 66.2% and 57.2 % greater than fixed location retrieval [Prasad et al.(2004)] on dataset-1 and
dataset-2 respectively. The performance achieved is 52.1% and 43.4% greater than [Moghaddam
et al. (2001)] on dataset-1 and dataset-2 respectively. Performance is also 19.6 % and 23.5%
better in comparison to the method proposed in [Lee and Nang (2011)] on dataset-1 and dataset-
2 respectively. CVAAO based method [Chan et al. (2008)] does not consider the relative
locations of multiple ROIs in retrieval hence have lower performance than our approach. This
shows that the proposed method performs better in comparison to other methods.
Figure 4.12 shows the performance comparison of different algorithms based on average
precision-and-recall graph. It is obvious from Figure 4.12 (a) that proposed algorithm has better
average precision over different recall points for dataset-1. The method is compared with other
methods based on fixed location matching [Chan et al. (2008), Moghaddam et al. (2001), Prasad
et al. (2004)] and relative location matching [Lee and Nang (2011)]. The techniques presented in
[Moghaddam et al. (2001), Prasad et al.(2004)] compare regions from a fixed location depending
upon the locations of ROIs in the query image. Lee and Nang [Lee and Nang (2011)] approach
considers relative locations but requires a complex algorithm and fails to provide a detailed level
of relative location similarity. The approach given in [Chan et al.(2008)] works well for single
ROI, but is not suitable for multiple ROIs based retrieval.
The method presented in this chapter considers multiple ROIs with a more detailed level of
relative location similarity and covers all relevant parts of an image for comparison using region
code based matching scheme.
79
(a) (b)
Figure 4.12: Interpolated P-R graphs to compare different methods on (a) dataset-1 (b)
dataset-2.
In addition, use of well known color and texture features has boosted the performance of the
proposed scheme. Figure 4.12 (b) shows that the performance of the proposed scheme on
dataset-2 is better in comparison to other methods. Average precision values of the proposed
method for top 10 retrieved images are 15%, 8%, 18% and 27% higher than the methods
described in [Chan et al.(2008), Lee and Nang(2011), Moghaddam et al.(2001) and Prasad et
al.(2004)] respectively. Therefore, the proposed method outperforms the other methods.
Figure 4.13 (a) shows the retrieval performance comparison of the proposed method with
some state-of-the-art region based retrieval methods using color, texture and their combinations
as feature descriptor on dataset-1.
80
(a) (b)
Figure 4.13. Performance comparison of average Precision – Recall graph on (a) dataset-1
(b) dataset -2
The experimental results clearly show that the proposed method is superior to other methods
[Saykol et al. (2005), Luo et al. (2006), Liu et al.(2011)] in performance. The average precision
values of the proposed method for top 10 retrieved images are 23%, 16% and 7% higher in
comparison to techniques given in [Saykol et al.(2005), Luo et al.(2006) and Liu et al.(2011)].
Better results are mainly due to use of region codes and effective feature set. Liu et al. (2011),
presented a MSD based technique. The MSD is defined by the underlying colors in micro-
structures with similar edge orientation, and the feature vector is extracted on edge orientation.
The MSD can represent color, texture and shape information effectively, but does not represent
the spatial location of different objects in the layout. Also, the scheme fails to capture global
information of the image. CECH based descriptor [Luo et al. (2006)] can represent the color and
81
shape information of objects but it lacks in texture features and relative locations are also not
considered in the retrieval process.
Saykol et al. (2005)’s histogram based representation of color and shape also lacks texture
information and is not suitable for multiple ROI based retrieval. The proposed scheme uses
dominant color and hence can extract representative colors of an image in an effective way. The
use of rotation invariant uniform LBP histogram makes it more robust to monotonic gray level
changes and invariant to geometric transformations. The LBP can also detect various structures
in a local region like point, edge, circle etc. In addition, the use of region codes helps in finding
images having high spatial location similarity. All this makes proposed method, outperforms in
comparison to other methods. Figure 4.13 (b) shows the performance of the proposed scheme on
dataset -2 which is better in comparison to other methods. Therefore we can conclude that the
proposed method outperforms in comparison to other three methods on dataset-1 and dataset-2.
Figure 4.14: Performance comparison in terms of retrieval time
82
Figure 4.15: Retrieval results for example query (a) Prasad et al. (2004) (b) Moghaddam et
al. (2001) (c) Chan et al. (2008) (d) Lee & Nang (2011) (e) Proposed method
Figure 4.14 gives a comparison of the average retrieval time of various methods. The
proposed method has a better retrieval time among all, as it considers relative location inherently
using region codes. The actual results of multiple ROI-based image retrieval are shown in Figure
4.15, which indicate that the proposed method retrieves the largest number of images similar to
ROIs.
The above experiments prove that in multiple ROI-based image retrieval, searching other
locations than those designated as ROIs and considering relative locations of ROIs improves the
efficiency of retrieval and better reflects the user’s intent.
83
4.6 Conclusion
In this chapter, a novel scheme for image retrieval based on region codes of ROI is proposed.
Use of region codes facilitates the user to define a ROI of arbitrary size and further helps in
narrowing down the search range resulting in increased accuracy and reduced computation time.
The spatial locations of different regions are also specified using these codes. The present
technique finds a way between fixed location matching and all-blocks matching techniques by
comparing a few blocks, which can reflect the user requirement in a satisfactory way. Region
codes while being more computationally efficient are also effective in finding the detailed level
of relative location similarity between multiple ROIs in the query and target image. To further
improve the efficiency, an effective feature set consisting of a dominant color and local binary
pattern is used to represent the image. Experimental results have shown that the proposed
method produces better results while consuming less computation time. The work can be
extended further to enhance the proposed method by considering partial overlapping region with
the ROI and using more effective feature set to represent various regions in the images.