Pattern Recognition Lettersyamaghani.com/Files/12201590342.pdfCBIR systems Dissimilarity spaces...

Pattern Recognition Letters 34 (2013) 1659–1668

Contents lists available at SciVerse ScienceDirect

Pattern Recognition Letters

journal homepage: www.elsevier .com/locate /patrec

Further results on dissimilarity spaces for hyperspectral images RF-CBIR

0167-8655/$ - see front matter � 2013 Elsevier B.V. All rights reserved.http://dx.doi.org/10.1016/j.patrec.2013.05.025

⇑ Corresponding author at: Gipsa-lab, Grenoble-INP, France. Tel.: +33 (0)476 826396; fax: +33 (0)476 57 4790.

E-mail addresses: [email protected] (M.A. Veganzones),[email protected] (M. Datcu), [email protected] (M. Graña).

1 Tel.: +49 8153 28 1388.2 Tel.: +34 943 01 8044.

Miguel A. Veganzones a,b,⇑, Mihai Datcu c,1, Manuel Graña a,2

a Grupo de Inteligencia Computacional, Basque Country University (UPV/EHU), Spainb Gipsa-lab, Grenoble-INP, Francec German Aerospace Center (DLR), Germany

a r t i c l e i n f o a b s t r a c t

Article history:Available online 14 June 2013

Communicated by Nima Hatami

Keywords:Hyperspectral imagingCBIR systemsDissimilarity spacesRelevance feedback

Content-Based Image Retrieval (CBIR) systems are powerful search tools in image databases that havebeen little applied to hyperspectral images. Relevance feedback (RF) is an iterative process that usesmachine learning techniques and user’s feedback to improve the CBIR systems performance. We pursuedto expand previous research in hyperspectral CBIR systems built on dissimilarity functions defined eitheron spectral and spatial features extracted by spectral unmixing techniques, or on dictionaries extractedby dictionary-based compressors. These dissimilarity functions were not suitable for direct application incommon machine learning techniques. We propose to use a RF general approach based on dissimilarityspaces which is more appropriate for the application of machine learning algorithms to the hyperspectralRF-CBIR. We validate the proposed RF method for hyperspectral CBIR systems over a real hyperspectraldataset.

� 2013 Elsevier B.V. All rights reserved.

1. Introduction

The increasing interest in hyperspectral remote sensing (Plazaet al., 2009) will yield to an exponential growth of hyperspectraldata acquisition in a short time. Most spatial agencies have sched-uled the launch of hyperspectral sensors on satellite payloads suchas in EnMAP (2011) or PRISMA (2011) missions. That will involvethe storage of a huge quantity of hyperspectral data. The problemof searching through these huge databases using Content-BasedImage Retrieval (CBIR) techniques has not been properly addressedfor the case of hyperspectral images until recently. Recent workson hyperspectral CBIR systems (Grana and Veganzones, 2012;Veganzones and Grana, 2012) make use of spectral and spectral-spatial dissimilarity functions to compare hyperspectral images.The spectral and spatial features are extracted by means of spectralunmixing algorithms (Keshava and Mustard, 2002). In Veganzoneset al. (2012), authors define dissimilarity functions built uponKolmogorov complexity (Li et al., 1997) and its approximation bycompression and dictionary distances (Watanabe et al., 2002; Liet al., 2004). Compression-based distances require a high computa-tional cost that make it unaffordable for the definition of CBIRsystems. Dictionary distances operate over dictionaries extractedfrom the hyperspectral images by the off-line application of a

lossless dictionary-based compressor such as the Lempel–Ziv–Welch (LZW) compression algorithm (Welch, 1984). In this workwe pursued to extend these hyperspectral CBIR systems by usingthe feedback of the user.

Relevance feedback (RF) is an iterative process that makes use ofthe feedback provided by the user to reduce the gap between thelow-level feature representation of the images and the high-levelsemantics of the user’s queries (Smeulders et al., 2000). Often, theuser’s feedback comes on the form of a labeling of the previouslyretrieved images as relevant or irrelevant for the query. The set oflabeled images is then used by the CBIR system to adapt the searchto the query semantics. If each image is represented by a point in afeature space, the RF with both, positive and negative trainingexamples, becomes a two-class classification problem or an onlinelearning problem in a batch mode (Zhou and Huang, 2003).

Dictionaries and spectral-spatial features extracted from hyper-spectral images cannot be directly represented as points in afeature space. Thus, they do not fit easily in feature-based machinelearning techniques employed for the definition of RF processes. Itis possible to treat dissimilarity functions as kernel functions inorder to use them in kernel-based methods, for instance in SupportVector Machine (SVM) (Shawe-Taylor and Cristianini, 2004). How-ever, these dissimilarity functions do not comply often with validkernel conditions (Pekalska and Haasdonk, 2009). Authors inPekalska and Duin (2005) and Duin and Pekalska (2012) proposethe definition of dissimilarity spaces as an alternative to featurespaces for machine learning. In dissimilarity spaces some data in-stances are used as reference points named prototypes. The datasamples are compared to these prototype instances by some

http://crossmark.dyndns.org/dialog/?doi=10.1016/j.patrec.2013.05.025&domain=pdf

http://dx.doi.org/10.1016/j.patrec.2013.05.025

mailto:[email protected]



http://dx.doi.org/10.1016/j.patrec.2013.05.025

http://www.sciencedirect.com/science/journal/01678655

http://www.elsevier.com/locate/patrec

1660 M.A. Veganzones et al. / Pattern Recognition Letters 34 (2013) 1659–1668

dissimilarity function. Then, for each data sample, the dissimilari-ties to the prototypes define the data coordinates in a so-called dis-similarity space. Thus, each prototype defines a dimension in thisdissimilarity space. The dissimilarity space is analogous to a fea-ture space so, once the data samples are represented as points inthe dissimilarity space, all the available potential of machine learn-ing techniques can then be used.

In this paper we propose the use of dissimilarity spaces to de-fine a RF methodology for hyperspectral CBIR making use of the al-ready available spectral, spectral-spatial and dictionarydissimilarity functions. The use of dissimilarity spaces to defineRF processes is scarce on the literature. In Nguyen et al. (2006),authors propose the use of dissimilarities to prototypes selectedby an offline clustering process as the entry to a RF process definedas a one-class classification problem. Authors in Giacinto and Roli(2003) perform an online prototypes selection instead, where theimages retrieved to the user for evaluation are at the same timethe prototypes and the training set. The RF process is defined asa new dissimilarity function based on the combination of the data-base images dissimilarities to the set of prototypes and the proto-types labeling. In Bruno et al. (2006), authors propose differentstrategies to characterize an image by a feature vector based onthe combination of dissimilarities to a set of prototypes. We pro-pose an hyperspectral RF process defined as a two-class classifica-tion problem based on dissimilarity spaces. The input to theclassifier is a dissimilarity representation defined over the unmix-ing and dictionary-based hyperspectral dissimilarity functionsrespect to offline and online selected prototypes.

The paper is divided as follows. In Section 2 we outline the dis-similarity functions used in the definition of hyperspectral CBIRsystems and in Section 3 we outline the dissimilarity spacesapproach. In Section 4 we introduce the proposed hyperspectralRF process. In Section 5 we define the experimental methodologyand in Section 6 we comment on the results. Finally, we contributewith some conclusions in Section 7.

2. Hyperspectral dissimilarity functions

Here, we outline the dissimilarity functions used on the litera-ture to compare hyperspectral images. Firstly, we describe the spec-tral and spectral-spatial dissimilarity functions defined over theresults of a spectral unmixing process. Secondly, we describethe dictionary distance defined over dictionaries extracted fromthe hyperspectral images by means of lossless dictionary-basedcompressors.

2.1. Unmixing-based dissimilarity functions

Spectral unmixing pursues the decomposition of an hyperspec-tral image into the spectral signatures of its main constituents andtheir corresponding spatial fractional abundances. Most of theunmixing methods are based on the Linear Mixing Model (LMM)(Keshava and Mustard, 2002; Bioucas-Dias et al., 2012). The LMMstates that an hyperspectral sample is formed by a linear combina-tion of the spectral signatures of pure materials present in the sam-ple (endmembers), plus some additive noise. Often, the spectralsignatures of the materials are unknown, and the set of endmem-bers must be built by either manually selecting spectral signaturesfrom a spectral library, or by automatically inducing them from theimage itself. The latter involves the use of some endmember induc-tion algorithm (EIA). The hyperspectral literature features plenty ofsuch algorithms. Some reviews on the topic can be found in Plazaet al. (2004), Veganzones and Grana (2008) and Bioucas-Dias et al.(2012). Once the set of endmembers has been induced, their

corresponding per-pixel abundances can be estimated by a leastsquares method (Charles, 1974).

The dissimilarity functions based on the spectral unmixingmake use of the spectral and spectral-spatial characterization ofthe hyperspectral images (Grana and Veganzones, 2012; Veganz-ones and Grana, 2012). Given an hyperspectral image, Ha, whosepixels are vectors in a q-dimensional space, its spectral character-ization is defined by the set of endmembers, Ea ¼ ea

1; ea2 . . . ea

ma

� �,

where ma denotes the number of induced endmembers from theath image. The spectral-spatial characterization is defined as thetuple Ea;Uað Þ, where Ua ¼ /a

1;/a2; . . . ;/a

ma

� �is the set of fractional

abundance maps resulting from the unmixing process. To imple-ment this approach, an EIA is first used to induce the endmembersfrom the image and then, their respective fractional abundancesare estimated by a least squares unmixing algorithm.

In order to compute the unmixing-based dissimilarities, theSpectral Distance Matrix (SDM), Da;b, between two given hyper-spectral images, Ha and Hb, has first to be computed. The SDM isthe matrix Da;b ¼ dij

� �; i ¼ 1; . . . ;ma; j ¼ 1; . . . ;mb, whose elements

dij are the pairwise distances between the endmembers eai ; e

bj 2 Rq

of each image. The spectral distance function d : Rq � Rq ! Rþ isoften the angular pseudo-distance:

d ei; ej� �

¼ cos�1 eiej

eik k ej

�� : ð1Þ

The spectral dissimilarity (Grana and Veganzones, 2012) is then gi-ven by:

sE Ha;Hb

� �¼ mrk k þ mck k; ð2Þ

where mrk k and mck k are the Euclidean norms of the vectors of rowand column minimal values of the SMD, respectively. The Spectral-Spatial dissimilarity (Veganzones and Grana, 2012) is given by:

sE;U Ha;Hb

� �¼Xma

i¼1

Xmb

j¼1

rijdij; ð3Þ

where dij is the aforementioned spectral distance and rij is thesignificance associated to dij. The significance matrix Ra;b ¼ rij

� �;

i ¼ 1; . . . ;ma; j ¼ 1; . . . ;mb is calculated on base to the normalizedaverage abundances �Ua and �Ub by the most similar highest priority(MSHP) principle (Li et al., 2000).

2.2. Dictionary-based dissimilarity functions

Given a signal x, a dictionary-based compression algorithmlooks for patterns in the input sequence from signal x. These pat-terns, called words, are subsequences of the incoming sequence.The compression algorithm result is a set of unique words calleddictionary. The dictionary extracted from a signal x is hereafterdenoted as DðxÞ, with DðkÞ ¼ ; only if k is the empty signal. Thenormalized dictionary distance (NDD) (Macedonas et al., 2008) isgiven by:

sNDD x; yð Þ ¼ D x [ yð Þ �min D xð Þ;D yð Þf gmax D xð Þ;D yð Þf g ; ð4Þ

where D x [ yð Þ and D x \ yð Þ respectively denote the union and inter-section of the dictionaries extracted from signals x and y. The NDDis a normalized admissible distance satisfying the metric inequali-ties. Thus, it results in a non-negative number in the interval½0;1�, being zero when the compared signals are equal and increas-ing up to one as the signals are more dissimilar.

3. Dissimilarity spaces

The dissimilarity space is a vector space in which the dimen-sions are defined by dissimilarity vectors measuring pairwise

M.A. Veganzones et al. / Pattern Recognition Letters 34 (2013) 1659–1668 1661

dissimilarities between individual objects and reference objects(prototypes) (Duin and Pekalska, 2012). Given a set of prototypesP ¼ fp1; . . . ; prg, where r denotes the number of prototype objectson P, and a set of objects X ¼ fx1; . . . ; xNg, where N denotes thenumber of individual objects on X, the dissimilarity representationDðX;PÞ is a data-dependent mapping Dð�;PÞ : X! Rr from a set ofobjects X to the dissimilarity space specified by the prototypes setP. Each dimension in the dissimilarity space corresponds to a dis-similarity to a prototype object, DðX; piÞ. The dissimilarity repre-sentation DðX;PÞ is thus defined as a N � r dissimilarity matrix,where each object x 2 X is described by a vector of dissimilaritiessx ¼ Dðx;PÞ ¼ sðx; p1Þ; . . . ; s x; prð Þ½ �. The pairwise dissimilarity func-tion s x; pið Þ is not required to be metric and can be defined adhoc for the given prototype. The dissimilarity space is a vectorspace equipped with an inner product and an Euclidean metric.Thus, the vector of dissimilarities to the set of prototypes, sx, canbe interpreted as a feature, allowing the use of machine learningtechniques commonly defined over feature spaces.

4. Relevance feedback by dissimilarity spaces

The use of dissimilarity spaces allows one to use the previouslymentioned hyperspectral dissimilarity functions to define a RF pro-cess based on conventional machine learning techniques. The pro-posed hyperspectral RF process follows the general approach inGiacinto and Roli (2003), Bruno et al. (2006), Nguyen et al.(2006) and it is depicted in Fig. 1. First, the user defines a zero-query by feeding the system with some positive sample. Next, aninitial ranking is obtained comparing the database images to thequery sample by some hyperspectral dissimilarity function andsome images are retrieved for user’s evaluation. Then, the user la-bels the images retrieved by the system, a set of prototype imagesis selected and the RF process starts. We follow by describing thezero-query and the relevance feedback processes in detail, andthen we discuss on the prototypes selection and the selection ofthe images retrieved by the system for evaluation.

4.1. Zero query

First, a query QlðHaÞ is defined following the query-by-imageapproach. Ha denotes the hyperspectral image selected as thequery and l 2 Zþ, named the scope of the query, denotes the num-ber of images that should be retrieved by the system. Every imageHb in the dataset is compared to the query image by some hyper-spectral dissimilarity function, sðHa;HbÞ. The dissimilarities to the

Fig. 1. CBIR system diagram with the proposed relev

query image are represented as a vector sa ¼ sa;1; . . . ; sa;N½ �, whereN is the number of images in the dataset and sa;b is the dissimilaritybetween the query image Ha and the dataset image Hb, withb ¼ 1; . . . ;N. Then, we sort the components of sa in increasing or-der, and the resulting shuffled image indexes constitute the zeroranking X0

a ¼ xq 2 1; . . . ;Nf g� �

; q ¼ 1; . . . ;N, so that sa;xq 6 sa;xqþ1 .Then, some selection criterion is followed to select l images fromthe zero ranking and retrieve them for user’s evaluation. The userlabels these images as relevant or non-relevant for the query.The set of relevant images, denoted as R, and the set of non-rele-vant images, denoted as NR, form the training set, T ¼ R [ NRf g,with which the relevance feedback process starts.

4.2. Relevance feedback

We propose a RF process defined as a two-class problem wherethe classes are the set of relevant (positive class) and the set ofirrelevant (negative class) images respect to the query. The inputto the two-class classifier is a feature vector composed of the dis-similarity values computed from a given image respect to each ofthe images of the prototypes set. The output of the classifier shouldbe an scalar representing any measure of an image identificationwith the positive class respect to the negative class, for instancea class probability. The classifier outputs are ordered to define aranking of the database images respect to the user’s query. Finally,the ranking is used to select some database images that will be re-trieved for the user’s evaluation and so, proceed with a new RF iter-ation. Thus, the RF process is divided in two steps, a training phaseand a testing phase.

4.2.1. Training phaseLet P ¼ Hpi

� �ri¼1 be the set of prototypes where pi is an index

pointing to a database image and r is the number of prototype in-

stances. Let T ¼ Hqj

n ot

j¼1be the set of training samples where qj is

an index pointing to a database image, t denotes the number oftraining samples and each image Hqj

has been labeled as belonging

to the positive class, Cþ, or to the negative class, C�. Then, the sys-tem calculates the t � r dissimilarity matrix

D T;Pð Þ ¼ s Hqj;Hpi

h i; j ¼ 1; . . . ; t; i ¼ 1; . . . ; r; using some given

hyperspectral dissimilarity function s �; �ð Þ. The rows of DðT;PÞ arethe geometrical coordinates of the training samples in the dissim-ilarity space defined by the set of prototypes, and would be used asfeature vectors to train the two-class classifier.

ance feedback by dissimilarity spaces approach.

Fig. 2. Examples of the five categories patches: (a) Forests, (b) Fields, (c) Urban areas, (d) Mixed and (e) Others.

3 http://www.OpAiRS.aero


4.2.2. Testing phaseFor each image Hb in the dataset we calculate the dissimilarity

vector sb ¼ D Hb;P� �

¼ s Hb;Hpi

� �� ri¼1, given the hyperspectral dis-

similarity function sð�; �Þ. The dissimilarity vector, sb, represents apoint in the dissimilarity space and is used as the input to thetrained classifier. The classifier will return an scalar, cb, measuringthe probability or the degree of inclusion of the image Hb respect tothe query class Cþ. An image Hk having a classification value higherthan an image Hl, that is ck P cl, should be ranked in a better posi-tion. The values obtained by the classifier for all the images in thedataset are then represented as a vector ca ¼ ½c1; . . . ; cN�, where N isthe number of images in the dataset. The vector of classificationvalues ca is sorted in decreasing order and the resulting shuffledimage indexes constitute the ranking Xt

a ¼ xtq 2 1; . . . ;Nf g

h i;

q ¼ 1; . . . ;N, so that cxtq

P cxtqþ1

. The superscript t in Xta denotes

the iteration in turn on the RF process, being t a positive integer,t > 0. The ranking serves to select some images that are retrievedto the user for evaluation, and then included in the training set.The RF process ends when the user is satisfied, a maximum num-ber of iterations, tmax, is achieved, or no new images are beingincorporated to the training set.

4.3. Prototypes selection

The general RF process depicted in Fig. 1 requires of a set of pro-totypes. We distinguish between two criteria to build the proto-types set, an offline selection and an online selection. In theformer, the prototypes are a priori representative subset of theimages in the database. A common procedure is to perform a clus-tering and keep the centers of the clusters as the prototypes. Thiscriterion could lead to a dramatical reduction in the computationalcosts of the CBIR system, but on the other hand it defines a fixed setof prototypes for all the possible queries, limiting the adaptabilityof the CBIR system. The later builds the set of prototypes during theRF process. In each iteration some images are retrieved to the userfor evaluation and then included on the training set. These sameimages or a subset of them are also used as prototypes. This allowsto adapt the set of prototypes to the query. However, it increasesthe computational burden.

4.4. Image retrieval

A key aspect of RF-CBIR systems is the criterion to select from agiven ranking those images that will be retrieved to the user forevaluation. Let l denote the scope of the query, that is, the numberof images that should be retrieved to the user. If the criterion is toreturn the l best ranked images on the database, is likely that thetraining set is biased towards the positive class. So, a better crite-rion seems to retrieve the l=2 best images and the l=2 worstimages, hereafter denoted as the Best–Worst (BW) criterion. How-ever, the best and worst images are not necessarily the most infor-mative ones. The active learning paradigm (Chang and Lin, 2011)states that the most ambiguous images, those that are close to

the class boundaries, are the most informative. Thus, the activelearning (AL) criterion will return the l=2 most ambiguous imageslabeled as belonging to the positive class, and the l=2 most ambig-uous images labeled as belonging to the negative class.

5. Experimental methodology

5.1. Dataset

The hyperspectral HyMAP data was made available fromHyVista Corp. and German Aerospace Center’s (DLR) opticalAirborne Remote Sensing and Calibration Facility service.3 Thescene corresponds to a flight line over the facilities of the DLR centerin Oberpfaffenhofen (Germany) and its surroundings, mostly fields,forests and small towns. The data cube has 2878 lines, 512 samplesand 125 spectral bands. We have removed non-informative bandsdue to atmospheric absorption and 113 spectral bands remained.

We cut the scene in patches of 64� 64 pixels size for a total of360 patches forming the hyperspectral database used in the exper-iments. We grouped the patches by visual inspection in five roughcategories. The three main categories are ‘Forests’, ‘Fields’ and ‘Ur-ban areas’, representing patches that mostly belong to one of thiscategories. A ‘Mixed’ category was defined for those patches thatpresented more than one of the three main categories, being notany of them dominant. Finally, we defined a fifth category, ’Others’,for those patches that did not represent any of the above or thatwere not easily categorized by visual inspection. The number ofpatches per category are: (1) Forests: 39, (2) Fields: 160, (3) Urbanareas: 24, (4) Mixed: 102, and (5) Others: 35. Fig. 2 shows exam-ples of the five categories patches.

5.2. Methodology

We test the use of the proposed hyperspectral RF-CBIR using theunmixing and dictionary-based hyperspectral dissimilarity func-tions. For the unmixing-based dissimilarities, the spectral (2) andthe spectral-spatial (3) dissimilarity functions, we conduct for eachimage in the database an unmixing process in order to obtain theset of induced endmembers and their corresponding fractionalabundances. In order to do that we use the Vertex ComponentAnalysis (VCA) (Nascimento and Bioucas Dias, 2005) endmemberinduction algorithm and a partially constrained least squaresunmixing (PCLSU) (Charles, 1974) algorithm. As VCA is an stochas-tic algorithm we perform 20 independent runs for each image andwe keep the one with the lowest averaged root squared meanreconstruction error:

� H; H

¼ 1M

XM

i¼1

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi1q

Xq

j¼1

HðjÞi � HðjÞi

2

vuut ð5Þ

http://www.OpAiRS.aero

Table 1ANR values of the hyperspectral RF-CBIR for the Forests categorical search.

Avg. Band NDD By-Band NDD Spectral Spectral-Spatial

Zero Query 0.0809 0.0613 0.1360 0.0552Online Prot. 7NN BW 0.0343 0.0426 0.1394 0.0630

AL 0.0280 0.0258 0.0869 0.0337BW + AL 0.0287 0.0281 0.0770 0.0330

SVM BW 0.0383 0.1392 0.2600 0.0852AL 0.0596 0.1155 0.3947 0.2371BW + AL 0.0462 0.0358 0.2143 0.2430

Offline Prot. 7NN BW 0.0662 0.0723 0.1922 0.0543AL 0.0329 0.0631 0.1735 0.0494BW + AL 0.0448 0.0633 0.1848 0.0473

SVM BW 0.0758 0.0478 0.2502 0.1063AL 0.0542 0.0409 0.3116 0.1678BW + AL 0.0642 0.0538 0.3180 0.1055

Table 2ANR values of the hyperspectral RF-CBIR for the Fields categorical search.



AL 0.1388 0.0495 0.1573 0.1514BW + AL 0.1433 0.0587 0.1862 0.1883

SVM BW 0.1898 0.2462 0.1511 0.1983AL 0.1808 0.0914 0.1526 0.0924BW + AL 0.1567 0.0812 0.1477 0.1184


SVM BW 0.2033 0.0724 0.2136 0.1936AL 0.1831 0.0660 0.2112 0.1700BW + AL 0.2008 0.0497 0.2171 0.1442

Table 3ANR values of the hyperspectral RF-CBIR for the Urban areas categorical search.



AL 0.1900 0.0096 0.0626 0.0392BW + AL 0.2702 0.0282 0.1230 0.0654

SVM BW 0.2675 0.0437 0.1120 0.2126AL 0.5870 0.0416 0.2501 0.1603BW + AL 0.3825 0.0415 0.1459 0.1712


SVM BW 0.1562 0.0103 0.0833 0.1240AL 0.2276 0.0273 0.2164 0.2651BW + AL 0.1763 0.0246 0.0561 0.2032


where HðjÞi denotes the jth band value of the ith pixel in the hyper-spectral image H and H ¼ UE is the hyperspectral image recon-structed by the set of induced endmembers E and theircorresponding fractional abundances U. For the normalized dictio-nary distance (4), we first convert each hyperspectral image to atext string in two ways, using the average of the spectral bandsand band-by-band. For the former, we calculate the mean of eachhyperspectral pixel along the spectral bands. For the later we trans-form each spectral band independently. In both cases we traversethe image in a zig-zag way. The averaged band transformationincurs in a big lost of spectral information compared to the bandby band transformation, but by contrast it yields to a more compactdictionary and so, to speed up the NDD computation.

Thus, we compare the use of the four hyperspectral dissimilar-ities, the Spectral, the Spectral-Spatial, the Averaged Band NDD and

the Band-by-Band NDD, in the RF process respect to their use in thezero-query. In order to do that, we run independent retrievalexperiments over the HyMAP dataset. Each of the 360 patcheswas a priori labeled as belonging to one of the five categoriesdefined above. The query is a categorical search, where the imagesbelonging to the same category than the query image form thepositive class and the remaining ones form the negative class.We perform an independent search for each of the 360 patches.Thus, user’s evaluation was not required and the experiment wasfully automatized. The maximum number of iterations on theretrieval feedback process was set to tmax ¼ 5.

For the RF process we compare the use of a k-NN classifier and atwo-class SVM classifier with a radial basis kernel. The k-NNclassifier does not require of a training phase and returns thefraction of the k most similar images in the training set respect


to the query image belonging to the positive class, that is,

c ¼Pk

i¼1I Hi ;Hað Þk , where I denotes an indicator function returning

1 if the two images belong to the same class, and 0 otherwise.The SVM classifier outputs the probability that the tested imagebelongs to the positive class. The parameters of the SVM classi-fier where selected using a 5-fold cross validation. For the k-NNthe knnclassify MATLAB function was used. For the SVM, weused the C-SVM classifier of the LIBSVM (Chang and Lin,2011) library.

We also compare the use of online and offline prototypes selec-tion processes. For the offline prototypes selection process we per-formed a hierarchical segmentation using each of the fourhyperspectral dissimilarity functions and we keep 10 clusters.Then, for each cluster f we selected the image Ho

f minimizing the

WB

Avg.Band NDD By−Band NDD Spectral Spectral−Spatial0

5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Offline Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Offline Prototypes -Fig. 3. Average number of relevant (R) and non-relevant (NR) images in the training setBW and the AL image retrieval selection criteria for the Forests categorical search.

averaged distance to the rest of images grouped into the samecluster:

Hof ¼ arg min

i

1fj jXHj2f

s Hi;Hj� �

ð6Þ

where jfj denotes the cardinality of the cluster f.Finally, we compare the results obtained using three different

criteria to select the images to be retrieved to the user for evalua-tion: the BW criterion, the AL criterion and a combination of both,BW + AL. For the BW criterion the system retrieves the 5 best andworst ranked images in the database. For the AL criterion thesystem retrieves the 5 most ambiguous positive and negative in-stances, that is, the ones closed to the class boundary on each side.For both, BW and AL criteria, the scope is then l ¼ 10. For the

LA

D By−Band NDD Spectral Spectral−Spatial

7NN


SVM


7NN


SVMfor each RF iteration and comparing hyperspectral dissimilarity functions, using the


BW + AL criterion the system returns the 3 best and worst rankedimages, and the 3 most ambiguous positive and negative instances,for a total scope of l ¼ 12.

5.3. Performance measures

Evaluation metrics from information retrieval field have beenadopted to evaluate CBIR systems quality. The two most usedevaluation measures are precision and recall (Smeulders et al.,2000; Daschiel and Datcu, 2005). Precision, p, is the fraction ofthe returned images that are relevant to the query. Recall, q, isthe fraction of retrieved relevant images respect to the total num-ber of relevant images in the database according to a prioriknowledge. If we denote L the set of returned images and R theset of all the images relevant to the query, then p ¼ L\Rj j

Lj j and

WB


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45



5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Offline Prototypes -Fig. 4. Average number of relevant (R) and non-relevant (NR) images in the training setBW and the AL image retrieval selection criteria for the Fields categorical search.

r ¼ L\Rj jRj j . Precision and recall follow inverse trends when consid-

ered as functions of the scope of the query. Precision falls whilerecall increases as the scope increases. Thus, precision and recallmeasures are usually given as precision-recall curves for a fixedscope. To evaluate the overall performance of a CBIR system,the Average Precision and Average Recall are calculated over allthe query images in the database. For a query of scope l, theseare defined as:

�pl ¼1N

XN

a¼1

plðHaÞ ð7Þ

and

�rl ¼1N

XN

a¼1

rlðHaÞ: ð8Þ

LA


7NN


SVM


7NN




The normalized rank (Muller et al., 2001) was used to summarizethe system performance into an scalar value. The normalized rankfor a given image query, denoted as Rank Hað Þ, is defined as:

Rank Hað Þ ¼ 1NNa

XNa

i¼1

Xia �

Na Na � 1ð Þ2

!; ð9Þ

where N is the number of images in the dataset, Na is the number ofrelevant images for the query Ha, and Xi

a is the rank at which the i-th image is retrieved. This measure is 0 for perfect performance, andapproaches 1 as performance worsens, being 0:5 equivalent to arandom retrieval. We calculated the Rank Hað Þ for each of the imagesin the dataset and then we calculated the average normalized rank(ANR):

ANR ¼ 1N

XN

a¼1

Rank Hað Þ: ð10Þ

WB


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Online Prototypes -


5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45



5

10

15

20

25

30

35

40

45

Avg.Band ND0

5

10

15

20

25

30

35

40

45

Offline Prototypes -Fig. 5. Average number of relevant (R) and non-relevant (NR) images in the training setBW and the AL image retrieval selection criteria for the Urban areas categorical search.

6. Results

Tables 1–3 show the ANR (10) values of the comparing hyper-spectral dissimilarities, using the proposed RF-CBIR respect to thezero-query, for the Forest, Fields and Urban areas categorical que-ries respectively. We run the experiments using different values ofk for the k-NN classifier, but we only show the results using k ¼ 7as in general it outperforms the other k values. The ANR resultscorrespond to the ranking obtained in the fifth RF iteration. Ingeneral, the hyperspectral RF process yields to better ANR resultsthan the zero query for the four compared hyperspectral dissimi-larity functions. The online prototype selection leads to betterresults than the offline selection, and so it does the 7-NN classifiercompared to the SVM classifier. The use of AL for the image retrie-val selection outperforms the BW criterion, and often the combina-tion of both, BW + AL. As it was expected, the results using the

LA


7NN


SVM


7NN




Band-by-Band NDD and the Spectral-Spatial dissimilarity functionsoutperform the Averaged Bands NDD and the spectral dissimilarityfunctions.

There are however some discrepancies depending on the cate-gorical query. This effect is specially relevant for the Urban areascategory and it is related to the asymmetry in the number ofimages present in the database for each class. The low number ofimages belonging to the Urban areas category makes the trainingset very unbalanced yielding to poor classification results, and so,to a low performance in the CBIR ranking. Figs. 3–5 show the aver-age number of relevant (R) and non-relevant (NR) images in thetraining set for each RF iteration using the BW and the AL imageretrieval selection criteria for the Forests, Fields and Urban areascategorical queries respectively. It is clear that the Urban areas cat-egory presents the most asymmetrical distribution of the trainingset into relevant and non-relevant images, what it can explainthe poor results on the RF process for this category. In general,

0 0.1 0.2 0.3 0.4 0.5 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Pre

cisi

on

Fig. 6. Precision–recall curv

0 0.1 0.2 0.3 0.4 0.5 0.60

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Recall

Pre

cisi

on

Fig. 7. Precision–recall curve

the asymmetry in the R/NR ratio is not so important as soon asthere are some critical number of each on the training set. It is alsopossible to observe that the AL selection criterion yields to bettertraining sets compared to the BW selection criterion, expressedas bigger and more equally distributed training sets. This issueseems to be a major drawback for the SVM classifier while the im-pact on the 7-NN classifier is less severe as soon as there are en-ough positive samples present on the training set. This issueshould be further addressed in future research in order to developan operative hyperspectral RF-CBIR system.

Finally, Figs. 6 and 7 show the P–R curves (7) and (8) for thezero-query and the best RF results respectively, using the fourcomparing dissimilarity functions. The improve on the P–R curvesby the RF process is clear except for the Urban areas categoricalsearch, due to the pernicious effect of the lack of positive samplesand the consequent asymmetrical distribution of R/NR samples onthe training sets.

0.7 0.8 0.9 1

Avg.Band NDD − ForestsAvg.Band NDD − FieldsAvg.Band NDD − Urban areasByBand NDD − ForestsByBand NDD − FieldsByBand NDD − Urban areasSpectral − ForestsSpectral − FieldsSpectral − Urban areasSpectralSpatial − ForestsSpectralSpatial − FieldsSpectralSpatial − Urban areas

es for the zero query.

0.7 0.8 0.9 1

Avg.Band NDD − ForestsAvg.Band NDD − FieldsAvg.Band NDD − Urban areasByBand NDD − ForestsByBand NDD − FieldsByBand NDD − Urban areasSpectral − ForestsSpectral − FieldsSpectral − Urban areasSpectralSpatial − ForestsSpectralSpatial − FieldsSpectralSpatial − Urban areas

s for the best RF results.


7. Conclusions

We have extended the hyperspectral CBIR systems present onthe literature by a RF process based on dissimilarity spaces. To de-fine a relevance feedback process for hyperspectral CBIR systems isnot easy as most of the available hyperspectral CBIR systems relyon feature respresentations and dissimilarity functions that donot fulfil the conditions to be used in common machine learningRF processes. The proposed approach expands the available dissim-ilarity-based hyperspectral CBIR systems on the literature in a sim-ple way by using dissimilarity space instead of the usual featurespace. The proposed approach proved to improve the performanceof the hyperspectral CBIR systems in the preliminary experimentspresented on this paper. Also, the selection of a proper training setfor the RF process was pointed as a major issue affecting the per-formance of the proposed hyperspectral RF-CBIR system. Furtherresearch will focus on this aspect and on the validation of the pro-posed system in a real scenario with a big database of hyperspec-tral images and real users.

Acknowledgments

The authors very much acknowledge the support of Dr. MartinBachmann from DLR.

References

Bioucas-Dias, J.M., Plaza, A., Dobigeon, N., Parente, M., Du, Qian, Gader, P.,Chanussot, J., 2012. Hyperspectral unmixing overview: geometrical statisticaland sparse regression-based approaches. IEEE Journal of Selected Topics inApplied Earth Observations and Remote Sensing 5 (2), 354–379.

Bruno, E., Moenne-Loccoz, N., Marchand-Maillet, S., 2006. Learning user queries inmultimodal dissimilarity spaces. In: Detyniecki, M., Jose, J.M., Nurnberger, A.,Rijsbergen, C.J. (Eds.), Adaptive Multimedia Retrieval: User, Context, andFeedback, Lecture Notes in Computer Science, vol. 3877. Springer, Berlin,Heidelberg, pp. 168–179.

Chang, C.-C., Lin, C.-J., 2011. LIBSVM: A library for support vector machines. ACMTransactions on Intelligent Systems and Technology 2, 27:1–27:27.

Daschiel, H., Datcu, M., 2005. Information mining in remote sensing image archives:system evaluation. IEEE Transactions on Geoscience and Remote Sensing 43 (1),188–199.

Duin, R.P.W., Pekalska, E., 2012. The dissimilarity space: bridging structural andstatistical pattern recognition. Pattern Recognition Letters 33 (7), 826–832.

German Spatial Agency (DLR). Environmental mapping and analysis program(enmap), 2011.

Giacinto, G., Roli, F., 2003. Dissimilarity representation of images for relevancefeedback in content-based image retrieval. In: Perner, P., Rosenfeld, A. (Eds.),Machine Learning and Data Mining in Pattern Recognition, Lecture Notes inComputer Science, vol. 2734, pp. 202–214.

Grana, Manuel, Veganzones, Miguel A., 2012. An endmember-based distance forcontent based hyperspectral image retrieval. Pattern Recognition 45 (9), 3472–3489.

Keshava, N., Mustard, J.F., 2002. Spectral unmixing. IEEE Signal Processing Magazine19 (1), 44–57.

Keshava, N., Mustard, J.F., 2002. Spectral unmixing. IEEE Signal Processing Magazine19 (1), 44–57.

Lawson, Charles L., 1974. Solving Least Squares Problems. Prentice Hall.Li, Ming, Vitanyi, Paul, 1997. An Introduction to Kolmogorov Complexity and Its

Applications, second ed. Springer.Li, J., Wang, J.Z., Wiederhold, G., 2000. IRM: integrated region matching for image

retrieval. In: Proceedings of the Eighth ACM International Conference onMultimedia, MULTIMEDIA ’00. ACM, Marina del Rey, CA, United States, pp. 147–156.

Li, Ming, Chen, Xin, Li, Xin, Ma, Bin, Vitanyi, P.M.B., 2004. The similarity metric. IEEETransactions on Information Theory 50 (12), 3250–3264.

Macedonas, A., Besiris, D., Economou, G., Fotopoulos, S., 2008. Dictionary basedcolor image retrieval. Journal of Visual Communication and ImageRepresentation 19 (7), 464–470.

Muller, H., Muller, W., Squire, D.McG., Marchand-Maillet, S., Pun, T., 2001.Performance evaluation in content-based image retrieval: overview andproposals. Pattern Recognition Letters 22 (5), 593–601.

Nascimento, J.M.P., Bioucas Dias, J.M., 2005. Vertex component analysis: a fastalgorithm to unmix hyperspectral data. IEEE Transactions on Geoscience andRemote Sensing 43 (4), 898–910.

Nguyen, G.P., Worring, M., Smeulders, A.W.M., 2006. Similarity learning viadissimilarity space in CBIR. In: ACM International Multimedia Conference andExhibition, pp. 107–116.

Pekalska, E., Duin, Robert P.W., 2005. The Dissimilarity Representation for PatternRecognition: Foundations and Applications. World Scientific Pub Co Inc.

Pekalska, E., Haasdonk, B., 2009. Kernel discriminant analysis for positive definiteand indefinite kernels. IEEE Transactions on Pattern Analysis and MachineIntelligence 31 (6), 1017–1032.

Plaza, A., Martinez, P., Perez, R., Plaza, J., 2004. A quantitative and comparativeanalysis of endmember extraction algorithms from hyperspectral data. IEEETransactions on Geoscience and Remote Sensing 42 (3), 650–663.

Plaza, Antonio, Benediktsson, Jon Atli, Boardman, Joseph W., Brazile, Jason,Bruzzone, Lorenzo, Camps-Valls, Gustavo, Chanussot, Jocelyn, Fauvel,Mathieu, Gamba, Paolo, Gualtieri, Anthony, Marconcini, Mattia, Tilton,James C., Trianni, Giovanna, 2009. Recent advances in techniques forhyperspectral image processing. Remote Sensing of Environment 113(Suppl. 1(0)), S110–S122.

Italian Spatial Agency (ASI). Precursore iperspettrale of the application mission(prisma), 2011.

Shawe-Taylor, J., Cristianini, N., 2004. Kernel Methods for Pattern Analysis.Cambridge University Press.

Smeulders, A.W.M., Worring, M., Santini, S., Gupta, A., Jain, R., 2000. Content-basedimage retrieval at the end of the early years. IEEE Transactions on PatternAnalysis and Machine Intelligence 22 (12), 1349–1380.

Veganzones, M.A., Grana, M., 2008. Endmember extraction methods: a shortreview. In: Knowledge-Based Intelligent Information and Engineering Systems,12th International Conference, KES 2008, Zagreb, Croatia, September 3–5,2008, Proceedings, Part III. Lecture Notes in Computer Science, vol. 5179.Springer, pp. 400–407.

Veganzones, M.A., Grana, M., 2012. A spectral/spatial CBIR system for hyperspectralimages. IEEE Journal of Selected Topics in Applied Earth Observations andRemote Sensing 5 (2), 488–500.

Veganzones, M.A., Datcu, M., Grana, M., 2012. Dictionary based hyperspectral imageretrieval. In: Proceedings of the First International Conference on PatternRecognition Applications and Methods. SciTePress, pp. 426–432.

Watanabe, T., Sugawara, K., Sugihara, H., 2002. A new pattern representationscheme using data compression. IEEE Transactions on Pattern Analysis andMachine Intelligence 24 (5), 579–590.

Welch, T.A., 1984. A technique for high-performance data compression. Computer17 (6), 8–19.

Zhou, Xiang Sean, Huang, Thomas S., 2003. Relevance feedback in image retrieval: acomprehensive review. Multimedia Systems 8 (6), 536–544.

http://refhub.elsevier.com/S0167-8655(13)00223-7/h0005














































































Pattern Recognition Lettersyamaghani.com/Files/12201590342.pdfCBIR systems Dissimilarity spaces...

Documents

Transcript of Pattern Recognition Lettersyamaghani.com/Files/12201590342.pdfCBIR systems Dissimilarity spaces...