Relevance Feedback in Image Retrieval Systems: A Surveychens/courses/cis6931/2001/Tao.pdf ·...
Transcript of Relevance Feedback in Image Retrieval Systems: A Surveychens/courses/cis6931/2001/Tao.pdf ·...
Relevance Feedback in Image Retr ieval Systems: A Survey
Tao Huang, Lin Luo, Chengcui Zhang
School of Computer Science
Florida International University
Abstract
The relevance feedback based approach in image retrieval system has been an active research
field in the past few years. This powerful technique has been proved successful in many
application areas. Various ad hoc parameter estimation techniques have been proposed for
relevance feedback. In addition, methods that perform relevance feedback on multi-level image
model have been formulated. The method of relevance feedback is based on the most popular
vector model used in information retrieval, and most of the previous relevance feedback research
can be classified into two approaches: query point movement and re-weighting techniques. More
recently, the new trend towards taking advantages of the semantic contents of the images in
addition to the low-level features has appeared. This paper surveys recent studies on relevance
feedback techniques image databases. Throughout the discussion, we will introduce the ideas of
various kinds of techniques and compare their performances.
1 Introduction
The development of the computer technology augments the importance of multimedia
information that must be browsed, retrieved and delivered. We cannot access or make use of the
information unless it is organized so as to allow efficient browsing, searching, and retrieval. The
wide spread of networked systems, among which the WWW has a distinguished role, highlights
the importance of good retrieval systems due to the vast amount of information available.
Information Retrieval (IR) is usually based on document surrogates that hold essential
information in an easily processible way. Queries are performed on surrogates instead of original
documents. Traditional approaches rely on text surrogates also for multimedia information, e.g.,
legends or keywords are attached to images, videos, sounds, speech fragments and etc. Retrieval
is performed on them rather than on original multimedia information. This approach has
advantage of the inheritance of efficient technology developed for text retrieval but suffers from
two serious drawbacks as Rui et al addressed in [Rui98, Rui98a]: one is the manual effort
required for attaching text descriptors to multimedia data; the other difficulty results from the
rich content in the images and the different interpretation of the same image given by different
users, leading to inconsistency in keyword assignment.
Techniques for content-based retrieval of multimedia data aim at overcoming these
drawbacks by using numerical features computed by direct analysis of the information content.
Content-Based Image Retrieval (CBIR) has been proposed in the early 1990’s. CBIR systems
use visual features like color, shape, texture, objects placement, line orientation and so on to
represent the image content. This approach is favorable since features can be computed
automatically, and the information used during the retrieval process is always consistent, not
depending on different human interpretation.
In CBIR systems, data objects are represented by surrogates that are feature vectors. Given
an input image, retrieval can be accomplished by extracting the query image’s visual features
and computing similarity coefficients or distances in the features space between the feature
vectors of a query image and the stored image, then retrieving matched or most similar images
from the database. The result is usually returned to the user ranked by decreasing similarity. This
process has some resemblance with text retrieval techniques which use term vectors to mark the
presence, absence or relative importance of selected words representing the document content.
Compared with the traditional image retrieval approaches such as keyword annotation,
CBIR is more efficient and practical. Research work in this field has developed quickly and
richly. While these research efforts establish the base of CBIR, most of them relatively ignore
two distinct characteristics of CBIR systems: (1) the gap between high-level concepts and low-
level features; (2) subjectivity of human perception of visual content. To overcome these
shortcomings, the concept of relevance feedback (RF) associated with CBIR was proposed in
1996 [Rui98]. Relevance feedback is an interactive process in which the user judges the quality
of the retrieval performed by the system by marking among the images retrieved by the system,
the one the user perceives as truly relevant. This information is then used to refine the original
query, resubmitted for a shaper selection.
In the past few years, the RF approach to image retrieval has been an active research field.
This powerful technique has been proved successful in many application areas. Various ad hoc
parameter estimation techniques have been proposed for RF. In addition, methods that perform
RF on multi-level image model have been formulated. The method of RF is based on the most
popular vector model [Buckley95, Salton83, Shaw95] used in information retrieval. Most of the
previous RF researches [Aksoy00, Benitez98, Chang99, Chua98, Hsu95, Lee98, Low98,
Meilhac98, Paek99, Rui97a, Seidl97, Wood98] are based on the low-level image features such as
color, texture and shape and can be classified into two approaches: query point movement and
re-weighting techniques [Lu00]. More recently, the new trend towards taking advantages of the
semantic contents of the images in addition to the low-level features has appeared.
In this paper, we will devote our efforts primarily to the recent studies on RF techniques in
image content-based retrieval domain. Throughout the discussion, we will introduce the ideas of
various kinds of techniques and compare their performances. The rest of the paper is organized
as follows. Section 2 reviews the development of RF technique and briefly introduces some
recent studies on RF techniques based on the low-level features. In Section 3, we discuss the RF
techniques used in the current famous image retrieval systems. Section 4 introduces the new
trends towards taking advantage of the semantic contents of the images. Section 5 gives the
concluding remarks.
2. Relevance Feedback Technique
RF is a widely accepted method of improving interactive retrieval effectiveness both in text
retrieval and image retrieval systems [Rui99]. The main idea of RF is: an initial search is made
by the system with a user-supplied query, returning a small number of results (documents or
images) to the users. The users indicate which of the returned results are useful (relevant). The
system then automatically refines the original query based upon those user relevance judgments.
The new feedback query is then compared to the collection of information (documents or
images), returning an improved set of documents or images to the user. This process can
continue to iterate until the user’s information need is satisfied.
In this section we will briefly introduce the concept of the RF technique in text based
retrieval, the ideas of various kinds of recent proposed RF techniques in image retrieval systems
and compare their performances.
2.1 Relevance Feedback Technique in Text Retr ieval
RF was initially introduced in text retrieval as early as the late 1960’s to increase the
number of relevant documents retrieved by a query [Rocchio66]. In RF, a user query is used to
start a search through the document collection. The documents obtained from this search are
examined and certain terms selected from the documents deemed to be relevant are used to
expand the original query. Terms from documents deemed not relevant may also be used to
modify the original query. Rocchio’s original formula, shown in equation (F1), is based on the
vector-space model.
∑ ∑= =
−+=1 2
1 1 2101
n
k
n
k
kk
n
S
n
RQQ γβ (F1)
where
Q1 = new query vector
Q0 = initial query vector
Rk = vector for relevant document k
Sk = vector for non-relevant document k
n1 = number of relevant documents
n2 = number of non-relevant documents
β and γ = weight multipliers to control relative contributions of relevant and
non-relevant documents
Much of the early research on RF has focused on the impact from the relevant and non-
relevant document weight multipliers β and γ [Lundquist97]. Based on the research done before,
RF was found to be extremely effective in text-based IR systems [Sltoton90, Salton89], even
now this technique is still adopted in some special text retrieval applications, e.g. using concept-
based relevance feedback for text retrieval on the WWW proposed by Chang et al [Chang99].
As we know, techniques in content-based image retrieval (CBIR) systems lag far behind
their text counterparts [Rui99] due to the difficulty for human to precisely express their visual
queries. Recently, RF based CBIR techniques have emerged as a promising research direction.
Following we will look into the RF techniques applied in Content Based Information Retrieval
Systems (CBIRSs).
2.2 Relevance Feedback Techniques in Image Retr ieval
In this section, we will introduce the main ideas of RF in image retrieval including image
model which RF method is based on and some specific RF methods.
2.2.1 Basic ideas and methods
In image retrieval, Picard and Minka have first studied RF technique and put RF technique
into a learning system named “FourEyes” [Meilhac98]. FourEyes offers a practical way to get
interactive performance by using RF technique, which will be addressed at length in Section 3.
Ever since then, the application of RF has been intensively studied. RF techniques do not require
a user to provide accurate initial queries, but rather estimate the user’s ideal query by using
positive and negative examples (training samples) feedback by the user. The fundamental goal of
these techniques is to estimate the ideal query parameters (both the query vectors and the
associated weights) accurately and robustly. However, most of the current RF based systems
estimate the ideal query parameters only on the low-level image features such as color, texture,
and shape. Most of the previous RF research can be classified into two approaches: query point
movement and re-weighting techniques [Ishikawa98]. We will address some well-known low-
level feature based RF techniques and related theories as follows.
The query point movement method as mentioned above essentially tries to improve the
estimation of the “ ideal query point” by moving it towards good example points and away from
bad example points. The frequently used technique to iteratively improve this estimation is the
Rocchio’s formula given as equation (F1) for sets of relevant documents Rk and non-relevant
documents Sk given by the user [Lu00]. This technique is implemented in the MARS system
[Rui97].
The central idea behind the re-weighting method is very simple and intuitive [Lu00]. Since
each image is represented by an N dimensional feature vector, we can view it as a point in an N
dimensional space. The weight associated with a feature defines the importance of this feature.
Intuitively, if all the relevant images have similar values for a feature, the feature can be
considered as an good indicator for user’s expectation. Conversely, if the variance of the good
examples is high along a principle axis j, then we can deduce that the values on this axis is not
very relevant to the input query so that we assign a low weight wj on it. Basically we use the
inverse of the standard deviation calculated over this feature in the feature matrix to update the
weight wj. The smaller the variance, the higher the associated weight.
Taking above two approaches into consideration, methods that perform relevance feedback
on multi-level image model proposed by Rui et al has been formulated [Rui97a, Rui98]. We will
show the details to formalize how an image object is modeled by use of this modeling theory as
follows.
Fundamentally, an image object O is represented by three tuples as equation (M1):
O = O ( D, F, R) (M1)
where
• D is the raw image data, e.g. a JPEG image.
• F = { f i } is a set of low-level visual features associated with the image object, such as
color, texture and shape.
• R = { rij } is a set of representations for a given feature f i , e.g. both color histogram and
color moments are representations for the color feature. Note that, each representation rij
itself may be a vector consisting of multiple components, i.e.
r ij =[ r ij1, r ij2 … ,r ijK ] (M2)
where K is the length of the vector.
This proposed object model supports multiple representations with dynamically updated
weights to accommodate the rich content in the image objects. Weights exist at various levels.
Wi, Wij, and Wijk, are associated with features f i, representations rij, and components rijk,
respectively.
The goal of RF based on this model is to find the appropriate weights to model the user’s
information request. The method of RF has been developed to be an effective technique for this
model.
MARS97[Rui97] introduced both a query vector moving technique and a re-weighting
techniques to estimate the ideal query parameters. MindReader [Ishikawa98] formulated a
minimization problem on the parameter estimation process. These two techniques are among the
best known techniques of RF. Compared with the MARS97 system and other previous RF
systems, the global optimization done in MindReader has shown a new trend towards improving
the robustness of the theoretical basis of RF systems, which will be further discussed in Section
4.1.1.
2.2.2 Some other proposed RF methods
Chua [Chua98a] et al proposed a RF approach to content-based image retrieval using
multiple attributes. The proposed approach has been applied to images’ text and color attributes.
In order to ensure that meaningful features are extracted, a pseudo object model based on color
coherence vector [Pass96] has been adopted to model color content. The RF approach employs
techniques developed in the fields of information retrieval and machine learning to extract
pertinent attributes in integrated content-based image retrieval. Retrieval using individual
attributes of text or color is able to achieve similar levels of retrieval effectiveness. But each is
only able to retrieve a subset of relevant images. In order to improve the overall retrieval
effectiveness, most image retrieval systems perform multiple attribute retrieval by combining
evidence [Faloutsos94, Low98, Ortega97]. Such techniques have resulted in improvements in
retrieval performance. However they are not very satisfactory, since different attributes tend to
have different degrees of importance for different classes of queries. In particular, without a
detailed knowledge of the collection make–up and retrieval environment, most users find it
difficult to formulate effective queries. Thus Chua et al adopted color coherence vector (CCV) as
a representation of color attribute [Pass96]. By treating the CCV’s color coherent component as a
low-level representation of objects within the images’ contents, they have developed a pseudo-
object-based method for image retrieval with RF. The main idea is to use pseudo object
representation for feedback, thus permitting the system to retrieve more new relevant images
with similar objects. They also use the user’s relevance judgement information to estimate the
importance of different attributes in a multi-attribute image retrieval model. The tested result of
their system demonstrates that the pseudo-object model and the proposed RF approach are more
powerful than the standard RF approaches.
In [Aksoy00], Aksoy el al presented a weighted distance approach where the weights are
the ratios of standard deviations of the feature values both for the whole database and also among
the images selected as relevant by the user. The feedback is used for both independent and
incremental updating of the weights and these weights are used to iteratively refine the effects of
different features in the database search. With this proposed technique, experiments in
[Aksoy00] showed the average retrieval performance was improved greatly after the first
iteration.
In addition, Macarthur et al [MacArthur00] proposed a relevance feedback retrieval system,
for each retrieval iteration, to build a decision tree to uncover a common thread between all
images marked as relevant. This tree was then used as a model for inferring which of the unseen
images the user would most likely desire. By use of the proposed technique, in [MacArthur00],
they demonstrated how retrieval precision increase after one or two iterations and they also
showed their retriever was fast enough to use online.
All the approaches described above perform RF at the low-level feature vector level, but
failed to take into account the actual semantics for the images themselves. The inherent problem
with these approaches is that the low-level features are often not as powerful in representing
complete semantic content of images as keywords in representing text documents. In other
words, applying the RF approaches used in text information retrieval technologies to low-level
feature based image retrieval will not be as successful as in text document retrieval. In viewing
this, there have been efforts on incorporating semantics in RF for image retrieval. In section 4,
we will introduce this new trend.
3. Application System Examples
Quite a few famous CBIR systems have adopted relevance feedback technique to achieve
better efficiency and satisfaction. We will give brief discussions of two typical systems:
� PhotoBook
� PicToSeek
3.1. Photobook (FourEyes)
Photobook, developed by MIT’s Media Laboratory, is a set of tools for performing queries
on image databases based on image content. Photobook has proved to be one of the most
influential of the early CBIR systems. FourEyes, embedded in the most recent version of
Photobook, is an interactive, power-assisted tool for segmenting and annotating image including
human being in the annotating and retrieval loop. FourEyes is different from other tools like
QBIC, Virage and CORE, which all support search on various features but offer little assistance
in actually choosing one for a given task; FourEyes, on the contrary, allows users to address their
intent directly. Users are allowed to do this because FourEyes calculates information-preserving
features. That means from these features all essential aspects of the original image can in theory
be reconstructed. Features relevant to a particular type of search are allowed to be computed at
search time.
figure 3.1 Interface for PhotoBook * *
** http://vismod.www.media.mit.edu/~tpminka/photobook/foureyes/
FourEyes system tries to overcome the difficulties of dimensional explosion in feature
space by using a “society of models” [Minka96]. In an initial off-line phase, a number of
different filtering techniques are applied to the data to hierarchically cluster the data in as many
ways as possible. Groups of these clusters are identified which best represent classes of scenes
which are employed accordingly in the on-line query phase. User clicks on some regions, gives
them a label, and FourEyes extrapolates the label to other regions using shared neighbor
clustering algorithm on the image and in the database. A labeling is produced by FourEyes to
select and combine models from “society of models” . A grouping is a set of image regions that
are associated in some way, normally are identified as best representing classes of scenes that are
employed accordingly in the on-line query phase. The elements of a grouping may not
necessarily come from one same image, so there are within-image grouping and across-image
grouping in FourEyes. The incorporation of within- and across-image grouping has advantages
of class-likelihood continuity and self-improve capability.
Once a set of grouping has been formed, FourEyes selects and combines these grouping to
form compound groupings for users. User feedback is filtered back to alter the clustering with
respect to the success or failure of the query, thus adapting the grouping to the user’s needs.
FourEyes uses a leaner with simple concept language but adaptive weighting mechanism so that
it is easy to steer in desired directions. Each grouping has its own weight. Staring from an empty
union, the learner adds the grouping which maximizes the product of this number and the prior
weight of the grouping to the compound grouping.
Fig 3.1 shows the interface of PhotoBook system. Experimental results show that the “society of
model” approach is effective in interactive image annotation and retrieval.
3.3. PicToSeek
PicToSeek system [Gevers98] is used to explore visual information on the World Wide
Web. PicToSeek adopts relevance feedback technique to offer an interactive and iterative
content-based image retrieval to the users. Through the java interface, PicToSeek allows user to
choose different matching level: fuzzy, exact and so on; different feature type: RGB transition, H
transition and etc.; and different image type wanted: photograph, graphic, pictures. User can pick
up an example image either through loading from certain URL or browsing the database in the
server of PicToSeek (see figure 3.2 for interface of PicToSeek). Server search through the
database to return result to user. From the user feedback of giving negative and positive images,
the learning method can automatically learn which image features are more important. The effect
of such a process can move the query point in the direction of the relevant images and away from
the non-relevant ones.
PicToSeek use Fisher’s Linear Discriminant Method to classify images into two groups:
photographic and synthetic [Gevers98]. The classifying method computes three features: Color
variation, Color saturation and Color transition strength, Photographic images and synthetic
images tend to have evident difference in those features. Furthermore, images are also
automatically cataloged according to these characteristics such as JFIF-GIF, gray-color, size and
creating date.
figure3.2. the interface of PicToSeek WWW search system*
* Refer to: http://zomax.wins.uva.nl:5345/ret_user/.
Detailed retrieval model of PicToSeek system is described in [Gevers98]. PicToSeek uses
the vector feature model. Each image can be presented by its image vectors in the form of
);...;,;,( ,1100 InnII wfwfwfI = . When there is a query, query is also presented by its
corresponding image vectors Q in the same form );...;,;,( ,1100 QnnQQ wQwQwQQ = . PicToSeek
use formula: )log()}max{
5.05.0(
1 n
N
ff
ffw
ni
ii
=
+= to assign weights to different feature, this weights
assignment takes into consideration both the high feature frequencies and low overall collection
frequencies. PicToSeek returns user the most “similar” images stored in database to the query
image. The similarity between query and “similar” images is measured by similarity function,
PicToSeek chooses cross correlation similarity measure to provide best retrieval accuracy
without any object clutter, the similarity function is defined as ∑
∑=
==n
k Qk
n
k IkQk
w
wwIQS
1
1),min(
),( .
Relevance feedback is a method of feature selection and weighting. Users feed back
negative/positive images information. System learns which image features are more important
from users’ feedback and find the images according to the new features weighting. PicToSeek
allows the specification of (non)relevant images and sub-images. The relevance feedback process
is formulated as ∑ ∑−+=rel nonrel i
i
i
i
D
D
D
DQQ
||||’ γβα . The aim of relevance feedback is to produce
improved query specification. User need not to give a precise initial query formulation, the
relevance feedback technique can move the query into the user-desired direction.
4. New Trend
Recently, there are some new trends towards the relevance feedback techniques in the
application domain of content-based image retrieval. Our survey shows that there are roughly
two main new trends in this field:
• Try to derive the more computationally robust methods that perform global optimization
on the weights adjustment as well as the correlation between different attributes.
• Incorporate the semantic information with the low-level features into the relevance
feedback process for image retrieval.
Besides discussing the above two main trends, other techniques such as query expansion and
storing feedback information will also be included in this section.
4.1 Global Optimization
4.1.1 MindReader Retrieval System
Recently, more computationally robust methods that perform global optimization have
been proposed. In [Ishikawa 98], the MindReader retrieval system designed by Ishikawa et al.
formulates a minimization problem on the parameter estimating process. The key point is: Unlike
traditional retrieval systems whose distance function can be represented by ellipses aligned with
the coordinate axis, the MindReader system proposed a distance function that is not necessarily
aligned with the coordinate axis. Therefore, it allows for correlations between attributes in
addition to different weights on each component.
In Ishikawa’s paper, he explicitly pointed out that there is innate incompleteness in MARS
1997 [2]. Consider the two main techniques (Query-point Movement and Re-weighting) used in
MARS 1997: 1) in query-point movement, Rui and Huang generated pseudo-document vectors
(by method called “ inverse document frequency” ) from image feature vectors and then directly
applied the Rocchio’s formula. Although this technique is based on similarity-based query
processing, the similarity values can be easily transformed to straight Euclidean distances; 2) in
standard deviation method used in [2], the basic idea is very intuitive: if the variance of the good
examples is high along, say, the j-th axis, and therefore the j-th axis should have a low weight wj.
So that the inverse of the standard deviation of the j-th feature values in the feature matrix is used
as the weight wj for feature j, that is, wj=1/σj. The resulting weights wj are used to compute the
new similarity values for images. Ishikawa did not deny the correctness of the intuition in this
method, but he pointed out that there is no justification given in [2] about this specific choice of
wj=1/σj. In fact any decreasing function of σj would be a good candidate for the weight, like
1/log(σj).
By seeing the incompleteness of MARS 1997, Ishikawa proposed a more robust method
which he claimed to include both the above two types of query refinement techniques as its
special cases. In fact, this method does not use the heuristics such as β and γ in the Rocchio’s
formula but directly go for an optimal solution for minimum problem in “hidden distance”
function. In this function, it allows not only for different weights of each attribute, but also for
correlations between attributes. As shown in Figure 1 [Ishikawa 98] which gives the visual
descriptions for the different 2-D distance functions, the left one is the isosurfaces for the straight
Euclidean distance function which has circles; the central one shows the isosurface for a
weighted Euclidean distance, like in MARS1997, which has ellipses whose major axis must be
aligned with the coordinate axis; the right one is the proposed distance function by Ishikawa,
which also has ellipses for its isosurfaces, but the major axis are not necessarily aligned with the
coordinate axis. In another word, Ishikawa applied a general ellipses function as the model of his
distance function.
q q q
Euclidean Weighted Euclidean Generalized ellipsoid distance
Figure 1 Different distance functions
As stated in [Ishikawa 98], the proposed distance function is:
)()(),( qxMqxqxD T ������ −−=
where M defines a generalized ellipsoid distance matrix.
Obviously, the generalized ellipsoid distance function includes the straight and weighted
Euclidean distance functions as its special cases. Moreover, in this paper, it gives the proof that
the weighting scheme of MARS 1997 is optimal if we restrict M to a diagonal matrix, but MARS
1997 is unable to “guess” generalized ellipsoid distance. Furthermore, the experiments being
done in this paper shows that the searching is fast due to the recent developments on generalized
ellipsoid queries in spatial access methods technology [Seidl 97].
4.1.2 MARS 1999
Based on MindReader’s approach for global minimum, a further improvement over this
approach is given by Rui and Huang [Rui 99]. In their CBIR system, it not only formulates the
optimization problem but also takes into account the multi-level image model.
In Rui and Huang’s paper, by using Lagrange multipliers, they have derived the explicit
optimal solutions for both the query vectors and the weights associated with the multi-level
image model. That is, they combine the two best-known techniques of relevance feedback
(MindReader and MARS) to overcome the shortcomings that each technique has. For example,
even though MindReader formulated a more vigorous estimation process than MARS which
overcomes the shortcomings caused by heuristic based parameter estimation, it failed to analyze
the necessary conditions for the technique to work. Moreover, neither technique takes into
account that images contain multiple levels of content. I think one of the main contribution of
[Rui 99] is to embed the multi-level image model into the optimal solution and develop a
formulation that guarantees explicit optimal solutions while making the problem as general as
possible. The formulas generated by Rui and Huang’s work in this paper are claimed to be
general enough to include both MARS and MindReader as its special cases.
4.2 Semantic Information in Relevance Feedback
Until now, all the above techniques we have discussed only perform feedback at the low-
level feature vector level, and they failed to take into account the actual semantics information
for the image themselves. As mentioned in [Lu 00], the inherent problem with these approaches
is that the low-level features are often not as powerful in representing complete semantic content
of images as keywords in representing text documents. By viewing this, recently there have been
efforts on incorporating semantics in relevance feedback for content-based image retrieval. The
framework proposed in [Lee 98, Paek 99] tried to embed semantic information into a low-level
feature based image retrieval process by using a correlation matrix. In this framework, user’s
feedback help the system to learn the semantic relevance between image clusters, and this
information can be used to improve the performance of retrieval. Another framework is Lu and
Zhang’s Ifind information retrieval system, which integrates both the semantics and low-level
features into the relevance feedback process in a new way. The basic idea is to construct a
semantic network and integrated it with low-level feature vector based relevance feedback by
using a modified form of the Rocchio’s formula. The semantic network is represented by a set of
keywords having links to the images in the database. Weights are assigned to each individual
link [Lu 00]:
image image …… image
Keyword 1 Keyword 2
w11
w12
w1n
Figure 2: Semantic network
The degree of relevance of the keywords to the associated image’s semantic content is
represented as the weight on each link. But it seems there is still some problems with this kind of
semantic network. First, it does not sound such rigorous as MARS 1997 &99 and MindReader.
For example, considering the weights assigned to those keyword links according to their
relevance, there seems no normalization method on this mentioned in that paper. Second, the
proposed criteria used for weights adjustment seems no theoretical base. Another problem is the
scalability, with the expansion of the vocabulary, the proposed framework may have big trouble
dealing with the increased complexity. In all, the framework proposed in [Lu 00] is not as solid
as MARS and MindReader.
4.3 Other Trends in Feedback Information Retr ieval
Other new trends and techniques include the query expansion [Porkaew 99a, Porkaew 99b]
and storing feedback information [Bartolini 00]. In query expansion, in each iteration of
feedback, the relevant objects are added to the query and non-relevant ones are removed, which
has been proved to have better performance than query point movement approach. Another new
idea is to store the outcome of a feedback process when the process is terminated. Instead of
starting out the new feedback iteration with default parameters each time (for the same query),
Bartolini and Ciaccia presented that using wavelets to store the parameters and enable the
prediction of parameter settings for similar queries by interpolation. That is, the feedback process
for a new query can be started with a parameter setting, which is usually much better (much
closer to the optimal) than the default parameters, so that the increased effectiveness and
response time can be achieved.
5.Conclusion
In this paper, we have presented a brief survey of the Relevance feedback (RF) technique
used in image retrieval systems, especially in CBIR systems.
CBIR has emerged as one of the most active research areas in the recent years. Retrieval is
accomplished by computing the similarities of individual feature representation with fixed
weights. Although CBIR has been widely implemented and adopted around the world, its
usefulness is limited due to the difficulty in representing high level concepts using low level
features and human perception subjectivity.
Relevance feedback is an excellent technique for improving the retrieval effectiveness.
User need not decompose his/her interested information into different low-level feature
representations and specify all the associated weights precisely. With RF techniques, users are
allowed to submit a coarse query at the beginning, and the query results will be continuously and
automatically refined by the system based on more and more accurate feedback information from
users.
Employing RF in image retrieval system has proved to be very promising to improve the
retrieval effectiveness of overall system.
As addressed above, RF technique is employed to improve the retrieval effectiveness of
overall system, no matter integrating semantics or low level features or both. MARS and
Mindreader retrieval systems are the best-known RF systems till now. Experiments show RF is
an excellent technique for improving the effectiveness of queries against a database. But there
still exist some problems in RF research area:
1) First, since most of current RF based systems estimate the ideal query parameters on
only the low-level image features such as color, texture, and shape. These systems
work well if the feature vectors can capture the essence of the query. On the other
hand, if the user is searching for a specific object that cannot be sufficiently
represented by combinations of available feature vectors, these RF systems will not
return many relevant results even with a large number of user feedbacks [Lu00].
Moreover, nobody ever mentions the time complexity issue of extracting low-level
features from images or that of the weights adjustment.
2) Second, even those systems that embedded the semantic information (such as the
keyword index) in them still have trouble dealing with the issues of weight
normalization, thresholds selection and narrowing the return-sample space.
3) No current systems can support object level query even given lots of low-level features
together with the keywords information. It is also a big research issue in content-based
retrieval society.
Employing RF in image retrieval system has proved to be very promising to improve the
retrieval effectiveness of overall system. We believe that future works in this field will contribute
greatly to the information retrieval research.
[References:]
[Aksoy00] Selim Aksoy and Robert M. Haralick, “A Weighted Distance Approach to Relevance
Feedback,” Proceedings of the International Conference on Pattern Recognition (ICPR’00).
[Allan96] J. Allan. Incremental relevance feedback for information filtering. In Proc. ACM
SIGIR Conf., Zurich,Switzerland,August 1996.
[Ana98] Ana Lelescu Ouri and Wolfson Bo Xu, “Approximate Retrieval from Multimedia
Databases Using Relevance Feedback,” Proceedings of the String Processing and Information
Retrieval Symposium & International Workshop on Groupware, 1998.
[Bach] J. R. bach, C. Fuller et al., “The virage image search engine:an open framework for
image management,” in Proc. SPIE and Retrieval for Image and Video Databases.
[Bartolini00] I. Bartolini, P. Ciaccia and F. Waas, “Using the Wavelet Transform to Learn from
User Feedback,” In Proceedings of the 1st DELOS Workshop on "Information Seeking,
Searching and Querying in Digital Libraries" (DELOS’00 - Network of Excellence on Digital
Libraries), Zurich, Switzerland, December 2000.
[Benitez98] Ana B. Benitez, Mandis Beigi, and Shih-Fu Chang, “Using Relevance Feedback in
Content-Based Image Metasearch,” IEEE Internet Computing, Vol. 2, No. 4, July/August 1998.
[Buckley95] Buckley, C., and Salton, G. “Optimization of Relevance Feedback Weights,” in
Proc of SIGIR’95.
[Chang99] Chia-Hui Chang, Student Member, IEEE, Ching-Chi Hsu, “Enabling Concept-Based
Relevance Feedback for Information Retrieval on the WWW,” IEEE Transactions on Knowledge
and Data Engineering, Vol. 11, No. 4, July/August 1999.
[Chua98a] Tat-Seng Chua, Chun-Xin Chu and Mohan Kankanhalli, “Relevance Feedback
Techniques for Image Retrieval Using Multiple Attributes,” Proceedings of the IEEE
International Conference on Multimedia Computing and Systems, Volume I.,1998
[Chua98] T.S. Chua, W.C. Low and C.X. Chu, “Relevance Feedback Techniques for Color-
based Image Retrieval,” Proceedings of the 1998 MultiMedia Modeling.
[Cox96] I.J. Cox, M.L. Miller, SM. Omohundro & P.N. Yianilos. Pichunter: Bayesian relevance
feedback for image retrieval. Int’ l Conference on Pattern Recognition, 361-369, 1996.
[Doulamis98] Anastasios D. Doulamis, Yannis S. Avrithis, Nikolaos D. Doulamis and Stefanos
D. Kollias, “ Interactive Content-Based Retrieval in Video Databases Using Fuzzy Classification
and Relevance Feedback,” Proceedings of the IEEE International Conference on Multimedia
Computing and Systems, Volume II, 1998.
[Faloutsos94] C. Faloutsos, R. Barber, M. Flicker, J. Hafner, W. Niblack, D. Petkovic & W.
Equitz. Efficient and effective querying by image content. Journal of Intelligent Information
Systems, 231-262, 1994.
[Gevers98] Theo Gevers and Arnold W.M. Smeulders, “The PicToSeek WWW Image Search
System,” Proceedings of the IEEE International Conference on Multimedia Computing and
Systems, Volume I, 1998.
[Hichem99] Frigui Hichem, “ Interactive Image Retrieval Using Fuzzy Sets,” IEEE trans. On
Image processing, Abstract received March 24, 1999
[Hsieh98] Jun-Wei Hsieh, Cheng-Chin Chiang and Yea-Shuan Huang, “Using Relevance
Feedback to Learn Visual Concepts from Image Instances,” Proceedings of the 10th
International Conference on Image Analysis and Processing, 1998.
[Hsu95] W. Hsu, T.S. Chua & H. K.Pung. Integrated color-spatial approach to content-based
image retrieval. ACM Multimedia ‘95,305313, 1995.
[Ide71] Ide, E., “New Experiments in Relevance Feedback,” Gerard Salton, Editor, The SMART
Retrieval System, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1971.
[Ishikawa98] Y. Ishikawa, R. Subramanya, and C. Faloutsos, “Mindreader: Query databases
through multiple examples” , In Proc. Of the 24th VLDB conference, (New York),1998.
[Junichi00] Tatemura, Junichi, “Graphical relevance feedback: Visual exploration in the
document space” , IEEE SYMP VISUAL LANG PROC, IEEE, LOS ALAMITOS, CA, (USA), pp.
39-46, 2000
[Lee98] Lee, C., Ma, W. Y., and Zhang, H. J. “ Information Embedding Based on user’s
relevance Feedback for Image Retrieval,” Technical Report HP Labs, 1998.
[Lewis95] DD Lewis. Active by accident: Relevance feedback in information retrieval. In AAAI
Fall Symposium on Active Learning, 1995.
[Lipson97] P. Lipson, E. Grimson, and P. Sinha, “Context and configuration based Scene
Classification” , Proceedings of IEEE Int. Conf. On Computer Vision and Pattern Recognition,
1997.
[Low98] W.C. Low & T.S. Chua. Color-based relevance feedback for image retrieval. In
International workshop on MM DBMS, Dayton, USA, 116-,123. IEEE Computer Society, 1998.
[Lu 00] Y. Lu, C.-H. Hu, X.-Q. Zhu, H.-J. Zhang and Q. Yang, “A Unified Framework for
Semantics and Feature Based Relevance Feedback in Image Retrieval Systems,” ACM
Multimedia, 2000.
[Lundquist97] Lundquist, C., D. Grossman, and O. Frieder, “ Improving Relevance Feedback in
the Vector-Space Model,” Proceedings of the Sixth ACM International Conference on
Information and Knowledge Management, 1997.
[Lundquist99] Lundquist, Carol; Frieder, Ophir; Holmes, David O; Grossman, David “Parallel
relational database management system approach to relevance feedback in information
retrieval” . Journal of the American Society for Information Science [J. Am. Soc. Inf. Sci.], vol.
50, no. 5, pp. 413-426, 1999
[MacArthur00] S. MacArthur, C. Brodley and C. Shyu, “Relevance Feedback Decision Trees in
Content-Based Image Retrieval,” Proceedings of the IEEE Workshop on Content-based Access
of Image and Video Libraries (CBAIVL’00).
[Maron98] O. Maron, “Learning from Ambiguity” , Doctoral Thesis,Dept. of Electrical
Engineering and Computer Science, M.I.T., June 1998.
[Meilhac98] Christophe Meilhac and Chahab Nastar, “Relevance Feedback and Category Search
in Image Databases,” Proceedings of the IEEE International Conference on Multimedia
Computing and Systems, Volume I, 1998.
[Minka96] T.P. Minka & R.W. Picard. Interactive learning using a society of models. IEEE
Computer Society Conference on Computer Vision and Pattern Recognition, 447-452, 1996.
[Müller00] Henning Müller, Wolfgang Müller, Stéphane Marchand-Maillet and Thierry Pun,
“Strategies for Positive and Negative Relevance Feedback in Image Retrieval,” Proceedings of
the International Conference on Pattern Recognition (ICPR’00).
[Ortega97] M. Ortega, Y. Rui, K. Chakrabarti, S. Mehrotra & T.S Huang. Supporting similarity
queries in Mars. ACM Multimedia ‘97,403-413, 1997.
[Paek 99] S. Paek, C.L. Sable, V. Hatzivassiloglou, A. Jaimes, B.H. Schiffman, S. F. Chang
and K.R. Mckeown, “ Integration of Visual and Text-Based Approaches for the Content Labeling
and Classification of Photographs,” SIGIR’99.
[Pass96] G. Pass, R. Zabih & J. Miller. Comparing images using color coherence vectors. ACM
Multimedia 96,65-73, 1996.
[Patrice00] Blancho Patrice and Hubert Konik, “Texture Similarity Queries and Relevance
Feedback for Image Retrieval,” Proceedings of the International Conference on Pattern
Recognition (ICPR’00).
[Porkaew 99a] K. Porkaew, K. Chakrabarti & S. Mehrotra, “Query Refinement for Multimedia
Similarity Retrieval in MARS,” Proceedings of the ACM International Multimedia Conference,
Orlando, Florida, pp 235-238, 1999.
[Porkaew 99b] K. Porkaew, S. Mehrotra, M. Ortega and K. Chakrabarti, “Similarity Search
Using Multiple Examples in MARS,” in Intl Conference on Visual Information Retrieval, 1999.
[Rocchio66], Jr., J. J., “Document Retrieval Systems - Optimization and Evaluation,” Ph.D.
Thesis, Harvard University, March 1966.
[Rocchio71] Rocchio, Jr., J. J., “Relevance Feedback in Information Retrieval,” Gerard Salton,
Editor, The SMART Retrieval System, Prentice-Hall, Inc., Englewood Cliffs, New Jersey, 1971.
[Rui97] Rui, Y.; Huang, T.S.; Mehrotra, S., “Content-based image retrieval with relevance
feedback in MARS,” Proceedings of the 1997 International Conference on Image Processing
(ICIP ’97) (3-Volume Set).
[Rui97a] Yong Rui, Thomas S. Huang, Sharad Mehrotra, and Michael Ortega, “A Relevance
Feedback Architecture for Content-based Multimedia Information Retrieval Systems,”
Proceedings of the 1997 Workshop on Content-Based Access of Image and Video Libraries
(CBAIVL ’97).
[Rui98] Yong Rui, Thomas S. Huang, Michael Ortega, and Sharad Mehrotra, “Relevance
Feedback: A Power Tool in Interactive Content-Based Image Retrieval” , IEEE Tran on Circuits
and Systems for Video Technology , Special Issue on Segmentation, Description, and Retrieval of
Video Content, pp644-655, Vol 8, No. 5, Sept, 1998
[Rui98a] Yong Rui, Thomas S. Huang, and Sharad Mehrotra. Relevance feedback techniques in
interactive contentbased image retrieval. In Proc. of IS&T SPIE Storage and Retrieval of
Images/Video Databases VI, EI'98,
[Rui99] Rui, Y., Huang, T. S. “A Novel Relevance Feedback Technique in Image Retrieval,”
ACM Multimedia, 1999.
[Salton83] Salton, G., and McGill, M. J. “ Introduction to Modern Information Retrieval,”
McGraw-Hill Book Company, 1983.
[Salton89] G. Salton. Automatic text processing. Addison-Wesley Publishing Company, 1989.
[Seidl 97] T. Seidl and H. P. Kriegel, “Efficient User-adaptable Similarity Search in Large
Multimedia Databases,” in Proceedings of VLDB, pp 506-515, Athens, Greece, August 1997.
[Shaw95] Shaw, W. M. “Term-Relevance Computation and Perfect Retrieval Performance” IPM
31, 1995, 491-498
[Sltoton90]G. Sltoton and C. Buckey. Improving retrieval performance by relevance feedback.
Journal of the American Society for Information Science, 41(4):228-287,1990.
[Squire99] D. M. Squire, W. M¨ uller, H. M¨ uller, and J. Raki. “Content-based query of image
databases, inspirations from text retrieval: inverted files, frequency-based weights and relevance
feedback” . 11th Scandinavian Conference on Image Analysis (SCIA’99), pages 143–149,
Kangerlussuaq, Greenland, June 7–11 1999.
[Squire99] D. M. Squire, W. M¨ uller, H. M¨ uller, and J. Raki. Content-based query of image
databases, inspirations from text retrieval: inverted files, frequency-based weights and relevance
feedback. In The 11th Scandinavian Conference on Image Analysis (SCIA’99), pages 143–149,
Kangerlussuaq, Greenland, June 7–11 1999.
[Vasconcelos00] Nuno Vasconcelos and Andrew Lippman, “Bayesian Relevance Feedback for
Content-Based Image Retrieval,” Proceedings of the IEEE Workshop on Content-based Access
of Image and Video Libraries (CBAIVL’00).
[Wood98] M. E. Wood, N. W. Campbell, and B. T. Thomas. Iterative refinement by relevance
feedback in content-based digital image retrieval. In Proceedings of The Fifth ACM
International Multimedia Conference (ACM Multimedia 98), pages 13--20, Bristol, UK,
September 1998.
[Wu00] Ying Wu, Qi Tian and Thomas S. Huang, “ Integrating Unlabeled Images for Image
Retrieval Based on Relevance Feedback,” Proceedings of the International Conference on
Pattern Recognition (ICPR’00).
[Zhou00] Xiang Sean Zhou and Thomas S. Huang, “ Image Retrieval: Feature Primitives, Feature
Representation, and Relevance Feedback,” Proceedings of the IEEE Workshop on Content-based
Access of Image and Video Libraries (CBAIVL’00).