Content Based Image and Text Retrival

8/8/2019 Content Based Image and Text Retrival

1/4

Content-based image retrieval

Content-based image retrieval (CBIR), also known as query by image content

(QBIC) and content-based visual information retrieval (CBVIR) is the application

of computer vision techniques to the image retrieval problem, that is, the problem

of searching for digital images in large databases.

"Content-based" means that the search will analyze the actual contents of the

image rather than the metadata such as keywords, tags, and/or descriptions

associated with the image. The term 'content' in this context might refer to

colors, shapes, textures, or any other information that can be derived from

the image itself. CBIR is desirable because most web based image search

engines rely purely on metadata and this produces a lot of garbage in the results.

Also having humans manually enter keywords for images in a large database can

be inefficient, expensive and may not capture every keyword that describes the

image. Thus a system that can filter images based on their content would providebetter indexing and return more accurate results.

Previous Method Of image Retrival and its disadvantages(Retrival based on

metadata)

Previous method includes searching with metadata(i.e data associated with the

images called tags).Humans manually enter keywords for images in a large

database and search starts.This system is sucessful but pose some

disadvantages

Not better way of indexing

InEfficient

Expensive

May not capture every keyword that describes the image

Requires humans to personally describe every image in the database. This

is impractical for very large databases, or for images that are generated

automatically, e.g. from surveillance cameras

CBIR techniques

1)Query techniques

Query by example

Query by example is a query technique that involves providing the CBIR system


2/4

with an example image that it will then base its search upon. The underlying

search algorithms may vary depending on the application, but result images

should all share common elements with the provided example

Other query methods

Other query methods include browsing for example images, navigating

customized/hierarchical categories, querying by image region (rather than the

entire image), querying by multiple example images, querying by visual sketch,

2)Content comparison using image distance measures

The most common method for comparing two images in content based image

retrieval (typically an example image and an image from the database) is using

an image distance measure. An image distance measure compares the similarity

of two images in various dimensions such as color, texture, shape, and others.

For example a distance of 0 signifies an exact match with the query, with respect

to the dimensions that were considered. As one may intuitively gather, a value

greater than 0 indicates various degrees of similarities between the images.

Search results then can be sorted based on their distance to the queried image

CBIR Applications

Art collections

Photograph archives

Medical diagnosis

Crime prevention

The military

Geographical information and remote sensing systems

Content Based Text Retrival

Content-based retrieval of text is retrieval that uses the text of the document

rather than any added metadata. Free text searching is a good example ofcontent-based text retrieval. The words making up the content of the document

are indexed and used as the basis for retrieval, sometimes in conjunction with

quite sophisticated intelligent software used to satisfy the query. Search

engines like Google and AltaVista offer content-based text retrieval on the Web.

content-based navigation for text-On the Web, navigation is mainly based on


3/4

fixed links that are embedded in the documents themselves. However, it is

possible for hypermedia navigation to be content-based. By this we mean that

the links offered are determined at link following time and selected on the basis

of the content of the chosen source anchor. Link authoring for content-based

navigation involves making an association between some chosen source anchor

and the address of a destination document. The link information may be stored in

a separate location from the document, typically a linkbase holding source

anchors and destination addresses. With this content-based approach to

navigation, multiple links may be made available for a given source anchor,

previously authored links may be added to new documents on the fly with

minimal effort and different viewers may see different link sets depending on the

linkbases which are active at the time

In both content-based retrieval and content-based navigation for text, the processdepends on matching content. In the case of retrieval, the textual content of the

query is matched with text forming the content of the document, typically indexed

in some way to accelerate the retrieval process. In content-based navigation, the

query (which is typically a portion of text selected from the content of the

document) is matched with the text making up the source anchors of links in the

linkbase.

For text, these processes of content-based retrieval and navigation aresufficiently well established and widely used for us to conclude with some

conviction that content-based retrieval and navigation are worthwhile and

effective approaches for text information handling. Of course metadata based

searches with text are also widely used and the two approaches can complement

each other well. The content matching, on which text content-based processes

depend, are in many cases straightforward exact matches between words,

although statistical matches between word sets, term switching or query

expansion via thesauri, word stemming and other textual tricks can greatly

enhance the processes to provide more powerful retrieval and navigation

facilities.

Now let us turn our attention to content based retrieval and navigation with non-

text media. We will use images as our example although many of the comments

will apply equally to other non-text media. Can we say with the same conviction

as we did for text that content-based image retrieval and navigation are

worthwhile and effective approaches for image information handling? Well, in


4/4

short, the answer is No, certainly not with the same conviction. But there are

circumstances where content-based retrieval and content-based navigation may

be worthwhile particularly in conjunction with metadata-based techniques. And in

the longer term, as research into media processing offers up more powerful

approaches, the value of content-based techniques should increase.

Content Based Image and Text Retrival

Documents

Transcript of Content Based Image and Text Retrival