Technical Portion of PhD Research

25
MC Leong Open University Malaysia 2010

Transcript of Technical Portion of PhD Research

Page 1: Technical Portion of PhD Research

MC LeongOpen University Malaysia

2010

Page 2: Technical Portion of PhD Research

Purpose• This document provides the technical scope of the

researcher’s work. Its intention is to bring the reader quickly to the key aspects of the work.

• This document briefly explains without explaining what wavelets are, the extensive algorithm and literature knowledge:– An overview of work.– The research results.– The critical analyses why certain areas of research

failed to produce good results, and the lessons learned.

– What are the researcher’s original work contributions to the research.

– Conclusions.

Page 3: Technical Portion of PhD Research

Overview The research work is on invariant or robust image querying of fine

art paintings using the wavelet transform approach to obtain unique wavelet transform coefficients, which are used for indexing purposes. In this context, query images are matched to a single and static painting (i.e., the target image.)  

It is discovered that the wavelet coefficient matching method used in this work could enable a limited but adequate level of invariance for painting recognition immune to image brightness, contrast, blur, noise (graininess), rotation, translation, and scale. These properties are useful when image querying painting artworks, which is central to the research work.  

Painting artwork museums and electronic databases would need to query painting artwork images to match with information of the painters, the year produced, and other textual information.

Keeping images of every painting in a database together with their associated textual information will demand a very large storage space. In the case of online querying, the bandwidth required will be very large, making things impractical for limited computing resources. 

Page 4: Technical Portion of PhD Research

Overview This research examines the ways to make an effective image querying

using the least computing resources while achieving fast querying and target image retrieval speeds ranging from 2 seconds to 15 seconds for an image database of 1774 painting artwork images of very high resolutions. 

“Aged” or old painting artworks tend to have color fading, known as color shift which show brownish hues, while cracks in canvas will show as lines or noise in imaging science terms. Further, the query image may be of a poor resolution (blurry), poor color registration (having various hues), poorly or brightly lighted (brightness and contrast are affected), translated (shifted) or slightly rotated during when a query image is scanned or camera-captured into its electronic form. 

With all these image distortions in mind, we will see that the research is positioned to result in accurate hits in quick recognition and retrieval time.

In the further development of this work which will be mentioned in the later section of this document, the wavelet coefficients are manipulated and a matching and adaptive method is modified to enable partial image querying, whereby a small square section of a query image is sufficient to pinpoint a target image.

Page 5: Technical Portion of PhD Research

The original painting (target) images are kept in an image database, and are first fed into the Discrete Wavelet Transform (DWT).

The resultant of the transform will be wavelet coefficients of a particular scale of decomposition kept in a separate database known as the wavelet coefficient database.

Wavelet coefficients are unique and representative of a particular image only.

The original image and text information of the image need to be associated with its wavelet coefficients arranged in a 16-by-16 matrix.

Next, a query image is obtained and passed into the wavelet transform of the same scale of decomposition, and the resultant coefficients are matched against entries in the wavelet coefficient database.

A query image may suffer distortions such as color-shift, poor resolution, scale, dithering effects, noise, disorientation, displacement and misregistration.

If a hit exists, the target image and its textual information, which may be kept in a database is accessed on-demand basis, separate from the signature database; thus reducing bandwidth.

Operation

Page 6: Technical Portion of PhD Research

Wavelet DecompositionAt every wavelet

decomposition scale factor k, the size of wavelet coefficients is decreased by a factor of 2k from the original image size.

An image of a size 256-by-256 pixels will be dilated into matrices of 128-by-128 (k=1st scale), 64-by-64 (k=2nd scale), 32-by-32 (k=3rd scale), 16-by-16 (k=4th scale), and so on.

Higher k values decreases the wavelet resolution necessary for image querying at the expense of a smaller set of wavelet coefficients.

Page 7: Technical Portion of PhD Research

Structure Of Content-Based Image Querying & Retrieval

Page 8: Technical Portion of PhD Research

Experiment Setup For Painting-Based Image Database The setting up of the following

experiments involves: Using color artwork images,

converted on-the-fly to grey scale images of 256-by-256 pixel size as specimens.

A sample size of 1700+ target color images is used.

Using Daubechies-8 QMF mother wavelet because of its favorable intrinsic smoothening property on images.

Pass through a wavelet decomposition, of scale k=4 to encourage a small low-pass wavelet coefficient in a matrix of 16-by-16, translated into a storage size of merely 3Kbytes.

Experiment 1 An attempt is made to understand

the extent of variations and distortions on query images may have in influencing the match percentages on the target images.

The query images are deliberately blurred, scaled, translated, rotated and have graininess or noise added.

Changes are also made to brightness and contrast at varying degrees.

The next slide shows the kinds of image distortion, the distortion levels, and the Wavelet Hit Percentages (WHP) numbers.

The WHP number is the percentage of query image wavelet coefficients matching the original image’s wavelet coefficients.

Page 9: Technical Portion of PhD Research

Query Image Invariance Tests

Page 10: Technical Portion of PhD Research

Query Image Invariance Tests

Page 11: Technical Portion of PhD Research

Query Image Invariance Tests

Page 12: Technical Portion of PhD Research

Query Image Distortion TestsExperiment 2 (Database Size: 1774

Images) The next experiment attempts

to observe query images sourced from the Internet and other image sources have on the retrieval accuracy.

Image moments and intensity histograms could not be used to supplement the image database using the wavelet method primarily because of very probable query image distortions in color (fading and color-shift due to artwork aging), translation, scale, and sometimes doctored images.

American Gothic painting: various query images are tested at a matching percentage of at least 90% retrieve an accurate original image.

Page 13: Technical Portion of PhD Research
Page 14: Technical Portion of PhD Research

Query Image Distortion TestsObservations from

Experiment 1 and Experiment 2 show an inherent and invariant property of the wavelet method to seven critical image distortions .

The level of invariance from this work is adequate for the paintings domain.

The time at which the original images are retrieved from the matching algorithm is in the region of 2 to 15 seconds on a conventional Pentium 4 personal computer running Windows XP operating system with 512MBytes RAM.

Page 15: Technical Portion of PhD Research

Additional Research Work• The research has considered the following aspects and

techniques in the effort to enhance the quality of the existing work and results produce thus far. They are:1. Other distance measures instead of using Mean-Square

Error (MSE); e.g. Euclidean distance.2. Image moments and/or image standard deviations.3. Thresholding color or grey images to become binary

images, so that moments and/or distance measures may be used.

4. Neural Network, Genetic Algorithm, and Self-Organizing Map to learn patterns, generalize, classify or cluster image information.

• However, none of the above four efforts can contribute to improve on the quality of the research work after much experimenting. The reasons are explained next.

Page 16: Technical Portion of PhD Research

Reasons For Failed Efforts1. Other distance measures instead of using Mean-Square Error (MSE); e.g. Euclidean

distance. The distinction between query and target images are their inherent image attributes (blur,

brightness, contrast, noise), not their spatial locations. Therefore distance measures based on spatial locations are not applicable here.

2. Image moments and/or image standard deviations. We can observe from the previous results that query images can be very different to the

target image because of poor query image resolution (blurry), noisy or grainy, differing brightness and contrast levels, hues, color fading, and perhaps “doctored” (as seen on the Mona Lisa and American Gothic query images). Therefore, it is near impossible to use image moment or its derivatives to provide indexes or clusters to the image database.

3. Thresholding color or grey images to become binary images, so that moments and/or distance measures may be used.

Same as the above two explanations. We cannot obtain a consistent binary image when persistent and unpredictable distortions are present in “aged” paintings.

4. Neural Network, Genetic Algorithm, and Self-Organizing Map to learn patterns, generalize, classify or cluster image information.

The number of query images resembling a target image can be infinite, unpredictable, and inconsistent. Therefore there is no pattern or form of relationship between the query and target images that the techniques can be use to generalize or classify.

Page 17: Technical Portion of PhD Research

Research Enhancements• The contributions by the researcher result in:

• Reduced minimum querying times from 10 seconds to sub-2 seconds by developing and implementing a heuristic hunting algorithm for querying in a database containing 1,774 painting images.

• Reduced wavelet coefficient sizes from 9KB to 3KB, by concentrating only on the image’s low pass frequency image. This reduces the storage space for indexes by a factor of 3.

• Partial image (sliding block-based) querying, where by an incomplete or a part of a painting image is scanned or captured using a camera, can be used to retrieve its full target image from the image database including its associated textual information.

Page 18: Technical Portion of PhD Research

Flowchart Structure Of Content-Based Image Querying & Retrieval

Load target image database file count

Read query image

Resize query image to 256px by 256px

Perform DWT on query image @k=4

Extract query image low pass wavelet

coefficients

Find coefficient mean-square error between query and

target images

Tally the number of hits ≤ mse(n), and find hit percentage

≥90%

Apply hunting algorithm to

determine the next mse (n+1)

Is there more than one hit, or zero hit?

yes Retrieve and display target

image

no

Load target low pass wavelet indexes

Apply sliding 9-block and finer 16-

block searches

Is partial query image used?

yes

Page 19: Technical Portion of PhD Research

Partial Image Querying: Sliding 9-Block The partial query image (sliding 9-block) is

wavelet transformed @k=5, whilst the target image in the database still retain the @k=4 wavelet coefficients as image database indexes.

Once the partial query image has its lowpass wavelet coefficients (8x8 matrix), the lowpass coefficients have to be normalised.

Page 20: Technical Portion of PhD Research

Partial Image Querying: Sliding 9-Block

A partial query image’s lowpass wavelet coefficients will be matched

against 9 possible blocks (sliding windows) derived from every target

image’s lowpass wavelet coefficients. If a single best hit is

found, the said target image will be retrieved and displayed.

Partial Query Image’s Lowpass

Wavelet Coefficients

Page 21: Technical Portion of PhD Research

Partial Image Querying: Sliding 16-Block The partial query image (sliding 16-block) is

wavelet transformed @k=6, whilst the target image in the database still retain the @k=4 wavelet coefficients as image database indexes.

Once the partial query image has its lowpass wavelet coefficients (4x4 matrix), the lowpass coefficients have to be normalised.

Page 22: Technical Portion of PhD Research

Partial Image Querying: Sliding 16-Block

Partial Query Image’s Lowpass

Wavelet Coefficients

A partial query image’s lowpass wavelet coefficients will be matched against 16 possible blocks (finer set

of sliding windows) derived from every target image’s lowpass

wavelet coefficients. If a single best hit is found, the said target image will be retrieved and displayed.

Page 23: Technical Portion of PhD Research

Partial Image Querying Results

Page 24: Technical Portion of PhD Research

The uniqueness of using the wavelet method for partial image query is demonstrated in the results.

The wavelet method has made an accurate query hit by retrieving the correct target image in spite of artifacts, predominately the color shifts or color fade.

Even re-colorized query images have little effect to the retrieval of the correct target image, substantiated with an interesting and an accurate retrieval when a partial image was used instead of a full query image.

Partial Image Querying Results

Page 25: Technical Portion of PhD Research

Conclusion Moderate to high wavelet match percentages are recorded for varying

contrast, brightness, blur, scale, graininess, translation and rotation of query images.

Extending the research to include partial image querying has had the research to originally design and develop sliding block-based search algorithms, thus allowing someone query the image database by image-capturing a part of the painting instead of a complete painting.

The research makes use of a specialized hunting algorithm by the researcher, to speed up the matching process by more than a factor of 75% instead of using an exhaustive search. This has helped to reduce the querying times tremendously to mere seconds for over thousands of images.

A faster CPU processing speed is essential to improve wavelet transform calculations, and reducing querying times can be gained from memory and hard disk cache.

The research has shown that indexing using wavelet coefficients have made remarkably accurate matches despite abnormalities in the query image, which are inherent in any scanned image; or that retrieved from the Internet.