Sree Project Document

7/28/2019 Sree Project Document

1/48

1

Chapter 1

Introduction

The computer is becoming more and more powerful day by day. As a result, the uses of

digital images are increasing rapidly. Along with this, increasing use of digital images come as

the serious issue of storing and transferring the huge volume of data representing the images

because the uncompressed multimedia (graphics, audio and video) data requires considerable

storage capacity and transmission bandwidth. Though there is a rapid progress in mass storage

density, speed of the processor and the performance of the digital communication systems, the

demand for data storage capacity and data transmission bandwidth continues to exceed the

capabilities of on hand technologies. Besides, the latest growth of data intensive multimedia

based web applications has put much pressure on the researchers to find the way of using the

images in the web applications more effectively.

Internet teleconferencing, High Definition Television (HDTV), satellite communications

and digital storage of movies are not feasible without a high degree of compression. As it is, such

applications are far from realizing their full potential largely due to the limitations of common

image compression techniques.

The image is actually a kind of redundant data i.e. it contains the same information from

certain perspective of view. By using data compression techniques, it is possible to remove some

of the redundant information contained in images. Image compression minimizes the size in

bytes of a graphics file without degrading the quality of the image to an unacceptable level. The

reduction in file size allows more images to be stored in a certain amount of disk or memory

space. It also reduces the time necessary for images to be sent over the Internet or downloaded

from web pages.

Wavelets are functions which allow data analysis of signals or images, according toscales or resolutions. The processing of signals by wavelet algorithms in fact works much the

same way the human eye does; or the way a digital camera processes visual scales of resolutions,

and intermediate details. But the same principle also captures cell phone signals, and even

digitized colour images.


2/48

2

Wavelets are of real use in these areas, for example in approximating data with sharp

discontinuities such as choppy signals, or pictures with lots of edges. While wavelets is perhaps a

chapter in function theory, we show that the algorithms that result are key to the processing of

numbers, or more precisely of digitized information, signals, time series, still-images, movies,

colour images, etc.

Though there is a rapid progress in mass storage density, speed of the processor and the

performance of the digital communication systems, the demand for data storage capacity and

data transmission bandwidth continues to exceed the capabilities of on hand technologies.

Besides, the latest growth of data intensive multimedia based web applications has put much

pressure on the researchers to find the way of using the images in the web applications more

effectively. As it is, such applications are far from realizing their full potential largely due to the

limitations of common image compression techniques.

Since the Haar Transform is memory efficient, exactly reversible without the edge

effects, it is fast and simple. As such the Haar Transform technique is widely used these days in

wavelet analysis. Fast Haar Transform is one of the algorithms which can reduce the tedious

work of calculations. One of the earliest versions of FHT is included in HT. FHT involves

addition, subtraction and division by 2. Its application in atmospheric turbulence analysis, image

analysis, signal and image compression.

The Modified Fast Haar Wavelet Transform (MFHWT), in which the MFHWT is used

for one-dimensional approach and FHT is used to find the N/2detail coefficients at each level for

a signal of length N. In this project, it has used the same concept of finding averages and

differences as in but here that approach is extended for 2D images with the addition of

considering the detail coefficients 0 for N/2elements at each level. The Haar Transform and Fast

Haar Transform have been explained. Modified Fast Haar Wavelet Transform is presented withthe proposed algorithm for 2D images.


3/48

3

Chapter 2

Introduction to Digital Image processing

2.1 Introduction:

Collection of pixels laid out in a specific order with width (x) and height (y) in pixels.

Each pixel has a numerical value, which correspond to a color or gray scale value. A Pixel has

no absolute size and pixels may (sometimes NOT always)have a spatial value (Spatial data isdata associated with the pixels that provides information about the size of the objects in the

image).

Fig: 2.1 Representation of digital image in x and y pixel format.

An image may be defined as a two-dimensional function, f(x, y), where x and y are

spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity

or gray level of the image at that point. When x, y, and the amplitude values of f are all finite,

discrete quantities, we call the image a digital image. The field of digital image processing refers

to processing digital images by means of a digital computer. Note that a digital image is

composed of a finite number of elements, each of which has a particular location and value.

These elements are referred to as picture elements, image elements and pixels. Pixel is the term

most widely used to denote the elements of a digital image.

Vision is the most advanced of our senses, so it is not surprising that images play the

single most important role in human perception. However, unlike humans, who are limited to the

visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire

EM spectrum, ranging from gamma to radio waves. They can operate also on images generated

by sources that humans are not accustomed to associating with images. These include ultrasound,

electron microscopy, and computer-generated images. Thus, digital image processing


4/48

4

encompasses a wide and varied field of applications. Image processing stops and other related

areas, such as image analysis and computer vision, start. Sometimes a distinction is made by

defining image processing as a discipline in which both the input and output of a process are

images. We believe this to be a limiting and somewhat artificial boundary. For example, under

this definition, even the trivial task of computing the average intensity of an image would not be

considered an image processing operation. On the other hand, there are fields such as computer

vision whose ultimate goal is to use computers to emulate human vision, including learning and

being able to make inferences and take actions based on visual inputs. This area itself is a branch

of artificial intelligence (AI), whose objective is to emulate human intelligence. The field of AI

is in its earliest stages of infancy in terms of development, with progress having been much

slower than originally anticipated. The area of image analysis (also called image understanding)

is in between image processing and computer vision. There are no clear-cut boundaries in the

continuum from image processing at one end to computer vision at the other. However, one

useful paradigm is to consider three types of computerized processes in this continuum: low-,

mid-,and high-level processes. Low-level processes involve primitive operations such as image

pre-processing to reduce noise, contrast enhancement, and image sharpening. A low-level

process is characterized by the fact that both its inputs and outputs are images. Mid-level

processes on images involve tasks such as segmentation (partitioning an image into regions or

objects), description of those objects to reduce them to a form suitable for computer processing,

and classification (recognition) of individual objects. A mid-level process is characterized by the

fact that its inputs generally are images, but its outputs are attributes extracted from those

images (e.g., edges, contours, and the identity of individual objects). Finally, higher-level

processing involves making sense of an ensemble of recognized objects, as in image analysis,

and, at the far end of the continuum, performing the cognitive functions normally associated with

human vision.

Based on the preceding comments, we see that a logical place of overlap between imageprocessing and image analysis is the area of recognition of individual regions or objects in an

image. Thus, what we call in this book digital image processing encompasses processes whose

inputs and outputs are images and, in addition, encompasses processes that extract attributes

from images, up to and including the recognition of individual objects. As a simple illustration to


5/48

5

clarify these concepts, consider the area of automated analysis of text. The processes of acquiring

an image of the area containing the text, pre-processing that image, extracting (segmenting) the

individual characters, describing the characters in a form suitable for computer processing, and

recognizing those individual characters, are in the scope of what we call digital image processing

in this book. Making sense of the content of the page may be viewed as being in the domain of

image analysis and even computer vision, depending on the level of complexity implied by the

statement making sense. Digital image processing, as we have defined it, is used successfully

in a broad range of areas of exceptional social and economic value.

2.2 Digital Image Characteristics:

Pixel - An abbreviation of the term 'picture element.' A pixel is the smallest picture

element of a digital image. A monochrome pixel can have two values, black or white/0 or 1.Color and gray scale require more bits; true color, displaying approximately 16.7 million colors,

requires 24 bits for each pixel. A pixel may have more data than the eye can perceive at one

time. Dot The smallest unit that a printer can print. Voxel An abbreviation of the term

volume element. The smallest distinguishable box-shaped part of a three dimensional space. A

particular voxel will be identified by the x, y and z coordinates of one of its eight corners, or

perhaps its centre. The term is used in three dimensional modeling. Voxels need not have

uniform dimensions in all three coordinate planes.

To the human observer, the internal structures and functions of the human body are not

generally visible. However, by various technologies, images can be created through which the

medical professional can look into the body to diagnose abnormal conditions and guide

therapeutic procedures. The medical image is a window to the body. No image window reveals

everything. Different medical imaging methods reveal different characteristics of the human

body. It is an overview of the medical imaging process. The five major components are the

patient, the imaging system, the system operator, the image itself, and the observer, The

objective is to make an object or condition within the patient's body visible to the observer. The

visibility of specific anatomical features depends on the characteristics of the imaging system

and the manner in which it is operated. Most medical imaging systems have a considerable

number of variables that must be selected by the operator. They can be changeable system

components, such as intensifying screens in radiography, transducers in sonography, or coils in


6/48

6

magnetic resonance imaging (MRI). However, most variables are adjustable physical quantities

associated with the imaging process, such as kilo voltage in radiography, gain in sonography,

and echo time (TE) in MRI. The values selected will determine the quality of the image and the

visibility of specific body features.

2.2.1 Image Quality:

The quality of a medical image is determined by the imaging method, the characteristics

of the equipment, and the imaging variables selected by the operator. Image quality is not a

single factor but is a composite of at least five factors: contrast, blur, noise, artefacts, and

distortion, as shown above. The human body contains many structures and objects that are

simultaneously imaged by most imaging methods. We often consider a single object in relation

to its immediate background. In fact, with most imaging procedures the visibility of an object is

determined by this relationship rather than by the overall characteristics of the total image.

The task of every imaging system is to translate a specific tissue characteristic into image shades

of gray or colour. If contrast is adequate, the object will be visible. The degree of contrast in the

image depends on characteristics of both the object and the imaging system.

2.2.2 Image Contrast:

Contrast means difference. In an image, contrast can be in the form of different shades of

gray, light intensities, or colors. Contrast is the most fundamental characteristic of an image. An

object within the body will be visible in an image only if it has sufficient physical contrast

relative to surrounding tissue. However, image contrast much beyond that required for good

object visibility generally serves no useful purpose and in many cases is undesirable. The

physical contrast of an object must represent a difference in one or more tissue characteristics.

For example, in radiography, objects can be imaged relative to their surrounding tissue if there is

an adequate difference in either density or atomic number and if the object is sufficiently thick.

When a value is assigned to contrast, it refers to the difference between two specific points or

areas in an image. In most cases we are interested in the contrast between a specific structure or

object in the image and the area around it or its background.


7/48

7

2.2.3 Contrast Sensitivity:

The degree of physical object contrast required for an object to be visible in an image

depends on the imaging method and the characteristics of the imaging system. The primary

characteristic of an imaging system that establishes the relationship between image contrast and

object contrast is its contrast sensitivity. Consider the situation shown below. The circular

objects are the same size but are filled with different concentrations of iodine contrast medium.

That is, they have different levels of object contrast. When the imaging system has a relatively

low contrast sensitivity, only objects with a high concentration of iodine (ie, high object contrast)

will be visible in the image. If the imaging system has a high contrast sensitivity, the lower-

contrast objects will also be visible.

It emphasize that contrast sensitivity is a characteristic of the imaging method and the

variables of the particular imaging system. It is the characteristic that relates to the system's

ability to translate physical object contrast into image contrast. The contrast transfer

characteristic of an imaging system can be considered from two perspectives. From the

perspective of adequate image contrast for object visibility, an increase in system contrast

sensitivity causes lower-contrast objects to become visible. However, if we consider an object

with a fixed degree of physical contrast (i.e., a fixed concentration of contrast medium), then

increasing contrast sensitivity will increase image contrast.

It is difficult to compare the contrast sensitivity of various imaging methods because

many are based on different tissue characteristics. However, certain methods do have higher

contrast sensitivity than others. For example, computed tomography (CT) generally has a higher

contrast sensitivity than conventional radiography. This is demonstrated by the ability of CT to

image soft tissue objects (masses) that cannot be imaged with radiography. Consider the image

below. Here is a series of objects with different degrees of physical contrast. They could be

vessels filled with different concentrations of contrast medium. The highest concentration (and

contrast) is at the bottom. Now imagine a curtain coming down from the top and covering some

of the objects so that they are no longer visible. Contrast sensitivity is the characteristic of the


8/48

8

imaging system that raises and lowers the curtain. Increasing sensitivity raises the curtain and

allows us to see more objects in the body. A system with low contrast sensitivity allows us to

visualize only objects with relatively high inherent physical contrast.

2.2.4 Blur and Visibility of Detail:

Structures and objects in the body vary not only in physical contrast but also in size.

Objects range from large organs and bones to small structural features such as trabecula patterns

and small calcifications. It is the small anatomical features that add detail to a medical image.

Each imaging method has a limit as to the smallest object that can be imaged and thus on

visibility of detail. Visibility of detail is limited because all imaging methods introduce blurring

into the process. The primary effect of image blur is to reduce the contrast and visibility of small

objects or detail. Consider the image below, which represents the various objects in the body in

terms of both physical contrast and size. As we said, the boundary between visible and invisible

objects is determined by the contrast sensitivity of the imaging system. We now extend the idea

of our curtain to include the effect of blur. It has little effect on the visibility of large objects but

it reduces the contrast and visibility of small objects. When blur is present, and it always is, our

curtain of invisibility covers small objects and image detail.

2.2.5 Noise:

Another characteristic of all medical images is image noise. Image noise, sometimes

referred to as image mottle, gives an image a textured or grainy appearance. The source and

amount of image noise depend on the imaging method and are discussed in more detail in a later

chapter. We now briefly consider the effect of image noise on visibility. In the image below we

find our familiar array of body objects arranged according to physical contrast and size. We now

add a third factor, noise, which will affect the boundary between visible and invisible objects.

The general effect of increasing image noise is to lower the curtain and reduce object visibility.

In most medical imaging situations the effect of noise is most significant on the low-contrast

objects that are already close to the visibility threshold.


9/48

9

2.2.6 Object Contrast:

The ability to see or detect an object is heavily influenced by the contrast between the

object and its background. For most viewing tasks there is not a specific threshold contrast at

which the object suddenly becomes visible. Instead, the accuracy of seeing or detecting a specific

object increases with contrast. The contrast sensitivity of the human viewer changes with

viewing conditions. When viewer contrast sensitivity is low, an object must have a relatively

high contrast to be visible. The degree of contrast required depends on conditions that alter the

contrast sensitivity of the observer: background brightness, object size, viewing distance, glare,

and background structure.

2.2.7 Background Brightness:

The human eye can function over a large range of light levels or brightness, but vision is

not equally sensitive at all brightness levels. The ability to detect objects generally increases with

increasing background brightness or image illumination. To be detected in areas of low

brightness, an object must be large and have a relatively high level of contrast with respect to its

background. This can be demonstrated with the image in the image above. View this image with

different levels of illumination. You will notice that under low illumination you cannot see all of

the small and low-contrast objects. A higher level of object contrast is required for visibility.

2.3 File Formats:

File format which defines the components of the digital image (x & y values, values of

the pixels, colour/gray scale, compression, manner in which the pixels are laid out, etc.) Standard

file formats provide the exchange of digital image information .

There are many file formats exist. They are,

JPEG - Joint Photographic Experts Group.

TIFF - Tagged Image File Format.

PNG - Portabe Network Graphics.

2.4 Digital Image Representation:


10/48

10

An image is defined as a two-dimensional function ie. a matrix, f(x, y), where x and y are

spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity

or gray level of the image at the point. Color images are formed by combining the individual

two-dimensional images. For example, in the RGB color system, a color images consists of three

namely, red, green and blue individual component images. Thus many of the techniques

developed for monochrome images can be extended to color images by processing the three

component images individually. When x, y and the amplitude values of f are all finite, discrete

quantities, the image is called a digital image. The field of digital image processing refers to

processing digital images by means of a digital computer. A digital image is composed of a finite

number of elements, each of which has a particular location and value. These elements are

referred to as picture elements, image elements, pels and pixels. Since pixel is the most widely

used term, the elements will be denoted as pixels from now on.

An image may be continuous with respect to the x- and y-coordinates, and also in

amplitude. Digitizing the coordinates as well as the amplitude will take into effect the conversion

of such an image to digital form. Here, the digitization of the coordinate values are called

sampling; digitizing the amplitude values is called quantization. A digital image is composed of

a finite number of elements, each of which has a particular location and value. The field of

digital image processing refers to processing digital images by means of a digital computer.

2.4.1 Coordinate Convention:Assume that an image f(x, y) is sampled so that the resulting image has M rows and N

columns. Then the image is of size M N. The values of the coordinates (x, y) are discrete

quantities. Integer values are used for these discrete coordinates. In many image processing

books, the image origin is set to be at (x, y) = (0, 0). The next coordinate values along the first

row of the image are (x, y) = (0, 1). Note that the notation (0, 1) is used to signify the second

sample along the first row. These are not necessarily the actual values of physical coordinates

when the image was sampled. Note that x ranges from 0 to M1, and y from 0 to N 1, where x

and y are integers. However, in the Wavelet Toolbox the notation (r, c) is used where r indicates

rows and c indicates the columns. It could be noted that the order of coordinates is the same as

the order discussed previously. Now, the major difference is that the origin of the coordinate.


11/48

11

system is at (r, c) = (1, 1); hence r ranges from 1 to M, and c from 1 to N for r and c integers. The

coordinates are referred to as pixel coordinates.

2.4.2 Images as Matrices:

The coordinate system discussed in preceding section leads to the following representation for

the digitized image function:

f(x,y) = (2.1)

The right side of the equation is a representation of digital image. Each element of this

array (matrix) is called the pixel.

Now, in MATLAB, the digital image is represented as the following matrix:

f = (2.2)Where M = the number of rows and N = the number of columns Matrices in MATLAB

are stored in variables with names such as A, a, RGB, real array and so on.

2.4.3 Color Image Representation:

An RGB color image is an M N 3 array or matrix of color pixels, where each color

pixel consists of a triplet corresponding to the red, green, and blue components of an RGB image

at a specific spatial location. An RGB image may be viewed as a stack of three gray-scale

images, that when fed into the red, green, and blue inputs of a color monitor, produce a color

image on the screen. So from the stack of three images forming that RGB color image, each

image is referred to as the red, green, and blue component images by convention. Now, the data

class of the component images determine their range of values. If an RGB color image is of

class double, meaning that all the pixel values are of type double, the range of values is [0, 1].

Likewise, the range of values is [0, 255] or [0, 65535] for RGB images of class uint8 or uint16,


12/48

12

respectively. The number of bits used to represent the pixel values of the component images

determines the bit depth of an RGB color image.

The RGB color space is shown graphically as an RGB color cube. The vertices of the

cude are the primary (red, green, and blue) and secondary (cyan, magenta, and yellow) colors of

light.

2.4.4 Indexed Images:

An indexed image has two components: a data matrix of integers, X, and a colormap

matrix, map. Matrix map is an m 3 array of class double containing floating-point values in the

range [0, 1]. The length, m, of the map is equal to the number of colors it defines. Each row of

map specifies the red, green, and blue components of a single color. An indexed image uses

direct mapping of pixels intensity values of color map values. The color of each pixel is

determined by using the corresponding value of integer matrix X as a pointer into map. If X is of

class double then all of its components with value 2 point to the second row, and so on. If X is of

class unit 8 or unit 16, then all components with value 0 point to the first row in map, all

components with value 1 to point to the second row and so on.

2.4.5 The Basics of Color Image Processing:

Color image processing techniques deals with how the color images are handled for a

variety of image-processing tasks. For the purposes of the following discussion we subdivide

color image processing into three principal areas: (1) color transformations (also called color

mappings); (2) spatial processing of individual color planes; and (3) color vector processing. The

first category deals with processing the pixels of each color plane based strictly on their values

and not on their spatial coordinates. This category is analogous to the intensity transformations.

The second category deals with spatial (neighbor-hood) filtering for individual color planes and

is analogous to spatial filtering. The third category deals with techniques base on processing all

components of a color image simultaneously. Since full-color images have at least three

components, color pixels are indeed vectors. For example, in the RGB color images, the RGB

system color point can be interpreted as a vector extending from the origin to that point in the

RGB coordinate system.

Let c represent an arbitrary vector in RGB color space:


13/48

13

c = [ ] = [ ] (2.3)

This above equation indicates that the components of c are simply the RGB components

of a color image at a point. Since the color components are a function of coordinates (x, y) by

using the notation.

c(x,y) = = (2.4)

For an image of size M N, there are MN such vectors, c(x, y), for x = 0,1,. M 1 and

y = 0,1,.N 1. In order for independent color component and vector-based processing to be

equivalent, two conditions have to be satisfied: (i) the process has to be applicable to both

vectors and scalars. (ii) the operation on each component of a vector must be independent of the

other components. The averaging would be accomplished by summing the gray levels of all the

pixels in the neighborhood. Or the averaging could be done by summing all the vectors in the

neighborhood and dividing each component of the average vector is the sum of the pixels in the

image corresponding to that component, which is the same as the result that would be obtained if

the averaging were done on the neighborhood of each component image individually, and then

the color vector were formed.

2.4.6 Reading Images:

In MATLAB, images are read into the MATLAB environment using function called

imread. The syntax is as follows: imread(filename) Here, filename is a string containing the

complete name of the image file including any applicable extension. For example, the command

line >> f = imread (x.jpg); reads the JPEG image into image array or image matrix f. Since there

are three color components in the image, namely red, green and blue components, the image is

broken down into the three distinct color matrices fR, fG and fB.

2.5 Standard method of image compression:

In 1992, JPEG established the first international standard for still image compression

where the encoders and decoders are DCT-based. The JPEG standard specifies three modes


14/48

14

namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless

encoding. The performance of the coders for JPEG usually degrades at low bit-rates mainly

because of the underlying block-based Discrete Cosine Transform (DCT) . The baseline JPEG

coder [5] is the sequential encoding in its simplest form. Fig. 1 and 2 show the key processing

steps in such an encoder and decoder respectively for grayscale images. Color image

compression can be approximately regarded as compression of multiple grayscale images, which

are either compressed entirely one at a time, or are compressed by alternately interleaving 8x8

sample blocks from each in turn.

The DCT-based encoder can be thought of as essentially compression of a stream of 8x8

blocks of image samples. Each 8x8 block makes its way through each processing step, and yields

output in compressed form into the data stream. Because adjacent image pixels are highly

correlated, the Forward DCT (FDCT) processing step lays the basis forgaining data compression

by concentrating most of the signal in the lower spatial frequencies. For a typical 8x8 sample

block from a typical source image, most of the spatial frequencies have zero or near-zero

amplitude and need not to be encoded.

Original

Image

Fig: 2.2 Encoder Block Diagram.

Compressed

Image Data

Fig: 2.3 Decoder Block Diagram.

FDCT QuantizerEntropy

Encoder

Quantization

Table(QT)

Huffman

Table

Compressed

Image Data

Entropy

Decoder

DequantizerInverse

DCT

Quantization

Table(QT)

Huffman

Table

Reconstructed

Image


15/48

15

After output from the Forward DCT (FDCT), each of the 64 DCT coefficients is

uniformly quantized in conjunction with a carefully designed 64-element Quantization Table

(QT). At the decoder, the quantized values are multiplied by the corresponding QT elements to

pick up the original unquantized values. After quantization, all the quantized coefficients are

ordered into zig-zag sequence. This ordering helps to facilitate entropy encoding by placing low

frequency non-zero coefficients before high-frequency coefficients. The DC coefficient, which

contains a significant fraction of the total image energy, is differentially encoded.

Entropy Coding (EC) achieves additional compression losslessly through encoding the

quantized DCT coefficients more compactly based on their statistical characteristics. The

JPEG proposal specifies both Huffman coding and arithmetic coding. More recently, the wavelet

transform has emerged as a cutting edge technology, within the field of image analysis. Wavelets

are a mathematical tool for hierarchically decomposing functions. Though rooted in

approximation theory, signal processing, and physics, wavelets have also recently been applied

to many problems in Computer Graphics including image editing and compression, automatic

level-of detail control for editing and rendering curves and surfaces, surface reconstruction from

contours and fast methods for solving simulation problems in 3D modelling, global illumination,

and animation .

Wavelet-based coding provides substantial improvements in picture quality at higher

compression ratios. Over the past few years, a variety of powerful and sophisticated wavelet-

based schemes for image compression have been developed and implemented. Because of the

many advantages of wavelet based image compressionas listed below, the top contenders in the

JPEG-2000 standard are all wavelet-based compression algorithms.

2.6 Conclusion:

The digital image characteristics, digital image representation in different analysis, the

basic colour image processing, the standard method of image compression are been discussed in

this chapter. Image compression using different teqniques are discussed in the next chapter.


16/48

16

Chapter 3

Image compression using different technique

3.1 Introduction:

Here, some background topics of image compression which include the principles of

image compression, the classification of compression methods and the framework of a general

image coder and wavelets for image compression, different types of transforms and quantization

are going to be discussed.

3.2Principles of Image Compression:

An ordinary characteristic of most images is that the neighboring pixels are correlated

and therefore hold redundant information. The foremost task then is to find out less correlated

representation of the image. Two elementary components of compression are redundancy and

irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal

source image. Irrelevancy reduction omits parts of the signal that is not noticed by the signal

receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be

identified: (a) Spatial Redundancy or correlation between neighboring pixel values, (b) Spectral

Redundancy or correlation between different color planes or spectral bands and (c) Temporal

Redundancy or correlation between adjacent frames in a sequence of images especially in video

applications. Image compression research aims at reducing the number of bits needed torepresent an image by removing the spatial and spectral redundancies as much as possible.

3.3. Framework of General Image Compression Method:

A typical lossy image compression system is shown in Fig. C. It consists of three closely

connected components namely (a) Source Encoder, (b) Quantizer and (c) Entropy Encoder.

Compression is achieved by applying a linear transform in order to decorrelate the image data,

quantizing the resulting transform coefficients and entropy coding the quantized values.

Input

Image

Fig: 2.4 A typical lossy encoder.

Source

Encoder

QuantizerEntropy

Encoder

Compressed

Image


17/48

17

Source Encoder:A variety of linear transforms have been developed which include Discrete

Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet

Transform (DWT) and many more, each with its own advantages and disadvantages.

Quantizer:A quantizer is used to reduce the number of bits needed to store the transformed

coefficients by reducing the precision of those values. As it is a many-to-one mapping, it

is a lossy process and is the main source of compression in an encoder. Quantization can

be performed on each individual coefficient, which is called Scalar Quantization (SQ).

Quantization can also be applied on a group of coefficients together known as Vector

Quantization (VQ). Both uniform and non-uniform quantizers can be used depending on

the problems.

Entropy Encoder:An entropy encoder supplementary compresses the quantized values losslessly to

provide a better overall compression. It uses a model to perfectly determine the

probabilities for each quantized value and produces an appropriate code based on these

probabilities so that the resultant output code stream is smaller than the input stream. The

most commonly used entropy encoders are the Huffman encoder and the arithmetic

encoder, although for applications requiring fast execution, simple Run Length Encoding

(RLE) is very effective.

3.4. Image Compression:

In the last decade, there has been a lot of technological transformation in the way we

communicate. This transformation includes the ever present, ever growing internet, the explosive

development in mobile communication and ever increasing importance of video communication.

Data Compression is one of the technologies for each of the aspect of this multimedia revolution.

Cellular phones would not be able to provide communication with increasing clarity without data

compression. Data compression is art and science of representing information in compact form.

Despite rapid progress in mass-storage density, processor speeds, and digital

communication system performance, demand for data storage capacity and data-transmission


18/48

18

bandwidth continues to outstrip the capabilities of available technologies. In a distributed

environment large image files remain a major bottleneck within systems.

Image Compression is an important component of the solutions available for creating

image file sizes of manageable and transmittable dimensions. Platform portability and

performance are important in the selection of the compression/decompression technique to be

employed.

Four Stage model of Data Compression:

Almost all data compression systems can be viewed as comprising four successive

stages of data processing arranged as a processing pipeline (though some stages will often be

combined with a neighboring stage, performed "off-line," or otherwise made rudimentary).

The four stages are

(A) Preliminary pre-processing steps.

(B) Organization by context.

(C) Probability estimation.

(D) Length-reducing code.

The ubiquitous compression pipeline (A-B-C-D) is what is of interest.

With (A) we mean various pre-processing steps that may be appropriate before the final

compression engineLossy compression often follows the same pattern as lossless, but with one or

more quantization steps somewhere in (A). Sometimes clever designers may defer the loss until

suggested by statistics detected in (C); an example of this would be modern zero tree image

coding.

(B) Organization by context often means data reordering, for which a simple but good

example is JPEG's "Zigzag" ordering. The purpose of this step is to improve the estimates found

by the next step.

(C) A probability estimate (or its heuristic equivalent) is formed for each token to be

encoded. Often the estimation formula will depend on context found by (B) with separate 'bins'

of state variables maintained for each conditioned class.


19/48

19

(D) Finally, based on its estimated probability, each compressed file token is represented as

bits in the compressed file. Ideally, a 12.5%-probable token should be encoded with three bits,

but details become complicated.

Principle behind Image Compression:

Images have considerably higher storage requirement than text; Audio and Video Data

require more demanding properties for data storage. An image stored in an uncompressed file

format, such as the popular BMP format, can be huge. An image with a pixel resolution of 640

by 480 pixels and 24-bit colour resolution will take up 640 * 480 * 24/8 = 921,600 bytes in an

uncompressed format.

The huge amount of storage space is not only the consideration but also the data

transmission rates for communication of continuous media are also significantly large. An image,

1024 pixel x 1024 pixel x 24 bit, without compression, would require 3 MB of storage and 7

minutes for transmission, utilizing a high speed, 64 Kbits /s, ISDN line.

Image data compression becomes still more important because of the fact that the transfer

of uncompressed graphical data requires far more bandwidth and data transfer rate. For example,

throughput in a multimedia system can be as high as 140 Mbits/s, which must be transferred

between systems. This kind of data transfer rate is not realizable with todays technology, or in

near the future with reasonably priced hardware.

3.5. Fundamentals of Image Compression Techniques:A digital image, or "bitmap", consists of a grid of dots, or "pixels", with each pixel

defined by a numeric value that gives its colour. The term data compression refers to the process

of reducing the amount of data required to represent a given quantity of information. Now, a

particular piece of information may contain some portion which is not important and can be

comfortably removed. All such data is referred as Redundant Data. Data redundancy is a central

issue in digital image compression. Image compression research aims at reducing the number of

bits needed to represent an image by removing the spatial and spectral redundancies as much as

possible.

A common characteristic of most images is that the neighboring pixels are correlated and

therefore contain redundant information. The foremost task then is to find less correlated

representation of the image. In general, three types of redundancy can be identified:


20/48

20

1. Coding Redundancy

2. Inter Pixel Redundancy

3.PsychovisualRedundancy

Coding Redundancy:

If the gray levels of an image are coded in a way that uses more code symbols than

absolutely necessary to represent each gray level, the resulting image is said to contain coding

redundancy. It is almost always present when an images gray levels are represented with a

straight or natural binary code. Let us assume that a random variable rK

lying in the interval [0,

1] represents the gray levels of an image and that each rK

occurs with probability Pr(r

K).

Pr(r

K) = N

k/ n where k = 0, 1, 2 L-1. (3.1)

L = No. of gray levels.N

k=No. of times that gray appears in that image.

N = Total no. of pixels in the image.

If no. of bits used to represent each value of rK

is l (rK), the average no. of bits required to

represent each pixel is

Lavg

= l (rK) P

r(r

K) (3.2)

That is average length of code words assigned to the various gray levels is found by summing

the product of the no. of bits used to represent each gray level and the probability that the gray

level occurs. Thus the total no. of bits required to code an MN image is MN Lavg

.

Inter Pixel Redundancy:

The Information of any given pixel can be reasonably predicted from the value of its

neighboring pixel. The information carried by an individual pixel is relatively small.

In order to reduce the inter pixel redundancies in an image, the 2-D pixel array normally used

for viewing and interpretation must be transformed into a more efficient but usually non visual

format. For example, the differences between adjacent pixels can be used to represent an image.

These types of transformations are referred as mappings. They are called reversible if the

original image elements can be reconstructed from the transformed data set.


21/48

21

Psycho visual Redundancy:

Certain information simply has less relative importance than other information in normal visual

processing. This information is said to be Psycho visually redundant, it can be eliminated without

significantly impairing the quality of image perception.

In general, an observer searches for distinguishing features such as edges or textual regions

and mentally combines them in recognizable groupings. The brain then correlates these

groupings with prior knowledge in order to complete the image interpretation process.

The elimination of psycho visually redundant data results in loss of quantitative information;

it is commonly referred as quantization. As this is an irreversible process i.e. visual information

is lost, thus it results in Lossy Data Compression. An image reconstructed following Lossy

compression contains degradation relative to the original. Often this is because the compression

scheme completely discards redundant information.

Image Compression Techniques:

There are basically two methods of Image Compression:

3.5.1. Lossless Coding Techniques

3.5.2. Lossy Coding Techniques

3.5.1. Lossless Coding Techniques:

In Lossless Compression schemes, the reconstructed image, after compression, is

numerically identical to the original image. However Lossless Compression can achieve a

modest amount of Compression. Lossless coding guaranties that the decompressed image is

absolutely identical to the image before compression. Lossless techniques can also be used for

the compression of other data types where loss of information is not acceptable. Lossless

compression algorithms can be used to squeeze down images and then restore them again for

viewing completely unchanged.

Lossless Coding Techniques are as follows:

Source Encoder Input Image F(x, y)1. Run Length Encoding.

2. Huffman Encoding.

3. Entropy Encoding.

4. Area Encoding.


22/48

22

3.5.2. Lossy Coding Techniques:

Lossy techniques cause image quality degradation in each Compression / De-

compression step. Careful consideration of the Human Visual perception ensures that the

degradation is often unrecognizable, though this depends on the selected compression ratio. An

image reconstructed following Lossy compression contains degradation relative to the original.

Often this is because the compression schemes are capable of achieving much higher

compression. Under normal viewing conditions, no visible loss is perceived (visually Lossless).

Lossy Image Coding Techniques normally have three Components:

Image Modeling:

It is aimed at the exploitation of statistical characteristics of the image (i.e. high

correlation, redundancy). It defines such things as the transformation to be applied to the Image.

Parameter Quantization:

The aim of Quantization is to reduce the amount of data used to represent the information

within the new domain.

Encoding:

Here a code is generated by associating appropriate code words to the raw produced by

the Quantizer. Encoding is usually error free. It optimizes the representation of the information

and may introduce some error detection codes.

3.6. Measurement of Image Quality:

The design of an imaging system should begin with an analysis of the physical characteristics of

the originals and the means through which the images may be generated. For example, one might

examine a representative sample of the originals and determine the level of detail that must be

preserved, the depth of field that must be captured, whether they can be placed on a glass platen

or require a custom book-edge scanner, whether they can tolerate exposure to high light

intensity, and whether specula reflections must be captured or minimized. A detailed

examination of some of the originals, perhaps with a magnifier or microscope, may be necessary

to determine the level of detail within the original that might be meaningful for a researcher or

scholar. For example, in drawings or paintings it may be important to preserve stippling or other

techniques characteristic.


23/48

23

3.7. Wavelets for image compression:

Wavelet transform exploits both the spatial and frequency correlation of data by dilations

(or contractions) and translations of mother wavelet on the input data. It supports the multi-

resolution analysis of data i.e. it can be applied to different scales according to the details

required, which allows progressive transmission and zooming of the image without the need of

extra storage. Another encouraging feature of wavelet transform is its symmetric nature that is

both the forward and the inverse transform has the same complexity, building fast compression

and decompression routines. Its characteristics well suited for image compression include the

ability to take into account of Human Visual Systems (HVS) characteristics, very good energy

compaction capabilities, robustness under transmission, high compression ratio etc.

Wavelet transform divides the information of an image into approximation and detail

sub-signals. The approximation sub-signal shows the general trend of pixel values and other

three detail sub-signals show the vertical, horizontal and diagonal details or changes in the

images. If these details are very small (threshold) then they can be set to zero without

significantly changing the image. The greater the number of zeros the greater the compression

ratio. If the energy retained (amount of information retained by an image after compression and

decompression) is 100% then the compression is lossless as the image can be reconstructed

exactly. This occurs when the threshold value is set to zero, meaning that the details have not

been changed.3.8 Image Compression Methodology:

Overview:The storage requirements for the video of a typical Angiogram procedure is of the order of

several hundred Mbytes.

*Transmission of this data over a low bandwidth network results in very high latency.

* Lossless compression methods can achieve compression ratios of ~2:1.

* We consider lossy techniques operating at much higher compression ratios (~10:1).* Key issues:

- High quality reconstruction required.

- Angiogram data contains considerable high-frequency spatial texture.


24/48

24

* Proposed method applies a texture-modeling scheme to the high-frequency texture of some

regions of the image.

* This allows more bandwidth allocation to important areas of the image.

3.9 Different types of transforms:

1. FT (Fourier Transform).

2. DCT (Discrete Cosine Transform).

3. DWT (Discrete Wavelet Transform). .

3.9.1 Discrete Fourier Transform:

The DTFT representation for a finite duration sequence is (3.3) (3.4)

Where x(n) is a finite duration sequence, X(j) is periodic withperiod 2.It is convenient sample

X(j) with a sampling frequency equal an integer multiple of its period =m that is taking N

uniformly spaced samples between 0 and 2.

Let (3.5)Therefore (3.6)Since X(j) is sampled for one period and there are N samples X(j) can be expressed as

(3.7)3.9.2 The Discrete Cosine Transform (DCT):

The discrete cosine transform (DCT) helps separate the image into parts (or spectral

sub-bands) of differing importance (with respect to the image's visual quality). The DCT is

similar to the discrete Fourier transform: it transforms a signal or image from the spatial domain

to the frequency domain.


25/48

25

3.9.3 Discrete Wavelet Transform (DWT):

The discrete wavelet transform (DWT) refers to wavelet transforms for which the

wavelets are discretely sampled. A transform which localizes a function both in space and

scaling and has some desirable properties compared to the Fourier transform. The transform is

based on a wavelet matrix, which can be computed more quickly than the analogous Fourier

matrix. Most notably, the discrete wavelet transform is used for signal coding, where the

properties of the transform are exploited to represent a discrete signal in a more redundant form,

often as a preconditioning for data compression. The discrete wavelet transform has a huge

number of applications in Science, Engineering, Mathematics and Computer Science.

Wavelet compression is a form of data compression well suited for image compression

(sometimes also video compression and audio compression). The goal is to store image data in as

little space as possible in a file. A certain loss of quality is accepted (lossy compression).

Using a wavelet transform, the wavelet compression methods are better at representing

transients, such as percussion sounds in audio, or high-frequency components in two-

dimensional images, for example an image of stars on a night sky. This means that the transient

elements of a data.

Signal can be represented by a smaller amount of information than would be the case if

some other transform, such as the more widespread discrete cosine transform, had been used.

First a wavelet transform is applied. This produces as many coefficients as there are pixels in theimage (i.e.: there is no compression yet since it is only a transform). These coefficients can then

be compressed more easily because the information is statistically concentrated in just a few

coefficients.

3.10 Quantization:

Quantization involved in image processing. Quantization techniques generally compress

by compressing a range of values to a single quantum value. By reducing the number of discrete

symbols in a given stream, the stream becomes more compressible. For example seeking to

reduce the number of colors required to represent an image. Another widely used example DCT

data quantization in JPEG and DWT data quantization in JPEG 2000.


26/48

26

3.11 Entropy Encoding:

An entropy encoding is a coding scheme that assigns codes to symbols so as to match

code lengths with the probabilities of the symbols. Typically, entropy encoders are used to

compress data by replacing symbols represented by equal-length codes with symbols represented

by codes proportional to the negative logarithm of the probability. Therefore, the most common

symbols use the shortest codes.

According to Shannon's source coding theorem, the optimal code length for a symbol is

logbP, where b is the number of symbols used to make output codes and P is the probability of

the input symbol. Three of the most common entropy encoding techniques are Huffman coding,

range encoding, and arithmetic coding. If the approximate entropy characteristics of a data

stream are known in advance (especially for signal compression), a simpler static code such as

unary coding, Elias gamma coding, Fibonacci coding, Golomb coding, or Rice coding may be

useful.

There are three main techniques for achieving entropy coding:

Huffman Coding - one of the simplest variable length coding schemes.

Run-length Coding (RLC) - very useful for binary data containing long runs of ones of

zeros.

Arithmetic Coding - a relatively new variable length coding scheme that can combine

the best features of Huffman and run-length coding, and also adapt to data with non-stationary

statistics. It shall concentrate on the Huffman and RLC methods for simplicity.

3.12 Conclusion:

Here, some topics of image compression which include the principles of image

compression, the classification of compression methods and the framework of a general image

coder and wavelets for image compression, different types of transforms and quantization are

discussed. The introduction to wavelet transforms is given in the next chapter.


27/48

27

Chapter 4

Introduction to wavelet transform4.1. Introduction:

The fundamental idea behind wavelets is to analyze according to scale. Indeed, some

researchers in the wavelet field feel that, by using wavelets, one is adopting a whole new

mindset or perspective in processing data.

Wavelets are functions that satisfy certain mathematical requirements and are used in

representing data or other functions. This idea is not new. Approximation using superposition

of functions has existed since the early 1800's, when Joseph Fourier discovered that he could

superpose sines and cosines to represent other functions. However, in wavelet analysis, the

scale that we use to look at data plays a special role. Wavelet algorithms process data at

different scales or resolutions. If we look at a signal with a large "window," we would notice

gross features. Similarly, if we look at a signal with a small "window," we would notice small

features. The result in wavelet analysis is to see both the forest andthe trees, so to speak.

This makes wavelets interesting and useful. For many decades, scientists have wanted

more appropriate functions than the sines and cosines which comprise the bases of Fourier

analysis, to approximate choppy signals. By their definition, these functions are non-local

(and stretch out to infinity). They therefore do a very poor job in approximating sharp spikes.

But with wavelet analysis, we can use approximating functions that are contained neatly in

finite domains. Wavelets are well-suited for approximating data with sharp discontinuities.

The wavelet analysis procedure is to adopt a wavelet prototype function, called an

analyzing wavelet or mother wavelet. Temporal analysis is performed with a contracted,

high-frequency version of the prototype wavelet, while frequency analysis is performed with

a dilated, low-frequency version of the same wavelet. Because the original signal or function

can be represented in terms of a wavelet expansion (using coefficients in a linear combination

of the wavelet functions), data operations can be performed using just the corresponding

wavelet coefficients. And if you further choose the best wavelets adapted to your data, or

truncate the coefficients below a threshold, your data is sparsely represented. This sparse

coding makes wavelets an excellent tool in the field of data compression.


28/48

28

Other applied fields that are making use of wavelets include astronomy, acoustics,

nuclear engineering, sub-band coding, signal and image processing, neurophysiology, music,

magnetic resonance imaging, speech discrimination, optics, fractals, turbulence, earthquake-

prediction, radar, human vision, and pure mathematics applications such as solving partial

differential equations.

4.2. Basis Functions:

It is simpler to explain a basis function if we move out of the realm of analog

(functions) and into the realm of digital (vectors) (*). Every two-dimensional vector (x,y) is a

combination of the vector (1,0) and (0,1). These two vectors are the basis vectors for (x,y).

Why? Notice that x multiplied by (1,0) is the vector (x,0), and y multiplied by (0,1) is the

vector(0,y). The sum is (x,y).

The best basis vectors have the valuable extra property that the vectors are

perpendicular, or orthogonal to each other. For the basis (1,0) and (0,1), this criteria is

satisfied. Now let's go back to the analog world, and see how to relate these concepts to basis

functions. Instead of the vector (x,y), we have a function f(x). Imagine that f(x) is a musical

tone, say the note A in a particular octave. We can construct A by adding sines and cosines

using combinations of amplitudes and frequencies. The sines and cosines are the basis

functions in this example, and the elements of Fourier synthesis. For the sines and cosines

chosen, we can set the additional requirement that they be orthogonal. How? By choosing the

appropriate combination of sine and cosine function terms whose inner product add up to

zero. The particular set of functions that are orthogonal and that construct f(x) are our

orthogonal basis functions for this problem.

Scale-Varying Basis Functions:

A basis function varies in scale by chopping up the same function or data space using

different scale sizes. For example, imagine we have a signal over the domain from 0 to 1. We

can divide the signal with two step functions that range from 0 to 1/2 and 1/2 to 1. Then we

can divide the original signal again using four step functions from 0 to 1/4, 1/4 to 1/2, 1/2 to

3/4, and 3/4 to 1. And so on. Each set of representations code the original signal with a

particular resolution or scale.


29/48

29

4.3. Fourier analysis:

Fourier Transform:

The Fourier transform's utility lies in its ability to analyze a signal in the time domain

for its frequency content. The transform works by first translating a function in the time

domain into a function in the frequency domain. The signal can then be analyzed for its

frequency content because the Fourier coefficients of the transformed function represent the

contribution of each sine and cosine function at each frequency. An inverse Fourier transform

does just what you'd expect, transform data from the frequency domain into the time domain.

Discrete Fourier Transform:

The discrete Fourier transform (DFT) estimates the Fourier transform of a function

from a finite number of its sampled points. The sampled points are supposed to be typical of

what the signal looks like at all other times.

The DFT has symmetry properties almost exactly the same as the continuous Fourier

transform. In addition, the formula for the inverse discrete Fourier transform is easily

calculated using the one for the discrete Fourier transform because the two formulas are

almost identical.

Windowed Fourier Transform:

Iff(t) is a non-periodic signal, the summation of the periodic functions, sine and

cosine, does not accurately represent the signal. You could artificially extend the signal to

make it periodic but it would require additional continuity at the endpoints. The windowed

Fourier transform (WFT) is one solution to the problem of better representing the non

periodic signal. The WFT can be used to give information about signals simultaneously in thetime domain and in the frequency domain.

With the WFT, the input signal f(t) is chopped up into sections, and each section is

analyzed for its frequency content separately. If the signal has sharp transitions, window uses

input data so that the sections converge to zero at the endpoint. This windowing is

accomplished via a weight function that places less emphasis near the interval's endpoints

than in the middle. The effect of the window is to localize the signal in time.


30/48

30

Fast Fourier Transform:

To approximate a function by samples, and to approximate the Fourier integral by the

discrete Fourier transform, requires applying a matrix whose order is the number sample

points n. Since multiplying an matrix by a vector costs on the order of arithmetic

operations, the problem gets quickly worse as the number of sample points increases.

However, if the samples are uniformly spaced, then the Fourier matrix can be factored into a

product of just a few sparse matrices, and the resulting factors can be applied to a vector in a

total of order arithmetic operations. This is the so-calledfast Fourier transform or FFT.

4.4. Similarities between Fourier and Wavelet Transform:

The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both

linear operations that generate a data structure that contains segments of various lengths,

usually filling and transforming it into a different data vector of length .

The mathematical properties of the matrices involved in the transforms are similar as

well. The inverse transform matrix for both the FFT and the DWT is the transpose of the

original. As a result, both transforms can be viewed as a rotation in function space to a

different domain. For the FFT, this new domain contains basis functions that are sines and

cosines. For the wavelet transform, this new domain contains more complicated basis

functions called wavelets, mother wavelets, or analyzing wavelets.

Both transforms have another similarity. The basis functions are localized in

frequency, making mathematical tools such as power spectra (how much power is contained

in a frequency interval) and scale grams (to be defined later) useful at picking out frequencies

and calculating power distributions.

4.5. Dissimilarities between Fourier and Wavelet Transform:

The most interesting dissimilarity between these two kinds of transforms is that

individual wavelet functions are localized in space. Fourier sine and cosine functions are not.

This localization feature, along with wavelets' localization of frequency, makes many

functions and operators using wavelets "sparse" when transformed into the wavelet domain.

This sparseness, in turn, results in a number of useful applications such as data compression,

detecting features in images, and removing noise from time series.


31/48

31

4.6. Wavelets:

Compactly supported wavelets are functions defined over a finite interval and having

an average alue of zero. The basic idea of the wavelet transform is to represent any arbitrary

function f(x) as a uperposition of a set of such wavelets or basis functions. These basisfunctions are obtained from a single prototype wavelet called the mother wavelet (x), by

dilations or scaling and translations. Wavelet bases are very good at efficiently representing

functions that are smooth except for a small set of discontinuities.

For each n, k Z, define (x) by(x) = (x - k) (4.1)Constructing the function (x),

on R, such that {

(x)}n,k

Z is an orthonormal

basis on R. As mentioned before (x) is a wavelet and the collection {(x) }n,kZ is awavelet orthonormal basis on R; this framework for constructing wavelets involves theconcept of a multi resolution analysis or MRA.

Multi resolution analysis is a device for computation of basis coefficients in (R) :f = , . It is defined as follows, = {f(x)|f(x) = g(x), g(x) }, (4.2)Where

f(x) = (f, ( n))(x n) (4.3)Then a multi resolution analysis on R is a sequence of subspaces {}nZ of functions on R, satisfying the following properties:(a)For all n, k Z, .(b)If f(x) is on R, then f(x) span{}nZ. That is, given > 0, there is an n Z

and a function g(x) such that ||fg||


32/48

32

(x) = g(k)(2x k). (4.5)Then {(x)} is a wavelet orthonormal basis on R.The orthogonal projection of an arbitrary function f

onto

is given by

f = (f, ) (4.6)As k varies, the basis functions are shifted in steps of , so f cannotrepresent any detail on a scale smaller than that. We say that the functions in have theresolution or scale . Here, f is called an approximation to f at resolution . For agiven function f, an MRA provides a sequence of approximations f of increasing accuracy.The difference between the approximations at resolutionand is called the finedetail at resolution

which is as follows:

f(x) = f(x) f(x). (4.7)Or f = (f, ) . (4.8) is also an orthogonal projection and its range is orthorgonal to where the followingholds: = {f| f = f} (4.9)

= {f|

f = f} (4.10)

= (4.11)There are choices of the numbers h and g such that {(x)} is a wavelet orthonormalbasis on R.We must show otho-normality and completeness. As for completeness, we have=0 (4.12)and = R (4.13)Then we have {

|k

Z} =

=

. Hence {

(x)}

is complete if and only

if = (R) holds, and this is true.Now, as for the ortho-normality,( , )= ((x k), (x l)

= (, ) = (k l). (4.14)To prove ortho-normality between scales, let n, n Z with n < n, and let k, k Z bearbitrary. Since (x) V1, (x) V1, Then we have . Since ( , ) =0 for all k, l

Z, it follows that (

,

)= 0, for all n, k, l

Z. Given f(x)

we know

that f(x)= (f, )(x) Hence for f(x)


33/48

33

( , f) = ( (f, ))= (f, ) ( , ) = 0 (4.15)

Since, n < n,

and since

,

also. Hence (

,

) =0

Therefore {(x)} is a wavelet orthonormal basis on R.i) Symmetry:Symmetric filters are preferred for they are most valuable for minimizing the edge

effects in the wavelet representation of discrete wavelet transform (DWT) of a function; large

coefficients resulting from false edges due to periodization can be avoided. Since orthogonal

filters in exception to Haar-filter cannot be symmetric, biorthogonal filters are almost always

selected for image compression application.

ii)

Vanishing Moments:Vanishing Moments are defined as follows: From the definition of multi resolution

analysis(MRA), any wavelet (x) that comes from MRA must satisfy =0 (4.16)The integral is referred to as the zeroth moment of(x), so that if the above equation

holds, we say that (x) has its zeroth moment vanishing. The integral (x) dx is referredto as the

moment of

(x) and if

(x) dx = 0, we say that

(x) has its

moment

vanishing.

We may encounter a situation where having different number of vanishing moments

on the analysis filters than on the reconstruction filters. As a matter of fact, it is possible to

have different number of vanishing moments on the analysis filters than on the reconstruction

filters. Vanishing moments on the analysis filters are desired for small coefficients in the

transform as a result, whereas vanishing moments on the reconstruction filter results in fewer

blocking artifacts in the compressed image thus is desired. Thus having sufficient vanishing

moments which maybe different in numbers on each filters are advantageous.iii)Size of the filters:

Long analysis filters results in greater computation time for the wavelet or wavelet

packet transform. Long reconstruction filters can create unpleasant artifacts in the

compressed image for the following reason. The reconstructed image is made up of the

superposition of only a few scaled and shifted reconstruction filters. So features of the

reconstruction filters such as oscillations or lack of smoothness, can be obvious noted in the


34/48

34

reconstructed image. Smoothness can be guaranteed by requiring a large number of vanishing

moments in the reconstruction filter.

4.7 List Of Wavelet Related Transform:

4.7. 1. Continuous Wavelet Transform:

A continuous wavelet transform is used to divide a continuous-time function into

wavelets. Unlike Fourier transform, the continuous wavelet transform possesses the ability to

construct a time frequencyrepresented of a signal that offers very good time and frequency

localization.

4.7.2. Multi resolution analysis:

A multi resolution analysis (MRA) or multi scale approximation (MSA) is the design

methods of most of the practically relevant discrete wavelet transform (DWT) and the

justification for the algorithm of the fast Fourier wavelet transform (FWT)

4.7.3. Discrete Wavelet Transform:

In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is

any wavelet transform for which the wavelets are discretely sampled. As with other wavelet

transforms, a key advantage it has overFourier transforms is temporal resolution: it capturesboth frequency andlocation information.

4.7.4. Fast Wavelet Transform:

The Fast Wavelet Transform is a mathematical algorithm designed to turn a waveform

or signal in the time domain into a sequence of coefficients based on an orthogonal basis of

small finite waves, or wavelets. The transform can be easily extended to multidimensional

signals, such as images, where the time domain is replaced with the space domain.

4.8. Applications of Wavelet Transforms:

Wavelets have broad applications in fields such as signal processing and medical

imaging. Due to time and space constraints, I will only be discussing two in this paper. The

two applications most applicable to this are wavelet image compression and the progressive

transmission of image files over the internet.
http://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Time-frequency_representation


35/48

35

4.8.1. Wavelet Compression:

The point of doing the Haar wavelet transform is that areas of the original matrix that

contain little variation will end up as small or zero elements in the Haar transform matrix. A

matrix is considered sparse if it has a high proportion of zero entries. Matrices that are sparse

take much less memory to store. Because we cannot expect the transformed matrices always

to be sparse, we must consider wavelet compression To perform wavelet compression we first

decide on a non-negative threshold value known as . We next let any value in the Haar

wavelet transformed matrix whose magnitude is less than be reset to zero. Our hope is that

this will leave us with a relatively sparse matrix. If is equal to zero we will not modify any

of the elements and therefore we will not lose any information. This is known as lossless

compression. Lossy compression occurs when is greater than zero. Because some of the

elements are reset to zero, some of our original data is lost. In the case of lossless

compression we are able to reverse our operations and get our original image back. With

lossy compression we are only able to build an approximation of our original image.

4.8.2. Progressive Transmission:

Many people frequently download images from the internet, a few of which are not

even pornographic. Wavelet transforms speed up this process considerably. When a personclicks on an image to download it, the source computer recalls the wave transformed matrix

from memory. It first sends the overall approximation coefficient and larger detailcoefficients, and then the progressively smaller detail coefficients. As your computer receivesthis information it begins to reconstruct the image in progressively greater detail until the

original image is fully reconstructed. This process can be interrupted at any time if the user

decides that s/he does not want the image. Otherwise an user would only be able to see an

image after the entire image file had been downloaded. Because a compressed image file issignificantly smaller it takes far less time to download.

4.9. Conclusion:

The basis function, Fourier analysis, similarities, dissimilarities between Fourier and

wavelet transform, introduction to wavelets, types of wavelet transforms and applications of

wavelet transform are discussed in this chapter. The image compression using modified fast

haar wavelet transform will be discussed.


36/48

36

Chapter 5

1.Image compression using SPIHT algorithmA. Description of the Algorithm

Image data through the wavelet decomposition, the coefficient of the distribution turn

into a tree. According to this feature, defining a data structure: spatial orientation tree. 4-level

wavelet decomposition of the spatial orientation trees structure are shown in Figure1.We can

see that each coefficient has four children except the red marked coeffcients in the LL

subband and the coeffcients in the highest subbands (HL1;LH1; HH1).

The following sets of coordinates of coeffcients are used to represent set partitioning

method in SPIHT algorithm. The location of coeffcient is notated by (i,j),where i and j

indicate row and column indices, respectively.

H: Roots of the all spatial orientation trees

O (i,j):Set of offspring of the coeffcient (i, j), O(i, j) = {(2i, 2j), (2i, 2j + 1),(2i + 1,

2j), (2i + 1, 2j + 1)}, except (i, j) is inLL; When (i,j) is inLL subband, O(i; j) is defined as:

O(i, j) = {(i, j + ), (i + , j), (i +, j + )}, where and is the width andheight of theLL subband, respectively.D (i,j): Set of all descendants of the coeffcient (i, j),L

(i,j):D (i,j) - O (i,j).

Figure5.1: Parent-child relationship in SPIHT

A significance function () which decides the Significance of the set ofcoordinates, , with respect to the threshold 2n is defined by:


37/48

37

Where ci,j is the wavelet coefficient. In this algorithm, three ordered lists are used to

store the significance information during set partitioning. List of insignificant sets (LIS), list

of insignificant pixels (LIP), and list of significant pixels (LSP) are those three lists. Note that

the term pixel is actually indicating wavelet coeffcient if the set partitioning algorithm is

applied to a wavelet transformed image.

Algorithm: SPIHT

1) Initialization:

1. Output n= [log2 max {| (, )|}]2. SetLSP=;3. SetLIP= (i,j)

H;

4. SetLIS= (i,j) H, whereD(i; j) and set each entry inLISas type A2) Sorting Pass:

1. For each (i, j) LIPdo:(a) Output (,)(b) If (,) = 1 then move (i, j) toLSPand output Sign ( , )2. For each (i, j) LISdo:(a) If (i, j) is type A then

I. output ((,))ii. If then (, ) = 1 then

A. for each (k, l) O(i, j). Output (, ). If (, ) = 1 then append (k, l) toLSP, output Sign( ,),and , = , 2sign( ,) else append (k; l) toLIPB. move (i, j) to the end ofLISas type B

(b) If (i, j) is type B then

I. output (, )ii. If (, ) = 1 then

. Append each (k, l) O(i, j) to the end ofLISas type A

. Remove (i,j) fromLSP

3) Refinement Pass:

1. For each (i,j) in LSP, except those included in the last sorting pass


38/48

38

. Output the n-th MSB of |, |4) Quantization Pass:

1. Decrement nby 1

2. Goto step 2).

B. Analyses of SPIHT Algorithm

Here a concrete example to analyze the output binary stream of SPIHT encoding. The

following is 3-level wavelet decomposition coefficients of SPIHT encoding

n = [log2 max {|c(i,j)|}] = 5, so, The initial threshold value: = 25, for , the output binarystream: 11100011100010000001010110000, 29 bits in all. By the SPIHT encoding results,

we can see that the output bit stream with a large number of seriate "0" situation, and along

with the gradual deepening of quantification, the situation will become much more severity,

so there will have a great of redundancy when we direct output.


39/48

39

2.Image compression using WDR algorithmWDR ALGORITHM

One of the defects of SPIHT is that it only implicitly locates the position of significant

coefficients. This makes it difficult to perform operations, such as region selection on

compressed data, which depend on the exact position of significant transform values. By

region selection, also known as region of interest (ROI), which means selecting a portion of a

compressed image, which requires increased resolution. Such compressed data operations are

possible with the Wavelet Difference Reduction (WDR) algorithm of Tian and Wells.

The term difference reduction refers to the way in which WDR encodes the locations

of significant wavelet transform values. In WDR, the output from the significance pass

consists of the signs of significant values along with sequences of bits which concisely

describe the precise locations of significant values.

The WDR algorithm is a very simple procedure. A wavelet transform is first applied to the

image, and then the bit-plane based WDR encoding algorithm for the wavelet coefficients is

carried out. WDR mainly consists of five steps as follows:

1. Initialization:During this step an assignment of a scan order should first be made. For an image

with P pixels, a scanorder is a one-to-one and onto mapping

= Xk, for k

=1,2,..., P between the wavelet coefficient () and a linear ordering (X k). The scan order is a

zigzag through subbands from higher to lower levels. For coefficients in subbands, row-based

scanning is used in the horizontal subbands, column based scanning is used in the vertical

subbands, and zigzag scanning is used for the diagonal and low-pass subbands. As the

scanning order is made, an initial threshold T0 is chosen so that all the transform values

satisfy |Xm|< T0 and at least one transform value satisfies |Xm|>= T0 / 2.

2.Update threshold:Let Tk=Tk-1 / 2.


40/48

40

3.Significance pass:In this part, transform values are deemed significant if they are greater than or equal

to the threshold value. Then their index values are encoded using the difference reduction

method of Tian and Wells. The difference reduction method essentially consists of a binary

encoding of the number of steps to go from the index of the last significant value to the index

of the current significant value. The output from the significance pass includes the signs of

significant values along with sequences of bits, generated by difference reduction, which

describes the precise locations of significant values.

4.Refinement pass:The refinement pass is to generate the refined bits via the standard bit-plane

quantization procedure like the refinement process in SPHIT method. Each refined value is a

better approximation of an exact transform value.

5.Repeat steps (2) through (4) until the bit budget is reached.

3.Image compression using EZW algorithmTHEORY OF EZW ALGORITHM

It generates a lot of unimportant data after wavelet transforming on image data. It

discards some unimportant data after the process of quantizing and coding according to some

special rules and the remained data can represent the original data approximately. This is the

principle of Image compress algorithm based on wavelet transform.

Zero-tree coding method is one of the most popular image compress algorithms using

wavelet transforms and Embedded Zero-tree Wavelets (EZW) coding is the representative

method of the zero-tree coding based. EZW was invented by Shapiro in 1993[3]. It is an

embedded wavelet image coding algorithms, which has a high compression rate. It is a

progressive coding method and can perform well at image compressing from lossy to

lossless.


41/48

41

The main features of EZW include compact multiresolution representation of images

by discrete wavelet transformation, zero-tree coding of the significant wavelet coefficients

providing compact binary maps, successive approximation quantization of the wavelet

coefficients, adaptive multilevel arithmetic coding, and capability of meeting an exact target

Compress rate.

The basic process flow of EZW algorithm can be described as follows: Operate the

image through wavelet transform and quantizing the coefficients. Given a series of threshold

values which are sorted from high to low, for every threshold (current threshold value equals

to 1/2 of the former threshold), sort all the coefficients and remain the important coefficients

and discard unimpo

Sree Project Document

Documents

Transcript of Sree Project Document