Sree Project Document

download Sree Project Document

of 48

Transcript of Sree Project Document

  • 7/28/2019 Sree Project Document

    1/48

    1

    Chapter 1

    Introduction

    The computer is becoming more and more powerful day by day. As a result, the uses of

    digital images are increasing rapidly. Along with this, increasing use of digital images come as

    the serious issue of storing and transferring the huge volume of data representing the images

    because the uncompressed multimedia (graphics, audio and video) data requires considerable

    storage capacity and transmission bandwidth. Though there is a rapid progress in mass storage

    density, speed of the processor and the performance of the digital communication systems, the

    demand for data storage capacity and data transmission bandwidth continues to exceed the

    capabilities of on hand technologies. Besides, the latest growth of data intensive multimedia

    based web applications has put much pressure on the researchers to find the way of using the

    images in the web applications more effectively.

    Internet teleconferencing, High Definition Television (HDTV), satellite communications

    and digital storage of movies are not feasible without a high degree of compression. As it is, such

    applications are far from realizing their full potential largely due to the limitations of common

    image compression techniques.

    The image is actually a kind of redundant data i.e. it contains the same information from

    certain perspective of view. By using data compression techniques, it is possible to remove some

    of the redundant information contained in images. Image compression minimizes the size in

    bytes of a graphics file without degrading the quality of the image to an unacceptable level. The

    reduction in file size allows more images to be stored in a certain amount of disk or memory

    space. It also reduces the time necessary for images to be sent over the Internet or downloaded

    from web pages.

    Wavelets are functions which allow data analysis of signals or images, according toscales or resolutions. The processing of signals by wavelet algorithms in fact works much the

    same way the human eye does; or the way a digital camera processes visual scales of resolutions,

    and intermediate details. But the same principle also captures cell phone signals, and even

    digitized colour images.

  • 7/28/2019 Sree Project Document

    2/48

    2

    Wavelets are of real use in these areas, for example in approximating data with sharp

    discontinuities such as choppy signals, or pictures with lots of edges. While wavelets is perhaps a

    chapter in function theory, we show that the algorithms that result are key to the processing of

    numbers, or more precisely of digitized information, signals, time series, still-images, movies,

    colour images, etc.

    Though there is a rapid progress in mass storage density, speed of the processor and the

    performance of the digital communication systems, the demand for data storage capacity and

    data transmission bandwidth continues to exceed the capabilities of on hand technologies.

    Besides, the latest growth of data intensive multimedia based web applications has put much

    pressure on the researchers to find the way of using the images in the web applications more

    effectively. As it is, such applications are far from realizing their full potential largely due to the

    limitations of common image compression techniques.

    Since the Haar Transform is memory efficient, exactly reversible without the edge

    effects, it is fast and simple. As such the Haar Transform technique is widely used these days in

    wavelet analysis. Fast Haar Transform is one of the algorithms which can reduce the tedious

    work of calculations. One of the earliest versions of FHT is included in HT. FHT involves

    addition, subtraction and division by 2. Its application in atmospheric turbulence analysis, image

    analysis, signal and image compression.

    The Modified Fast Haar Wavelet Transform (MFHWT), in which the MFHWT is used

    for one-dimensional approach and FHT is used to find the N/2detail coefficients at each level for

    a signal of length N. In this project, it has used the same concept of finding averages and

    differences as in but here that approach is extended for 2D images with the addition of

    considering the detail coefficients 0 for N/2elements at each level. The Haar Transform and Fast

    Haar Transform have been explained. Modified Fast Haar Wavelet Transform is presented withthe proposed algorithm for 2D images.

  • 7/28/2019 Sree Project Document

    3/48

    3

    Chapter 2

    Introduction to Digital Image processing

    2.1 Introduction:

    Collection of pixels laid out in a specific order with width (x) and height (y) in pixels.

    Each pixel has a numerical value, which correspond to a color or gray scale value. A Pixel has

    no absolute size and pixels may (sometimes NOT always)have a spatial value (Spatial data isdata associated with the pixels that provides information about the size of the objects in the

    image).

    Fig: 2.1 Representation of digital image in x and y pixel format.

    An image may be defined as a two-dimensional function, f(x, y), where x and y are

    spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity

    or gray level of the image at that point. When x, y, and the amplitude values of f are all finite,

    discrete quantities, we call the image a digital image. The field of digital image processing refers

    to processing digital images by means of a digital computer. Note that a digital image is

    composed of a finite number of elements, each of which has a particular location and value.

    These elements are referred to as picture elements, image elements and pixels. Pixel is the term

    most widely used to denote the elements of a digital image.

    Vision is the most advanced of our senses, so it is not surprising that images play the

    single most important role in human perception. However, unlike humans, who are limited to the

    visual band of the electromagnetic (EM) spectrum, imaging machines cover almost the entire

    EM spectrum, ranging from gamma to radio waves. They can operate also on images generated

    by sources that humans are not accustomed to associating with images. These include ultrasound,

    electron microscopy, and computer-generated images. Thus, digital image processing

  • 7/28/2019 Sree Project Document

    4/48

    4

    encompasses a wide and varied field of applications. Image processing stops and other related

    areas, such as image analysis and computer vision, start. Sometimes a distinction is made by

    defining image processing as a discipline in which both the input and output of a process are

    images. We believe this to be a limiting and somewhat artificial boundary. For example, under

    this definition, even the trivial task of computing the average intensity of an image would not be

    considered an image processing operation. On the other hand, there are fields such as computer

    vision whose ultimate goal is to use computers to emulate human vision, including learning and

    being able to make inferences and take actions based on visual inputs. This area itself is a branch

    of artificial intelligence (AI), whose objective is to emulate human intelligence. The field of AI

    is in its earliest stages of infancy in terms of development, with progress having been much

    slower than originally anticipated. The area of image analysis (also called image understanding)

    is in between image processing and computer vision. There are no clear-cut boundaries in the

    continuum from image processing at one end to computer vision at the other. However, one

    useful paradigm is to consider three types of computerized processes in this continuum: low-,

    mid-,and high-level processes. Low-level processes involve primitive operations such as image

    pre-processing to reduce noise, contrast enhancement, and image sharpening. A low-level

    process is characterized by the fact that both its inputs and outputs are images. Mid-level

    processes on images involve tasks such as segmentation (partitioning an image into regions or

    objects), description of those objects to reduce them to a form suitable for computer processing,

    and classification (recognition) of individual objects. A mid-level process is characterized by the

    fact that its inputs generally are images, but its outputs are at- tributes extracted from those

    images (e.g., edges, contours, and the identity of individual objects). Finally, higher-level

    processing involves making sense of an ensemble of recognized objects, as in image analysis,

    and, at the far end of the continuum, performing the cognitive functions normally associated with

    human vision.

    Based on the preceding comments, we see that a logical place of overlap between imageprocessing and image analysis is the area of recognition of individual regions or objects in an

    image. Thus, what we call in this book digital image processing encompasses processes whose

    inputs and outputs are images and, in addition, encompasses processes that extract attributes

    from images, up to and including the recognition of individual objects. As a simple illustration to

  • 7/28/2019 Sree Project Document

    5/48

    5

    clarify these concepts, consider the area of automated analysis of text. The processes of acquiring

    an image of the area containing the text, pre-processing that image, extracting (segmenting) the

    individual characters, describing the characters in a form suitable for computer processing, and

    recognizing those individual characters, are in the scope of what we call digital image processing

    in this book. Making sense of the content of the page may be viewed as being in the domain of

    image analysis and even computer vision, depending on the level of complexity implied by the

    statement making sense. Digital image processing, as we have defined it, is used successfully

    in a broad range of areas of exceptional social and economic value.

    2.2 Digital Image Characteristics:

    Pixel - An abbreviation of the term 'picture element.' A pixel is the smallest picture

    element of a digital image. A monochrome pixel can have two values, black or white/0 or 1.Color and gray scale require more bits; true color, displaying approximately 16.7 million colors,

    requires 24 bits for each pixel. A pixel may have more data than the eye can perceive at one

    time. Dot The smallest unit that a printer can print. Voxel An abbreviation of the term

    volume element. The smallest distinguishable box-shaped part of a three dimensional space. A

    particular voxel will be identified by the x, y and z coordinates of one of its eight corners, or

    perhaps its centre. The term is used in three dimensional modeling. Voxels need not have

    uniform dimensions in all three coordinate planes.

    To the human observer, the internal structures and functions of the human body are not

    generally visible. However, by various technologies, images can be created through which the

    medical professional can look into the body to diagnose abnormal conditions and guide

    therapeutic procedures. The medical image is a window to the body. No image window reveals

    everything. Different medical imaging methods reveal different characteristics of the human

    body. It is an overview of the medical imaging process. The five major components are the

    patient, the imaging system, the system operator, the image itself, and the observer, The

    objective is to make an object or condition within the patient's body visible to the observer. The

    visibility of specific anatomical features depends on the characteristics of the imaging system

    and the manner in which it is operated. Most medical imaging systems have a considerable

    number of variables that must be selected by the operator. They can be changeable system

    components, such as intensifying screens in radiography, transducers in sonography, or coils in

  • 7/28/2019 Sree Project Document

    6/48

    6

    magnetic resonance imaging (MRI). However, most variables are adjustable physical quantities

    associated with the imaging process, such as kilo voltage in radiography, gain in sonography,

    and echo time (TE) in MRI. The values selected will determine the quality of the image and the

    visibility of specific body features.

    2.2.1 Image Quality:

    The quality of a medical image is determined by the imaging method, the characteristics

    of the equipment, and the imaging variables selected by the operator. Image quality is not a

    single factor but is a composite of at least five factors: contrast, blur, noise, artefacts, and

    distortion, as shown above. The human body contains many structures and objects that are

    simultaneously imaged by most imaging methods. We often consider a single object in relation

    to its immediate background. In fact, with most imaging procedures the visibility of an object is

    determined by this relationship rather than by the overall characteristics of the total image.

    The task of every imaging system is to translate a specific tissue characteristic into image shades

    of gray or colour. If contrast is adequate, the object will be visible. The degree of contrast in the

    image depends on characteristics of both the object and the imaging system.

    2.2.2 Image Contrast:

    Contrast means difference. In an image, contrast can be in the form of different shades of

    gray, light intensities, or colors. Contrast is the most fundamental characteristic of an image. An

    object within the body will be visible in an image only if it has sufficient physical contrast

    relative to surrounding tissue. However, image contrast much beyond that required for good

    object visibility generally serves no useful purpose and in many cases is undesirable. The

    physical contrast of an object must represent a difference in one or more tissue characteristics.

    For example, in radiography, objects can be imaged relative to their surrounding tissue if there is

    an adequate difference in either density or atomic number and if the object is sufficiently thick.

    When a value is assigned to contrast, it refers to the difference between two specific points or

    areas in an image. In most cases we are interested in the contrast between a specific structure or

    object in the image and the area around it or its background.

  • 7/28/2019 Sree Project Document

    7/48

    7

    2.2.3 Contrast Sensitivity:

    The degree of physical object contrast required for an object to be visible in an image

    depends on the imaging method and the characteristics of the imaging system. The primary

    characteristic of an imaging system that establishes the relationship between image contrast and

    object contrast is its contrast sensitivity. Consider the situation shown below. The circular

    objects are the same size but are filled with different concentrations of iodine contrast medium.

    That is, they have different levels of object contrast. When the imaging system has a relatively

    low contrast sensitivity, only objects with a high concentration of iodine (ie, high object contrast)

    will be visible in the image. If the imaging system has a high contrast sensitivity, the lower-

    contrast objects will also be visible.

    It emphasize that contrast sensitivity is a characteristic of the imaging method and the

    variables of the particular imaging system. It is the characteristic that relates to the system's

    ability to translate physical object contrast into image contrast. The contrast transfer

    characteristic of an imaging system can be considered from two perspectives. From the

    perspective of adequate image contrast for object visibility, an increase in system contrast

    sensitivity causes lower-contrast objects to become visible. However, if we consider an object

    with a fixed degree of physical contrast (i.e., a fixed concentration of contrast medium), then

    increasing contrast sensitivity will increase image contrast.

    It is difficult to compare the contrast sensitivity of various imaging methods because

    many are based on different tissue characteristics. However, certain methods do have higher

    contrast sensitivity than others. For example, computed tomography (CT) generally has a higher

    contrast sensitivity than conventional radiography. This is demonstrated by the ability of CT to

    image soft tissue objects (masses) that cannot be imaged with radiography. Consider the image

    below. Here is a series of objects with different degrees of physical contrast. They could be

    vessels filled with different concentrations of contrast medium. The highest concentration (and

    contrast) is at the bottom. Now imagine a curtain coming down from the top and covering some

    of the objects so that they are no longer visible. Contrast sensitivity is the characteristic of the

  • 7/28/2019 Sree Project Document

    8/48

    8

    imaging system that raises and lowers the curtain. Increasing sensitivity raises the curtain and

    allows us to see more objects in the body. A system with low contrast sensitivity allows us to

    visualize only objects with relatively high inherent physical contrast.

    2.2.4 Blur and Visibility of Detail:

    Structures and objects in the body vary not only in physical contrast but also in size.

    Objects range from large organs and bones to small structural features such as trabecula patterns

    and small calcifications. It is the small anatomical features that add detail to a medical image.

    Each imaging method has a limit as to the smallest object that can be imaged and thus on

    visibility of detail. Visibility of detail is limited because all imaging methods introduce blurring

    into the process. The primary effect of image blur is to reduce the contrast and visibility of small

    objects or detail. Consider the image below, which represents the various objects in the body in

    terms of both physical contrast and size. As we said, the boundary between visible and invisible

    objects is determined by the contrast sensitivity of the imaging system. We now extend the idea

    of our curtain to include the effect of blur. It has little effect on the visibility of large objects but

    it reduces the contrast and visibility of small objects. When blur is present, and it always is, our

    curtain of invisibility covers small objects and image detail.

    2.2.5 Noise:

    Another characteristic of all medical images is image noise. Image noise, sometimes

    referred to as image mottle, gives an image a textured or grainy appearance. The source and

    amount of image noise depend on the imaging method and are discussed in more detail in a later

    chapter. We now briefly consider the effect of image noise on visibility. In the image below we

    find our familiar array of body objects arranged according to physical contrast and size. We now

    add a third factor, noise, which will affect the boundary between visible and invisible objects.

    The general effect of increasing image noise is to lower the curtain and reduce object visibility.

    In most medical imaging situations the effect of noise is most significant on the low-contrast

    objects that are already close to the visibility threshold.

  • 7/28/2019 Sree Project Document

    9/48

    9

    2.2.6 Object Contrast:

    The ability to see or detect an object is heavily influenced by the contrast between the

    object and its background. For most viewing tasks there is not a specific threshold contrast at

    which the object suddenly becomes visible. Instead, the accuracy of seeing or detecting a specific

    object increases with contrast. The contrast sensitivity of the human viewer changes with

    viewing conditions. When viewer contrast sensitivity is low, an object must have a relatively

    high contrast to be visible. The degree of contrast required depends on conditions that alter the

    contrast sensitivity of the observer: background brightness, object size, viewing distance, glare,

    and background structure.

    2.2.7 Background Brightness:

    The human eye can function over a large range of light levels or brightness, but vision is

    not equally sensitive at all brightness levels. The ability to detect objects generally increases with

    increasing background brightness or image illumination. To be detected in areas of low

    brightness, an object must be large and have a relatively high level of contrast with respect to its

    background. This can be demonstrated with the image in the image above. View this image with

    different levels of illumination. You will notice that under low illumination you cannot see all of

    the small and low-contrast objects. A higher level of object contrast is required for visibility.

    2.3 File Formats:

    File format which defines the components of the digital image (x & y values, values of

    the pixels, colour/gray scale, compression, manner in which the pixels are laid out, etc.) Standard

    file formats provide the exchange of digital image information .

    There are many file formats exist. They are,

    JPEG - Joint Photographic Experts Group.

    TIFF - Tagged Image File Format.

    PNG - Portabe Network Graphics.

    2.4 Digital Image Representation:

  • 7/28/2019 Sree Project Document

    10/48

    10

    An image is defined as a two-dimensional function ie. a matrix, f(x, y), where x and y are

    spatial coordinates, and the amplitude of f at any pair of coordinates (x, y) is called the intensity

    or gray level of the image at the point. Color images are formed by combining the individual

    two-dimensional images. For example, in the RGB color system, a color images consists of three

    namely, red, green and blue individual component images. Thus many of the techniques

    developed for monochrome images can be extended to color images by processing the three

    component images individually. When x, y and the amplitude values of f are all finite, discrete

    quantities, the image is called a digital image. The field of digital image processing refers to

    processing digital images by means of a digital computer. A digital image is composed of a finite

    number of elements, each of which has a particular location and value. These elements are

    referred to as picture elements, image elements, pels and pixels. Since pixel is the most widely

    used term, the elements will be denoted as pixels from now on.

    An image may be continuous with respect to the x- and y-coordinates, and also in

    amplitude. Digitizing the coordinates as well as the amplitude will take into effect the conversion

    of such an image to digital form. Here, the digitization of the coordinate values are called

    sampling; digitizing the amplitude values is called quantization. A digital image is composed of

    a finite number of elements, each of which has a particular location and value. The field of

    digital image processing refers to processing digital images by means of a digital computer.

    2.4.1 Coordinate Convention:Assume that an image f(x, y) is sampled so that the resulting image has M rows and N

    columns. Then the image is of size M N. The values of the coordinates (x, y) are discrete

    quantities. Integer values are used for these discrete coordinates. In many image processing

    books, the image origin is set to be at (x, y) = (0, 0). The next coordinate values along the first

    row of the image are (x, y) = (0, 1). Note that the notation (0, 1) is used to signify the second

    sample along the first row. These are not necessarily the actual values of physical coordinates

    when the image was sampled. Note that x ranges from 0 to M1, and y from 0 to N 1, where x

    and y are integers. However, in the Wavelet Toolbox the notation (r, c) is used where r indicates

    rows and c indicates the columns. It could be noted that the order of coordinates is the same as

    the order discussed previously. Now, the major difference is that the origin of the coordinate.

  • 7/28/2019 Sree Project Document

    11/48

    11

    system is at (r, c) = (1, 1); hence r ranges from 1 to M, and c from 1 to N for r and c integers. The

    coordinates are referred to as pixel coordinates.

    2.4.2 Images as Matrices:

    The coordinate system discussed in preceding section leads to the following representation for

    the digitized image function:

    f(x,y) = (2.1)

    The right side of the equation is a representation of digital image. Each element of this

    array (matrix) is called the pixel.

    Now, in MATLAB, the digital image is represented as the following matrix:

    f = (2.2)Where M = the number of rows and N = the number of columns Matrices in MATLAB

    are stored in variables with names such as A, a, RGB, real array and so on.

    2.4.3 Color Image Representation:

    An RGB color image is an M N 3 array or matrix of color pixels, where each color

    pixel consists of a triplet corresponding to the red, green, and blue components of an RGB image

    at a specific spatial location. An RGB image may be viewed as a stack of three gray-scale

    images, that when fed into the red, green, and blue inputs of a color monitor, produce a color

    image on the screen. So from the stack of three images forming that RGB color image, each

    image is referred to as the red, green, and blue component images by convention. Now, the data

    class of the component images determine their range of values. If an RGB color image is of

    class double, meaning that all the pixel values are of type double, the range of values is [0, 1].

    Likewise, the range of values is [0, 255] or [0, 65535] for RGB images of class uint8 or uint16,

  • 7/28/2019 Sree Project Document

    12/48

    12

    respectively. The number of bits used to represent the pixel values of the component images

    determines the bit depth of an RGB color image.

    The RGB color space is shown graphically as an RGB color cube. The vertices of the

    cude are the primary (red, green, and blue) and secondary (cyan, magenta, and yellow) colors of

    light.

    2.4.4 Indexed Images:

    An indexed image has two components: a data matrix of integers, X, and a colormap

    matrix, map. Matrix map is an m 3 array of class double containing floating-point values in the

    range [0, 1]. The length, m, of the map is equal to the number of colors it defines. Each row of

    map specifies the red, green, and blue components of a single color. An indexed image uses

    direct mapping of pixels intensity values of color map values. The color of each pixel is

    determined by using the corresponding value of integer matrix X as a pointer into map. If X is of

    class double then all of its components with value 2 point to the second row, and so on. If X is of

    class unit 8 or unit 16, then all components with value 0 point to the first row in map, all

    components with value 1 to point to the second row and so on.

    2.4.5 The Basics of Color Image Processing:

    Color image processing techniques deals with how the color images are handled for a

    variety of image-processing tasks. For the purposes of the following discussion we subdivide

    color image processing into three principal areas: (1) color transformations (also called color

    mappings); (2) spatial processing of individual color planes; and (3) color vector processing. The

    first category deals with processing the pixels of each color plane based strictly on their values

    and not on their spatial coordinates. This category is analogous to the intensity transformations.

    The second category deals with spatial (neighbor-hood) filtering for individual color planes and

    is analogous to spatial filtering. The third category deals with techniques base on processing all

    components of a color image simultaneously. Since full-color images have at least three

    components, color pixels are indeed vectors. For example, in the RGB color images, the RGB

    system color point can be interpreted as a vector extending from the origin to that point in the

    RGB coordinate system.

    Let c represent an arbitrary vector in RGB color space:

  • 7/28/2019 Sree Project Document

    13/48

    13

    c = [ ] = [ ] (2.3)

    This above equation indicates that the components of c are simply the RGB components

    of a color image at a point. Since the color components are a function of coordinates (x, y) by

    using the notation.

    c(x,y) = = (2.4)

    For an image of size M N, there are MN such vectors, c(x, y), for x = 0,1,. M 1 and

    y = 0,1,.N 1. In order for independent color component and vector-based processing to be

    equivalent, two conditions have to be satisfied: (i) the process has to be applicable to both

    vectors and scalars. (ii) the operation on each component of a vector must be independent of the

    other components. The averaging would be accomplished by summing the gray levels of all the

    pixels in the neighborhood. Or the averaging could be done by summing all the vectors in the

    neighborhood and dividing each component of the average vector is the sum of the pixels in the

    image corresponding to that component, which is the same as the result that would be obtained if

    the averaging were done on the neighborhood of each component image individually, and then

    the color vector were formed.

    2.4.6 Reading Images:

    In MATLAB, images are read into the MATLAB environment using function called

    imread. The syntax is as follows: imread(filename) Here, filename is a string containing the

    complete name of the image file including any applicable extension. For example, the command

    line >> f = imread (x.jpg); reads the JPEG image into image array or image matrix f. Since there

    are three color components in the image, namely red, green and blue components, the image is

    broken down into the three distinct color matrices fR, fG and fB.

    2.5 Standard method of image compression:

    In 1992, JPEG established the first international standard for still image compression

    where the encoders and decoders are DCT-based. The JPEG standard specifies three modes

  • 7/28/2019 Sree Project Document

    14/48

    14

    namely sequential, progressive, and hierarchical for lossy encoding, and one mode of lossless

    encoding. The performance of the coders for JPEG usually degrades at low bit-rates mainly

    because of the underlying block-based Discrete Cosine Transform (DCT) . The baseline JPEG

    coder [5] is the sequential encoding in its simplest form. Fig. 1 and 2 show the key processing

    steps in such an encoder and decoder respectively for grayscale images. Color image

    compression can be approximately regarded as compression of multiple grayscale images, which

    are either compressed entirely one at a time, or are compressed by alternately interleaving 8x8

    sample blocks from each in turn.

    The DCT-based encoder can be thought of as essentially compression of a stream of 8x8

    blocks of image samples. Each 8x8 block makes its way through each processing step, and yields

    output in compressed form into the data stream. Because adjacent image pixels are highly

    correlated, the Forward DCT (FDCT) processing step lays the basis forgaining data compression

    by concentrating most of the signal in the lower spatial frequencies. For a typical 8x8 sample

    block from a typical source image, most of the spatial frequencies have zero or near-zero

    amplitude and need not to be encoded.

    Original

    Image

    Fig: 2.2 Encoder Block Diagram.

    Compressed

    Image Data

    Fig: 2.3 Decoder Block Diagram.

    FDCT QuantizerEntropy

    Encoder

    Quantization

    Table(QT)

    Huffman

    Table

    Compressed

    Image Data

    Entropy

    Decoder

    DequantizerInverse

    DCT

    Quantization

    Table(QT)

    Huffman

    Table

    Reconstructed

    Image

  • 7/28/2019 Sree Project Document

    15/48

    15

    After output from the Forward DCT (FDCT), each of the 64 DCT coefficients is

    uniformly quantized in conjunction with a carefully designed 64-element Quantization Table

    (QT). At the decoder, the quantized values are multiplied by the corresponding QT elements to

    pick up the original unquantized values. After quantization, all the quantized coefficients are

    ordered into zig-zag sequence. This ordering helps to facilitate entropy encoding by placing low

    frequency non-zero coefficients before high-frequency coefficients. The DC coefficient, which

    contains a significant fraction of the total image energy, is differentially encoded.

    Entropy Coding (EC) achieves additional compression losslessly through encoding the

    quantized DCT coefficients more compactly based on their statistical characteristics. The

    JPEG proposal specifies both Huffman coding and arithmetic coding. More recently, the wavelet

    transform has emerged as a cutting edge technology, within the field of image analysis. Wavelets

    are a mathematical tool for hierarchically decomposing functions. Though rooted in

    approximation theory, signal processing, and physics, wavelets have also recently been applied

    to many problems in Computer Graphics including image editing and compression, automatic

    level-of detail control for editing and rendering curves and surfaces, surface reconstruction from

    contours and fast methods for solving simulation problems in 3D modelling, global illumination,

    and animation .

    Wavelet-based coding provides substantial improvements in picture quality at higher

    compression ratios. Over the past few years, a variety of powerful and sophisticated wavelet-

    based schemes for image compression have been developed and implemented. Because of the

    many advantages of wavelet based image compressionas listed below, the top contenders in the

    JPEG-2000 standard are all wavelet-based compression algorithms.

    2.6 Conclusion:

    The digital image characteristics, digital image representation in different analysis, the

    basic colour image processing, the standard method of image compression are been discussed in

    this chapter. Image compression using different teqniques are discussed in the next chapter.

  • 7/28/2019 Sree Project Document

    16/48

    16

    Chapter 3

    Image compression using different technique

    3.1 Introduction:

    Here, some background topics of image compression which include the principles of

    image compression, the classification of compression methods and the framework of a general

    image coder and wavelets for image compression, different types of transforms and quantization

    are going to be discussed.

    3.2Principles of Image Compression:

    An ordinary characteristic of most images is that the neighboring pixels are correlated

    and therefore hold redundant information. The foremost task then is to find out less correlated

    representation of the image. Two elementary components of compression are redundancy and

    irrelevancy reduction. Redundancy reduction aims at removing duplication from the signal

    source image. Irrelevancy reduction omits parts of the signal that is not noticed by the signal

    receiver, namely the Human Visual System (HVS). In general, three types of redundancy can be

    identified: (a) Spatial Redundancy or correlation between neighboring pixel values, (b) Spectral

    Redundancy or correlation between different color planes or spectral bands and (c) Temporal

    Redundancy or correlation between adjacent frames in a sequence of images especially in video

    applications. Image compression research aims at reducing the number of bits needed torepresent an image by removing the spatial and spectral redundancies as much as possible.

    3.3. Framework of General Image Compression Method:

    A typical lossy image compression system is shown in Fig. C. It consists of three closely

    connected components namely (a) Source Encoder, (b) Quantizer and (c) Entropy Encoder.

    Compression is achieved by applying a linear transform in order to decorrelate the image data,

    quantizing the resulting transform coefficients and entropy coding the quantized values.

    Input

    Image

    Fig: 2.4 A typical lossy encoder.

    Source

    Encoder

    QuantizerEntropy

    Encoder

    Compressed

    Image

  • 7/28/2019 Sree Project Document

    17/48

    17

    Source Encoder:A variety of linear transforms have been developed which include Discrete

    Fourier Transform (DFT), Discrete Cosine Transform (DCT), Discrete Wavelet

    Transform (DWT) and many more, each with its own advantages and disadvantages.

    Quantizer:A quantizer is used to reduce the number of bits needed to store the transformed

    coefficients by reducing the precision of those values. As it is a many-to-one mapping, it

    is a lossy process and is the main source of compression in an encoder. Quantization can

    be performed on each individual coefficient, which is called Scalar Quantization (SQ).

    Quantization can also be applied on a group of coefficients together known as Vector

    Quantization (VQ). Both uniform and non-uniform quantizers can be used depending on

    the problems.

    Entropy Encoder:An entropy encoder supplementary compresses the quantized values losslessly to

    provide a better overall compression. It uses a model to perfectly determine the

    probabilities for each quantized value and produces an appropriate code based on these

    probabilities so that the resultant output code stream is smaller than the input stream. The

    most commonly used entropy encoders are the Huffman encoder and the arithmetic

    encoder, although for applications requiring fast execution, simple Run Length Encoding

    (RLE) is very effective.

    3.4. Image Compression:

    In the last decade, there has been a lot of technological transformation in the way we

    communicate. This transformation includes the ever present, ever growing internet, the explosive

    development in mobile communication and ever increasing importance of video communication.

    Data Compression is one of the technologies for each of the aspect of this multimedia revolution.

    Cellular phones would not be able to provide communication with increasing clarity without data

    compression. Data compression is art and science of representing information in compact form.

    Despite rapid progress in mass-storage density, processor speeds, and digital

    communication system performance, demand for data storage capacity and data-transmission

  • 7/28/2019 Sree Project Document

    18/48

    18

    bandwidth continues to outstrip the capabilities of available technologies. In a distributed

    environment large image files remain a major bottleneck within systems.

    Image Compression is an important component of the solutions available for creating

    image file sizes of manageable and transmittable dimensions. Platform portability and

    performance are important in the selection of the compression/decompression technique to be

    employed.

    Four Stage model of Data Compression:

    Almost all data compression systems can be viewed as comprising four successive

    stages of data processing arranged as a processing pipeline (though some stages will often be

    combined with a neighboring stage, performed "off-line," or otherwise made rudimentary).

    The four stages are

    (A) Preliminary pre-processing steps.

    (B) Organization by context.

    (C) Probability estimation.

    (D) Length-reducing code.

    The ubiquitous compression pipeline (A-B-C-D) is what is of interest.

    With (A) we mean various pre-processing steps that may be appropriate before the final

    compression engineLossy compression often follows the same pattern as lossless, but with one or

    more quantization steps somewhere in (A). Sometimes clever designers may defer the loss until

    suggested by statistics detected in (C); an example of this would be modern zero tree image

    coding.

    (B) Organization by context often means data reordering, for which a simple but good

    example is JPEG's "Zigzag" ordering. The purpose of this step is to improve the estimates found

    by the next step.

    (C) A probability estimate (or its heuristic equivalent) is formed for each token to be

    encoded. Often the estimation formula will depend on context found by (B) with separate 'bins'

    of state variables maintained for each conditioned class.

  • 7/28/2019 Sree Project Document

    19/48

    19

    (D) Finally, based on its estimated probability, each compressed file token is represented as

    bits in the compressed file. Ideally, a 12.5%-probable token should be encoded with three bits,

    but details become complicated.

    Principle behind Image Compression:

    Images have considerably higher storage requirement than text; Audio and Video Data

    require more demanding properties for data storage. An image stored in an uncompressed file

    format, such as the popular BMP format, can be huge. An image with a pixel resolution of 640

    by 480 pixels and 24-bit colour resolution will take up 640 * 480 * 24/8 = 921,600 bytes in an

    uncompressed format.

    The huge amount of storage space is not only the consideration but also the data

    transmission rates for communication of continuous media are also significantly large. An image,

    1024 pixel x 1024 pixel x 24 bit, without compression, would require 3 MB of storage and 7

    minutes for transmission, utilizing a high speed, 64 Kbits /s, ISDN line.

    Image data compression becomes still more important because of the fact that the transfer

    of uncompressed graphical data requires far more bandwidth and data transfer rate. For example,

    throughput in a multimedia system can be as high as 140 Mbits/s, which must be transferred

    between systems. This kind of data transfer rate is not realizable with todays technology, or in

    near the future with reasonably priced hardware.

    3.5. Fundamentals of Image Compression Techniques:A digital image, or "bitmap", consists of a grid of dots, or "pixels", with each pixel

    defined by a numeric value that gives its colour. The term data compression refers to the process

    of reducing the amount of data required to represent a given quantity of information. Now, a

    particular piece of information may contain some portion which is not important and can be

    comfortably removed. All such data is referred as Redundant Data. Data redundancy is a central

    issue in digital image compression. Image compression research aims at reducing the number of

    bits needed to represent an image by removing the spatial and spectral redundancies as much as

    possible.

    A common characteristic of most images is that the neighboring pixels are correlated and

    therefore contain redundant information. The foremost task then is to find less correlated

    representation of the image. In general, three types of redundancy can be identified:

  • 7/28/2019 Sree Project Document

    20/48

    20

    1. Coding Redundancy

    2. Inter Pixel Redundancy

    3.PsychovisualRedundancy

    Coding Redundancy:

    If the gray levels of an image are coded in a way that uses more code symbols than

    absolutely necessary to represent each gray level, the resulting image is said to contain coding

    redundancy. It is almost always present when an images gray levels are represented with a

    straight or natural binary code. Let us assume that a random variable rK

    lying in the interval [0,

    1] represents the gray levels of an image and that each rK

    occurs with probability Pr(r

    K).

    Pr(r

    K) = N

    k/ n where k = 0, 1, 2 L-1. (3.1)

    L = No. of gray levels.N

    k=No. of times that gray appears in that image.

    N = Total no. of pixels in the image.

    If no. of bits used to represent each value of rK

    is l (rK), the average no. of bits required to

    represent each pixel is

    Lavg

    = l (rK) P

    r(r

    K) (3.2)

    That is average length of code words assigned to the various gray levels is found by summing

    the product of the no. of bits used to represent each gray level and the probability that the gray

    level occurs. Thus the total no. of bits required to code an MN image is MN Lavg

    .

    Inter Pixel Redundancy:

    The Information of any given pixel can be reasonably predicted from the value of its

    neighboring pixel. The information carried by an individual pixel is relatively small.

    In order to reduce the inter pixel redundancies in an image, the 2-D pixel array normally used

    for viewing and interpretation must be transformed into a more efficient but usually non visual

    format. For example, the differences between adjacent pixels can be used to represent an image.

    These types of transformations are referred as mappings. They are called reversible if the

    original image elements can be reconstructed from the transformed data set.

  • 7/28/2019 Sree Project Document

    21/48

    21

    Psycho visual Redundancy:

    Certain information simply has less relative importance than other information in normal visual

    processing. This information is said to be Psycho visually redundant, it can be eliminated without

    significantly impairing the quality of image perception.

    In general, an observer searches for distinguishing features such as edges or textual regions

    and mentally combines them in recognizable groupings. The brain then correlates these

    groupings with prior knowledge in order to complete the image interpretation process.

    The elimination of psycho visually redundant data results in loss of quantitative information;

    it is commonly referred as quantization. As this is an irreversible process i.e. visual information

    is lost, thus it results in Lossy Data Compression. An image reconstructed following Lossy

    compression contains degradation relative to the original. Often this is because the compression

    scheme completely discards redundant information.

    Image Compression Techniques:

    There are basically two methods of Image Compression:

    3.5.1. Lossless Coding Techniques

    3.5.2. Lossy Coding Techniques

    3.5.1. Lossless Coding Techniques:

    In Lossless Compression schemes, the reconstructed image, after compression, is

    numerically identical to the original image. However Lossless Compression can achieve a

    modest amount of Compression. Lossless coding guaranties that the decompressed image is

    absolutely identical to the image before compression. Lossless techniques can also be used for

    the compression of other data types where loss of information is not acceptable. Lossless

    compression algorithms can be used to squeeze down images and then restore them again for

    viewing completely unchanged.

    Lossless Coding Techniques are as follows:

    Source Encoder Input Image F(x, y)1. Run Length Encoding.

    2. Huffman Encoding.

    3. Entropy Encoding.

    4. Area Encoding.

  • 7/28/2019 Sree Project Document

    22/48

    22

    3.5.2. Lossy Coding Techniques:

    Lossy techniques cause image quality degradation in each Compression / De-

    compression step. Careful consideration of the Human Visual perception ensures that the

    degradation is often unrecognizable, though this depends on the selected compression ratio. An

    image reconstructed following Lossy compression contains degradation relative to the original.

    Often this is because the compression schemes are capable of achieving much higher

    compression. Under normal viewing conditions, no visible loss is perceived (visually Lossless).

    Lossy Image Coding Techniques normally have three Components:

    Image Modeling:

    It is aimed at the exploitation of statistical characteristics of the image (i.e. high

    correlation, redundancy). It defines such things as the transformation to be applied to the Image.

    Parameter Quantization:

    The aim of Quantization is to reduce the amount of data used to represent the information

    within the new domain.

    Encoding:

    Here a code is generated by associating appropriate code words to the raw produced by

    the Quantizer. Encoding is usually error free. It optimizes the representation of the information

    and may introduce some error detection codes.

    3.6. Measurement of Image Quality:

    The design of an imaging system should begin with an analysis of the physical characteristics of

    the originals and the means through which the images may be generated. For example, one might

    examine a representative sample of the originals and determine the level of detail that must be

    preserved, the depth of field that must be captured, whether they can be placed on a glass platen

    or require a custom book-edge scanner, whether they can tolerate exposure to high light

    intensity, and whether specula reflections must be captured or minimized. A detailed

    examination of some of the originals, perhaps with a magnifier or microscope, may be necessary

    to determine the level of detail within the original that might be meaningful for a researcher or

    scholar. For example, in drawings or paintings it may be important to preserve stippling or other

    techniques characteristic.

  • 7/28/2019 Sree Project Document

    23/48

    23

    3.7. Wavelets for image compression:

    Wavelet transform exploits both the spatial and frequency correlation of data by dilations

    (or contractions) and translations of mother wavelet on the input data. It supports the multi-

    resolution analysis of data i.e. it can be applied to different scales according to the details

    required, which allows progressive transmission and zooming of the image without the need of

    extra storage. Another encouraging feature of wavelet transform is its symmetric nature that is

    both the forward and the inverse transform has the same complexity, building fast compression

    and decompression routines. Its characteristics well suited for image compression include the

    ability to take into account of Human Visual Systems (HVS) characteristics, very good energy

    compaction capabilities, robustness under transmission, high compression ratio etc.

    Wavelet transform divides the information of an image into approximation and detail

    sub-signals. The approximation sub-signal shows the general trend of pixel values and other

    three detail sub-signals show the vertical, horizontal and diagonal details or changes in the

    images. If these details are very small (threshold) then they can be set to zero without

    significantly changing the image. The greater the number of zeros the greater the compression

    ratio. If the energy retained (amount of information retained by an image after compression and

    decompression) is 100% then the compression is lossless as the image can be reconstructed

    exactly. This occurs when the threshold value is set to zero, meaning that the details have not

    been changed.3.8 Image Compression Methodology:

    Overview:The storage requirements for the video of a typical Angiogram procedure is of the order of

    several hundred Mbytes.

    *Transmission of this data over a low bandwidth network results in very high latency.

    * Lossless compression methods can achieve compression ratios of ~2:1.

    * We consider lossy techniques operating at much higher compression ratios (~10:1).* Key issues:

    - High quality reconstruction required.

    - Angiogram data contains considerable high-frequency spatial texture.

  • 7/28/2019 Sree Project Document

    24/48

    24

    * Proposed method applies a texture-modeling scheme to the high-frequency texture of some

    regions of the image.

    * This allows more bandwidth allocation to important areas of the image.

    3.9 Different types of transforms:

    1. FT (Fourier Transform).

    2. DCT (Discrete Cosine Transform).

    3. DWT (Discrete Wavelet Transform). .

    3.9.1 Discrete Fourier Transform:

    The DTFT representation for a finite duration sequence is (3.3) (3.4)

    Where x(n) is a finite duration sequence, X(j) is periodic withperiod 2.It is convenient sample

    X(j) with a sampling frequency equal an integer multiple of its period =m that is taking N

    uniformly spaced samples between 0 and 2.

    Let (3.5)Therefore (3.6)Since X(j) is sampled for one period and there are N samples X(j) can be expressed as

    (3.7)3.9.2 The Discrete Cosine Transform (DCT):

    The discrete cosine transform (DCT) helps separate the image into parts (or spectral

    sub-bands) of differing importance (with respect to the image's visual quality). The DCT is

    similar to the discrete Fourier transform: it transforms a signal or image from the spatial domain

    to the frequency domain.

  • 7/28/2019 Sree Project Document

    25/48

    25

    3.9.3 Discrete Wavelet Transform (DWT):

    The discrete wavelet transform (DWT) refers to wavelet transforms for which the

    wavelets are discretely sampled. A transform which localizes a function both in space and

    scaling and has some desirable properties compared to the Fourier transform. The transform is

    based on a wavelet matrix, which can be computed more quickly than the analogous Fourier

    matrix. Most notably, the discrete wavelet transform is used for signal coding, where the

    properties of the transform are exploited to represent a discrete signal in a more redundant form,

    often as a preconditioning for data compression. The discrete wavelet transform has a huge

    number of applications in Science, Engineering, Mathematics and Computer Science.

    Wavelet compression is a form of data compression well suited for image compression

    (sometimes also video compression and audio compression). The goal is to store image data in as

    little space as possible in a file. A certain loss of quality is accepted (lossy compression).

    Using a wavelet transform, the wavelet compression methods are better at representing

    transients, such as percussion sounds in audio, or high-frequency components in two-

    dimensional images, for example an image of stars on a night sky. This means that the transient

    elements of a data.

    Signal can be represented by a smaller amount of information than would be the case if

    some other transform, such as the more widespread discrete cosine transform, had been used.

    First a wavelet transform is applied. This produces as many coefficients as there are pixels in theimage (i.e.: there is no compression yet since it is only a transform). These coefficients can then

    be compressed more easily because the information is statistically concentrated in just a few

    coefficients.

    3.10 Quantization:

    Quantization involved in image processing. Quantization techniques generally compress

    by compressing a range of values to a single quantum value. By reducing the number of discrete

    symbols in a given stream, the stream becomes more compressible. For example seeking to

    reduce the number of colors required to represent an image. Another widely used example DCT

    data quantization in JPEG and DWT data quantization in JPEG 2000.

  • 7/28/2019 Sree Project Document

    26/48

    26

    3.11 Entropy Encoding:

    An entropy encoding is a coding scheme that assigns codes to symbols so as to match

    code lengths with the probabilities of the symbols. Typically, entropy encoders are used to

    compress data by replacing symbols represented by equal-length codes with symbols represented

    by codes proportional to the negative logarithm of the probability. Therefore, the most common

    symbols use the shortest codes.

    According to Shannon's source coding theorem, the optimal code length for a symbol is

    logbP, where b is the number of symbols used to make output codes and P is the probability of

    the input symbol. Three of the most common entropy encoding techniques are Huffman coding,

    range encoding, and arithmetic coding. If the approximate entropy characteristics of a data

    stream are known in advance (especially for signal compression), a simpler static code such as

    unary coding, Elias gamma coding, Fibonacci coding, Golomb coding, or Rice coding may be

    useful.

    There are three main techniques for achieving entropy coding:

    Huffman Coding - one of the simplest variable length coding schemes.

    Run-length Coding (RLC) - very useful for binary data containing long runs of ones of

    zeros.

    Arithmetic Coding - a relatively new variable length coding scheme that can combine

    the best features of Huffman and run-length coding, and also adapt to data with non-stationary

    statistics. It shall concentrate on the Huffman and RLC methods for simplicity.

    3.12 Conclusion:

    Here, some topics of image compression which include the principles of image

    compression, the classification of compression methods and the framework of a general image

    coder and wavelets for image compression, different types of transforms and quantization are

    discussed. The introduction to wavelet transforms is given in the next chapter.

  • 7/28/2019 Sree Project Document

    27/48

    27

    Chapter 4

    Introduction to wavelet transform4.1. Introduction:

    The fundamental idea behind wavelets is to analyze according to scale. Indeed, some

    researchers in the wavelet field feel that, by using wavelets, one is adopting a whole new

    mindset or perspective in processing data.

    Wavelets are functions that satisfy certain mathematical requirements and are used in

    representing data or other functions. This idea is not new. Approximation using superposition

    of functions has existed since the early 1800's, when Joseph Fourier discovered that he could

    superpose sines and cosines to represent other functions. However, in wavelet analysis, the

    scale that we use to look at data plays a special role. Wavelet algorithms process data at

    different scales or resolutions. If we look at a signal with a large "window," we would notice

    gross features. Similarly, if we look at a signal with a small "window," we would notice small

    features. The result in wavelet analysis is to see both the forest andthe trees, so to speak.

    This makes wavelets interesting and useful. For many decades, scientists have wanted

    more appropriate functions than the sines and cosines which comprise the bases of Fourier

    analysis, to approximate choppy signals. By their definition, these functions are non-local

    (and stretch out to infinity). They therefore do a very poor job in approximating sharp spikes.

    But with wavelet analysis, we can use approximating functions that are contained neatly in

    finite domains. Wavelets are well-suited for approximating data with sharp discontinuities.

    The wavelet analysis procedure is to adopt a wavelet prototype function, called an

    analyzing wavelet or mother wavelet. Temporal analysis is performed with a contracted,

    high-frequency version of the prototype wavelet, while frequency analysis is performed with

    a dilated, low-frequency version of the same wavelet. Because the original signal or function

    can be represented in terms of a wavelet expansion (using coefficients in a linear combination

    of the wavelet functions), data operations can be performed using just the corresponding

    wavelet coefficients. And if you further choose the best wavelets adapted to your data, or

    truncate the coefficients below a threshold, your data is sparsely represented. This sparse

    coding makes wavelets an excellent tool in the field of data compression.

  • 7/28/2019 Sree Project Document

    28/48

    28

    Other applied fields that are making use of wavelets include astronomy, acoustics,

    nuclear engineering, sub-band coding, signal and image processing, neurophysiology, music,

    magnetic resonance imaging, speech discrimination, optics, fractals, turbulence, earthquake-

    prediction, radar, human vision, and pure mathematics applications such as solving partial

    differential equations.

    4.2. Basis Functions:

    It is simpler to explain a basis function if we move out of the realm of analog

    (functions) and into the realm of digital (vectors) (*). Every two-dimensional vector (x,y) is a

    combination of the vector (1,0) and (0,1). These two vectors are the basis vectors for (x,y).

    Why? Notice that x multiplied by (1,0) is the vector (x,0), and y multiplied by (0,1) is the

    vector(0,y). The sum is (x,y).

    The best basis vectors have the valuable extra property that the vectors are

    perpendicular, or orthogonal to each other. For the basis (1,0) and (0,1), this criteria is

    satisfied. Now let's go back to the analog world, and see how to relate these concepts to basis

    functions. Instead of the vector (x,y), we have a function f(x). Imagine that f(x) is a musical

    tone, say the note A in a particular octave. We can construct A by adding sines and cosines

    using combinations of amplitudes and frequencies. The sines and cosines are the basis

    functions in this example, and the elements of Fourier synthesis. For the sines and cosines

    chosen, we can set the additional requirement that they be orthogonal. How? By choosing the

    appropriate combination of sine and cosine function terms whose inner product add up to

    zero. The particular set of functions that are orthogonal and that construct f(x) are our

    orthogonal basis functions for this problem.

    Scale-Varying Basis Functions:

    A basis function varies in scale by chopping up the same function or data space using

    different scale sizes. For example, imagine we have a signal over the domain from 0 to 1. We

    can divide the signal with two step functions that range from 0 to 1/2 and 1/2 to 1. Then we

    can divide the original signal again using four step functions from 0 to 1/4, 1/4 to 1/2, 1/2 to

    3/4, and 3/4 to 1. And so on. Each set of representations code the original signal with a

    particular resolution or scale.

  • 7/28/2019 Sree Project Document

    29/48

    29

    4.3. Fourier analysis:

    Fourier Transform:

    The Fourier transform's utility lies in its ability to analyze a signal in the time domain

    for its frequency content. The transform works by first translating a function in the time

    domain into a function in the frequency domain. The signal can then be analyzed for its

    frequency content because the Fourier coefficients of the transformed function represent the

    contribution of each sine and cosine function at each frequency. An inverse Fourier transform

    does just what you'd expect, transform data from the frequency domain into the time domain.

    Discrete Fourier Transform:

    The discrete Fourier transform (DFT) estimates the Fourier transform of a function

    from a finite number of its sampled points. The sampled points are supposed to be typical of

    what the signal looks like at all other times.

    The DFT has symmetry properties almost exactly the same as the continuous Fourier

    transform. In addition, the formula for the inverse discrete Fourier transform is easily

    calculated using the one for the discrete Fourier transform because the two formulas are

    almost identical.

    Windowed Fourier Transform:

    Iff(t) is a non-periodic signal, the summation of the periodic functions, sine and

    cosine, does not accurately represent the signal. You could artificially extend the signal to

    make it periodic but it would require additional continuity at the endpoints. The windowed

    Fourier transform (WFT) is one solution to the problem of better representing the non

    periodic signal. The WFT can be used to give information about signals simultaneously in thetime domain and in the frequency domain.

    With the WFT, the input signal f(t) is chopped up into sections, and each section is

    analyzed for its frequency content separately. If the signal has sharp transitions, window uses

    input data so that the sections converge to zero at the endpoint. This windowing is

    accomplished via a weight function that places less emphasis near the interval's endpoints

    than in the middle. The effect of the window is to localize the signal in time.

  • 7/28/2019 Sree Project Document

    30/48

    30

    Fast Fourier Transform:

    To approximate a function by samples, and to approximate the Fourier integral by the

    discrete Fourier transform, requires applying a matrix whose order is the number sample

    points n. Since multiplying an matrix by a vector costs on the order of arithmetic

    operations, the problem gets quickly worse as the number of sample points increases.

    However, if the samples are uniformly spaced, then the Fourier matrix can be factored into a

    product of just a few sparse matrices, and the resulting factors can be applied to a vector in a

    total of order arithmetic operations. This is the so-calledfast Fourier transform or FFT.

    4.4. Similarities between Fourier and Wavelet Transform:

    The fast Fourier transform (FFT) and the discrete wavelet transform (DWT) are both

    linear operations that generate a data structure that contains segments of various lengths,

    usually filling and transforming it into a different data vector of length .

    The mathematical properties of the matrices involved in the transforms are similar as

    well. The inverse transform matrix for both the FFT and the DWT is the transpose of the

    original. As a result, both transforms can be viewed as a rotation in function space to a

    different domain. For the FFT, this new domain contains basis functions that are sines and

    cosines. For the wavelet transform, this new domain contains more complicated basis

    functions called wavelets, mother wavelets, or analyzing wavelets.

    Both transforms have another similarity. The basis functions are localized in

    frequency, making mathematical tools such as power spectra (how much power is contained

    in a frequency interval) and scale grams (to be defined later) useful at picking out frequencies

    and calculating power distributions.

    4.5. Dissimilarities between Fourier and Wavelet Transform:

    The most interesting dissimilarity between these two kinds of transforms is that

    individual wavelet functions are localized in space. Fourier sine and cosine functions are not.

    This localization feature, along with wavelets' localization of frequency, makes many

    functions and operators using wavelets "sparse" when transformed into the wavelet domain.

    This sparseness, in turn, results in a number of useful applications such as data compression,

    detecting features in images, and removing noise from time series.

  • 7/28/2019 Sree Project Document

    31/48

    31

    4.6. Wavelets:

    Compactly supported wavelets are functions defined over a finite interval and having

    an average alue of zero. The basic idea of the wavelet transform is to represent any arbitrary

    function f(x) as a uperposition of a set of such wavelets or basis functions. These basisfunctions are obtained from a single prototype wavelet called the mother wavelet (x), by

    dilations or scaling and translations. Wavelet bases are very good at efficiently representing

    functions that are smooth except for a small set of discontinuities.

    For each n, k Z, define (x) by(x) = (x - k) (4.1)Constructing the function (x),

    on R, such that {

    (x)}n,k

    Z is an or- thonormal

    basis on R. As mentioned before (x) is a wavelet and the collection {(x) }n,kZ is awavelet orthonormal basis on R; this framework for constructing wavelets involves theconcept of a multi resolution analysis or MRA.

    Multi resolution analysis is a device for computation of basis coefficients in (R) :f = , . It is defined as follows, = {f(x)|f(x) = g(x), g(x) }, (4.2)Where

    f(x) = (f, ( n))(x n) (4.3)Then a multi resolution analysis on R is a sequence of subspaces {}nZ of functions on R, satisfying the following properties:(a)For all n, k Z, .(b)If f(x) is on R, then f(x) span{}nZ. That is, given > 0, there is an n Z

    and a function g(x) such that ||fg||

  • 7/28/2019 Sree Project Document

    32/48

    32

    (x) = g(k)(2x k). (4.5)Then {(x)} is a wavelet orthonormal basis on R.The orthogonal projection of an arbitrary function f

    onto

    is given by

    f = (f, ) (4.6)As k varies, the basis functions are shifted in steps of , so f cannotrepresent any detail on a scale smaller than that. We say that the functions in have theresolution or scale . Here, f is called an approximation to f at resolution . For agiven function f, an MRA provides a sequence of approximations f of increasing accuracy.The difference between the approximations at resolutionand is called the finedetail at resolution

    which is as follows:

    f(x) = f(x) f(x). (4.7)Or f = (f, ) . (4.8) is also an orthogonal projection and its range is orthorgonal to where the followingholds: = {f| f = f} (4.9)

    = {f|

    f = f} (4.10)

    = (4.11)There are choices of the numbers h and g such that {(x)} is a wavelet orthonormalbasis on R.We must show otho-normality and completeness. As for completeness, we have=0 (4.12)and = R (4.13)Then we have {

    |k

    Z} =

    =

    . Hence {

    (x)}

    is complete if and only

    if = (R) holds, and this is true.Now, as for the ortho-normality,( , )= ((x k), (x l)

    = (, ) = (k l). (4.14)To prove ortho-normality between scales, let n, n Z with n < n, and let k, k Z bearbitrary. Since (x) V1, (x) V1, Then we have . Since ( , ) =0 for all k, l

    Z, it follows that (

    ,

    )= 0, for all n, k, l

    Z. Given f(x)

    we know

    that f(x)= (f, )(x) Hence for f(x)

  • 7/28/2019 Sree Project Document

    33/48

    33

    ( , f) = ( (f, ))= (f, ) ( , ) = 0 (4.15)

    Since, n < n,

    and since

    ,

    also. Hence (

    ,

    ) =0

    Therefore {(x)} is a wavelet orthonormal basis on R.i) Symmetry:Symmetric filters are preferred for they are most valuable for minimizing the edge

    effects in the wavelet representation of discrete wavelet transform (DWT) of a function; large

    coefficients resulting from false edges due to periodization can be avoided. Since orthogonal

    filters in exception to Haar-filter cannot be symmetric, biorthogonal filters are almost always

    selected for image compression application.

    ii)

    Vanishing Moments:Vanishing Moments are defined as follows: From the definition of multi resolution

    analysis(MRA), any wavelet (x) that comes from MRA must satisfy =0 (4.16)The integral is referred to as the zeroth moment of(x), so that if the above equation

    holds, we say that (x) has its zeroth moment vanishing. The integral (x) dx is referredto as the

    moment of

    (x) and if

    (x) dx = 0, we say that

    (x) has its

    moment

    vanishing.

    We may encounter a situation where having different number of vanishing moments

    on the analysis filters than on the reconstruction filters. As a matter of fact, it is possible to

    have different number of vanishing moments on the analysis filters than on the reconstruction

    filters. Vanishing moments on the analysis filters are desired for small coefficients in the

    transform as a result, whereas vanishing moments on the reconstruction filter results in fewer

    blocking artifacts in the compressed image thus is desired. Thus having sufficient vanishing

    moments which maybe different in numbers on each filters are advantageous.iii)Size of the filters:

    Long analysis filters results in greater computation time for the wavelet or wavelet

    packet transform. Long reconstruction filters can create unpleasant artifacts in the

    compressed image for the following reason. The reconstructed image is made up of the

    superposition of only a few scaled and shifted reconstruction filters. So features of the

    reconstruction filters such as oscillations or lack of smoothness, can be obvious noted in the

  • 7/28/2019 Sree Project Document

    34/48

    34

    reconstructed image. Smoothness can be guaranteed by requiring a large number of vanishing

    moments in the reconstruction filter.

    4.7 List Of Wavelet Related Transform:

    4.7. 1. Continuous Wavelet Transform:

    A continuous wavelet transform is used to divide a continuous-time function into

    wavelets. Unlike Fourier transform, the continuous wavelet transform possesses the ability to

    construct a time frequencyrepresented of a signal that offers very good time and frequency

    localization.

    4.7.2. Multi resolution analysis:

    A multi resolution analysis (MRA) or multi scale approximation (MSA) is the design

    methods of most of the practically relevant discrete wavelet transform (DWT) and the

    justification for the algorithm of the fast Fourier wavelet transform (FWT)

    4.7.3. Discrete Wavelet Transform:

    In numerical analysis and functional analysis, a discrete wavelet transform (DWT) is

    any wavelet transform for which the wavelets are discretely sampled. As with other wavelet

    transforms, a key advantage it has overFourier transforms is temporal resolution: it capturesboth frequency andlocation information.

    4.7.4. Fast Wavelet Transform:

    The Fast Wavelet Transform is a mathematical algorithm designed to turn a waveform

    or signal in the time domain into a sequence of coefficients based on an orthogonal basis of

    small finite waves, or wavelets. The transform can be easily extended to multidimensional

    signals, such as images, where the time domain is replaced with the space domain.

    4.8. Applications of Wavelet Transforms:

    Wavelets have broad applications in fields such as signal processing and medical

    imaging. Due to time and space constraints, I will only be discussing two in this paper. The

    two applications most applicable to this are wavelet image compression and the progressive

    transmission of image files over the internet.

    http://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Time-frequency_representationhttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Waveletshttp://en.wikipedia.org/wiki/Orthogonal_basishttp://en.wikipedia.org/wiki/Sequencehttp://en.wikipedia.org/wiki/Time_domainhttp://en.wikipedia.org/wiki/Waveformhttp://en.wikipedia.org/wiki/Algorithmhttp://en.wikipedia.org/wiki/Mathematicshttp://en.wikipedia.org/wiki/Fourier_transformhttp://en.wikipedia.org/wiki/Wavelethttp://en.wikipedia.org/wiki/Wavelet_transformhttp://en.wikipedia.org/wiki/Functional_analysishttp://en.wikipedia.org/wiki/Numerical_analysishttp://en.wikipedia.org/wiki/Time-frequency_representation
  • 7/28/2019 Sree Project Document

    35/48

    35

    4.8.1. Wavelet Compression:

    The point of doing the Haar wavelet transform is that areas of the original matrix that

    contain little variation will end up as small or zero elements in the Haar transform matrix. A

    matrix is considered sparse if it has a high proportion of zero entries. Matrices that are sparse

    take much less memory to store. Because we cannot expect the transformed matrices always

    to be sparse, we must consider wavelet compression To perform wavelet compression we first

    decide on a non-negative threshold value known as . We next let any value in the Haar

    wavelet transformed matrix whose magnitude is less than be reset to zero. Our hope is that

    this will leave us with a relatively sparse matrix. If is equal to zero we will not modify any

    of the elements and therefore we will not lose any information. This is known as lossless

    compression. Lossy compression occurs when is greater than zero. Because some of the

    elements are reset to zero, some of our original data is lost. In the case of lossless

    compression we are able to reverse our operations and get our original image back. With

    lossy compression we are only able to build an approximation of our original image.

    4.8.2. Progressive Transmission:

    Many people frequently download images from the internet, a few of which are not

    even pornographic. Wavelet transforms speed up this process considerably. When a personclicks on an image to download it, the source computer recalls the wave transformed matrix

    from memory. It first sends the overall approximation coefficient and larger detailcoefficients, and then the progressively smaller detail coefficients. As your computer receivesthis information it begins to reconstruct the image in progressively greater detail until the

    original image is fully reconstructed. This process can be interrupted at any time if the user

    decides that s/he does not want the image. Otherwise an user would only be able to see an

    image after the entire image file had been downloaded. Because a compressed image file issignificantly smaller it takes far less time to download.

    4.9. Conclusion:

    The basis function, Fourier analysis, similarities, dissimilarities between Fourier and

    wavelet transform, introduction to wavelets, types of wavelet transforms and applications of

    wavelet transform are discussed in this chapter. The image compression using modified fast

    haar wavelet transform will be discussed.

  • 7/28/2019 Sree Project Document

    36/48

    36

    Chapter 5

    1.Image compression using SPIHT algorithmA. Description of the Algorithm

    Image data through the wavelet decomposition, the coefficient of the distribution turn

    into a tree. According to this feature, defining a data structure: spatial orientation tree. 4-level

    wavelet decomposition of the spatial orientation trees structure are shown in Figure1.We can

    see that each coefficient has four children except the red marked coeffcients in the LL

    subband and the coeffcients in the highest subbands (HL1;LH1; HH1).

    The following sets of coordinates of coeffcients are used to represent set partitioning

    method in SPIHT algorithm. The location of coeffcient is notated by (i,j),where i and j

    indicate row and column indices, respectively.

    H: Roots of the all spatial orientation trees

    O (i,j):Set of offspring of the coeffcient (i, j), O(i, j) = {(2i, 2j), (2i, 2j + 1),(2i + 1,

    2j), (2i + 1, 2j + 1)}, except (i, j) is inLL; When (i,j) is inLL subband, O(i; j) is defined as:

    O(i, j) = {(i, j + ), (i + , j), (i +, j + )}, where and is the width andheight of theLL subband, respectively.D (i,j): Set of all descendants of the coeffcient (i, j),L

    (i,j):D (i,j) - O (i,j).

    Figure5.1: Parent-child relationship in SPIHT

    A significance function () which decides the Significance of the set ofcoordinates, , with respect to the threshold 2n is defined by:

  • 7/28/2019 Sree Project Document

    37/48

    37

    Where ci,j is the wavelet coefficient. In this algorithm, three ordered lists are used to

    store the significance information during set partitioning. List of insignificant sets (LIS), list

    of insignificant pixels (LIP), and list of significant pixels (LSP) are those three lists. Note that

    the term pixel is actually indicating wavelet coeffcient if the set partitioning algorithm is

    applied to a wavelet transformed image.

    Algorithm: SPIHT

    1) Initialization:

    1. Output n= [log2 max {| (, )|}]2. SetLSP=;3. SetLIP= (i,j)

    H;

    4. SetLIS= (i,j) H, whereD(i; j) and set each entry inLISas type A2) Sorting Pass:

    1. For each (i, j) LIPdo:(a) Output (,)(b) If (,) = 1 then move (i, j) toLSPand output Sign ( , )2. For each (i, j) LISdo:(a) If (i, j) is type A then

    I. output ((,))ii. If then (, ) = 1 then

    A. for each (k, l) O(i, j). Output (, ). If (, ) = 1 then append (k, l) toLSP, output Sign( ,),and , = , 2sign( ,) else append (k; l) toLIPB. move (i, j) to the end ofLISas type B

    (b) If (i, j) is type B then

    I. output (, )ii. If (, ) = 1 then

    . Append each (k, l) O(i, j) to the end ofLISas type A

    . Remove (i,j) fromLSP

    3) Refinement Pass:

    1. For each (i,j) in LSP, except those included in the last sorting pass

  • 7/28/2019 Sree Project Document

    38/48

    38

    . Output the n-th MSB of |, |4) Quantization Pass:

    1. Decrement nby 1

    2. Goto step 2).

    B. Analyses of SPIHT Algorithm

    Here a concrete example to analyze the output binary stream of SPIHT encoding. The

    following is 3-level wavelet decomposition coefficients of SPIHT encoding

    n = [log2 max {|c(i,j)|}] = 5, so, The initial threshold value: = 25, for , the output binarystream: 11100011100010000001010110000, 29 bits in all. By the SPIHT encoding results,

    we can see that the output bit stream with a large number of seriate "0" situation, and along

    with the gradual deepening of quantification, the situation will become much more severity,

    so there will have a great of redundancy when we direct output.

  • 7/28/2019 Sree Project Document

    39/48

    39

    2.Image compression using WDR algorithmWDR ALGORITHM

    One of the defects of SPIHT is that it only implicitly locates the position of significant

    coefficients. This makes it difficult to perform operations, such as region selection on

    compressed data, which depend on the exact position of significant transform values. By

    region selection, also known as region of interest (ROI), which means selecting a portion of a

    compressed image, which requires increased resolution. Such compressed data operations are

    possible with the Wavelet Difference Reduction (WDR) algorithm of Tian and Wells.

    The term difference reduction refers to the way in which WDR encodes the locations

    of significant wavelet transform values. In WDR, the output from the significance pass

    consists of the signs of significant values along with sequences of bits which concisely

    describe the precise locations of significant values.

    The WDR algorithm is a very simple procedure. A wavelet transform is first applied to the

    image, and then the bit-plane based WDR encoding algorithm for the wavelet coefficients is

    carried out. WDR mainly consists of five steps as follows:

    1. Initialization:During this step an assignment of a scan order should first be made. For an image

    with P pixels, a scanorder is a one-to-one and onto mapping

    = Xk, for k

    =1,2,..., P between the wavelet coefficient () and a linear ordering (X k). The scan order is a

    zigzag through subbands from higher to lower levels. For coefficients in subbands, row-based

    scanning is used in the horizontal subbands, column based scanning is used in the vertical

    subbands, and zigzag scanning is used for the diagonal and low-pass subbands. As the

    scanning order is made, an initial threshold T0 is chosen so that all the transform values

    satisfy |Xm|< T0 and at least one transform value satisfies |Xm|>= T0 / 2.

    2.Update threshold:Let Tk=Tk-1 / 2.

  • 7/28/2019 Sree Project Document

    40/48

    40

    3.Significance pass:In this part, transform values are deemed significant if they are greater than or equal

    to the threshold value. Then their index values are encoded using the difference reduction

    method of Tian and Wells. The difference reduction method essentially consists of a binary

    encoding of the number of steps to go from the index of the last significant value to the index

    of the current significant value. The output from the significance pass includes the signs of

    significant values along with sequences of bits, generated by difference reduction, which

    describes the precise locations of significant values.

    4.Refinement pass:The refinement pass is to generate the refined bits via the standard bit-plane

    quantization procedure like the refinement process in SPHIT method. Each refined value is a

    better approximation of an exact transform value.

    5.Repeat steps (2) through (4) until the bit budget is reached.

    3.Image compression using EZW algorithmTHEORY OF EZW ALGORITHM

    It generates a lot of unimportant data after wavelet transforming on image data. It

    discards some unimportant data after the process of quantizing and coding according to some

    special rules and the remained data can represent the original data approximately. This is the

    principle of Image compress algorithm based on wavelet transform.

    Zero-tree coding method is one of the most popular image compress algorithms using

    wavelet transforms and Embedded Zero-tree Wavelets (EZW) coding is the representative

    method of the zero-tree coding based. EZW was invented by Shapiro in 1993[3]. It is an

    embedded wavelet image coding algorithms, which has a high compression rate. It is a

    progressive coding method and can perform well at image compressing from lossy to

    lossless.

  • 7/28/2019 Sree Project Document

    41/48

    41

    The main features of EZW include compact multiresolution representation of images

    by discrete wavelet transformation, zero-tree coding of the significant wavelet coefficients

    providing compact binary maps, successive approximation quantization of the wavelet

    coefficients, adaptive multilevel arithmetic coding, and capability of meeting an exact target

    Compress rate.

    The basic process flow of EZW algorithm can be described as follows: Operate the

    image through wavelet transform and quantizing the coefficients. Given a series of threshold

    values which are sorted from high to low, for every threshold (current threshold value equals

    to 1/2 of the former threshold), sort all the coefficients and remain the important coefficients

    and discard unimpo