CHAPTER-3 DESIGN AND IMPLEMENTATION OF LIFTING BASED...

17

CHAPTER-3

DESIGN AND IMPLEMENTATION OF LIFTING BASED 3D DISCRETE

WAVELET TRANSFORM FOR IMAGE COMPRESSION

3.1 Introduction

To deal with medical image processing, it is a pre-requisite to understand the

concepts of basic transformation techniques applied to signal processing and also various

compression techniques that are applied to images of different types.

3.1.1 Fourier Transform (FT)

The Fourier Transform [36] has played an important role in signal processing for

many years. The 2D Fourier Transform is a powerful tool and is used to enhance, restore,

encode and describe the images. Spatial and Frequency domain approaches are two

different types in image processing. Most of the spatial domain approaches involve more

computations, whereas the frequency domain approaches like Fourier Transform are more

flexible and involve less computation. Fourier Transform is linear and it possesses the

property of homogeneity and additivity as shown in Fig. 3.1 and Fig. 3.2 respectively.

However it is necessary to analyze the various image transforms.

The Fast Fourier Transform (FFT) [36] an algorithm of Discrete Fourier

Transform (DFT) is used in a number of image processing applications to reduce

computational cost. The FFT is easy to be implemented by employing successive doubling

technique and hence it finds an important place in image processing applications.

The Fourier transform of x(t), denoted by X(f) is defined in Eq. 3.1.

𝑋 𝑓 = 𝑥(𝑡)𝑒−𝑗2𝜋𝑓𝑡+∞

−∞𝑑𝑡 … (3.1)

Where x(t) is a continuous function of a real variable „x‟ and j= −1, the variable „f’ is

frequency, x(t) the original signal can be obtained by the application of the Inverse Fourier

Transform (IFT). Fourier Transform identifies all spectral components present in the

18

signal; however it does not provide any information regarding the temporal (time)

localization of the components.

Fig. 3.1: Homogeneity Property

Fig. 3.2: Additivity Property

19

3.1.2 Fourier Transform Limitations

The signals have been classified into two types viz.., Stationary and

Non-Stationary. Non-Stationary signals are those which have got time varying spectral

components. FT only provides the existence of the spectral components of the signal, but

does not provide any information on the time occurrences of spectral components.

The reason is the basis function e-jωt

stretches to infinity and hence signal can be

analyzed globally. In order to obtain time-localization of spectral components, the signal

needs to be analyzed locally. This can be achieved by Short Time Fourier Transform.

3.1.3 Short Time Fourier Transform (STFT)

The Short Time Fourier Transform or alternatively Short Term Fourier

Transform[37] is a Fourier related transform, used to determine the sinusoidal frequency

and phase content of local sections of a signal as it changes over time. Simply, in the

continuous time case, the function to be transformed is multiplied by a window

function which is non-zero for only a short period of time. The Fourier Transform of the

resulting signal is taken as the window that slides along the time axis, resulting in a two

dimensional representation of the signal. Mathematically, this is written in Eq. 3.2.

STFTxω τ, ω = x t ω t − τ e−jωtdt

t … (3.2)

where ω(t) is the window function, commonly a Hann Window or Gaussian bell centered

around zero and x(t) is the signal to be transformed. X(τ, ω) is essentially the Fourier

Transform of x(t)ω(t-τ), a complex function representing the phase and magnitude of the

signal over time and frequency. Often, phase unwrapping is employed throughout either or

both the time axis „τ‟ and frequency axis „ω‟, to suppress any jump discontinuity of the

phase result of the STFT. The time index „τ‟ is normally considered to be slow time.

The main advantage of STFT is that, it gives time frequency description of the

signal and it overcomes the difficulties of Fourier Transform, by using windowing

functions.

http://en.wikipedia.org/wiki/List_of_Fourier-related_transforms

http://en.wikipedia.org/wiki/Window_function



http://en.wikipedia.org/wiki/Fourier_transform


http://en.wikipedia.org/wiki/Hann_window

http://en.wikipedia.org/wiki/Gaussian_function

http://en.wikipedia.org/wiki/Complex_function

http://en.wikipedia.org/wiki/Phase_unwrapping

http://en.wikipedia.org/wiki/Jump_discontinuity

20

3.1.4 Evolution of Wavelets and Wavelet Transform (WT)

Evolution of Wavelets

Wavelet Transform (WT)

The Wavelet Transform has gained widespread acceptance in signal processing

and image compression. Because of their inherent multi resolution nature, wavelet coding

schemes are especially suitable for applications where scalability and tolerable

degradation are important. Wavelet compression is a form of data compression well suited

https://en.wikipedia.org/wiki/Data_compression

21

for image compression, sometimes for video and audio compression. The transformation

of a signal is just another form of representing the signal. It does not change the

information content present in the signal. The Wavelet Transform provides a time

frequency representation of the signal. It was developed to overcome the short coming of

the Short Time Fourier Transform (STFT), which can also be used to analyze non-

stationary signals. While STFT gives a constant resolution at all frequencies, the Wavelet

Transform uses multi-resolution technique by which different frequencies are analyzed

with different resolutions.

A wave is an oscillating function of time or space and is periodic. In contrast,

wavelets are localized waves. They have their energy concentrated in time or space and

are suited to analysis of transient signals. While Fourier Transform and STFT use waves to

analyze signals, the Wavelet Transform uses wavelets of finite energy.

Using a Wavelet Transform, the wavelet compression methods are adequate for

representing transients, such as percussion sounds in audio or high frequency components

in two dimensional images, for example an image of stars on a night sky. This means that

the transient elements of a data signal can be represented by a smaller amount of

information that would be the case if some other transform, such as the more

widespread Discrete Cosine Transform, had been used.

Firstly, Wavelet Transform is applied to an image, which produces as

many coefficients as there are pixels in the image (i.e. there is no compression yet since it

is only a transform). Then these coefficients can be compressed more easily, because the

information is statistically concentrated in just a few coefficients. This principle is called

Transform Coding. After that, the coefficients are quantized and the quantized values

are Entropy encoded and/or Run Length encoded. A few 1D and 2D applications of

wavelet compression use a technique called Wavelet Footprints. The Wavelet Transform

https://en.wikipedia.org/wiki/Image_compression

https://en.wikipedia.org/wiki/Video_compression

https://en.wikipedia.org/wiki/Audio_compression_(data)

https://en.wikipedia.org/wiki/Transient_(acoustics)

https://en.wikipedia.org/wiki/Discrete_cosine_transform

https://en.wikipedia.org/wiki/Coefficient

https://en.wikipedia.org/wiki/Pixel


https://en.wikipedia.org/wiki/Transform_coding


https://en.wikipedia.org/wiki/Quantization_(signal_processing)

https://en.wikipedia.org/wiki/Entropy_encoding

https://en.wikipedia.org/wiki/Run-length_encoding

22

can provide the frequency of the signals and the time associated to those frequencies

making it very convenient for its application in numerous fields.

(a) (b)

Fig. 3.3: Demonstration of (a) Wave and (b) Wavelet

A wavelet shown in Fig. 3.3 is a waveform of effectively limited duration that has

zero average value. Wavelet analysis is the decomposition of a function onto shifted and

scaled versions of the basic wavelet. A wavelet is a wave shaped function mentioned in

Eq. 3.3 having a limited length with a zero mean value. This means that a wavelet

decreases fast enough in the frequency domain and a consequence of the condition for the

existence of the Inverse Wavelet Transform.

0ψ(x)dx(0)ψ̂ ….. (3.3)

Unlike a sine wave, wavelets are generally irregular and asymmetrical that is shown in

Fig. 3.4.

Fig. 3.4: Sine function and a wavelet

It is intuitively clear that, functions with sharp changes can be analyzed better in

using short irregular waves than with a smooth infinite sine. The wavelet basis {ψj,k(x)}j,k

is generated by the translation and dilatation ψ(2-j.x - k) of the basic (“mother”) wavelet

ψ(x). If the basic wavelet ψ(x), where ψ(x) ≡ ψ0,0(x) starts at the moment of x = 0 and ends

at the moment of x = N - 1, the shifted wavelet ψo,k starts at the moment of x = k and ends

23

at the moment of x = k + N - 1. The scaled wavelet ψj,0 starts at the moment of x = 0 and

ends at the moment of x = 2j(N - 1). Its graph is scaled (compressed or expanded,

depending of the sign of „j‟) by a factor of „2-j‟ (Eq. 3.4), while the graph of the wavelet

ψ0,k is translated to the right by k, if k > 0 (Eq. 3.5).

Scaling ψj,0(x) = 2-j/2

ψ(2-jx) … (3.4)

Translation ψ0,k(x) = ψ(x - k) … (3.5)

The basis wavelet is generated by scaling the basic wavelet „j‟ times and shifting it by „k‟

is given by Eq. 3.6.

ψj,k(x) = 2-j/2

ψ(2-j

x - k) ... (3.6)

The multiplier 2−j/2 is a normalizing factor, so that the L2 norm of the wavelet is

equal to one. The space of details on the jth

resolution level „Wj‟ contains functions that are

linear combinations of wavelets ψj,k(x).

The wavelet analysis is done similar to the STFT analysis. The signal to be

analyzed is multiplied with a wavelet function just as it is multiplied with a window

function in STFT and then the transform is computed for each segment generated.

However, unlike STFT in Wavelet Transform, the width of the wavelet function changes

with each spectral component.

The Wavelet Transform[38] at high frequencies gives good time resolution and

poor frequency resolution, while at low frequencies gives good frequency resolution and

poor time resolution. Wavelet Transform is again classified into Continuous Wavelet

Transform and Discrete Wavelet Transform.

3.1.5 Continuous Wavelet Transform (CWT)

A Continuous Wavelet Transform[37] is used to divide a continuous-time function

into wavelets. Unlike FT, the CWT possesses the ability to construct a time frequency

representation of a signal that offers very good time and frequency localization. CWTs are

particularly helpful in tackling problems, involving signal identification and detection of

http://en.wikipedia.org/wiki/Wavelet


http://en.wikipedia.org/wiki/Time-frequency_representation

http://en.wikipedia.org/wiki/Time-frequency_representation

24

hidden transients (hard to detect, short lived elements of a signal). The CWT with a given

function f(x), called the Analyzing Wavelet can be expressed as given by the Eq. 3.7.

… (3.7)

τ = Translation parameter

s= 1/f = Scaling parameter

ψ(t) = Mother wavelet,

The kernel functions used in Wavelet Transform are obtained from one prototype function

known as Mother Wavelet, by scaling and/or translating it. The modified equation is

shown in Eq. 3.8.

… (3.8)

a = Scale parameter

b = Translation parameter

In order to become a wavelet, a function must satisfy the following two conditions given

in Eq. 3.8.1 and Eq. 3.8.2.

Ѱ 𝑡 𝑑𝑡 = 0∞

−∞ ... (3.8.1)

|Ѱ 𝑡 |2𝑑𝑡 < ∞∞

−∞ ... (3.8.2)

3.1.6 Discrete Wavelet Transform (DWT)

In statistical and functional analysis, a DWT is any Wavelet Transform for which

the wavelets are discretely sampled. Its advantage over FT and other WTs is temporal

resolution, because it captures both frequency and location information (location in time).

1D Discrete Wavelet Transform

The Discrete Wavelets Transform (DWT) [39], transforms a discrete time signal

to a discrete wavelet representation. Initially, the wavelet parameters are discretized to

reduce the continuous basis set of wavelets to a discrete and orthogonal/ orthonormal set

of basis wavelets and is given by Eq. 3.9.

m,n(t) = 2m/2

(2m

t – n); m, n such that m > -, n < … (3.9)

dts

τtx(t)Ψ

|s|

1s)Ψ(τ,s)(ττCWT )(

t

*Ψ

x

(t)dtx(t)Ψa

1b)W(a, ba,

http://en.wikipedia.org/wiki/Numerical_analysis

http://en.wikipedia.org/wiki/Functional_analysis

http://en.wikipedia.org/wiki/Wavelet_transform

http://en.wikipedia.org/wiki/Wavelet


25

The 1D DWT is given as the inner product of the signal x(t) being transformed with each

of the discrete basis functions is written in Eq. 3.10.

Wm,n = < x(t), m,n(t) > ; m, nZ ... (3.10)

The 1D Inverse DWT is given by Eq. 3.11.

x(t) = m n

nm,nm, (t)ψW ; m, nZ … (3.11)

2D Discrete Wavelet Transform

The 1D DWT can be extended to 2D transform[40] using separable wavelet filters.

With separable filters, applying a 1D transform to all the rows of the input, which is

shown in Fig. 3.5(a) is then repeating on all of the columns can compute the 2D transform

shown in Fig. 3.5(b). When one level 2D DWT is applied to an image, four transform

coefficient sets are created. As depicted in Fig. 3.5(c), the four sets are LL, HL, LH and

HH, where the first letter corresponds to applying either a Low pass or High pass filter to

the rows and the second letter refers to the filter applied to the columns.

Fig. 3.5: Illustration of 1D DWT applied to input image

Fig. 3.6: DWT for Lena image (a) Original image (b) Output image after 1D DWT

applied on column input (c) Output image after 1D DWT applied on row input

26

The 2D DWT[41] converts images from spatial domain to frequency domain. At

each level of the wavelet decomposition, each column of an image is first transformed

using a 1D vertical analysis filter bank. The same filter bank is then applied horizontally

to each row of the filtered and sub-sampled data. One level of wavelet decomposition

produces four filtered and sub sampled images, referred to as sub-bands.

The upper and lower areas of Fig. 3.6(b) represent the low pass and high pass

coefficients respectively after applying vertical 1D DWT and sub-sampling to an input

image shown in Fig. 3.6(a). The result of the horizontal 1D DWT and sub-sampling to

form a 2D DWT output image is shown in Fig. 3.6(c). Multiple levels of Wavelet

Transforms can be used to concentrate data energy in the lowest sampled bands.

Especifically, the LL sub-band in Fig. 3.5(c) can be transformed again to form LL2, HL2,

LH2 and HH2 sub-bands, producing a two level Wavelet Transform.

An „R-1‟ level wavelet decomposition is associated with „R‟ resolution levels

numbered from „0‟ to „R-1‟ with „0‟ and „R-1‟ corresponding to the coarsest and finest

resolutions. The straight forward convolution implementation of 1D DWT requires a large

amount of memory and large computation complexity. An alternative implementation of

the 1D DWT known as the Lifting scheme provides significant reduction in the memory

and the computation complexity.

Lifting also allows in-place computation of the wavelet coefficients. Nevertheless,

the Lifting approach computes the same coefficients as the direct filter bank convolution.

To employ wavelets for image decomposition, it is replaced with the notion of time, which

has therefore served as free variable with Spatial position. In addition, the wavelet

framework has to deal with the two dimensional signals. Although, two dimensional

wavelets can be constructed, a more popular approach is to transform images using one

dimensional separable wavelet.

27

Using separable wavelets, one can apply the Wavelet Transform first in a direction

and then transform the result again in the other direction. In Fig. 3.7 firstly, DWT is

applied to x-direction of an image, relocating the scaling coefficients to the left side and

the wavelet coefficients to the right side as before. Afterwards, DWT is applied in the

y-direction on the resulting image for relocating scaling coefficients to the top.

Fig. 3.7: Two dimensional transform with separable wavelets

Fig. 3.8: 2 Level decomposition of Lena image

Different filter banks can be used for each direction, if desired. After both

transformations the upper left quadrant will contain the original image at half of the

resolution, while the other quadrants contain the refinement coefficients necessary to

bring the smaller image back to full scale. Each of the quadrants have their own basis

functions, thus the basis for separable 2D transforms consists of one scaling function ΦxΦy

and three wavelet functions ΨxΦy, ΦxΨy and ΨxΨy. After executing DWT in both

directions, the algorithm can be recursively applied to the lower resolution image.

Fig. 3.8 shows the wavelet coefficients of LENA after 2 levels of decomposition with

28

Daubechies 4-tap wavelet and recognized that the original image in the upper left, has

scaled down to 25% resolution. The wavelet coefficients, especially those from level „1‟,

are so small that they are almost imperceptible (the grey levels have been contrast

enhanced for improved viewing).

This illustrates the efficiency of Wavelet Transforms for energy compaction.

Interestingly (and quite unlike the Fourier Transform) the wavelet coefficient quadrants

visually resemble the high resolution details of the image. The lower left quadrant has

mostly details for the x-direction, while the upper right has details for the y-direction.

The lower right quadrant has details from both directions (diagonal), but they are

almost too fine to be seen. The actual compression is accomplished by discarding

coefficients. For instance, discard some of the quadrants in the decomposition. But, a

better strategy would be to selectively discard coefficients based on their magnitude. Since

larger coefficients probably have more impact on the reconstructed image, keep those and

rather discard the smaller values which are known as Thresholding.

With Hard thresholding, a tolerance limit „T’ should be selected and discard all

coefficients with absolute value smaller than „T’. A variation on this scheme is called

Quantile Thresholding in which a percentage „P‟ will be selected and smallest „P‟ percent

of the values will be discarded. With Soft Thresholding, the magnitude of all coefficients

is reduced by the amount „T’. The coefficients that are smaller than this value are reduced

to zero, while all the rest are brought closer to zero. Instead of subtraction, the use of

integer division by „Q‟ is followed. Again, all values smaller than „Q‟ would be reduced to

zero, while the rest are made smaller.

This strategy would also limit the number of different values for coefficients,

which in effect could make coding more efficient, since the number of bits required to

code the values can be reduced. The process of limiting the set of possible values used is

known as Quantization. Besides, more advanced approach is to use different values of „T’

29

or „Q‟ for different sub-bands. Since the human visual system is less sensitive to high

frequencies it is desirable to use a greater threshold value or a coarser quantization for the

fine detail sub-bands. An example of wavelet compression is shown in Fig. 3.9.

(a) (b) (c)

Fig. 3.9: Lena image compressed with Daubechies 4-tap wavelet

(a) Original (b) 80% compressed image (c) 96% compressed image

In the middle Fig. 3.9(b) the smallest 80% of the wavelet coefficients have been

discarded before reconstructing the image (hard threshold). At this compression level,

there is no perceivable reduction in image quality. The only visual effect seems to be a

reduction of noise and a slight smoothing of texture. The Fig. 3.9(c) is reconstructed from

only 4% of the original coefficients. The image is now composed of 2621 wavelets of

different sizes and positions, as compared to 65536 pixels in standard representation.

Compression artefacts have now become apparent, but even at this high level of

compression, the image is quite recognizable. In comparison, a JPEG representation at this

compression level would on average synthesize each patch of only 2.5 basis patterns.

Haar Wavelet Transform

The first DWT was invented by the Hungarian mathematician Alfred Haar[42] in

1909. For an input represented by a list of 2n numbers where „n‟ represents number of bits

of a pixel. The Haar Wavelet Transform may be considered to simply pair up input values,

storing the difference and passing the sum. This process is repeated recursively, pairing up

the sums to provide the next scale, thus finally resulting in 2n-1 differences and one final

http://en.wikipedia.org/wiki/Alfr%C3%A9d_Haar

http://en.wikipedia.org/wiki/Haar_wavelet

30

sum. Haar used these functions to give an example of a countable orthonormal system for

the space of square integrable functions on the real line. The study of wavelets, and even

the term Wavelet did not come until much later. The Haar Wavelet is also the simplest

possible wavelet. The technical disadvantage of the Haar Wavelet is that it is not

continuous and therefore not differentiable. This property can however, be an advantage

for the analysis of signals with sudden transitions, such as monitoring of tool failure in

machines. The mother wavelet function ψ(t) of Haar Wavelet (Eq. 3.12) can be described

as

Ѱψ 𝑡 =

1, 0 ≤ 𝑡 ≤1

2

−1, 1

2≤ 𝑡 ≤ 1

0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒

… (3.12)

Its scaling function φ(t) can be described as in Eq. 3.13.

φ 𝑡 = 1, 0 ≤ 𝑡 ≤ 1

0, 𝑜𝑡𝑕𝑒𝑟𝑤𝑖𝑠𝑒 ... (3.13)

In functional analysis, the Haar systems denote the set of Haar Wavelets given in Eq. 3.14.

{tψn,k(t) = ψ (2nt-k); n ψ N, 0 ≤ k < 2

n} ... (3.14)

Haar Wavelet properties

The Haar Wavelet transform has got several peculiar properties:

1. Any continuous real function can be approximated by linear combinations of φ(t),

φ(2t), φ(4t),….. φ(2kt) and their shifted functions. This extends to those function

spaces where any function therein can be approximated by continuous functions.

2. Any continuous real function can be approximated by linear combinations of the

constant function ψ(t), ψ(2t), ψ(4t),….. ψ(2kt) and their shifted functions.

3. Orthogonality

n1n,m1m,1

m1mm δδ)dtntn)ψ)ψtψ(22

…(3.15)

Here δi,j represents the Kronecker delta. The dual function of ψ(t) is ψ(t) itself.

4. Wavelet/Scaling functions with different scale „m‟ have a functional relationship.

http://en.wikipedia.org/wiki/Square-integrable_function

http://en.wikipedia.org/wiki/Real_line

http://en.wikipedia.org/wiki/Continuous_function

http://en.wikipedia.org/wiki/Derivative

http://en.wikipedia.org/wiki/Functional_analysis

http://en.wikipedia.org/wiki/Linear_combination

http://en.wikipedia.org/wiki/Orthogonality

http://en.wikipedia.org/wiki/Kronecker_delta

http://en.wikipedia.org/wiki/Dual_function

31

φ(t) = φ(2t) + φ(2t − 1)

ψ(t) = φ(2t) − φ(2t − 1) ... (3.16)

5. Coefficients of scale „m‟ can be calculated by coefficients of scale „m+1‟:

If xw(n,m) = 2m/2

n)dttx(t)φ(t m

... (3.17)

and Xw(n,m) = 2m/2

n)dttx(t)ψ(t m

… (3.17.1)

then xw(n,m) = 1))m1,(2nX1)m(2n,(X2

1ww

... (3.18)

Xw(n,m) = 1))m1,(2nX1)m(2n,(X2

1ww … (3.19)

Haar matrix

The 2×2 matrix described by Eq. 3.20 that is associated with the Haar

Wavelet[43] is

11

1 12H … (3.20)

Using the DWT, any sequence (a0, a1,.., a2n, a2n+1) of even length can be transformed into a

sequence of two component vectors ((a0, a1),.., (a2n, a2n+1)), then each of these vectors is

right multiplied with the matrix H2, results ((s0, d0),.., (sn, dn)) in one stage of the Fast Haar

Wavelet Transform. Sequence „s‟ is often referred to as the averages part, whereas „d‟

is known as the details part. Usually one separates the sequences „s‟ and „d‟ and

continues with transforming the sequences. Generally, the 2N×2N Haar matrix can be

derived by the following Eq. 3.21a.

H2N = 𝐻𝑁 ⊗ [1, 1]𝐼𝑁 ⊗ [1, −1]

... (3.21a)

If N=2 ⟹ H4 = 𝐻2 ⊗ [1, 1]

𝐼2 ⊗ [1, −1]

http://en.wikipedia.org/wiki/Discrete_wavelet_transform

32

where I2 is a Identity matrix of order 2 x 2.

I2 = 1 00 1

and ⊗ is the Kronecker Product

For Example, if A is an m x n matrix and B is a p x q matrix then the Kronecker product of

A ⊗ B is computed as

A ⊗ B = 𝑎11𝐵 ⋯ 𝑎1𝑛 𝐵

⋮ ⋱ ⋮𝑎𝑚1𝐵 ⋯ 𝑎𝑚𝑛 𝐵

If the sequence of length is a multiple of four, blocks of four elements is

constructed and elements are transformed in a similar manner with the 4×4 Haar matrix

which combines two stages of the Fast Haar Wavelet Transform is represented by

Eq. 3.21.

H 4 = 1 11 −1

⊗ 1 1

1 0

0 1 ⊗ 1 − 1

H 4 =

1[1 1] 1[1 1]

1[1 1] −1[1 1]1[1 − 1] 0[1 − 1]

0[1 − 1] 1[1 − 1]

H 4 =

1 1 1 11 1 −1 −11 −1 0 00 0 1 −1

... (3.21)

Note that, the above matrix is an un-normalized Haar matrix. The Haar matrix required

by the Haar transform should be normalized. Unlike the Fourier transform, the Haar

matrix has only real element (i.e., 1, -1 or 0) and is non-symmetric.

Haar Transform

The Haar Transform[44] is the simplest of the Wavelet Transforms. This transform

cross multiplies a function against the Haar Wavelet with various shifts and stretches, like

the Fourier Transform cross multiplies a function against a sine wave with two phases at

http://en.wikipedia.org/wiki/Wavelet_transform

33

many stretches. The Haar Transform is derived from the Haar matrix. An example of a

4x4 Haar Transformation matrix (Eq. 3.22) is shown below.

22 0 0

0 022

1 1 1 1

1 1 1 1

4

1H 4 ... (3.22)

The Haar Transform[45] can be thought of as a sampling process in which rows of

the transformation matrix act as samples of fine resolution.

Daubechies Wavelet Transform

The most commonly used set of DWTs is formulated by the Belgian

mathematician, Ingrid Daubechies in 1988. This formulation is based on the use

of recurrence relations to generate progressively finite discrete samplings of an implicit

mother wavelet function. In her seminal paper, Daubechies derived a family of wavelets,

the first of which is the Haar Wavelet. Interest in this field has exploded since then and

many variations of Daubechies original wavelets were developed. In general the

Daubechies Wavelets[46] are chosen to have the highest number „A‟ of vanishing

moments (this does not imply the best smoothness) for given support width N=2A and

among the 2A-1 possible solutions, the one is chosen whose scaling filter has extreme

phase variation. The Wavelet Transform is also easy to put into practice using the Fast

Wavelet Transform. Daubechies Wavelets are widely used in solving a broad range of

problems, e.g. self similarity properties of a signal or fractal problems, signal

discontinuities etc..,

The Daubechies Wavelets are not defined in terms of the resulting scaling and

wavelet functions. In fact, they are not possible to write down in closed form. The graph

shown below is generated using the cascade algorithm, a numeric technique consisting of

simply inverse transforming [1 0 0 0 0 ...] an appropriate number of times.

http://en.wikipedia.org/wiki/Ingrid_Daubechies

http://en.wikipedia.org/wiki/Recurrence_relation

http://en.wikipedia.org/wiki/Daubechies_wavelet

http://en.wikipedia.org/wiki/Fast_wavelet_transform



http://en.wikipedia.org/wiki/Fractal

http://en.wikipedia.org/wiki/Closed_form_expression

http://en.wikipedia.org/wiki/Cascade_algorithm

34

Fig. 3.10: Scaling, Wavelet function and corresponding amplitude of the frequency spectra

Note that the spectra shown here are not the frequency response of the high and

low pass filters, but rather the amplitudes of the Continuous Fourier Transforms (CFT) of

the scaling (blue) and wavelet (red) functions. Daubechies orthogonal wavelets D2-D20

(even index numbers only) is commonly used. The index number refers to the number „N‟

of coefficients. Each wavelet has a number of zero moments or vanishing moments equal

to half the number of coefficients.

For example, D2 (a special case of the Haar Wavelet) has one vanishing moment,

D4 has two and etc., A vanishing moment limits the wavelet‟s ability to represent

polynomial behavior or information in a signal. For example, D2 with one moment easily

encodes polynomials of one coefficient, or constant signal components. D4 encodes

polynomials with two coefficients, i.e. constant and linear signal components and D6

encodes three polynomials, i.e. constant, linear and quadratic signal components.

This ability to encode signals is nonetheless subject to the phenomenon of scale

leakage and the lack of shift invariance, which arise from the discrete shifting operation

(below) during application of the transform. Sub-sequences which represent linear,

quadratic signal components are treated differently by the transform depending on

whether the points align with even or odd numbered locations in the sequence. The lack of

http://en.wikipedia.org/wiki/Haar_wavelet

http://en.wikipedia.org/wiki/Polynomial

http://en.wikipedia.org/wiki/Quadratic_polynomial

http://en.wikipedia.org/wiki/Quadratic_polynomial

35

the important property of shift invariance has led to the development of several different

versions of a Shift Invariant (discrete) Wavelet Transform.

The following Table 3.1 shows the comparison between Haar and Daubechies

Wavelets.

Table 3.1: Comparison of Haar and Daubechies Wavelets

Property Haar Daubechies

Explicit function Yes No

Orthogonal Yes Yes

Symmetric Yes No

Continuous No Yes

Compacted support Yes Yes

Maximum regularity for order L No No

Shortest scaling function for order L Yes No

The different types of transformations and representations are shown in Table 3.2 below.

Table 3.2: Comparison of Fourier Transformation, Time-Frequency analysis and

Wavelet Transformation

Transformation Representation Output

Fourier Transform X(f)= 𝑥(𝑡)

+∞

−∞𝑒−𝑗2𝜋𝑓𝑡 𝑑𝑡 frequency „f‟

Time-Frequency analysis X(t, f) time „t‟; frequency „f‟

Wavelet Transform X(a,b)=1

a Ѱ

𝑡−𝑏

𝑎 𝑥(𝑡)𝑑𝑡

∞

−∞ scaling „a‟; time „b‟

3.2 Proposed 3D Lifting based Discrete Wavelet Transform

The Wavelet Transform[47] provides a multi-resolution representation using a set

of analyzing functions that are dilations and translations of a few functions (wavelets). An

efficient VLSI architecture for implementation of 3D Lifting based DWT is proposed. The

whole architecture is optimized in efficient pipeline and parallel design to speed up and

achieve higher hardware utilization. Time Division Multiplexing (TDM) design is utilized

to realize the prediction step and update step using the same architecture and hence the

size of the circuit can be reduced.

http://en.wikipedia.org/wiki/Translational_invariance

http://en.wikipedia.org/wiki/Shift_invariant_wavelet_transform

https://en.wikipedia.org/wiki/Fourier_transform

https://en.wikipedia.org/wiki/Time-frequency_analysis

36

(a)

(b)

Fig. 3.11: (a) 3 level decomposition of an image (b) Pictorial representation of 3D DWT

By taking data of size N1 x N2 x N3 and applying the 1D analysis filter bank to the

first dimension, two sub-band data sets, each of size 𝑁1

2 x N2 x N3 are obtained. After

applying the 1D analysis filter bank to the second dimension four sub-band data sets, each

of size 𝑁1

2 x

𝑁2

2 x N3 are obtained. Applying the 1D analysis filter bank to the third

dimension gives eight sub-band data sets, each of size 𝑁1

2 x

𝑁2

2 x

𝑁3

2. This is illustrated in

the Fig. 3.11(a) and Fig. 3.11(b). The proposed work utilized an efficient line-based VLSI

architecture for 3D DWT using Lifting scheme, which is mainly composed of one row

DWT module and one column DWT module, working in parallel and pipeline fashion

with 100% hardware utilization.

Many common classes of images, such as medical images (e.g. MRI), scanned

documents and Satellite images do not have the same statistical properties as photographic

images. The standard wavelets used in image coders often do not match such images,

resulting in decreased compression or image quality. Moreover MRIs are often stored in

37

large databases of similar images, making it worthwhile to find a specially adapted

wavelet for them.

In medical applications like Tele-medicine and automatic diagnosis,

a sophisticated and lossless compression on one side, decompression and detection on the

other side are essential. To mitigate this problem a Lifting based Discrete Wavelet

Transform technique has been proposed where row DWT and column DWT have been

designed and implemented with 3 Level decomposition of 3D signal is shown in Fig. 3.12

below and also its 3D view of decomposition of an image is also shown in Fig. 3.13.

Fig. 3.12: 3 Level decomposition of 3D image

Fig. 3.13: 3D view of decomposition of an image

38

3.2.1 Proposed Lifting based DWT

Wavelet Transform is an important and useful application for image compression.

Many techniques have been developed for feature extraction from MRI, but Lifting based

3D DWT is the most versatile method for feature extraction because it is a non-statistical

method which gives local frequency information and detail coefficients of the image at

various levels.

The Lifting scheme[48] has been developed as a flexible tool suitable for

constructing the second generation wavelet[49]. It is composed of three basic operation

stages: Splitting, Predicting and Updating. Fig. 3.14 shows the Lifting scheme of the

wavelet filter computing one dimension signal.

Fig. 3.14: Lifting scheme of the wavelet filter

Split step: The signal is split into even and odd points because the maximum correlation

between adjacent pixels can be utilized for the next predict step.

Predict step: The even samples are multiplied by predict factor and then the results are

added to odd samples to generate the detailed coefficients (High Pass Coefficients).

Update step: The detailed coefficients computed by the predict step are multiplied by the

update factor and then results are added to even samples to get coarse coefficients

(Low Pass Coefficients).

Split (s) Predict (P) Update (U)

X2i+1

X2i

Xi

High pass

Coefficients

Low pass

Coefficients

39

3.2.2 Proposed 3D LDWT Architecture

The proposed VLSI architecture shown in Fig. 3.15 performs 3D LDWT with line

based method, which consists of five key modules: data choose module, the row DWT

module, the column DWT module, DWT control unit and external RAM. An 𝑁2

4 external

RAM is used to store the LL band output coefficients to carry out the multi level

decomposition, where „N‟ represents the width and the height of the input image. The

DWT control unit controls the time sequence of the whole system.

Fig. 3.15: Block diagram of 3D Lifting based DWT architecture.

Firstly, one line of image data or LL sub-band data is routed in the Data Selector.

Then the data enter into the row processor to perform 1D row DWT and the output data

are stored in the line buffer. The number of the buffers is decided by the number of tap of

the low pass filter. When 𝑀+1

2 („M‟ is the number of taps of the Low Pass Filter) rows of

data have finished the row DWT, the column DWT module starts to perform the column

transform immediately and stores the intermediate results in the column buffer. The final

transformed data are stored in the external RAM. DWT module is explained using Finite

State Machine (FSM) chart (Fig. 3.16).

40

Fig.3.16: FSM of DWT control unit

Improved Embedded Mirror Symmetric Extension at the Boundaries

The finite length of signal, processed by using wavelet filter leads to the edge

effect. JPEG 2000 standard[50] employs the Symmetric Extension at the boundaries to

eliminate it. The traditional extension arithmetic needs additional memory units and

operations and it will consume much power and area[10]. According to the characteristic

of the Lifting based DWT, this module brings forward the Embedded Mirror Symmetric

Extension Arithmetic[51], as shown in Fig. 3.17. It is embedded into the data operation

process by changing the operation process at the beginning and ending of the Lifting

operation.

Fig. 3.17: Mirror symmetric extension

END

41

Eq. 3.23 given below are relating to the new operation process of (5, 3) Wavelet

Transform.

y(2n+1)=

𝑥 2𝑛 + 1 −𝑥 2𝑛 +𝑥(2𝑛+2)

2 𝑖0 + 1 ≤ 2𝑛 + 1 ≤ 𝑖𝑙 − 2, 𝑛𝑜𝑟𝑚𝑎𝑙

𝑥 2𝑛 + 1 − 𝑥 2𝑛 + 2 2𝑛 + 1 = 𝑖0, 𝑜𝑑𝑑_𝑏𝑒𝑔𝑖𝑛

𝑥 2𝑛 + 1 − 𝑥 2𝑛 2𝑛 + 1 = 𝑖𝑙 − 1, 𝑜𝑑𝑑_𝑒𝑛𝑑

y(2n)=

𝑥 2𝑛 −

𝑦 2𝑛−1 +𝑦 2𝑛+1 +2

4 𝑖0 + 1 ≤ 2𝑛 ≤ 𝑖𝑙 − 2, 𝑛𝑜𝑟𝑚𝑎𝑙

𝑥 2𝑛 +𝑦(2𝑛+1)

2+

1

2 2𝑛 = 𝑖0, 𝑒𝑣𝑒𝑛_𝑏𝑒𝑔𝑖𝑛

𝑥 2𝑛 +𝑦(2𝑛−1)

2+

1

2 2𝑛 = 𝑖𝑙 − 1, 𝑒𝑣𝑒𝑛_𝑒𝑛𝑑

… (3.23)

The embedded scheme, embedded into the row DWT module and column DWT

module is implemented by FSM and multiplexers which have four states: Forward

extension, Normal even, Normal odd and Last extension. The data extension is only

embedded in Forward extension and Last extension.

Proposed row DWT module

The proposed architecture is optimized in terms of the processing speed, as

illustrated in Fig. 3.18. The multiplication is optimized by using shifting and adding

operation. In this way, the row processor consists of six registers, five multiplexers, one

adder and one shifting adder. All the hardware resources of the row processor can be time-

multiplexed. One single line is calculated at a time. When a Lifting step is performed, two

consecutive even-numbered samples are added and multiplied with the corresponding

Lifting coefficient and later added to the middle odd-numbered sample, i.e. one pixel data

is encoded in one clock. This reduces storage cells, compared to the row DWT module

where the input lines are partitioned into even and odd samples[9] (which needs two

parallel row DWT units).

42

Fig. 3.18: Block diagram of Row DWT

Embedded Mirror Symmetric Boundary data extension algorithm is implemented

by using two multiplexers controlled by signals sel „0‟ and sel „1‟ which results in

significant reduction in the amount of internal storage and the access times of the external

memory. The control signals (sel „0‟, sel „1‟) of multiplexers and the corresponding model

of the Lifting scheme are shown in Table 3.3.

Table 3.3: Control Signals and the corresponding model of the Lifting Scheme

Sel 1 Sel 0 Lifting Scheme step Corresponding

coefficient Extension

0 0 Prediction -1/2 Forward

extension

0 1 Update -1/4 Normal even

1 0 Prediction -1/2 Normal odd

1 1 update -1/4 Last Extension

Time-multiplexing row processor is implemented by conducting the predict step in

even clocks and the update step in odd clocks. The control signals sel „1‟ and sel „0‟ of the

multiplexers are generated by a counter. The row processor is optimized in the pipelined

way and the samples are encoded continuously. Hardware utilization reaches

MUX

Input data

MUX

+<< +

REG

G

Sel 1

Sel 1

1 2

REG

G

REG

G

REG

G

REG

G

MUX

MUX

REG

G

MUX

Sel 0

Sel 0 Sel 0

Buffer

43

approximately 100% and the control logic is simple. Table 3.4 shows the data flow of the

proposed module of (5, 3) DWT[52] for a row with 8 samples, where Hi (Li) represents

the ith

high-pass (low-pass) output.

Table 3.4: Data Flow for (5, 3) DWT

clock Input En0 En1 Output

1 X0 0 0 -

2 X1 0 0 -

3 X2 0 0 H1

4 X3 0 1 L1

5 X4 1 0 H2

6 X5 0 1 L2

7 X6 1 0 H3

8 X7 0 1 L3

9 - 1 1 H4

10 - 0 1 L4

Proposed Column DWT module

In order to reduce the system latency, the column DWT has to execute in the row-

wise order. The input data are stored in even line buffer and odd line buffer which are

naturally separated into even samples and odd samples along the column. Embedded

mirror symmetric boundary data extension algorithm is also implemented in the Column

DWT module (Fig. 3.19). There are four multiplexers to control all the steps („1‟

represents prediction step while „0‟ represents update step). Multiplexers can ensure the

re-use of the hardware resource and that samples join the associated computation

according to the timing plan. The column DWT module begins to calculate samples after

the first two lines finish computing in row DWT. Firstly, multiplexers are set to „1‟ and

column processor conducts the prediction step.

44

Fig. 3.19: Block diagram of Column DWT

The result of the prediction step is stored in column buffer at the same time. Then

multiplexers are set to „0‟, column processor conducts the update step and then the results

are exported directly. The column processor is optimized in pipelined way to increase the

speed of the Wavelet Transform. The data flow of the column DWT is similar to that of

the row DWT.

3.3 Floating Point Multiplication Algorithm

A Floating point multiplier is proposed, instead of normal multiplier to multiply

the input filtered values with a constant coefficient values.

Fig. 3.20: Floating point multiplication with a constant value „a‟

Odd line

Even line

Column buffer

M

U

X

DFF

M

U

X

M

U

X

MUX

DFF

DFF +<<

+ DFF

X

Y

DFF

Sel

Sel

Sel

Sel

1 2

E1 E2

Predict

a

1

45

The main advantage of this floating point multiplier is to increase the speed of

operation and accuracy. The normalized floating point numbers have the form of „Z‟ as

given by Eq. 3.23a.

Z = (-1S) x 2

(E - Bias) x (1.M) … (3.23a)

The following algorithm is used to multiply two floating point numbers

Significance and multiplication; i.e. (1.M1 x 1.M2)

Placing the decimal point in the result

Exponents addition; i.e. (E1 + E2 -Bias)

Getting the sign; i.e. S1 xor S2

Normalizing the result; i.e. obtaining „1‟ at the MSB

Rounding implementation.

Verifying for underflow/overflow occurrence.

The above mentioned IEEE 754 single precision floating point numbers are

considered to perform the multiplication, but the number of mantissa bits is reduced for

simplification. Here only five bits are considered while the hidden „1‟ bit is retained for

normalized numbers. The Fig. 3.21 shows each block of the floating point multiplier.

Fig. 3.21: Floating point multiplier block diagram

Here a Floating point multiplier is presented in which rounding support is not

implemented. By this more precision in MAC unit is obtained and this will be accessed by

the multiplier or by a floating point adder unit. Fig. 3.20 shows the block diagram of the

46

multiplier structure; Exponents addition, significant multiplication and result sign

calculation. All processes done are independent and are in parallel. The significant

multiplication is done on two 24 bit numbers and it results in a 48 bit product called as

Intermediate Product (IP). The IP is represented as (47 down to 0) and the decimal point

is located between bits 46 and 45 in the IP.

3.4 Look Up Table (LUT) Implementation for Memory based Computations

Here the proposed LUT Implementation for memory based computations is used to

store the filtered values as well as constant multiplier product values. Instead of registers

to store the values, LUTs are used to reduce the memory size and optimize the area and

delay.

The Anti symmetric Product Coding (APC) and Odd Multiple Storage (OMS)

techniques for Look Up Table design for memory based multipliers are proposed to be

used in Digital Signal Processing applications.

Fig. 3.22: The (5, 3) Discrete Wavelet Transform

Each of these techniques results in the reduction of the LUT size by a factor of

two. In this, a different form of APC and a modified OMS scheme are presented in order

to combine them for efficient memory based multiplication. The proposed combined

approach provides a reduction in LUT size to one fourth of the conventional LUT.

47

Fig. 3.23: Proposed APC and OMS Combined LUT design for the multiplication of

„W‟- bit fixed coefficient „A‟ with 6-bit input „X‟

3.5 Principles of Compression

Image compression[53] addresses the problem of reducing amount of data required

to present a digital image. The underlying basis of the reduction process is the removal of

redundant data. From a mathematical view point, this amounts to transforming a 2D pixel

array into a statistically uncorrelated data set. The transformation is applied prior to

storage and transmission of the image. The compressed image is decompressed later to

reconstruct the original image or an approximation to it.

3.5.1 Need for Image Compression

With the advanced development in Internet, Teleconferencing, Multimedia and

High Definition Television technologies, the amount of information that is handled by

computers has grown exponentially over the past decades and hence storage and

transmission of the digital image component of multimedia systems pose a major problem.

The amount of data required to present images at an acceptable level of quality is

extremely large. High quality image data requires large amounts of storage space and

transmission bandwidth, with which the current technology is unable to handle technically

and economically. One of the possible solutions to this problem is to compress the

48

information so that the storage space and transmission time can be reduced. For example,

if a 1400x1800 color image needs to be stored, the space required to store the image is

1400 X 1800 X 8 X 3 = 60, 480, 000 bits

= 7, 560, 000 bytes

= 7.56 M bytes

The maximum space available on one Compact Disk (CD) is 700 MB, so that the

CD can store only 93 such images. The amount of data transmitted through the Internet

doubles every year and a large portion of that data comprises of images. Reduction of

bandwidth occupied by an image will result in significant cost reduction and make the

users of the device more affordable. Image compression offers a way to represent an

image in more compact way, so that images can be stored in compact manner and

transmitted faster.

3.5.2 Types of Redundancies

Following are the different types of redundancies

Psycho-visual redundancy: The accuracy of the human visual system is not 100% and

it is often possible to remove some details or reduce pixel precision (e.g. by

quantization), without affecting the perceived quality of the image.

Statistical redundancy: When the distribution of the symbol is not uniform, it is

generally possible to find an appropriate coding that will reduce the overall data length

(e.g. Entropy Coding).

Spatial redundancy: An image generally contains uniform regions of pixels or regular

patterns that can be efficiently represented with very few symbols (e.g. by prediction

or by resorting to a specific transformed domain).

Temporal redundancy: Temporal redundancy is the statistical correlation between

pixels from successive frames in a video sequence. The temporal redundancy is also

called Inter Frame redundancy. Motion compensated predictive coding is employed to

49

reduce temporal redundancy. Removing a large amount of temporal redundancy leads

to efficient video compression.

The following Fig. 3.24 shows different types of redundancies.

Fig. 3.24: Classification of Redundancy

3.5.3 Basic Image compression system

Image compression is an application of data compression that encodes the original

image with few bits. The objective of image compression is to reduce the redundancy of

the image and to store or transmit data in an efficient form.

Fig. 3.25: Basic image compression system

Fig 3.25 shows the block diagram of the general basic image compression system

along with storage system. The two basic components of an image compression are

Encoder and Decoder. The component that compresses the source image is called Encoder

Construct

n × n sub images

Quantizer

I-Quantizer Inverse Transform

Symbol encoder

Forward transform

Symbol decoder

Merge images

Hard Disk/ Channel

Decompressed image

Original image

50

and the output is the compressed data/coded data. The objective of Quantizer is to reduce

the precision and to achieve higher CR. The compressed and/or reduced data may be

either stored or transmitted, but are at some point fed to a decoder. The Decoder is a

component that recreates/reconstructs an image from the compressed data. The main goal

of such system is to reduce the storage quantity as much as possible and the decoded

image displayed in the monitor can be similar to the original image as much as can be.

3.5.4 Classification of Image Compression Schemes

Image compression schemes can be broadly classified into two types: Lossless

compression and Lossy compression technique schemes. Lossless compression scheme is

preferred in the case of Medical images. Lossy compression scheme is preferred in the

case of Multimedia applications. In the case of a lossless compression scheme, the

reconstructed image exactly resembles original image without any loss of information, but

the Compression Ratio (CR) is usually less. On the other hand, a high CR can be obtained

in a lossy compression scheme at the expense of the quality of reconstructed image. There

is always a tradeoff between the quality of the reconstructed image and the CR.

Lossless Compression (or) Reversible Compression

In lossless compression, the image after compression and decompression is

identical to the original image and every bit of information is preserved during the

decomposition process. The reconstructed image after compression is an exact replica of

the original one. Although the lossless compression methods have the appeal that there is

no deterioration in image quality, this scheme only achieves a modest CR. This lossless

compression scheme is used in applications where no loss of image data can be

compromised.

Lossy Compression (or) Irreversible Compression

In lossy compression, the reconstructed image contains degradations with respect

to original image. Here a perfect reconstruction is sacrificed by eliminating some amount

51

of redundancies in the image to achieve high CR. In lossy compression, the highest CR

can be achieved than lossless compression. The term lossy is often used to characterize

lossy compression schemes that result in no visual degradation under a set of designated

viewing conditions.

3.6 Coding Techniques

There have been so many lossless coding techniques used for image processing.

Some of them are explained below.

3.6.1 Lossless Coding Techniques

The following are the lossless coding techniques which are briefly explained

Entropy Coding

In information theory, an Entropy encoding is a lossless data compression scheme

that is independent of the specific characteristics of the medium. One of the main

advantages of Entropy Coding is to create and assign a unique prefix free code to each

unique symbol that occurs in the input. These Entropy encoders then compress data by

replacing each fixed length input symbol by the corresponding variable length prefix free

output codeword. The length of each codeword is approximately proportional to the

negative logarithm of the probability.

Entropy as a measure of similarity

Besides, using Entropy encoding as a way to compress digital data, an Entropy

encoder can also be used to measure the amount of similarity between streams of data.

This is done by generating an Entropy coder/compressor for each class of data. Unknown

data is then classified by feeding the uncompressed data to each compressor in order to

know which compressor yields the highest CR.

Run Length Encoding(RLE)

Run Length Encoding is a very simple form of data compression in which runs of

data (that is, sequences in which the same data value occurs in many consecutive data

http://en.wikipedia.org/wiki/Information_theory

http://en.wikipedia.org/wiki/Data_compression

http://en.wikipedia.org/wiki/Prefix-free_code

http://en.wikipedia.org/wiki/Entropy_(information_theory)

http://en.wikipedia.org/wiki/Proportionality_(mathematics)

http://en.wikipedia.org/wiki/Logarithm

http://en.wikipedia.org/wiki/Probability

http://en.wikipedia.org/wiki/Data_compression

52

elements) are stored as a single data value and count, rather than as the original run. This

is the most useful data that contains many such runs: for example, simple graphic images

such as icons, line drawings and animations. It is not useful in files that don‟t have many

runs as it could greatly increase the file size. RLE also refers to a little-used image format

in Windows 3.x, with the extension “.rle”, which is a Run Length Encoded Bitmap and it

can be used to compress the Windows 3.x start up screen.

Bit Plane Coding

A Bit Plane [54] of a digital signal (such as image or sound) is a set of bits having

the same position in the respective binary numbers. For example, for 16 bit data

representation, there are 16 bit planes: the 1st bit plane contains the set of the Most

Significant Bit and the 16th

contains the Least Significant Bit.

Huffman Coding

From Shannon source coding theory, it is known that a source can be coded with

an average code length close to the Entropy of the source. In 1952, D.A. Huffman

invented a coding technique to produce the shortest possible average code length which

gives the source symbol set and the associated probability of the occurrence of the

symbols. Codes generated using these coding techniques are popularly known as Huffman

Codes[55]. Huffman Coding technique is based on the following two observations

regarding optimum prefix codes.

The more frequently occurring symbols can be allocated with shorter codeword

than the less frequently occurring symbols.

The two least frequently occurring symbols will have codeword of the same

length and they differ only in the Least Significant Bit.

Huffman coding is more suitable than Arithmetic coding when simplicity is the

major concern.

http://en.wikipedia.org/wiki/Windows_3.x

http://en.wikipedia.org/wiki/Digital



http://en.wikipedia.org/wiki/Bit

http://en.wikipedia.org/wiki/Binary_number

http://en.wikipedia.org/wiki/16-bit

53

Lossless Predictive Coding

Predictive Coding techniques constitute another example of exploration of inter

pixel redundancy, in which the basic idea is to encode only the new information in each

pixel. This new information is usually defined as the difference between the actual and the

predicted value of that pixel. The key component is the predictor, whose function is to

generate an estimated (predicted) value for each pixel from the input image based on

previous pixel values.

The predictor‟s output is rounded to the nearest integer and compared with the

actual pixel value; the difference between the two is called Prediction Error, which is then

encoded by a Variable Length Coding (VLC) encoder. Since prediction errors are likely to

be smaller than the original pixel values, the VLC encoder will generate shorter codeword.

There are several local, global and adaptive prediction algorithms in the literature. In most

cases the predicted pixel value is a linear combination of previous pixels.

Lempel Ziv Welch (LZW) Coding

One of the most common algorithms used in computer graphics is the Lempel Ziv

Welch [56], compression scheme. This lossless method of data compression is found in

several image file formats, such as Graphic Interchange Format(GIF) and Tagged Image

File Format(TIFF) and it is also part of the V.42 bis modem compression standard and

Post Script Level 2. In 1977, Abraham Lempel and Jakob Ziv created the LZ family of

substitution compressors. The LZ78 compression algorithms are more commonly used to

compress binary data, such as bitmaps. In 1984, while working for Unisys, Terry Welch

modified the LZ78 compressor for implementing in high performance disk controllers.

The result was the LZW algorithm that is commonly found today. LZW is a general

compression algorithm capable of working on almost any type of data. It is generally fast

in both compressing and decompressing data and does not require the use of floating point

operations, because LZW writes compressed data as bytes and not as words. LZW

http://www.fileformat.info/format/gif/egff.htm

http://www.fileformat.info/format/tiff/egff.htm

54

encoded output can be identical on both Big Endian and Little Endian systems although

one may still encounter bit order and fill order problems. LZW is referred to as

a substitutional or dictionary based encoding algorithm. The algorithm builds a data

dictionary (also called a translation table or string table) of data occurring in an

uncompressed data stream. Patterns of data (substrings) are identified in the data stream

and are matched to entries in the dictionary. If the substring is not present in the

dictionary, a code phrase is created based on the data content of the substring and it is

stored in the dictionary. The phrase is then written to the compressed output stream. When

a re-occurrence of a substring is identified in the data, the phrase of the substring already

stored in the dictionary is written to the output, because the phrase value has a physical

size that is smaller than the substring and hence data compression is achieved.

Arithmetic Coding

Arithmetic Coding[57] is a variable length source encoding technique. In

traditional Entropy encoding techniques such as Huffman coding, each input symbol in a

message is substituted by a specific code specified by an integer number of bits.

Arithmetic Coding deviates from this paradigm. In this coding, a sequence of input

symbols is represented by an interval of real numbers between „0.0‟ and „1.0‟. It offers

superior compression efficiency and more flexibility compared to the popular Huffman

coding, but the arithmetic coding requires more computational power and memory

compared to the Huffman coding.

Embedded Zero tree Wavelet (EZW) Coding

For a 1D Wavelet Transform, a vector of the wavelet coefficients can be divided

into sub-bands after the wavelet decomposition as shown in the Fig. 3.26. Similarly, a

block of the two dimensional wavelet coefficients can be divided into sub-bands as shown

in Fig. 3.27. An EZW encoder was specially designed by Shapiro to use with Wavelet

Transforms. In fact, EZW coding is more like a quantization method. It was originally

55

designed to operate on images (2D signals), but it can also be used on other dimensional

signals. The EZW[58] encoder is based on progressive encoding to compress an image

into a bit stream with increasing accuracy.

Fig. 3.26: Sub-bands after 1D Wavelet Decomposition

Fig. 3.27: Sub-bands in a Wavelet Transform Block after 2D Wavelet Decomposition

Set Partitioning In Hierarchical Tree (SPIHT) Coding

The SPIHT [59] coder is a highly refined version of EZW algorithm and is a

powerful image compression algorithm that produces an embedded bit stream from

which the best reconstructed images in the MSE sense can be extracted at various bit

rates. A wide variety of images for given CRs results in high PSNR values with this

coding. Hence, it has become the state-of-the-art algorithm for image compression. The

Parent child relationship in SPIHT coding algorithm is shown in Fig. 3.28.

56

Fig. 3.28: Parent child relationship in SPIHT

Image data through the wavelet decomposition results into a tree of coefficients.

Here each coefficient has four children except the red marked coefficients in the

LL sub-band and the coefficients in the highest subbands (LH1, HL1 and HH1). The

following set of coordinates of coefficients is used to represent set partioning method in

SPIHT algorithm [60]. The location of coefficients is noted by (i, j), where „i‟ and „j‟

indicates row and column indices respectively.

3.6.2 Lossy Coding Techniques

Transform Coding

The techniques discussed so far work directly on the pixel values and are usually

called Spatial domain techniques[61] (refers to the image plane itself and methods in this

category are based on direct manipulation of the pixels of an image). Transform Coding

techniques use a reversible and linear mathematical transform to map the pixel values onto

a set of coefficients, which are then quantized and encoded.

57

Many of the resulting coefficients for most natural images have small magnitudes

and can be quantized (or discarded altogether) without causing significant distortion in the

decoded image. Different mathematical transforms, such as DFT, Walsh Hadamard

Transform (WHT) and Karhunen Loeve Transform (KLT) have been considered for the

task. For compression purposes, the higher the capability of compressing information in

fewer coefficients, the better the transform; for this reason, the DCT has become the most

widely used transform coding technique.

Embedded Block Coding with Optimal Truncation (EBCOT) Coding

In JPEG-2000, the Entropy coding of information is committed to the EBCOT

algorithm introduced in 1998 by David Taubman. Every sub-band is partitioned into little

blocks (for example 64x64 or 32x32), called Code Blocks. Every Code Block is encoded

independently from the other ones, thus producing an elementary embedded bit stream.

The algorithm can find some points of optimal truncation in order to minimize the

distortion and support its scalability.

Fractal Coding

Fractal compression is a lossy compression method for digital images based on

fractals. The method is best suited for textures and natural images, because of the fact that

parts of an image often resemble other parts of the same image. Fractal algorithms convert

these parts into mathematical data called Fractal Codes, which are used to recreate the

encoded image.

Chroma Coding

Chroma sub-sampling is the practice of encoding images by implementing less

resolution for Chroma information than for luma information, taking advantage of the

human visual system lower acuity for color differences than for luminance. It is used in

many video encoding schemes, such as analog, digital and in JPEG encoding.

http://en.wikipedia.org/wiki/Lossy_compression

http://en.wikipedia.org/wiki/Digital_image

http://en.wikipedia.org/wiki/Fractal

http://en.wikipedia.org/wiki/Algorithms

http://en.wikipedia.org/wiki/Chrominance

http://en.wikipedia.org/wiki/Information

http://en.wikipedia.org/wiki/Luma_(video)

http://en.wikipedia.org/wiki/JPEG

58

Zonal Coding

Zonal Coding is based on the premise that the transformed coefficients having

very high variances are the ones that carry most of the signal and hence they should be

retained, whereas the ones with less variance can be truncated. In Zonal Coding, it is

therefore necessary to compute the variances at every position of the transformed array,

based on an ensemble of representative blocks of transformed arrays or by applying global

image models, such as Gauss Markov model. „M‟ transform coefficients may be retained

based on high values of variance. The retained coefficients will have a value of „1‟ in the

binary zonal mask, whereas all truncated coefficients will have a value of „0‟. A Typical

Zonal Coding mask for an 8 x 8 block is shown in Fig. 3.29.

Fig. 3.29: A Typical Zonal Coding mask for an 8 X 8 block

Each block has the same zonal mask. These masks can be customized for images and in

that event the mask information needs to be encoded with the image. Two different bit

allocation policies exist to encode the retained coefficients. In first one, same number of

bits is assigned to each retained coefficient; each coefficient is normalized by its standard

deviation and then uniformly quantized. In the other bit allocation policy, number of bits

allocated to the retained coefficients is based on the variances of those coefficients

59

computed and more bits are to be allocated to the coefficients having high variance. In this

approach, optimal Lloyd Max quantizers are designed for every retained coefficient.

Threshold Coding

Zonal Coding is often applied over a fixed mask, which may not be optimal for all

blocks and for all images. For better coding performance, the positions and the number of

retained coefficients should be adaptively changed on a block to block basis, based on its

transformed coefficient array.

Such adaptive bit allocation is done using Threshold Coding approach, which is

more often used in practice and is based on the premise that the transform coefficient of

largest magnitude makes the most significant contribution to the reconstructed block

quality. Only those transform coefficients, whose magnitudes exceed a threshold are

significant and all the remaining ones can be discarded for image reconstruction.

3.7 Experimental results and Performance analysis of the proposed method

Experimental results

The proposed system is mainly composed with three parts; namely, a CMOS image

sensor, an FPGA and a PC, which is a real time platform. The functions of the image

sensor are fully programmable via I2C serial control bus.

Fig. 3.30: Brain tumor MR Image

The real time image is first captured by the image sensor and then output to the FPGA

by I2C bus. The transform circuit in the FPGA processes the captured image by doing 3

level LDWT. Transformed image data of each level are stored in the SRAM of FPGA and

then shown on the PC. The experimental result of the Lifting DWT system is shown in

60

Fig. 3.30, which is the transformed image by doing 3 level Lifting DWT from the original

sample image.

Performance analysis

The Fig. 3.31 shows the simulation result of 3D-Lifting based DWT performed on

the transformed brain MR Image shown in figure 3.30

Fig. 3.31: 3D LDWT Simulation Results

In Table 3.5, the synthesis report with respect to the hardware conditions reported

by the proposed method are shown with 3D Lifting DWT results[8-9][57][62].

Table 3.5: 3D Lifting based DWT Synthesis Report

Logic Utilization Used Available Utilization

Number of Slice Flip Flops 444 1,920 23%

Number of 4 input LUTs 561 1,920 29%

Number of occupied Slices 484 960 50%

Number of Slices containing only related logic 484 484 100%

Number of Slices containing unrelated logic 0 484 0%

Total Number of 4 input LUTs 592 1,920 30%

Number used as logic 337

Number used as route-thru 31

Number used for Dual Port RAMs 224

Number of bonded IOBs 66 83 79%

Number of BUFGMUXs 1 24 4%

Average Fanout of Non-Clock Nets 3.01

61

The proposed architecture is successfully synthesized using Spartan 3 FPGA. The

performance including the memory interface circuit for this real time platform is shown in

Table 3.5. The performance of the proposed 3D LDWT architecture is compared with

2D LDWT architecture. The device utility factor for both the architectures are summarized

and concluded that the 3D LDWT is efficient than 2D LDWT in respect of device

utilization parameters presented in Table 3.6.

3.8 Inference

The proposed architecture is simulated using Verilog HDL[63] and is implemented

on FPGA[64] [65]. The functions of the image sensor are fully programmable via I2C

serial control bus. The processed image data are shown on PC. This architecture is

optimized in pipelined way, thus reducing computation time and increasing the speed.

However, from the obtained simulation and synthesis, it is evident that the number of

LUTs has been increased to 29%. The PSNR obtained is around 40 dB and maximum CR

is achieved which makes the system suitable for the required application of compressing

medical images. The computation time is observed to be 36 msec only, which is much less

than the conventional technique. Hence, this architecture is more suitable for compression

of medical images[66] when Lifting based DWT[67-70] is chosen as the compression

technique.

Table 3.6: Device utility comparison of 2D LDWT and 3D LDWT

Device Utilization parameters

Utilization %

2D DWT 3D DWT

Number of slice flip flops 4% 23%

Number of 4 input LUTs 5% 29%

Number of occupied slices 9% 50%

Number of slices containing related logic 100% 100%

Number of slices containing unrelated logic 0% 0%

Total number of 4 input LUTs 5% 30%

Number of bonded IOBs 44% 79%

Number of BUFG Mux‟s 4% 4%

CHAPTER-3 DESIGN AND IMPLEMENTATION OF LIFTING BASED...

Documents

Transcript of CHAPTER-3 DESIGN AND IMPLEMENTATION OF LIFTING BASED...