CHAPTER-3 DESIGN AND IMPLEMENTATION OF LIFTING BASED...
Transcript of CHAPTER-3 DESIGN AND IMPLEMENTATION OF LIFTING BASED...
17
CHAPTER-3
DESIGN AND IMPLEMENTATION OF LIFTING BASED 3D DISCRETE
WAVELET TRANSFORM FOR IMAGE COMPRESSION
3.1 Introduction
To deal with medical image processing, it is a pre-requisite to understand the
concepts of basic transformation techniques applied to signal processing and also various
compression techniques that are applied to images of different types.
3.1.1 Fourier Transform (FT)
The Fourier Transform [36] has played an important role in signal processing for
many years. The 2D Fourier Transform is a powerful tool and is used to enhance, restore,
encode and describe the images. Spatial and Frequency domain approaches are two
different types in image processing. Most of the spatial domain approaches involve more
computations, whereas the frequency domain approaches like Fourier Transform are more
flexible and involve less computation. Fourier Transform is linear and it possesses the
property of homogeneity and additivity as shown in Fig. 3.1 and Fig. 3.2 respectively.
However it is necessary to analyze the various image transforms.
The Fast Fourier Transform (FFT) [36] an algorithm of Discrete Fourier
Transform (DFT) is used in a number of image processing applications to reduce
computational cost. The FFT is easy to be implemented by employing successive doubling
technique and hence it finds an important place in image processing applications.
The Fourier transform of x(t), denoted by X(f) is defined in Eq. 3.1.
𝑋 𝑓 = 𝑥(𝑡)𝑒−𝑗2𝜋𝑓𝑡+∞
−∞𝑑𝑡 … (3.1)
Where x(t) is a continuous function of a real variable „x‟ and j= −1, the variable „f’ is
frequency, x(t) the original signal can be obtained by the application of the Inverse Fourier
Transform (IFT). Fourier Transform identifies all spectral components present in the
18
signal; however it does not provide any information regarding the temporal (time)
localization of the components.
Fig. 3.1: Homogeneity Property
Fig. 3.2: Additivity Property
19
3.1.2 Fourier Transform Limitations
The signals have been classified into two types viz.., Stationary and
Non-Stationary. Non-Stationary signals are those which have got time varying spectral
components. FT only provides the existence of the spectral components of the signal, but
does not provide any information on the time occurrences of spectral components.
The reason is the basis function e-jωt
stretches to infinity and hence signal can be
analyzed globally. In order to obtain time-localization of spectral components, the signal
needs to be analyzed locally. This can be achieved by Short Time Fourier Transform.
3.1.3 Short Time Fourier Transform (STFT)
The Short Time Fourier Transform or alternatively Short Term Fourier
Transform[37] is a Fourier related transform, used to determine the sinusoidal frequency
and phase content of local sections of a signal as it changes over time. Simply, in the
continuous time case, the function to be transformed is multiplied by a window
function which is non-zero for only a short period of time. The Fourier Transform of the
resulting signal is taken as the window that slides along the time axis, resulting in a two
dimensional representation of the signal. Mathematically, this is written in Eq. 3.2.
STFTxω τ, ω = x t ω t − τ e−jωtdt
t … (3.2)
where ω(t) is the window function, commonly a Hann Window or Gaussian bell centered
around zero and x(t) is the signal to be transformed. X(τ, ω) is essentially the Fourier
Transform of x(t)ω(t-τ), a complex function representing the phase and magnitude of the
signal over time and frequency. Often, phase unwrapping is employed throughout either or
both the time axis „τ‟ and frequency axis „ω‟, to suppress any jump discontinuity of the
phase result of the STFT. The time index „τ‟ is normally considered to be slow time.
The main advantage of STFT is that, it gives time frequency description of the
signal and it overcomes the difficulties of Fourier Transform, by using windowing
functions.
20
3.1.4 Evolution of Wavelets and Wavelet Transform (WT)
Evolution of Wavelets
Wavelet Transform (WT)
The Wavelet Transform has gained widespread acceptance in signal processing
and image compression. Because of their inherent multi resolution nature, wavelet coding
schemes are especially suitable for applications where scalability and tolerable
degradation are important. Wavelet compression is a form of data compression well suited
21
for image compression, sometimes for video and audio compression. The transformation
of a signal is just another form of representing the signal. It does not change the
information content present in the signal. The Wavelet Transform provides a time
frequency representation of the signal. It was developed to overcome the short coming of
the Short Time Fourier Transform (STFT), which can also be used to analyze non-
stationary signals. While STFT gives a constant resolution at all frequencies, the Wavelet
Transform uses multi-resolution technique by which different frequencies are analyzed
with different resolutions.
A wave is an oscillating function of time or space and is periodic. In contrast,
wavelets are localized waves. They have their energy concentrated in time or space and
are suited to analysis of transient signals. While Fourier Transform and STFT use waves to
analyze signals, the Wavelet Transform uses wavelets of finite energy.
Using a Wavelet Transform, the wavelet compression methods are adequate for
representing transients, such as percussion sounds in audio or high frequency components
in two dimensional images, for example an image of stars on a night sky. This means that
the transient elements of a data signal can be represented by a smaller amount of
information that would be the case if some other transform, such as the more
widespread Discrete Cosine Transform, had been used.
Firstly, Wavelet Transform is applied to an image, which produces as
many coefficients as there are pixels in the image (i.e. there is no compression yet since it
is only a transform). Then these coefficients can be compressed more easily, because the
information is statistically concentrated in just a few coefficients. This principle is called
Transform Coding. After that, the coefficients are quantized and the quantized values
are Entropy encoded and/or Run Length encoded. A few 1D and 2D applications of
wavelet compression use a technique called Wavelet Footprints. The Wavelet Transform
22
can provide the frequency of the signals and the time associated to those frequencies
making it very convenient for its application in numerous fields.
(a) (b)
Fig. 3.3: Demonstration of (a) Wave and (b) Wavelet
A wavelet shown in Fig. 3.3 is a waveform of effectively limited duration that has
zero average value. Wavelet analysis is the decomposition of a function onto shifted and
scaled versions of the basic wavelet. A wavelet is a wave shaped function mentioned in
Eq. 3.3 having a limited length with a zero mean value. This means that a wavelet
decreases fast enough in the frequency domain and a consequence of the condition for the
existence of the Inverse Wavelet Transform.
0ψ(x)dx(0)ψ̂ ….. (3.3)
Unlike a sine wave, wavelets are generally irregular and asymmetrical that is shown in
Fig. 3.4.
Fig. 3.4: Sine function and a wavelet
It is intuitively clear that, functions with sharp changes can be analyzed better in
using short irregular waves than with a smooth infinite sine. The wavelet basis {ψj,k(x)}j,k
is generated by the translation and dilatation ψ(2-j.x - k) of the basic (“mother”) wavelet
ψ(x). If the basic wavelet ψ(x), where ψ(x) ≡ ψ0,0(x) starts at the moment of x = 0 and ends
at the moment of x = N - 1, the shifted wavelet ψo,k starts at the moment of x = k and ends
23
at the moment of x = k + N - 1. The scaled wavelet ψj,0 starts at the moment of x = 0 and
ends at the moment of x = 2j(N - 1). Its graph is scaled (compressed or expanded,
depending of the sign of „j‟) by a factor of „2-j‟ (Eq. 3.4), while the graph of the wavelet
ψ0,k is translated to the right by k, if k > 0 (Eq. 3.5).
Scaling ψj,0(x) = 2-j/2
ψ(2-jx) … (3.4)
Translation ψ0,k(x) = ψ(x - k) … (3.5)
The basis wavelet is generated by scaling the basic wavelet „j‟ times and shifting it by „k‟
is given by Eq. 3.6.
ψj,k(x) = 2-j/2
ψ(2-j
x - k) ... (3.6)
The multiplier 2−j/2 is a normalizing factor, so that the L2 norm of the wavelet is
equal to one. The space of details on the jth
resolution level „Wj‟ contains functions that are
linear combinations of wavelets ψj,k(x).
The wavelet analysis is done similar to the STFT analysis. The signal to be
analyzed is multiplied with a wavelet function just as it is multiplied with a window
function in STFT and then the transform is computed for each segment generated.
However, unlike STFT in Wavelet Transform, the width of the wavelet function changes
with each spectral component.
The Wavelet Transform[38] at high frequencies gives good time resolution and
poor frequency resolution, while at low frequencies gives good frequency resolution and
poor time resolution. Wavelet Transform is again classified into Continuous Wavelet
Transform and Discrete Wavelet Transform.
3.1.5 Continuous Wavelet Transform (CWT)
A Continuous Wavelet Transform[37] is used to divide a continuous-time function
into wavelets. Unlike FT, the CWT possesses the ability to construct a time frequency
representation of a signal that offers very good time and frequency localization. CWTs are
particularly helpful in tackling problems, involving signal identification and detection of
24
hidden transients (hard to detect, short lived elements of a signal). The CWT with a given
function f(x), called the Analyzing Wavelet can be expressed as given by the Eq. 3.7.
… (3.7)
τ = Translation parameter
s= 1/f = Scaling parameter
ψ(t) = Mother wavelet,
The kernel functions used in Wavelet Transform are obtained from one prototype function
known as Mother Wavelet, by scaling and/or translating it. The modified equation is
shown in Eq. 3.8.
… (3.8)
a = Scale parameter
b = Translation parameter
In order to become a wavelet, a function must satisfy the following two conditions given
in Eq. 3.8.1 and Eq. 3.8.2.
Ѱ 𝑡 𝑑𝑡 = 0∞
−∞ ... (3.8.1)
|Ѱ 𝑡 |2𝑑𝑡 < ∞∞
−∞ ... (3.8.2)
3.1.6 Discrete Wavelet Transform (DWT)
In statistical and functional analysis, a DWT is any Wavelet Transform for which
the wavelets are discretely sampled. Its advantage over FT and other WTs is temporal
resolution, because it captures both frequency and location information (location in time).
1D Discrete Wavelet Transform
The Discrete Wavelets Transform (DWT) [39], transforms a discrete time signal
to a discrete wavelet representation. Initially, the wavelet parameters are discretized to
reduce the continuous basis set of wavelets to a discrete and orthogonal/ orthonormal set
of basis wavelets and is given by Eq. 3.9.
m,n(t) = 2m/2
(2m
t – n); m, n such that m > -, n < … (3.9)
dts
τtx(t)Ψ
|s|
1s)Ψ(τ,s)(ττCWT )(
t
*Ψ
x
(t)dtx(t)Ψa
1b)W(a, ba,
25
The 1D DWT is given as the inner product of the signal x(t) being transformed with each
of the discrete basis functions is written in Eq. 3.10.
Wm,n = < x(t), m,n(t) > ; m, nZ ... (3.10)
The 1D Inverse DWT is given by Eq. 3.11.
x(t) = m n
nm,nm, (t)ψW ; m, nZ … (3.11)
2D Discrete Wavelet Transform
The 1D DWT can be extended to 2D transform[40] using separable wavelet filters.
With separable filters, applying a 1D transform to all the rows of the input, which is
shown in Fig. 3.5(a) is then repeating on all of the columns can compute the 2D transform
shown in Fig. 3.5(b). When one level 2D DWT is applied to an image, four transform
coefficient sets are created. As depicted in Fig. 3.5(c), the four sets are LL, HL, LH and
HH, where the first letter corresponds to applying either a Low pass or High pass filter to
the rows and the second letter refers to the filter applied to the columns.
Fig. 3.5: Illustration of 1D DWT applied to input image
Fig. 3.6: DWT for Lena image (a) Original image (b) Output image after 1D DWT
applied on column input (c) Output image after 1D DWT applied on row input
26
The 2D DWT[41] converts images from spatial domain to frequency domain. At
each level of the wavelet decomposition, each column of an image is first transformed
using a 1D vertical analysis filter bank. The same filter bank is then applied horizontally
to each row of the filtered and sub-sampled data. One level of wavelet decomposition
produces four filtered and sub sampled images, referred to as sub-bands.
The upper and lower areas of Fig. 3.6(b) represent the low pass and high pass
coefficients respectively after applying vertical 1D DWT and sub-sampling to an input
image shown in Fig. 3.6(a). The result of the horizontal 1D DWT and sub-sampling to
form a 2D DWT output image is shown in Fig. 3.6(c). Multiple levels of Wavelet
Transforms can be used to concentrate data energy in the lowest sampled bands.
Especifically, the LL sub-band in Fig. 3.5(c) can be transformed again to form LL2, HL2,
LH2 and HH2 sub-bands, producing a two level Wavelet Transform.
An „R-1‟ level wavelet decomposition is associated with „R‟ resolution levels
numbered from „0‟ to „R-1‟ with „0‟ and „R-1‟ corresponding to the coarsest and finest
resolutions. The straight forward convolution implementation of 1D DWT requires a large
amount of memory and large computation complexity. An alternative implementation of
the 1D DWT known as the Lifting scheme provides significant reduction in the memory
and the computation complexity.
Lifting also allows in-place computation of the wavelet coefficients. Nevertheless,
the Lifting approach computes the same coefficients as the direct filter bank convolution.
To employ wavelets for image decomposition, it is replaced with the notion of time, which
has therefore served as free variable with Spatial position. In addition, the wavelet
framework has to deal with the two dimensional signals. Although, two dimensional
wavelets can be constructed, a more popular approach is to transform images using one
dimensional separable wavelet.
27
Using separable wavelets, one can apply the Wavelet Transform first in a direction
and then transform the result again in the other direction. In Fig. 3.7 firstly, DWT is
applied to x-direction of an image, relocating the scaling coefficients to the left side and
the wavelet coefficients to the right side as before. Afterwards, DWT is applied in the
y-direction on the resulting image for relocating scaling coefficients to the top.
Fig. 3.7: Two dimensional transform with separable wavelets
Fig. 3.8: 2 Level decomposition of Lena image
Different filter banks can be used for each direction, if desired. After both
transformations the upper left quadrant will contain the original image at half of the
resolution, while the other quadrants contain the refinement coefficients necessary to
bring the smaller image back to full scale. Each of the quadrants have their own basis
functions, thus the basis for separable 2D transforms consists of one scaling function ΦxΦy
and three wavelet functions ΨxΦy, ΦxΨy and ΨxΨy. After executing DWT in both
directions, the algorithm can be recursively applied to the lower resolution image.
Fig. 3.8 shows the wavelet coefficients of LENA after 2 levels of decomposition with
28
Daubechies 4-tap wavelet and recognized that the original image in the upper left, has
scaled down to 25% resolution. The wavelet coefficients, especially those from level „1‟,
are so small that they are almost imperceptible (the grey levels have been contrast
enhanced for improved viewing).
This illustrates the efficiency of Wavelet Transforms for energy compaction.
Interestingly (and quite unlike the Fourier Transform) the wavelet coefficient quadrants
visually resemble the high resolution details of the image. The lower left quadrant has
mostly details for the x-direction, while the upper right has details for the y-direction.
The lower right quadrant has details from both directions (diagonal), but they are
almost too fine to be seen. The actual compression is accomplished by discarding
coefficients. For instance, discard some of the quadrants in the decomposition. But, a
better strategy would be to selectively discard coefficients based on their magnitude. Since
larger coefficients probably have more impact on the reconstructed image, keep those and
rather discard the smaller values which are known as Thresholding.
With Hard thresholding, a tolerance limit „T’ should be selected and discard all
coefficients with absolute value smaller than „T’. A variation on this scheme is called
Quantile Thresholding in which a percentage „P‟ will be selected and smallest „P‟ percent
of the values will be discarded. With Soft Thresholding, the magnitude of all coefficients
is reduced by the amount „T’. The coefficients that are smaller than this value are reduced
to zero, while all the rest are brought closer to zero. Instead of subtraction, the use of
integer division by „Q‟ is followed. Again, all values smaller than „Q‟ would be reduced to
zero, while the rest are made smaller.
This strategy would also limit the number of different values for coefficients,
which in effect could make coding more efficient, since the number of bits required to
code the values can be reduced. The process of limiting the set of possible values used is
known as Quantization. Besides, more advanced approach is to use different values of „T’
29
or „Q‟ for different sub-bands. Since the human visual system is less sensitive to high
frequencies it is desirable to use a greater threshold value or a coarser quantization for the
fine detail sub-bands. An example of wavelet compression is shown in Fig. 3.9.
(a) (b) (c)
Fig. 3.9: Lena image compressed with Daubechies 4-tap wavelet
(a) Original (b) 80% compressed image (c) 96% compressed image
In the middle Fig. 3.9(b) the smallest 80% of the wavelet coefficients have been
discarded before reconstructing the image (hard threshold). At this compression level,
there is no perceivable reduction in image quality. The only visual effect seems to be a
reduction of noise and a slight smoothing of texture. The Fig. 3.9(c) is reconstructed from
only 4% of the original coefficients. The image is now composed of 2621 wavelets of
different sizes and positions, as compared to 65536 pixels in standard representation.
Compression artefacts have now become apparent, but even at this high level of
compression, the image is quite recognizable. In comparison, a JPEG representation at this
compression level would on average synthesize each patch of only 2.5 basis patterns.
Haar Wavelet Transform
The first DWT was invented by the Hungarian mathematician Alfred Haar[42] in
1909. For an input represented by a list of 2n numbers where „n‟ represents number of bits
of a pixel. The Haar Wavelet Transform may be considered to simply pair up input values,
storing the difference and passing the sum. This process is repeated recursively, pairing up
the sums to provide the next scale, thus finally resulting in 2n-1 differences and one final
30
sum. Haar used these functions to give an example of a countable orthonormal system for
the space of square integrable functions on the real line. The study of wavelets, and even
the term Wavelet did not come until much later. The Haar Wavelet is also the simplest
possible wavelet. The technical disadvantage of the Haar Wavelet is that it is not
continuous and therefore not differentiable. This property can however, be an advantage
for the analysis of signals with sudden transitions, such as monitoring of tool failure in
machines. The mother wavelet function ψ(t) of Haar Wavelet (Eq. 3.12) can be described
as
Ѱψ 𝑡 =
1, 0 ≤ 𝑡 ≤1
2
−1, 1
2≤ 𝑡 ≤ 1
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒
… (3.12)
Its scaling function φ(t) can be described as in Eq. 3.13.
φ 𝑡 = 1, 0 ≤ 𝑡 ≤ 1
0, 𝑜𝑡𝑒𝑟𝑤𝑖𝑠𝑒 ... (3.13)
In functional analysis, the Haar systems denote the set of Haar Wavelets given in Eq. 3.14.
{tψn,k(t) = ψ (2nt-k); n ψ N, 0 ≤ k < 2
n} ... (3.14)
Haar Wavelet properties
The Haar Wavelet transform has got several peculiar properties:
1. Any continuous real function can be approximated by linear combinations of φ(t),
φ(2t), φ(4t),….. φ(2kt) and their shifted functions. This extends to those function
spaces where any function therein can be approximated by continuous functions.
2. Any continuous real function can be approximated by linear combinations of the
constant function ψ(t), ψ(2t), ψ(4t),….. ψ(2kt) and their shifted functions.
3. Orthogonality
n1n,m1m,1
m1mm δδ)dtntn)ψ)ψtψ(22
…(3.15)
Here δi,j represents the Kronecker delta. The dual function of ψ(t) is ψ(t) itself.
4. Wavelet/Scaling functions with different scale „m‟ have a functional relationship.
31
φ(t) = φ(2t) + φ(2t − 1)
ψ(t) = φ(2t) − φ(2t − 1) ... (3.16)
5. Coefficients of scale „m‟ can be calculated by coefficients of scale „m+1‟:
If xw(n,m) = 2m/2
n)dttx(t)φ(t m
... (3.17)
and Xw(n,m) = 2m/2
n)dttx(t)ψ(t m
… (3.17.1)
then xw(n,m) = 1))m1,(2nX1)m(2n,(X2
1ww
... (3.18)
Xw(n,m) = 1))m1,(2nX1)m(2n,(X2
1ww … (3.19)
Haar matrix
The 2×2 matrix described by Eq. 3.20 that is associated with the Haar
Wavelet[43] is
11
1 12H … (3.20)
Using the DWT, any sequence (a0, a1,.., a2n, a2n+1) of even length can be transformed into a
sequence of two component vectors ((a0, a1),.., (a2n, a2n+1)), then each of these vectors is
right multiplied with the matrix H2, results ((s0, d0),.., (sn, dn)) in one stage of the Fast Haar
Wavelet Transform. Sequence „s‟ is often referred to as the averages part, whereas „d‟
is known as the details part. Usually one separates the sequences „s‟ and „d‟ and
continues with transforming the sequences. Generally, the 2N×2N Haar matrix can be
derived by the following Eq. 3.21a.
H2N = 𝐻𝑁 ⊗ [1, 1]𝐼𝑁 ⊗ [1, −1]
... (3.21a)
If N=2 ⟹ H4 = 𝐻2 ⊗ [1, 1]
𝐼2 ⊗ [1, −1]
32
where I2 is a Identity matrix of order 2 x 2.
I2 = 1 00 1
and ⊗ is the Kronecker Product
For Example, if A is an m x n matrix and B is a p x q matrix then the Kronecker product of
A ⊗ B is computed as
A ⊗ B = 𝑎11𝐵 ⋯ 𝑎1𝑛 𝐵
⋮ ⋱ ⋮𝑎𝑚1𝐵 ⋯ 𝑎𝑚𝑛 𝐵
If the sequence of length is a multiple of four, blocks of four elements is
constructed and elements are transformed in a similar manner with the 4×4 Haar matrix
which combines two stages of the Fast Haar Wavelet Transform is represented by
Eq. 3.21.
H 4 = 1 11 −1
⊗ 1 1
1 0
0 1 ⊗ 1 − 1
H 4 =
1[1 1] 1[1 1]
1[1 1] −1[1 1]1[1 − 1] 0[1 − 1]
0[1 − 1] 1[1 − 1]
H 4 =
1 1 1 11 1 −1 −11 −1 0 00 0 1 −1
... (3.21)
Note that, the above matrix is an un-normalized Haar matrix. The Haar matrix required
by the Haar transform should be normalized. Unlike the Fourier transform, the Haar
matrix has only real element (i.e., 1, -1 or 0) and is non-symmetric.
Haar Transform
The Haar Transform[44] is the simplest of the Wavelet Transforms. This transform
cross multiplies a function against the Haar Wavelet with various shifts and stretches, like
the Fourier Transform cross multiplies a function against a sine wave with two phases at
33
many stretches. The Haar Transform is derived from the Haar matrix. An example of a
4x4 Haar Transformation matrix (Eq. 3.22) is shown below.
22 0 0
0 022
1 1 1 1
1 1 1 1
4
1H 4 ... (3.22)
The Haar Transform[45] can be thought of as a sampling process in which rows of
the transformation matrix act as samples of fine resolution.
Daubechies Wavelet Transform
The most commonly used set of DWTs is formulated by the Belgian
mathematician, Ingrid Daubechies in 1988. This formulation is based on the use
of recurrence relations to generate progressively finite discrete samplings of an implicit
mother wavelet function. In her seminal paper, Daubechies derived a family of wavelets,
the first of which is the Haar Wavelet. Interest in this field has exploded since then and
many variations of Daubechies original wavelets were developed. In general the
Daubechies Wavelets[46] are chosen to have the highest number „A‟ of vanishing
moments (this does not imply the best smoothness) for given support width N=2A and
among the 2A-1 possible solutions, the one is chosen whose scaling filter has extreme
phase variation. The Wavelet Transform is also easy to put into practice using the Fast
Wavelet Transform. Daubechies Wavelets are widely used in solving a broad range of
problems, e.g. self similarity properties of a signal or fractal problems, signal
discontinuities etc..,
The Daubechies Wavelets are not defined in terms of the resulting scaling and
wavelet functions. In fact, they are not possible to write down in closed form. The graph
shown below is generated using the cascade algorithm, a numeric technique consisting of
simply inverse transforming [1 0 0 0 0 ...] an appropriate number of times.
34
Fig. 3.10: Scaling, Wavelet function and corresponding amplitude of the frequency spectra
Note that the spectra shown here are not the frequency response of the high and
low pass filters, but rather the amplitudes of the Continuous Fourier Transforms (CFT) of
the scaling (blue) and wavelet (red) functions. Daubechies orthogonal wavelets D2-D20
(even index numbers only) is commonly used. The index number refers to the number „N‟
of coefficients. Each wavelet has a number of zero moments or vanishing moments equal
to half the number of coefficients.
For example, D2 (a special case of the Haar Wavelet) has one vanishing moment,
D4 has two and etc., A vanishing moment limits the wavelet‟s ability to represent
polynomial behavior or information in a signal. For example, D2 with one moment easily
encodes polynomials of one coefficient, or constant signal components. D4 encodes
polynomials with two coefficients, i.e. constant and linear signal components and D6
encodes three polynomials, i.e. constant, linear and quadratic signal components.
This ability to encode signals is nonetheless subject to the phenomenon of scale
leakage and the lack of shift invariance, which arise from the discrete shifting operation
(below) during application of the transform. Sub-sequences which represent linear,
quadratic signal components are treated differently by the transform depending on
whether the points align with even or odd numbered locations in the sequence. The lack of
35
the important property of shift invariance has led to the development of several different
versions of a Shift Invariant (discrete) Wavelet Transform.
The following Table 3.1 shows the comparison between Haar and Daubechies
Wavelets.
Table 3.1: Comparison of Haar and Daubechies Wavelets
Property Haar Daubechies
Explicit function Yes No
Orthogonal Yes Yes
Symmetric Yes No
Continuous No Yes
Compacted support Yes Yes
Maximum regularity for order L No No
Shortest scaling function for order L Yes No
The different types of transformations and representations are shown in Table 3.2 below.
Table 3.2: Comparison of Fourier Transformation, Time-Frequency analysis and
Wavelet Transformation
Transformation Representation Output
Fourier Transform X(f)= 𝑥(𝑡)
+∞
−∞𝑒−𝑗2𝜋𝑓𝑡 𝑑𝑡 frequency „f‟
Time-Frequency analysis X(t, f) time „t‟; frequency „f‟
Wavelet Transform X(a,b)=1
a Ѱ
𝑡−𝑏
𝑎 𝑥(𝑡)𝑑𝑡
∞
−∞ scaling „a‟; time „b‟
3.2 Proposed 3D Lifting based Discrete Wavelet Transform
The Wavelet Transform[47] provides a multi-resolution representation using a set
of analyzing functions that are dilations and translations of a few functions (wavelets). An
efficient VLSI architecture for implementation of 3D Lifting based DWT is proposed. The
whole architecture is optimized in efficient pipeline and parallel design to speed up and
achieve higher hardware utilization. Time Division Multiplexing (TDM) design is utilized
to realize the prediction step and update step using the same architecture and hence the
size of the circuit can be reduced.
36
(a)
(b)
Fig. 3.11: (a) 3 level decomposition of an image (b) Pictorial representation of 3D DWT
By taking data of size N1 x N2 x N3 and applying the 1D analysis filter bank to the
first dimension, two sub-band data sets, each of size 𝑁1
2 x N2 x N3 are obtained. After
applying the 1D analysis filter bank to the second dimension four sub-band data sets, each
of size 𝑁1
2 x
𝑁2
2 x N3 are obtained. Applying the 1D analysis filter bank to the third
dimension gives eight sub-band data sets, each of size 𝑁1
2 x
𝑁2
2 x
𝑁3
2. This is illustrated in
the Fig. 3.11(a) and Fig. 3.11(b). The proposed work utilized an efficient line-based VLSI
architecture for 3D DWT using Lifting scheme, which is mainly composed of one row
DWT module and one column DWT module, working in parallel and pipeline fashion
with 100% hardware utilization.
Many common classes of images, such as medical images (e.g. MRI), scanned
documents and Satellite images do not have the same statistical properties as photographic
images. The standard wavelets used in image coders often do not match such images,
resulting in decreased compression or image quality. Moreover MRIs are often stored in
37
large databases of similar images, making it worthwhile to find a specially adapted
wavelet for them.
In medical applications like Tele-medicine and automatic diagnosis,
a sophisticated and lossless compression on one side, decompression and detection on the
other side are essential. To mitigate this problem a Lifting based Discrete Wavelet
Transform technique has been proposed where row DWT and column DWT have been
designed and implemented with 3 Level decomposition of 3D signal is shown in Fig. 3.12
below and also its 3D view of decomposition of an image is also shown in Fig. 3.13.
Fig. 3.12: 3 Level decomposition of 3D image
Fig. 3.13: 3D view of decomposition of an image
38
3.2.1 Proposed Lifting based DWT
Wavelet Transform is an important and useful application for image compression.
Many techniques have been developed for feature extraction from MRI, but Lifting based
3D DWT is the most versatile method for feature extraction because it is a non-statistical
method which gives local frequency information and detail coefficients of the image at
various levels.
The Lifting scheme[48] has been developed as a flexible tool suitable for
constructing the second generation wavelet[49]. It is composed of three basic operation
stages: Splitting, Predicting and Updating. Fig. 3.14 shows the Lifting scheme of the
wavelet filter computing one dimension signal.
Fig. 3.14: Lifting scheme of the wavelet filter
Split step: The signal is split into even and odd points because the maximum correlation
between adjacent pixels can be utilized for the next predict step.
Predict step: The even samples are multiplied by predict factor and then the results are
added to odd samples to generate the detailed coefficients (High Pass Coefficients).
Update step: The detailed coefficients computed by the predict step are multiplied by the
update factor and then results are added to even samples to get coarse coefficients
(Low Pass Coefficients).
Split (s) Predict (P) Update (U)
X2i+1
X2i
Xi
High pass
Coefficients
Low pass
Coefficients
39
3.2.2 Proposed 3D LDWT Architecture
The proposed VLSI architecture shown in Fig. 3.15 performs 3D LDWT with line
based method, which consists of five key modules: data choose module, the row DWT
module, the column DWT module, DWT control unit and external RAM. An 𝑁2
4 external
RAM is used to store the LL band output coefficients to carry out the multi level
decomposition, where „N‟ represents the width and the height of the input image. The
DWT control unit controls the time sequence of the whole system.
Fig. 3.15: Block diagram of 3D Lifting based DWT architecture.
Firstly, one line of image data or LL sub-band data is routed in the Data Selector.
Then the data enter into the row processor to perform 1D row DWT and the output data
are stored in the line buffer. The number of the buffers is decided by the number of tap of
the low pass filter. When 𝑀+1
2 („M‟ is the number of taps of the Low Pass Filter) rows of
data have finished the row DWT, the column DWT module starts to perform the column
transform immediately and stores the intermediate results in the column buffer. The final
transformed data are stored in the external RAM. DWT module is explained using Finite
State Machine (FSM) chart (Fig. 3.16).
40
Fig.3.16: FSM of DWT control unit
Improved Embedded Mirror Symmetric Extension at the Boundaries
The finite length of signal, processed by using wavelet filter leads to the edge
effect. JPEG 2000 standard[50] employs the Symmetric Extension at the boundaries to
eliminate it. The traditional extension arithmetic needs additional memory units and
operations and it will consume much power and area[10]. According to the characteristic
of the Lifting based DWT, this module brings forward the Embedded Mirror Symmetric
Extension Arithmetic[51], as shown in Fig. 3.17. It is embedded into the data operation
process by changing the operation process at the beginning and ending of the Lifting
operation.
Fig. 3.17: Mirror symmetric extension
END
41
Eq. 3.23 given below are relating to the new operation process of (5, 3) Wavelet
Transform.
y(2n+1)=
𝑥 2𝑛 + 1 −𝑥 2𝑛 +𝑥(2𝑛+2)
2 𝑖0 + 1 ≤ 2𝑛 + 1 ≤ 𝑖𝑙 − 2, 𝑛𝑜𝑟𝑚𝑎𝑙
𝑥 2𝑛 + 1 − 𝑥 2𝑛 + 2 2𝑛 + 1 = 𝑖0, 𝑜𝑑𝑑_𝑏𝑒𝑔𝑖𝑛
𝑥 2𝑛 + 1 − 𝑥 2𝑛 2𝑛 + 1 = 𝑖𝑙 − 1, 𝑜𝑑𝑑_𝑒𝑛𝑑
y(2n)=
𝑥 2𝑛 −
𝑦 2𝑛−1 +𝑦 2𝑛+1 +2
4 𝑖0 + 1 ≤ 2𝑛 ≤ 𝑖𝑙 − 2, 𝑛𝑜𝑟𝑚𝑎𝑙
𝑥 2𝑛 +𝑦(2𝑛+1)
2+
1
2 2𝑛 = 𝑖0, 𝑒𝑣𝑒𝑛_𝑏𝑒𝑔𝑖𝑛
𝑥 2𝑛 +𝑦(2𝑛−1)
2+
1
2 2𝑛 = 𝑖𝑙 − 1, 𝑒𝑣𝑒𝑛_𝑒𝑛𝑑
… (3.23)
The embedded scheme, embedded into the row DWT module and column DWT
module is implemented by FSM and multiplexers which have four states: Forward
extension, Normal even, Normal odd and Last extension. The data extension is only
embedded in Forward extension and Last extension.
Proposed row DWT module
The proposed architecture is optimized in terms of the processing speed, as
illustrated in Fig. 3.18. The multiplication is optimized by using shifting and adding
operation. In this way, the row processor consists of six registers, five multiplexers, one
adder and one shifting adder. All the hardware resources of the row processor can be time-
multiplexed. One single line is calculated at a time. When a Lifting step is performed, two
consecutive even-numbered samples are added and multiplied with the corresponding
Lifting coefficient and later added to the middle odd-numbered sample, i.e. one pixel data
is encoded in one clock. This reduces storage cells, compared to the row DWT module
where the input lines are partitioned into even and odd samples[9] (which needs two
parallel row DWT units).
42
Fig. 3.18: Block diagram of Row DWT
Embedded Mirror Symmetric Boundary data extension algorithm is implemented
by using two multiplexers controlled by signals sel „0‟ and sel „1‟ which results in
significant reduction in the amount of internal storage and the access times of the external
memory. The control signals (sel „0‟, sel „1‟) of multiplexers and the corresponding model
of the Lifting scheme are shown in Table 3.3.
Table 3.3: Control Signals and the corresponding model of the Lifting Scheme
Sel 1 Sel 0 Lifting Scheme step Corresponding
coefficient Extension
0 0 Prediction -1/2 Forward
extension
0 1 Update -1/4 Normal even
1 0 Prediction -1/2 Normal odd
1 1 update -1/4 Last Extension
Time-multiplexing row processor is implemented by conducting the predict step in
even clocks and the update step in odd clocks. The control signals sel „1‟ and sel „0‟ of the
multiplexers are generated by a counter. The row processor is optimized in the pipelined
way and the samples are encoded continuously. Hardware utilization reaches
MUX
Input data
MUX
+<< +
REG
G
Sel 1
Sel 1
1 2
REG
G
REG
G
REG
G
REG
G
MUX
MUX
REG
G
MUX
Sel 0
Sel 0 Sel 0
Buffer
43
approximately 100% and the control logic is simple. Table 3.4 shows the data flow of the
proposed module of (5, 3) DWT[52] for a row with 8 samples, where Hi (Li) represents
the ith
high-pass (low-pass) output.
Table 3.4: Data Flow for (5, 3) DWT
clock Input En0 En1 Output
1 X0 0 0 -
2 X1 0 0 -
3 X2 0 0 H1
4 X3 0 1 L1
5 X4 1 0 H2
6 X5 0 1 L2
7 X6 1 0 H3
8 X7 0 1 L3
9 - 1 1 H4
10 - 0 1 L4
Proposed Column DWT module
In order to reduce the system latency, the column DWT has to execute in the row-
wise order. The input data are stored in even line buffer and odd line buffer which are
naturally separated into even samples and odd samples along the column. Embedded
mirror symmetric boundary data extension algorithm is also implemented in the Column
DWT module (Fig. 3.19). There are four multiplexers to control all the steps („1‟
represents prediction step while „0‟ represents update step). Multiplexers can ensure the
re-use of the hardware resource and that samples join the associated computation
according to the timing plan. The column DWT module begins to calculate samples after
the first two lines finish computing in row DWT. Firstly, multiplexers are set to „1‟ and
column processor conducts the prediction step.
44
Fig. 3.19: Block diagram of Column DWT
The result of the prediction step is stored in column buffer at the same time. Then
multiplexers are set to „0‟, column processor conducts the update step and then the results
are exported directly. The column processor is optimized in pipelined way to increase the
speed of the Wavelet Transform. The data flow of the column DWT is similar to that of
the row DWT.
3.3 Floating Point Multiplication Algorithm
A Floating point multiplier is proposed, instead of normal multiplier to multiply
the input filtered values with a constant coefficient values.
Fig. 3.20: Floating point multiplication with a constant value „a‟
Odd line
Even line
Column buffer
M
U
X
DFF
M
U
X
M
U
X
MUX
DFF
DFF +<<
+ DFF
X
Y
DFF
Sel
Sel
Sel
Sel
1 2
E1 E2
Predict
a
1
45
The main advantage of this floating point multiplier is to increase the speed of
operation and accuracy. The normalized floating point numbers have the form of „Z‟ as
given by Eq. 3.23a.
Z = (-1S) x 2
(E - Bias) x (1.M) … (3.23a)
The following algorithm is used to multiply two floating point numbers
Significance and multiplication; i.e. (1.M1 x 1.M2)
Placing the decimal point in the result
Exponents addition; i.e. (E1 + E2 -Bias)
Getting the sign; i.e. S1 xor S2
Normalizing the result; i.e. obtaining „1‟ at the MSB
Rounding implementation.
Verifying for underflow/overflow occurrence.
The above mentioned IEEE 754 single precision floating point numbers are
considered to perform the multiplication, but the number of mantissa bits is reduced for
simplification. Here only five bits are considered while the hidden „1‟ bit is retained for
normalized numbers. The Fig. 3.21 shows each block of the floating point multiplier.
Fig. 3.21: Floating point multiplier block diagram
Here a Floating point multiplier is presented in which rounding support is not
implemented. By this more precision in MAC unit is obtained and this will be accessed by
the multiplier or by a floating point adder unit. Fig. 3.20 shows the block diagram of the
46
multiplier structure; Exponents addition, significant multiplication and result sign
calculation. All processes done are independent and are in parallel. The significant
multiplication is done on two 24 bit numbers and it results in a 48 bit product called as
Intermediate Product (IP). The IP is represented as (47 down to 0) and the decimal point
is located between bits 46 and 45 in the IP.
3.4 Look Up Table (LUT) Implementation for Memory based Computations
Here the proposed LUT Implementation for memory based computations is used to
store the filtered values as well as constant multiplier product values. Instead of registers
to store the values, LUTs are used to reduce the memory size and optimize the area and
delay.
The Anti symmetric Product Coding (APC) and Odd Multiple Storage (OMS)
techniques for Look Up Table design for memory based multipliers are proposed to be
used in Digital Signal Processing applications.
Fig. 3.22: The (5, 3) Discrete Wavelet Transform
Each of these techniques results in the reduction of the LUT size by a factor of
two. In this, a different form of APC and a modified OMS scheme are presented in order
to combine them for efficient memory based multiplication. The proposed combined
approach provides a reduction in LUT size to one fourth of the conventional LUT.
47
Fig. 3.23: Proposed APC and OMS Combined LUT design for the multiplication of
„W‟- bit fixed coefficient „A‟ with 6-bit input „X‟
3.5 Principles of Compression
Image compression[53] addresses the problem of reducing amount of data required
to present a digital image. The underlying basis of the reduction process is the removal of
redundant data. From a mathematical view point, this amounts to transforming a 2D pixel
array into a statistically uncorrelated data set. The transformation is applied prior to
storage and transmission of the image. The compressed image is decompressed later to
reconstruct the original image or an approximation to it.
3.5.1 Need for Image Compression
With the advanced development in Internet, Teleconferencing, Multimedia and
High Definition Television technologies, the amount of information that is handled by
computers has grown exponentially over the past decades and hence storage and
transmission of the digital image component of multimedia systems pose a major problem.
The amount of data required to present images at an acceptable level of quality is
extremely large. High quality image data requires large amounts of storage space and
transmission bandwidth, with which the current technology is unable to handle technically
and economically. One of the possible solutions to this problem is to compress the
48
information so that the storage space and transmission time can be reduced. For example,
if a 1400x1800 color image needs to be stored, the space required to store the image is
1400 X 1800 X 8 X 3 = 60, 480, 000 bits
= 7, 560, 000 bytes
= 7.56 M bytes
The maximum space available on one Compact Disk (CD) is 700 MB, so that the
CD can store only 93 such images. The amount of data transmitted through the Internet
doubles every year and a large portion of that data comprises of images. Reduction of
bandwidth occupied by an image will result in significant cost reduction and make the
users of the device more affordable. Image compression offers a way to represent an
image in more compact way, so that images can be stored in compact manner and
transmitted faster.
3.5.2 Types of Redundancies
Following are the different types of redundancies
Psycho-visual redundancy: The accuracy of the human visual system is not 100% and
it is often possible to remove some details or reduce pixel precision (e.g. by
quantization), without affecting the perceived quality of the image.
Statistical redundancy: When the distribution of the symbol is not uniform, it is
generally possible to find an appropriate coding that will reduce the overall data length
(e.g. Entropy Coding).
Spatial redundancy: An image generally contains uniform regions of pixels or regular
patterns that can be efficiently represented with very few symbols (e.g. by prediction
or by resorting to a specific transformed domain).
Temporal redundancy: Temporal redundancy is the statistical correlation between
pixels from successive frames in a video sequence. The temporal redundancy is also
called Inter Frame redundancy. Motion compensated predictive coding is employed to
49
reduce temporal redundancy. Removing a large amount of temporal redundancy leads
to efficient video compression.
The following Fig. 3.24 shows different types of redundancies.
Fig. 3.24: Classification of Redundancy
3.5.3 Basic Image compression system
Image compression is an application of data compression that encodes the original
image with few bits. The objective of image compression is to reduce the redundancy of
the image and to store or transmit data in an efficient form.
Fig. 3.25: Basic image compression system
Fig 3.25 shows the block diagram of the general basic image compression system
along with storage system. The two basic components of an image compression are
Encoder and Decoder. The component that compresses the source image is called Encoder
Construct
n × n sub images
Quantizer
I-Quantizer Inverse Transform
Symbol encoder
Forward transform
Symbol decoder
Merge images
Hard Disk/ Channel
Decompressed image
Original image
50
and the output is the compressed data/coded data. The objective of Quantizer is to reduce
the precision and to achieve higher CR. The compressed and/or reduced data may be
either stored or transmitted, but are at some point fed to a decoder. The Decoder is a
component that recreates/reconstructs an image from the compressed data. The main goal
of such system is to reduce the storage quantity as much as possible and the decoded
image displayed in the monitor can be similar to the original image as much as can be.
3.5.4 Classification of Image Compression Schemes
Image compression schemes can be broadly classified into two types: Lossless
compression and Lossy compression technique schemes. Lossless compression scheme is
preferred in the case of Medical images. Lossy compression scheme is preferred in the
case of Multimedia applications. In the case of a lossless compression scheme, the
reconstructed image exactly resembles original image without any loss of information, but
the Compression Ratio (CR) is usually less. On the other hand, a high CR can be obtained
in a lossy compression scheme at the expense of the quality of reconstructed image. There
is always a tradeoff between the quality of the reconstructed image and the CR.
Lossless Compression (or) Reversible Compression
In lossless compression, the image after compression and decompression is
identical to the original image and every bit of information is preserved during the
decomposition process. The reconstructed image after compression is an exact replica of
the original one. Although the lossless compression methods have the appeal that there is
no deterioration in image quality, this scheme only achieves a modest CR. This lossless
compression scheme is used in applications where no loss of image data can be
compromised.
Lossy Compression (or) Irreversible Compression
In lossy compression, the reconstructed image contains degradations with respect
to original image. Here a perfect reconstruction is sacrificed by eliminating some amount
51
of redundancies in the image to achieve high CR. In lossy compression, the highest CR
can be achieved than lossless compression. The term lossy is often used to characterize
lossy compression schemes that result in no visual degradation under a set of designated
viewing conditions.
3.6 Coding Techniques
There have been so many lossless coding techniques used for image processing.
Some of them are explained below.
3.6.1 Lossless Coding Techniques
The following are the lossless coding techniques which are briefly explained
Entropy Coding
In information theory, an Entropy encoding is a lossless data compression scheme
that is independent of the specific characteristics of the medium. One of the main
advantages of Entropy Coding is to create and assign a unique prefix free code to each
unique symbol that occurs in the input. These Entropy encoders then compress data by
replacing each fixed length input symbol by the corresponding variable length prefix free
output codeword. The length of each codeword is approximately proportional to the
negative logarithm of the probability.
Entropy as a measure of similarity
Besides, using Entropy encoding as a way to compress digital data, an Entropy
encoder can also be used to measure the amount of similarity between streams of data.
This is done by generating an Entropy coder/compressor for each class of data. Unknown
data is then classified by feeding the uncompressed data to each compressor in order to
know which compressor yields the highest CR.
Run Length Encoding(RLE)
Run Length Encoding is a very simple form of data compression in which runs of
data (that is, sequences in which the same data value occurs in many consecutive data
52
elements) are stored as a single data value and count, rather than as the original run. This
is the most useful data that contains many such runs: for example, simple graphic images
such as icons, line drawings and animations. It is not useful in files that don‟t have many
runs as it could greatly increase the file size. RLE also refers to a little-used image format
in Windows 3.x, with the extension “.rle”, which is a Run Length Encoded Bitmap and it
can be used to compress the Windows 3.x start up screen.
Bit Plane Coding
A Bit Plane [54] of a digital signal (such as image or sound) is a set of bits having
the same position in the respective binary numbers. For example, for 16 bit data
representation, there are 16 bit planes: the 1st bit plane contains the set of the Most
Significant Bit and the 16th
contains the Least Significant Bit.
Huffman Coding
From Shannon source coding theory, it is known that a source can be coded with
an average code length close to the Entropy of the source. In 1952, D.A. Huffman
invented a coding technique to produce the shortest possible average code length which
gives the source symbol set and the associated probability of the occurrence of the
symbols. Codes generated using these coding techniques are popularly known as Huffman
Codes[55]. Huffman Coding technique is based on the following two observations
regarding optimum prefix codes.
The more frequently occurring symbols can be allocated with shorter codeword
than the less frequently occurring symbols.
The two least frequently occurring symbols will have codeword of the same
length and they differ only in the Least Significant Bit.
Huffman coding is more suitable than Arithmetic coding when simplicity is the
major concern.
53
Lossless Predictive Coding
Predictive Coding techniques constitute another example of exploration of inter
pixel redundancy, in which the basic idea is to encode only the new information in each
pixel. This new information is usually defined as the difference between the actual and the
predicted value of that pixel. The key component is the predictor, whose function is to
generate an estimated (predicted) value for each pixel from the input image based on
previous pixel values.
The predictor‟s output is rounded to the nearest integer and compared with the
actual pixel value; the difference between the two is called Prediction Error, which is then
encoded by a Variable Length Coding (VLC) encoder. Since prediction errors are likely to
be smaller than the original pixel values, the VLC encoder will generate shorter codeword.
There are several local, global and adaptive prediction algorithms in the literature. In most
cases the predicted pixel value is a linear combination of previous pixels.
Lempel Ziv Welch (LZW) Coding
One of the most common algorithms used in computer graphics is the Lempel Ziv
Welch [56], compression scheme. This lossless method of data compression is found in
several image file formats, such as Graphic Interchange Format(GIF) and Tagged Image
File Format(TIFF) and it is also part of the V.42 bis modem compression standard and
Post Script Level 2. In 1977, Abraham Lempel and Jakob Ziv created the LZ family of
substitution compressors. The LZ78 compression algorithms are more commonly used to
compress binary data, such as bitmaps. In 1984, while working for Unisys, Terry Welch
modified the LZ78 compressor for implementing in high performance disk controllers.
The result was the LZW algorithm that is commonly found today. LZW is a general
compression algorithm capable of working on almost any type of data. It is generally fast
in both compressing and decompressing data and does not require the use of floating point
operations, because LZW writes compressed data as bytes and not as words. LZW
54
encoded output can be identical on both Big Endian and Little Endian systems although
one may still encounter bit order and fill order problems. LZW is referred to as
a substitutional or dictionary based encoding algorithm. The algorithm builds a data
dictionary (also called a translation table or string table) of data occurring in an
uncompressed data stream. Patterns of data (substrings) are identified in the data stream
and are matched to entries in the dictionary. If the substring is not present in the
dictionary, a code phrase is created based on the data content of the substring and it is
stored in the dictionary. The phrase is then written to the compressed output stream. When
a re-occurrence of a substring is identified in the data, the phrase of the substring already
stored in the dictionary is written to the output, because the phrase value has a physical
size that is smaller than the substring and hence data compression is achieved.
Arithmetic Coding
Arithmetic Coding[57] is a variable length source encoding technique. In
traditional Entropy encoding techniques such as Huffman coding, each input symbol in a
message is substituted by a specific code specified by an integer number of bits.
Arithmetic Coding deviates from this paradigm. In this coding, a sequence of input
symbols is represented by an interval of real numbers between „0.0‟ and „1.0‟. It offers
superior compression efficiency and more flexibility compared to the popular Huffman
coding, but the arithmetic coding requires more computational power and memory
compared to the Huffman coding.
Embedded Zero tree Wavelet (EZW) Coding
For a 1D Wavelet Transform, a vector of the wavelet coefficients can be divided
into sub-bands after the wavelet decomposition as shown in the Fig. 3.26. Similarly, a
block of the two dimensional wavelet coefficients can be divided into sub-bands as shown
in Fig. 3.27. An EZW encoder was specially designed by Shapiro to use with Wavelet
Transforms. In fact, EZW coding is more like a quantization method. It was originally
55
designed to operate on images (2D signals), but it can also be used on other dimensional
signals. The EZW[58] encoder is based on progressive encoding to compress an image
into a bit stream with increasing accuracy.
Fig. 3.26: Sub-bands after 1D Wavelet Decomposition
Fig. 3.27: Sub-bands in a Wavelet Transform Block after 2D Wavelet Decomposition
Set Partitioning In Hierarchical Tree (SPIHT) Coding
The SPIHT [59] coder is a highly refined version of EZW algorithm and is a
powerful image compression algorithm that produces an embedded bit stream from
which the best reconstructed images in the MSE sense can be extracted at various bit
rates. A wide variety of images for given CRs results in high PSNR values with this
coding. Hence, it has become the state-of-the-art algorithm for image compression. The
Parent child relationship in SPIHT coding algorithm is shown in Fig. 3.28.
56
Fig. 3.28: Parent child relationship in SPIHT
Image data through the wavelet decomposition results into a tree of coefficients.
Here each coefficient has four children except the red marked coefficients in the
LL sub-band and the coefficients in the highest subbands (LH1, HL1 and HH1). The
following set of coordinates of coefficients is used to represent set partioning method in
SPIHT algorithm [60]. The location of coefficients is noted by (i, j), where „i‟ and „j‟
indicates row and column indices respectively.
3.6.2 Lossy Coding Techniques
Transform Coding
The techniques discussed so far work directly on the pixel values and are usually
called Spatial domain techniques[61] (refers to the image plane itself and methods in this
category are based on direct manipulation of the pixels of an image). Transform Coding
techniques use a reversible and linear mathematical transform to map the pixel values onto
a set of coefficients, which are then quantized and encoded.
57
Many of the resulting coefficients for most natural images have small magnitudes
and can be quantized (or discarded altogether) without causing significant distortion in the
decoded image. Different mathematical transforms, such as DFT, Walsh Hadamard
Transform (WHT) and Karhunen Loeve Transform (KLT) have been considered for the
task. For compression purposes, the higher the capability of compressing information in
fewer coefficients, the better the transform; for this reason, the DCT has become the most
widely used transform coding technique.
Embedded Block Coding with Optimal Truncation (EBCOT) Coding
In JPEG-2000, the Entropy coding of information is committed to the EBCOT
algorithm introduced in 1998 by David Taubman. Every sub-band is partitioned into little
blocks (for example 64x64 or 32x32), called Code Blocks. Every Code Block is encoded
independently from the other ones, thus producing an elementary embedded bit stream.
The algorithm can find some points of optimal truncation in order to minimize the
distortion and support its scalability.
Fractal Coding
Fractal compression is a lossy compression method for digital images based on
fractals. The method is best suited for textures and natural images, because of the fact that
parts of an image often resemble other parts of the same image. Fractal algorithms convert
these parts into mathematical data called Fractal Codes, which are used to recreate the
encoded image.
Chroma Coding
Chroma sub-sampling is the practice of encoding images by implementing less
resolution for Chroma information than for luma information, taking advantage of the
human visual system lower acuity for color differences than for luminance. It is used in
many video encoding schemes, such as analog, digital and in JPEG encoding.
58
Zonal Coding
Zonal Coding is based on the premise that the transformed coefficients having
very high variances are the ones that carry most of the signal and hence they should be
retained, whereas the ones with less variance can be truncated. In Zonal Coding, it is
therefore necessary to compute the variances at every position of the transformed array,
based on an ensemble of representative blocks of transformed arrays or by applying global
image models, such as Gauss Markov model. „M‟ transform coefficients may be retained
based on high values of variance. The retained coefficients will have a value of „1‟ in the
binary zonal mask, whereas all truncated coefficients will have a value of „0‟. A Typical
Zonal Coding mask for an 8 x 8 block is shown in Fig. 3.29.
Fig. 3.29: A Typical Zonal Coding mask for an 8 X 8 block
Each block has the same zonal mask. These masks can be customized for images and in
that event the mask information needs to be encoded with the image. Two different bit
allocation policies exist to encode the retained coefficients. In first one, same number of
bits is assigned to each retained coefficient; each coefficient is normalized by its standard
deviation and then uniformly quantized. In the other bit allocation policy, number of bits
allocated to the retained coefficients is based on the variances of those coefficients
59
computed and more bits are to be allocated to the coefficients having high variance. In this
approach, optimal Lloyd Max quantizers are designed for every retained coefficient.
Threshold Coding
Zonal Coding is often applied over a fixed mask, which may not be optimal for all
blocks and for all images. For better coding performance, the positions and the number of
retained coefficients should be adaptively changed on a block to block basis, based on its
transformed coefficient array.
Such adaptive bit allocation is done using Threshold Coding approach, which is
more often used in practice and is based on the premise that the transform coefficient of
largest magnitude makes the most significant contribution to the reconstructed block
quality. Only those transform coefficients, whose magnitudes exceed a threshold are
significant and all the remaining ones can be discarded for image reconstruction.
3.7 Experimental results and Performance analysis of the proposed method
Experimental results
The proposed system is mainly composed with three parts; namely, a CMOS image
sensor, an FPGA and a PC, which is a real time platform. The functions of the image
sensor are fully programmable via I2C serial control bus.
Fig. 3.30: Brain tumor MR Image
The real time image is first captured by the image sensor and then output to the FPGA
by I2C bus. The transform circuit in the FPGA processes the captured image by doing 3
level LDWT. Transformed image data of each level are stored in the SRAM of FPGA and
then shown on the PC. The experimental result of the Lifting DWT system is shown in
60
Fig. 3.30, which is the transformed image by doing 3 level Lifting DWT from the original
sample image.
Performance analysis
The Fig. 3.31 shows the simulation result of 3D-Lifting based DWT performed on
the transformed brain MR Image shown in figure 3.30
Fig. 3.31: 3D LDWT Simulation Results
In Table 3.5, the synthesis report with respect to the hardware conditions reported
by the proposed method are shown with 3D Lifting DWT results[8-9][57][62].
Table 3.5: 3D Lifting based DWT Synthesis Report
Logic Utilization Used Available Utilization
Number of Slice Flip Flops 444 1,920 23%
Number of 4 input LUTs 561 1,920 29%
Number of occupied Slices 484 960 50%
Number of Slices containing only related logic 484 484 100%
Number of Slices containing unrelated logic 0 484 0%
Total Number of 4 input LUTs 592 1,920 30%
Number used as logic 337
Number used as route-thru 31
Number used for Dual Port RAMs 224
Number of bonded IOBs 66 83 79%
Number of BUFGMUXs 1 24 4%
Average Fanout of Non-Clock Nets 3.01
61
The proposed architecture is successfully synthesized using Spartan 3 FPGA. The
performance including the memory interface circuit for this real time platform is shown in
Table 3.5. The performance of the proposed 3D LDWT architecture is compared with
2D LDWT architecture. The device utility factor for both the architectures are summarized
and concluded that the 3D LDWT is efficient than 2D LDWT in respect of device
utilization parameters presented in Table 3.6.
3.8 Inference
The proposed architecture is simulated using Verilog HDL[63] and is implemented
on FPGA[64] [65]. The functions of the image sensor are fully programmable via I2C
serial control bus. The processed image data are shown on PC. This architecture is
optimized in pipelined way, thus reducing computation time and increasing the speed.
However, from the obtained simulation and synthesis, it is evident that the number of
LUTs has been increased to 29%. The PSNR obtained is around 40 dB and maximum CR
is achieved which makes the system suitable for the required application of compressing
medical images. The computation time is observed to be 36 msec only, which is much less
than the conventional technique. Hence, this architecture is more suitable for compression
of medical images[66] when Lifting based DWT[67-70] is chosen as the compression
technique.
Table 3.6: Device utility comparison of 2D LDWT and 3D LDWT
Device Utilization parameters
Utilization %
2D DWT 3D DWT
Number of slice flip flops 4% 23%
Number of 4 input LUTs 5% 29%
Number of occupied slices 9% 50%
Number of slices containing related logic 100% 100%
Number of slices containing unrelated logic 0% 0%
Total number of 4 input LUTs 5% 30%
Number of bonded IOBs 44% 79%
Number of BUFG Mux‟s 4% 4%