08-3 - Image Compression

download 08-3 - Image Compression

of 33

Transcript of 08-3 - Image Compression

  • 8/12/2019 08-3 - Image Compression

    1/33

    4/28/2

    methods and standards

    Spring 2008 ELEN 4304/5365 DIP 1

    by Gleb V. Tcheslavski: [email protected]

    http://ee.lamar.edu/gleb/dip/index.htm

    JPEGJPEG while being one of the most popular continuous tone imagecompression standards defines three basic coding schemes:1) A lossy baseline coding system based on DCT;

    , precision, or progressive reconstruction applications;

    3) A lossless independent coding system for reversible compression.

    In a baseline format, the image is subdivided into 8x8 pixel blocks,which are processed left to right, top to bottom. For each block, its 64

    pixels are level-shifted by subtracting 2 k-1, where 2 k is the maximum

    Spring 2008 ELEN 4304/5365 DIP 2

    number of intensity levels. Next, a 2D DCT of the block is computed,quantized, and reordered using the zigzag pattern to form a 1Dsequence of quantized coefficients. Next, the nonzero AC coefficientsare coded using a variable-length code. The DC coefficient isdifference coded relative to the DC coefficient of the previous block.

  • 8/12/2019 08-3 - Image Compression

    2/33

  • 8/12/2019 08-3 - Image Compression

    3/33

    4/28/2

    JPEGJPEG decompression begins from decoding DC and AC coefficients

    and recovering an array of quantized coefficients from a 1D zigzag.

    De-normalized DCT coefficients

    Spring 2008 ELEN 4304/5365 DIP 5

    IDCT array Upscaled IDCT reconstructedimage

    JPEG

    The error between theoriginal and reconstructedimages is due to the lossynature of the JPEGcompression. The rms erroris approximately 5.8 intensitylevels.

    Spring 2008 ELEN 4304/5365 DIP 6

  • 8/12/2019 08-3 - Image Compression

    4/33

    4/28/2

    JPEGJPEGapproximation withcom ression 25:1rms error 5.4

    JPEGapproximation withcompression 52:1,

    Spring 2008 ELEN 4304/5365 DIP 7

    rms error .

    Predictive codingPredictive coding is based on eliminating the redundancies of closelyspaced pixels in space and/or in time by extracting and coding onlythe new information in each pixel. The new information is defined asthe difference between the actual and predicted value of the pixel.

    Lossless predictive coding

    Lossless predictivecoding system

    Spring 2008 ELEN 4304/5365 DIP 8

    Predictor generates theexpected value of eachsample based on aspecified number of

    past samples.

  • 8/12/2019 08-3 - Image Compression

    5/33

    4/28/2

    Predictive codingPredictors output is rounded to the nearest integer and is used to

    compute prediction errore n n n=

    Prediction error is encoded by a variable-length code to generate thenext element of the encoded data stream. The decoder reconstructse(n) from the encoded data and performs the inverse operation

    ( ) ( ) ( )n e n f n= +

    Spring 2008 ELEN 4304/5365 DIP 9

    , ,Often, the prediction is a linear combination of m previous samples:

    1

    ( ) ( )m

    ii

    f n round f n i = =

    Predictive codingWhere m is the order of the linear predictor and i, i = 1,m are

    prediction coefficients, f (n) are the input pixels. The m samples usedfor prediction can be taken from the current scan line (1D linear

    pre c ve co ng , rom e curren an prev ous neLPC), or from the current image and the previous images in an imagesequence (3D LPC). The 1D LPC:

    1

    ( ) ( , )m

    ii

    f n round f x y i =

    =

    Spring 2008 ELEN 4304/5365 DIP 10

    which is a function of the previous pixels in the current line. Notethat the prediction cannot be formed for the first m pixels. These

    pixels are coded by other means (f.e. Huffman code).

  • 8/12/2019 08-3 - Image Compression

    6/33

    4/28/2

    Predictive codingFor the image shown, form a

    first-order (m = 1) LPC in form = , ,A predictor is called a previous

    pixel predictor, and the coding procedure is differential( previous pixel ) coding.

    Spring 2008 ELEN 4304/5365 DIP 11

    up-scaled by 128.

    The average prediction error0.26The entropy reduction is due to removal of spatial redundancy.

    Predictive codingThe compression achieved in predictive coding is related to theentropy reduction resulting from mapping an input image into a

    prediction error sequence. Therefore, the pdf of the prediction error isn genera g y pea e at an as re at ve y sma compare to

    the input image) variance. It is often modeled by a zero-meanuncorrelated Laplacian pdf:

    21

    ( )2

    e

    e

    e

    e

    p e e

    =

    Spring 2008 ELEN 4304/5365 DIP 12

    where e is the standard deviation of e.

  • 8/12/2019 08-3 - Image Compression

    7/33

    4/28/2

    Predictive codingTwo successive frames of Earth

    taken by NASA spacecraft.Usin the first-order m = 1Usin the first-order m = 1LPC:

    [ ] ( , , ) ( , , 1) f x y t round f x y t = with = 1, the pixel intensitiesin the second frame can be

    predicted from the intensities in

    LPC:

    Spring 2008 ELEN 4304/5365 DIP 13

    the first frame; the residualimage

    Considerable decrease in standard deviation and in the entropyindicates significant compression that can be achieved.

    Predictive codingMotion compensated prediction residuals

    Since successive frames in a video sequence are often quite similar,coding their differences can reduce temporal redundancy and providesignificant compression. On the other hand, when a frame sequencecontains rapidly moving objects, the similarity between neighboringframes is reduced. The attempt to use LPC on images with littletemporal redundancy may lead to data expansion . Videocompression systems avoid the problem of data expansion by:1. Tracking object movement and compensating for it during the

    Spring 2008 ELEN 4304/5365 DIP 14

    pre ct on an erenc ng process;2. Switching to an alternative coding method when there is

    insufficient inter-frame correlation (similarity between frames) tomake predictive coding advantageous.

  • 8/12/2019 08-3 - Image Compression

    8/33

    4/28/2

    Predictive codingBasics of motioncompensation:

    Each video frame is divided into non-overlapping rectangular regions typically of size 4x4 to 16x16 macroblocks . The movement ofeach macroblock with respect to the previous frame ( reference frame )

    Spring 2008 ELEN 4304/5365 DIP 15

    is encoded in a motion vector that describes the motion by definingthe vertical and horizontal displacement from the most likely

    position. This displacement is usually specified to the nearest pixel, pixel, or pixel precision. If sub-pixel precision is used, predictionmust be interpolated from a combination of pixels in the frame.

    Predictive codingAn encoded frame that is based on the previous frame (forward

    prediction) is called a predictive frame (P-frame); the frame that isalso based on the subsequent frame (backward prediction) is called a

    - -.stream to be reordered.Finally, some frames are encoded without referencing to any of theneighboring frames (like JPEG) and are encoded independently. Suchframes are calledintraframes or independent

    frames I-frames and are

    Spring 2008 ELEN 4304/5365 DIP 16

    ideal starting points for thegeneration of predictionresiduals. Also, I-frames can

    be easily accessed withoutdecoding the stream.

  • 8/12/2019 08-3 - Image Compression

    9/33

    4/28/2

    Predictive coding Motion estimation is the key concept of motion compensation. During

    it, the motion of objects is measured and encoded into motion vectors.The search for the best motion vector requires specification of . ,

    the basis of maximum correlation or minimum error betweenmacroblock pixels and the predicted (or interpolated) pixels for thechosen reference frame. One of the most frequently used error measures is the mean absolute distortion (MAD):

    1( , ) ( , ) ( , )

    m n

    AD x y f x i y j p x i dx y j dy= + + + + + +

    Spring 2008 ELEN 4304/5365 DIP 17

    1 1i jmn = =

    where x and y are the coordinates of the upper-left pixel of the mxnmacroblock being coded, dx and dy are displacements from thereference frame, and p is an array of predicted macroblock pixels.

    Predictive codingTypically, dx and dymust fall within alimited search regionaroun eacmacroblock.

    Values from 8 to 64 pixels are common, and the horizontal searcharea often is significantly larger than the vertical search area.Another, more computationally efficient measure is the sum of

    absolute distortions (SAD) that omits the 1/ mn factor.

    Spring 2008 ELEN 4304/5365 DIP 18

    For the specified selection criterion (say, MAD), motion estimationis performed by searching for the dx and dy minimizing MAD( x,y)over the allowed range of motion vector displacements block

    matching . An exhaustive search is efficient but expensive; there arefast algorithms that are inexpensive but dont guarantee optimum.

  • 8/12/2019 08-3 - Image Compression

    10/33

    4/28/2

    Predictive codingTwo imagesdiffering by 13frames.

    Spring 2008 ELEN 4304/5365 DIP 19

    Motion vectors:highly correlated variable-length code

    Difference image:stdiv of error =12.73; entropy = 4.2

    Motion-compens.difference image

    Predictive codingMotion compensated prediction residual was computed by dividingthe latest figure into 16x16 macroblocks and comparing eachmacroblock to all possible 16x16 macroblock in the earlier framewithin 16 pixels position. The MAD criterion was used. Theresulting standard deviation was 5.62 and the entropy was 3.04

    bits/pixel.

    We observe that there is no motion in the lower portion of the imagecorresponding to the space shuttle. Therefore, no motion vectors are

    shown. The macroblocks in this area are redicted from similarl

    Spring 2008 ELEN 4304/5365 DIP 20

    located macroblocks in the reference frame.

  • 8/12/2019 08-3 - Image Compression

    11/33

    4/28/2

    Predictive codingPrediction accuracy can be increased using sub-pixel motioncompensation.

    Prediction Residual with 1residual with nomotioncompensation:Stdev = 12.7;Entropy = 4.17

    pixel motioncompensation:Stdev = 4.4;Entropy = 3.34

    Residual with Residual with

    Spring 2008 ELEN 4304/5365 DIP 21

    pixel motioncompensation:Stdev = 4;Entropy = 3.35

    pixel motioncompensation:

    Stdev = 3.8;Entropy = 3.34

    Predictive codingMotion estimation is a computationally intensive process. Fortunately,only the encoder must estimate the macroblock motion. The decoder

    of the reference frames that were used in the encoder to form the

    prediction residuals.

    For this reason, most video compression standards do not includemotion estimation. Instead, compression standards focus on thedecoder: place constraints on macroblock dimensions, motion vector

    Spring 2008 ELEN 4304/5365 DIP 22

    prec s on, or zon a an ver ca sp acemen ranges, e c.

  • 8/12/2019 08-3 - Image Compression

    12/33

    4/28/2

    Predictive coding

    Spring 2008 ELEN 4304/5365 DIP 23

    Predictive codingMost of the video compression standards use an 8x8 DCT for I-frameencoding but specify a larger area (16x16 macroblocks) for motioncompensation. Additionally, even the P- and B-frame predictionres ua s are trans orm co e ue to e ect veness o coe c entquantization. The H.264 and MPEG-4 AVC support intraframe

    predictive coding (in I-frames) to reduce spatial redundancy

    Spring 2008 ELEN 4304/5365 DIP 24

  • 8/12/2019 08-3 - Image Compression

    13/33

    4/28/2

    Predictive coding

    A typical motion-compensatedvideo encoder

    Spring 2008 ELEN 4304/5365 DIP 25

    An encoder exploits redundancies within and between adjacent video

    frames, and the psychovisual properties of the human visual system.

    Predictive codingEncoders input is a sequence of macroblocks. For color video, eachmacroblock consists of a luminance block and 2 chrominance blocks.The human eye has less spatial acuity for color than for luminance,t ere ore, t e c rom nance oc s are o ten samp e at a t ehorizontal and vertical resolution the luminance block.

    luma

    chroma

    Spring 2008 ELEN 4304/5365 DIP 26

    Grayed elements of the encoder are JPEG encoder that may operateon conventional macroblock (I-frames) or their differences (P- andB-blocks). Inverse mapper performs IDCT.

  • 8/12/2019 08-3 - Image Compression

    14/33

    4/28/2

    Predictivecoding

    1 minute HD(1280x720) full-colorvideo containing 150frames fade-in from

    black (frames 21, 44),to black (frames 1595,1609, 1652), abruptchanges (frames 1303,

    Spring 2008 ELEN 4304/5365 DIP 27

    1304). H.264compression requires

    44.56 MB of storage ascompared to about 5GB uncompressed.

    MPEG-2

    Abbr. NamePictureCoding Types

    ChromaFormat

    Aspect RatiosScalablemodes

    MPEG-2 Profiles

    SP Simple profile I, P 4:2:0square pixels,4:3, or 16:9

    none

    MP Main profile I, P, B 4:2:0square pixels,4:3, or 16:9

    none

    SNR SNR Scalable

    profileI, P, B 4:2:0

    square pixels,4:3, or 16:9

    SNR (signal-to-noise ratio)scalable

    Spring 2008 ELEN 4304/5365 DIP 28

    SpatialSpatiallyScalable

    profileI, P, B 4:2:0

    square pixels,4:3, or 16:9

    SNR- orspatial-scalable

    HP High profile I, P, B 4:2:2 or 4:2:0 square pixels,4:3, or 16:9

    SNR- orspatial-scalable

  • 8/12/2019 08-3 - Image Compression

    15/33

    4/28/2

    MPEG-2

    Abbr. Name Frame rates (Hz) Max h. Max v- Max luminanceMax bitrate

    MPEG-2 Levels

    . . (Mbit/s)

    LL Low Level 23.976, 24, 25,29.97, 30 352 288 3,041,280 4

    ML MainLevel

    23.976, 24, 25,29.97, 30

    720 576

    10,368,000;Hp: 14,475,600 for4:2:0 and11,059,200 for 4:2:2

    15

    Spring 2008 ELEN 4304/5365 DIP 29

    H-14 High 144023.976, 24, 25,29.97, 30, 50,59.94, 60

    1440 115247,001,600Hp with 4:2:0:62,668,800

    60

    HL High Level23.976, 24, 25,29.97, 30, 50,59.94, 60

    1920 115262,668,800Hp: with 4:2:0:83,558,400

    80

    MPEG-2 Allowed Resolutions720 480, 704 480, 352 480, 352 240 pixel (NTSC)720 576, 704 576, 352 576, 352 288 pixel (PAL)

    Allowed Aspect ratios (Display AR)4:316:9(1.85:1 and 2.35:1, among others, are often listed as valid DVD aspect ratios,but are actually just a 16:9 image with the top and bottom of the frame maskedin black)

    Allowed Frame rates29.97 frame/s (NTSC)25 frame/s (PAL)

    Note: By using a pattern of REPEAT_FIRST_FIELD flags on the headers of encoded

    Spring 2008 ELEN 4304/5365 DIP 30

    pictures, pictures can be displayed for either two or three fields and almost any picture display rate (minimum of the frame rate) can be achieved. This is mostoften used to display 23.976 (approximately film rate) video on NTSC. Audio+videobitrate

    Video peak 9.8 Mbit/sTotal peak 10.08 Mbit/sMinimum 300 kbit/s

  • 8/12/2019 08-3 - Image Compression

    16/33

    4/28/2

    Lossy predictive coding

    Lossy predictive codingis achieved by including

    the error-free output bythe nearest integer.

    The decompressed sequence

    Spring 2008 ELEN 4304/5365 DIP 31

    ( ) ( ) ( ) f n e n f n= +is also a predictors input. mapped prediction error

    Lossy predictive coding Delta modulation

    Delta modulation is a simple form of lossy predictive coding, wherethe redictor and uantizer are defined as

    ( ) ( 1)

    ( ) 0( )

    f n f n

    for e ne n

    otherwise

    = + >=

    where is a prediction coefficient (normally < 1) and is a positive

    Spring 2008 ELEN 4304/5365 DIP 32

    constant. The quantizer output can be represented by a single bit.

  • 8/12/2019 08-3 - Image Compression

    17/33

    4/28/2

    Lossy predictive coding

    Calculations needed tocompress and reconstructan input sequence {14 1514 15 13 15 15 14 20 26 2728 27 27 29 37 47 62 75 77

    Spring 2008 ELEN 4304/5365 DIP 33

    = 1, = 6.5.

    The process begins with the error-free transfer of the first inputsample 14 to the decoder. The remaining outputs are computedaccording to the equations.

    Lossy predictive codingWhen is too small to represent the inputs change, a distortioncalled slope overload occurs. On the other hand, if is too large torepresent the inputs smallest change, granular noise appears. Inimages, these distortions show up as blurred edges and grainy(noisy) surfaces (distorted smooth areas).

    These distortions are common to all lossy predictive coding formsand arise from a combination of quantization and prediction methodsused. Predictors are normally designed with the assumption of no

    Spring 2008 ELEN 4304/5365 DIP 34

    quant zat on error, an quant zers are es gne to m n m ze t eyrown errors.

  • 8/12/2019 08-3 - Image Compression

    18/33

    4/28/2

    Optimal predictorsUsually, the predictor is chosen to minimize the encoders mean-

    square prediction error:2

    2 =

    with the constrain that

    ( ) ( ) ( ) ( ) ( ) ( ) f n e n f n e n f n f n= + + =

    and ( ) ( 1)m

    f n f n =

    Spring 2008 ELEN 4304/5365 DIP 35

    1i=

    Therefore, the optimization criterion is minimal mean-square

    prediction error, the quantization error is assumed to be negligible( (n) e(n)), and the prediction is a linear combination of m

    previous samples.

    Optimal predictorsThe resulting predictive coding approach is called a differential

    pulse code modulation (DPCM). Therefore, we need to select the m prediction coefficients that minimize

    { }2

    2

    1

    ( ) ( ) ( )m

    ii

    E e n E f n f n i =

    =

    The minimization can be done by differentiating the above equation

    with res ect to each coefficient, e uatin the derivatives to zero, and

    Spring 2008 ELEN 4304/5365 DIP 36

    solving them:1

    = R r

  • 8/12/2019 08-3 - Image Compression

    19/33

    4/28/2

    Optimal predictorswhere R -1 is the inverse of the mxm autocorrelation matrix

    { } { } { }( 1) ( 1) ( 1) ( 2) ( 1) ( )2 1 2 2

    E f n f n E f n f n E f n f n m

    E n n E n n

    { } { } { }( ) ( 1) ( ) ( 2) ( ) ( ) E f n m f n E f n m f n E f n m f n m

    R =

    And r and are the m-element vectors:

    { }( ) ( 1)2

    E f n f n

    E n n

    1

    Spring 2008 ELEN 4304/5365 DIP 37

    { }( ) ( ) E f n f n m

    r =

    m

    =

    The coefficients depend only on the autocorrelations of the samplesin the original sequence and can be determined by matrix operations.

    Optimal predictorsThe variance of the prediction error resulting from using theseoptimal coefficients would be

    2 2 2m

    T = = 1

    e ii=

    However, computations of autocorrelations is very difficult in practice. Usually, a set of global coefficients is computed assuming asimple input model and substituting the correspondingautocorrelations.. For instance, when a 2D Markov image source

    with the se arable autocorrelation function

    Spring 2008 ELEN 4304/5365 DIP 38

    { } 2( , ) ( , ) i jv h E f x y f x i y j =and a generalized 4 th order linear predictor

    1 2 3 4 ( , ) ( , 1) ( 1, 1) ( 1, ) ( 1, 1) f x y f x y f x y f x y f x y = + + + +

  • 8/12/2019 08-3 - Image Compression

    20/33

    4/28/2

    Optimal predictorsare assumed, the resulting optimal coefficients are

    1 2 3 4 0h v h v = = = = v

    the considered image. The sum of prediction coefficients usually is

    1

    1m

    ii

    =

    which ensures that the predictors output is within the allowed rangeand reduces the im act of transmission noise usuall seen as

    Spring 2008 ELEN 4304/5365 DIP 39

    horizontal streaks). Reducing the decoders susceptibility to theinput noise is important since a single error may propagate to all

    future outputs! That is, the decoders output may become unstable. Ifthe sum is strictly less than one, an input error will affect only asmall number of outputs.

    Optimal predictorsConsidering the prediction error resulting from DPMC coding themonochrome Lena image assuming a zero quantization error andusing each of the 4 predictors:

    ( , ) 0.97 ( , 1) ( , ) 0.5 ( , 1) 0.5 ( 1, ) ( , ) 0.75 ( , 1) 0.75 ( 1, ) 0.5 ( 1, 1)

    0.97 ( , 1) ( , )

    0.97 ( 1, )

    f x y f x y

    f x y f x y f x y

    f x y f x y f x y f x y

    f x y if h v f x y

    f x y otherwise

    = = + = +

    =

    Spring 2008 ELEN 4304/5365 DIP 40

    where( 1, ) ( 1, 1)

    ( , 1) ( 1, 1)

    h f x y f x y

    v f x y f x y

    = =

    are the horizontal and vertical gradients at point ( x,y).

  • 8/12/2019 08-3 - Image Compression

    21/33

    4/28/2

    Optimal predictorsObserve that the 4 th adaptive predictor is designed to improve edge

    rendition by computing a local measure of the directional properties.

    computed for the 4 predictors.

    The visually perceptive errordecreases as the predictorsorder increases.

    Spring 2008 ELEN 4304/5365 DIP 41

    The standard deviations are11.1, 9.8, 9.1, and 9.7 intensitylevels.

    Optimal quantizationThe staircase quantization functiont = q ( s) is an odd function of s (i.e.q(- s) = -q( s)) and can be describedcompletely by the L/2 values of siand t i shown in the first quadrant ofthe graph. These break points definefunction discontinuities and arecalled the quantizers decision and

    reconstruction levels.

    Spring 2008 ELEN 4304/5365 DIP 42

    y t e convent on, s s mappe to t i t es n si , s i+ 1 .

    The quantizer design problem is to select the best si and t i for a particular optimization criterion and input pdf p( s).

  • 8/12/2019 08-3 - Image Compression

    22/33

    4/28/2

    Optimal quantizationIf the optimization criterion (that can be either statistical or psycho-

    visual measure) is the minimization of the mean-square quantizationerror E {( si t i)2} and p( s) is an even function, the conditions form n ma error are

    1

    1

    ( ) ( ) 1, 2,... 2

    0 0

    1,2,... 2 1

    i

    i

    s

    i s

    i ii

    s t p s ds i L

    i

    t t s i L

    +

    =

    = += =

    Spring 2008 ELEN 4304/5365 DIP 43

    2

    i i

    i i

    i L

    s st t

    =

    = =

    Optimal quantizationTherefore, the reconstruction levels are the centroids of the areaunder p( s) over the specified decision intervals; the decision levelsare halfway between the reconstruction levels; q is an odd function.

    e quan zer sa s y ng e a ove con ons s op ma n e mean-square error sense and is called the L-level Lloyd-Max quantizer.

    For a unit variance Laplacian pdf, the 2-, 4-, and 8-level Lloyd-Maxdecision and reconstruction levels (computed numerically) are:

    Spring 2008 ELEN 4304/5365 DIP 44

  • 8/12/2019 08-3 - Image Compression

    23/33

    4/28/2

    Optimal quantizationThe 3 quantizers shown provide fixed output rates of 1, 2, and 3

    bits/pixel. If a non-unit variance pdf is considered, the reconstructionand decision levels are obtained by multiplying the shown values by

    e s an ar ev a on o e cons ere p .The last row in the table shows the step size satisfying

    1 1i i i it t s s = = The Lloyd-Max quantizer is not adaptive. However, adjusting thequantization levels based on the local behavior of an image can bever beneficial. In theor slowl chan in re ions can be finel

    Spring 2008 ELEN 4304/5365 DIP 45

    quantized, while the rapidly changing areas are quantized morecoarsely. This approach reduces both the granular noise and theslope overload, while requiring a minimal increase in code rate andincreased quantizer complexity.

    Wavelet compressionThe idea of wavelet-based compression is similar to the idea of DCTcompression: the transform coefficients can be stored moreefficiently than the pixels themselves since a transform decorrelates

    e n orma on s ore n p xe s. e rans orm as s unc ons nthis case wavelets) pack most of the visual information in fewcoefficients, the remaining coefficients can be quantized coarsely orzeroed with little image distortion.

    A typical

    Spring 2008 ELEN 4304/5365 DIP 46

    waveletcodingsystem

  • 8/12/2019 08-3 - Image Compression

    24/33

    4/28/2

    Wavelet compressionTo encode a 2 Jx2J image, an analyzing wavelet and a minimumdecomposition level J-P are selected and used to compute DWT of theimage. If the wavelet has the complementary scaling function , theas wave e rans orm can e use . e rans orm conver s a arge

    portion of the original image to horizontal, vertical, and diagonalcoefficients with zero mean and Laplacian-like distribution. Many ofthese coefficients carry little visual information and can be quantizedand coded to reduce inter-coefficient and coding redundancy.

    Since the wavelet transform is both computationally efficient and

    Spring 2008 ELEN 4304/5365 DIP 47

    inherently local (since basis functions have a limited duration), imagesubdivision into block is not needed, which eliminate the blocking

    artifact and is the major difference compared to the transform codingsystems.

    Wavelet compression

    The selection of a wavelet affects the computational complexity of

    Wavelet selection

    images of acceptable error.

    When a wavelet has a companion scaling function, transformationscan be implemented as a sequence of filtering operations. The abilityof the wavelet to pack information into a small number of transformcoefficients determines its compression and reconstruction

    Spring 2008 ELEN 4304/5365 DIP 48

    per ormance.

    The most frequently used are Daubechies wavelets and biorthogonalwavelets.

  • 8/12/2019 08-3 - Image Compression

    25/33

    4/28/2

    Wavelet compression3-scale

    Haarwavelet

    3-scale

    Daubechieswavelet

    rans orm

    3-scales melet

    rans orm

    3-scaleCohen-

    Spring 2008 ELEN 4304/5365 DIP 49

    transform Daubechies-Feauveau

    biorthogonalwavelettransform

    Wavelet compressionSymlets are an extension of Daubechies wavelets with increasedsymmetry. As seen in the table, computational intensity increases(from 4 to 28 multiplications and additions per coefficient) whenmov ng o more comp ex wave e s, as we as e n orma on

    packing performance.

    Spring 2008 ELEN 4304/5365 DIP 50

    The coefficients below 1.5 were set to zero. The potentialcompression ability of biorthogonal wavelet is almost 10% higherthan the one of Haar wavelet.

  • 8/12/2019 08-3 - Image Compression

    26/33

    4/28/2

    Wavelet compressionDecomposition level selection

    Since a P -scale fast wavelet transform involves P filter banks, thenumber of iterations in the computation of the forward and inversetransforms increases with the number of decomposition levels. Inmany applications (like searching image database or transmittingimages for progressive reconstruction), the resolution of the storedor transmitted images and the scale of the lowest usefulapproximation normally determine the number of transform levels.

    Spring 2008 ELEN 4304/5365 DIP 51

    or a biorthogonal

    wavelet anda globalthreshold 25

    Wavelet compression

    The most important factor affecting wavelet coding compression andreconstruction error is coefficient uantization. The most widel

    Quantizer design

    used compressors are uniform. However, the effectiveness of thequantization can be improved significantly by

    1) Introducing a larger quantization interval around zero (a dead zone );

    2) Adapting the size of the quantization interval from scale to scale.

    Spring 2008 ELEN 4304/5365 DIP 52

    In either case, the selected quantization intervals must be transmittedto the decoder with the encoded image bit stream. The intervalsthemselves may be determined heuristically or automaticallycomputed based on the image being decoded.

  • 8/12/2019 08-3 - Image Compression

    27/33

  • 8/12/2019 08-3 - Image Compression

    28/33

    4/28/2

    Wavelet compression : JPEG-2000For instance, the irreversible component transform of JPEG-200 is:

    0 0 1 2( , ) 0.299 ( , ) 0.587 ( , ) 0.114 ( , )Y x y I x y I x y I x y= + +1 0 1 2

    3 0 1 2

    , . , . , . ,

    ( , ) 0.5 ( , ) 0.41869 ( , ) 0.08131 ( , )Y x y I x y I x y I x y

    =

    where I 0 , I 1 , and I 2 are the level-shifted input components and Y 0 , Y 1 ,and Y 2 are the corresponding decorrelated components. If the inputcomponents are the red, green, and blue planes, the equation

    Spring 2008 ELEN 4304/5365 DIP 55

    r .transformation is to improve compression efficiency since thetransformed components Y 1 and Y 2 are difference images whosehistograms are highly peaked around zero.

    Wavelet compression : JPEG-2000After the image has been level shifted and optionally decorrelated,its components can be divided into tiles rectangular arrays of

    p xe s t at are processe n epen ent y prov ng a s mp emechanism to access and/or manipulate a limited region of a codedimage.

    For instance, an image with a 16:9 aspect ratio can be subdividedinto tiles such that one of its tiles is a subimage with a 4:3 aspect

    ratio. That tile could be reconstructed without accessing the other

    Spring 2008 ELEN 4304/5365 DIP 56

    tiles in the compressed image.

    If the image is not subdivided into tiles, it is called a single tile .

  • 8/12/2019 08-3 - Image Compression

    29/33

    4/28/2

    Wavelet compression : JPEG-2000 Next, the 1D DWT of the rows and columns of each tile component

    is computed. For error-free compression, the transform uses a biorthogonal 5-3 coefficient scaling and wavelet vector. A rounding proce ure s use or e rans orm coe c en s av ng non- n egervalues.In lossy applications, a 9-7 coefficient scaling-wavelet vector isused. In either case, the transform is computed using either the fastwavelet transform or a complementary lifting-based approach.

    The coefficients

    Spring 2008 ELEN 4304/5365 DIP 57

    used to constructthe 9-7 FWT

    analysis filter bank

    Wavelet compression : JPEG-2000The complementary lifting-based implementation involves 6sequential lifting/scaling operations:

    [ ] 0 1(2 1) (2 1) (2 ) (2 2) 3 2 1 3Y n X n X n X n i n i + = + + + + + < +[ ]

    [ ]0 1

    0 1

    (2 ) (2 ) (2 1) (2 1) 2 2 2

    (2 1) (2 1) (2 ) (2 2) 1 2 1 1

    (2 ) (2 ) (2 1) (2 1

    Y n X n Y n Y n i n i

    Y n Y n Y n Y n i n i

    Y n Y n Y n Y n

    = + + + < ++ = + + + + + < += + + +[ ] 0 1

    0 1

    0 1

    ) 2

    (2 1) (2 1) 2 1

    (2 ) (2 ) 2

    i n i

    Y n K Y n i n i

    Y n Y n K i n i