Chapter 8 Image Compression. Portable bit map Family (BMP, Lena 66,616B) Graphics interchange Format...

Chapter 8 Chapter 8

Image CompressionImage Compression

Chapter 8 Chapter 8

Image CompressionImage Compression

Portable bit map Family (BMP, Lena 66,616B)

Graphics interchange Format (GIF, Lena 70,458B)

Tag image file format (TIFF/TIF, Lena 88,508B)

JPEG format (JFIF/JFI/JPG, Lena 12,377)

Commonly Used Formats:

Image Data Compression

Large amount of data per image has 3 implications: storage processing communications

For digital image, 512×512×3 bytes/frame = 768Kbyte/frame

Data rate = half of line frequency (e.g. 25Hz), total data rate = 768KB×25/sec = 19.2Mbyte/sec


The goal of image compression is to reduce the amount of data required to represent a digital image.

Compression can be losslesslossless or lossyor lossy. The strategy is to remove redundant d

ata from the image. The compressed data should allow th

e original image to be reconstructed or approximated.


Entropy coding

Pixel Coding

Pixel Coding

Entropy CodingEntropy Coding Consider a source with L possible

independent symbols with probabilities pi , i = 0,…,L-1. The entropyentropy is defined as

Shannon’sShannon’s noise coding theorem: It is possible to code, without distortion, a source of entropy H, using ( ) bits/symbol where can be arbitrarily small.

H

1

02log

L

iii ppH

Ideal compression ratio (without distortion):

H

B

H

BC

Example: If 256 gray level with equal probability, then

ipi 256

1

bitsHi

8256log256

1log

256

122

256

1

i.e. No compression if the gray levels are all random.

If are different, then use a variable length code – Huffman codingHuffman coding.

ip

Arrange the symbol probabilities in a decreasing order and consider them as leaf nodes of a tree.

While there is more than one node: Merge the 2 nodes with smallest

probability. To form a new node, whose probability is the sum of the 2 merged node.

Arbitrarily assign 1 and 0 to each pair of branches merging into a node.

Repeat the above until there is only one symbol left with a probability of 1.

ip

Huffman Huffman codingcoding

The resultant code is obtained by reading sequentially from the root node to the leaf node where the symbol is located.

Note: There may be a choice between two symbols with the same probability. If this is the case, either symbol can be chosen. The final tree and codes will be different, but the overall efficiency of the code will be the same.

Notice that each string of 0's and 1's can be uniquely decoded.

Coding and decoding – by table lookup.

Encode the letters A (0.12), E (0.42), I (0.09), O (0.30), U (0.07)

Thus the codes for each letter are: A - 100 E - 0 I - 1011 O - 11 U - 1010

Now using this code, any string of vowels can be written uniquely. AI = 1001011 EIEIO = 010110101111 UEA = 10100100 10110 = IE 100101011 = AUO 0101111111010 = EIOOU

Run‑Coded Binary Images

Run‑coding is an efficient coding scheme for binary or labeled images: not only does it reduce memory space, but it can also speed up image operations.

Example:

Image Row r0000000011111000000000000111000000011111111100000

Run‑code A

8(0)5(1)12(0)3(1)7(0)9(1)5(0)

Run‑coding is often used for compression within standard file formats.

Run-Length Coding Run-Length Coding (RLC)(RLC)Run-Length Coding Run-Length Coding (RLC)(RLC)

Arithmetic Arithmetic CodingCoding

Arithmetic coding is a lossless coding technique. Arithmetic coding typically has a better compression ratio than Huffman coding, as it produces a single symbol rather than several seperate codewords.

a message is represented by an interval of real numbers between 0 and 1.

successive symbols of the message reduce the size of the interval in accordance to the symbols prob. generated by the model.

Start with an interval [0, 1), divided into subintervals of all possible symbols to appear within a message. Make the size of each subinterval proportional to the frequency at which it appears in the message. Eg:

Symbol Probability Interval

a 0.2 [0.0, 0.2)

b 0.3 [0.2, 0.5)

c 0.1 [0.5, 0.6)

d 0.4 [0.6, 1.0)

When encoding a symbol, "zoom" into the current interval, and divide it into subintervals like in step one with the new range. Example: suppose we want to encode "addc". We "zoom" into the interval corresponding to "a", and divide up that interval into smaller subintervals like before. We now use this new interval as the basis of the next symbol encoding step.

Symbol New "a" Interval

aa [0.0, 0.04)

ab [0.04, 0.1)

ac [0.1, 0.102)

ad [0.102, 0.2)

Repeat the process until the maximum precision of the machine is reached, or all symbols are encoded. To encode the next character “d", we use the "a" interval created before, and zoom into the subinterval “d", and use that for the next step. This produces:

Symbol New “ad" Interval

ada [0.102, 0.1216)

adb [0.1216, 0.151)

adc [0.151, 0.1608)

add [0.1608, 0.2)

And lastly, the final result is:

SymbolNew “add" Interval

adda [0.1608, 0.16864)

addb [0.16864, 0.1804)

addc [0.1804, 0.18432)

addd [0.18432, 0.2)

Transmit some number within the latest interval to send the codeword. The number of symbols encoded will be stated in the protocol of the image format, so any number within [0.1804, 0.18432) will be acceptable for “addc”. We last find a shortest binary fraction that lies within [0.1804, 0.18432) to be the codeword.

To decode the message, a similar algorithm is followed, except that the final number is given, and the symbols are decoded sequentially from that.

Bit-plane Encoding

e.g. 256 gray-level image can be considered as 8 one –bit plane

each one-bit plane is coded by RLC Usually, compression ratio ～ 1.5-2 Disadvantage: sensitive to noise in tra

nsmission.

Predictive Coding

Predictive Coding

Predictive Techniques

Remove mutual redundancy between successive pixels and encode only new information.

A quantity , an estimate of u(n) is predicted from the previously decoded samples

Whereφdenotes the prediction rule.

nu *

,2,1 ** nunu

* * *1 , 2 ,u n u n u n

Given the prediction rule, we need only to code the error (difference)

And is the quantized value of e(n)

Differential Pulse Code Modulation (DPCM)

nunune * ne*

nenunu ***

Example:

The sequence 100,102,120,120,118,116 is to be predictively coded using the prediction rule

for DPCM and a Feedforward predictive coder using . Assume a 2-bit quantizer as shown below:

1** nunu 1* nunu

Giving , we can obtain the following table:

10000* uu

We can see that reconstruction error build up with feedforward system, while error stabilize with DPCM.

Note that if the input sequence is an integer, and the predicted output sequence is made to be an integer, then error=integer and can be coded for perfect reconstruction.

Advantage of quantizer is that the error sequence is distributed over a much smaller range, and hence can be coded with fewer bits.

Delta Modulation (DM) Simplest form: and a one-bit qua

nitizer. Problem:

slope overload [increase sampling rate] granularity noise [use tri-state DM] instability to transmission error [use leak, atte

nuating the predictor output by a factor<1.]

1** nunu

2-D image can be coded line by line. Each scan line is coded

independently by DPCM Use a 1-D model:

Perform quantization on the error sequence.

knukanunp

k

1

2-D DPCM

2-D prediction model:

Where W is a prediction window. In practice, only nearest neighbour is use

d.

Wlk

lnkmulkanmu ,,,,

1,11,11,,1, 4321 nmuanmuanmuanmuanmu

The coefficients be obtained by least square method.

1,11,11,,1, 4321, nmuanmuanmuanmuanmue nm

222

21

2

,

2 1,,1,{ nmuanmuanmuenm

nmunmuanmuanmua ,1,21,11,1 122

422

3

1,1,21,,2 32 nmunmuanmunmua

1,,121,1,2 214 nmunmuaanmunmua

1,1,121,1,12 4131 mmunmuaanmunmuaa

1,11,21,11, 4232 nmunmuaanmunmuaa

}1,11,12 43 nmunmuaa

rarararaaaa 32102

42

32

22

1 2221

raaraaraara 4131214 2222

raaraaraa 434232 222

Transform Coding

Transform Coding

Fourier Transform (FFT) and Fourier Inverse Transform (FFT)Fourier Transform (FFT) and Fourier Inverse Transform (FFT)

Fourier Low Pass FilteringFourier Low Pass Filtering

Transform Coding

Block quantization — a block of data is unitarily transformed so that a large portion of energy is packed in relative few translated coefficients.

It can be shown that the K-L transform is the optional choice — minimize the mean square distortion of the reproduced data for a given no. of bits.

There is no fast algorithm for K-L transform, use DCT usually.

The 2-D DCT of an image f (x,y) is C(u,v), u,v = 0, 1, 2, …, N-1

])12[cos(])12)[cos(,(2

1),(

),(1

)0,0(

1

0

1

03

1

0

1

0

vyuxyxfN

vuC

yxfN

C

N

y

N

x

N

y

N

x

By the DCT, image f(x,y) is decomposed into a series expansion of basis functions, which are used as the features

DCT-based Coding

Divide the image into small rectangular (square) blocks

Perform unitary transformation The coefficients are uniformly quantize

d but each coefficient with different steps- specified by Quantization TableQuantization Table

The coefficients with more energy is allocated more bits

Entropy encoding- Huffmann coding/Arithmetic Coding (need a code table)

Furthermore, the DC coefficient (the first coefficient in DCT )(which represents the average gray level ) is coded as difference from previous block.

The coefficients are arranged in a zig-zag sequence (in accordance with spatial frequencies)

Subband Coding To split a multi-dimensional signal into

subbands (wavelet decomposition). Each subband will have different chara

cteristics. e.g. Human are less sensitive to high freq.,

hence high freq. Subbands can be coded with fewer bits.

The high freq. subbands will have a smaller range as compared with the original signal.

Application to Image Compression

Perform filtering to obtain 4 images.

-Code individual images using entropy code or other methods, such as DPCM, or VQ.

-Reconstruct image on the other end of transmission.


can perform more than one level of image decomposition e.g. Future decompose each 4 images

and get 16 images. Then code individual images.

OR just decompose the LP-LP(Low-pass) version only.

This will form an image “pyramid”, or multi-resolution representation.

Threshold = 8, Bases included = 12378, Compression ratio = 5 : 1

Threshold = 4, Bases included = 6160, Compression ratio = 10 : 1 Threshold = 2, Bases included = 2383, Compression ratio = 27 : 1

Original image, the Lena

Vector Quantization Coding

Vector Quantization Coding

Vector quantization (VQ)

mapping a sequence of continuous or discrete vectors into a digital sequence for coding.

According to Shannon, better performance can always be achieved by coding vector instead of scalars.

Vector quantization (VQ)

VQ consists of 2 mappings: an encoder γwhich assigns to each inpu

t vector X=(X0, X1, …, Xk-1) a channel symbol γ(x)

∈M, the channel symbol set. A decoder βassigns each channel symb

ol ν∈M a reproduced vector . The elements in M can be coded in normal

way. (e.g. Huffmann coding)

A

which is the K-dim Education distance. -Suitable for image coding, which essentiall

y measure the squared error. (or SNR Signal-to-Noise Ratio)

21

0

'2'',

k

iii xxxxxxd

-Measure of distortion used: Squared error distortion.

The collection of possible reproduction vectors

C={all y :y=β(ν) for some ν∈M} is called the reproduction codebook , and its member ν a codebook.

[Remark: in image compression,the size of codebook also need to be taken into account]

For a memoryless VQ, the best encoder is a nearest neighbor mapping:

γ(x)=ν such that d[x,β(ν)] is min. The encoder γcan thus be thought as a partition of

the input space into cells where all input vectors yielding a common reproduction are grouped together – clustering.

Partition according to min. distortion rule- Voronoi partition.

Properties: Given an encoder γ,no decoder can do better than that which assign to each channel symbolνthe generalized centroid of all source vectors encoded intoν.

Codewords in 2-dimensional space. Input vectors are marked with an x, codewords are marked with red circles, and the Voronoi regions are separated with boundary lines.

Training Algorithm for the codebook Determine the number of codewords, N, or the size of the code

book. Select N codewords at random, and let that be the initial codeb

ook. The initial codewords can be randomly chosen from the set of input vectors.

Using the Euclidean distance measure, clusterize, the vectors around each codeword. This is done by taking each input vector and finding the Euclidean distance between it and each codeword. The input vector belongs to the cluster of the codeword that yields the minimum distance.

Compute the new set of codewords. This is done by obtaining the average of each cluster. Add the component of each vector and divide by the number of vectors in the cluster.

Repeat steps 3 and 4 until the either the codewords don't change or the change in the codewords is small.

Chapter 8 Image Compression. Portable bit map Family (BMP, Lena 66,616B) Graphics interchange Format...

Documents

Transcript of Chapter 8 Image Compression. Portable bit map Family (BMP, Lena 66,616B) Graphics interchange Format...