Chapter 8 Image Compression. Portable bit map Family (BMP, Lena 66,616B) Graphics interchange Format...
-
Upload
clyde-perkins -
Category
Documents
-
view
248 -
download
0
Transcript of Chapter 8 Image Compression. Portable bit map Family (BMP, Lena 66,616B) Graphics interchange Format...
Chapter 8 Chapter 8
Image CompressionImage Compression
Chapter 8 Chapter 8
Image CompressionImage Compression
Portable bit map Family (BMP, Lena 66,616B)
Graphics interchange Format (GIF, Lena 70,458B)
Tag image file format (TIFF/TIF, Lena 88,508B)
JPEG format (JFIF/JFI/JPG, Lena 12,377)
Commonly Used Formats:
Image Data Compression
Large amount of data per image has 3 implications: storage processing communications
For digital image, 512×512×3 bytes/frame = 768Kbyte/frame
Data rate = half of line frequency (e.g. 25Hz), total data rate = 768KB×25/sec = 19.2Mbyte/sec
Image Data Compression
The goal of image compression is to reduce the amount of data required to represent a digital image.
Compression can be losslesslossless or lossyor lossy. The strategy is to remove redundant d
ata from the image. The compressed data should allow th
e original image to be reconstructed or approximated.
Image Data Compression
Entropy coding
Pixel Coding
Pixel Coding
Entropy CodingEntropy Coding Consider a source with L possible
independent symbols with probabilities pi , i = 0,…,L-1. The entropyentropy is defined as
Shannon’sShannon’s noise coding theorem: It is possible to code, without distortion, a source of entropy H, using ( ) bits/symbol where can be arbitrarily small.
H
1
02log
L
iii ppH
Ideal compression ratio (without distortion):
H
B
H
BC
Example: If 256 gray level with equal probability, then
ipi 256
1
bitsHi
8256log256
1log
256
122
256
1
i.e. No compression if the gray levels are all random.
If are different, then use a variable length code – Huffman codingHuffman coding.
ip
Arrange the symbol probabilities in a decreasing order and consider them as leaf nodes of a tree.
While there is more than one node: Merge the 2 nodes with smallest
probability. To form a new node, whose probability is the sum of the 2 merged node.
Arbitrarily assign 1 and 0 to each pair of branches merging into a node.
Repeat the above until there is only one symbol left with a probability of 1.
ip
Huffman Huffman codingcoding
The resultant code is obtained by reading sequentially from the root node to the leaf node where the symbol is located.
Note: There may be a choice between two symbols with the same probability. If this is the case, either symbol can be chosen. The final tree and codes will be different, but the overall efficiency of the code will be the same.
Notice that each string of 0's and 1's can be uniquely decoded.
Coding and decoding – by table lookup.
Encode the letters A (0.12), E (0.42), I (0.09), O (0.30), U (0.07)
Thus the codes for each letter are: A - 100 E - 0 I - 1011 O - 11 U - 1010
Now using this code, any string of vowels can be written uniquely. AI = 1001011 EIEIO = 010110101111 UEA = 10100100 10110 = IE 100101011 = AUO 0101111111010 = EIOOU
Run‑Coded Binary Images
Run‑coding is an efficient coding scheme for binary or labeled images: not only does it reduce memory space, but it can also speed up image operations.
Example:
Image Row r0000000011111000000000000111000000011111111100000
Run‑code A
8(0)5(1)12(0)3(1)7(0)9(1)5(0)
Run‑coding is often used for compression within standard file formats.
Run-Length Coding Run-Length Coding (RLC)(RLC)Run-Length Coding Run-Length Coding (RLC)(RLC)
Arithmetic Arithmetic CodingCoding
Arithmetic coding is a lossless coding technique. Arithmetic coding typically has a better compression ratio than Huffman coding, as it produces a single symbol rather than several seperate codewords.
a message is represented by an interval of real numbers between 0 and 1.
successive symbols of the message reduce the size of the interval in accordance to the symbols prob. generated by the model.
Start with an interval [0, 1), divided into subintervals of all possible symbols to appear within a message. Make the size of each subinterval proportional to the frequency at which it appears in the message. Eg:
Symbol Probability Interval
a 0.2 [0.0, 0.2)
b 0.3 [0.2, 0.5)
c 0.1 [0.5, 0.6)
d 0.4 [0.6, 1.0)
When encoding a symbol, "zoom" into the current interval, and divide it into subintervals like in step one with the new range. Example: suppose we want to encode "addc". We "zoom" into the interval corresponding to "a", and divide up that interval into smaller subintervals like before. We now use this new interval as the basis of the next symbol encoding step.
Symbol New "a" Interval
aa [0.0, 0.04)
ab [0.04, 0.1)
ac [0.1, 0.102)
ad [0.102, 0.2)
Repeat the process until the maximum precision of the machine is reached, or all symbols are encoded. To encode the next character “d", we use the "a" interval created before, and zoom into the subinterval “d", and use that for the next step. This produces:
Symbol New “ad" Interval
ada [0.102, 0.1216)
adb [0.1216, 0.151)
adc [0.151, 0.1608)
add [0.1608, 0.2)
And lastly, the final result is:
SymbolNew “add" Interval
adda [0.1608, 0.16864)
addb [0.16864, 0.1804)
addc [0.1804, 0.18432)
addd [0.18432, 0.2)
Transmit some number within the latest interval to send the codeword. The number of symbols encoded will be stated in the protocol of the image format, so any number within [0.1804, 0.18432) will be acceptable for “addc”. We last find a shortest binary fraction that lies within [0.1804, 0.18432) to be the codeword.
To decode the message, a similar algorithm is followed, except that the final number is given, and the symbols are decoded sequentially from that.
Bit-plane Encoding
e.g. 256 gray-level image can be considered as 8 one –bit plane
each one-bit plane is coded by RLC Usually, compression ratio ~ 1.5-2 Disadvantage: sensitive to noise in tra
nsmission.
Predictive Coding
Predictive Coding
Predictive Techniques
Remove mutual redundancy between successive pixels and encode only new information.
A quantity , an estimate of u(n) is predicted from the previously decoded samples
Whereφdenotes the prediction rule.
nu *
,2,1 ** nunu
* * *1 , 2 ,u n u n u n
Given the prediction rule, we need only to code the error (difference)
And is the quantized value of e(n)
Differential Pulse Code Modulation (DPCM)
nunune * ne*
nenunu ***
Example:
The sequence 100,102,120,120,118,116 is to be predictively coded using the prediction rule
for DPCM and a Feedforward predictive coder using . Assume a 2-bit quantizer as shown below:
1** nunu 1* nunu
Giving , we can obtain the following table:
10000* uu
We can see that reconstruction error build up with feedforward system, while error stabilize with DPCM.
Note that if the input sequence is an integer, and the predicted output sequence is made to be an integer, then error=integer and can be coded for perfect reconstruction.
Advantage of quantizer is that the error sequence is distributed over a much smaller range, and hence can be coded with fewer bits.
Delta Modulation (DM) Simplest form: and a one-bit qua
nitizer. Problem:
slope overload [increase sampling rate] granularity noise [use tri-state DM] instability to transmission error [use leak, atte
nuating the predictor output by a factor<1.]
1** nunu
2-D image can be coded line by line. Each scan line is coded
independently by DPCM Use a 1-D model:
Perform quantization on the error sequence.
knukanunp
k
1
2-D DPCM
2-D prediction model:
Where W is a prediction window. In practice, only nearest neighbour is use
d.
Wlk
lnkmulkanmu ,,,,
1,11,11,,1, 4321 nmuanmuanmuanmuanmu
The coefficients be obtained by least square method.
1,11,11,,1, 4321, nmuanmuanmuanmuanmue nm
222
21
2
,
2 1,,1,{ nmuanmuanmuenm
nmunmuanmuanmua ,1,21,11,1 122
422
3
1,1,21,,2 32 nmunmuanmunmua
1,,121,1,2 214 nmunmuaanmunmua
1,1,121,1,12 4131 mmunmuaanmunmuaa
1,11,21,11, 4232 nmunmuaanmunmuaa
}1,11,12 43 nmunmuaa
rarararaaaa 32102
42
32
22
1 2221
raaraaraara 4131214 2222
raaraaraa 434232 222
Transform Coding
Transform Coding
Fourier Transform (FFT) and Fourier Inverse Transform (FFT)Fourier Transform (FFT) and Fourier Inverse Transform (FFT)
Fourier Low Pass FilteringFourier Low Pass Filtering
Transform Coding
Block quantization — a block of data is unitarily transformed so that a large portion of energy is packed in relative few translated coefficients.
It can be shown that the K-L transform is the optional choice — minimize the mean square distortion of the reproduced data for a given no. of bits.
There is no fast algorithm for K-L transform, use DCT usually.
The 2-D DCT of an image f (x,y) is C(u,v), u,v = 0, 1, 2, …, N-1
])12[cos(])12)[cos(,(2
1),(
),(1
)0,0(
1
0
1
03
1
0
1
0
vyuxyxfN
vuC
yxfN
C
N
y
N
x
N
y
N
x
By the DCT, image f(x,y) is decomposed into a series expansion of basis functions, which are used as the features
DCT-based Coding
Divide the image into small rectangular (square) blocks
Perform unitary transformation The coefficients are uniformly quantize
d but each coefficient with different steps- specified by Quantization TableQuantization Table
The coefficients with more energy is allocated more bits
Entropy encoding- Huffmann coding/Arithmetic Coding (need a code table)
Furthermore, the DC coefficient (the first coefficient in DCT )(which represents the average gray level ) is coded as difference from previous block.
The coefficients are arranged in a zig-zag sequence (in accordance with spatial frequencies)
Subband Coding To split a multi-dimensional signal into
subbands (wavelet decomposition). Each subband will have different chara
cteristics. e.g. Human are less sensitive to high freq.,
hence high freq. Subbands can be coded with fewer bits.
The high freq. subbands will have a smaller range as compared with the original signal.
Application to Image Compression
Perform filtering to obtain 4 images.
-Code individual images using entropy code or other methods, such as DPCM, or VQ.
-Reconstruct image on the other end of transmission.
Application to Image Compression
Application to Image Compression
can perform more than one level of image decomposition e.g. Future decompose each 4 images
and get 16 images. Then code individual images.
OR just decompose the LP-LP(Low-pass) version only.
This will form an image “pyramid”, or multi-resolution representation.
Threshold = 8, Bases included = 12378, Compression ratio = 5 : 1
Threshold = 4, Bases included = 6160, Compression ratio = 10 : 1 Threshold = 2, Bases included = 2383, Compression ratio = 27 : 1
Original image, the Lena
Vector Quantization Coding
Vector Quantization Coding
Vector quantization (VQ)
mapping a sequence of continuous or discrete vectors into a digital sequence for coding.
According to Shannon, better performance can always be achieved by coding vector instead of scalars.
Vector quantization (VQ)
VQ consists of 2 mappings: an encoder γwhich assigns to each inpu
t vector X=(X0, X1, …, Xk-1) a channel symbol γ(x)
∈M, the channel symbol set. A decoder βassigns each channel symb
ol ν∈M a reproduced vector . The elements in M can be coded in normal
way. (e.g. Huffmann coding)
A
which is the K-dim Education distance. -Suitable for image coding, which essentiall
y measure the squared error. (or SNR Signal-to-Noise Ratio)
21
0
'2'',
k
iii xxxxxxd
-Measure of distortion used: Squared error distortion.
The collection of possible reproduction vectors
C={all y :y=β(ν) for some ν∈M} is called the reproduction codebook , and its member ν a codebook.
[Remark: in image compression,the size of codebook also need to be taken into account]
For a memoryless VQ, the best encoder is a nearest neighbor mapping:
γ(x)=ν such that d[x,β(ν)] is min. The encoder γcan thus be thought as a partition of
the input space into cells where all input vectors yielding a common reproduction are grouped together – clustering.
Partition according to min. distortion rule- Voronoi partition.
Properties: Given an encoder γ,no decoder can do better than that which assign to each channel symbolνthe generalized centroid of all source vectors encoded intoν.
Codewords in 2-dimensional space. Input vectors are marked with an x, codewords are marked with red circles, and the Voronoi regions are separated with boundary lines.
Training Algorithm for the codebook Determine the number of codewords, N, or the size of the code
book. Select N codewords at random, and let that be the initial codeb
ook. The initial codewords can be randomly chosen from the set of input vectors.
Using the Euclidean distance measure, clusterize, the vectors around each codeword. This is done by taking each input vector and finding the Euclidean distance between it and each codeword. The input vector belongs to the cluster of the codeword that yields the minimum distance.
Compute the new set of codewords. This is done by obtaining the average of each cluster. Add the component of each vector and divide by the number of vectors in the cluster.
Repeat steps 3 and 4 until the either the codewords don't change or the change in the codewords is small.