Digital Image Processing Lecture 22: Image Compression Prof. Charlene Tsai *Section 8.4 in Gonzalez.

Digital Image Processing Lecture 22: Image

Compression

Prof. Charlene TsaiProf. Charlene Tsai

*Section 8.4 in Gonzalez

2

Starting with Information Theory Data compression: the process of reducing the

amount of data required to represent a given quantity of information.

Data information Data convey the information; various amount of data

can be used to represent the same amount of information.

E.g. story telling (Gonzalez pg411) Data redundancy Our focus will be coding redundancy

3

Coding Redundancy

Again, we’re back to gray-level histogram for data (code) reduction

Let rk be a graylevel with occurrence probability pr(rk).

If l(rk) is the # of bits used to represent rk, the average # of bits for each pixel is

1

0

L

kkrkavg rprlL

4

Example on Variable-Length Coding

Average for code 1 is 3, and for code 2 is 2.7 Compression ratio is 1.11 (3/2.7), and level of

reduction is 099.011.1

11 DR

5

Information Theory

Information theory provides the mathematical framework for data compression

Generation of information modeled as a probabilistic process

A random event E that occurs with probability p(E) contain

units of information (self-information)

EpEp

EI log1

log

6

Some Intuition

I(E) is inversely related to p(E) If p(E) is 1 => I(E)=0

No uncertainty is associated with the event, so no information is transferred by E.

Take alphabet “a” and “q” as an example. p(“a”) is high, so, low I(“a”); p(“q”) is low, so high I(“q”).

The base of the logarithm is the unit used to measure the information.

Base 2 is for information in bit

7

Entropy

Measure of the amount of information Formal definition: entropy H of an image is the

theoretical minimum # of bits/pixel required to encode the image without loss of information

where i is the grayscale of an image, and pi is the probability of graylevel i occurring in the image.

No matter what coding scheme is used, it will never use fewer than H bits per pixel

1

02log

L

iii ppH

8

Variable-Length Coding

Lossless compression Instead of fixed length code, we use variable-

length code: Smaller-length code for more probable gray

values Two methods:

Huffman coding Arithmetic coding

We’ll go through the first method

9

Huffman Coding

The most popular technique for removing coding redundancy

Steps: Determine the probability of each gray value in the

image Form a binary tree by adding probabilities two at a

time, always taking the 2 lowest available values Now assign 0 and 1 arbitrarily to each branch of

the tree from the apex Read the codes from the top down

10

Example

The average bit per pixel

is 2.7 Much better than 3,

originally Theoretical minimum

(entropy) is 2.7

How to decode the string

11011101111100111110 Huffman codes are uniquely decodable.

Gray value Huffman code

0 (0.19) 00

1 (0.25) 10

2 (0.21) 01

3 (0.16) 110

4 (0.08) 1110

5 (0.06) 11110

6 (0.03) 111110

7 (0.02) 111111

11

LZW (Lempel-Ziv-Welch) Coding Lossless Compression Compression scheme for Gif, TIFF and PDF For 8-bit grayscale images, the first 256

words are assigned to grayscales 0, 1, …255 As the encoder scans the image, the

grayscale sequences not in the dictionary are placed in the next available location.

The encoded output consists of dictionary entries.

12

Example

Consider the 4x4, 8-bit image of a vertical edge39 39 126 126 39 39 126 12639 39 126 126 39 39 126 126

A 512-word dictionary starts with the content

Dictionary location Entry

0 0

1 1

255 255

256 ----

511 ---

… …

… …

13To decode, read the 3rd column from top to bottom

14

Run-Length Encoding (1D)

Lossless compression To encode strings of 0s and 1s by the

number or repetitions in each string. A standard in fax transmission There are many versions of RLE

15

(con’d)

Consider the binary image on

the right Method 1:

(123)(231)(0321)(141)(33)(0132) Method 2:

(22)(33)(1361)(24)(43)(1152) For grayscale image, break up the image first

into the bit planes.

0 1 1 0 0 00 0 1 1 1 01 1 1 0 0 10 1 1 1 1 00 0 0 1 1 11 0 0 0 1 1

16

Problem with grayscale RLE

Long runs of very similar gray values would result in very good compression rate for the code.

Not the case for 4 bit image consisting of randomly distributed 7s and 8s.

One solution is to use gray codes.

17

Example in pg 400 For 4 bit image,

Binary encoding: 8 is 1000, and 7 is 0111 Gray code encoding: 8 is 1100 and 7 is 0100

Bit planes are:

0 0 1 0

0 1 0 1

1 1 0 1

1 0 1 1

1 1 0 1

1 0 1 0

0 0 1 0

0 1 0 0

0 0 0 0

0 0 0 0

0 0 0 0

0 0 0 0

1 1 0 1

1 0 1 0

0 0 1 0

0 1 0 0

0th, 1st, and 2nd binary bit plane

3rd binary bit plane

0th and 1st gray code bit plane(replace 0 by 1 for 2nd plane)

3rd gray code bit plane

Highly correlatedUncorrelated

18

Summary

Information theory Measure of entropy, which is the theoretical

minimum # of bits per pixel Lossless compression schemes

Huffman coding LZW Run-Length encoding

Digital Image Processing Lecture 22: Image Compression Prof. Charlene Tsai *Section 8.4 in Gonzalez.

Documents

Transcript of Digital Image Processing Lecture 22: Image Compression Prof. Charlene Tsai *Section 8.4 in Gonzalez.