Efficient use of spectrum Less sensitive to noise & distortions Integration of digital services Data...

Post on 13-Jan-2016

214 views 0 download

Tags:

Transcript of Efficient use of spectrum Less sensitive to noise & distortions Integration of digital services Data...

Efficient use of spectrum

Less sensitive to noise &

distortions

Integration of digital services

Data Encryption

Digital Video

576

lin

es

5.5MHz = 720 pixelsRaw image

576 lines/frame720 pixels/line50 fields/second8 bit per pixelTotal: 576x720x25x3 x8 = 249Mbit/sec

R,G andB

R: 83Mbit/s G: 83Mbit/s B: 83Mbit/s Total: 249Mbit/s

Figure 2a

f

VSB

Chroma at 4.43 MHz

-1.75 MHz

Sound at 6 MHz

Chrominance can be represented with a considerable narrower bandwidth (resolution) than luminance

576

lin

es

5. 5MHz = 720 pixelsPAL system

576 lines/frame720 pixels/line50 fields/second8 bit per pixelTotal: 576x720x25x8 = 83Mbit/sec

Luminance Y

Y: 83Mbit/s U: V: Total:

Figure 2b

288

lin

es

2.75MHz = 360 pixelsPAL system

288 lines/frame360 pixels/line50 fields/second8 bit per pixelTotal: 288x360x25x8 = 21Mbit/sec

Chrominance U

Y: 83Mbit/s U: 21Mbit/s V: Total:

Figure 2c

288

lin

es

2.75MHz = 360 pixelsPAL system

288 lines/frame360 pixels/line50 fields/second8 bit per pixelTotal: 288x360x25x8 = 21Mbit/sec

Chrominance V

Y: 83Mbit/s U: 21Mbit/s V: 21Mbit/s Total: 125Mbit/s

Figure 2d

Medium Quality : 1.2 Mbit/s

Superior Quality : 6 Mbit/s

Actual size - 249Mbit/s

Result: Compression is necessary

U,V downsampled - 125Mbit/s

Redundancy in image contents

Adjacent pixels are similar

Intensity variations can be predicted

Sequential frames are similar

Lossy compression: Removal of redundant information, resulting in distortion that is insensitive to Human Perception

Figure 3a

Lenna

Pixels within this region have similar but not totally identical intensity.

Figure 3b

Figure 3b

Inte

nsi

ty

position

0 1 2 3 4 5-1-2-3-4-5

Autocorrelation function

0.2

0.4

0.6

0.8

1.0

Figure 4

Interpolation

1. Pixel intensities usually varies in a smooth manner except at edge (dominant/salient) points

2. Record pixels at dominant points only.

3. Reconstruct the pixels between dominant points with “Interpolation”.

4. A straightforward method: Joining dominant points with straight lines.

5. High compression ratio for smooth varying intensity profile.

6. Difficulty: How to identify dominant points?

Inte

nsi

ty

positionFigure 5a

Transmit only selected pixels predicted the rest

Prediction of current sample based on previous ones

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepq

Input signal nx

Predicted signal nxp

Error signal nxnxne pp

Quantized error signal nepq

Reconstructed signal nenxnx pqp ˆ

Quantizer: representation of a continuous dynamic range with a finite number of discrete levels (will be discussed later)

Error = Quantization error

Function of Predictive Coding: Data Compression

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepq

24319916212081400 ,,,,,,nxp

24119816311982410 ,,,,,,nx 8 bits The better the predictor, the higher is the

compression ratio 2111110 ,,,,,,nep 3 bits

Prediction error

A simple example:

0000000000 xeexx pqpp ˆ

45393228201470 ,,,,,,,nx

1 nxnx p ˆ

6 bits

61ˆ 61 71 01 71 xeexx pqpp

122628262142 xeexx pqpp ˆ

1836383123203 xeexx pqpp ˆ

LevelLevel xx yy

00 0 to 20 to 2 00

11 3 to 53 to 5 33

22 6 to 86 to 8 66

33 9 to 119 to 11 99

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepq

2 bits Quantizer

27494104184284 xeexx pqpp ˆ

,.....,,,,ˆ,.....,,,, 3222096660 nene pqpq

Predictive Decoder

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepq

Predictor (P)

nx̂

Reconstructed signal nenxnx pqp ˆ

Reconstruction error nxnx ˆ

nenxnenx pqppp

Quantization error nene pqp

nepqˆ

Option: the quantized levels are transmitted instead of the actual errors

Q-1

Q-1

Predictive Decoder

LevelLevel xx yy

00 0 to 20 to 2 00

11 3 to 53 to 5 33

22 6 to 86 to 8 66

33 9 to 119 to 11 99

2 bits Quantizer

45393228201470 ,,,,,,,nx 6 bits

61016121 xxee ppqpq ˆˆˆ

00000000 xxee ppqpq ˆˆˆ

1212626222 xxee ppqpq ˆˆˆ

1831236323 xxee ppqpq ˆˆˆ

2741849434 xxee ppqpq ˆˆˆ

Error = Quantization error

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepq

Predictor (P)

nx̂ nepqˆ

Q-1

Q-1

Predictive Decoder

Quantizer (Q)

Predictor (P)

nx

nx̂

nep

nxp

nepqˆ

Predictor (P)

nx̂

LevelLevel xx yy

00 positivepositive +S+S

11 negativenegative -S-S

1 bits Quantizer

45393228201470 ,,,,,,,nx 6 bits

Q-1

Q-1

S = Fix step size

Where

Prediction based on the linear combination of previously reconstructed samples

Current sample = lnxanxk

llp

ˆ1

k

lla

1

1

2

1

k

ll lnxanxE

02

neEa

MSPE pi

min

Optimal predictor design by minimizing the Mean Square Prediction Error

22 nxnxEneEMSPE pp

rRa 1

k

ll

k

ll xxaa

11

1 maxmax

2

1

k

ll lnxanxE

02

neEa

MSPE pi

min

Optimal predictor design by minimizing the Mean Square Prediction Error

22 nxnxEneEMSPE pp

rRa ~~~ 1

Tkaaaa ,.......,,~21

TknxnxEnxnxEnxnxEr ,........,,~ 21

knxknxnxknxnxknx

knxnxnxnxnxnx

knxnxnxnxnxnx

ER

21

22212

12111

~

Inte

nsi

ty

position Figure 5b

YA

e.g. Asin(n/T)+YIn

ten

sity

position n Figure 5d

AY

1. Select a basis - a set of fixed functions {f0(n), f1(n), f2(n), f3(n), ……………, fN(n)}

2. Assuming all types of signals can be approximated by a linear combination of these functions (i.e. A(n) = a0f0(n)+ a1f1(n)+ a2f2(n)+…+ aNfN(n)

3. Calculate the coefficients a0, a1, ….., aN

4. Represents the input signal with the coefficients instead of the actual data

5. Compression: Use less coefficients, e.g. a0, a1, ….., aK (K<N)

6. For example: the set of sine and cosine waves

Major Steps

1. Adopt the sine and cosine waves as a basis

2. Calculate the Fourier coefficients (Note: a sequence of N points will give N complex coefficients

3. Encoding (compression): Represents the signal with the first K coefficients, where K < N

4. Decoding (decompression): Reconstruct the signal with the K coefficients with inverse Fourier Transform.

5. Other Transforms (e.g. Walsh Transform) can be adopted

Sinusoidal Waves

1110

0100

1

0

NNWNW

NWW

W

W

N ,,

,,

Set of basis functions

denotes Dot Product between A and B

1

0

1

0

NWs

Ws

NX

X

S

,

,

BA,

Transform from the “s” domain to the “S” domain

1

0

N

n

knWnxkX ,

1

0

00N

n

nWnxX ,

1

0

11N

n

nWnxX ,

1

0

22N

n

nWnxX ,

1210 Nxxxxs ,........,,,

1210 NXXXXS ,........,,,

x(0) x(1) x(2) …….. x(N-2) x(N-1)

X(k)

W(0,k) W(1,k) W(2,k) W(N-2,k) W(N-1,k)

1

0

N

n

knWnxkX ,

A. Orthogonal Property

Delta function

1

0

1 N

n

kj kjknWjnWN

WW ,,,,

B. Orthonormal Property

otherwise

kjWW kj

0

1,

s

denotes Dot Product between A and B

1

0

1

0

NWS

WS

Nx

x

s'

'

,

,

BA,

Inverse Transform from the “S” domain to the “s” domain

are complex conjugateskk WW ',

1210 NXXXXS ,........,,,

1

0

00N

k

kWkXx ,*

1

0

11N

k

kWkXx ,*

1

0

22N

k

kWkXx ,

1210 Nxxxxs ,........,,,

1

0

N

k

knWkXnx ,*

X(0) X(1) X(2) …….. X(N-2) X(N-1)

x(n)

W*(n,0) W*(n,1) W*(n,2) W*(n,N-2) W*(n,N-1)

1

0

N

k

knWkXnx ,*

Note: X(k) is complex

1

0

N

n

knWnxkX ,

N

nkj

eknW2

,

N

nkj

N

nk 22sincos

1

02

N

n

knWnxkC

kX ,

N

knknW

2

12 cos,

otherwise

kkC

1

02 50.

Note: X(k) is real

x(n) =

W’(n,k) = cos[(2n+1)k

Note: Wk is real, therefore W’k = Wk

k=0

N-1 X(k)W’(n,k)C(k)

2

C(k) = 2-0.5 for k = 0

= 1 otherwise

Transform that are suitable for compression should exhibit the following properties:

Optimal Transform : Karhunen-Loeve Transform (KLT)

a. There exist an inverse transform

b. Decorrelation

c. Good Energy Compactness

x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)

A sample can be predicted from its neighbor(s)

X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)

After DFT, a coefficient is less predictable from its neighbor(s)

Magnitude of frequency components

x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)

All samples are important

x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)

All samples are important

Any missing sample causes large distortion

X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)

x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)

DFT samples

X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)

x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)

X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)

x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)

The signal can be constructed with the first 3 samples with good approximation

All information is concentrated in a small number of elements in the

transformed domain

DCT has very good Energy Compactness and Decorrelation Properties

X(j,k) = m=0

M-1

x(m,n)W(m,j) W(n,k)C(j) 2

n=0

C(k) 2

W(n,k) = cos[(2n+1)k

C(k) , C(j) = 2-0.5 for k = 0 and j = 0, respectively

= 1 otherwise

N-1

W(m,j) = cos[(2m+1)j

x(0,0) x(0,1) x(0,2) x(0,N-1)

x(1,0) x(1,1) x(1,2) x(1,N-1)

x(M-1,0) x(M-1,1) x(M-2,2) x(M-1,N-1)

X(0,0) X(0,1) X(0,2) X(0,N-1)

X(1,0) X(1,1) X(1,2) X(1,N-1)

X(M-1,0) X(M-1,1) X(M-2,2) X(M-1,N-1)

2-D DCT

x(m,n) = j=0

M-1 X(j,k)W(m,j) W(n,k)C(j)

2

C(k) 2

W(n,k) = cos[(2n+1)k

C(k) , C(j) = 2-0.5 for k = 0 and j = 0, respectively

= 1 otherwise

k=0

N-1

W(m,j) = cos[(2m+1)j

x(0,0) x(0,1) x(0,2) x(0,N-1)

x(1,0) x(1,1) x(1,2) x(1,N-1)

x(M-1,0) x(M-1,1) x(M-2,2) x(M-1,N-1)

X(0,0) X(0,1) X(0,2) X(0,N-1)

X(1,0) X(1,1) X(1,2) X(1,N-1)

X(M-1,0) X(M-1,1) X(M-2,2) X(M-1,N-1)

2-D IDCT

Impo

rtan

ceIm

port

ance

E f n n

Given a signal

and E f n f n k R n k n ,

Assume f(n) is wide-sense stationary, i.e. its statistical properties are constant with changes in time

kRnknRandconstant ,

Define and

(O1)

(O2)

1

1

1

1

1

0

Nf

f

f

f

,...,, 110 Nfffnf

f(n), define the mean and autocorrelation as

1.....21

2....11

1....11

=

0.....21

2....01

1....10

2

NN

N

N

RNRNR

NRRR

NRRR

R

where R k k and 2 0 1 (O3)

Equation O1 can be rewritten as

C conv f E f fT

The covariance of f is given by

Tf ffER (O4)

(O5)T

fR

The signal is transform to its spectral coefficients

sk

N

sf k k s N

0

1

0 1*

Comparing the two sequences:

f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1

a. Adjacent terms are relatedb. Every term is important

a. Adjacent terms are unrelatedb. Only the first few terms are

important

The signal is transform to its spectral coefficients

sk

N

sf k k s N

0

1

0 1*

similar to f, we can define the mean, autocorrelation and covariance matrix for

R E T

f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1

a. Adjacent terms are related a. Adjacent terms are unrelated

Adjacent terms are uncorrelated if every term is only correlated to itself, i.e., all off-diagonal terms in the autocorrelation function is zero.

Define a measurement on correlation between samples:

f fjj i

N

i

N

jj i

N

i

N

R i j and R i j

, , 1

1

1

1

1

1

1

1

(O6)

We assume that the mean of the signal is zero. This can be achieved simply by subtracting the mean from f if it is non-zero.

The covariance and autocorrelation matrices are the same after the mean is removed.

f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1

b. Every term is important b. Only the first few terms are important

0

1

1

0

1

1N

r

r

r

r N

Note:

If only the first L-1 terms are used to reconstruct the signal, we have

f L r rr

L

0

1

(O7)

If only the first L-1 terms are used to reconstruct the signal, the error is

The energy lost is given by e eLT

L rr L

N

21

e f fL L r rr L

N

1

r rk

NT

rf k k f

*

0

1

but,

hence r rT T

rf f2

(O8)

(O9)

(O10)

Eqn. O10 is valid for describing the approximation error of a single sequence of signal data f. A more generic description for covering a collection of signal sequences is given by:

J E e e E

E f f R

L LT

L rr L

N

rT T

rr L

N

rT

f rr L

N

'

21

1 1

(O11)

An optimal transform mininize the error term in eqn. O11. However, the solution space is enormous and constraint is required. Noted that the basis functions are orthonormal, hence the following objective function is adopted.

J RrT

f r r rT

rr L

N

11

(O12)

The term r is known as the Lagrangian multiplier

The optimal solution can be found by setting the gradient of J to 0 for each value of r, i.e.,

rr

JJ

0

Eqn O13 is based on the orthonormal property of the basis functions.

(O13)

R f r r r

The solution for each basis function is given by

(O14)

ris an eigenvector of Rf and r is an eigenvalue

Grouping the N basis functions gives an overall equation

R fT T

N where 0 1,......., (O15)

R = Rf= (O16)

which is a diagonal matrix.The decorrelation criteria is satisfied

sk

N

sf k k s N

0

1

0 1*

The signal is transform to its spectral coefficients

Given a signal ,...,, 110 Nfffnf

R f r r r

The solution for each basis function is given by

Determine the autocorrelation function Rf

Redundancy in imagesRedundancy in images

Probability distribution of pixel values are uneven

Assuming the pixel intensity (gray scale) ranges from 0 to 255 units

Figure 6a

255

0

Pixel Intensity

Probability of occurrence

0 1 2 3 4 100 252 253 254 255

Figure 6b

0.4

0.2

0.6

0.8

1.0

Use less bits to represent pixel intensity that occurs more often

A simple example: 720 pixels

576

pix

els

8bit per pixelsTotal: 3.3Mbits

Image size = 720x576x8 = 3.3Mbits

8bit per pixels

Intensity Pr

Pixel Intensity0 1 2 3 4 100 252 253 254 255

0.0040.002

0.500

0.0981.000

Pr

0 - 254 0.00196

255 0.500

720 pixels

576

pix

els

8bit per pixelsTotal: 3.3Mbits

Pixel Intensity

0 1 2 3 4 100 252 253 254 255

0.0040.002

0.500

0.0981.000

Intensity Pr # of bits Bit String

Pr

0 - 254 0.00196 9 1XXXXXXXX 255 0.500 1 0

Total = (720X576)X(0.500 + 0.002X255X9) = 2.1Mbits

Sequential frames are similar

Figure 7

P1

P2

P3

Only about 5-10% of the content had been changed between frames

Still picture - JPEGStill picture - JPEG

Joint Photographic Expert Group

• International Standard Organization (ISO) standards.

• Based on Discrete Cosine Transform (DCT).

Motion picture - MPEGMotion picture - MPEG

Motion Picture Expert Group

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Digitization

Figure 8

Figure 9

Image Digitization

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Figure 10a

Figure 10b

Image vectors

Figure 10c

Image Vector - a magnified view

Image Vector - a magnified view

Figure 10d

Image Vector - a magnified view

x(0,0) x(0,1) x(0,2) x(0,3) x(0,4) x(0,5) x(0,6) x(0,7)

x(2,0) x(2,1) x(2,2) x(2,3) x(2,4) x(2,5) x(2,6) x(2,7)

x(3,0) x(3,1) x(3,2) x(3,3) x(3,4) x(3,5) x(3,6) x(3,7)

x(4,0) x(4,1) x(4,2) x(4,3) x(4,4) x(4,5) x(4,6) x(4,7)

x(1,0) x(1,1) x(1,2) x(1,3) x(1,4) x(1,5) x(1,6) x(1,7)

x(5,0) x(5,1) x(5,2) x(5,3) x(5,4) x(5,5) x(5,6) x(5,7)

x(6,0) x(6,1) x(6,2) x(6,3) x(6,4) x(6,5) x(6,6) x(6,7)

x(7,0) x(7,1) x(7,2) x(7,3) x(7,4) x(7,5) x(7,6) x(7,7)Figure

10e

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Increasing horizontal frequency

Incr

easi

ng

vert

ical

fr

equ

ency

Figure 11a

Increasing horizontal frequency

Incr

easi

ng

vert

ical

fr

equ

ency

Figure 11b

Because of the energy compactness of DCT, most of the information is concentrated in the low frequency corner

200 185 170 25 1 3 1 3

198 180 160 171 10 7 3 10

165 150 125 5 12 11 10 9

30 25 8 13 5 3 9 0

210 190 195 120 7 15 5 8

2 9 1 0 3 6 2 1

5 5 7 2 7 1 1 5

4 9 2 11 9 2 3 0 Figure 11c

The DCT coefficients are normalised to 11 bits integer values

Before the transform, the pixel intensity range is converted from [0,255] to [-128, 127]

The process, known as ‘zero shift’, is performed by subtracting each pixel intensity by 128

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Quantizerf fQf ˆ

0 d1 d2 d3

r1

r2

-r2

-r1

d4

-d4 -d3 -d2 -d1

r3

-r3

Uniform Symmetric Quantizers

Input

Output

di : Decision levels

ri : Representation levels

U

L

a

adffpffffE

22 ˆˆ

Mean Square Quantization Error (MSQE)

U

L

a

adffpffffE ˆˆ

Mean Absolute Quantization Error (MAQE)

Q1

Q2

Max-Lloyd Quantizer

A method to determine the decision and representation levels

Suppose

Then

jrfQ

U

L

a

adffpffMSQE

1

0

21J

l

d

d l

l

l

dffprf Q3

Max-Lloyd Quantizer

Consider two arbitrary adjacent reconstruction levels rk-1 and rk

What will be the optimal value for dk so that error is minimized?

001

122

1

k

k

k

k

d

d

d

d

kkkk

dffprfdffprfdd

2

1 kk

k

rrd

dk-1 dk dk+1

rk-1 rk

Q4

Max-Lloyd Quantizer

Similarly

001

2

k

k

d

d

kkk

dffprfrr

1

1

k

k

k

k

d

d

d

dk

dffp

dfffp

r Q5

Max-Lloyd Quantizer for uniform pdf

Consider a uniform probability density function

f0 A/2-A/2

1/A

p(f)

Aaafp

LU

11

Max-Lloyd Quantizer for uniform pdf

22

111

kkkk

k

ddrrdFrom Q4,

kkd

d

d

dk dd

dffp

dfffp

rk

k

k

k

12

11

1

From Q5,

kkkkkkk ddddddd 11112Hence,

Constant Step Size

Max-Lloyd Quantizer for uniform pdf

12

11 2

2

22

1

1 SSdgg

SSdfrf

dd

SS

SS

d

d

kkk

k

k

/

/

J

aadd LU

kk

1

Step size (SS)

12

1 22

2

22 Adff

A

A

A

f

/

/

Q6

Q7Variance =

Max-Lloyd Quantizer for uniform pdf

bb A

SSJ2

2 Q8For a b bits quantizer,

bdBA

A

MSQEb

b

f 62

122

12 2

2

2

22

/

/Q9SNR =

200 185 170 25 1 3 1 3

198 180 160 171 10 7 3 10

165 150 125 5 12 11 10 9

30 25 8 13 5 3 9 0

210 190 195 120 7 15 5 8

2 9 1 0 3 6 2 1

5 5 7 2 7 1 1 5

4 9 2 11 9 2 3 0

Assign different quantization step size for each coefficients

Figure 12

Consider a range of values from, lets say 0 to 255

0 - 7 0 00000 0 8 - 15 1 00001 816 - 23 2 00010 1624 - 31 3 00011 2432 - 39 4 00100 32

248 - 255 31 11111 248

If a step size = 8 is used, the range is divided into 256/8 = 32 levels

5 bits are required to represent each level in this range

Value Level Bit string Quantized value

Consider a range of values from, lets say 0 to 255

0 - 15 0 00000 016 - 31 1 00001 1632 - 47 2 00010 3248 - 63 3 00011 4864 - 79 4 00100 64

240 - 255 16 11111 240

If a step size = 16 is used, the range is divided into 256/16 = 16 levels

4 bits are required to represent each level in this range

Value Level Bit string Quantized value

16 le

vels

The larger the step size,

the smaller the number of quantized levels

the smaller the number of bits,

the larger the distortion in value

and the other way round

Human Visual System is more sensitive to low frequency intensity (spatial) variation in an image

Increasing horizontal frequency

Incr

easi

ng

vert

ical

fr

equ

ency

Figure 13

Human Visual System (HVS) is more sensitive to low frequency intensity (spatial) variation in an image

Decreasing sensitivity to HVS

Dec

reas

ing

sen

siti

vity

to

HV

S

Figure 14

200 185 170 25 1 3 1 3

198 180 160 171 10 7 3 10

165 150 125 5 12 11 10 9

30 25 8 13 5 3 9 0

210 190 195 120 7 15 5 8

2 9 1 0 3 6 2 1

5 5 7 2 7 1 1 5

4 9 2 11 9 2 3 0

Assign different quantization step size for each coefficients

Figure 15

1 1 1 4 8 12 16 20

1 4 8 12 16 22 22 25

4 8 12 16 20 24 25 30

8 12 16 20 22 28 30 32

1 1 4 8 12 20 22 24

12 14 18 24 25 30 35 40

10 16 20 28 30 35 40 43

12 20 25 30 32 40 45 48

DCT coefficients Q Step Size

200 185 170 25 1 3 1 3

198 180 160 171 10 7 3 10

165 150 125 5 12 11 10 9

30 25 8 13 5 3 9 0

210 190 195 120 7 15 50 8

2 9 1 0 3 6 2 1

5 5 7 2 7 1 1 5

4 9 2 11 9 2 3 0

Assign different quantization step size for each coefficients

Figure 16

200 185 170 24 0 0 0 0

198 180 160 168 0 0 0 0

164 144 120 0 0 0 0 0

32 24 0 0 0 0 0 0

210 190 192 120 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

DCT coefficients Quantized DCT coefficients

After Quantization, a lot of high frequency DCT coefficients are truncated to ‘0’

Non-zero coefficients carry most of the image contents and those that are sensitive to the HVS

Large number of ‘0’ value coefficients suggested runlength coding

For a continuous stream of numbers with identical values, it is only necessary to record 1. The value of the number2. The number of duplication

A sequence of 8 bytes of raw data s = [15, 15, 15, 15, 15, 15, 15, 15]

Runlength representation: [ 15 , 8 ]

Value Runlength

Only 2 bytes are needed to

represent ‘s’

The longer the string of duplicated numbers, the larger the Compression Ratio (CR)

Runlength representation: [ 15 , 4 ]

Value RunlengthCompression Ratio = 2

Runlength representation: [ 15 , 16 ]

Value RunlengthCompression Ratio = 8

s = [15, 15, 15, 15]

s = [15, 15, 15, 15, 15, 15, 15, 15,15,15,15,15,15,15,15,15]

200 185 170 24 0 0 0 0

198 180 160 168 0 0 0 0

164 144 120 0 0 0 0 0

32 24 0 0 0 0 0 0

210 190 192 120 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Runlength of ‘0’

4

4

4

5

6

8

8

8

CR

2

2

2

2.5

3

4

4

4

Figure 17

The compression ratio of horizontal scanning is always less than or equal to 4

A better approach is to adopted zig-zag scanning

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

200 185 170 24 0 0 0 0

198 180 160 168 0 0 0 0

164 144 120 0 0 0 0 0

32 24 0 0 0 0 0 0

210 190 192 120 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Quantized DCT coefficients

Runlength of ‘0’ = 47

CR = 23.5

Figure 18

Image

Image Vectors

DCT

Quantization

Zig-Zag Coding

Runlength Coding

Entropy Coding

Digitization

JPEG Compressed Format

Probability distribution of pixel values are uneven

Use less bits to represent pixel intensity that occurs more often

Remember this?

This can be generalised to ......

If probability distribution of data values are uneven

Less bits can be used to represent values that occurs more often and vice versa

In JPEG, DC and other coefficients are encoded separately

200 185 170 25 1 3 1 3

198 180 160 171 10 7 3 10

165 150 125 5 12 11 10 9

30 25 8 13 5 3 9 0

210 190 195 120 7 15 50 8

2 9 1 0 3 6 2 1

5 5 7 2 7 1 1 5

4 9 2 11 9 2 3 0Figure 19

DCT coefficientsDC

All other coefficients are ‘AC’ terms

DC coefficients of adjacent image blocks are similar.

DC coefficient represents the average intensity in an image block

8 pixels

8 pixels

Differential Pulse Code Modulation (DPCM) is applied to encode the ‘Quantized’ DC terms.

Consider a row of image block

200 190 198 202 205 200 195 220 225

Image blocks Quantized DC coefficients

-10 +8 +4 +3 -5 -5 +25 +5

DPCM

As adjacent DC terms are similar, the DPCM values are small in general, i.e., small values

occur more often

The DPCM values are divided in 16 classes according to their magnitude

Each class had different probability of occurence

Class DPCM difference values0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

[0]

[-1] [+1]

[-3,-2] [+2,+3]

[-7, -6, ...., -4] [+4, +5, ...., +7]

[-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]

[-31, -30, ....,-17, -16] [+16, +17, ....,+30, +31]

[-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]

[-127, -126, ......., -64] [+64, ......., +126, +127]

[-255, -254, ....., -128] [+128, +129, ....., +255]

[-511, -511, ....., -256] [+256, +257, ....., +511]

[-1023, ..., -513, -512] [+512, +513, ..., +1023]

[-2047, ........., -1024] [+1024, ..........., +2047]

[-4095, ........., -2048] [+2048, ..........., +4095]

[-8191, ........., -4096] [+4096, ..........., +8191]

[-16383, ........, -8192] [+8192, .........., +16383]

[-32767, ......, -16384] [+16384, ........, +32767]

Small values, that occur more often, are grouped into classes that contain fewer

members

A class with fewer elements(s) require less bits to identify its members

As a result, small values require less bits to represent

Any DPCM value is addressed by its class and a string of additional bits to identify its position in the class

Class DPCM difference values6 [-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]

For example, in class 6, there are 64 members, 6 additional bits is required

000000 000001 011111 100000 111111

Representation of DPCM data

Class Additional bits

4 bits Adaptive

For most DC coefficients, the DPCM values are belonged to lower classes that require less

additional bits

Nonzero AC terms are represented in the same way as DPCM coefficients

Class Additional bits

4 bits Adaptive

Zero terms are encoded with zig-zag scanning followed by RLC

How are these two items combined?

200 0 -14 0 0 0 0 0

0 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

Quantized DCT coefficients V

0 1 5 6 14 15 27 28

3 8 12 17 25 30 41 43

9 11 18 24 31 40 44 53

10 19 23 32 39 45 52 54

2 4 7 13 16 26 29 42

20 22 33 38 46 51 55 60

21 34 37 47 50 56 59 61

35 36 48 49 57 58 62 63

Zig-zag scanning index (I)

1 2 3 4 5 6 7 8 9 10 11 12 62 63

0 0 0 0 -14 0 0 0 1 0 0 0 0 0

I

V

4 3 54RL

Class AC coefficient values0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

[0]

[-1] [+1]

[-3,-2] [+2,+3]

[-7, -6, ...., -4] [+4, +5, ...., +7]

[-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]

[-31, -30, ....,-17, -16] [+16, +17, ....,+30, +31]

[-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]

[-127, -126, ......., -64] [+64, ......., +126, +127]

[-255, -254, ....., -128] [+128, +129, ....., +255]

[-511, -511, ....., -256] [+256, +257, ....., +511]

[-1023, ..., -513, -512] [+512, +513, ..., +1023]

[-2047, ........., -1024] [+1024, ..........., +2047]

[-4095, ........., -2048] [+2048, ..........., +4095]

[-8191, ........., -4096] [+4096, ..........., +8191]

[-16383, ........, -8192] [+8192, .........., +16383]

[-32767, ......, -16384] [+16384, ........, +32767]

1 2 3 4 5 6 7 8 9 10 11 12 62 63

0 0 0 0 -14 0 0 0 1 0 0 0 0 0

I

V

3 54RL

4Class 1

Class AC coefficient values4 [-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]

0000 0001 0111 1000 1111

4

0001

Class AC coefficient values1 [-1] [+1]

0 1

1 2 3 4 5 6 7 8 9 10 11 12 62 63

0 0 0 0 -14 0 0 0 1 0 0 0 0 0

I

V

3 54RL

4Class 1

4

0001 1

RL and Class are grouped into the RUN-SIZE Table

00 01 02 03 04 05 0F

N/A 11 12 13 14 15 1F

N/A 21 22 23 24 25 2F

N/A 31 32 33 34 35 3F

N/A 41 42 43 44 45 4F

N/A 51 52 53 54 55 5F

N/A F1 F2 F3 F4 F5 FF

0 1 2 3 4 5 F

0

1

2

3

4

5

F

RR

RR

SSSS00 - End of Block

•Each non-zero AC coefficient is represented by an 8-bit value ‘RRRRSSSS’

•RRRR is the runlength of ‘zeros’ between current and previous AC coefficients

•If the runlength exceeds 15, a term ‘F0’ will be inserted to represent a runlength of 16

•If all remaining coefficients are zero, a term ‘00’ (EOB) is inserted.

A Few Points to Note

Additional bits

RL

Class

4 3

4 1

EOB

EOB : End of Block

RS 44 31 00 Hexadecimal

RS 68 49 00 Decimal

1 2 3 4 5 6 7 8 9 10 11 12 62 63

0 0 0 0 -14 0 0 0 1 0 0 0 0 0

I

V

3 54RL

4Class 1

4

0001 1

0001 1Additional bits

68RS 49

Encoded AC format 0001 168 49

00

00

Number of bits: 8 + 4 + 8 + 1 + 8 = 29bits

Number of bits for the 63 AC coefficients = 63 x 11 = 693 bits

1 2 3 4 5 6 7 8 9 10 11 12 62 63

0 0 0 0 -14 0 0 0 1 0 0 0 0 0

I

V

3 54RL

4Class 1

4

The “Baboon” is one of the popular standard images that had been adopted for comparison purpose in image compression research. The difficult part is that the large amount of texture is pretty hard to compress with good fidelity. The easy part is the distortions are difficult to spot.

Hi!, I am the famous Baboon,

very nice to meet all of you.