CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

45
CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems

Transcript of CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

Page 1: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 1

Lossy Compression

Spring 2015

CMPT 365 Multimedia Systems

Page 2: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 2

Lossless vs Lossy Compression

If the compression and decompression processes induce no information loss, then the compression scheme is lossless; otherwise, it is lossy.

Why is lossy compression possible ?

Compression Ratio: 12.3Compression Ratio: 7.7 Compression Ratio: 33.9

Original

Page 3: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 3

Outline

Quantization Uniform Non-uniform

Transform coding DCT

Page 4: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 4

Quantization The process of representing a large (possibly

infinite) set of values with a much smaller set. Example: A/D conversion

An efficient tool for lossy compression Review …

EntropycodingQuantizatio

nTransform

Encoder

Entropydecoding

InverseQuantizatio

n

InverseTransform

Decoder

channel

Page 5: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 5

Review: Basic IdeaIn

put

Val

ues

Bin 0

Bin 1

Bin 2

Bin 3

Bin 4

x EntropyCoding

EntropyDecoding

Bin 0

Bin 1

Bin 2

Bin 3

Bin 4

Rec

onst

ruct

ion

Val

ues

Quantizer Dequantizer(Inverse Quantizer)

x

Quantization is a function that maps an input interval to one integer Can reduce the bits required to represent the source. Reconstructed result is generally not the original input Terminologies:

Decision boundaries bi: bin boundaries Reconstruction levels yi: output value of each bin by the dequantizer.

index index

Page 6: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 6

Uniform Quantizer All bins have the same size except possibly for the two outer intervals:

bi and yi are spaced evenly The spacing of bi and yi are both ∆ (step size)

∆ 2∆ 3∆ Input

-3∆ -2∆ -∆

Reconstruction3.5∆

2.5∆

1.5∆0.5 ∆

-0.5∆

-1.5∆

-2.5∆-3.5∆

Uniform Midrise Quantizer

Even number of reconstruction levels0 is not a reconstruction level

-2.5∆ -1.5∆ -0.5∆

Reconstruction3∆

2∆

-∆

-2∆-3∆

Uniform Midtread Quantizer

0.5∆ 1.5∆ 2.5∆ Input

Odd number of reconstruction levels0 is a reconstruction level

iii bby 12

1for inner intervals.

Page 7: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 7

Midtread Quantizer

5.0)()(

xxsignxAq

Quantization mapping:

Output is an index

qqBx )(ˆ

De-quantization mapping:

Example:

x = -1.8∆, q = -2. -2.5∆ -1.5∆ -0.5∆

Reconstruction

3∆

2∆

-∆

-2∆

-3∆

0.5∆ 1.5∆ 2.5∆ Input

Page 8: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 8

Model of Quantization

Quantization: q = A(x) Inverse Quantization: )())(()(ˆ xQxABqBx

B(x) is not exactly the inverse function of A(x), because xx ˆxxxe ˆ)( Quantization error:

Aq

x̂x B

Q x̂x

Combining quantizer and de-quantizer:

- e(x)

x̂xor

Page 9: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 9

Rate-Distortion Tradeoff Things to be determined:

Number of bins Bin boundaries Reconstruction levels

A tradeoff between rate and distortion: To reduce the size of the encoded bits, we need to

reduce the number of bins Less bins More reconstruction errors

Rate

Distortion

A

B

Page 10: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 10

Measure of Distortion Quantization error: Mean Squared Error (MSE) for Quantization

Average quantization error of all input values Need to know the probability distribution of the input

Number of bins: M Decision boundaries: bi, i = 0, …, M

Reconstruction Levels: yi, i = 1, …, M Reconstruction:

iii bxbyx 1 iff ˆ

MSE:

M

i

b

b

iq

i

i

dxxfyxdxxfxxMSE1

22

1

)()(ˆ

xxxe ˆ)(

Same as the variance of e(x) if μ = E{e(x)} = 0 (zero mean).

Definition of Variance: deefe ee )(22

Page 11: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 11

Rate-Distortion Optimization

Two Scenarios: Given M, find bi and yi that minimize the MSE.

Given a distortion constraint D, find M, bi and yi such that the MSE ≤ D.

Page 12: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 12

Outline

Quantization Uniform Non-uniform Vector quantization

Transform coding DCT

Page 13: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 13

Uniform Quantization of a Uniformly Distributed Source

Input X: uniformly distributed in [-Xmax, Xmax]: f(x)= 1 / (2Xmax) Number of bins: M (even for midrise quantizer) Step size is easy to get: ∆ = 2Xmax / M. bi = (i – M/2) ∆

b40

b3-∆

b2-2∆

b1-3∆

b0-4∆

b5∆

b62∆

b73∆

b84∆

-XmaxXmax

y4-0.5 ∆

y3-1.5∆

y2-2.5∆

y1-3.5∆

y50. 5∆

y61.5∆

y72.5∆

y83.5∆ x

e(x) is uniformly distributed in [-∆/2, ∆/2].

x 0.5 ∆

-0.5 ∆

∆ 2∆ 3∆ 4∆-∆-2∆-3∆-4∆

Page 14: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 14

Uniform Quantization of a Uniformly Distributed Source

MSE

23

max0

2

max

1

22

12

1

12

1

222

1

)()(ˆ1

X

Mdxx

XM

dxxfyxdxxfxxMSEM

i

b

b

iq

i

i

M increases, ∆ decreases, MSE decreases

Variance of a random variable uniformly distributed in [- ∆/2, ∆/2]:

22/

2/

22

12

110

dxxq

Optimization: Find M such that MSE ≤ D

DXMD

M

XD

3

1

2

12

1

12

1max

2

max2

Page 15: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 15

Signal to Noise Ratio (SNR)

Variance is a measure of signal energy Let M = 2n

Each bin index is represented by n bits

dBn

nMMX

X

X

EnergyNoise

nergySignaldBSNR

n

02.6

)2log20(2log10log10/2

2log10

12/1

212/1log10

E log10)(

102

102

102max

2max

10

2

2max

1010

If nn+1, ∆ is halved, noise variance reduces to 1/4,and SNR increases by 6 dB.

Page 16: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 16

Outline

Quantization Uniform Non-uniform

Transform coding DCT

Page 17: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 17

Non-uniform Quantization Uniform quantizer is not optimal if source is not uniformly

distributed For given M, to reduce MSE, we want narrow bin when f(x) is

high and wide bin when f(x) is low

x

f(x)

0

M

k

b

b

kq

k

k

dxxfyxdxxfxx1

222

1

)()(ˆ

Page 18: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 18

Lloyd-Max Quantizer

M

k

b

b

kq

k

k

dxxfyxdxxfxx1

222

1

)()(ˆ

Also known as pdf-optimized quantizer

Given M, the optimal bi and yi that minimize MSE, satisfying

.0 ,0 :condition Lagrangian22

i

q

i

q

by

i

i

i

i

b

b

b

bi

i

q

dxxf

dxxfx

yy

1

1

)(

)(

02

yi is the centroid of interval [bi-1, bi].x

f(x)

0 bi-1 bi

yi

Page 19: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 19

Lloyd-Max Quantizer

If f(x) = c (uniformly distributed source):

)(2

1)(21

)(

)(

)(

11

21

2

1

1

1

1

ii

ii

ii

ii

b

b

b

b

b

bi bb

bb

bb

bbc

dxxc

dxxf

dxxfx

y

i

i

i

i

i

i

2 0 1

2

iii

i

q yyb

b

bi is the midpoint of yi and yi+1x

f(x)

0 bi-1 bi bi+1

yi yi+1

Page 20: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 20

Lloyd-Max Quantizer

How to find optimal bi and yi simultaneously? A deadlock:

• Reconstruction levels depend on decision levels• Decision levels depend on reconstruction levels

Solution: iterative method !

i

i

i

i

b

b

b

bi

dxxf

dxxfx

y

1

1

)(

)(

2

1 ii

i

yyb

Given bi, can find the corresponding optimal yi

Given yi, can find the corresponding optimal bi

Summary of conditions for optimal quantizer:

Page 21: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 21

Lloyd Algorithm (Sayood pp. 267)

1. Start from an initial set of reconstruction values yi.

2. Find all decision levels

3. Computer MSE:

4. Stop if MSE changes little from last time.

5. Otherwise, update yi,

go to step 2.

M

k

b

b

kq

k

k

dxxfyx1

22

1

)(

2 1

iii

yyb

i

i

i

i

b

b

b

bi

dxxf

dxxfx

y

1

1

)(

)(

Page 22: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 22

Outline

Quantization Uniform quantization Non-uniform quantization

Transform coding Discrete Cosine Transform (DCT)

Page 23: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 23

Why Transform Coding ? Transform

From one domain/space to another space Time -> Frequency Spatial/Pixel -> Frequency

Purpose of transform Remove correlation between input samples Transform most energy of an input block into a few coefficients Small coefficients can be discarded by quantization without too much

impact to reconstruction quality

EntropycodingQuantizatio

nTransform

Encoder

Page 24: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 24

1-D Example Fourier Transform

Page 25: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 25

1-D Example Application (besides compression)

Boost bass/audio equalizer Noise cancellation

Page 26: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 26

1-D Example http://www.mathdemos.org/mathdemos/trigsounddemo/tri

gsounddemo.html Sine wave/sound/piano

www.sagebrush.com/mousing.htm An electronic instrument that allows direct control of pitch

and amplitude

Page 27: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 27

1-D Example Smooth signals have strong DC (direct current, or zero frequency)

and low frequency components, and weak high frequency components

High frequencyDC

1 2 3 4 5 6 7 80

100

200Original Input

1 2 3 4 5 6 7 80

1000

2000DFT Magnitudes

1 2 3 4 5 6 7 8-500

0

500DCT Coefficients

Sample Index

High frequencyDC

Page 28: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 28

2-D ExampleOriginal Image

2-D DCT Coefficients. Min= -465.37, max= 1789.00

0 50 100 150 200 250 3000

2000

4000

6000

8000

10000

-500 0 500 1000 1500 20000

1

2

3x 10

5

Apply transform to each 8x8 block Histograms of source and DCT coefficients

Most transform coefficients are around 0. Desired for compression

Page 29: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 29

Rationale behind Transform

If Y is the result of a linear transform T of the input vector X in such a way that the components of Y are much less correlated, then Y can be coded more efficiently than X.

If most information is accurately described by the first few components of a transformed vector, then the remaining components can be coarsely quantized, or even set to zero, with little signal distortion.

Page 30: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 30

Matrix Representation of Transform Linear transform is an N x N matrix:

11 NNNN xTy TX y

Inverse Transform:

yTx 1 TXy

T x-1

Unitary Transform (aka orthonormal):

TTT 1TX

yT x

T

For unitary transform: rows/cols have unit norm and are orthogonal to each others

ji ,0

ji ,1 ij

Tji

T ttITT

Page 31: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 31

Discrete Cosine Transform (DCT)

DCT – close to optimal (known as KL Transform) but much simpler and faster

1.-N ..., 1,ifor /2

0,ifor /1

1.-N ..., 0,j i, ,2

)12(cos ,

Na

Na

N

ijaji

C

Definition:

Matlab function: dct(eye(N));

Page 32: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 32

DCT

Definition:

1.-N ..., 1,ifor /2

0,ifor /1

1.-N ..., 0,j i, ,2

)12(cos ,

Na

Na

N

ijaji

C

N = 2 (Haar Transform):

11

11

2

12C

11

10

1

0

1

02

1

0

2

1

11

11

2

1

xx

xx

x

x

x

x

y

yC

y0 captures the mean of x0 and x1 (low-pass) x0 = x1 = 1 y0 = sqrt(2) (DC), y1 = 0

y1 captures the difference of x0 and x1 (high-pass) x0 = 1, x1 = -1 y0 = 0 (DC), y1 = sqrt(2).

Page 33: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 33

DCT Magnitude Frequency Responses of 2-point DCT:

Can be obtained by freqz( ) in Matlab.

0 0.1 0.2 0.3 0.4 0.5-40

-35

-30

-25

-20

-15

-10

-5

0

5DC Att. 403.0103, Mirr Att. 324.2604, Stopband 50, Coding Gain = 5.055 dB

Normalized Frequency

Mag

nitu

de R

espo

nse

(dB

)

x 2πDC

11

11

2

12C

Low pass

Highpass

Page 34: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 34

4-point DCT Four subbands 0.5000 0.5000 0.5000 0.5000

0.6533 0.2706 -0.2706 -0.6533 0.5000 -0.5000 -0.5000 0.5000 0.2706 -0.6533 0.6533 -0.2706

0 0.1 0.2 0.3 0.4 0.5-40

-35

-30

-25

-20

-15

-10

-5

0

5DC Att. 406.0206, Mirr Att. 324.2604, Stopband 8.3456, Coding Gain = 7.5701 dB

Normalized Frequency

Mag

nitu

de R

espo

nse

(dB

)

x 2π

Page 35: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 35

8-point DCT Eight subbands

x 2π

0 0.1 0.2 0.3 0.4 0.5-40

-35

-30

-25

-20

-15

-10

-5

0

5DC Att. 409.0309, Mirr Att. 320.1639, Stopband 9.9559, Coding Gain = 8.8259 dB

Normalized Frequency

Mag

nitu

de R

espo

nse

(dB

)

Page 36: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 36

Example x = [100 110 120 130 140 150 160 170]T; 8-point DCT:

[381.8377, -64.4232, 0.0, -6.7345, 0.0, -2.0090, 0.0, -0.5070]Most energy are in the first 2 coefficients.

1 2 3 4 5 6 7 850

100

150

200

250

1 2 3 4 5 6 7 8-100

0

100

200

300

400

Page 37: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 37

Block Transform

Divide input data into blocks (2D) Encode each block separately (sometimes with information

from neighboring blocks) Examples:

Most DCT-based image/video coding standards

Page 38: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 38

2-D DCT Basis

For 2-point DCT For 4-point DCT

Page 39: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 39

2-D Separable DCT

X: N x N input block T: N x N transform A = TX: Apply T to each column of X B=XTT: Apply T to each row of X 2-D Separable Transform:

Apply T to each row Then apply T to each column

TTXTY

Inverse Transform:

YTTX T

Page 40: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 40

2-D 8-point DCT Example

89 78 76 75 70 82 81 82122 95 86 80 80 76 74 81184 153 126 106 85 76 71 75221 205 180 146 97 71 68 67225 222 217 194 144 95 78 82228 225 227 220 193 146 110 108223 224 225 224 220 197 156 120217 219 219 224 230 220 197 151

Original Data:

2-D DCT Coefficients (after rounding to integers):

1155 259 -23 6 11 7 3 0-377 -50 85 -10 10 4 7 -3 -4 -158 -24 42 -15 1 0 1 -2 3 -34 -19 9 -5 4 -1 1 9 6 -15 -10 6 -5 -1 3 13 3 6 -9 2 0 -3 8 -2 4 -1 3 -1 0 -2 2 0 -3 2 -2 0 0 -1

Most energy is in the upper-left corner

Page 41: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 41

Interpretation of Transform

Forward transform y = Tx (x is N x 1 vector) Let ti be the i-th row of T yi = ti x = <ti

T, x> (Inner product) yi measures the similarity between x and ti

Higher similarity larger transform coefficient

i

N

i

Ti

TN

TTT y

1

0110 tytttyTx

x is the weighted combination of ti. Rows of T are called basis vectors.

Inverse transform:

Page 42: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 42

Interpretation of 2-D Transform

2-D basis matrices:

Outer products of basis vectors

YTTXTXTY TT

.1 ..., ,0 , , NjijTi tt

Proof:

),(1

0

1

0

1

0

1

0, j

Ti

N

i

N

j

N

i

N

jji

TT jiY ttTSTYTTX

0. are others all j), Y(i, isentry th -j)(i, The Define :, jiS

X is the weighted combination of basic matrices.

Page 43: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 43

2-D DCT Basis Matrices

For 2-point DCT For 4-point DCT

Page 44: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 44

2-D DCT Basis Matrices: 8-point DCT

Page 45: CMPT365 Multimedia Systems 1 Lossy Compression Spring 2015 CMPT 365 Multimedia Systems.

CMPT365 Multimedia Systems 45

Further Exploration

Textbook 8.1-8.5 Other sources

Introduction to Data Compression by Khalid Sayood Vector Quantization and Signal Compression by Allen

Gersho and Robert M. Gray Digital Image Processing by Rafael C. Gonzales and

Richard E.Woods Probability and Random Processes with Applications to

Signal Processing by Henry Stark and John W. Woods A Wavelet Tour of Signal Processing by Stephane G. Mallat