IRISApeople.irisa.fr/.../CompressionTools_DIIC3...1011.pdf · Introduction Entropy Coding Other...
Transcript of IRISApeople.irisa.fr/.../CompressionTools_DIIC3...1011.pdf · Introduction Entropy Coding Other...
HistoryTable of Content
COMPRESSION
O. Le [email protected]
Univ. of Rennes 1http://www.irisa.fr/temics/staff/lemeur/
October 2010
1
HistoryTable of Content
VERSION:
2009-2010: Document creation, done by OLM;
2010-2011: Document updated, done by OLM:Major revions of the part concerning lossless vs lossy coding.
2
HistoryTable of Content
TOOLS FOR IMAGE AND VIDEO COMPRESSION
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
3
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
SummaryIntroduction
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
4
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
SummaryIntroduction
Why is it required to compress information?
Example (Facts)
Standard denition 720× 576, 16 bits/pixel, 50 Hz:
6.6 Mbits/image (720× 576× 16)
330 Mbits/second...
5
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Entropy Coding
1 Introduction
2 Entropy CodingSome denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
6
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Some denitions
Denition (Alphabet)
An alphabet is a set of data a1, ..., aN that we might wish to encode.
Denition (Code, Codewords)
A code C is a mapping from an alphabet a1, ..., aN to a set of nite length binarystrings. C(aj ) is called the codeword for symbol aj .
Denition (length of a codeword)
The length l(aj ) of a codeword C(aj ) is the number of bits of this codeword.
Denition (Fixed length code)
A xed length code is a code such that l(aj ) = l(ai ), ∀i , j .
Denition (Variable Length Code (VLC))
A variable length code is a code that is not a xed length code.7
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Some denitions
Denition (Prex code)
A code is called a prex code (instantaneous code) if no codeword is a prex ofanother codeword.
Denition (Optimal prex code)
Assume an alphabet of N symbols with probabilities p(ai ). An optimal prex code Cis a prex code with minimal average length, that is, if C ′is another prex code andl(ai )
′ are the lengths of codewords of C ′ then∑Ni=1 l(ai )p(ai ) ≤
∑Ni=1 l(ai )
′p(ai )
8
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Entropy coding
Denition (Entropy coding)
The entropic coding converts a vector X of integers from a source S into a binarystream Y . It exploits the redundancies in the statistical distribution of X to reduce asmuch as possible the size of Y (Variable Length Codes).
Ideally, the codewords are optimal such that H(S) ≤ l ≤ H(S) + 1, withl =
∑i l(ai )p(ai ).
Remark
The lower bound for the number of bits of Y is the Shannon entropy H(S) given byH(S) = −
∑i p(ai )× log2(p(ai )).
This is a lossless data compression...
9
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Fano-Shannon coding
Algorithm:
1. Sort symbols according to their probabilities;2. Recursively divide into two equiprobable parts;3. One part is set to 0, the other to 1.
Example (A = a0, ..., a7 (probabilities of each symbol are given below))
a7 p(a7) = 0.0625
a6 p(a6) = 0.0625
a5 p(a5) = 0.0625
a4 p(a4) = 0.0625
a3 p(a3) = 0.14
a2 p(a2) = 0.15
a1 p(a1) = 0.21
a0 p(a0) = 0.25
1
1
1
1
1
1
0
0
1
1
1
1
0
0
1
0
1
1
0
0
1
0
1
0
1
0
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=101
C(a2)=100
C(a1)=01
C(a0)=00
10
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
David Human proposed in 1952 a method for building an optimal prex code for agiven source S. Its average word lenght l is in the range H(S) ≤ l ≤ H(S) + 1.The proposed algorithm rests on three principles:
1 if p(X = xj ) > p(X = xi ), i 6= j , then l(xj ) ≥ l(xi );
2 the two symbols having the less important probability have the same length;
3 These two symbols have the same nmax − 1 last values.
Algorithm
1 Sort symbols according to their probabilities;
2 A binary tree is generated from left to right taking the two less probable symbols,putting them together to form another equivalent symbol having a probabilitythat equals the sum of the two symbols;
3 The process is repeated until there is just one symbol;
4 The tree can then be read backwards, from right to left, assigning dierent bitsto dierent branches.
11
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Example
a7 0.0625
a6 0.0625
a5 0.0625
a4 0.0625
a3 0.14
a2 0.15
a1 0.21
a0 0.25
0.125
1
0
0.125
1
0
0.25
0
1
0.29
1
0
0.460
1
0.540
1
1.00
1
C(a7)=1111
C(a6)=1110
C(a5)=1101
C(a4)=1100
C(a3)=011
C(a2)=010
C(a1)=10
C(a0)=00
H(S) = 2.781
l = 2.79
12
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Human coding individually codes each input symbol according to the symbolprobabilities. An integer number of bits is associated to each symbol and thisnumber is never less than 1.
Although Human coding is optimal for a symbol-by-symbol coding, sometimes,its optimatily is not as good as one could expect. This is the case when theprobability of one or more symbols is very high.
Example
We assume that A = a0, a1, a2, with p(a0) = 0.02, p(a1) = 0.18, p(a2) = 0.8.
a0 0.02
a1 0.18
a2 0.8
C(a0)=11
C(a1)=01
C(a2)=0 H(S) = 0.8157
l = 1.2
13
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Human coding individually codes each input symbol according to the symbolprobabilities. An integer number of bits is associated to each symbol and thisnumber is never less than 1.
Although Human coding is optimal for a symbol-by-symbol coding, sometimes,its optimatily is not as good as one could expect. This is the case when theprobability of one or more symbols is very high.
Example
We assume that A = a0, a1, a2, with p(a0) = 0.02, p(a1) = 0.18, p(a2) = 0.8.
a0 0.02
a1 0.18
a2 0.8
C(a0)=11
C(a1)=01
C(a2)=0 H(S) = 0.8157
l = 1.2
13
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Human coding
Human coding individually codes each input symbol according to the symbolprobabilities. An integer number of bits is associated to each symbol and thisnumber is never less than 1.
Although Human coding is optimal for a symbol-by-symbol coding, sometimes,its optimatily is not as good as one could expect. This is the case when theprobability of one or more symbols is very high.
Example
We assume that A = a0, a1, a2, with p(a0) = 0.02, p(a1) = 0.18, p(a2) = 0.8.
a0 0.02
a1 0.18
a2 0.8
C(a0)=11
C(a1)=01
C(a2)=0 H(S) = 0.8157
l = 1.2
13
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Denition
Arithmetic coding is a lossless encoding method that allows combining multiplesymbols into a single codable unit. A message is then encoded as a real number in aninterval from one to zero.
Basic algorithm for arithmetic coding
1 Start with a current interval [L,H[ initialized to [0, 1[.
2 Subdivided it into subintervals, one for each possible event. The size of a event'ssubinterval is proportional to the probability of the symbol.
3 We select the subinterval corresponding to the event and make it the new currentinterval (we redene this interval into smaller ones as previously described, so goback to 1);
4 The process above is repeated until all symbols are encoded or until themaximum precision of the machine is reached.
14
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to encode the message acb.
a b c0 10.6 0.9
a0 0.36 0.54 0.6
c0.54 0.576 0.594 0.6
15
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to encode the message acb.
a b c0 10.6 0.9
a0 0.36 0.54 0.6
c0.54 0.576 0.594 0.6
15
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to encode the message acb.
a b c0 10.6 0.9
a0 0.36 0.54 0.6
c0.54 0.576 0.594 0.6
15
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to encode the message acb.
a b c0 10.6 0.9
a0 0.36 0.54 0.6
c0.54 0.576 0.594 0.6
15
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to encode the message acb.
a b c0 10.6 0.9
a0 0.36 0.54 0.6
c0.54 0.576 0.594 0.6
Final interval:[0.576, 0.594[acb can be coded by the number 0.59375 (=(0.10011)2). This is the shortest binary
fraction that lies within the interval.
(0.10011)2 = (1× 12
+ 0× 122
+ 0× 123
+ 1× 124
+ 1× 125
)10
15
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to decode the codeword 0.10011(0.59375). acb.
a b c0 10.6 0.9
0.59375
0 0.36 0.54 0.60.59375
c
0.54 0.576 0.594 0.6
0.59375
b
Final message: acb
16
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to decode the codeword 0.10011(0.59375). acb.
a b c0 10.6 0.9
0.59375
0 0.36 0.54 0.60.59375
c
0.54 0.576 0.594 0.6
0.59375
b
Final message: acb
16
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to decode the codeword 0.10011(0.59375). acb.
a b c0 10.6 0.9
0.59375
0 0.36 0.54 0.60.59375
c
0.54 0.576 0.594 0.6
0.59375
b
Final message: acb
16
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to decode the codeword 0.10011(0.59375). acb.
a b c0 10.6 0.9
0.59375
0 0.36 0.54 0.60.59375
c
0.54 0.576 0.594 0.6
0.59375
b
Final message: acb
16
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example
We assume that A = a, b, c, with p(a) = 0.6, p(b) = 0.3, p(c) = 0.1. Suppose wewant to decode the codeword 0.10011(0.59375). acb.
a b c0 10.6 0.9
0.59375
0 0.36 0.54 0.60.59375
c
0.54 0.576 0.594 0.6
0.59375
b
Final message: acb
16
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
The previous description is rather theoric and dicult to implement. Two drawbackscan be mentionned:
the shrinking current interval requires the use of high precision arithmetic(especially when the sequence to encode is long (small intervalls))
encoding delay (no output until the entire message has been read).
Pratical arithmetic coding [Witten et al.,87]
Let dene an interval [0, 1[. i and s denote the lower and the higher bound of theinterval, respectively. f is a bit counter called underow. Three rules are applied onthe interval of the current symbol:
(R1) if the interval is featured by s ≤ 0.5 then i → 2i and s → 2s, send a bit 0, and fbits of 1, f = 0;
(R2) if the interval is featured by i ≥ 0.5 then i → 2(i − 0.5) and s → 2(s − 0.5), senda bit 1, and f bits of 0, f = 0
(R3) if the interval is featured by 0.25 ≤ i < 0.5 ≤ s < 0.75 then i → 2(i − 0.25) ands → 2(s − 0.25), f + +.
17
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Some denitionsDenition of entropy codingFano-Shannon codingHuman codingArithmetic coding
Arithmetic coding
Example (Arithmetic coding with incremental transmission)
p(a) = 0.6; p(b) = 0.2; p(c) = 0.2. We want to code abc.
Next symbol i s Code f
a 0 0.6 −− 0b 0.36 0.48R1 0 0
0.72 0.96R2 1 0
0.44 0.92c 0.792 0.92R2 1 0
0.584 0.84R2 1 0
0.168 0.68
Code for the message abc 0111.
18
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Other coding methods
1 Introduction
2 Entropy Coding
3 Other coding methodsRun-Length CodingLempel-Ziv-Welch (LZW)algorithm
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
19
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Run-Length Coding
Denition
A sequence of identic symbols is called a run. Each run is represented by a singlecodeword (the determination of these codewords is, most of the time, based onHuman's procedure).
Example
We assume that A = a, b, c. We want to code the message aaaabcbc.
aaaabcbc → (a, 4)(b, 1)(c, 1)(b, 1)(c, 1).
Remark:
Only interesting when there exist large uniform areas (fax)...
20
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 ba
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 baa,b,c,ab,ba ababaacb ab 3 aba
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 baa,b,c,ab,ba ababaacb ab 3 aba
a,b,c,ab,ba,aba abaacb aba 5 abaa
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 baa,b,c,ab,ba ababaacb ab 3 aba
a,b,c,ab,ba,aba abaacb aba 5 abaaa,b,c,ab,ba,aba,abaa acb a 0 ac
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 baa,b,c,ab,ba ababaacb ab 3 aba
a,b,c,ab,ba,aba abaacb aba 5 abaaa,b,c,ab,ba,aba,abaa acb a 0 ac
a,b,c,ab,ba,aba,abaa,ac cb c 2 cb
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithm
Denition
Lempel-Ziv-Welch (LZW) is a universal lossless data compression algorithm created byLempel, Ziv, and Welch. It was published by Welch in 1984 as an improvedimplementation of the LZ78 algorithm published by Lempel and Ziv in 1978. They areboth dictionary coders, unlike minimum redundancy coders [Welch,84].
Example
Dictionnary Message Coded symbol index New entrya,b,c abababaacb a 0 ab
a,b,c,ab bababaacb b 1 baa,b,c,ab,ba ababaacb ab 3 aba
a,b,c,ab,ba,aba abaacb aba 5 abaaa,b,c,ab,ba,aba,abaa acb a 0 ac
a,b,c,ab,ba,aba,abaa,ac cb c 2 cba,b,c,ab,ba,aba,abaa,ac,cb b b 1
Message abababaacb is coded by 0,1,3,5,0,2,1.
21
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
a,b,c,ab 3 ab abab ba
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
a,b,c,ab 3 ab abab ba
a,b,c,ab,ba 5 ab+a abababa aba
There is no codeword for this index, this is a new entry...In this case, the decoded codeword is composed of the previous decoded codeword ab plus its
rst char a. We decode aba.
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
a,b,c,ab 3 ab abab baa,b,c,ab,ba 5 ab+a abababa aba
a,b,c,ab,ba,aba 0 a abababaa abaa
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
a,b,c,ab 3 ab abab baa,b,c,ab,ba 5 ab+a abababa aba
a,b,c,ab,ba,aba 0 a abababaa abaaa,b,c,ab,ba,aba,abaa 2 c abababaac ac
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Run-Length CodingLempel-Ziv-Welch (LZW) algorithm
Lempel-Ziv-Welch (LZW) algorithmDecoding...
To decode an LZW-compressed message, we need to know the initial dictionnary. Additionalentries will be recontructed during the decoding process. These entries are always simpleconcatenations of previous entries.
Example (0,1,3,5,0,2,1)
Dictionnary Index Received symbol Message New entrya,b,c 0 a a a,b,c 1 b ab ab
a,b,c,ab 3 ab abab baa,b,c,ab,ba 5 ab+a abababa aba
a,b,c,ab,ba,aba 0 a abababaa abaaa,b,c,ab,ba,aba,abaa 2 c abababaac ac
a,b,c,ab,ba,aba,abaa,ac 1 b abababaacb cb
22
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy codingLimitations of the losslesscompressionElement of Rate/Distortion theoryLagrangian formulation of the R-Dproblem
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
23
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy codingLimitations of the lossless compression
Reminding the goal of a lossless coding:
To nd code words C such that the average length lC is close to the entropy of thesource H(S).
Small compression rate (distortionless):
2 or 3 for natural images;
3 or 4 for video sequences.
To increase this rate, it is necessary to degrade the source quality:
reduction of the image source resolution (HD, SD, CIF, QCIF);
frame dropping...
24
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Rate distortion theory is the branch of information theory addressing the problem ofdetermining the minimal amount of entropy or information that should be
communicated over a channel such that the source can be reconstructed at thereceiver with given distortion.
Brief introduction about distortion (see next section):
d(x , y) ≥ 0, x and y are the transmitted and received data;
d(x , y) = 0, if x = y .
25
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy codingElement of Rate/Distortion theory
Rate/Distortion theory calculates the minimum transmission bit-rate R for a requiredlevel of distortion D.
D
R(D)
Entropy/Lossless coding
Source Information
Redundancy
Max. Distortion
D0
R(D0)
26
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy codingElement of Rate/Distortion theory
Rate/Distortion theory calculates the minimum transmission bit-rate R for a requiredlevel of distortion D.
Given maximum rate R0, minimizedistortion D.
D
R(D)
R0
Given a level of maximum distortion D0,minimize rate R.
D
R(D)
D0
Constrained optimization problem.
27
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
The function that relates the rate and the distortion are found as the solution of thefollowing minimization problem:
R(D0) = minD≤D0
(I (X ;Y ))
where
D0 is the maximum average distortion (allowed)(D(X ,Y ) = E [d(x , y)] =
∑u
∑v p(u, v)d(u, v));
I (X ;Y ) is the mutual information.
28
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
The function that relates the rate and the distortion are found as the solution of thefollowing minimization problem:
R(D0) = minD≤D0
(I (X ;Y ))
= minD≤D0
(H(X )− H(X |Y ))
= H(X )− maxD≤D0
(H(X |Y ))
= H(X )− maxD≤D0
(H(X − Y |Y ))
This relation suggests that the source coder have to produce a distortion X − Y thatis statistically independent from the reconstructed signal Y . Of course, it is not always
possible!
29
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
Shannon lower bound
R(D0) ≥ H(X )− maxD≤D0
(H(X − Y ))
Rate-distortion theory tell us that no compression system exists that performs outsidethe gray area. The closer a practical compression system is to the red (lower) bound,
the better it performs.
30
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
R(D) is usually very dicult to compute and can usually be found only approximately.However, the constraint problem above can be solved for few cases:
Memoryless Gaussian source;
Gaussian source with memory.
31
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
Memoryless Gaussian source
Let O = 0(s), s ∈ S be a random source of discrete observations on grid S with aGaussian PDF denoted:
p [o(s) = i ] = pi =1
√2πσ
exp
(−
1
2σ2i2)
The entropy is given by H = −∑
i pi log2pi . Since log2pi = log 1√2πσ− i2
2σ2.
H = −log21
√2πσ
∑i
pi +1
2σ2
∑i
pi i2 (1)
H = log2(√2πσ) +
1
2(2)
H =1
2log2(2πσ2) +
1
2log2e (3)
H =1
2log2(2πeσ2) (4)
32
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
Memoryless Gaussian source
We suppose that the source is Gaussian with a variance σ2.
I (X ;Y ) = H(X )− H(X |Y )
= H(X )− H(X − Y |Y )
≥ H(X )− H(X − Y )
≥ 1/2log2(2πeσ2)− H(N (0,E[(X − Y )2
]))
= 1/2log2(2πeσ2)− 1/2log2(2πeD)
= 1/2log2(2πeσ2
D)
(2) to (3): conditionning reduces entropy;(4) to (5): the normal distribution maximizes the entropy for a given second moment.
33
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy coding
Memoryless Gaussian source
R(D) = 1/2log2(2πeσ2
D)
when 0 ≤ D ≤ σ2.
D(R) = σ2exp(−2R)
Each bit of description reduces the expected distortion by a factor 4.
34
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Limitations of the lossless compressionElement of Rate/Distortion theoryLagrangian formulation of the R-D problem
Lossless vs lossy codingLagrangian formulation of the R-D problem
The following constrained problems are transformed into an unconstrained Lagrangiancost function:
minθ Dθwith the constraint
Rθ ≤ Rmax .
minθ Rθwith the constraint
Dθ ≤ Dmax .
J = D + λR, λ is the Lagrangian factor.
D
R(D)
Lines of constant
J = D + λR
Slope=− 1λ
35
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Distortion/quality assessment
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessmentTaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporaldimension
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
36
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Taxonomy
Distortion/quality metrics can be divided into 3 categories:
1 Full-Reference metrics (FR) for which both the original and the distorted imagesare required (benchmark, compression) ;
2 Reduced-Reference metrics (RR) for which a description of the original and thedistorted image is required (network monitoring);
3 No-Reference (NR) metrics for which the original image is not required (networkmonitoring).
Each category can be divided into two subcategories: metrics based on signal delityand metrics based on properties on the human visual system.
37
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Peak Signal to Noise Ratio (PSNR)
The PSNR is the most popular quality metric. This simple metric just calculates themathematical dierence between each pixel of the degraded image and the originalimage.
Denition (PSNR)
Let I and D the original and impaired images, respectively. These images having a sizeof M pixels are coded with n bits.
PSNR = 10log10((2n−1)2)
MSEdB,
with the Mean Squared Error MSE =
∑(x,y)(I (x,y)−D(x,y))2
M.
A high value indicates that the amount of impairment is small. A small value indicatesthat there is a strong degradation.
38
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Peak Signal to Noise Ratio (PSNR)
The PSNR is not always well correlated with the human judgment (MOS Mean OpinionScore). The reason is simple: this metric does not take into account the properties of thehuman visual system.
Example
(a) (b) (c) Original (d) Original+uniform noise
(a) and (b) from [Nadenau,00]: impact of Gabor patch on our perception.
39
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Peak Signal to Noise Ratio (PSNR)
Example (These three pictures have the same PSNR...)
(a) Original (b) Contrast stretched
(c) Blur (d) JPEG
40
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Metric based on the error visibility
For this type of metric, the behavior of the visual cells are simulated:
Perceptual Color Space PSD CSF Masking
−
Perceptual Color Space PSD CSF Masking
Pooling Quality
I
D
PSD: Perceptual Subband Decomposition (Wavelet, Gabor, Fourier);
CSF: Contrast Sensitivity Function.
41
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Metric based on the error visibility
Example
VDP (Visible Dierences Predictor) [Daly,93]:
WQA (Wavelet-based Quality Assessment) [Ninassi et al.,08a]
VQM (Video Quality Model) [Pinson et al.,04]...
42
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Metric based on the structural similarity
SSIM standing for Structural Similarity index. Image degradations are considered here asperceived structural information loss instead of perceived errors [Wang et al.,04a].
Denition
Let I and D the original and the degraded images, respectively.
S(x, y) = l(x, y) × c(x, y) × s(x, y) (5)
The luminance comparison measure l(x, y) =2µxµy
µ2x + µ2y
The contrast comparison measure c(x, y) =2σxσy
σ2x + σ2y
The structural comparison measure s(x, y) =σxy
σxσy
SSIM(x, y) =(2µxµy + C1)(2σxy + C2)
(µ2x + µ2y + C1)(σ2x + σ2y + C2)
SSIM → 1, the best quality and SSIM → 0 indicates a poor quality.
43
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Example of distortion maps
Example
(a) Original (b) Degraded
(c) MSE (d) WQA (e) SSIM
44
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Impact of the visual masking (from [Ninassi et al.,08b])
Example
(a) Original (b) Degraded
(c) WQA-masking (d) WQA+masking
45
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
PSNR and SSIM computation example
Example (Uniform quantization mid-riser, ∆ = 16, N = 16, 8 bits)
Flat areas: O =
63 65 68 6763 63 66 6660 67 65 5667 65 63 65
Oq =
56 72 72 7256 56 72 7256 72 72 5672 72 56 72
MSE = 35.68; PSNR=32.6 dB; SSIM = 0.81.
Textured areas: O =
86 97 28 24187 27 207 149151 63 156 20178 148 77 31
Oq =
88 104 24 24888 24 200 152152 56 152 20072 152 72 24
MSE = 23.68; PSNR=34.38 dB; SSIM = 0.999.
Edges: O =
45 45 45 4545 45 167 16745 167 167 16745 167 167 167
Oq =
40 40 40 4040 40 168 16840 168 168 16840 168 168 168
MSE = 13; PSNR=36.99 dB; SSIM = 0.998.
46
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Comparisons with subjective tests (from [Ninassi et al.,08b])Description of the three subjective experiments : IVC, Toyama1 and Toyama2.
Subjective Distortions Contents / Protocol Viewing Display ObserversExperiments Distorted images Conditions Devices (#)
IVCDCT Coding,
10 / 120 DSISITU-R BT 500.10
CRTFrench
DWT Coding, 6H (20)Blur
Toyama1 DCT Coding, 14 / 168 ACR ITU-R BT 500.10 CRT JapaneseDWT Coding 4H (16)
Toyama2 DCT Coding, 14 / 168 ACR ITU-R BT 500.10 LCD FrenchDWT Coding 4H (27)
(DSIS = Double Stimulus Impairment Scale), (ACR = Absolute Category Rating).
Metrics IVC (DSIS) Toyama2 (ACR) Toyama1 (ACR)CC SROCC RMSE CC SROCC RMSE CC SROCC RMSE
MOSp(WQA) 0.923 0.921 0.48 0.937 0.941 0.38 0.919 0.923 0.514MOSp(PSNR) 0.768 0.77 0.795 0.699 0.685 0.777 0.685 0.678 0.943MOSp(SSIM) 0.832 0.844 0.691 0.823 0.826 0.618 0.814 0.82 0.754
MOS = Mean Opinion Score;
CC = Linear correlation coecient [−1, 1];
RMSE = Root mean square error [0,+∞[;
SROCC = Spearman rank order correlation coecient (a non-parametric measure of correlation) [−1, 1].
47
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
TaxonomySignal delityPerceptual metricExamplesPerformancesExtension to the temporal dimension
Quality metric for video
How do we built our own opinion regarding the quality of a video?
Quick to criticize and slow to forgive...
Example (Video quality metric)
Temporal SSIM [Wang et al.,04b]:1 To reduce the importance of dark areas compared to lighter areas (that are deemed
to be more attractive (discutable));2 To reduce the importance of spatial distortions when the dominant motion is high.
Temporal WQA [Ninassi et al.,09]:1 HVS integrates most of the visual information during the visual xation step (≈ 250
ms);2 Distortion is evaluated when the area is stabilized on the fovea (motion
compensation);3 The characteristics of temporal distortions, such as time frequency and amplitude of
the variations, impact the perception.
48
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Quantization
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 QuantizationScalar quantizationPrincipleUniform quantizationOptimal quantization, Lloyd-MaxalgorithmExamples: uniform, semi-uniform andoptimal quantizerSummary
Vector quantizationPrincipleVoronoi diagramClustering and K-means
7 Predictive Coding
8 Transform coding
9 Motion estimation49
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Principle
The quantization is a process to represent a large set of values with a smaller set.
Denition (Scalar quantization)
Q : X 7−→ C = yi , i = 1, 2, ...Nx Q(x) = yi
N is the number of quantization level;
X could be continue (ex: R) or discret;C is always discret (codebook,dictionnary);
card(X ) > card(C);
As x 6= Q(x), we will lost some information (lossy compression).
50
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform quantization
Denition
In the uniform quantization, the quantization step size ∆ is xed, no matter what thesignal amplitude is.
input x
output y
t6 t7 t8 t9t4t3t2t1
y2
y1
∆
ti , decision levels
yi , representative levels
Denition:
The quantization thresholds areuniformly distributed:∀i ∈ 1, 2, ...,N, ti − ti+1 = ∆
The output values are the center ofthe quantization interval:
∀i ∈ 1, 2, ...,N, yi =ti+ti+1
2
Example of the nearest neighborhoodquantizer (see on the left):
Q(x) = ∆×⌊x∆
+ 0.5⌋.
51
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform quantization
A uniform quantization is completely dened by the number of levels, the quantizationstep and if it is a mid-step or mid-riser quantizer.
A mid-step quantizerZero is one of the representative levels yk
input x
output y
t6 t7 t8 t9t4t3t2t1
y2
y1
∆
A mid-riser quantizerZero is one of the decision levels tk
input x
output y
t6 t7 t8 t9t4t3t2t1
y2
y1
Usually, a mid-riser quantizer is used if the number of representative levels is even and a mid-stepquantizer if the number of level is odd.
52
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform quantization with dead zone
Uniform quantization
input x
output y
t6 t7 t8 t9t4t3t2t1
y2
y1
∆
Uniform quantization with dead zone
input x
output y
t6 t7 t8 t9t4t3t2t1
y2
y1
Deadzone
Interest:To remove small coecients by favouring the zero value. Increase the codingeciency with a small visual impact.
53
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Example of an uniform quantization
Example (Original picture quantized to 8, 7, 6, 4 bits/pixels)
(a) Original (8 bits per pixel)
(b) 7 bpp (∆ = 2) (c) 6 bpp (∆ = 4) (d) 4 bpp (∆ = 8)
54
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Optimal quantization
Denition (Optimal quantization)
An optimal quantization of a random variable X having a probability distribution p(x)is obtained by a quantizer that minimises a given metric:
Linf -norm: D = max |X − Q(X )|;L1-norm: D = E [|X − Q(X )|];L2-norm: D = E
[(X − Q(X ))2
]called the Mean-Square Error (MSE). This is
the most used.
Considering the MSE, we have:
if the random variable X is continue: D =∫ xmaxxmin
p(x)(x − Q(x))2dx ;
if the random variable X is discret: D =∑n
k=1 p(xk)(xk − Q(xk))2.
55
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Example
Example (Quantization error)
Hypothesis:
Uniform quantization with a quantization step ∆;
p(x) is a uniform probability distribution of a random variable X ;
N is the number of representative levels.
xmax − xmin
∆
xmin xmax∆2−∆
2
Quantization step: ∆ =xmax−xmin
N
Probability distribution: p(x) = 1xmax−xmin
= 1N∆
56
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Example
Example (Quantization error)
D =
∫ xmax
xmin
p(x)(x − Q(x))2dx
= N ×∫ ∆/2
−∆/2p(x)(x − 0)2dx , 0 the mid-point
= N ×∫ ∆/2
−∆/2
1
N∆x2dx
=∆2
12
57
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Optimal quantizer
Uniform quantizer is not optimal if the source is not uniformly distributed.
Optimal quantizer
To nd the decision levels ti and the representative levels yi to minimize thedistortion D.
To reduce the MSE, the idea is to decrease the bin's size when the probability ofoccurrence is high and to increase the bin's size when the probability is low. For Nrepresentative levels and with a probability density p(x), the distortion is given by:
D =
∫ xmax
xmin
p(x)(x − Q(x))2dx
=
N−1∑k=1
∫ tk+1
tk
p(x)(x − yk)2dx
58
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Optimal quantizer
The optimal ti and yi satisfy:
∂D∂ti
= 0 and ∂D∂yi
= 0
Lloyd-Max quantizer [Lloyd,82, Max,60]
∂D∂ti
= 0 ⇒ ti =yi+yi+1
2.
ti is the midpoint of yi and yi+1.
∂D∂yi
= 0 ⇒ yi =
∫ titi−1
p(x)xdx∫ titi−1
p(x)dx.
yi is the centroid of the interval [ti−1, ti ].
⇒ given the ti, we can nd the corresponding optimal yi.⇒ given the yi, we can nd the corresponding optimal ti.
How can we nd the optimal ti and yi simultaneously?
59
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Lloyd-Max algorithm
The Lloyd-Max algorithm is an algorithm for nding the representative levels yi and thedecision levels ti to meet the previous conditions, with no prior knowledge.
Principle of the iterative process
1 The iterative process starts for k = 0 with a set of initial values for the representative
levelsy
(0)1 , ..., y
(0)N
.
2 New values for decision levels are determined t(k+1)i =
y(k)i
+y(k)i+1
2
3 Compute the distortion D(k) and the relative errors δ(k);
4 Depending on the stopping criteria (δ(k) < ε), stop the process or update the
representative levels y(k+1)i =
∫ t(k+1)i
t(k+1)i−1
xp(x)dx
∫ t(k+1)i
t(k+1)i−1
p(x)dx
and go back to step 2.
60
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, Semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Uniform quantizer: N = 4, Mid-riser and ∆ = 2)
input x
output y
02 4 6 8
1
3
5
7
ti ∈ T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
ri ∈ R = r0 = 1, r1 = 3, r2 = 5, r3 = 7
Check the denition of a uniform quantizer...
Quantized vector X = 1, 1, 3, 3, 1, 3, 5, 3, 5, 7, 7 (MSE = 0.42)
61
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Semi-uniform quantizer: N = 4, Mid-riser and ∆ = 2)
input x
output y
02 4 6 8
1
3
5
7
ti ∈ T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
ri ∈ R = r0 = 2/3, r1 = 3.25, r2 = 4.75, r3 = 7.5
Quantized vector X = 2/3, 2/3, 3.25, 3.25, 2/3, 3.25, 4.75, 3.25, 4.75, 7.5, 7.5(MSE = 0.31)
62
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
1 Initial values of decision levels T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
2 First iteration: ri =
∫ ti+1ti
xp(x)dx∫ ti+1ti
p(x)dx
In this case, the pdf is not known, so, we assign each observation to the representativelevels leading to the smallest distortion. Then we compute the centroid:
0 2 4 6 8
0, 0.01 1.99, 2.8, 3.2, 3.4, 3.6 4.5, 5 7.1, 7.9
R = 0.005, 2.998, 4.75, 7.53 Second iteration: ti =
ri−1+ri2
and new representative levelsT = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
4 Third iteration: ti =ri−1+ri
2and new representative levels
T = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
63
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
1 Initial values of decision levels T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
2 First iteration: ri =
∫ ti+1ti
xp(x)dx∫ ti+1ti
p(x)dx
In this case, the pdf is not known, so, we assign each observation to the representativelevels leading to the smallest distortion. Then we compute the centroid:
0 2 4 6 8
0, 0.01 1.99, 2.8, 3.2, 3.4, 3.6 4.5, 5 7.1, 7.9
R = 0.005, 2.998, 4.75, 7.5
3 Second iteration: ti =ri−1+ri
2and new representative levels
T = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.54 Third iteration: ti =
ri−1+ri2
and new representative levelsT = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
63
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
1 Initial values of decision levels T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
2 First iteration: ri =
∫ ti+1ti
xp(x)dx∫ ti+1ti
p(x)dx
In this case, the pdf is not known, so, we assign each observation to the representativelevels leading to the smallest distortion. Then we compute the centroid:
0 2 4 6 8
0, 0.01 1.99, 2.8, 3.2, 3.4, 3.6 4.5, 5 7.1, 7.9
R = 0.005, 2.998, 4.75, 7.53 Second iteration: ti =
ri−1+ri2
and new representative levelsT = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
4 Third iteration: ti =ri−1+ri
2and new representative levels
T = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
63
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
1 Initial values of decision levels T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
2 First iteration: ri =
∫ ti+1ti
xp(x)dx∫ ti+1ti
p(x)dx
In this case, the pdf is not known, so, we assign each observation to the representativelevels leading to the smallest distortion. Then we compute the centroid:
0 2 4 6 8
0, 0.01 1.99, 2.8, 3.2, 3.4, 3.6 4.5, 5 7.1, 7.9
R = 0.005, 2.998, 4.75, 7.53 Second iteration: ti =
ri−1+ri2
and new representative levelsT = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
4 Third iteration: ti =ri−1+ri
2and new representative levels
T = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
63
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
1 Initial values of decision levels T = t0 = 0, t1 = 2, t2 = 4, t3 = 6, t4 = 8
2 First iteration: ri =
∫ ti+1ti
xp(x)dx∫ ti+1ti
p(x)dx
In this case, the pdf is not known, so, we assign each observation to the representativelevels leading to the smallest distortion. Then we compute the centroid:
0 2 4 6 8
0, 0.01 1.99, 2.8, 3.2, 3.4, 3.6 4.5, 5 7.1, 7.9
R = 0.005, 2.998, 4.75, 7.53 Second iteration: ti =
ri−1+ri2
and new representative levelsT = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
4 Third iteration: ti =ri−1+ri
2and new representative levels
T = 0, 1.5, 3.87, 6.125, 8 R = 0.005, 2.998, 4.75, 7.5
63
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform, semi-uniform and optimal quantizer
Suppose we have the following 1D discrete signal:
X = 0, 0.01, 2.8, 3.4, 1.99, 3.6, 5, 3.2, 4.5, 7.1, 7.9
Example (Lloyd-Max Algorithm)
Given that T = 0, 1.5, 3.87, 6.125, 8 and R = 0.005, 2.998, 4.75, 7.5, thequantized vector isX = 0.005, 0.005, 2.998, 2.998, 2.998, 2.998, 4.75, 2.998, 4.75, 7.5, 7.5 (MSE=0.18).
input x
output y
02 4 6 8
1
3
5
7Uniform
Semi-uniform
Optimal
xi ∈ X
64
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Uniform vs optimal quantizer
Example (N=5)
(a) Original (b) Uniform (c) Optimal
(d) Histo. (e) Decision levels (f) Decision levels
65
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Summary
Scalar quantization
The scalar quantization involves basically two operations:
1 partitionning the range of possible input values into a nite collection ofsubranges or subsets
2 For each subsets choosing a representative value to be output when an inputvalue is in that subrange.
Vector quantization is also based on this two operations that take place not in aone-dimensional scalar space, but in an N-dimensional vector space.
66
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Principle
Vector quantization, also called block quantization, allows the encoding of valuesstemming from a multidimensional vector space into a nite set of value from adiscrete subspace of lower dimension.
Denition (Vector quantization)
A vector quantizer maps n-dimensional vectors in the vector space Rn into a nite setof vectors C = yi , i = 1, 2, ...N. Each vector yi in the vector space Rn is called acode vector or a codeword and the set of all the codewords is called a codebook C.
Q : Rn 7−→ C = yi , i = 1, 2, ...N
x =
x1x2· · ·xn
7−→ yi =
yi,1yi,2· · ·yi,n
n is the size of the input vector x;
N is the number of representative levels;
The ouput y is a vector of size n.
67
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Voronoi diagram
The set of Voronoi regions R partition the entire space Rn such that:⋃Ni=1 Ri = Rn and
⋂Ni=1 Ri = , ∀i 6= j .
Example (Example of a 2-dimensional VQ)
N = 6
x
y
y1
y2
y3
y4 y5
x
R5
∀x ∈ R5, Q(x) = y5
A representative level or codeword
68
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Voronoi diagram
Example (Two 1D-scalar quantizations)
N = 9
x
y
One possible approach is to quantize each dimension independently with a scalarquantizer. This results in a rectangular grid of quantization regions.
69
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Pro and cons of vector quantization
Pro
The statistical dependancy between the dierent dimensions of the space is takeninto account. Very good candidate when the samples are statistically highlydependent;
For the same number of representative levels, the distortion is less important thanthe distortion of a scalar quantization.
Cons
Complexity to nd a good partition of the space...
70
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Optimal vector quantization
The questions are:
1 What are the representative levels that best represent a given set of input vectors?
2 How many representative levels should be chosen?
Optimal vector quantization
Let Ri a Voronoi region. An optimal vector quantization implies that:
Vi =x ∈ Rn : d(x , yi ) ≤ d(x , yj ), ∀j 6= i
d(.) is a given distance metric (Euclidean distance), yi is the centroid of the region
Ri .
71
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Optimal vector quantization
The questions are:
1 What are the representative levels that best represent a given set of input vectors?
2 How many representative levels should be chosen?
General framework Linde-Buzo-Gray (LBG) algorithm [Linde et al.,80]
1 Determine the number of representative levels, N;
2 Select N representative levels at random;
3 Using the Euclidean distance measure clusterize the space around eachrepresentative level. For a given input vector, nd the representative level thatyields the minimum distance;
4 Compute the new set of representative levels;
5 Repeat steps 2 and 3 until the representative levels are almost constant.
This is an extension of Lloyd-Max algo (very similar to k-means).
72
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Clustering and k-means
A cluster is therefore a collection of objects which are similar between them and aredissimilar to the objects belonging to other clusters.
Example (K-means on a picture)
(a) Original (b) N=2 (c) N=5 (d) N=10
74
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
Scalar quantizationVector quantization
Clustering and k-mean
K-mean algorithm
Repeat the following three steps until convergence (stability of the centroid):
1 determine the centroid coordinates (for the rst iteration, random values can bechosen);
2 determine the distances of each data points to the centroids;
3 group the data points based on minimum distance.
Example
Let A = 1, 2, 3, 6, 7, 8, 13, 15, 17. Three clusters are required.
75
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive CodingPrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
8 Transform coding
9 Motion estimation
76
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Denition (Basic predictive coders)
Predictive coding exploits the correlation between adjacents pels (spatially as well astemporally). A prediction of the pel to be encoded is made from previously coded information,already transmitted.
A predictive coder is composed of a predictor, a quantizer step and an entropy encoder.
Lossless predictive coding
PredictionMemory
Entropy Encoder+Coding process
f −
f
εBinary codes
MemoryPrediction
Entropy Decoderf = f + εDecoding process+ε
Binary codes
f
+
77
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Lossless predictive coding
PredictionMemory
Entropy Encoder+Coding process
f −
f
εBinary codes
f (n) input sample;
f (n) predicted sample;
ε(n) = f (n)− f (n) is the prediction error.
A linear prediction at Pth order is given by: f (n) =∑P
i=1 ai f (n − i)
Ideally, the coecients ai minimize the Lα-norm of the prediction error.
78
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Least Mean Square to compute the coecients ai
Let be:
the input samples F = [f (0) · · · f (N − 1)]t ;
the prediction error E = [ε(0) · · · ε(N − 1)]t ;
the parameter vector Ω = [a1 · · · aP ]t .
The optimal parameters ai are given by the following equation (optimal regarding the MSE):
ΩOPT = arg minΩ1N
∑N−1n=0 ε(n)2
ε(0)ε(1)
.
.
.ε(N − 1)
=
f (0)f (1)
.
.
.f (N − 1)
+
f (−1) · · · f (−P)f (0) · · · f (−P + 1)
.
.
.. . .
.
.
.f (N − 2) · · · f (NP − 1)
a1...aP
E = F + ΓΩ
We have∑N−1
n=0 ε(n)2 = E tE and
E tE = (F + ΓΩ)T (F + ΓΩ) = (F tF + 2(ΓtF )tΩ + ΩtΓtΓΩ)
The optimal parameters are given by ∂∂ΩE
tE = 0.
Ωopt = (ΓtΓ)−1ΓtF
79
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Spatial linear prediction (Intra-picture)
The current value is predicted from its immediate neighboorhood. This is called DPCM(Dierence Pulse Coding Modulation).
j a0
a1 a2 a3
i
fi,j = a0fi−1,j + a1fi−1,j−1 + a2fi,j−1 + a3fi+1,j−1
Example
a0 = 1 and a1,2,3 = 0, fi,j = fi−1,j ;
a0 = a2 = 1/2 and a1 = a3 = 0, fi,j =fi−1,j+fi,j−1
2 ;
a0 = a2 = 1, a1 = −1 and a3 = 0, fi,j = fi−1,j + fi,j−1 − fi−1,j−1.
80
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Example (Classical predictor)
a0 i,j
a1 a2 a3
Linear predictors:
fi,j =fi−1,j+fi,j−1
2 , prediction based on the two nearest pixels;
fi,j =2fi−1,j+fi,j−1
3 , prediction based on the two nearest pixels but with a preferredorientation;
fi,j = 23 (fi−1,j + fi,j−1)− 1
3 fi−1,j−1.
Non-linear predictor used in JPEG-LS:
fi,j =
min(fi−1,j , fi,j−1) if fi−1,j−1 ≥ max(fi−1,j , fi,j−1)max(fi−1,j , fi,j−1) if fi−1,j−1 ≤ min(fi−1,j , fi,j−1)fi−1,j + fi,j−1 − fi−1,j−1 Otherwise
81
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Example (Spatial linear prediction)
(a) Orig. (b) PDF Luma.
(c) (d) (e) (f) PDF Error
(a) H = 6.6 bits/symbol, L = 6.7;(c) H = 2.33 bits/symbol, L = 4.6; Simple prediction a0 = 1, otherwise 0;(d) H = 2.14 bits/symbol, L = 4.2; Minimum-variance prediction a0 = 7/8, a1 = −5/8, a2 = 3/4 and a3 = 0;(e) H = 2.16 bits/symbol, L = 4.3; Minimum-entropy prediction a0 = 7/8, a1 = −1/2, a2 = 5/8 and a3 = 0.
82
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Temporal prediction (Inter-pictures)
Two solutions to deal with the temporal redundancy:
I (i, j, t + 1) = I (i, j, t), (error=Frames Dierence (FD));
I (i, j, t + 1) = I (i + di, j + dj, t), (error=Displaced FD).
Example
FD
I (t)
DFD
I (t + 1)
~V = [di, dj]t
83
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Lossy predictive coding (open-loop or feedforward)
PredictionMemory
Entropy EncoderQ+Coding process
f −
f
ε εQBinary codes
MemoryPrediction
Entropy Decoderf = f + εDecoding process+εQ
Binary codes
f
+
Predictor is based on the input (before the quantization). Therefore, any errorintroduced by Q can not be recovered.
84
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding and quantization
Example (Open-loop, simple prediction, quantization 2k, 2k + 1 → 2k)
Encoding side...
f Values 1 1 2 3 0 2 1 3
f Prediction 0 1 1 2 3 0 2 1
ε = f − f 1 0 1 1 -3 2 -1 2εQ 0 0 0 0 -4 2 -2 2
MemoryPrediction
d = f + εDecoding process+ε
f
+
εQ 0 0 0 0 -4 2 -2 2
f Prediction 0 0 0 0 0 -4 -2 -4d Decoded 0 0 0 0 -4 -2 -4 -2
85
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding and quantization
Lossy predictive Coding (closed-loop or feedback)
Predictor is based on input (after Q). Eect of Q is fed back to the input for adjustment.
MemoryPrediction
Q Entropy Coder
Entropy Decoder
Prediction
ε = f − f εQ = Q(ε)+f
f
−
+
f = f + εQ
+
Channel
+ εQfD+
f
Reconstruction Error: fD − f = εQ − ε
86
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
PrincipleSpatial linear predictionTemporal predictionPredictive coding and quantization
Predictive coding
Summary
Prediction: estimation of random variable from past or present observablerandom variables;
Intra-frame prediction to exploit spatial similarities;
Inter-frame prediction to exploit similarity of temporally successive pictures (MCor not);
Prediction shapes error signal (Laplacian, generalized gaussian);
Simple and ecient;
Prediction is based on quantized samples;
Adaptive intra/inter frame DPCM.
87
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Transform coding
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform codingA brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
9 Motion estimation
88
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
A brief reminder
Denition (Discrete inner product)
Let f and g two vectors of size N. The inner product is given by:
〈f , g〉 =∑N−1
n=0 f (n)g(n)
Denition (A basis)
A basis consists of a set of functions/vectors from which any other vectors can begenerated via linear combinations.
Denition (Discrete orthogonal basis)
A set of functions ϕm0≤m<N denes an orthogonal basis B if:
〈ϕm, ϕn〉 =
0 if n 6= m
c if n = m
if c = 1, the basis is said orthonormal.
89
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
The goal
Transform coding is fundamental to achieve a signicant bit-rate reduction because it allows toreduce the redundancy of the original signal (energy compaction). Three main features arerequired:
Energy compaction;
Decorrelate the tranform coecients;
Conservation of the energy.
Denition (Orthogonal decompositions or forward transform)
Let I the input picture of size N. This picture can be perfectly approximated by a linearcombination of a set of orthogonal functions ϕm0≤m<N .
I =∑N−1
m=0 αmϕm
αm are given by 〈I , ϕm〉
How to nd or what is the best orthogonal basis B? KLT, Fourier, DCT, Wavelet...
90
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Karhunen-Loeve transform
Principle
Let X a squared matrix of size N containing the input data (Xi are random variables):
X =
X1· · ·XN
=
x11 · · · x1N· · · · · · · · ·xN1 · · · xNN
The covariance matrix Σ is dened by the element Σij given by:
Σij = cov(Xi ,Xj ) = E[(Xi − µi )(Xj − µj )
], with µi = E [Xi ].
Σ =
E [(X1 − µ1)(X1 − µ1)] · · · E [(X1 − µ1)(XN − µN)]
.
.
....
.
.
.E [(XN − µN)(X1 − µ1)] · · · E [(XN − µN)(XN − µN)]
The KLT basis vectors are the eigenvectors of Σ (diagonalization of Σ): Σ = UDUt .
D the eigenvalues matrix. D = diag(d1, ..., dN), with d1 ≥ · · · ≥ dN
The eigenvectors (column vectors of U) are arranged according to decreasing eigenvalues.
KLT = Principal Component Analysis (PCA)
91
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Karhunen-Loeve transform
Example (Hard thresholding)
(a) Original (b) 1 EV(MSE=515)
(c) 10EV(MSE=49)
(d) 15EV(MSE=26)
(e) 30 EV (MSE=7) (f) 60EV(MSE=0.86)
(g) 120EV(MSE=10−4)
EV = Eigen Vectors.
92
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Karhunen-Loeve transform
Pro
The KL transform coecients are uncorrelated;
Basic vectors are ordered according to decreasing eigenvalues (compaction ofenergy).
Cons
Representation (the transform matrix) is signal dependent;
Computation cost is signicant.
93
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discrete Fourier transform (DFT)
Let f a sampled nite signal (N samples).
Denition (Discrete Fourier basis)
The discrete Fourier basis B = ϕm0≤m<N is dened by:
ϕm(n) = 1√Nexp 2iπ
N mn
Discrete Fourier Transform: F (k) =∑N−1
n=0 f (n)exp− 2iπ
N kn
Inverse Fourier Transform: f (n) = 1√N
∑N−1k=0
F (k)exp 2iπ
N kn
Extension to 2D
2D DFT can be separated into a sequence of 2 1D DFT. The basis functions are then:
ϕm1,m2 (n1, n2) = ϕm1 (n1)ϕm2 (n2)
ϕm1,m2(n1, n2) =1
Nexp
2iπ
N(m1n1 + m2n2)
94
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
DFT
Example
Original Ampitude spectrum 3 contour lines
95
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discrete Cosine transform (DCT)
Denition (Discrete Cosine basis)
The discrete cosine basis B = ϕm0≤m<N is dened by:
ϕm(n) = λ(m)√
2N cos
π(2n+1)m2N
with, λ(m) =
1√2
if m = 0
1 otherwise
Extension to 2D
2D DCT can be separated into a sequence of 2 1D DCT. The basis functions are then:
ϕm1,m2 (n1, n2) = ϕm1 (n1)ϕm2 (n2)
ϕm1,m2 (n1, n2) = λ(m1)λ(m2)2
Ncos
π(2n1 + 1)m1
2Ncos
π(2n2 + 1)m2
2N
with,
λ(m) =
1√2
if m = 0
1 otherwise
96
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discrete Cosine transform (DCT)
DCT and inverse DCT
The DCT of a block I of size NxN is dened by:
DCT (I )(n,m) = 2Nλ(n)λ(m)
∑N−1i=0
∑N−1j=0 cos
π(2i+1)n2N
cosπ(2j+1)m
2NI (i , j)
⇔〈ϕn,m, I 〉
The inverse DCT is dened by:
I (n1, n2) = 2N
∑N−1m1=0
∑N−1m2=0 λ(m1)λ(m2)cos
π(2n1+1)m12N cos
π(2n2+1)m22N DCT (I )(n1, n2).
Two dimensional DCT basis. The source data (8x8)is transformed to a linear combination of these 64frequency squares.
97
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discrete Cosine transform (DCT)
Example (Hard thresholding)
DCT
IDCT
IDCT
IDCT
Mask
Mask
Mask
fu
fv
Highest frequencies
Smallest coe.
98
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discrete Cosine transform (DCT)
Pro
Good decorrelation of the coe;
Good compaction of energy (easy to remove coe. without visual annoyance).
A lot of almost null coe for the highest frequencies. Possibility to reorder thecoecients using the zig-zag scan (Run-length coding).
Cons
??
99
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Discret Wavelet Transform (DWT)
Denition (Wavelet [Mallat,98])
A wavelet ψ is a function of zero average which is dilated with a scale parameter sand translated by u:
ψs,u(t) = 1√sψ( t−u
s)
The familyψj,n
(j,n)∈Z2 is an orthonormal basis of L2(R) (s = 2j ).
Denition (Link with the multiresolution approach)
Because the orthogonal wavelet dilated by s = 2j carry signal variation at theresolution 2−j , the wavelet transform can be seen as an iterative process. For eachscale j , f ∈ L2(R) is decomposed into:
1 A coarse approximation with a scaling function φ:aj (n) =
⟨f , φj,n
⟩, with the orthonormal basis
φj,n
n∈Z ;
2 A ne approximation with a wavelet function ψ:dj (n) =
⟨f , ψj,n
⟩, with the orthonormal basis
ψj,n
n∈Z .
100
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Wavelets in a basis set are composed of scaled and translated copies (daughterwavelets) of a primary basis function (mother wavelet).
Example (Haar wavelets [Haar, 1910])
Mother wavelet
Wavelet function
ψ(t) =
1 0 ≤ t < 1/2−1 1/2 ≤ t < 10 otherwise
Scaling functionφ(t) = 1[0,1]
ψn,k(t) = ψ(2nt − k)
(n,k)∈Z2
(Haar basis)
ψ0,0(t) ψ0,1(t)
ψ1,0(t) ψ1,1(t)
A signal f ∈ L2(R) can be decomposed on this orthogonal basisψn,k
(n,k)∈Z2 :
f =∑+∞
j=−∞∑+∞
n=−∞⟨f , ψj,n
⟩ψj,n
101
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
2D wavelet orthogonal basis
Denition (2D DWT)
A wavelet basis in 2D (L2(R2)) is constructed with separable products from a 1D wavelet ψand a scaling function φ. Three wavelets are dened:
ϕH(n1, n2) = φ(n1)ψ(n2)
ϕV (n1, n2) = ψ(n1)φ(n2)
ϕD(n1, n2) = ψ(n1)ψ(n2)
These wavelets extract images details at dierent scales and orientations.
The 2D wavelet transform provides for each scale 2−j the following coecients and for eachpoint n = (n1, n2): aj (n) =
⟨f , φ2j,n
⟩dkj (n) =
⟨f , ψkj,n
⟩with k ∈ H,V ,D
Remarks: in practice, these coe. are obtained by using 1D lters applied successively on therows and then on the columns
φ is equivalent to a low-pass lter;
ψ is equivalent to a high-pass lter.
102
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
2D wavelet orthogonal basis
h
Low pass lter
↓ 2 h ↓ 2
g ↓ 2
g
High pass lter
↓ 2 h ↓ 2
g ↓ 2
(Sx, Sy/2)aj−1
(Sx, Sy)aj ...
(Sx/2, Sy/2)
dVj
dHj
dDj
on each row on each column
Example (3-Layers decomposition (CDF 9/7))
First Iter. Second Iter.
H D
V
fu
fv
dH1
103
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
2D DWT
Pro
Good decorrelation of the coe;
Good compaction of energy (easy to remove coe. without visual annoyance).
A lot of almost null coe for the highest frequencies. Possibility to reorder thecoecients using the zig-zag scan (Run-length coding).
Used by JPEG2000.
Cons
??
Example (Coecients of the lters)
Daubechies 5/3 5 for the low pass −1, 2, 6, 2,−1 and 3 for the high pass 1, 2, 1
Daubechies 9/7: 0.0378,−0.0238,−0.1106, 0.3774, 0.8527, 0.3774, ...,−0.0645,−0.0407, 0.4181, 0.7885; 0.4181,−0.0407,−0.0645
104
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
A brief reminderKarhunen-Loeve transformDiscrete Fourier transform (DFT)Discrete Cosine transform (DCT)Discret Wavelet Transform (DWT)
Lifting scheme [Sweldens, 95]
Goal and principle
The purpose of the lifting scheme is to decompose a function into a sum of a coarseapproximation associated to a correction to the coarse representation.The decomposition is carried out by ltering alternatively the function at odd andeven locations. Interests are:
reversible operation;
simple process.
P1 U1 P2Split
−
+
+
+xe , even s1
xo , odd d1
.....x
s, Coarse Rep.
d , Fine Rep.
d1 = x0 − P1(xe)
s1 = xe + U1(d)
Pi and Ui are the ith predict and update stages, respectively. The number of predict and
update stages depend on the lter's size.
105
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Motion estimation
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimationIntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
106
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Introduction
Motivation
To deal with the temporal redundancy of a video sequence.
Motion estimator is a fundamental tool, and that for a number of applications:
Data compression
Filtering such as noise reduction
Frame interpolation (upconversion 24 frames/sec to 60 frames/sec)
De-interlacing (conversion from interlaced to progressive video)
107
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Fundamental assumption
An image of the sequence is dened by I (x , y , t). (x , y) represents the spatialcoordinates whereas t represents the time.
Fundamental assumption
The image intensity is conserved along trajectories !!!
I (x , y , t) = I (x + δx , y + δy , t + δt)
Classication:
Feature / Region Matching: the motion eld is estimated by correlating features(edge, intensity...) from one frame to another (Block Matching, Phasecorrelation...);
Gradient-based methods: the motion eld is estimated by using spatio-temporalgradients of the image intensity distribution (Pel-recursive method, theHorn-Schunck algorithm...).
108
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Motion models
Motion models
2D Translation (2 parameters):This model dealing only with the translation is used in video coding (works quitewell because motion between concecutive frames is rather small).[
u
v
]=
[x
y
]+
[dx
dy
]Ane model (6 parameters):Translation, rotation, scaling and deformation are taken into account to evaluatethe displacement. [
u
v
]=
[dxx dxydyx dyy
] [x
y
]+
[dx
dy
]From these models, we can estimate:
the global apparent motion (dominant motion): a motion model is dened for thewhole image (refers to apparent motion of background);
the local apparent motion: a motion model is dened for each pixel (block) of theimage.
109
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Block Matching
Principle
1 Matching is performed by minimizing an error criterion;
2 A search procedure must be dened (exhaustive, non-exhaustive search...);
3 Regularization (smooth constraint);
4 Spatial resolution of the motion vector (pel, 1/2 pel, 1/4 pel...).
110
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Error criterion
Matching for a block Bi of size NxN
Image(t-1) Image(t)
Bi
~V = [dx , dy ]t
The estimated displacement vector has to minimize a criterion f :(dx∗, dy∗) = argmin(dx,dy)∈W f (I (x , y , t), I (x + dx , y + dy , t − 1)), W ⊆ I .
MSE or MAD
MSE (Mean Square Error), L2-norm:
MSEdx,dy (x, y) = 1N2∑
(k,l )∈Bi(I (x + k, y + l, t)− I (x + k + dx, y + l + dy , t − 1))2
MAD (Mean Absolute Dierence), L1-norm:
ADdx,dy (x, y) = 1N2∑
(k,l )∈Bi|I (x + k, y + l, t)− I (x + k + dx, y + l + dy , t − 1)|
Notice that we made the assumption that the pixels of the blocks undergo a commondisplacement (coherence constraint).111
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Dierent procedures can be used, more or less complex, more or less ecient:
Full search;
Three step search [Koga et al.,81]:
Two dimensional logarithmic search [Jain et al.,81];
Orthogonal search [Puri et al.,87];
One-at-a-Time search [Srinivasan,85].
Predictive motion vector eld adaptive search technique (PMVFAST)[Tourapis,01].
112
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Full search
A full search is an exhaustive search within the picture or within a predeterminedwindow. The size of the predetermined window determines the maximumdisplacement range.
(dx∗, dy∗) = argmin(dx,dy)∈W f (I (x , y , t), I (x + dx , y + dy , t − 1)), W ⊆ I .
Let W the window having a size of(2N + 1)× (2M + 1), the number of
computation to nd the globalminimum is 2N + 1× 2M + 1.
2M+1
2N+1
Original pixel (block); Pixel (block) to match; Best solution (global maximum);
OPTIMAL MATCH
113
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Three step search [Koga et al.,81]
This algorithm is based on a coarse-to-ne process. Three steps are used:
1 Initial search on 8 pixels at a given distance of the center ( );
2 The distance is halved. The centre is moved to the pixel that minimizes thematching criterion ( );
3 Repeat steps 1 and 2 until the step size is smaller than a given threshold ( ).
STEP 1 STEP 2 STEP 3
SUBOPTIMAL MATCH (in this case)
114
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Variations of the three step search
2D logarithmic search
1 Four pixels (cardinal axis) are considered at a given distance ( and seconditeration);
2 If the position of best match is at the centre, halve the distance. Otherwise,repeat step 1 from the best candidate;
3 When the distance is of 1, all the nine blocks around the centre are chosen ( ).
Original pixel (block); Pixel (block) to match; Best solution (global maximum).
115
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Variations of the three step search
Orthogonal search
1 Two pixels are chosen at a given distance in the horizontal direction ( and 3th
iteration) and the point minimizing the matching criterion is chosen;
2 Take two points in the vertical direction and nd the minimum ( and 4th
iteration);
3 Halve the distance and go back to 1, if the distance is greater than one.Otherwise, stop ( ).
Original pixel (block); Pixel (block) to match; Best solution (global maximum).116
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
Variations of the three step search
One-at-a-time search
1 Two pixels are chosen about the position of the original pixel ( the pointminimizing the matching criterion is chosen. A direction is dened (right or left);
2 Continue in this direction while the distortion is smaller than the previouscandidate ( );
3 Repeat step 1 and 2 by changing the direction. Consider now the vertical axis (then ).
Original pixel (block); Pixel (block) to match; Best solution (global maximum).
117
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Search procedure
PMVFAST, Predictive motion vector eld adaptive search technique [Tourapis,01]
~Ve
t − 1 t
~Va
~Vb ~Vc ~Vd
~Vp
~Vp =
[MED(vax , vbx , vcx , vdx , vex )MED(vay , vby , vcy , vdy , vey )
]Small diamont Large diamont
118
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Regularization
Goal and principle
The regularization is used to improve the robustness and the consistency of a motioneld (in function or not of a targeted application).
(dx∗, dy∗) = argmin(dx,dy)∈W f (I (x , y , t), I (x + dx , y + dy , t− 1)) +λ× g(.), W ⊆ I .
The rst term is related to the residual from the optical ow equation;
The second concerns the regularization term:→ Hidden variables of interest (extracted from the data);→ Prior knowledge about application / target.
Example used in a compression scheme:
(dx∗, dy∗) = arg min(dx,dy)∈W f (I (x, y , t), I (x + dx, y + dy , t − 1)) + λ× C(~Vx,y , ~Vx−1,y )
with ~Vx,y motion vector at (x, y).
119
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Spatial resolution of the motion vector 12 - pel
Luminance sample interpolation
The displacement estimate can be rened to subpixel accuracy ( 12- pel, 1
4- pel).
The prediction picture is rened to subpixel accuracy.
A B
C D
PELe
f
1/2 PEL
g
e = (A + B + 1)/2
f = (A + C + 1)/2
g = (A + B + C + D + 2)/4
For H264/AVC, half sample positions are obtained by applying a 6-taps lter (1,-5,20,20,-5,1).120
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Spatial resolution of the motion vector 14 - pel
Luminance sample interpolation
Once the half-pel samples are avaiblable, the samples at quarter-step (quater-pel)positions are produced by bilinear interpolation.
A B
C D
PELe
f
1/2 PEL
ggh
ga gb gc
gd
gegfgg
1/4 PEL
For H264/AVC, quarter sample positions are obtained by applying a bilinear lter.
121
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Hierarchical block matching
Goal and principle
The basic idea of a hierarchical block matching is to perform motion estimation ateach level of a pyramid. The estimation starts with the lowest resolution.
Low resolution
Low frequencies
High resolution
High frequencies
Rough estimateof the motion information
Accurate estimateof the motion information
122
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Hierarchical block matching
Components of a HME
1 Pyramid construction
2 Motion estimation (dierent distances can be used)3 Coarse to ne renement:
Coarse (lowest resolution): dominant motion estimation that is propagated to thenext higher level of the pyramid
Fine (highest resolution): local motion, the previous estimated motion is locallyrened
123
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Hierarchical block matching
Pros and cons
Pros:
Signicantly reduce the computational load (complexity);
Quite good estimation.
Cons:
Dicult to assess the motion of small object;
The initialization step is very important;
Storage resources (to keep the pictures at dierent resolutions).
124
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Quality of a motion estimator
Dierent parameters... more or less important for a given application
Energy of the Displaced Frame Dierence image (DFD);
Entropy of the DFD frame;
Spatial uniformity of the motion vector (at least over the same moving areas).
For up-conversion, the quality of reconstructed frame is fundamental.
Ambiguities resulting from intensity conservation:
I (t) I (t + 1)
pp
~V1
~V2
For the point p, there are two candidates, namely ~V1 and ~V2 that provide the same DFD. To
deal with that we can use an a priori smoothness constraint to resolve the ambiguity in favor
of ~V2. But not so easy....125
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
IntroductionBlock MatchingHierarchical block matchingQuality of a motion estimator
Quality of a motion estimator
Example (Backward estimation, 5 levels)
126
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
List of compression tools
Summary
1 Introduction
2 Entropy Coding
3 Other coding methods
4 Lossless vs lossy coding
5 Distortion/quality assessment
6 Quantization
7 Predictive Coding
8 Transform coding
9 Motion estimation
127
IntroductionEntropy Coding
Other coding methodsLossless vs lossy coding
Distortion/quality assessmentQuantization
Predictive CodingTransform codingMotion estimation
Summary
List of compression tools
List of compression tools
Lossless encoding tools
Entropy coding: Human, arithmetic, Fano-Shannon;
Lempel-Ziv-Welch and run-length coding.
Lossy encoding tools for reducing the amount of information
Quantization: scalar quantizer and vector quantizer.
Lossless tools to increase the eciency of aforementionned tools
Prediction (intra, inter-frame): encode the prediction error (less bits);
Transform the image into a new domain (better compaction of the energy).
128
Suggestion for further reading...
[Daly,93] S. Daly. The visible dierences predictor: An algorithm for the assessment of imagedelity. Digital Images and Human Vision, pp. 179-206, 1993, MIT Press.
[Haar, 1910] A. Haar. Zur Theorie der orthogonalen funktionensysteme. Mathematischeannalen, 69, p. 331-371, 1910.
[Jain et al.,81] J. R. Jain and A. K. Jain. Displacement Measurement and Its Application inInterframe Image Coding. IEEE Trans. Commun., vol. COM-29, 1981.
[Koga et al.,81] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro. MotionCompensated Interframe Coding for Video Conferencing. in Proc. Nat. Telcommun.Conf., 1981.
[Linde et al.,80] Y. Linde, A. Buzo, R. Gray. An Algorithm for Vector Quantizer Design. IEEETransactions on Communications. Vol. 28, pp. 84-94, 1980.
[Lloyd,82] S. P. Lloyd. Least squares quantization in PCM. Institute of MathematicalStatistics Meeting, Atlantic City, NJ, September 1957; IEEE Transactions on InformationTheory, pp. 129-136, March 1982.
[Mallat,98] S. Mallat. A wavelet tour of signal processing. Academic Press, 1998.
[Max,60] J. Max. Quantizing for minimum distortion. IRE Trans. Information, Theory, it-6, pp.7-12, 1960.
128
[Nadenau,00] M. Nadenau. Integration of human color vision models into high quality imagecompression. PhD thesis, EPFL, 2000.
[Ninassi et al.,08a] A. Ninassi, O. Le Meur, P. Le Callet, and D. Barba. On the performance ofhuman visual system based image quality assessment metric using wavelet domain. Proc.SPIE Human Vision and Electronic Imaging XIII., Vol. 6806, pp. 680610-12, 2008.
[Ninassi et al.,08b] A. Ninassi, O. Le Meur, P. Le Callet and D. Barba. Which Semi-LocalVisual Masking Model For Wavelet Based Image Quality Metric ?, ICIP 2008, San Diego,California, USA, 2008.
[Ninassi et al.,09] A. Ninassi, O. Le Meur, P. Le Callet and D. Barba. Considering thetemporal variations of spatial visual distortions in video quality assessment. IEEE SignalProcessing, special issue on visual media quality assessment, 2009.
[Pinson et al.,04] M. Pinson and S. Wolf. A new standardized method for objectivelymeasuring video quality. IEEE Trans. Broadcasting, Vol. 50, N. 3, pp. 312-322, 2004.
[Puri et al.,87] A. Puri, H. M. Hang, D.L; Schilling. An ecient block-matching algorithm formotion compensated coding. International Conference on Acoustics, Speech and SignalProcessing, 1987.
[Srinivasan,85] R. Srinivasan and K. R. Rao. Predictive coding based on ecient motionestimation. IEEE Transactions on Communications , Vol. COM-33, No. 8, pp. 888-896,1985.
128
[Sweldens, 95] W. Sweldens. The lifting scheme: a new philosophy in biorthogonal waveletconstructions. Wavelet applications in signal and image processing III. SPIE 2569, p.68-79, 1995.
[Tourapis,01] A. Tourapis, C. Oscar and L. Ming. Predictive motion vector eld adaptivesearch technique (PMVFAST): enhancing block-based motion estimation. Proc. SPIEVol. 4310, p. 883-892, Visual Communications and Image Processing 2001.
[Wang et al.,04a] Z. Wang, A. C. Bovik, H.R. Sheikh and E.P. Simoncelli. Image qualityassessment: from error visibility to structural similarity. IEEE Trans. on Image Processing,Vol. 13, pp. 600-612, 2004.
[Wang et al.,04b] Z. Wang, L. Lu and A. Bovik. Video quality assessment based on structuraldistortion measurement. Signal processing: image communication, Vol. 19, N. 1, 2004.
[Welch,84] T.A Welch. A technique for high performance data compression. IEEE Computer,Vol. 17, N. 6, pp. 8-19, 1984.
[Witten et al.,87] I.H. Witten, R.M. Neal and J.G. Cleary. Arithmetic coding for datacompression. Communications of the ACM, 30(6), pp. 520-540, 1987.
128