Introduction to JPEG and MPEG Ingemar J. Cox University College London.
-
Upload
maximillian-patterson -
Category
Documents
-
view
223 -
download
0
Transcript of Introduction to JPEG and MPEG Ingemar J. Cox University College London.
Introduction to JPEG and MPEG
Ingemar J. Cox
University College London
Nov 27th 2006 Ingemar J. Cox 2
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Outline
Elementary information theory
Lossless compression
Quantization
Fundamentals of images
Discrete Cosine Transform (DCT)
JPEG
MPEG-1, MPEG-2
Nov 27th 2006 Ingemar J. Cox 3
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Bibliography
D. MacKay, “Information Theory, Inference and learning Algorithms”, Cambridge University Press, 2003. http://www.inference.phy.cam.ac.uk/itprnn/book.html
W. B. Pennebaker and J. L. Mitchell, “JPEG Still Image Data Compression Standard”, Chapman Hall, 1993 (ISBN 0-442-01272-1).
G. K. Wallace, “The JPEG Still-Picture Compression Standard”, IEEE Trans. On Consumer Electronics, 38, 1, 18-34, 1992.
http://en.wikipedia.org/wiki/JPEG
Nov 27th 2006 Ingemar J. Cox 4
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Bibliography
http://en.wikipedia.org/wiki/MPEG-2
T. Sikora, “MPEG Digital Video-Coding Standards”, IEEE Signal Processing Magazine, 82-100, September 1997
Elementary Information Theory
Nov 27th 2006 Ingemar J. Cox 6
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
How much information does a symbol convey?
Intuitively, the more unpredictable or surprising it is, the more information is conveyed.
Conversely, if we strongly expected something, and it occurs, we have not learnt very much
Nov 27th 2006 Ingemar J. Cox 7
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
If p is the probability that a symbol will occur
Then the amount of information, I, conveyed is:
The information, I, is measured in bits
It is the optimum code length for the symbol
pI
1log2
Nov 27th 2006 Ingemar J. Cox 8
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
The entropy, H, is the average information per symbol
Provides a lower bound on the compression that can be achieved
))(
1(log)( 2 sp
spHs
Nov 27th 2006 Ingemar J. Cox 9
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information theory
A simple example. Suppose we need to transmit four possible weather conditions:
1. Sunny
2. Cloudy
3. Rainy
4. Snowy
If all conditions are equally likely, p(s)=0.25, and H=2 i.e. we need a minimum of 2 bits per symbol
Nov 27th 2006 Ingemar J. Cox 10
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary information theory
Suppose instead that it is:1. Sunny 0.5 of the time
2. Cloudy 0.25 of the time
3. Rainy 0.125 of the time, and
4. Snowy 0.125 of the time
Then the entropy is
75.175.05.05.0
3125.02225.015.0125.0
1log125.02
25.0
1log25.0
5.0
1log5.0 222
H
H
H
Nov 27th 2006 Ingemar J. Cox 11
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
Variable length codewords
Huffman code – integer code lengths
Arithmetic codes – non-integer code lengths
Nov 27th 2006 Ingemar J. Cox 12
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
Huffman code
Weather Probability Information Integer code
Sunny 0.5 1 0
Cloudy 0.25 2 10
Rainy 0.125 3 110
Snowy 0.125 3 111
Nov 27th 2006 Ingemar J. Cox 13
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
Previous illustration is an example of a lossless code I.e. we are able to recover the information exactly
Nov 27th 2006 Ingemar J. Cox 14
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Elementary Information Theory
Note that we have assumed that each symbol is independent of the other symbols I.e. the current symbol provides no information
regarding the next symbol
Nov 27th 2006 Ingemar J. Cox 15
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Quantization
Quantization is the process of approximating a continuous (or range of values) by a (much) smaller range of values
Where Round(y) rounds y to the nearest integer
is the quantization stepsize
5.0
Round),(x
xQ
Nov 27th 2006 Ingemar J. Cox 16
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Quantization
Example: =2
0 1-3 -2 -1 2 3 4 5-5 -4
0-1 1 2-2
0-2 2 4-4
Nov 27th 2006 Ingemar J. Cox 17
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Quantization
Quantization plays an important role in lossy compression This is where the loss happens
Fundamentals of Images
Nov 27th 2006 Ingemar J. Cox 19
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
An image consists of pixels (picture elements)
Each pixel represents luminance (and colour) Typically, 8-bits per pixel
Nov 27th 2006 Ingemar J. Cox 20
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Colour Colour spaces (representations)
RGB (red-green-blue) CMY (cyan-magenta-yellow) YUV
• Y = 0.3R+0.6G+0.1B (luminance)
• U=R-Y
• V=B-Y
Greyscale
Binary
Nov 27th 2006 Ingemar J. Cox 21
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
A TV frame is about 640x480 pixels
If each pixels is represented by 8-bits for each colour, then the total image size is 640×480*3=921,600 bytes or 7.4Mbits
At 30 frames per second, this would be 220Mbits/second
Nov 27th 2006 Ingemar J. Cox 22
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Do we need all these bits?
Nov 27th 2006 Ingemar J. Cox 23
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Here is an image represented with 8-bits per pixel
Nov 27th 2006 Ingemar J. Cox 24
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Here is the same image at 7-bits per pixel
Nov 27th 2006 Ingemar J. Cox 25
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
And at 6-bits per pixel
Nov 27th 2006 Ingemar J. Cox 26
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
And at 5-bits per pixel
Nov 27th 2006 Ingemar J. Cox 27
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
And at 4-bits per pixel
Nov 27th 2006 Ingemar J. Cox 28
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Do we need all these bits? No!
The previous example illustrated the eye’s sensitivity to luminance
We can build a perceptual model Only code what is important to the human visual
system (HVS) Usually a function of spatial frequency
Nov 27th 2006 Ingemar J. Cox 29
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of Images
Just as audio has temporal frequencies
Images have spatial frequencies
Transforms Fourier transform Discrete cosine transform Wavelet transform Hadamard transform
Nov 27th 2006 Ingemar J. Cox 30
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Discrete cosine transform
Forward DCT
Inverse DCT
1
0
)5.0(8
cos)(2
)()(
N
n
nu
nsuC
uS
)5.0(8
cos)(2
)()(
1
0
nu
uSuC
nsN
u
Nov 27th 2006 Ingemar J. Cox 31
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
DC term
Nov 27th 2006 Ingemar J. Cox 32
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
First term
Nov 27th 2006 Ingemar J. Cox 33
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Second term
Nov 27th 2006 Ingemar J. Cox 34
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Third term
Nov 27th 2006 Ingemar J. Cox 35
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Fourth term
Nov 27th 2006 Ingemar J. Cox 36
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Fifth term
Nov 27th 2006 Ingemar J. Cox 37
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Sixth term
Nov 27th 2006 Ingemar J. Cox 38
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Basis functions
Seventh term
DCT Example
Nov 27th 2006 Ingemar J. Cox 40
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example
Signal
Nov 27th 2006 Ingemar J. Cox 41
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example
DCT coefficients are: 4.2426 0 -3.1543 0 0 0 -0.2242 0
Nov 27th 2006 Ingemar J. Cox 42
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example: DCT decomposition
DC term
Nov 27th 2006 Ingemar J. Cox 43
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example: DCT decomposition
2nd AC term
Nov 27th 2006 Ingemar J. Cox 44
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example: DCT decomposition
6th AC term
Nov 27th 2006 Ingemar J. Cox 45
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example: summation of DCT terms
First two non-zero coefficients
Nov 27th 2006 Ingemar J. Cox 46
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example: summation of DCT terms
All 3 non-zero coefficients
Nov 27th 2006 Ingemar J. Cox 47
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example
What if we quantize DCT coefficients? =1
Quantized DCT coefficients are: 4 0 -3 0 0 0 0 0
Nov 27th 2006 Ingemar J. Cox 48
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example
Approximate reconstruction
Nov 27th 2006 Ingemar J. Cox 49
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Example
Exact reconstruction
Nov 27th 2006 Ingemar J. Cox 50
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
2-D DCT Transform
Let i(x,y) represent an image with N rows and M columns
Its DCT I(u,v) is given by
where
M
x
N
y
vyuxyxivCuCvuI
1 1 16
)12(cos
16
)12(cos),()()(
4
1),(
2
1)0( C 1)( uC
Nov 27th 2006 Ingemar J. Cox 51
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Discrete cosine transform Coefficients are approximately uncorrelated
Except DC term C.f. original 8×8 pixel block
Concentrates more power in the low frequency coefficients
Computationally efficient
Block-based DCT Compute DCT on 8×8 blocks of pixels
Nov 27th 2006 Ingemar J. Cox 52
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of images
Basis functions for the 8×8 DCT (courtesy Wikipedia)
Fundamentals of JPEG
Nov 27th 2006 Ingemar J. Cox 54
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
DCT Quantizer Entropy coder
IDCT Dequantizer Entropy
decoder
Compressed
image data
Encoder
Decoder
Nov 27th 2006 Ingemar J. Cox 55
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
JPEG works on 8×8 blocks
Extract 8×8 block of pixels
Convert to DCT domain
Quantize each coefficient Different stepsize for each coefficient
Based on sensitivity of human visual system
Order coefficients in zig-zag order
Entropy code the quantized values
Nov 27th 2006 Ingemar J. Cox 56
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
A common quantization table is
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Nov 27th 2006 Ingemar J. Cox 57
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
Zig-zag ordering
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
Nov 27th 2006 Ingemar J. Cox 58
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
Entropy coding Run length encoding followed by
Huffman Arithmetic
DC term treated separately Differential Pulse Code Modulation (DPCM)
2-step process1. Convert zig-zag sequence to a symbol sequence
2. Convert symbols to a data stream
Nov 27th 2006 Ingemar J. Cox 59
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of JPEG
Modes Sequential Progressive
Spectral selection• Send lower frequency coefficients first
Successive approximation• Send lower precision first, and subsequently refine
Lossless Hierarchical
Send low resolution image first
Fundamentals of MPEG-1/2
Nov 27th 2006 Ingemar J. Cox 61
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
Fundamentals of MPEG
A sequence of 2D images
Temporal correlation as well as spatial correlation
TV broadcast Frame-based Field-based
Nov 27th 2006 Ingemar J. Cox 62
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Moving Picture Experts Group
Standard for video compression
Similarities with JPEG
Nov 27th 2006 Ingemar J. Cox 63
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Design is a compromise between Bit rate Encoder/decoder complexity Random access capability
Nov 27th 2006 Ingemar J. Cox 64
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Images Spatial redundancy Perceptual redundancy
Video Spatial redundancy
Intraframe coding
Temporal redundancy Interframe coding
Perceptual redundancy
Nov 27th 2006 Ingemar J. Cox 65
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Consider a sequence of n frames of video.
It consists of: I-frames P-frames B-frames
A sequence of one I-frame followed by P- and B-frames is known as a GOP Group of Pictures E.g. IBBPBBPBBPBBP
Nov 27th 2006 Ingemar J. Cox 66
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
I-frames Intraframe coded
No motion compensation
P-frames Interframe coded
Motion compensation• Based on past frames only
B-frames Interframe coded
Motion compensation• Based on past and future frames
Nov 27th 2006 Ingemar J. Cox 67
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Motion-compensated prediction Divide current frame, i, into disjoint 16×16
macroblocks Search a window in previous frame, i-1, for closest
match Calculate the prediction error For each of the four 8×8 blocks in the macroblock,
perform DCT-based coding Transmit motion vector + entropy coded prediction
error (lossy coding)
Nov 27th 2006 Ingemar J. Cox 68
UC
L A
dast
ral P
ark
Post
gra
duate
Cam
pus
MPEG
Like JPEG, the DC term is treated separately DPCM
B-frame compression high Need buffer and delay