Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.
-
Upload
ophelia-fleming -
Category
Documents
-
view
218 -
download
1
Transcript of Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.
![Page 1: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/1.jpg)
Understanding JPEG
MIT-CETI Xi’an ‘99
Lecture 10
Ben Walter, Lan Chen, Wei Hu
![Page 2: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/2.jpg)
What is JPEG?
• JPEG is a method for compressing image data so it takes less space to store or transmit across a network.
• JPEG is very efficient. A file that was 1Mb in size could be compressed to as little 25Kb (1:40)!
• JPEG achieves such good compression ratios because it is lossy - but the loss is not visually perceptible.
![Page 3: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/3.jpg)
Overview
• Images contain different frequencies; low frequencies correspond the slowly varying colors, high frequencies correspond to fine detail.
• The low frequencies are much more important than the high frequencies; we can throw away some high frequencies to compress our data!
0 1 2 3 4 5 6 7-1
0
1
0 1 2 3 4 5 6 7-1
0
1
0 1 2 3 4 5 6 7-1
0
1
![Page 4: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/4.jpg)
Overview
• Note that we aren’t talking about the frequencies of light, but of the light and dark areas in the image!
• We need a way to go from the color of pixels, which is essentially a number, to frequencies…
• This way is called the Discrete Cosine Transform (DCT).
![Page 5: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/5.jpg)
A JPEG Encoder
Entropy Encoder
DCT
Quantizer
![Page 6: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/6.jpg)
The Discrete Cosine Transform
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
X Position of Pixel
Col
or o
f Pix
el
0 2 4 6 8 10 12 14 16-20
-10
0
10
20
30
40
Inte
nsity
Frequency
=
![Page 7: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/7.jpg)
0 10 20-2
0
2
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 2 4 6 8 10 12 14 16-20
-10
0
10
20
30
40
Inte
nsity
Frequency
The Discrete Cosine Transform
![Page 8: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/8.jpg)
The Discrete Cosine Transform
0 2 4 6 8 10 12 14 160
2
4
6
8
10
12
14
16
X Position of Pixel
Col
or o
f Pix
el
0 2 4 6 8 10 12 14 16-1
-0.5
0
0.5
1
1.5
0 2 4 6 8 10 12 14 16-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 2 4 6 8 10 12 14 16-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0 2 4 6 8 10 12 14 16-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
= x1
+ x2
+ … + x15 + x16
0 2 4 6 8 10 12 14 16-20
-10
0
10
20
30
40
Inte
nsity
Frequency
![Page 9: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/9.jpg)
The Discrete Cosine Transform
0 10 206
8
10
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
0 10 200
10
20
![Page 10: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/10.jpg)
The 2D DCT
• So far we’ve been talking about one-dimensional images, just one line of the picture… but an image has two dimensions.
• We can talk about frequencies in two dimensions, although it’s much harder to visualize.
![Page 11: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/11.jpg)
Basis
• Remember we saw that every 16-pixel line can be written as the sum of 16 different waves?
• Those 16 waves formed a basis for the set of 16-pixel lines.
0 10 20-2
0
2
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
0 10 20-0.5
0
0.5
![Page 12: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/12.jpg)
Basis
• When we are compressing a JPEG, we work in blocks of 8x8 pixels. That’s 64 numbers, so there are 64 different basis images.
• This means we can describe any 8x8 image as a combination (a sum) of those 64 images.
![Page 13: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/13.jpg)
Basis
![Page 14: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/14.jpg)
The 2D DCT
0
2
4
6
8
0
2
4
6
8
0
200
400
0
2
4
6
8 02
46
8
-500
0
500
1000
1500
![Page 15: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/15.jpg)
The 2D DCT
![Page 16: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/16.jpg)
Summary
• The Discrete Cosine Transform (DCT) allows us to determine what frequencies make up an image.
• Into this stage we have 8x8 numbers that are the values of each pixel.
• Out of this stage we have 8x8 numbers that represent how much of each frequency (or how much of each basis) is in the image.
![Page 17: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/17.jpg)
A JPEG Encoder
Entropy Encoder
DCT
Quantizer
![Page 18: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/18.jpg)
Quantization
• So we still have 64 numbers to work with - we haven’t reduced the size at all!
• The reason we wanted the numbers as frequencies was because some frequencies are more important than others.
• The low frequencies are the most important, the high frequencies are not very important (think back to building up the image).
![Page 19: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/19.jpg)
Quantization
• Before quantization, each frequency can be between 0 and 255.
• To quantize, we divide frequencies by a number so that the range is reduced. For example, it becomes 0 to 31. For high frequencies we divide by a higher number.
![Page 20: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/20.jpg)
Quantization
• Before we had, say: 134,113,145,117,32,11,17,5… 4.
• After quantization, we might have: 116, 55, 55, 30, 1, 0, 0, … 0.
![Page 21: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/21.jpg)
Quantization
124
56
113
17
34
27
49
25
110
2119
5
7
15
710
97
1 3
![Page 22: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/22.jpg)
Quantization
300 kB 75 kB
Original Medium Quality JPEG
![Page 23: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/23.jpg)
Quantization
300 kB 35 kB
Original Low Quality JPEG
![Page 24: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/24.jpg)
Summary
• The degree of quantization, dictates the amount of information “thrown away”.
• If you throw away more information, you will get better compression, but the picture will start to look bad.
• When you adjust the quality of a JPEG save from Photoshop, you are changing the quantization!
![Page 25: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/25.jpg)
A JPEG Encoder
Entropy Encoder
DCT
Quantizer
![Page 26: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/26.jpg)
Entropy Encoding
• Entropy encoding is another stage of compression, that relies on statistical properties of the data, e.g. most frequently occuring numbers, lots of the same number in a row.
• So the take the 64 numbers, do Run Length Encoding, then follow that with Huffman Coding! (Remember yesterday?)
![Page 27: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/27.jpg)
Entropy Encoding
• These compression schemes now work very well, because quantization turns numbers like 132, 117, 78 into numbers more like 31, 31, 15.
• After quantization, the range of numbers is smaller, and there are often large runs of numbers - so it can be highly compressed!
• This is where all of the compression happens!
![Page 28: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/28.jpg)
Summary
Entropy Encoder
DCT
Quantizer
![Page 29: Understanding JPEG MIT-CETI Xi’an ‘99 Lecture 10 Ben Walter, Lan Chen, Wei Hu.](https://reader036.fdocuments.us/reader036/viewer/2022062408/56649f065503460f94c1c034/html5/thumbnails/29.jpg)
Summary
• We break up the image into 8x8 blocks.
• We calculate the frequencies in each block, this allows us to identify the important and less important data.
• We throw away some less important data.
• We compress the resulting data.
• The result: ~ 1:40 compression!