Mult 8 Compresion
-
Upload
ashishj1083 -
Category
Documents
-
view
231 -
download
0
Transcript of Mult 8 Compresion
-
8/13/2019 Mult 8 Compresion
1/29
8. Compression
-
8/13/2019 Mult 8 Compresion
2/29
2
Video and Audio Compression
Video and Audio files are very large. Unless wedevelop and maintain very high bandwidthnetworks (Gigabytes per second or more) we have
to compress the data.Relying on higher bandwidths is not a good option -
M25 Syndrome: Traffic needs ever increases andwill adapt to swamp current limit whatever this is.
Compression becomes part of the representation orcod ingscheme which have become popularaudio, image and video formats.
-
8/13/2019 Mult 8 Compresion
3/29
3
What is Compression?
Compression basically employs redundancy in the data:
Temporal - in 1D data, 1D signals, Audio etc.
Spatial - correlation between neighbouring pixels or dataitems
Spectral - correlation between colour or luminescencecomponents. This uses the frequency domain to exploitrelationships between frequency of change in data.
psycho-visual - exploit perceptual properties of the humanvisual system.
-
8/13/2019 Mult 8 Compresion
4/29
4
Compression can be categorised in
two broad ways:
Lossless Compression
where data is compressed and can be reconstituted
(uncompressed) without loss of detail or information.
These are referred to as bit-preserving or reversiblecompression systems also.
Lossy Compression
where the aim is to obtain the best possible f idel i tyfor a
given bit-rate or minimizing the bit-rate to achieve agiven fidelity measure. Video and audio compression
techniques are most suited to this form of compression.
-
8/13/2019 Mult 8 Compresion
5/29
5
If an image is compressed it clearly needs to be
uncompressed (decoded) before it can
viewed/listened to. Some processing of datamay be possible in encoded form however.
Lossless compression frequently involves some
form of ent ropy encodingand are based in
information theoretic techniques (see next fig.)
Lossy compression use source encoding
techniques that may involve transform encoding,
differential encoding or vector quantisation (seenext fig.).
-
8/13/2019 Mult 8 Compresion
6/29
6
-
8/13/2019 Mult 8 Compresion
7/29
7
Lossless Compression Algorithms
(Repetitive Sequence Suppression)Simple Repetition Suppression
If in a sequence a series on nsuccessive tokens appearswe can replace these with a token and a count number ofoccurrences. We usually need to have a special f lagto
denote when the repeated token appearsFor Example
89400000000000000000000000000000000
can be replaced with
894f32
where f is the flag for zero.
Compression savings depend on the content of the data.
-
8/13/2019 Mult 8 Compresion
8/29
8
Applications of this simple compression
technique include:
Suppression of zero's in a file (Zero
Length Suppression)
Silence in audio data, Pauses in conversation
etc.
Bitmaps
Blanks in text or program source files
Backgrounds in images
other regular image or data tokens
-
8/13/2019 Mult 8 Compresion
9/29
9
Run-length Encoding
This encoding method is frequently applied to
images (or pixels in a scan line). It is a small
compression component used in JPEG
compression. In this instance, sequences of image elements X1,
X2, , Xn are mapped to pairs (c1, l1), (c1, L2),
, (cn
, ln
) where ci
represent image intensity or
colour and lithe length of the ith run of pixels (Not
dissimilar to zero length suppression above).
-
8/13/2019 Mult 8 Compresion
10/29
-
8/13/2019 Mult 8 Compresion
11/29
11
Lossless Compression Algorithms
(Pattern Substitution)
This is a simple form of statistical encoding.
Here we substitute a frequently repeating
pattern(s) with a code. The code is shorterthan the pattern giving us compression.
A simple Pattern Substitution scheme could
employ predefined code (for examplereplace all occurrences of `The' with the
code '&').
-
8/13/2019 Mult 8 Compresion
12/29
12
More typically tokens are assigned to according tofrequency of occurrence of patterns:
Count occurrence of tokens
Sort in Descending order
Assign some symbols to highest count tokens
A predefined symbol table may used i.e. assigncode ito token i.
However, it is more usual to dynamically assigncodes to tokens. The entropy encoding schemesbelow basically attempt to decide the optimumassignment of codes to achieve the bestcompression.
-
8/13/2019 Mult 8 Compresion
13/29
13
Lossless Compression Algorithms
(Entropy Encoding)
Lossless compression frequently involves
some form of entropy encodingand are
based in information theoretic techniques,
Shannon is father of information theory.
-
8/13/2019 Mult 8 Compresion
14/29
14
The Shannon-Fano Algorithm
This is a basic information theoretic
algorithm. A simple example will be used to
illustrate the algorithm:
Symbol A B C D E
Count 15 7 6 6 5
-
8/13/2019 Mult 8 Compresion
15/29
15
Encoding for the Shannon-Fano
Algorithm:
A top-down approach
1. Sort symbols according to their
frequencies/probabilities, e.g., ABCDE.
2. Recursively divide into two parts, eachwith approx. same number of counts.
-
8/13/2019 Mult 8 Compresion
16/29
16
Huffman Coding
Huffman coding is based on the frequency ofoccurrence of a data item (pixel in images).The principle is to use a lower number of
bits to encode the data that occurs morefrequently. Codes are stored in a CodeBookwhich may be constructed for eachimage or a set of images. In all cases thecode book plus encoded data must betransmitted to enable decoding.
Th H ff l ith i b i fl i d
-
8/13/2019 Mult 8 Compresion
17/29
17
The Huffman algorithm is now briefly summarised:
A bottom-up approach
1. Initialization: Put all nodes in an OPEN list, keep it
sorted at all times (e.g., ABCDE). 2. Repeat until the OPEN list has only one node left:
(a) From OPEN pick two nodes having the lowestfrequencies/probabilities, create a parent node of
them. (b) Assign the sum of the children's frequencies/
probabilities to the parent node and insert it intoOPEN.
(c) Assign code 0, 1 to the two branches of the tree,and delete the children from OPEN.
-
8/13/2019 Mult 8 Compresion
18/29
18
-
8/13/2019 Mult 8 Compresion
19/29
19
The following points are worth noting about theabove algorithm:
Decoding for the above two algorithms istrivial as long as the coding table (thestatistics) is sent before the data. (There is abit overhead for sending this, negligible if the
data file is big.) Unique Prefix Property: no code is a prefix
to any other code (all symbols are at the leaf
nodes) -> great for decoder, unambiguous. If prior statistics are available and accurate,
then Huffman coding is very good.
-
8/13/2019 Mult 8 Compresion
20/29
20
Huffman Coding of Images
In order to encode images:
Divide image up into 8x8 blocks
Each block is a symbol to be coded
Compute Huffman codes for set of block
Encode blocks accordingly
-
8/13/2019 Mult 8 Compresion
21/29
21
Adaptive Huffman Coding
The basic Huffman algorithm has been extended, for thefollowing reasons:
(a) The previous algorithms require the statisticalknowledge which is often not available (e.g., live audio,
video). (b) Even when it is available, it could be a heavy overhead
especially when many tables had to be sent when a non-order0 model is used, i.e. taking into account the impact ofthe previous symbol to the probability of the current symbol(e.g., "qu" often come together, ...).
The solution is to use adaptive algorithms, e.g. AdaptiveHuffman coding (applicable to other adaptive compressionalgorithms).
-
8/13/2019 Mult 8 Compresion
22/29
22
Arithmetic Coding
Huffman coding and the like use an integer number
(k) of bits for each symbol, hence k is never less
than 1. Sometimes, e.g., when sending a 1-bit
image, compression becomes impossible.Map all possible length 2, 3 messages to intervals
in the range [0..1] (in general, needlog pbits to
represent interval of sizep).
To encode message, just send enough bits of a
binary fraction that uniquely specifies the interval.
-
8/13/2019 Mult 8 Compresion
23/29
23
Problem: how to determine
probabilities?
Simple idea is to use adaptive model:
Start with guess of symbolfrequencies. Update frequency with
each new symbol.
Another idea is to take account ofintersymbol probabilities, e.g.,
Prediction by Partial Matching.
-
8/13/2019 Mult 8 Compresion
24/29
24
Lempel-Ziv-Welch (LZW) Algorithm
The LZW algorithm is a very common
compression technique.
Suppose we want to encode the Oxford
Concise English dictionary which
contains about 159,000 entries. Whynot just transmit each word as an 18 bit
number?
-
8/13/2019 Mult 8 Compresion
25/29
25
Problems:
Too many bits,
everyone needs a dictionary,
only works for English text.
Solution: Find a way to build the
dictionary adaptively.
-
8/13/2019 Mult 8 Compresion
26/29
26
Original methods due to Ziv and
Lempel in 1977 and 1978. Terry Welch
improved the scheme in 1984 (calledLZW compression).
It is used in UNIX compress-1D token
stream (similar to below)
It used in GIF compression - 2D
window tokens (treat image as withHuffman Coding Above).
-
8/13/2019 Mult 8 Compresion
27/29
27
The LZW Compression Algorithm
can summarised as follows:
w = NIL;
while ( read a character k )
{
if wk exists in the dictionaryw = wk;
else
add wk to the dictionary;
output the code for w;w = k;
}
-
8/13/2019 Mult 8 Compresion
28/29
28
The LZW Decompression Algorithm
is as follows:read a character k;
output k;
w = k;
while ( read a character k )
/* k could be a character or a code. */{
entry = dictionary entry for k;
output entry;
add w + entry[0] to dictionary;w = entry;
}
-
8/13/2019 Mult 8 Compresion
29/29