Post on 17-Jan-2016
Computer Systems (159.253) ~ 1 ~Data Communications: © P.Lyons 2004
Text usually unsuitable for RLEonly contains repeated space chars
Data Compression
RLE (RUN LENGTH ENCODING)
Aims to save money and time by reducing amount of data transmittedInstead of sending <char> <char><char><char><char><char><char>Send ESC 7 <char>When data includes ESC, send ESC ESC
RLE is used for encoding faxes
Binary files betteroften contain repeated chars, especially NUL
there’s an alternative that would send<char> <char> 5
no. of repetitions
more elegant; <char> acts as its own esc
RunLength Encoding
Computer Systems (159.253) ~ 2 ~Data Communications: © P.Lyons 2004
Data Compression
HUFFMAN CODING
Makes boundaries between characters hard to find
ASCII encodes all characters with 7 bits
Characters occur with unequal frequenciesf e =100 x fq
Use fewer bits to encode most-common
D.A. Huffman
Huffman Coding
Computer Systems (159.253) ~ 3 ~Data Communications: © P.Lyons 2004
Consider an alphabet with only 4 letters
2-bit code Huffman code (most common) A 00 1
B 01 01C 10 001
(least common) D 11 000
20 bits → 19 bitsbetter compression with larger alphabets!
Data Compression
HUFFMAN CODING
AAAABBBCCD00
1
00 00 01 01 01
1 1 01 01 01
00
1
10 10
001 001
11
000 Huffman code
2-bit code
D.A. Huffman
Huffman Coding
Computer Systems (159.253) ~ 4 ~Data Communications: © P.Lyons 2004
Data Compression
HUFFMAN CODING
Most efficient to create a code specifically for the data being sentAllows for different letter frequencies in different languages
Both ends must agree on the code set
Fax machines use a modified Huffman scheme. codes for sequences containing1, 2, 3, .... , 63, 64 black or white dots128, 192, ... (i.e. multiples of 64) dots
So 67-dot sequence would be sent as codes for 64 then 3
D.A. Huffman
Huffman Coding
Computer Systems (159.253) ~ 5 ~Data Communications: © P.Lyons 2004
Data Compression
LEMPEL-ZIV COMPRESSION
ZIP and UNIX Compress utility use (modified) L-Z compression
Codes are fixed-length (usually 12 or 16-bits )
7-bit ASCII for single characters + nine 0-bitsInefficient when sending single charactersBut after a while, very few single characters get sent
Extra codes for most common character sequencesSender creates extra codes based on the letter frequencies in the messageReceiver constructs extra codes while decompressing the original message
Abraham Lempel
Jacob Ziv
Lempel-Ziv Compression
Computer Systems (159.253) ~ 6 ~Data Communications: © P.Lyons 2004
Lempel-Ziv CompressionData Compression
LEMPEL-ZIV COMPRESSION
a
1b
2.z
26.th
27.the
79.then
158
the Remaining stringR denotes the unsent part of the message
Initially, R is the complete message
A code table relates character sequences to codes
Initially, code table just contains the alphabetSender and receiver have the same table
New sequences are added when they are encounteredsender and receiver both add the same codeseasy for the sender!
L denotes the longest string of characters… starting from the first character of Roccurring in the code table
L’ denotes L + the next character in R
Abraham Lempel
Jacob Ziv
Computer Systems (159.253) ~ 7 ~Data Communications: © P.Lyons 2004
Lempel-Ziv CompressionData Compression
…they then and there theorised that this was thus
LL’
R
Senderidentifies L,
sends the code for L to the receiver
Receiver
receives codelooks it up in the code table
adds L to the message string
Sender identifies L’, & makes a new entry in the code table for L’
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 8 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compressiona 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
Sendera 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
Receiver
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 9 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
they_then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
2028
h t
Sender Receiver
Data Compression
LEMPEL-ZIV COMPRESSION
Abraham Lempel
Jacob Ziv
Computer Systems (159.253) ~ 10 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
eeh y_then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
829
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 11 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
yey_then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
530
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
► ey 30en 34
e
► he 29
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 12 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
_y then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
2531
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
e
► he 29
y
► ey 30en 34
► y_ 31
_
► ey 30en 34
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 13 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
_then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
2732
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
e
► he 29
y
► ey 30en 34
► y_ 31
t _
► ey 30en 34
► _t 32_a 36
► y_ 31
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 14 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
then_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
2833
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
e
► he 29
y
► ey 30en 34
► y_ 31
e _
► ey 30en 34
th
► y_ 31
► _t 32_a 36
► _t 32_a 36
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 15 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
en_and_there_theorised_that_this_was_thus
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► th 28the 33
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
534
t
Sender Receiver
Data Compression
► he 29
h
► th 28the 33
e
► he 29
y
► ey 30en 34
► y_ 31
n _
► ey 30en 34
th
► y_ 31
► _t 32_a 36
► _t 32_a 36
e
Several further steps ensue…
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 16 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
_th_the
► n_ 35
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► nd 38
► re 42
► _t 32_a 36
► _th 40► _the 43
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
► n_ 35
► re 42
► th 28the 33
► _t 32_a 36
► _th 40
► y_ 31
_and_there_theorised_that_this_was_
Sender Receiver
►
► he 29
►
► th 28the 33
► y_ 31
► he 29
► nd 38
► an 37 ► an 37
► d_ 39 ► d_ 39
► her 41 ► her 41
ey 30en 34e_ 43
► ey 30en 34e_ 43
RLL’
eorised_that_this_was_thus4043
they_then
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 17 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
spin-spin-s
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
Sender
RLL’
pin-effect8992
A special case
► sp 73► spi 79► spin 84► ► sp 73► spi 79► spin 84►
spin- 89►
s
Receiver
spin- 89► spin-s 92►
(well, nearly)<stringa><stringa><not char1 of stringa>
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 18 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
spin-spin-e
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
Sender
RLL’
effect8993
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
Receiver
spin-
spin-e 93►
A special case (well, nearly)<stringa><stringa><not char1 of stringa>
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 19 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
spin-ef
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
e
Sender
RLL’
ffect594
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
Receiver
spin-
spin-e 93►
ef 94►
spin-e 93►
A special case (well, nearly)<stringa><stringa><not char1 of stringa>
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 20 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
s spin-spin-s
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
Sender
RLL’
pin-splitt8992
► sp 73► spi 79► spin 84► ► sp 73► spi 79► spin 84►
spin- 89►
Receiver
spin- 89► spin-s 92►
A special case (yes, really!)
ing
<stringa><stringa><char1 of stringa>
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
Computer Systems (159.253) ~ 21 ~Data Communications: © P.Lyons 2004
Lempel-Ziv Compression
pspin-s spin-
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
a 1b 2c 3d 4e 5f 6g 7h 8i 9j 10k 11l 12m 13n 14o 15p 16q 17r 18s 19t 20u 21v 22w 23x 24y 25z 26_ 27
RLL’
9293
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
► sp 73► spi 79► spin 84►
spin- 89► spin-s 92►
spin-s
spin-sp 93►
A special case (yes, really!)
litting
Sender Receiver
p
<stringa><stringa><char1 of stringa>
Data Compression
Abraham Lempel
Jacob Ziv
LEMPEL-ZIV COMPRESSION
We are standing up
Computer Systems (159.253) ~ 22 ~Data Communications: © P.Lyons 2004
Error Detection
Sensitivity of applications to errors
Error Detection and Correction
ERROR DETECTION
Errors caused by noiseImpulse (clicks)Crosstalk (between lines)Thermal (can’t eliminate)
If we can detect errors, we can eliminate themUndetected errors can’t be eliminated altogetherAim for a detection rate high enough for application
Video transfer
Bank transactions
high
low
Tanenbaum 3rd edition: 183-190
Computer Systems (159.253) ~ 23 ~Data Communications: © P.Lyons 2004
with even parity: 1111111 ↓
11111111
0111111 ↓ 01111110
Error Detection MethodsError Detection and Correction
ERROR DETECTION
Double sending
METHODS
used by data prep operatorsnot normally used in data comms
ParityAdd 1 or 0 after character
to make total no of 1s even (even parity ) or odd (odd parity)
On arrival, no. of 1 bits in characters should still be evenSingle-bit corruptions make no. of 1 bits oddTwo-bit corruptions are undetected.
Computer Systems (159.253) ~ 24 ~Data Communications: © P.Lyons 2004
Receiver calculates XOR sum of complete byte sequencedetects error if calculated XOR sums is non-zero
Error Detection and Correction
BLOCK CHECKSUMS
Sender sends byte sequence & XOR sum of byte sequence
does not detect two characters that are reversed.
Block Checksums
Computer Systems (159.253) ~ 25 ~Data Communications: © P.Lyons 2004
conventional “horizontal” parity
“longitudinal” parity
Uses both longitudinal and horizontal parity.
Note: positional information => block parity can be used for error correction
Error Detection and Correction
BLOCK PARITY
0 0 0 1 0 1 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0
011101110
Block Parity
Computer Systems (159.253) ~ 26 ~Data Communications: © P.Lyons 2004
CRCs
Receiver
DATA
Sender
Error Detection and Correction
CRC (CYCLIC REDUNDANCY CHECK)divisor represents a polynomial:11001 represents a polynomial of degree D = 41x4 + 1x3 + 0x2 + 0x1 + 1x0
divisorquotient
divisorquotient
+
0+
remainder
DATADATA
++ remainderremainder
DATA’DATA’ DATA’
Computer Systems (159.253) ~ 27 ~Data Communications: © P.Lyons 2004
CRCs
If data = 11100110, divisor = 11001 (D = 4 (x4 is the highest term))add D 0s to the datadivide, using rules of modulo-2 division:
XOR instead of subtraction
CRC (CYCLIC REDUNDANCY CHECK)
Error Detection and Correction
B “goes into” A if B’s high-order bit is in the same position as A’s high-order bit
In A/B,
Computer Systems (159.253) ~ 28 ~Data Communications: © P.Lyons 2004
CRCs
11001) 11100110 0000
CRC (CYCLIC REDUNDANCY CHECK)
Error Detection and Correction
1
11001 01011
0
00000 1011111001 11100
11
11001 0101000000 1010011001 1101011001 0011000000
0110
The CCITT polynomial (divisor) is x16 + x15 + x2+ x0
CRCs detect all single bit errors, most double bit errors, all error bursts <16 bitsmost error burst >16 bits.
11000000000000101
Computer Systems (159.253) ~ 29 ~Data Communications: © P.Lyons 2004
FEC (Forward Error Control)Used where retransmission is undesirableInclude extra information with message so it can be reconstructed
Computer memory, or diskSimplex transmission from a data loggerTransmissions from distant spacecraft
Error Detection and Correction
ERROR CORRECTION
ARQ (Automatic Retransmission on reQuest)Most common in data commsif received data contains errors, request retransmission
Error Correction
Computer Systems (159.253) ~ 30 ~Data Communications: © P.Lyons 2004
HAMMING CODES
Error Detection and Correction
Facilitate error detection and correctionUse >1 bit to encode a bit
0 in data becomes codeword 0001 in data becomes codeword 111
000 100
110010
011
001
111
101
“Hamming Distance” = 3
Closer to 111 than to 000
Closer to 000 than to 111
To detect d bit errors, a code’s HD must be d+1 To correct d bit errors, a code’s HD must be 2d+1
With HD = 3, it is possibleEITHER to detect 2-bit errorsOR to correct 1-bit errors
Richard Hamming
Hamming Codes
Computer Systems (159.253) ~ 31 ~Data Communications: © P.Lyons 2004
To detect d bit errors, code’s HD must be d+1 To correct d bit errors, a code’s HD must be 2d+1
HAMMING CODES
Error Detection and Correction
a 0000011111b 0000000000 c 1111100000d 1111111111
HD for some character-pairs is 10But minimum intercharacter HD is 5, so HD for the whole code is 5.If 1 or 2 bits change, the result is nearer to the original valid codeword
So, for 2-bit error correction, HD 5
5105a
dcbaInter-
characterHammingDistances
a 0000011111 0000101111
0000011111 a0000000000 b1111100000 c1111111111 d
25
85
5105d
5510c
1055b
Richard Hamming
Hamming Codes
Computer Systems (159.253) ~ 32 ~Data Communications: © P.Lyons 2004
Richard Hamming
HAMMING CODES Hamming CodesError Detection and Correction
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
0 0 1 1 1 0 1 0 1 0 0 1 0 0 0
1
1
1
1
1
1
1
0
1
1
0
1
1
1
0
0
1
0
1
1
1
0
1
0
1
0
0
1
1
0
0
0
0
1
1
1
0
1
1
0
0
1
0
1
0
1
0
0
0
0
1
1
0
0
1
0
0
0
0
1
Computer Systems (159.253) ~ 34 ~Data Communications: © P.Lyons 2004
0 0 1 0 1 0 0 0
0 1 1 1 1 0 0 0
Richard Hamming
HAMMING CODES Hamming CodesError Detection and Correction
0 0 1 1 1 0 1 0
0 0 1 1 1 0 0 1
15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
0 0 1 1 1 0 1 0 1 0 0 1 0 0 0
0
0
0
0
0 0 0 0
0
0
0
0 0
1
1
1
1 0 1 1
1
Hamming Codes