R. Ernst, TU Braunschweig 1 Embedded System Modeling Part 2 R. Ernst TU Braunschweig.
Coding Theory WS200809englv1uni.paniladen.de/.../script/script_en.pdf · Prof. Dr.-Ing. Thomas...
Transcript of Coding Theory WS200809englv1uni.paniladen.de/.../script/script_en.pdf · Prof. Dr.-Ing. Thomas...
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 1
Coding Theory
Prof. Dr.-Ing. Thomas Kürner
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 2
Parts of this script are based on the manuscript of the lecture of the same title held byProf. Dr.-Ing. Andreas Czylwik at Braunschweig Technical University in the wintersemester 2001/2002.
The version on hand was drawn up by Dipl.-Ing. Andreas Hecker in the years 2004 and 2005.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 3
Organisation
• Lecture and practical exercise in the winter semester (2 + 1 SWS):– on Tuesdays; 08.50 am – 09.35 am; lecture hall SN 23.2– on Wednesdays; 11.30 am – 01.00 pm; lecture hall SN 23.2
• Assistant: Dipl.-Ing. M. Sc. Thomas Jansen• Examination: written or oral
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 4
Objectives:This lecture is intended to develop the understanding of the information-theoretical limiting factors in data transmission and to build knowledgeof methods of source and error control coding in theory and in practical applications.
Contents:Introduction1. Fundamentals of information theory2. Fundamentals of error control coding3. Single-error correcting block codes4. Burst error correcting block codes5. Convolutional codes6. Special coding techniques
Objectives and Contents of the Lecture
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 5
LiteratureLiterature recommended for the lecture:
– H. Rohling: Einführung in die Informations- und Codierungstheorie, Teubner– R. Togneri, C.J.S. de Silva: Fundamentals of Information Theory and Coding
Design, Chapman & Hall/CRC
Further literature:– B. Friedrichs: Kanalcodierung, Springer– H. Schneider-Obermann: Kanalcodierung, Vieweg– M. Bossert: Kanalcodierung, Teubner– J. Bierbrauer: Introduction to Coding Theory, Chapman & Hall/CRC– O. Mildenberger: Informationstheorie und Codierung, Vieweg– S. Lin, D. J. Costello Jr.: Error Control Coding, Pearson Prentice Hall
Literature dealing with digital transmission:– J.G. Proakis, M. Saleki: Grundlagen der Kommunikationstechnik, Pearson
Studium
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 6
IntroductionCode / Coding:A rule for the unambiguous assignment of characters of one character set to those of another character set is called a code.
The conversion of one character set into the other is referred to as coding(to code).
Depending on the problem, coding is classified as follows:– Cryptology– Source coding– Error control coding
Example: The assignment of names of towns to number plates:Berlin → BBraunschweig → BS
} subject of this lecture
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 7
Coding for encryption (cryptology):Messages are to be protected against unauthorised monitoring. Decryptioncan only be effected if the code is known.
Source coding:Messages are compressed to a minimum number of characters withoutloss of information (reduction of redundancy). With further compression, loss of information is accepted (e. g. considering video and audiotransmission).
Error control coding:Messages are to be protected against transmission errors by controlledadding of redundancy. Applications can be found, among other things, in the field of data transfer over waveguides and radio channels (especiallymobile radio channels) to be secured or with the storage of data.
Introduction
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 8
Block diagram of a digital broadcasting system:
Introduction
Digital source
Digital sink
Discrete channel
Source SourceCoder
Error ControlCoder Modulator
TransmissionChannel
Interference
Sink SourceDecoder
Error ControlDecoder Demodulator
A
D
D
A
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 9
Simplified model:
Introduction
Error control coding for error correctionFEC – forward error correction
The channel quality is characterised by the residual-error probability afterdecoding. The data rate, however, is independent from the channelquality.
DataSource ChannelChannel
CoderDataSink
Channel Decoder(Error Correction)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 10
Introduction
Overview of some FEC codes:
Convolutional codes
FEC codes
Concatenated codesBlock codes
recursive non-recursivelinearnon-linear serial Product codes parallel
cyclicnon-cyclic
Hamming RS BCH cyclic HammingReed-Muller Abrahamson
The grey-coloured codes will not be dealt with in this lecture.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 11
Error control coding for error detectionCRC – cyclic redundancy check;Application in combination with ARQ methods (automatic repeat request)
Introduction
For this method, a return channel is required, that is why it is notapplicable for „broadcast“ or time-critical applications. Redundancy isadded adaptively, that means only in case of errors. Thus, the residual-error probability is independent, however, the data rate depends on thechannel quality.
ARQ, for example, is applied for GPRS.
ChannelChannelCoder
ARQControl
ARQControl
return channel
errorinformation
Channel Decoder(Error
Detection)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 12
Using error control coding, the error probability can bereduced to any value, if the data rate is smaller than thechannel capacity.
Claude E. Shannon, Warren Weaver,The mathematical theory of communication,Urbana, University of Illinois Press, 1949
Unfortunately, Shannon, the founder of the information theory, does notgive a practical construction rule for his theorem.
Einführung
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 13
Channel interference effects andexamples of correction results:
Introduction
Image before transmission:
Images after transmission:
via BSC channel with perr = 0.005without error control coding
via BSC channel with perr = 0.5without error control coding,but also not corrigible!
via BSC channel with perr = 0.005with (7,4,3)-Hamming error control coding(not all errors were fixed)
via BSC channel with perr = 0.005with (31,16,15)-BCH error control coding(almost error-free)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 14
Fundamental idea of error control codingFor the protection of the message to be transmitted, redundancy is addedfor error detection (FE) and possibly for error correction (FK). Assignment in the coder; block coding (BC) is given as an example:
With the BC, a k-digit input vector u = (u1, ... , uk) is transformed into an n-digit output vector x = (x1, ... , xn).Then the code rate RC is defined:
With the minimum Hamming distance dmin, that means the minimumnumber of different digits of all possible pairs of code words, a measure
for the performance of a code is given.
Introduction
nkR =C (E1)
length length
output vector xinput vector u
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 15
Introduction
Code dice with n = 3 and
– RC = 1– dmin = 1– no FE and
no FK possible
– RC = 2/3– dmin = 2– one error noticeable;
no error corrigible
– RC = 1/3– dmin = 3– two errors noticeable;
one error corrigible
k = 3 k = 2 k = 1
valid code word invalid code word
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 16
1 Fundamentals of Information TheoryIn information theory, broadcasting is describedmathematically.
incorrect information
irrelevant relevant
redundant not redundant
Pivotal questions deal with the– quantitative calculation of the information content of messages,– determination of the transmission capacity of transmission channels
and– analysis and optimisation of source and error control coding.
Message from the point of view of the information theory:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 17
1.2 Mathematical Description of MessagesMessages are classified according to their quality. The less a message is predictable, the higher its relevance.
1 Fundamentals of Information Theory
Relevance
Probability
Example:• Tomorrow the sun rises.• Tomorrow the weather will be bad.• For tomorrow we expect a severe
storm and, as a consequence, a power cut.
The message from a digital source corresponds to a sequenceof characters.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 18
Maximum entropy H0 of a source:
1.2 Mathematical Description of Messages
(1.13)H0 describes the number of binary decisions required for the selection of themessage. Thereby, the occurrance probability of the source symbols is notconsidered.
Examples: Text in German with 71 alphanumerical characters as source(26⋅2 characters, 3⋅2 umlaut, ß, 12 special characters incl. space characters „“ ()-.,;:!?)
H0 = ld(71) = ln(71)/ln(2) = 6.15 bit/characterOne page of text in German with 40 lines and 70 characters per lineas source, that means all in all N = 7140⋅70 different pages are possible
H0 = ld(7140⋅70) = 40 ⋅ 70 ⋅ ld (71) = 17.22 kbit/page
It is assumed that a source with a character set of N characters is given. H0 isthen defined as:
H0 = ld(N) bit/character (bit = binary digit)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 19
These conditions are fulfilled by the general solution I(xi) = −k ⋅ logb(p(xi)).
1.2 Mathematical Description of Messages
Information content I:
– I(xi) ≥ 0 for 0 ≤ p(xi) ≤ 1– I(xi) → 0 for p(xi) → 1– I(xi) > I(xj) for p(xi) < p(xj) – I(xi,xj) = I(xi) + I(xj) for p(xi,xj) = p(xi) ⋅ p(xj), d. h.
for two consecutive, statistically independent characters xi and xj
Definition of information content:(with k = 1 and b = 2)
terbit/charac)(
1ld)( ⎟⎟⎠
⎞⎜⎜⎝
⎛=
ii xp
xI (1.14)
It is assumed that a character set (alphabet) X with X = {x1, ... , xN} and thesingle occurrence probabilities of the characters p(x1), ... , p(xN) is given. In this regard, I is to map the content of a message in dependency on theprobability p [I = f(p)]. For this, the following characteristics are to befulfilled:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 20
1.2 Mathematical Description of Messages
∑=
⋅==N
iiii xIxpxIXH
1)()()()(
Entropy H(X):
Redundancy of a source RQ:Information content of a source that is under-utilised due to the particularoccurrence probabilities of the characters:
RQ = H0 − H(X) (1.16)
(1.15)
H(X) describes the mean information content of a source:
terbit/charac)(
1ld)(1
∑=
⎟⎟⎠
⎞⎜⎜⎝
⎛⋅=
N
i ii xp
xp
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 21
1.2 Mathematical Description of Messages
Entropy of a binary source:
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
S(p) / bit
p
(1.18)
(1.17)
A binary source has a character set X of exactly two characters:X = {x1, x2} with the occurrence probabilities p(x1) = p and p(x2) = 1−p. Then the entropy H(X) is: H(X) = −p ld(p) − (1 − p) ld(1 − p).This function in dependence of p is also called Shannon function:
S(p) = H(X).
Repetition of mathematics:
( )
( ) 0loglim
log1log
0=⋅
−=⎟⎠⎞
⎜⎝⎛
→xx
xx
x
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 22
1.2 Mathematical Description of Messages
∑∑∑=== ⋅
⋅=⋅−⋅=−N
i ii
N
ii
N
i ii Nxp
xpNxpxp
xpNXH111 )(
1ld)(ld)()(
1ld)(ld)(
1ln −≤ xx
-1.0
0.0
1.0
1.0 2.0 x
ln(x)
x − 1
The entropy becomes maximum for equiprobable characters, that means:H(X) ≤ H0 = ld N.
Evidence:
Utilisation of the inequation:
)1(2ln
1ld1ln −≤⇒−≤ xxxx
∑=
⎟⎟⎠
⎞⎜⎜⎝
⎛−
⋅⋅≤−
N
i ii Nxp
xpNXH1
1)(
12ln
1)(ld)(
0112ln
1)(12ln
11
=⎟⎠⎞
⎜⎝⎛ −⋅=⎟
⎠⎞
⎜⎝⎛ −= ∑
= NNxp
N
N
ii
Using
q.e.d.
(1.19)
(1.20)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 23
1.3 Methods and Algorithms of Source CodingIn communications technology, source coding is deployed in order to reduce the data volume to be transmitted. Examples for this can be foundin many applications frequently used such as data compression by„zapping“, MP3 or MPEG compression. It can also be found in GSM speech coding where a digitalisation has to be carried out first.
In the following, drafts for binary coding of discrete sources areconsidered in terms of the transmission via a binary channel.
1 Fundamentals of Information Theory
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 24
1.3.1 Fields of Application of Source CodingThe character set X = {x1, ... , xN} of an arbitrary source is given.With source coding, a binary code of the code word length L(xi) isassigned to a character xi of the character set X. Regarding this, theobjective of the coding is the minimisation of the mean code wordlength :
1.3 Methods and Algorithms of Source Coding
L
∑=
⋅==N
iiii xLxpxLL
1)()()(
x1 1001x2 011
xN 010111L(xN)
....
....Examples for binary coding of characters:– ASCII code:
Fixed code word length L(xi) = 8 (block code)– Morse code:
Letters are represented by dots and dashes; code words are separatedby gaps; letters that often occur are assigned to short code words.
(1.21)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 25
Prefix characteristic of a code:This characteristic means that a code word cannot form the beginning of another code word at the same time. A given bit sequence is then definite without use of further separation rules („point-free code“).
1.3.1 Fields of Application of Source Coding
x1 → 0x2 → 01x3 → 10x4 → 100
The definite decoding of a bit sequence is not possible. For example, for the sequence 010010 the following decodingresults are possible: x1x3x2x1, x2x1x1x3, x1x3x1x3, x2x1x2x1, x1x4x3 .
Examples for codes with and without prefix characteristic:x1 → 0x2 → 10x3 → 110x4 → 111
The definite decoding of a bit sequence is guaranteed. Decoding of the sequence 010010110111100, for example, provides definitely: x1x2x1x2x3x4x2x1 .For a successful decoding, however, the synchronisation to the beginning of the sequence is required.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 26
Codes with a prefix characteristic can be decoded using a decision tree.
1.3.1 Fields of Application of Source Coding
x1 → 0x2 → 10x3 → 110x4 → 111
(Do not mistake this for theredundancy RQ of a source!)Redundancy of a code RC: )(C XHLR −=
(1.22)
1st plane:
2nd plane:
3rd plane:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 27
Kraft‘s inequation:A binary code with prefix characteristic and the code word lengthsL(x1), L(x2), ... , L(xN) only exists if:
1.3.1 Fields of Application of Source Coding
121
)( ≤∑=
−N
i
xL i
Possible proof:The length of a tree structure is equal to the maximum code word length Lmax:
Lmax = max(L(x1), L(x2), ... , L(xN)).The code word xi on the layer L(xi) eliminates of the possiblecode words on the layer Lmax .The sum of all eliminated code words is less than or equal to the
maximum number of the code words on the layer Lmax:
.
)(max2 ixLL −
maxmax 221
)( LN
i
xLL i ≤∑=
−
The equals sign in Kraft‘s inequation is valid, if all endpoints of the codetree are taken by code words.
(1.23)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 28
Lower limit of the mean code word length :Every prefix code has a mean code word length , which corresponds at least to the entropy of the source H(X): .
1.3.1 Fields of Application of Source Coding
Possible proof:The evidence is similar to the evidence of inequation (1.19). Additionally, Kraft‘s inequation (1.23) is utilised.
The equals sign is fulfilled if the following applies:and the code word lengths are selected to: L(xi) = Ki = I(xi).
LL)(XHL ≥
xi p(xi) Code wordsx1 1/2 1 x2 1/4 00 x3 1/8 010 x4 1/16 0110 x5 1/16 0111
( ) iKixp 2
1)( =
With general p(xi), I(xi) has to be transferredto an integer value by rounding up, so thatthe following applies:
I(xi) ≤ L(xi) < I(xi) + 1. (1.27)L(xi) then continue to fulfil inequation (1.23).
815)( == XHLFor the example on the right:
(1.24)
(1.25)(1.26)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 29
Shannon‘s coding theorem:For every source, a binary coding can be found:
1.3.1 Fields of Application of Source Coding
1)()( +<≤ XHLXH
Possible proof:Multiply inequation (1.27) by p(xi) and sum up over all i.
Using the left side of inequation (1.27), proof is provided that with thiscondition a code with a prefix characteristic exists:
)()(
1 2)()(ld)( ii
xLiixpi xpxLxI −≥⇒≤⎟
⎠⎞⎜
⎝⎛=
1)(211
)( ∑∑==
− =≤N
ii
N
i
xL xpiSummation over all i:
From this, the inequation (1.23) results that guarantees the existence of prefixcodes. q.e.d.
(1.28)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 30
1.3.2 Shannon CodingAlgorithm:1. Arranging of the occurrence probabilities2. Determination of the code word lengths L(xi) according to inequation
(1.27)3. Calculation of the accumulated
occurrence probabilities Pi:
1.3 Methods and Algorithms of Source Coding
∑−
==
1
1)(
i
jji xpP
i p(xi) I(xi) L(xi) Pi Code1 0.22 2.18 3 0.00 000 2 0.19 2.40 3 0.22 001 3 0.15 2.74 3 0.41 011 4 0.12 3.06 4 0.56 1000 5 0.08 3.64 4 0.68 1010 6 0.07 3.84 4 0.76 1100 7 0.07 3.84 4 0.83 1101 8 0.06 4.06 5 0.90 111009 0.04 4.64 5 0.96 11110
4. Code words are the L(xi) positionsafter decimal points of thebinary representation of Pi.
Example:0.90 = 1⋅2−1 + 1⋅2−2 + 1⋅2−3 +
0⋅2−4 + 0⋅2−5 + 1⋅2−6 +1⋅2−7 + 0⋅2−8 + 0⋅2−9 +1⋅2−10 + ...
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 31
Conversion of a decimal number into a binary number:For example, for the decimal number 0.9 the binary number is to bedeveloped. This search corresponds to the search for coefficients of thefollowing equation that can then be solved in the following way:
1.3.2 Shannon Coding
0.90⋅2 = 0.8 + 10.80⋅2 = 0.6 + 10.60⋅2 = 0.2 + 10.20⋅2 = 0.4 + 0 ...
0.90 = a1⋅2−1 + a2⋅2−2 + a3⋅2−3 + a4⋅2−4 + ... | ⋅21.80 = 1 + 0.80 = a1⋅20 + a2⋅2−1 + a3⋅2−2 + a4⋅2−3 + ...
Since 20 = 1, also a1 = 1 provided thata2⋅2−1 + a3⋅2−2 + a4⋅2−3 + ... < 1
However, this corresponds directly to Kraft‘s inequation (1.23), which isproven.Thus, by multiplying repeatedly, the coefficients can be determined.Simplified, this method can then be described:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 32
1.3.2 Shannon Coding
Tree diagram of the Shannon code considered in the example:
The tree diagram shows the disadvantage of Shannon coding. Not all endpoints of the tree are taken, that means that the code word length can bereduced. Therefore, this code is not optimal.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 33
1.3.3 Huffman Coding1.3 Methods and Algorithms of Source Coding
Huffman coding is based on a recursive approach. The fundamental ideais that the two symbols with the smallest probabilities must have the samecode word length; otherwise a reduction of the code word length ispossible. Thus, the symbols with the smallest probabilities form the initialpoint.
Algorithm:1. Arranging of the symbols according to their probability2. Assignment of 0 and 1 to the two symbols with the smallest
probability3. Combination of the two symbols with the smallest probability xN−1
and xN to a new symbol with the probability p(xN−1 ) + p( xN)4. Repetition of the steps 1 - 3, until only one symbol remains.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 34
Coding of the example from 1.3.2:
1.3.3 Huffman Coding
0
0
1
1
1 0
0
1
01
1
1
10,1
0,18
0,26
0,41
0,59
1,0
7
1
4
5
6
3
8
2
0,22
0,19
0,15
0,12
0,08
0,07
0,07
0,06
9 0,04
0,14
0,33
10
11
001
011
0001
0100
0101
00000
00001
0
0
0
i Coding Codep(xi)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 35
Tree structure of the example Huffman Code:
1.3.3 Huffman Coding
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 36
1.4 Discrete Source Models1.4.1 Discrete Memoryless Source
1 Fundamentals of Information Theory
First, the joint entropy of character strings is to be calculated:
For M independent characters of the same source, the following applies:H(X1, X2, ... , XM) = M ⋅ H(X)
(1.29)
(1.30)
The coding possibilities for those sources are considered the charactersof which are read out statistically independent from each other.
)()(),( YHXHYXH +=
))((ld)()())((ld)()(1111
ypypxpxpxpypN
kkk
N
ii
N
iii
N
kk ⋅⋅−⋅⋅−= ∑∑∑∑
====
[ ]))((ld))((ld)()(1 1
ypxpypxpN
i
N
kkiki +⋅⋅−= ∑ ∑
= =
)),((ld),(),(1 1
yxpyxpYXHN
i
N
kkiki−= ∑ ∑
= =
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 37
1.4.1 Discrete Memoryless Source
As the general form of the Shannon coding theorem shows, the coding of character strings represents a more efficient method:
The disadvantage of this method is the strongly increasing codingcomplexity, since the number of possible characters increases exponentially.
As an example, a binary source with X = {x1, x2} and the occurranceprobabilities p(x1) = 0.2 and p(x2) = 0.8, respectively, are considered. Thus, the entropy of the source is H(X) = 0.7219 bit/character.As coding algorithm, the Huffman coding is used.
(1.31)
),,( 1 MM XXL KH(X1,...,XM) ≤ ≤ H(X1,...,XM) + 1
LM ⋅M⋅H(X) ≤ ≤ M⋅H(X) + 1
LH(X) ≤ ≤ H(X) + 1/M
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 38
1.4.1 Discrete Memoryless Source
Coding of single characters: Character p(xi) Code L(xi) p(xi)⋅L(xi)x1 0.2 0 1 0.2 x2 0.8 1 1 0.8
Σ = 1
Mean code word length: terbit/charac 1=L
CharacterPair
p(xi,xj) Code L(xi,xj) p(xi,xj)⋅L(xi,xj)
x1x1 0.04 101 3 0.12 x1x2 0.16 11 2 0.32 x2x1 0.16 100 3 0.48 x2x2 0.64 0 1 0.64
Σ = 1.56
Coding ofcharacter pairs:
Mean code word length: terbit/charac 78.0=L
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 39
1.4.1 Discrete Memoryless Source
Coding ofcharacter triples:
CharacterTriple
p(xi,xj,xk) Code L(xi,xj,xk) p(xi,xj,xk)⋅L(xi,xj,xk)
x1x1x1 0.008 11111 5 0.040 x1x1x2 0.032 11100 5 0.160 x1x2x1 0.032 11101 5 0.160 x1x2x2 0.128 100 3 0.384 x2x1x1 0.032 11110 5 0.160 x2x1x2 0.128 101 3 0.384 x2x2x1 0.128 110 3 0.384 x2x2x2 0.512 0 1 0.512
Σ = 2.184
Mean codeword length:
terbit/charac 728.0=L
Summary: M H(X) H(X) + 1/M1 0.7219 1.0 1.7219 2 0.7219 0.78 1.2219 3 0.7219 0.728 1.0552
L
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 40
1.4.2 Discrete Source With Memory1.4 Discrete Source Models
Coding possibilities for those sources, the characters of which are read out interdependently, are considered here. This fact applies to real sources, ifcorrelations occur between the single characters. In this case, the jointentropy is calculated to be:
∑ ∑= =
⋅−=N
i
N
kikki xypyxpXYH
1 1))|((ld),()|(with the conditional entropy
(1.32a)
(1.32b)
)|()(),( XYHXHYXH +=
))|((ld),())((ld)()|(1 11 1
xypyxpxpxpxypN
i
N
kikki
N
i
N
kiiik ⋅−⋅⋅−= ∑ ∑∑ ∑
= == =
[ ]))|((ld))((ld)|()(1 1
xypxpxypxpN
i
N
kikiiki +⋅⋅−= ∑ ∑
= =
)),((ld),(),(1 1
yxpyxpYXHN
i
N
kkiki−= ∑ ∑
= =
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 41
1.4.2 Discrete Source with Memory
Between the source entropy and theconditional entropy, the following relation exists: H(Y) ≥ H(Y | X )
Proof:
with p(xi,yk) = p(xi) ⋅ p(yk|xi)
∑ ∑= =
⎟⎟⎠
⎞⎜⎜⎝
⎛−⋅⋅≤−
N
i
N
k ik
kki xyp
ypyxpYHXYH1 1
1)|(
)(2ln
1),()()|(
0),()()(2ln
1)()|(1 11 1
=⎥⎦
⎤⎢⎣
⎡−≤− ∑ ∑∑ ∑
= == =
N
i
N
kki
N
i
N
kki yxpypxpYHXYH
Estimation with: )1(2ln
1ld1ln −≤⇒−≤ xxxx
q.e.d.
(1.33)
∑ ∑= =
⎟⎟⎠
⎞⎜⎜⎝
⎛⋅=−
N
i
N
k ik
kki xyp
ypyxpYHXYH1 1 )|(
)(ld),()()|(
∑ ∑∑= ==
⋅−=⋅−=N
i
N
kkki
N
kkk ypyxpypypYH
1 11))((ld),())((ld)()(
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 42
1.4.2 Discrete Source with Memory
With this result, based on equiprobable single characters, the entropy of a source with memory is accordingly smaller than the entropy of a memoryless source. In other words: With sources with memory, a considerably more efficient source coding is possible by coding of characterstrings utilising the correlations among the single characters, as if the sourcewas considered memoryless.
Sources with memory can be modelled as Markoff processes. The respective sourcemodels are referred to as Markoff sources.
Example for correlations in texts in German: The probability that the letter „Q“is followed by „u“, is near 1.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 43
1.4.2 Discrete Source with Memory
Markoff source:A discrete source with memory can generally be described as Markoffsource. The Markoff sources are based on Markoff processes.As a basis, a sequence of successively occurring random variables isconsidered: z0, z1, z2, ..., zn. If these are statistically independent, thefollowing applies in general:
)()...,,,|( 021,...,,| 021 nznnnzzzz zfzzzzfnnnn
=−−−−
If the values for zi are statistically independent, the equation is extended to:
),...,|(),...,,|( 1,...,|021,...,,| 1021 mnnnzzznnnzzzz zzzfzzzzfmnnnnnn −−−− −−−−
=
Frequently, first order Markoff processes are considered (m = 1):
)|(),...,,|( 1|021,...,,| 1021 −−− −−−= nnzznnnzzzz zzfzzzzf
nnnnn
(1.34a)
(1.34b)
(1.34c)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 44
1.4.2 Discrete Source with Memory
zi can reach a finite number of discrete values: zi ∈ {x1, ... , xN}. TheMarkoff process emits these values. A set of values dependent on eachother represents a Markoff chain. This chain can be described in full bytransition probabilities:
),...,|(),...,|(101 101 mnnn imninjniinjn xzxzxzpxzxzxzp
−−−======= −−−
If the transition probabilities do not depend on the absolute time, this isreferred to as homogenous Markoff chain:
),...,|(),...,|(11 11 mm imkikjkimninjn xzxzxzpxzxzxzp ======= −−−−
If the steady state does not depend on the initial probabilities, this isreferred to as stationary Markoff chain:
For first order homogenous and stationary Markoff chains (y = 1) applies:
jjjnknikjnknwxpxzpxzxzp ====== +∞→+∞→
)()(lim)|(lim
ijinjn pxzxzp === − )|( 1
(1.35a)
(1.35b)
(1.35c)
(1.35d)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 45
1.4.2 Discrete Source with Memory
In case of the homogenousstationary Markoff chain, thesingle transition probabilities can becombined in a so-called transition matrix:
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
NNNN
N
N
ppp
pppppp
L
LOLL
L
L
21
22221
11211
P
11
=∑=
N
jijp
))()()(()( 2121 NN xpxpxpwww LL ==w
Pww =
The single rows of the transition matrixmust add up to 1:
The transition probabilities of the single discrete values can be combined to a probability vector:
The probability vector can be determined by means of the steady state:
(1.36a)
(1.37)
(1.38)
(1.36b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 46
1.4.2 Discrete Source with Memory
A stationary Markoff source emitting first order Markoff chains is nowconsidered.
(1.39)N
∑ ∑= =
−− ==⋅==−=N
i
N
jinjninjn xzxzpxzxzp
1 111 ))|((ld),(
N N∑ ∑= =
−− ==⋅==⋅−=i j
injninjni xzxzpxzxzpw1 1
11 ))|((ld)|(
∑=
−− =⋅===i
inniiinn xzzHwxzzH1
11 )|()|(
−−−∞→
∞ == nnnnnn
zzHzzzzHZH 1021 )|(), ,,|(lim)( K
For this configuration described by w and P, the entropy is to be calculated. Then, this one is equal to the entropy in the steady state:
H∞(Z)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 47
1.4.2 Discrete Source with Memory
Since the entropy H∞(Z) includes characters from the past which havealready been known, this is about a conditional entropy. Analogue to inequation (1.33), the following applies to this:
)()()|()( 01 nnnn zHzHzzHZH ≤≤= −∞
The source coding of a Markoff source can now be carried out consideringthe memory. For this, for example, the Huffman coding can be appliedtaking into account the current state of the source.
The problem caused by this coding represents the disastrous errorpropagation. However, it is a basic problem of the source codes of a variable length. An error occurring on the channel that has not beencorrected can then lead to a loss of the complete information sequence.
(1.40)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 48
1.4.2 Discrete Source with Memory
In the following, the example of a Markoff source on the right is considered. Its states correspond to the transmittedcharacters: zn ∈ {x1, x2, x3}.
The transition probabilities for thisexample are as follows:
zn zn−1
x1 x2 x3
x1 0.2 0.4 0.4 x2 0.3 0.5 0.2 x3 0.6 0.1 0.3
( ) ( )⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛===== −
3.01.06.02.05.03.04.04.02.0
)|( 1 Pijinjn pxzxzpIn the following, wand H∞(Z) are to becalculated.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 49
1.4.2 Discrete Source with Memory
Calculation of the stationary probabilities wi:
w1 = 0.2 w1 + 0.3 w2 + 0.6 w3w2 = 0.4 w1 + 0.5 w2 + 0.1 w3w3 = 0.4 w1 + 0.2 w2 + 0.3 w31 = w1 + w2 + w3
3011.03441.03548.0 9328
39332
29333
1 ≈=≈=≈= www
From equation (1.38) three single equations can be obtained here to calculate the three unknowns. Each of the three single equations, however, is linearly dependent on the two other ones so that the second approach to obtain a third independent equation has to be taken into account:
}Strike out one of the three equations!
Solution to the system of equations:
(1.41)
As an approach, equation (1.38) as well as the following equation areapplied:
1.1
=∑=
N
iiw
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 50
1.4.2 Discrete Source with Memory
Calculation of the Entropy H∞(Z):
∑ ∑∑= ==
−∞ ⋅⋅−==⋅=N
i
N
jijiji
N
iinni ppwxzzHwZH
1 111 )(ld)|()(
At first, the single calculations of the state entropy are the obviousprocedure for the overview:
character/bit441.1)|()|()|()( 313212111
≅=+=+== −−−∞ xzzHwxzzHwxzzHwZH nnnnnn
The entropy is then:
(1.42)
The approach for this calculation is represented by equation (1.39). Use of the notation for the stationary probabilities wi and transition probabilities pijleads to the following notation:
3.00.10.6 bit / character2955.1ld3.0ld1.0ld6.0)|( 11131 ≅⋅+ ⋅+⋅==− xzzH nn
acterbit / char4855.1ld2.0ld5.0ld3.0)|( 2.01
0.51
0.31
21 ≅⋅+⋅+⋅==− xzzH nn
bit / character5219.1ld4.0ld4.0ld2.0)|( 4.01
0.41
0.21
11 ≅⋅+⋅+⋅==− xzzH nn
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 51
1.4.2 Discrete Source with Memory
A comparison of the different values shows:
character/bit441,1)( ≅∞ ZH
character/bit5814,11ld)(lim)(1
≅⋅== ∑=
∞→
N
i iinn w
wzHZH
character/bit1,5850ld30 ≅=H
See also inequation (1.40).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 52
1.4.2 Discrete Source with Memory
For the considered example, the state-dependent Huffman codings are listedbelow. For every state zn−1, an own coding exists that here is also different from the respective other ones:
zn zn−1
x1 x2 x3 ⟨L⟩|zn−1
x1 11 10 0 1.6 x2 10 0 11 1.5 x3 0 11 10 1.4
character/bit5054.1
)|()|(1 1
11
≅
==⋅⋅==== ∑∑= =
−−
N
i
N
jinjnijiinjn xzxzLpwxzxzLL
zn zn−1
x1 x2 x3
x1 0.2 0.4 0.4 x2 0.3 0.5 0.2 x3 0.6 0.1 0.3
The mean code word length is then calculated to be:
character/bit441.1)( ≅∞ ZHcharacter/bit5814.1)( ≅ZH
For comparison:
pij Code words
(1.43)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 53
1.5 Communicationover a Discrete Memoryless Channel
1.5.1 Definition and Characteristicsof the Discrete Memoryless Channel
1 Fundamentals of Information Theory
},...,,{ 21 XNxxxX ∈ },...,,{ 21 YNyyyY ∈
The discrete memoryless channel (DMC) describes a model for thetransmission of discrete values via a channel. Memoryless means that thetransmission probabilities of the symbols are independent from each other.
The input signal X and the output signal Y each can reach different discretevalues, whereas the possible values at the input and at the output as well as the number of values do not have to be equal:
DiscreteMemoryless
Channel
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 54
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
( ) ( )
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
====
YXXX
Y
Y
NNNN
N
N
ijij
ppp
pppppp
pxXyYp
L
LOLL
L
L
21
22221
11211
)|(P
With the transmission, each of the xi is now transferredinto yj with a certain probability pij:
The transition probabilities pij are againcombined to the transition matrix P.Thereby, the sum of pij over one line each has to equal 1 (equation (1.36b)).
(1.44)
(Similar to equation (1.36a), where,however, transitions in sources areconcerned!)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 55
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
⎟⎟⎠
⎞⎜⎜⎝
⎛=
2221
1211pppp
P
212121
212121
)()()|()()|()()err(
pxppxpxypxpxypxpp
⋅+⋅=⋅+⋅=
⎟⎟⎠
⎞⎜⎜⎝
⎛−
−=
errerr
errerr1
1pp
ppP
errerr21
212121
)]()([)()()err(
ppxpxppxppxpp
=⋅+=⋅+⋅=
A special case of the DMC is the binary channel. X as well as Y consist of only two symbols. The transition matrix is reduced to:
The probability p(err) that a transition error occurs is calculated to be:
For the binary symmetric channel (BSC) the relationsare simplified to:
(1.45)
(1.46)
(1.47)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 56
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
For the interference-free channel it is obvious that the information at theoutput has to be the same as the information at the input of the channel. At the same time, that means that transmitted information is equal to theentropy of the source.With the interfered channel, information gets lost with the transmissiondue to false assignment of several symbols. That means that theinformation at the output has to be less than the information at the input. This also leads to a decrease of the transmitted information, which has then to be less than the entropy of the source.For this reason, the transinformation T(X,Y) is defined as a measure forthe information actually transmitted. In the following, this transinformationis correlated to the other information streams.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 57
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
Information streams:
Source SourceCoder Source
Decoder Sink
channel
irrelevance
transinformation
equivocation
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 58
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
Definitions of the entropies in terms of the discrete memoryless channel
Input entropy: mean information content of the input symbols
Output entropy: mean information content of the output symbols
Joint entropy: mean uncertainty of the complete transmission system
)(1ld)()(
1 i
N
ii xp
xpXHX
∑=
⋅=
)(1ld)()(
1 j
N
jj yp
ypYHY
∑=
⋅=
∑ ∑= =
⋅=X YN
i
N
j jiji yxp
yxpYXH1 1 ),(
1ld),(),(
(1.48)
(1.49)
(1.50)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 59
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
Entropy of irrelevance: mean information content at the output in case of a known input symbol (conditional entropy)
Entropy of equivocation: mean information content at the input in case of a known output symbol (conditional entropy); entropy of informationgetting lost on the channel
∑ ∑= =
⋅=X YN
i
N
j ijji xyp
yxpXYH1 1 )|(
1ld),()|(
∑ ∑= =
⋅=X YN
i
N
j jiji yxp
yxpYXH1 1 )|(
1ld),()|(
(1.51)
(1.52)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 60
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
From the diagramm of information streams as well as from equation (1.32a) the following relations can be derived:
)|()()|()(),(),( YXHYHXYHXHXYHYXH +=+==
)()|( XHYXH ≤
)()|( YHXYH ≤
The mean transinformation can be determined from these relations:
),()()(
)|()(
)|()(),(
YXHYHXH
XYHYH
YXHXHYXT
−+=
−=
−=
(1.53)
(1.54a)
(1.54b)
(1.55)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 61
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
In terms of transinformation, two special examples are to be considered, forwhich the entropy can be derived easily.
For the ideal, not interfered channel applies:
For the entropy then applies: H(X|Y) = 0, H(Y|X) = 0,H(X,Y) = H(X) = H(Y) and thusT(X,Y) = H(X) = H(Y).
In terms of the useless, completely interfered channel the input and outputsymbols are statistically independent from each other, so that the followingapplies: p(xi,yj) = p(xi) ⋅ p(yj) = p(yj|xi) ⋅ p(xi) ⇒ p(yj|xi) = p(yj) ⇒ pij = pkj .That means that the transition probabilities are all the same. Regarding theentropy, that means: H(X|Y) = H(X), H(Y|X) = H(Y),
H(X,Y) = H(X) + H(Y) and thusT(X,Y) = 0.
⎩⎨⎧
≠=
=jiji
pij für 0für 1
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 62
1.5.1 Definition and Characteristics of the Discrete Memoryless Channel
Concluding, a concrete example regarding transinformation:1000 binary, statistically independent and equiprobable symbols (p(0) = p(1) = 0.5) are transmitted via a binary symmetric channelperr = 0.01.Consideration in terms of quality: The mean number of correctlytransmitted symbols is 990. However, the exact position of the errors isunknown so that the following has to apply: T(X,Y) < 0.99 bit/character.Consideration in terms of quantity:
bit/character9192.0)(1 err ≅−= pS
1ld1
1ld)1())1()0((1err
errerr
err ⎥⎦
⎤⎢⎣
⎡+
−−+−=
pp
pppp
)|()(),( −= XYHYHYXT
)|(1ld)|()(
)(1ld)(
1 11
⋅⋅−⋅= ∑∑∑= == xyp
xypxpyp
ypX YY N
i
N
j ijiji
N
j jj
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 63
1.5.2 Channel Capacity1.5 Communication over a Discrete Memoryless Channel
Considering a given channel, it is interesting to know how muchinformation can be sent via it at most. The transinformation, however, depends on the probability distribution of the source symbols and thus on the source so that it is not suitable to be a measure for the description of channels. Nevertheless, decoupling from the source is possible byconsidering the maximum value of the transinformation. Thus, the channelcapacity is defined as:
),(max1
)(YXT
TC
ixpΔ=
The channel capacity C is the maximum of the flow of information and has the dimension bit/s. ΔT describes the time period of the characterstransmitted via the channel. With this definition, C only depends on thechannel characteristics, that means on the transition probabilities und not on the source.
(1.56)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 64
1.5.2 Channel Capacity
SymbolR
0)( =→= CRXHL
Symbol
inf
RL
R
⋅
=inf
,
inf RRRR
CKK ≥=
KRinfR
SymbolRQRXHH );(;0
Source
Source coder Channel coderFehlerp
BSC channelCp ;err
Channel coder1, ≤CKR
Source coder
CRL;
Sink
ideal coder:
(without redundancy)
Overview of a communication system:
Definition of flow of information:Flow of information (entropy / time): H′(X) = H(X) / ΔTFlow of transinformation (transinformation / time): T′(X,Y) = T(X,Y) / ΔTFlow of decision (decision content / time): H0′(X) = H0(X) /ΔT
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 65
1.5.2 Channel Capacity
Taking the binary symmetric channel (BSC) as an example, the channelcapacity is to be derived. For this, the transinformation is to be determinedand subsequently the maximum is to be found.
Transinformation of the BSC:
(1.57)⎥⎦
⎤⎢⎣
⎡−
−+−err
errerr
err 11ld)1(1ldp
pp
p
+−−+−−+
err1err1err1err1 21
1ld)21(pppp
pppp
−+−+=
err1err1err1err1 2
1ld)2(pppp
pppp
−= )|()(),( XYHYHYXT
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 66
1.5.2 Channel Capacity
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 p(x1)
T(X,Y)
bit/c
hara
cter
perr = 0
perr = 0.1
perr = 0.2
perr = 0.3
perr = 0.5
Graph of the transinformation over different transmission errorprobabilities:
The position where the maximum of the transinformation at the BSC occursis independent from the transmission error probability. It isp1 = p(x1) = 0.5 (equiprobable source symbols).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 67
1.5.2 Channel Capacity
Graph of the channel capacity. It can be derived from the graph of thetransinformation on the previous page by plotting the respective maxima of the transinformation over perr:
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 perr
C⋅ΔT / bit
)(11
1ld)1(1ld1),(max errerr
errerr
err)(
pSp
pp
pYXTTCixp
−=⎥⎦
⎤⎢⎣
⎡−
−+−==Δ⋅
(1.58)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 68
1.5.2 Channel Capacity
As a further example, the binary erasure channel (BEC) is considered. Itmodels the scenario that shows that a received signal cannot be assigned to one of the transmitted signals. The data and results are as follows:
⎟⎟⎠
⎞⎜⎜⎝
⎛−
−=
err
err
err
err1
00
1pp
pp
P err1 pTC −=Δ⋅
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 perr
C⋅ΔT / bit
(1.60)(1.59)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 69
1.5.2 Channel Capacity
Theorem of the channel capacity (Shannon 1948):For every ε > 0 and every flow of information of a source Rinf less than thechannel capacity C ( Rinf < C ), a binary block code of the length n (n abovea threshold) exists, so that the residual-error probability after decoding at the receiver is less than ε .Inversion: Whatever the effort, for Rinf > C, the residual-error probabilitycannot go below a certain limit.
The argumentation is carried out using random block codes (random codingargument). In doing so, the proof of the average over all codes is carriedout. It can be seen here that all known codes are bad codes. The theorem of the channel capacity, however, does only state that codes with such characteristics exist. There is no construction rule given by Shannon.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 70
1.5.2 Channel Capacity
In terms of the channel capacity, use of infinite code word lengths (thatmeans n is very large) would be optimal. This, however, would bring aboutan infinite delay and complexity.Thus, Gallager improves the theorem of the channel capacity applying his error exponent EG(RC) for a DMC with NX input symbols. The objective isan estimate that is valid for smaller code word lengths.
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛⋅−⋅−= ∑ ∑
=
+
=
+≤≤
Y X
i
N
j
sN
i
sijixps
xypxpRsRE1
1
1
11
C)(10
CG )|()(ldmaxmax)(
)(w CG2 REnP ⋅−<
An (n,k) block code (k: information word length) always exists usingRC = k / n ld NX < C ΔT,
so that the following applies for the word error probability:
(1.61)
(1.63)
(1.62)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 71
1.5.2 Channel Capacity
0.0
0.2
0.4
0.6
0.8
0.0 0.2 0.4 0.6 0.8 1.0
EG(RC)
RC
R0
R0 C
The error exponent according to Gallager has the following characteristics:EG(RC) > 0 for RC < CEG(RC) = 0 for RC ≥ C
The R0 value is defined as:
R0 = EG(RC = 0) (1.64)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 72
1.5.2 Channel Capacity
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
⎟⎟⎠
⎞⎜⎜⎝
⎛⋅−=== ∑ ∑
= =
Y X
i
N
j
N
iiji
xpxypxpRER
1
2
1)(CG0 )|()(ldmax)0(
C01
2
1C
)(CG )|()(ldmax)( RRxypxpRRE
Y X
i
N
j
N
iiji
xp−=
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡
⎟⎟⎠
⎞⎜⎜⎝
⎛⋅−−≥ ∑ ∑
= =
The maximum for EG(RC = 0) is s = 1. The R0 value, which is also referredto as „computational cut-off rate“, is then:
If RC is not put to zero, the equation reads as follows:
(1.66)
(1.65)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 73
1.5.2 Channel Capacity
R0 theorem:An (n,k) block code with
RC = k / n ld NX < C ΔT,always exists so that for the word error probability (with Maximum-Likelihood decoding) the following applies:
)(w C02 RRnP −−<
As in case of the theorem of the channel capacity, this theorem onlyprovides information about the existence and does not provide anyconstruction rule for good codes.
(1.67)
Co-domain for the code rate:0 ≤ RC ≤ R0 estimate of PW using (1.67)R0 ≤ RC ≤ C ΔT estimate of PW is difficult to calculateRC > C ΔT PW cannot become arbitrarily small
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 74
1.5.2 Channel Capacity
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.1 0.2 0.3 0.4 0.5 perr
R0
C ΔTbi
t/cha
ract
er
Comparison of channel capacity and R0 value for a BSC:
The R0 graph is below the graph representing the channel capacity and indicates a limit predicting worse characteristics of the code. However, thislimit is more realistic in the sense that it can be reached by finite code wordlengths.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 75
2 Fundamentals of Channel CodingCoding Theory
In chapter 1, the influence of interferences on the channel caused byreduction of the transmitted information was described. Practically, interference results in random errors occurring with the transmission of symbols.
In this chapter, the fundamental parameters of channel coding are to bedescribed:– code rate,– minimum distance and– error probabilities.
Dependent on a quality criterion like the bit error rate (BER), a certainnumber of (residual) errors can be accepted, for the limitation of whichchannel codes have to be applied. With voice transmission, e.g., a BER of 10-3 is applied.
Furthermore, decoding rules are introduced and bounds theoreticallyachievable are identified.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 76
2.1 IntroductionBasic principle of channel coding:
2 Fundamentals of Channel Coding
The transmission line with channel coding/decoding is divided into:– Classification of the source symbol stream to information words u of the
length k,– Assignment of u to a transmit word x ∈ C of the length n,– Transmission of x; with this, superposition of an error word f,– Estimation of the receive word y = x + f, that means assignment of y to
the most likely code word ∈ C,– Assignment of to the estimated information word .x u
x
ChannelCoder
Channel
channel decoder
Code Word
Estimator
InformationWord
Assignment
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 77
2.1 Introduction
Regarding this, the task of coding is to add redundancy to the errorprotection (redundant codes).
In this chapter, only binary block codings are considered, that meansq = 2, and k and n, respectively, are values that are constant for a code.In the following, the repetition codes and the parity check codes are
considered as examples.
For the codingu = (u0, u1, ... , uk-1) → c = (c0, c1, ... , cn-1)
applies n ≥ k. The m = n – k added symbols are referred to as checksymbols. Each of the single symbols ui and ci , respectively, can reachq values in general (q: number of symbols). Then there are N = qk
different messages and code words, respectively and qn – qk invalid words.
The code space, that means the set of all code words, is referred to as C.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 78
2.1 Introduction
Repetition code:The information consists of an information bit (k = 1) that is repeated(n – 1)-fold. With this, all in all 2k = 2 code words exist which are
c1 = (0 0 0 ... 0) and c2 = (1 1 1 ... 1).The decoding is carried out by majority decision, if n is odd.
Example: n = 5 ⇒ RC = 1/5 u1 = (0) → c1 = (0 0 0 0 0) and u2 = (1) → c2 = (1 1 1 1 1)
Interfered receive sequences with: f1 = (0 1 0 0 1) and x1 = c2 = (1 1 1 1 1) y1 = x1 + f1 = (1 0 1 1 0) → (successful decoding)
f2 = (1 1 0 1 0) and x2 = c1 = (0 0 0 0 0) y2 = x2 + f2 = (1 1 0 1 0) → (decoding error)
)1(ˆ1 =u
)1(ˆ 2 =u
The addition is made bit by bit without carry.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 79
2.1 Introduction
Example: k = 3y1 = ( 0 1 1 0) → no error
detectedy2 = ( 1 1 1 0) → error detected
Parity Check Code:The code consists of k information bits and a control bit (m = 1). With this, n = k + 1. There are 2k code words of the form
c = (u p) with p = u0 + u1 + u2 + ... + uk−1.From this, an even number of ones in the code words always results.
Code word Information Control bit
c0 000 0 c1 001 1 c2 010 1 c3 011 0 c4 100 1 c5 101 0 c6 110 0 c7 111 1
The parity check code reads then as follows:s0 = y0 + y1 + y2 + ... + yn−1 = 0
→ no errors0 = y0 + y1 + y2 + ... + yn−1 = 1
→ error
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 80
2.2 Code Rate2 Fundamentals of Channel Coding
A source generates information with a mean rate of Ri bit per second (bps), that means every Ti = 1/Ri second one bit is transmitted. The channel coderassigns a code word of the length n to every message of the length k so thatthe channel bit rate reads as follows:
since the following applies: n ≥ k, ist RC ≤ 1.The code rate measures the relative information content in every codeword and is one of the key values for the performance of a channel code. Though a high value of RC means that more information is transmitted, protection from transmission errors is more difficult due to the smallernumber of redundant symbols.
knRR iK ⋅=
i
K
K
iC T
TRR
nkR ===The code rate is then:
(2.1)
(2.2)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 81
2.2 Code Rate
In some systems, Ri and RK are fixed parameters. For a given messagelength k, n is then the greatest integer that is less than . If this termitself is no integer, dummy bits have to be included in order to maintain thesynchronisation.Example:A system is given by Ri = 3 bps and RK = 4 bps. With this, RC = ¾.
iK RRk /⋅
If a channel coder is to be made available with k = 3, n = 4 has to bechosen; loss of synchronisation does not occur.If a channel coder with k = 5 is to be made available, for the product results
= 6 + 2/3. Thus, n = 6. However, 2 dummy bits have to be addedafter every third code word in order to close the gap of 2/3.
iK RRk /⋅
k = 5X X X X X
⇓O O O O O O
n = 6
k = 5X X X X X
⇓O O O O O O
n = 6
k = 5X X X X X
⇓O O O O O O
n = 6d d
15 bits
20 bitsRC = 3/4
dummy bits
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 82
2.3 Decoding Rules2 Fundamentals of Channel Coding
The channel decoder has to use a decoding rule for the receive word y in order to decide which transmit word x was transmitted. That means that a decision rule D(.) is searched for, for which the following applies: x = D(y).
Let p(x) be the a-priori probability for the message x to be sent. By meansof Bayes‘ rule (equation 1.9) the probability that x was transmitted when ywas received is given by:
Let p(y | x) be the probability that y was received after transmitting x. For a discrete memoryless channel it can be expressed by the transmissionprobabilities:
∏−
=
=1
0
)|()|(n
iii xypp xy (2.3)
)()()|()|(
yxxyyx
pppp ⋅
= (2.4)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 83
From the transmission point of view, the 3 possible results are as follows:− y = x, that means error-free transmission: ,− y ≠ x, y ∈ C, that means falsification of the receive word into another
code word (not corrigible):− y ≠ x, y ∉ C, that means erroneous transmission,
in which the error is observable and possibly corrigible:
2.3 Decoding Rules
uuxx == ˆ,ˆ
uuxx ≠≠ ˆ,ˆ
uuxx⎭⎬⎫
⎩⎨⎧
≠=
⎭⎬⎫
⎩⎨⎧
≠=
ˆ,ˆ
From the receiver‘s point of view, there a 3 possible decoding results:− Correct decoding, where either no errors have occurred on the channel
or all errors have been corrected properly:− Erroneous decoding, that means an error has occurred in such a way
that the estimation has assigned an erroneous code word to the receiveword:
− Decoding failure, that means an error was detected,however, the decoder does not find a solution for this error.
uuxx == ˆ,ˆ
uuxx ≠≠ ˆ,ˆ
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 84
The objective is now to minimise the word error probability.For this, has to be selected that way that becomes maximum.
Using equation (2.4), there is also the possibility to maximise .
It is assumed that the code words are transmitted with the same probability. Then for pW the following applies:
∑∑≠
=≠
=≠xx
xyxx
xyxxxyy ˆ,
)|(ˆ,
)gesendet |empfangen ()gesendet |ˆ( ppp
2 Fundamentals of Channel Coding
x )ˆ|( xyp
Word error probability for a DMC:
The limited probability for an estimation error is:
∑∈
⋅≠=Cx
xxxx )gesendet()gesendet|ˆ( pp≠= xx )ˆ(p≠= uu )ˆ(W pp
)|ˆ( yxp
(2.5)
(2.7)
(2.6)
∑⋅−= −
yxy )ˆ|(21 pk∑⋅−=
=∈
−
xxyxxy )|(21
ˆ,,pk
C
∑⋅=∈
−
yxxy )|([2
,pk
C∑ ∑ ⋅=∈
−
≠x xxyxy 2)|(
ˆ,W pp k
C∑−
=∈ xxyxxy ])|(
ˆ,,p
C
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 85
= DMAP(y), with ∈ C,
2.3 Decoding Rules
Maximum A-Posteriori rule:The probability that the decision for the receive word y to the code word xis correct is just p(x | y). The probability that an error occurs is then1 - p(x | y). Thus, the minimisation of the estimation error with the selectionof x is the maximisation of the a-posteriori probability p(x | y).
Cxyxyxx ∈≥= allfor)|()|ˆ( pp
This decoding rule results in a minimum decoding error.
The corresponding maximum a-posteriori rule (MAP rule) reads as follows:
x xso that
Using equation (2.4) and since p(y) is independent from x, equation (2.8b) is simplified to:
Cxxxyxxxxy ∈≥== allfor)()|()ˆ()ˆ|( pppp
(2.8a)
(2.8b)
(2.8c)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 86
2.3 Decoding Rules
Maximum Likelihood decoding rule:An alternative decoding rule is based on the maximisation of p(y | x), theprobability that with the transmission of x the word y was received. Thisrule does not guarantee a minimum of errors, but it is easier to implement, since the channel input probabilities are not required.
C∈≥= xxyxxy allfor)|()ˆ|( pp
For equiprobable input parameters, the MLD and the MAP rule lead to thesame results (compare the following example).
The corresponding Maximum Likelihood Decoding rule (MLD rule) readsas follows:
x x= DMLD(y), with ∈ C,so that
(2.9a)
(2.9b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 87
2.3 Decoding Rules
Example: A BSC with perr = 0.4 and a channel coder with (k,n) = (2,3) are assumed, that means a channel code with N = 22 = 4 code words of the length n = 3. The codewords and their occurrence probabilities are assumed to be:
Code word p(x) x1 = (000) 0.4 x2 = (011) 0.2 x3 = (101) 0.1 x4 = (110) 0.3
144.0)0|1()1|1()1|1()110|111()|(144.0)1|1()0|1()1|1()101|111()|(144.0)1|1()1|1()0|1()011|111()|(064.0)0|1()0|1()0|1()000|111()|(
4
3
2
1
=⋅⋅===⋅⋅===⋅⋅===⋅⋅==
pppppppppppppppppppp
xyxyxyxy
Assumption: y = (111). The MLD rule provides:
0432.03.0144.0)()|(0144.01.0144.0)()|(0288.02.0144.0)()|(0256.04.0064.0)()|(
44
33
22
11
=⋅=⋅=⋅=⋅=⋅=⋅=⋅=⋅
xxyxxyxxyxxy
pppppppp
This rule provides three equal maximum probabilities, so that a uniquedecision is not possible.
Minimising the error isunique here by selection of the code word x4, that meansx4 = DMAP(y = (111)).
The MAP rule, however, provides (by means of equation (2.8c)):
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 88
2.3 Decoding Rules
Decoding sphere:The decoding sphere describes an n-dimensional sphere around a code wordwith the radius t. All n-digit receive words located within a definite decoding sphere are decoded as the respective code word.
It applies that the total number of all vectors within decoding spheres is less thanor equal the total number of all possible words.
The decoding sphere forms the basis for the Hamming bound (see below).
decoding sphere
possible receive word(no code word)
code word
space of possible receive words
t
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 89
2.3 Decoding Rules
Maximum-Likelihood Decoding (MLD):The receive word is compared to all code words (2k necessary vectorcomparisons). Though this method is optimal, it is very complex. Possibly, more than t errors are corrigible.Limited Distance Decoding (BDD):Every code word is surrounded by decoding spheres with the radius t0, which can overlap. Decoding takes place only for those receive vectors yexactly located in a sphere. All receive vectors y located in no sphere or in several spheres are not decoded.
Limited Minimum Distance Decoding (BMD):Every code word is surrounded by a decoding sphere with the radius t. Thedecoding spheres are disjoint. Decoding takes place only for the receive
vectors y located in a sphere.
Decoding methods:A block code that can definitely correct t errors is considered.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 90
2.3 Decoding Rules
MLD BDD BMD
Further decoding methods are e.g. the syndrome decoding and themajority decision.The syndrome decoding is easy to realise. However, it is suboptimal sincenot the complete information is used.The majority decision has already been known from the
repetition codes and is suboptimal as well.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 91
2.4 Hamming Distance2 Fundamentals of Channel Coding
Assumption: The two binary code words x and y of the length n are given. The Hamming distance between x and y, dH(x,y), is defined as the numberof position bits in which x and y differ. For arbitrary binary words x, y and z, the Hamming distance fulfils the following conditions:
− dH(x,y) ≥ 0, with equality, if x = y, (2.10a)− dH(x,y) = dH(y,x) and (2.10b)− dH(x,y) + dH(y,z) ≥ dH(x,z) (triangle inequality). (2.10c)
Example: Let n = 8 and x = (11010001),y = (00010010) andz = (01010011).
The Hamming distances for all combinations are givenin the table.
dH x y zx 0 4 2y 4 0 2z 2 2 0
Check of the triangle inequality (equation (2.10c)) results in:dH(x,y) + dH(y,z) ≥ dH(x,z) ⇒ 4 + 2 ≥ 2
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 92
2.4 Hamming Distance
(Hamming) weight:A binary vector x of the length n is given. Then the Hamming weight wH(x) is defined as the number of ones in x:
Example: x = ( 0 1 1 1 0 1 0 1 )y = ( 1 0 1 0 0 1 0 1 ) ⇒ dH(x,y) = 3
x + y =( 1 1 0 1 0 0 0 0 ) ⇒ wH(x + y) = 3
∑−
=
=1
0H )(
n
iixw x
Example: wH(0 1 1 1 0 1 0 1) = 5.
Between the Hamming distance dH and the Hamming weight wH thefollowing relation exists:
dH(x,y) = wH(x + y).
(2.11)
(2.12)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 93
2.4.1 Decoding Rule for the BSC2.4 Hamming Distance
A BSC with the bit error probability perr is assumed. With regard to theMLD rule, for the conditional probability the following results:
),(
err
errerr
),(err
),(err
1
0 err
err1
0
HHH
1)1()1(
fürfür1
)|()|(
xyxyxy
xy
dnddn
n
i ii
iin
iii
ppppp
xypxyp
xypp
⎟⎟⎠
⎞⎜⎜⎝
⎛−
⋅−=⋅−=
⎭⎬⎫
⎩⎨⎧
≠=−
==
−
−
=
−
=∏∏
C∈≤ xxyxy allfor),()ˆ,( HH dd
Equation (2.13) is maximum, if dH(x,y) is minimum.According to the MLD decoding rule, the estimated vector x is to besearched in that way that it is nearest to the receive word y:
Moreover, equation (2.13) states that, if perr < 0.5, the probability for oneerror is greater than for two or three errors etc., that means
p(i errors) > p(i + 1 errors), 0 ≤ i ≤ n – 1.
(2.13)
(2.14)
(2.15)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 94
Example: The following channel code isassumed:
2.4.1 Decoding Rule for the BSC
Message Code word00 000 01 001 10 011 11 111
Since n = 3, N = 8 possible receive words exist, 4 of which are no code words. If the receiveword is one of the code words, the decodingrule also assigns this code word to the estimate( = y). If y is no code word, however, thedecoding rule reads as follows:
y dH(y,ci) Next code word Action 010 1, 2, 1, 2 000; 011 1 bit error detected 100 1, 2, 3, 2 000 1 bit error corrected 101 2, 1, 2, 1 001; 111 1 bit error detected 110 2, 3, 2, 1 111 1 bit error corrected
The separate consideration of possible pairs of words is the disadvantage of this approach. A parameter describing the general detection and correction
characteristics for a code is more interesting.
x
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 95
2.4.2 Error Detection and Error Correction2.4 Hamming Distance
Minimum distance of a linear code*:The minimum distance dmin of a linear block code C is given by the smallestHamming distance of all pairs of different code words:
( )212121Hmin ,|),(min cccccc ≠∧∈= Cdd
( )( ) minH
212121Hmin;|)(min
;,|),(minww
dd=≠∈=
≠∈=
0ccccccccc
CC
( )( )( )( )0ccc
0ccc0cccccc0
cccccc
≠∈=
≠∈=
≠∈+=
≠∈=
;|)(min;|),(min
;,|),(min;,|),(min
H
H
212121H
212121Hmin
CC
CC
wdddd
The minimum distance of a linear code word corresponds to the minimumweight of the code without considering the zero word:
Proof:
(2.16)
(2.17)
(*) linear codes:see chapter 3
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 96
2.4.2 Error Detection and Error Correction
Error detection characteristic:A block code C detects up to t errors, if its minimum distance is greaterthan t:
This is an important characteristic for ARQ methods, where codes for errordetection are required.
dmin > t
Error correction characteristic:A block code C corrects up to t errors, if its minimum distance is greaterthan 2t:
This is an important characteristic for FEC methods where codes for errorcorrection are required.
dmin > 2t
(2.18a)
(2.18b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 97
2.4.2 Error Detection and Error Correction
dmin = 3 dmin = 4
Thus, the number of detectable errors is given by te = dmin − 1.The number of corrigible errors is given by
t = (dmin − 2) / 2, if dmin is even andt = (dmin − 1) / 2, if dmin is odd.
With dmin = 1, neither an error detection nor an error correction can beguaranteed; with dmin = 2 at least one single error is always detectable; withdmin = 3 at least one single error is always corrigible and at least two errorsare detectable etc.
(2.19)
(2.20a)(2.20b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 98
2.4.2 Error Detection and Error Correction
Examples:1) With the repetition code, (n – 1)/2 errors (rounded down) are corrigibleand n – 1 errors are detectable.2) With the parity check, no error is corrigible and an odd number of errorsis detectable in general.3) In the example of the table below, an error is definitely detectable sincedmin = 2.
Code word Information Control bit Weight c0 000 0 0 c1 001 1 2 c2 010 1 2 c3 011 0 2 c4 100 1 2 c5 101 0 2 c6 110 0 2 c7 111 1 4
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 99
2.5 Bounds and Perfect Codes2 Fundamentals of Channel Coding
As observed in 2.4, the minimum distance dmin specifies the ability of a code C to detect errors. Based on different problems, interesting questionsresult:
− A code C of the length n is to be generated which shall have theminimum distance dmin(C). Does then a maximum number (upper bound) of possible code words (messages) N exist?
− A code C of the length n for messages of the length k is to be developed. Which is the maximum error protection (that means maximum dmin) obtainable by these parameters?
− A code C for messages of the length k with a given error protection (tbits) is to be developed. Which is the shortest length n of a code wordthat can be used for this?
In the following, various bounds are to be considered.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 100
2.5 Bounds and Perfect Codes
Singleton bound:The minimum distance and the minimum weight of a linear code C arebounded by
mknwd +=−+≤= 11minmin
Proof: In general, qk different code words exist, which have a minimumdistance of dmin to each other. If the last dmin – 1 symbols are removed fromevery code word, at least qk different words still have to remain, however. Otherwise the coding would not have a minimum distance of dmin.All in all, qn different words of the length n exist in general (all possiblecombinations). If the last dmin – 1 symbols are removed here, at most qn – (dmin
– 1) words are different from each other. Thus, the following has to apply: n –(dmin – 1) = n – dmin + 1 ≥ k or dmin ≤ n – k + 1.
A code for which the equals sign in equation (2.21) applies, is calledMaximum Distance Separable (MDS).
(2.21)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 101
2.5 Bounds and Perfect Codes
From equation (2.21), using equation (2.19) the following can be derived for theerror detection:
te + 1 = dmin ≤ 1 + m ⇒ m ≥ te
That means that at least two control bits per corrigible error are required.
(dmin − 1) / 2 ≥ t ≥ (dmin − 2) / 22 t + 1 ≤ dmin ≤ 1 + m ⇒ m ≥ 2 t
That means that at least one control bit per observable error is required.
For error correction, the following can be derived from equations (2.20a/b) :
(2.22a)
(2.22b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 102
2.5 Bounds and Perfect Codes
Hamming bound:For a binary (n,k) block code with N = 2k code words of the length n and the correction capability t the following applies:
nt
i
kin
220
≤⎟⎠
⎞⎜⎝
⎛∑=
1)1())1(()1(
⋅⋅−⋅−−⋅⋅−⋅
=⎟⎠
⎞⎜⎝
⎛K
K
tttnnn
tnnk
tnnnn
2210
2 ≤⎥⎦
⎤⎢⎣
⎡⎟⎠
⎞⎜⎝
⎛++⎟
⎠
⎞⎜⎝
⎛+⎟
⎠
⎞⎜⎝
⎛+⎟
⎠
⎞⎜⎝
⎛K
Proof: The total number of all words within the 2k existing decodingspheres is at most equal to the number of all possible words (2n), i. e.:
with
number of words around a code word with dH = t
number of words around a code word with dH = 2number of words around a code word with dH = 1code word (centre of a decoding sphere)
(2.23)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 103
2.5 Bounds and Perfect Codes
Perfect codes:A code C with the variables n, k and dmin is referred to as perfect if equalitywithin the Hamming bound (equation (2.23)) applies.Graphically, this means that all qn possible receive words are located withinthe correction spheres of the qk code words. Thus, with perfect codes all corrupted code words can be assigned unambiguously to a correct codeword.The repetition code of odd length, e.g., is a perfect code.
Further bounds:Furthermore, there are the Plotkin and the Elias bound representing upperbounds. They are justified by different focuses in different ranges of parameters.The Gilbert-Varshamov bound is a lower bound, the adherence of whichguarantees the existence of a code.
(for further information, see Schneider-Obermeier, Friedrichs)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 104
2.5 Bounds and Perfect Codes
Asymptotic bounds for the minimum distance:Eventually, the dependency between the code rate RC and the normaliseddistance dmin/n is to be considered. For large code word lengths (n → ∞) thefollowing terms arise for the mentioned bounds:
– Singleton bound:
– Hamming bound:
– Plotkin bound:
– Elias bound:
– Gilbert-Varshamov bound:
ndR min
C 1−≤
⎟⎠⎞
⎜⎝⎛ ⋅−≤
ndSR min
C 211
ndR min
C 21 ⋅−≤
⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛−−−≤
ndSR min
C 211211
⎟⎠⎞
⎜⎝⎛−≥
ndSR min
C 1
(2.24a)
(2.24b)
(2.24c)
(2.24d)
(2.24e)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 105
2.5 Bounds and Perfect Codes
The upper bounds form different limits in terms of the error protection(dmin/n) at which a certain information content (RC) can be transmitted. TheGilbert-Varshamov bound forms the lower limit. Codes the values of whichare lower than the lower limit have to be considered as bad codes, since in theory it is said that, based on a bad code, a better one with a higher coderate exists. Thesecodes can thenbe consideredas good codes.
0.0
0.2
0.4
0.6
0.8
1.0
0.0 0.2 0.4 0.6 0.8 1.0
RC=k/n
SingletonPlotkinEliasHamming
dmin / n
Gilbert-Varshamov
upperbounds
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 106
2.6 Error Probabilities2.6.1 Error Occurrence Probability
2 Fundamentals of Channel Coding
It is assumed that a BSC with the error probability perr is given. From this, a mean number of errors nFehler per code word of results. The occurrence probability of which is calculated as follows:
errFehler pnn ⋅=
The probability that no error occurs is:nppppp )1()1()1()1()( errerrerrerr −=−⋅⋅−⋅−== K0f
The probability that a single error occurs at a certain point (f = f1,wH(f1) = 1) is: 1
errerr1 )1()( −−⋅== nppp ff
The probability that a single error occurs at an arbitrary point(f = f1, wH(f1) = 1) is: 1
errerr1 )1()( −−⋅⋅== nppnp ffThe probability of ne errors at arbitrary points(f = fne
, wH(fne) = ne) is:
eee
)1()( errerre
nnnn pp
nn
p −−⋅⋅⎟⎟⎠
⎞⎜⎜⎝
⎛== ff
(2.25a)
(2.25c)
(2.25b)
(2.25d)
n coefficients
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 107
0.0
0.2
0.4
0.6
0.8
1.0
0.0 1.0 2.0 3.0 4.0 5.0
p(wH(f) = ne)
ne
perr = 0.01
perr = 0.1perr = 0.3
2.6.1 Error Occurrence Probability
For the binomial coefficients(eq. 2.26) generally applies:
Using these, the binomialformula can be represented as:
; 10
=⎟⎠
⎞⎜⎝
⎛=⎟
⎠
⎞⎜⎝
⎛nnn
)!(!! ⎟
⎠
⎞⎜⎝
⎛−
=−⋅
=⎟⎠
⎞⎜⎝
⎛kn
nknk
nkn
nn
k kn
20
=⎟⎠
⎞⎜⎝
⎛∑=
;nknkn
kbaba
kn
)(0
+=⎟⎠
⎞⎜⎝
⎛∑ −
=
For n = 7, the error occurrence probabilityaccording to bit errors is represented here.
(2.26)
(2.27)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 108
2.6.2 Probability of Undetected Errors2.6 Error Occurrence Probabilities
Let xi and xj be code words; xi was sent and y = xj was received. Thedecoding rule will assume that no error has occurred and see xj as the codeword that was transmitted. This special case of a decoding error cannot beavoided; the error remains undetected. For codes that shall detect the errors, the probability pue (undetected error) that an error remains undetected is an important parameter calculated by:
∑∑
∑ ∑
∑ ∑
−
=
−
≠=
−
−
=
−
≠=
−
−
=
−
≠=
−=
−=
=
1
0
1
0
),(),(
1
0
1
0
),(),(
1
0
1
0
HH
HH
)1(1
)1()(
)|()(
N
i
N
ijj
derr
dnerr
N
i
N
ijj
derr
dnerri
N
i
N
ijj
ijiue
ijij
ijij
ppM
ppxp
pxpp
xxxx
xxxx
xx
(2.28a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 109
2.6.2 Probability of Undetected Errors
It is assumed that all code words have equal occurrence probability. Thedouble sum in (eq. 2.28a) and the need for all possible Hamming distancesare the main difficulty in terms of the calculation. For the linear binarycodes discussed in the next chapter, it applies that the distribution of d(xi,xj) can be determined by the number Ai of code words with the weight i.
∑=
− ⋅−⋅=n
di
iini ppAp
min
errerrue )1(
∑=
−−⋅⋅+=⎟⎟⎠
⎞⎜⎜⎝
⎛−
n
di
iii ppA
ppA
min
)1(11 errerr
err
err
⎥⎦
⎤⎢⎣
⎡−⎟⎟
⎠
⎞⎜⎜⎝
⎛−
−= 11
)1(err
errerrue p
pApp n
Using the equation
the following notationresults:
(2.28b)
(2.29)
(2.30a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 110
2.6.2 Probability of Undetected Errors
For small error probabilities perr, pue can be calculated approximately by:
Example:Consider the (7,4) Hamming code, which is a linear binary code. This codehas one code word each with the weight 0 and the weight 7 and 7 codewords each with the weights 3 and 4.The minimum distance is
dmin = 3.From this, a probability pue results:
pue = 7 (1 − perr)4 perr3 + 7 (1 − perr)3 perr
4 + perr7
= 7 perr3 − 21 perr
4 + 21 perr5 − 7 perr
6 + perr7 ≈ 7 perr
3
(2.30b)minmin
min
errerrerrue 1)( dd
n
di
ii pApApAp ⋅≈−=⋅≈ ∑
=
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 111
2.6.3 Residual-Error Probability2.6 Error Probabilities
Let C = {c0,c1,c2, ... ,c2k-1} be a linear block code with the correctioncapability t. Usage of a limited minimum distance decoding (BMD) isassumed. Let the error vector be referred to as f. Then for the word errorprobability with BMD pw,BMD the following applies:
∑∑+==
===−=
+≥=≤−=
−=
n
ti
t
i
iwpiwp
twptwppp
1H
0H
HH
BMDW,
))(())((1
)1)(())((1)decodingcorrect (1
ff
ff
ini ppin
iwp −−⋅⋅⎟⎠
⎞⎜⎝
⎛== )1())(( errerrH f
With equation (2.25d), the probability of error vectors with the weightwH(f) = i results
∑∑+=
−
=
− −⎟⎠⎞⎜
⎝⎛=−⎟
⎠⎞⎜
⎝⎛−=
n
ti
init
i
ini ppinppi
np1
errerr0
errerrBMDW, )1()1(1
From this, the following results:
(2.31a)
(2.31b)
(2.31c)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 112
2.6.3 Residual-Error Propability
For small error probabilities, the following estimation applies:1
errBMDW, 1+⎟
⎠
⎞⎜⎝
⎛+
< tpt
np
BMDW,MLDW, pp ≤Between the word error probabilities with MLD and BMD the followingrelation exists:
2err
2err
2errerrerr
3err
2errerr
6errerr
7err
6err
1err
7err
0errW
27
21
)61(7)2171(1
)1(7)1(1
)1(17
)1(07
1
pp
pppppp
ppp
ppppp
⎟⎠
⎞⎜⎝
⎛=≈
+−−−+−−=
−−−−=
−⎟⎠
⎞⎜⎝
⎛−−⎟
⎠
⎞⎜⎝
⎛−=
KK
Example: For the (7,4) Hamming code (see above) results:
(2.31d)
equals sign, since the code is a perfect one
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 113
2.6.3 Residual-Error Probability
Eventually, the bit error probability pBit after decoding is to be estimated:
pbit ∑=
=⋅==n
iiwpiw
k 0HH ))(()(|worddecod.perno. of bit errors1 ff
)ˆ,(),()ˆ,()ˆ,( HHHH xyyxxxuu dddd +≤≤
= i ≤ t
The number of bit errors is limited to the number of information bits k and to i + t, because:
∑+=
=⋅+≤n
tiiwptik
kp
1Hbit ))((),min(1 f⇒
inin
tipp
in
ktip −
+=−⋅⋅⎟
⎠
⎞⎜⎝
⎛⋅⎟
⎠⎞
⎜⎝⎛ +
≤ ∑ )1(,1min errerr1
bitAnd finally:
For small error probabilities theestimate applies:
1err
minbit 1
,1min +⋅⎟⎠
⎞⎜⎝
⎛+
⋅⎟⎠⎞
⎜⎝⎛≤ tp
tn
kdp
(2.32)
(2.33)
(2.34)
(2.35a)
(2.35b)
triangle inequality
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 114
3 Single-Error Correcting Block CodesCoding Theory
In this chapter, fundamental knowledge of linear block coding is imparted. Besides the mathematical dependencies and their properties, two simple code examples are considered which are of great importance: the generalcyclic codes and the Hamming codes. Furthermore, practical approaches forthe realisation of such codes by means of shift register circuits areconsidered.
Based on a block code with the number of levels q, the number of information bits k, the number of code word bits n and the minimumdistance dmin, the general term reads:
In this chapter, only binary codes are considered (q = 2). Therefore, theindex is skipped. With most terms, instead of the minimum distance thenumber of control bits m are used or they are skipped as well:
(n,k,dmin)q block code
(n,k,m) block code and (n,k) block code respectively
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 115
3 Single-Error Correcting Block Codes
An essential indicator is the classification intosystematic / unsystematic codes.
With systematic codes, information bits and control bits can be separatedfrom each other:
u0 u1 u2 u3 ... ... uk-1
↓ ↓ ↓ ↓ ↓ ↓ ↓
c0 c1 c2 c3 ... ... ck-1 ck ck+1 ... ... cn-1
m = n – kBy contrast, information bits and control bits cannot be separated by non-systematic codes.However, block codes can always be converted (i.e. by transposing of
bits) to equivalent systematic codes.
c = (u, p) (3.1)
(3.2)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 116
3.1 Mathematical Principles3 Single-Error Correcting Block Codes
Calculating in the binary number system (modulo 2):Addition: Multiplication:⊕ 0 1
0 0 11 1 0
⊗ 0 10 0 01 0 1
In an equation, a modulo calculation is often clarified explicitly by statingit, e. g.: 1 + 1 = 0 mod 2.Mathematically, this number system with its calculation rules represents thesimplest example of a field K (chapter 4). Practically, they can be realisedby an XOR and an AND relation, respectively.Binary words form so-called n-tuples of elements of Kn (space over all n-digit bit sequences). The addition in Kn is effected component bycomponent,
example from K4: 0011 + 0101 = 0110,whereas the multiplication by elements of K is simply defined as:
0b = 0 and 1b = b for all b ∈ Kn.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 117
3.1 Mathematical Principles
Representation of block codes as a matrix:Let u be an information vector of the length k and let x be the respectivecode word vector of a block code of the length n. Then u and x are relatedby
x = u G,where G represents the generator matrix (k × n - matrix) of the block codes.
If a code can be described by this matrix multiplication for any u, it islinear.The rows of the generator matrix have to be linearly independent.
For systematical block codes, G has the formG = [Ik P]
with the k × k - unit matrix Ik and the k × (n − k) – control bit matrix P.
(3.3)
(3.4)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 118
3.1 Mathematical Principles
Polynomial representation of block codes:An n-digit code word c can be represented as a polynomial c(x) as follows:
( ) ∑−
=− =⇔=
1
0110 )(
n
i
iin xcxccccc K
c c(x)0000 ... 0 01000 ... 0 10100 ... 0 x0010 ... 0 x2
000 ... 01 xn−1
1111 ... 1 1 + x + x2 + ... + xn−1
1010 ... 0 1 + x2
Examples:c1 = 0011 ⇔ c1(x) = x2 + x3;c2 = 0101 ⇔ c2(x) = x + x3;(see also table for furtherpolynomial examples of codewords of the length n).
The addition of vectors corresponds tothe addition of polynomials.Example: c1 + c2 = 0110
⇔ c1(x) + c2(x) = x2 + x3 + x + x3 = x + x2.
(3.5)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 119
3.1 Mathematical Principles
Weighting function:Let C be a block code. Then let Ai be the number of code words of C thathave the weight i (0 ≤ i ≤ n). Then let be the weighting function A(z) and W(x,y), respectively:
Between these two notations the following relations exist:
∑=
=n
i
ii zAzA
0)( ∑
=
−=n
i
iini yxAyxW
0),(
),1()( zWzA = )(),(xyAxyxW n=
The weighting function can be calculated in closed form only for a smallnumber of codes.Some codes have a symmetrical weight distribution: Ai = An−i.The weight distribution makes the accurate calculation of the residual-errorprobability possible (see chapter 2).For the weight distribution of a linear code with the minimum distance
dmin applies: A0 = 1; An ≤ 1 and Ai = 0 for 0 < i < dmin.
(3.6b)
(3.6a) (3.7a)
(3.7b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 120
3.1 Mathematical Principles
Example: (4,3) parity check
Code word Information Control bit Weight c0 000 0 0 c1 001 1 2 c2 010 1 2 c3 011 0 2 c4 100 1 2 c5 101 0 2 c6 110 0 2 c7 111 1 4
4261)( zzzA ++=
The weight distribution from the table results in a weighting function:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 121
3.2 Terms and Features of Linear Block Codes3 Single-Error Correcting Block Codes
For a linear block code C, which is described by equation (3.3), thefollowing applies:
– Every code word is a linear combination of rows of the generatormatrix G;
– The code is composed of all linear combinations of rows of G;– The sum of code words results in a code word and– in the code the zero vector (0 0 0 ... 0) is included.
Fundamental row operations in G do not change a code, that means thecode space C remains constant:
– Permutation of two rows,– Multiplication of a row by a scalar unequal 0 and– Addition of a row to another one.
However, the assignment of the information words u to the code words x changes.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 122
3.2 Terms and Features of Linear Block Codes
Parity-check matrix:For a linear block code C, a parity-check matrix H is defined by:
( ) 0PPIPPII
PPIHG =+=+=⎟⎟
⎠
⎞⎜⎜⎝
⎛= −
−knk
knk
T
c HT = 0 for all c ∈ C and x HT ≠ 0 for all x ∉ CH is a (n − k) × n matrix. It applies:
0 = c HT = (u G) HT = u (G HT) → G HT = 0
That means that the generator matrix G and the parity-check matrix H areorthogonal.As for G, also for H fundamental row operations are allowed.If C is a systematic code with G = (Ik P), for H applies:
H = (PT In − k)
For this case, orthogonality can be proven as follows:
(3.8a/b)
(3.9)
(3.10)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 123
3.2 Terms and Features of Linear Block Codes
Syndrome:Let H be the parity-check matrix of a linear (n,k,m) block code and x ∈ Cany code word to be sent. With the transmission, an error word f issuperposed so that the receive word y reads:
– s is a zero vector only then if y is a code word,– All those error vectors f are recognised that are no code words and– s is independent from the code word.
For the product of y and HAT then applies:y HT = (x + f) HT = x HT + f HT = 0 + f HT = f HT = s
The product y HT is referred to as syndrome s, the represented calculation isthe calculation of the syndrome in matrix notation.The features of s are:
y = x + f. (3.11)
(3.12)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 124
3.2 Terms and Features of Linear Block Codes
Dual code:Based on a code C with the generator matrix G and the parity-check matrix H, let a dual code Cd, which is described by its dual generator matrix Gd and its dual parity-check matrix Hd, be defined as:
Gd = H and Hd = GThe code words of the two codes are orthogonal. Using c = u G ∈ C and cd = v H∈ Cd, the following applies:
( ) ( ) 0TTTTTd ==== v0uvHGuHvGucc
Examples for codes being dual to each other are the repetition code (index W) and the parity check where the first digit is the control bit (index P):
( )111|1 L=GW = HP
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
10000
010001
||||
1
11
L
OM
M
L
MHW = GP.
(3.13a/b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 125
3.2 Terms and Features of Linear Block Codes
Weighting function of the dual code (MacWilliams identity):If A(z) is the weighting function of an (n,k) block code, then the weightingfunction Ad(z) of the corresponding dual code is:
⎟⎠⎞
⎜⎝⎛
+−
⋅+⋅= −zzAzzA nk
11)1(2)(d ⎟
⎠⎞
⎜⎝⎛
+−
⋅+⋅= −−zzAzzA nkn
11)1(2)( d
)(
),(2),(d yxyxWyxW k −+⋅= − ),(2),( d)( yxyxWyxW kn −+⋅= −−
Example: For the (3,2) parity check, the code C is:C = {000,011,101,110}.
The weighting function reads as follows: A(z) = 1 + 3z2.
32
32d 1
1131)1(2)( z
zzzzA +=
⎥⎥⎦
⎤
⎢⎢⎣
⎡⎟⎠⎞
⎜⎝⎛
+−
+⋅+⋅= −
The corresponding dual code reads: Cd = {000,111} (repetition code).The weighting function is calculated to be:
(3.14a/b)
(3.15a/b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 126
3.2 Terms and Features of Linear Block Codes
Modifications of linear codes:An (n,k,dmin)q code can be modified to an (n',k',dmin')q code as follows:
– Extending: appending additional control bitsn' > n, k' = k, m' > m, RC' < RC, dmin' ≥ dmin
– Puncturing: reduction of control bitsn' < n, k' = k, m' < m, RC' > RC, dmin' ≤ dmin
– Lengthening: appending additional information bitsn' > n, k' > k, m' = m, RC' > RC, dmin' ≤ dmin
– Shortening: reduction of information bitsn' < n, k' < k, m' = m, RC' < RC, dmin' ≥ dmin
Adding a parity check bit, the extension of a code word with an unevenminimum distance can result in an augmentation of the minimum distance by 1, so that the code can detect a further error (example: see Hammingcodes).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 127
3.3 Cyclic Codes3 Single-Error Correcting Block Codes
Definition of cyclic codes:A linear (n,k) block code is called cyclic, if every cyclic shift of a codeword results in a code word:
C ∈⇒∈ −−− ),,,,(),,,,( 21011210 nnn cccccccc KK C
The set of cyclic codes is a subset of the linear block codes. Accordingly, between an information word
the following relation existsc = u G.
u = (u0, u1, u2, ... , uk−1)and a code word
c = (c0, c1, c2, ... , cn−1)
Thus, a cyclic code is also described by a generator matrix G. Thereby, all cyclic codes have at least a generator matrix with band structure (see
equation (3.18) and the following example).
(3.16)
(3.17)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 128
3.3 Cyclic Codes
Example of a cyclic code:
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
1011000010110000101100001011
G
u x0000 00000001000 11010000100 01101000010 00110100001 00011011110 10001100111 01000111101 1010001
u x1011 11111111010 11100100101 01110011100 10111000110 01011100011 00101111111 10010111001 1100101
Due to the cyclic feature, the code can easily be derived. In this case it issufficient to know the code words {0000000, 1101000, 1110010, 1111111}. At first, however, these have to be found (using linear combinations of rowsof G + zero word). The remaining words result from cyclic shift.Stop criterion: 2k words found.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 129
3.3 Cyclic Codes
Generator matrix with band structure:
⎟⎟⎟⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎝
⎛
=
−
−
−
−
−
kn
kn
kn
kn
kn
gggggggg
gggggggg
gggg
KK
KK
MM
MM
KK
KK
KK
210
210
210
210
210
00000000
000000000000
G
Caution: Generator matrices with band structure do not necessarily describecyclic codes!Example: C = {00000, 11010, 01101, 10111} with the generator matrix
⎟⎟⎠
⎞⎜⎜⎝
⎛=
1011001011
G
is not a cyclic code.
(3.18)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 130
3.3 Cyclic Codes
Generator polynomial:The generator matrix G of a cyclic code is thus characterised completely bya row of G. It includes at most n − k + 1 of 0 different entries: g0, g1, g2, ... , gn−k. These coefficients provide a special polynomial:
g(x) = g0 + g1 x + g2 x2 + ... + gn−k xn−k
g(x) is called generator polynomial.With binary codes, the coefficients of the generator polynomial can onlytake values of 0 or 1. For an (n,k) block code, the following must continueto apply:
g0 = gn−k = 1⇒ g(x) = 1 + g1 x + g2 x2 + ... + xn−k
Using polynomials, for cyclic codes a simplified description of generatormatrix, message and code vectors results.
(3.19b)
(3.19a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 131
3.3 Cyclic Codes
The polynomial and the matrix notation are listed below side by side:Information words:
u = (u0, u1, u2, ... , uk−1) u(x) = u0 + u1 x + u2 x2 + ... + uk−1 xk−1
Code words:c = (c0, c1, c2, ... , cn−1) c(x) = c0 + c1 x + c2 x2 + ... + cn−1 xn−1
Coding rules:c = u G c(x) = u(x) ⋅ g(x)
M M M M
With the equations (3.18/19b), both rules provide the same results:c0 = u0 g0
c1 = u0 g1 + u1 g0c2 = u0 g2 + u1 g1 + u2 g0
Consequently, the general presentation of a code word polynomial is:c(x) = (u0 + u1 x + u2 x2 + ... + uk−1 xk−1) ⋅ g(x)⇒ g(x) is the code word polynomial with the smallest degree greaterthan zero.
(3.20a/b)
(3.20c/d)(3.20e/f)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 132
3.3 Cyclic Codes
Period of a generator polynomial:Let g(x) be the generator polynomial of a cyclic code C. And let r be thesmallest exponent for which applies
xr + 1 = 0 mod g(x).g(x) is then the divisor of xr + 1, and r is the period of g(x).A cyclic feature only exists if for the code word length n the followingapplies: n = r
For the proof the following theorem is applied, which is quite plausible using equation (3.20f):
Theorem: Every code word polynomial c(x) is divisible by g(x) without remainder.
Example: g(x) = 1 + x + x3, u(x) = 1 + x2 + x3
⇒ c(x) = u(x) ⋅ g(x) = 1 + x + x2 + x3 + x4 + x5 + x6 ⇔ c = (1111111)
(3.22)
(3.21)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 133
3.3 Cyclic Codes
d is a code word only then, if d(x) is divisible by g(x) without remainder. Therefore, the following has to apply:
1 + xn = 0 mod g(x).
Only if the generator polynomial fulfils condition (3.23), a cyclic coderesults.
Proof of the condition (3.22): It shall apply thatc = (c0, c1, c2, ... , cn−1) ∈ C ⇒ d = (cn−1, c0, c1, ... , cn−2) ∈ C.
= c(x)= x ⋅ c(x) + cn−1 (1 + xn)
c(x) = c0 + c1 x + c2 x2 + ... + cn−1 xn−1 is divisible by g(x) without remainder:c(x) = 0 mod g(x) ⇒ x ⋅ c(x) = 0 mod g(x)d(x) = cn−1 + c0 x + c1 x2 + ... + cn−2 xn−1
= cn−1 + x (c0 + c1 x + c2 x2 + ... + cn−1 xn−1) + cn−1 xn
(3.23)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 134
3.3 Cyclic Codes
Determination of a generator polynomial:g(x) is the divisor of 1 + xn. In general, that means that 1 + xn can besplit into single coefficients. Every coefficient can then be applied as a generator polynomial.
Example: x7 + 1 = (1 + x) ⋅ (1 + x + x3) ⋅ (1 + x2 + x3)
Calculation of the period of a given generator polynomial: Thisis done by reversion of the Gaussian Division Algorithm. The polynomial isshifted that way that always the lowest powers cancel out each other.
Example: (x3 + x + 1) ⋅ p(x) = xn + 1x3 + 0 + x + 1 | ⋅1
x4 + 0 + x2 + x | ⋅x ⇒ p(x) = x4 + x2 + x + 1x5 + 0 + x3 + x2 | ⋅x2 = h(x)
x7 + 0 + x5 + x4 | ⋅x4
x7 + 0 + 0 + 0 + 0 + 0 + 0 + 1 ⇒ n = 7
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 135
3.3 Cyclic Codes
Check matrix and check polynomial:For the generator polynomial g(x) of a cyclic (n,k) block code applies:
xn + 1 = g(x) ⋅ h(x).
⎟⎟⎟⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎝
⎛
=
−−
−−
−−
−−
−−
021
021
021
021
021
00000000
000000000000
hhhhhhhh
hhhhhhhh
hhhh
kkk
kkk
kkk
kkk
kkk
KK
KK
MM
MM
KK
KK
KK
H
h(x) is the check polynomial and has the degree k:h(x) = h0 + h1 x + h2 x2 + ... + hk xk
with h0 = hk = 1.
The respective (n − k) × n parity-check matrix H is composed as follows:
(3.25a)
(3.25b)
(3.24)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 136
3.3 Cyclic Codes
Systematic Coding:In general, cyclic codes are not systematic. In order to achieve this, thefollowing construction rule is used:
)()()(
)()(
xgxrxq
xgxxu kn
+=⋅ −
1. Multiplication of the message polynomial u(x) by xn−k:u(x) ⋅ xn−k = u0 xn−k + u1 xn−k+1 + u2 xn−k+2 + ... + uk−1 xn−1
(corresponds to the shift of the message digits to the highest coefficients).
2. Division of the polynomial u(x) ⋅ xn−k by g(x): that means, u(x) ⋅ xn−k = q(x) ⋅ g(x) + r(x) withq(x) as whole-number division result and r(x) as division remainder (ingeneral: r(x) ≠ 0 and r(x) = r0 + r1 x + r2 x2 + ... + rn−k−1 xn−k−1).
3. Whole-number divisibility is enforced by addition of r(x):u(x) ⋅ xn−k + r(x) = q(x) ⋅ g(x) = c(x).
Resulting code word: c = (r0, r1, r2, ... , rn−k−1, u0, u1, u2, ... , uk−1)(3.27)
(3.26b)
(3.26a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 137
3.3 Cyclic Codes
Example:Using the cyclic (7,4) block code with the generator matrix
g(x) = x3 + x + 1,for the information word
u = (1 0 0 1), d. h. u(x) = x3 + 1,a systematic code word is to be developed.
13
2
++
+
xxxx
1. For n = 7 and k = 4, n − k = 3. From this results: u(x) ⋅ xn−k = x6 + x3.2. Division of the polynomial u(x) ⋅ xn−k by g(x):
( x6 + 0 + 0 + x3) : (x3 + 0 + x + 1) = x3 + x ++ (x6 + 0 + x4 + x3)
( 0 + 0 + x4 + 0 + 0 + 0)+ ( x4 + 0 + x2 + x)
( 0 + 0 + x2 + x) = r(x)3. c(x) = u(x) ⋅ xn−k + r(x) = x6 + x3 + x2 + x ⇒ c = (0 1 1 1 0 0 1)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 138
3.3 Cyclic Codes
Decoding of cyclic codes:The receive vector y and the receive polynomial y(x), respectively, are to bechecked
y = c + f ⇔ y(x) = c(x) + f(x).
)()()(
)()(
xgxsxq
xgxy
+=
It applies that a code word c without remainder is divisible by the generatorpolynomial g(x). The result is the original information word u, which canbe taken out again in this way. For any y, this method is a possibility to check whether or not it is about a code word. In case of error, with thedivision a remainder is left:
s(x) represents the syndrome in polynomial notation:y(x) = q(x) ⋅g(x) + s(x).
If s(x) ≠ 0, the error correction takes place by means of a syndrome table. The polynomial division is practically realised by shift register circuits. It isa great advantage that with the coding and decoding no matrix
multiplications are required.
(3.28)
(3.29a)
(3.29b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 139
3.4 Hamming Codes3 Single-Error Correcting Block Codes
Definition of the Hamming codes:Let be m ≥ 3. A Hamming code is a code with code words of the lengthn = 2m − 1, consisting of k = 2m − m − 1 information bits and m check bits. The minimum distance is 3.
Thus, the Hamming code is a (2m − 1,2m − 1 − m) block code.For the code, inter alia, the following values for n, k and m result:
m 3 4 5 6 7 8n 7 15 31 63 127 255k 4 11 26 57 120 247
Due to its minimum distance, the Hamming code can correct exactly oneerror. In reality, it is constructed for this case.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 140
3.4 Hamming Codes
Derivation of the construction rule for Hamming codes:Aim: construction of a code that can correct exactly one error.The syndrome s of the length m only depends on the error vector f of thelength n (eq. (3.12)). A single error at the position i (fi = 1; 0 ≤ i ≤ n − 1) results in a syndrome complying with the respective row of HT. In order to differentiate all single-error positions i unambiguously, all rows of HT haveto differ from each other; none of these rows may be the zero vector.All in all, there are 2m − 1 different rows without the zero vector, thus n = 2m − 1. From n = k + m results k = 2m − m − 1.Virtually, the rows of HT and the columns of H, respectively, are formed byall kinds of sequences except the zero vector.In doing so, the m = n − k control bits address the 2m − 1 positions.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 141
3.4 Hamming Codes
Since it is irrelevant which syndrome describes an error position, the bitsequences may be randomised. In doing so, the parity-check matrix can bearranged in that way that the resulting Hamming code is systematic:
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
== −
10000
010001
||||
11
1101
)I( T
L
OM
M
L
LL
LLMM
LL
LL
knPH
all possible column vectors with more than one 1The generator matrix can then be derived easily using equation (3.4).
(3.30)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 142
3.4 Hamming Codes
Example: For the (7,4) Hamming code the following systematic notationresults:
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
110|101|011|111|
1000010000100001
G
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
1101
1011
0111
P⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡== −
100
010
001
|||
110
101
011
111
)I( TknPH
u0 u1 u2 u3
↓ ↓ ↓ ↓
c0 c1 c2 c3 c4 c5 c6
control bits:c4 = c0 + c1 + c2c5 = c0 + c1 + c3c6 = c0 + c2 + c3
⇒
⇒With this, the following systematicswith the respective equations for thecalculation of the control bitsresults:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 143
3.4 Hamming Codes
Example (continuation):Table of code words of the (7,4) Hamming codes:
Code word Information Control bit c8 1000 111 c9 1001 100 c10 1010 010 c11 1011 001 c12 1100 001 c13 1101 010 c14 1110 100 c15 1111 111
d
Code word Information Control bitc0 0000 000 c1 0001 011 c2 0010 101 c3 0011 110 c4 0100 110 c5 0101 101 c6 0110 011 c7 0111 000
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 144
3.4 Hamming Codes
Error correction with Hamming codes:Example (continuation): error correction using the (7,4) Hamming code
s0 = y0 + y1 + y2 + y4s1 = y0 + y1 + y3 + y5s2 = y0 + y2 + y3 + y6
These are analysed according to thetable of syndromes.
Syndrome Error position s0 s1 s2
No error 0 0 0 Error in 0th pos. 1 1 1 Error in 1st pos. 1 1 0 Error in 2nd pos. 1 0 1 Error in 3rd pos. 0 1 1 Error in 4th pos. 1 0 0 Error in 5th pos. 0 1 0 Error in 6th pos. 0 0 1
The respective syndrome reads in general: s = (s0 s1 s2).From the parity-check matrix, the checkequations result:
The decoding steps are then:1. Evaluation of the check equations → syndrome;2. Determination of the error position from the table of syndromes;3. Realisation of the error correction: add „1“ at the position of error.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 145
3.4 Hamming Codes
Minimum distance of general Hamming codes:All columns of the check matrix are different from each other. With this, two columns are linearly independent at any time. However, since all kindsof bit combinations exist, three columns are linearly independent at anytime, e. g.:
100000...., 010000.... und 110000....With this, the minimum distance is always
dmin = 3and the number of corrigible errors
t = 1.
⎥⎥
⎦
⎤
⎢⎢
⎣
⎡−⋅+⋅++
+=
+−2
12
1
)1()1()1(1
1)(nn
n zznzn
zA
Weighting function of general Hamming codes:
(3.31a)
(3.31b)
(3.32)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 146
3.4 Hamming Codes
On pages 109 and 111 bothprobabilities for a (7,4) Hamming codehave already beencalculated. For nequal to 3, 7, 15, 31, 63 and 127 they aredepicted in theadjoining diagram.
-6 -5 -4 -3 -2 -1 0-12
-10
-8
-6
-4
-2
0
lg pW
lg perr
lg pue
pW = perr
pW = perr2
pW = perr3
n = 3n = 7
lg pW
lg pue
Word error probability and probability of undetected errors of Hamming codes:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 147
3.4 Hamming Codes
The extended Hamming code differentiates between the 1-error and 2-error situation by using an additional control bit. Therefore, it is a (2m,2m − 1 − m) block code. The respective generator matrix of the extended Hamming codereads as follows:
Extended Hamming code:Through evaluation of the syndrome, one error is corrigible. With a minimum distance of 3, however, also two errors are observable whichresult in s ≠ 0 as well. However, it cannot be distinguished whether one ortwo errors have occurred.
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡
=
1
11
HextH,M
GG (3.33)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 148
3.4 Hamming Codes
Possible error events are then:– no error: s = 0;– one error: s ≠ 0, sm+1 = 1; – two errors: s ≠ 0, sm+1 = 0.
Decoding is then carried out according to the following rules:– s = 0:
receive vector = code word– s ≠ 0, sm+1 = 1:
uneven number of errors ⇒ one error accepted and corrected bysyndrome evaluation
– s ≠ 0, sm+1 = 0:even number of errors ⇒ errors cannot be corrected.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 149
3.5 Realisation of Coding and Decoding by ShiftRegisters 3 Single-Error Correcting Block Codes
Basic circuits:
S0 . . . .S1 Sm−1
a(x) x ⋅ a(x)
a a + b
b
a a ⋅ b a a ⋅ bbAdder Multiplier Scaling
Shift registers consist of memory cells, each delaying by one pulse. Thecontent of the memory cell j at the time i is referred to as Sj(i). With this, the following applies:
Sj+1(i+1) = Sj(i) (3.34)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 150
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Cyclic shifting of a polynomial:Shift register circuits are an excellent possibility to realise cyclic coding. The mathematical operations required for this can all be carried out usingthis technique. One of these is the cyclic shifting of a polynomial:
x0
. . . .
x1 x2 xn−1xn−2xn−3
With this circuit,xi ⋅ a(x) mod (xn + 1)
is calculated, forming the simplest example of a regenerated shift register.(3.35)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 151
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Multiplication of two polynomials:An equation of the following form is to be calculated
c(x) = a(x) ⋅ [b0 + b1 x + b2 x2 + ... + bn xn].
b0
. . . .a(x)
. . . .
b1bn−1bn
c(x)
b0
. . . .a(x)
. . . .
b1 b2 bn
c(x)
Alternativestructure:
(3.36)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 152
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Example of a multiplication:The following is to be calculated
b0 = 1 b1 = 0 b2 = 1 b3 = 1
10111
11110011
s0 s1 s2 s3
i s0 s1 s2 s3 ci
−1 0 0 0 0 00 1 0 0 0 11 1 1 0 0 12 1 1 1 0 03 0 1 1 1 04 1 0 1 1 15 0 1 0 1 16 0 0 1 0 17 0 0 0 1 18 0 0 0 0 0
c(x) = a(x) ⋅ b(x) = (x4 + x2 + x +1 )⋅(x3 + x2 + 1) = x7 + x6 + x5 + x4 + 0 + 0 + x + 1
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 153
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Division of two polynomials:
Based on the circuit, the following equation is calculated:q(x−1) = a(x−1) xm + q(x−1) [b0 xm + b1 xm−1 + ... +bm−1 x]
⇒ a(x−1) = q(x−1) [b0 + b1 x−1 + ... + bm−1 x−(m−1) + x−m] = q(x−1) b(x−1)
is discontinued after having written the last digit of a(x−1) in the shiftregister. Then q(x−1) is the initial word and s(x−1) the
content of the memory cells.
b0
. . . .
a(x−1) . . . . q(x−1)
b1 bm−1
)()()(
xbxaxq =Substitution x−1 → x ⇒ a(x) = q(x) b(x) ⇒
)()()(
)()(
xbxsxq
xbxa
+=Division of two polynomials with remainder
(3.37a)
(3.37b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 154
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Example of a division of two polynomials with remainder:
x0 x1 x2 x3 x4
000100001 001110000s0 s1 s2 s3
q(x−1)a(x−1)
i ai s0 s1 s2 s3 qi
9 0 0 0 0 0 08 1 1+0=1 0+0=0 0 0+0=0 07 0 0+0=0 1+0=1 0 0+0=0 06 0 0+0=0 0+0=0 1 0+0=0 05 0 0+0=0 0+0=0 0 1+0=1 04 0 0+1=1 0+1=1 0 0+1=1 13 1 1+1=0 1+1=0 1 0+1=1 12 0 0+1=1 0+1=1 0 1+1=0 11 0 0+0=0 1+0=1 1 0+0=0 00 0 0+0=0 0+0=0 1 1+0=1 0
It is calculated(x8 + x3) : (x4 + x3 + x + 1)
134
23
+++
+
xxxxx
= x4 + x3 + x2 +
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 155
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Circuit for coding of a systematic cyclic code:
code word
control bitsinformation bits
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 156
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Mode of operation:
1. Reading in the information bits u0, u1, u2, ... , uk−1, starting with uk−1.Reading in from the „right“ corresponds to the multiplication by xn−k. After reading in the k information bits, the shift register contains thedivision remainder r(x).
2. Switch-off of the switches S1 and S2 (break of the feedback loop).3. Output of the control bits.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 157
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Example:The following has to be coded: u(x) = x3 + x2 + 1with g(x) = x3 + x + 1. t s1 b0 b1 b2 s2 s3
0 0 0 0 0 0 01 1 1 1 0 1 12 1 1 0 1 1 13 0 1 0 0 1 04 1 1 0 0 1 15 0 1 0 0 06 0 0 1 0 07 0 0 0 0 1
code wordinformation bitscontrol bits
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 158
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Circuit for the decoding:With this circuit, the receive word is divided by the generator polynomialand the syndrome is calculated.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 159
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
This circuit is also applied for decoding. However, the receive word is hereread in from the right.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 160
3.5 Realisation of Coding and Decoding Using Shift Register Circuits
Meggitt decoder: decoding with error correction at the same time
k-bit receive register
k-bit error register
syndrome logic
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 161
4 Burst-Error Correcting Block CodesCoding Theory
With the Reed Solomon codes (RS codes), this chapter introducesrepresentatives of the best known and often used codes being able to correctburst errors. They are closely related to the Bose-Chaudhuri-Hocquenghemcodes (BCH codes), the strength of which is rather to correct statisticallydistributed single errors. They are introduced at the end of this chapter as a special case of the RS codes.
In order to understand these two codes, knowledge of the algebra of finitefields (Galois fields) and of their features is required. At first, thisknowledge will be built in the following section.
A burst error of the length l ≥ 1 is understood by l consecutive digits of an error vector f, the first and the last digit of which are not equal to zero. Thevalues of the digits located between the first and the last digit of the errorburst are arbitrary.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 162
4.1 Fundamental Algebraic Terms for Codes4 Burst-Error Correcting Block Codes
In this section, a brief introduction of the algebra required for coding theoryis given. For this, basic knowledge of set theory is assumed. The concept of function is summarised briefly.
Binary function:In a not empty set MM, a binary function ° is understood by an allocation of two elements a,b of MM (a,b ∈ MM) in such a way:
a ° b = c.Here, it is not absolutely necessary that: a ° b = b ° a.If the condition (4.1b) is fulfilled, however, the set is commutative.In terms of the binäry function, this set is called closed (closed set), if:a ° b ∈ MM for all a,b ∈ MM.A binary function is called associative, if:
a ° (b ° c) = (a ° b) ° c, a,b,c ∈ MM.
(4.1a)(4.1b)
(4.1c)
(4.1d)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 163
4.1 Fundamental Algebraic Terms for Codes
Modulo-m addition:For a set of whole numbers MM = {0, 1, 2, ... , m−1} with m > 0, let themodulo-m addition „⊕“ be defined in such a way that for two elementsa,b ∈ MM the following applies:
a ⊕ b = r,where r is the remainder of the division of a + b by m.
rmqba +⋅=+⇔mrq
mba +=
+
⊕ 0 1 2 3 4 5 60 0 1 2 3 4 5 61 1 2 3 4 5 6 02 2 3 4 5 6 0 13 3 4 5 6 0 1 24 4 5 6 0 1 2 35 5 6 0 1 2 3 46 6 0 1 2 3 4 5
Example:MM = {0, 1, 2, 3, 4, 5, 6}and modulo-7 addition „⊕“.
For the demonstration of eq. (4.2a):
According to eq. (4.2b), the remainder r isagain an element of the set MM. The set MM isclosed in terms of the modulo-m addition.
(4.2a)
(4.2b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 164
4.1 Fundamental Algebraic Terms for Codes
Example with p = 7:MM = {1, 2, 3, 4, 5, 6}andmodulo-7 multiplication „⊗“.
⊗ 1 2 3 4 5 61 1 2 3 4 5 62 2 4 6 1 3 53 3 6 2 5 1 44 4 1 5 2 6 35 5 3 1 6 4 26 6 5 4 3 2 1
rpqba +⋅=⋅⇔prq
pba +=⋅
a ⋅ b is not divisible by p without remainder, because p is a prime number. The remainder isagain an element of the set MM. Thus, MM is closed.
Modulo-p multiplication:For a set of positive whole numbers MM = {1, 2, 3, ... , p−1}, with p being a prime number, let the modulo-p multiplication „ ⊗“ be defined in such a way that for two elements a,b ∈ MM the following applies:
a ⊗ b = r,with r being the remainder of the division of a ⋅ b by p.
For the demonstration of eq. (4.3a):
(4.3a)
(4.3b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 165
4.1.1 Groups4.1 Fundamental Algebraic Terms for Codes
A set GG and a function ° are referred to as group (GG,°), if the followingconditions for any a,b,c ∈ GG are fulfilled:
– The function ° is associative,– the set GG is closed,– the set GG contains a neutral element e with the feature:
a ° e = e ° a = a with e ∈ GG and– for every element a an inverse element a′ ∈ G G existsexists, so that:
a ° a′ = a′ ° a = e.
A group is referred to as commutative or Abelian group, if for any a,b ∈ GGequation (4.1b) applies.
Example: The set of integer numbers ZZ = {... −2, −1, 0, 1, 2, ...} composes a commutative group in terms of the addition. The neutral element is equal to 0. For the element a, the inverse element is -a.In terms of the multiplication, ZZ does not compose a group, since the inverse
elements are not contained in ZZ.
(4.4a)
(4.4b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 166
4.1.1 Groups
Finite group:If a group has a finite number of elements, it is referred to as finite group.
The neutral element of a group is unambiguous.Proof: Assuming that there are two neutral elements e1 and e2 with thefeature (4.4a), from this follows: e1 = e1 ° e2 = e2 ° e1 = e2.
Examples:The binary set BB = {0, 1}with the modulo-2 addition „⊕“. The neutral element is equal to 0 and the respective inverseelements: a = 0 ⇒ a′ = 0,
a = 1 ⇒ a′ = 1.
The sets with the modulo-m addition and modulo-p multiplicationconsidered at the beginning of section 4.1 are also finite groups.
⊕ 0 10 0 11 1 0
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 167
4.1.1 Groups
Cyclic groupsIf all elements of a multiplicative group GG = {1, g1, g2, ... , gm−1} can beillustrated as powers of at least one element gi, this group is called cyclic. Then, gi is called a primitive element of the group of the m-th order. For the‘‘one‘‘ element, the representation 1 = gi
0 applies. GG consists then of melements of the form gi
j, with 0 ≤ j ≤ m – 1:GG = {gi
0, gi1, gi
2, ... , gim−1}.
For any gk ∈ GG, the order (gk) describes the number of elements that can becomposed of gk
l.
z z1 z2 z3 z4 Order 1 1 1 1 1 1 2 2 4 3 1 4 3 3 4 2 1 4 4 4 1 4 1 2
Example:Multiplicative modulo-5 groupGG = {1, 2, 3, 4}.The elements z1 = 2 and z2 = 3 are
primitive elements.
(4.5)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 168
4.1.2 Rings4.1 Fundamental Algebraic Terms for Codes
A set RR, at which the addition ⊕ as well as the multiplication ⊗ are defined, is called a ring if the following conditions are fulfilled:
Example of a commutative ring with neutral element:whole numbers ZZ with common addition and multiplication.
–– RR is a commutative group in terms of ⊕,–– RR is closed in terms of ⊗,–– RR is associative in terms of ⊗ and – the distributive law is applied: a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) with
a,b,c ∈ RR.A commutative ring exists if ⊗ is also commutative.A ring with a neutral element exists, if:
a ⊗ 1 = 1 ⊗ a = a with a ∈ RR.
Example of a commutative ring with neutral element:whole numbers ZZm = {0,1, 2, ... , m−1} with modulo-m addition and modulo-m multiplication. There, an unambiguous neutral element exists
only, if m is a prime number.
(4.6a)
(4.6b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 169
4.1.3 Fields4.1 Fundamental Algebraic Terms for Codes
A set KK, where the binary functions addition ⊕ and multiplication ⊗ areexplained, is a finite field in terms of ⊕ and ⊗, if the following conditionsare fulfilled:–– KK is a commutative group in terms of ⊕ with 0 as neutral element,–– KK without 0 is a commutative group in terms of ⊗ with 1 as neutral
element and– the distributive law is applied: a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) with
a,b,c ∈ KK.Galois field:A field with a finite number of elements forms a Galois field GFGF.
Example: GFGF(2) ⊕ 0 10 0 11 1 0
⊗ 0 10 0 01 0 1
(4.7)
The finite set ZZm = {0,1, 2, ... , m−1} of whole numbers is a Galois fieldGFGF(m) = ZZm, if m is a prime number, that means: m = p.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 170
4.1.3 Fields
Several features of Galois fields are listed here without proof:
Every Galois field has at least one primitive element.
a ⊗ 0 = 0 ⊗ a = 0,
From the closure of the multiplication results that no zero divisors exist:a ≠ 0 and b ≠ 0 → a ⊗ b ≠ 0
a ⊗ b = 0 and a ≠ 0 → b = 0,
− (a ⊗ b) = (−a) ⊗ b = a ⊗ (−b),
a ⊗ b = a ⊗ c and a ≠ 0 → b = c,
1 ⊕ 1 ⊕ 1 ⊕ ... ⊕ 1 = 0,
m summands
(4.8a)
(4.8b)
(4.8c)
(4.8d)
(4.8e)
(4.8f)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 171
4.1.3 Fields
Example: GFGF(5) ⊕ 0 1 2 3 40 0 1 2 3 41 1 2 3 4 02 2 3 4 0 13 3 4 0 1 24 4 0 1 2 3
⊗ 0 1 2 3 40 0 0 0 0 01 0 1 2 3 42 0 2 4 1 33 0 3 1 4 24 0 4 3 2 1
a 0 1 2 3 4−a 0 4 3 2 1a−1 – 1 3 2 4
Direct addition using table:1 + 2 = 3, 2 + 4 = 1, 4 + 4 = 3.
Inverse elements of the addition: a + (−a) = 0.Subtraction with inverse elements of the addition:
3 − 2 = 3 + (−2) = 3 + 3 = 1 1 − 4 = 1 + (−4) = 1 + 1 = 2
Direct multiplication using table:1 ⋅ 2 = 2, 2 ⋅ 4 = 3, 4 ⋅ 4 = 1.
Inverse elements of the multiplication: a ⋅ (a−1) = 1Division by inverse elements of the multiplication:
3 ÷ 2 = 3 ⋅ (2−1) = 3 ⋅ 3 = 4 1 ÷ 3 = 1 ⋅ (3−1) = 1 ⋅ 2 = 2
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 172
4.1.3 Fields
Example:Solution to a linear system of equations in GFGF(5)
The following linear system of equations is given:I: 2 x0 + x1 = 2II: 3 x1 + x2 = 3III: x0 + x1 + 2 x2 = 3
Solution stepsI: 2 x0 = 2 − x1II: x2 = 3 − 3 x12 ⋅ III: (2 − x1) + 2 x1 + 4 (3 − 3 x1) = 2 ⋅ 3
⇒ 2 + 4 ⋅ 3 − 2 ⋅ 3 = x1(1 − 2 + 4 ⋅ 3) ⇒ 3 = x1⇒ x2 = 3 − 3 x1 = 4⇒ x0 = 2−1 (2 − x1) = 2
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 173
4.1.4 Expansion Fields4.1 Fundamental Algebraic Terms for Codes
Problem:In 4.1.3, finite fields are built, which are, however, limitedto prime numbers with regard to the number of theirelements. For example, GG = {0, 1, 2, 3} does not providea Galois field due to the required modulo-4 multiplication!
A much more general and, at the same time, more flexible construction of Galois fields can be achieved by the expansion of pm elements. These arerepresented as polynomials of degree < m with the coefficients pi ∈ GFGF (p):
⊗ 0 1 2 30 0 0 0 01 0 1 2 32 0 2 0 23 0 3 2 1
p(x) = p0 + p1x + p2x2 + ... + pm−1xm−1 = mit pm-1 ≠ 0.∑−
=
1
0
m
i
ii xp
These overall pm possible polynomials can be divided into:– splittable or reducible polynomials and– nonsplittable or irreducible polynomials.In digital communication technology, particularly the expansion fields
GFGF(2m) are of great importance.
(4.9)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 174
4.1.4 Expansion Fields
Irreducible polynomials:A polynomial p(x) with coefficients from GFGF (p) is called irreducible in terms of GFGF (p), if it cannot be represented as a product of two polynomialspa(x), pb(x) ≠ 0, 1 of lower degree with coefficients from GFGF (p).Thus, irreducible polynomials do not have a root in GFGF (p) either.The characteristics of irreducible polynomials and prime numbers arecomparable.
Reducible polynomials:A polynomial p(x) with coefficients from GFGF (p) is called reducible in termsof GFGF (p), if it can be decomposed into a product of two polynomials pa(x), pb(x) ≠ 0, 1 of lower order with coefficients from GFGF (p).
Example: p(x) = x2 + x + 1 Test polynomial of degree 1: pa(x) = x, pb(x) = x + 1.
x1
11+x
(x2 + x + 1) ÷ x = x + 1 + ; (x2 + x + 1) ÷ (x + 1) = x + .
Example: p(x) = (x + 1) (x2 + x + 1) = (x3 + x2 + x) + (x2 + x + 1) = x3 + 1.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 175
4.1.4 Expansion Fields
Roots:An irreducible polynomial p(x) of degree > 1 with coefficients from GFGF (p) does not have roots in GFGF (p). An element α ∈ GFGF(pm) of the expansionfield is called root of p(x), if the following condition is fulfilled:p(α) = 0There is an analogy between the expansion of Galois fields and theexpansion of real numbers to complex numbers.
1 ⇔⇔ 1;−=α2 −=α0!
12 =+α ⇔ 2mod12 += αα0!
12 =++αα
Definition of complex numbers; of an expanded Galois field:RR → CC GFGF (2) → GFGF(22)
No solution of x2 + 1 = 0; x2 + x + 1 = 0for x ∈ RR; x ∈ GFGF (2)field expansion:
Definition of complex numbers; of the elements of GFGF(22):c = c0 + c1α with c0, c1 ∈ RR; c = c0 + c1α with c0, c1 ∈ GFGF
(2).
(4.10)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 176
4.1.4 Expansion Fields
Primitive polynomials:An irreducible polynomial p(x) of degree m and with the coefficientspi ∈ GFGF (p) is referred to as primitive, if it has a root α (p(α) = 0) with thefollowing characteristic:
αi mod p(α), for i = 0, 1, ..., pm − 2provides all possible pm − 1 of 0 different elements of the Galois fieldGFGF(pm). The element α ∈ GFGF(pm) is called primitive element, for whichapplies:
α0 = αn = 1 für n = pm − 1 and
Primitive polynomials are irreducible, in general the reversal does not apply.For every prime field GFGF(pm) and every m, at least one primitive polynomial p(x) exists. If more than one primitive polynomial exists, equivalent expansion fields result from the construction.
)1(mod −=mpkk αα
(4.11)
(4.13)(4.12)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 177
4.1.4 Galois Fields
Primitive polynomials for the construction of Galois fields GFGF(2m):
m primitive polynomial m primitive polynomial1 x + 1 9 x9 + x4 + 12 x2 + x + 1 10 x10 + x3 + 13 x3 + x + 1 11 x11 + x2 + 14 x4 + x + 1 12 x12 + x7 + x4 + x3 + 15 x5 + x2 + 1 13 x13 + x4 + x3 + x + 16 x6 + x + 1 14 x14 + x8 + x6 + x + 17 x7 + x + 1 15 x15 + x + 18 x8 + x6 + x5 + x4 + 1 16 x16 + x12 + x3 + x + 1
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 178
4.1.4 Expansion Fields
There are three possibilities to represent the elements of a Galois field:
Example of GFGF(22):The elements of this Galois field in component representation are all 22 = 4 polynomials of degree < 2, which can be built with the coefficients fromGFGF(2): GFGF(22) = {0, 1, α, 1+α}. 0 = α–∞
1 = α0
α = α1
1+α = α2
1 = α3 = α0
α = α4 = α1
1+α = α5 = α2
... ...
– Component representation (polynomial notation),– Exponential representation (according to equation 4.11) and– Vectorial representation (representation of the coefficients).
The representation of the coefficients only results in thevectorial representation: GFGF(22) = {00, 10, 01, 11}.In exponential representation, the zero element isrepresented symbolically as α–∞ :GFGF(22) = {α–∞, α0, α1, α2}.For the exponential representation, with equation 4.13
applies: αk = αk mod 3.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 179
4.1.4 Expansion Fields
Addition and multiplication tables for GFGF(22) in
⊕ 0 1 α 1+α0 0 1 α 1+α1 1 0 1+α αα α 1+α 0 1
1+α 1+α α 1 0
⊗ 0 1 α 1+α0 0 0 0 01 0 1 α 1+αα 0 α 1+α 1
1+α 0 1+α 1 α
⊕ 00 10 01 1100 00 10 01 1110 10 00 11 0101 01 11 00 1011 11 01 10 00
⊗ 00 10 01 1100 00 00 00 0010 00 10 01 1101 00 01 11 1011 00 11 10 01
⊕ 0 1 α α2
0 0 1 α α2
1 1 0 α2 αα α α2 0 1α2 α2 α 1 0
⊗ 0 1 α α2
0 0 0 0 01 0 1 α α2
α 0 α α2 1α2 0 α2 1 α
Componentrepresentation:
Vectorialrepresentation:
Exponentialrepresentation:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 180
4.1.4 Expansion Fields
Polynomial residue class:Vice versa, the single elements of the Galois fields can be also described as polynomial remainders of a modulo-p(x) calculation, with p(x) being a primitive polynomial.
Example in GFGF(22):For this, p(x) = x2 + x + 1 is a primitive polynomial. The condition
x2 + x + 1 = 0corresponds to a calculation modulo x2 + x + 1. For any product, e. g. (1 + α) ⋅ (1 + α) = 1 + α2, with the division by p(x = α) results:
12 ++ααα
(α2 + 1) : (α2 +α + 1) = 1 ++ (α2 + α + 1)
α = r(α) (comp. component representation of GFGF(22))GFGF(22) is defined by all possible results of arbitraryf(α) = p(α) modulo α2 + α + 1.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 181
4.1.4 Expansion Fields
Conjugate roots:Let p(x), with grad p(x) = m and coefficients pi ∈ GFGF(p), be irreducible in terms of GFGF(p), and let α be the root of p(x), then, with α ,
are also roots of p(x). These roots are referred to as conjugate.
The characteristics are:– for all a ∈ GFGF(pm) applies: a ∼ a,– for all a, b ∈ GFGF(pm) applies: a ∼ b ⇒ b ∼ a and– for all a, b, c ∈ GFGF(pm) applies: a ∼ b and b ∼ c ⇒ a ∼ c.
∏−
=−=
1
0)()(
m
i
pixxp α
pmα = αp2
α , ...,pα , pm–1α , where
If two elements a, b ∈ GFGF (pm) are conjugate to each other, you can also write: a ∼ b.
By the conjugated root, a linear factorisation of p(x) is given:
(4.14)
(4.15)
(4.16a)(4.16b)(4.16c)
(4.17)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 182
4.1.4 Expansion Fields
Equivalence classes:The combination of elements that are conjugate to each other according to
[zi] = {zi, zj, zk, zl, ...} with zi ∼ zj ∼ zk ∼ zl ...is referred to as equivalence classes and forms disjoint decompositions of GFGF(pm).
Cyclotomic fields:Cyclotomic fields Kj contain all exponents of the elements that areconjugate to each other:
The term cyclotomy comes from the complex numbers, where then-th root of unit and the one-element of a Galois field behave similarly:
⇔ .Therefore, expansion fields are also referred to as cyclotomic fields.
12j == πezn 101 ===− zzz npm
(4.18)
(4.19)
(4.20)
1mit}1,,1,0|mod{ −=−=⋅= mij pnminpjK K with
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 183
4.1.4 Expansion Fields
Minimum polynomial:For every a ∈ GFGF(pm), an unambiguously determined normalised polynomialm[a](x) of minimum degree with coefficients from GFGF(p) exists, the root of which is a. This polynomial is called minimum polynomial.m[a](x) is irreducible.
The minimum polynomial is calculated using equation (4.17):
∏∏∈∈
−=−=][,][,
][ )()()(a
i
Kii
p
abba axbxxm
Two conjugate elements have the same minimum polynomial:a ∼ b ⇒ m[a](x) = m[b](x)
The degree of the minimum polynomial shows the power of theequivalence class.
(4.21)
(4.22)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 184
4.1.4 Expansion Fields
The product of all minimum polynomials from GFGF(pm)results in:
Cyclotomic field Minimum polynomial K0 = {0} m0(x) = (x − z0) = x + 1 K1 = {1,2,4,8} m1(x) = (x − z1) (x − z2) (x − z4) (x − z8) = x4 + x + 1 K3 = {3,6,9,12} m3(x) = (x − z3) (x − z6) (x − z9) (x − z12) = x4 + x3 + x2 + x + 1 K5 = {5,10} m5(x) = (x − z5) (x − z10) = x2 + x + 1 K7 = {7,11,13,14} m7(x) = (x − z7) (x − z11) (x − z13) (x − z14) = x4 + x3 + 1
1)(][ −=∏ n
ia xxm
i
m0(x) ⋅ m1(x) ⋅ m3(x) ⋅ m5(x) ⋅ m7(x) = x15 − 1
Example:GFGF(24) with primitive polynomial p(x) = x4 + x + 1 and primitive element zwith z4 + z + 1 = 0. The elements in GFGF(24) are:GFGF(24) = {0, z0, z1, z2, ... , zn−1} with n = pm − 1 = 24 − 1 = 15.In the exponent, the calculation is carried out modulo n: zn = z0.
There, one of the minimum polynomials corresponds tothe primitive polynomial.
(4.23)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 185
4.1.5 Discrete Fourier Transformation4.1 Fundamental Algebraic Terms for Codes
The Discrete Fourier Transformation (DFT) describes in general a transformation. For the coding theory, it provides a very well arrangeddisplay format.Based on a Galois field GFGF(pm) with the primitive element z, letpolynomials of degree ≤ n − 1 = pm − 2 with coefficients from GFGF(pm) begiven in the following form:
∑−
=− =⇔=
1
0110 )(),...,,(
n
i
iin xaxaaaaa
∑−
=− =⇔=
1
0110 )(),...,,(
n
j
jjn xAxAAAAA
∑−
=
⋅−− −=−=1
0)(
n
i
jii
jj zazaA ∑
−
=
⋅==1
0)(
n
j
jij
ii zAzAa
The transformation then reads:
(4.24a)
(4.24b)
(4.25a/b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 186
4.1.5 Discrete Fourier Transform
For the DFT, the following notations exist:a(x) A(x)
a Atime domain frequency domain
The DFT has the following characteristics:– the transformation is one-to-one,– IDFT(DFT(a(x))) = a(x),– DFT(DFT(a(x))) = −a(x),– DFT(DFT(DFT(DFT(a(x))))) = a(x),– z−j is the root of a(x) ⇔ Aj = 0,
),...,,0,,...,,(0)( 11110 −+−− =⇔= njj
j AAAAAza A
),...,,0,,...,,(0)( 11110 −+−=⇔= niii aaaaazA a
– zi is the root of A(x) ⇔ ai = 0.
(4.26a)(4.26b)(4.26c)
(4.26d)
(4.26e)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 187
4.1.5 Discrete Fourier Transform
The polynomial transformation according to eq. (4.25a/b) can be also described for the coefficient vectors by means of matrices:
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−−−−
−−
− 1
210
)1()1(21
)1(242121
1
210
21
11
1111
nnnn
nn
n A
AAA
zzz
zzzzzz
a
aaa
ML
MOMMMLLL
M
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−−−−−−−
−−−−−−−−
− 1
210
)1()1(2)1(
)1(242)1(21
1
210
21
11
1111
nnnn
nn
n a
aaa
zzz
zzzzzz
A
AAA
ML
MOMMMLLL
M
a A:
A a:
(4.27a)
(4.27b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 188
4.1.5 Discrete Fourier Transform
Example:The polynomial A(x) = 4 + 5x with the coefficients from GFGF(7) is to betransformed. z = 5 is chosen as a primitive element. Application of eq. (4.27b) results in:
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
543210
1234524024303034204254321
543210
11111
111111
AAAAAA
zzzzzzzzzzzzzzzzzzzzzzzzz
aaaaaa
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅+
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
506312
326451
5
111111
4
000054
546231421421616161241241326451111111
543210
aaaaaa
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 189
4.1.5 Discrete Fourier Transform
The inverse transform of the polynomial a(x) = 2 + x + 3x2 + 6x3 + 5x5 withthe coefficients from GFGF(7) calculated on the previous page is carried out using eq. (4.27a). The primitive element z = 5 is retained.
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−−−−−−−−−−−−−−−−−−−−−−−−−
543210
1234524024303034204254321
543210
11111
111111
aaaaaa
zzzzzzzzzzzzzzzzzzzzzzzzz
AAAAAA
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
543210
5432142042303032402412345
543210
11111
111111
aaaaaa
zzzzzzzzzzzzzzzzzzzzzzzzz
AAAAAA
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 190
4.1.5 Discrete Fourier Transform
Inverse transform (continuation):
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
000054
000023
506312
326451241241616161421421546231111111
543210
AAAAAA
A = (4,5,0,0,0,0) a = (2,1,3,6,0,5)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 191
4.1.5 Discrete Fourier Transform
A shift register circuit of the DFT according to equation (4.25a) is shown in the following figure:
a0z−j⋅0
z0
a0
Σ Aj
a1z−j⋅1
z−1
a1
an−1z−j⋅(n−1)
z−(n−1)
an−1
−1Initialisation
Memory contentafter j steps
Here the j-th step of the transformation for the calculation of A(x) ismapped. At the beginning (j = 0), the register is initialised with the time domain a0, a1, ... an−1, so that the following applies:
A0 = −(a0 + a1 + ... + an−1) mod pm. (4.28a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 192
4.1.5 Discrete Fourier Transform
A shift register circuit of the IDFT according to equation (4.25b) is shownin the following figure:
A0zi⋅0
z0
A0
Σ ai
A1zi⋅1
z1
A1
An−1zi⋅(n−1)
zn−1
An−1
Initialisation
Memory contentafter i steps
Here the i-th step of the transformation for the calculation of a(x) is mapped. At the beginnning (i = 0) the register is initialised with the frequencydomain A0, A1, ... An−1, so that the following applies:
a0 = A0 + A1 + ... + An−1 mod pm. (4.28b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 193
4.1.5 Discrete Fourier Transform
Finally, two theorems of the DFT are to be considered.
The product of two polynomials corresponds to the cyclic convolution of the coefficients:
)...()...()...( 1110
1110
1110
−−
−−
−− +++=+++⋅+++ n
nn
nn
n xcxccxbxbbxaxaa
∑−
=−
−−
−−−
⋅=
++++⋅=++++⋅=
1
0mod
211201101
112211000...
...
n
jnjiji
nn
nnn
abc
ababababcababababc
M
)1(mod)()( −⋅⇔∗ nxxbxaba
Let polynomials a(x), b(x), c(x) of degree ≤ n − 1 = pm − 2 be given with thecoefficients from GFGF(pm), the primitive element of which let be referred to as z. For any x ∈ GFGF(pm), xn = x0 = 1, i. e. modulo-(xn − 1) calculation, applies. For the product of two polynomials a(x) ⋅ b(x) = c(x) then applies:
(4.30)
(4.29)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 194
4.1.5 Discrete Fourier Transformation
Convolution Theorem of the DFT:
Shift Theorem of the DFT:
Proof of (4.31a): Using equation (4.3b), for the modulo calculation therepresentation c(x) = a(x) ⋅ b(x) − γ(x) ⋅ (xn−1) resultsand thus Cj = − c(z−j) = − a(z−j) ⋅ b(z−j) + γ(z−j) ⋅ (z−jn−1).
−Aj =1−Bj
a(x) ⋅ b(x) mod (xn−1) − Aj ⋅ Bjai ⋅ bi A(x) ⋅ B(x) mod (xn−1)
with a(x) A(x), b(x) B(x)
x ⋅ a(x) mod (xn−1) z−j ⋅ Ajzi ⋅ ai x ⋅ A(x) mod (xn−1)
with a(x) A(x).Proof: by inserting b(x) = x and B(x) = x, respectively, in the Convolution
Theorem.
(4.31a/b)
(4.32a/b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 195
4.2 Reed-Solomon Codes4 Burst-Error Correcting Block Codes
The Reed-Solomon codes (RS codes) were developed approx. in 1960. They can be constructed analytically. There, the minimum distance can beprovided so that their weight distribution is known.The RS codes fulfil the Singleton bound with equality.They are very efficient and vitally important in practice. Among otherthings, they are used in spacecraft communications, for CDs (Compact Disk) and in mobile radio communications.
RS codes are symbol-oriented codes, that means they are especiallydedicated for the correction of burst errors. In contrast, they can beinefficient for the correction of single errors, since always whole symbolshave to be corrected. A respective redundancy is required here.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 196
4.2.1 Construction of Reed-Solomon Codes4.2 Reed-Solomon Codes
Fundamental theorem of the linear algebra:A polynomial A(x) = A0 + A1 x + A2 x2 + ... + Ak−1 xk−1 of degree k − 1 withcoefficients Ai ∈ GFGF(pm) and Ak−1 ≠ 0 has at most k − 1 different roots.
The proof of this theorem is to confirm that a polynomial can be representedby a number of linear factors restricted by its degree:
∏−
=− −⋅=
1
11 )()(
k
iik xxAxA
Hamming weight of general vectors:The Hamming weight wH(c) of a vector c is defined as the number of elements of c that are not zero.Eq. (2.11), which calculates the weight of a binary vector, can be seen as a special case here, if the elements of the vector come from GF GF (2).
(4.33)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 197
4.2.1 Construction of Reed-Solomon Codes
Proof:Equation (4.26e) says that roots of A(zi) correspond to elementsai = 0. The polynomial A(x) has at most k − 1 different roots, so that thevector a has at least n − (k − 1) = n − k + 1 ≥ d points different from zero.
Theorem for the minimum weight of vectors:
According to this theorem vectors can be constructed, the weight of whichdepends only on the polynomial degree of the transformed vector. These
have a defined minimum weight and thus a defined minimumdistance.
Let A(x) = A0 + A1 x + A2 x2 + ... + Ak−1 xk−1 be a polynomial with thecoefficients Ai ∈ GFGF(pm), where its degree is bounded by:
grad A(x) = k − 1 ≤ n − d,then for the weight of a vector applies a = (a0, a1, a2, .... , an−1) withthe relation a(x) A(x):
wH(a) ≥ d.(4.34)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 198
4.2.1 Construction of Reed-Solomon Codes
Definition of Reed-Solomon codes:
The RS codes satisfy the Singleton bound with equality, since the followingapplies: n − k = dmin − 1The best case is an odd dmin, so that applies: t = (dmin − 1)/2.And thus: n − k = 2 t
I.e., the number of detectable and corrigible errors can be adjusted.
Let z be a primitive element from GFGF(pm), then a = (a0, a1, a2, .... , an−1) is a code word of the RS code C of the length n = pm − 1, dimension k = pm − dand minimum distance d = n − k + 1, if the following applies:
C = {a | ai = A(zi), grad A(x) ≤ k − 1 = n − d}.
The code vector (A0, A1, A2, .... , Ak−1) corresponds to the information tobe coded. By filling with n − k = d − 1 zeros (also called test symbols or testfrequencies), the code word in the frequency domain A results, by IDFT thecode word in the time domain a results:
a = (a0, a1, a2, .... , an−1) A = (A0, A1, A2, .... , Ak−1,0,0, ... ,0)
(4.36)
(4.37)
(4.35)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 199
4.2.1 Construction of Reed-Solomon Codes
Example:Selection of the Galois field GFGF(pm) with p = 2 and m = 4 ⇒ GFGF(24).The code word length is then: n = pm − 1 = 16 − 1 = 15.Let the number of corrigible symbols be t = 3 ⇒ n − k = dmin − 1 = 2 t = 6. Then the code word in the frequency domain reads in general:
This code can be referred to as hexadecimal (15,9)-RS code and as binary(60,36)-RS code.
with Aj ∈ {0, 1, z, z+1, z2, z2+1, z2+z, z2+z+1,z3, z3+1, z3+z, z3+z+1, z3+z2, z3+z2+1, z3+z2+z, z3+z2+z+1}.
The number of information digits / bits is: k = 9 / k ⋅ m = 9 ⋅ 4 = 36;the number of code word digits / bits is: n = 15 / n ⋅ m = 15 ⋅ 4 = 60.
A0 A1 A2 A3 A4 A5 A6 A7 A8 0 0 0 0 0 0
k = n − (dmin − 1) = 9 dmin − 1
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 200
4.2.1 Construction of Reed-Solomon Codes
Possible error vectors in a (partial) binary notation with 0 = (0000) are:f = (0 0 0 1111 1111 1111 0 0 0 0 0 0 0 0 0) is corrigible;f = (0 0 0001 1111 1111 1110 0 0 0 0 0 0 0 0 0) is not corrigible;f = (0100 0 0 0 0001 0 0 0100 0 0 0 0 0 0010 0) ist not corrigible.
Example (continuation):With the correction capability of t = 3 symbols maximumtb,max = 3 ⋅ 4 = 12 bits and minimum tb,min = 3 ⋅ 1 = 3 bits can be corrected.
From the extreme cases it can be derived that definitely 3 single errors or9 consecutive errors are corrigible.Thus, burst errors are corrigible up to a length of:
nd
ndn
nkR 11)1(
C−
−=−−
==For the code rate of the RS code applies: .
tb,Bündel = m (t − 1) + 1 = m (d − 3) / 2 + 1
Example: For the (15,9)-RS code, RC = k / n = 9 / 15 = 60 %.
(4.38)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 201
4.2.1 Construction of Reed-Solomon Codes
Generator polynomial:Due to the restriction of the degree of the code word A(x), for which applies:
Aj = 0 for j = k ... n − 1,and the DFT characteristic (4.26d), every code word polynomial a(x) has the roots:
a(z−j) = 0 for j = k ... n − 1.The product of all linear factors resulting from the roots forms the generatorpolynomial of the respective RS code:
Every code word polynomial can then be described as:a(x) = u(x) ⋅ g(x).
Since an RS code can be described by a generator polynomial, it is a cycliccode.
∏∏−
=
−
=
− −=−=kn
j
jn
kj
j zxzxxg1
1)()()(
(4.39)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 202
4.2.1 Construction of Reed-Solomon Codes
Check polynomial:The check polynomial h(x) is the complementary polynomial to thegenerator polynomial and contains as roots all those elements different fromzero which are no roots of g(x):
Proof:
∏∏+−=
−
=
− −=−=n
knj
jk
j
j zxzxxh1
1
0)()()(
)1(mod0)1()()()(
)()()()()(1
0−=−⋅=−⋅=
⋅⋅=⋅
∏−
=
nnn
i
i xxxuzxxu
xhxgxuxhxa
For the polynomials g(x), u(x) and h(x) the following relations apply:grad(g(x)) = n − k; grad(u(x)) = k − 1; grad(g(x) ⋅ u(x)) = n − 1;grad(h(x)) = k.
(4.40)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 203
4.2.1 Construction of Reed-Solomon Codes
Example:For the (6,2)-RS code over GFGF(7) using the primitive element z = 5, thegenerator polynomial g(x) is:
The check polynomial h(x) is then calculated to be:
43222
4
1
5
2
4652)65()56(
)2)(6()4)(5(
)5()5()(
xxxxxxxx
xxxx
xxxgi
i
i
i
++++=++⋅++=
−−⋅−−=
−=−= ∏∏==
−
)3)(1(334652
1)(
)(1)(
2432
6−−=++=
++++
−=
−=
xxxxxxxx
xxh
xgxxh
n
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 204
4.2.1 Construction of Reed-Solomon Codes
Generator matrix:The RS code is a cyclic code, the generator matrix G according to equation3.18 of which can be developed from the coefficients of the generatorpolynomial g(x).Example with g(x) = 2 + 5x + 6x2 + 4x3 + x4 (compare page 203):
⎟⎠⎞⎜
⎝⎛= 146520
014652G
Systematisation:By transformation, every generator matrix can be systemised.Example of the above generator matrix: The systematisation steps are here1) substitution of the first row by the sum of the rows and 2) multiplicationof all elements by 4.
⎟⎠⎞⎜
⎝⎛= 146520
1534021G ⎟
⎠⎞⎜
⎝⎛= 423610
465201systGG ⇒ ⇒
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 205
4.2.1 Construction of Reed-Solomon Codes
Control matrix:The control matrix H can be determined in three ways:1. Due to the cyclic characteristic of the RS codes using eq. 3.25bExample with h(x) = x2 + 3x + 3 (see page 203):
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
331000033100003310000331
1H
2. Using the systematic generator matrix: G = [Ik P] ⇒ H2 = [−PT In-k]Example with a systematic generator matrix Gsyst (compare page 204):
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛
−−−−−−−−
=100033010051001042000115
100044010026001035000162
2H
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 206
4.2.1 Construction of Reed-Solomon Codes
3. Direct evaluation of the check equations:−
0)( =−= −zaA jj 1for
1
0−=⋅−= ∑
=
− nkjzan
i
iji K
Inserting j = k …n-1, the sums for all j yield as a matrix:
⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−
−−−−−−−
+−−+−+−
+−−+−+−
−−−−
0
000
1
111
1
2
1
0
)1()1(2)1(
)2)(1()2(2)2(
)1)(1()1(2)1(
)1(2
2MM
K
MMMMKKK
nnnn
knkk
knkk
knkk
a
aaa
zzz
zzzzzz
zzz
H3
Examplewith z = 5in GFGF(7): ⎟
⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜
⎝
⎛
=−
−−−−−−−−−−−−−−−−
326451241241616161421421
1111
12345240243030342042
3
zzzzzzzzzzzzzzzzzzzz
H
(4.41)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 207
4.2.1 Construction of Reed-Solomon Codes
Syndrome calculation:The received signal r can be described as
r = a + f R = A + Fand accordingly r(x) = a(x) + f(x) R(x) = A(x) + F(x).
In the frequency range, the syndrome is a part of the receive vector which islocated at the test frequencies:
From this, the following can be derived in the frequency range:A(x) = A0x0 + A1x1 + ... + Ak−1xk−1
F(x) = F0x0 + F1x1 + ... + Fk−1xk−1 + Fkxk + ... + Fn−1xn−1
R(x) = R0x0 + R1x1 + ... + Rk−1xk−1 + Fkxk + ... + Fn−1xn−1
with Rj = Aj + Fj syndrome coefficients
S(x) = S0x0 + ... + Sn−k−1xn−k−1 = Fkx0 + ... + Fn−1xn−k−1 (4.42)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 208
4.2.1 Construction of Reed-Solomon Codes
In vectorial notation, the syndrome isS = (S0, S1, ... , Sn−k−1).
The syndrome can also be calculated by multiplication by the check matrix:
S = r ⋅ H3T ⇔ ST = H3 ⋅ rT
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜⎜
⎝
⎛
−=
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−−−−−−−
+−−+−+−+−−+−+−
−−−−
−− 1
210
)1()1(2)1(
)2)(1()2(2)2()1)(1()1(2)1(
)1(2
1
210
21
111
nnnn
knkkknkk
knkk
kn r
rrr
zzz
zzzzzz
zzz
S
SSS
M
K
MMMMKKK
M
Written out, the following representation results:
(4.43a)
(4.43b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 209
4.2.1 Construction of Reed-Solomon Codes
Generic RS code:The RS code is a cyclic code. According to eq. 4.32a, cyclic shifts in thetime domain have no effect on the test frequencies:
a(x) = xk ⋅ b(x) mod (xn−1) z−j⋅k ⋅ Bj = Aj → Bj = 0 ⇒ Aj = 0.
Otherwise, using equations 4.26e and 4.32b, it is certain that the cyclic shiftof a polynomial in the frequency range does not have an effect on theweight and thus on the minimum distance of the appropriate code word in the time domain.
Definition: A generic RS code of the length n, dimension k and minimumdistance d is defined by:C = {a | ai = A(zi), A(x) = xk ⋅ B(x) mod (xn−1), grad B(x) ≤ k − 1 = n − d },
with k being any number.(4.44)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 210
4.2.1 Construction of Reed-Solomon Codes
∑=
=n
i
ii zAzA
0)(
diiAi <≤
== 1für 00für 1
diqjiqi
nAdi
j
jdiji ≥⎟
⎠⎞⎜
⎝⎛ −−−⎟
⎠⎞⎜
⎝⎛= ∑
−
=
−− für 1)1()1(0
The RS codes belong to the MDS codes (compare page 100).The proof is based on combinatorial considerations in terms of essential MDS characteristics of the RS codes.
Weighting function of MDS codes:The weighting function A(z) for an (n,k)-MDS code with the minimumdistance d = n – k + 1 over the GFGF(q = pm) can be calculated as follows:
with
(4.45)and for
forfor
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 211
4.2.1 Construction of Reed-Solomon Codes
Example: weighting distribution of the (7,3)-RS code over GFGF(q = 23) withthe minimum distance d = (n − k) + 1 = 5.
1477217)!57(!5
!71757
5 =⋅=⋅−⋅
=⋅⋅⎟⎠⎞⎜
⎝⎛=A
147)58()!67(!6
!7)1158(76
76 =−⋅
−⋅=⋅⎟
⎠⎞⎜
⎝⎛−⋅⋅⎟
⎠⎞⎜
⎝⎛=A
217)154864(7)12681
68(777 2
7 =+−⋅=⋅⎟⎠⎞⎜
⎝⎛+⋅⎟
⎠⎞⎜
⎝⎛−⋅⋅⎟
⎠⎞⎜
⎝⎛=A
A0 = 1A1 = A2 = A3 = A4 = 0
With this, the weighting function is:
∑=
===+++=n
ii qA
0
33 85122171471471
765 2171471471)( zzzzA +++=
Check: .
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 212
4.2.2 Coding of Reed-Solomon Codes4.2 Reed-Solomon Codes
The information u = (u0, u1, u2, .... , uk−1) and the information polynomialu(x) = u0x0 + u1x1 + ... + uk−1xk−1, respectively, is to be coded. In thefollowing, three possible coding methods are identified:1. Coding in the frequency range, that means inverse transform of u by
means of IDFT (eq. 4.25b):
2. Non-systematic coding in the time domain, that means multiplicationof u(x) by generator polynomial:
3. Systematic coding in the time domain, that means multiplication of u(x) by xn−k: u(x) ⋅ xn−k = u0xn−k + u1xn−k+1 + ... + uk−1xn−1,division of u(x) ⋅ xn−k by g(x):u(x) ⋅ xn−k = q(x) ⋅ g(x) + r(x) with r(x) = r0 + r1x + ... + rn−k−1xn−k−1 andsolving according to the code word: − r(x) + u(x) ⋅ xn−k = q(x) ⋅ g(x) = a(x).
a = (−r0, −r1, −r2, ... , −rn−k−1, u0, u1, u2, ... , uk−1)
A = (u0, u1, u2, .... , uk−1,0,0, ... ,0) a = (a0, a1, a2, .... , an−1)
a(x) = u(x) ⋅ g(x)
(4.46c)
(4.46b)
(4.46a)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 213
4.2.2 Coding of Reed-Solomon Codes
Block diagram for coding in the frequency range (method 1):
u0 A0 a0 c0
u1 A1 a1 c1
u2 A2 a2 c2
uk−1 Ak−1
0 Ak
0 Ak+1
0 An−1 an−1 cn−1
....
................
................
∑−
=⋅=
1
0
n
j
ijji zAa
IDFT....
....
....
Messageword
Codeword
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 214
4.2.2 Coding of Reed-Solomon Codes
Example of coding in the frequency range (method 1):A (6,2)-RS code over GFGF(7) with the primitive element z = 5 is considered. The information u = (3,3) is to be coded.
With B(x) = 1 + 1x, the code word b results as followsB = (1,1,0,0,0,0) b = (2,6,5,0,3,4).
n = 6; k = 2 and A(x) = 3 + 3x. From ai = A(x = zi) (eq. 4.25b) follows:a0 = A(x = z0 = 50 = 1) = A0 + A1z0 = 3 + 3 ⋅ 1 = 6,a1 = A(x = z1 = 51 = 5) = A0 + A1z1 = 3 + 3 ⋅ 5 = 4,a2 = A(x = z2 = 52 = 4) = A0 + A1z2 = 3 + 3 ⋅ 4 = 1,a3 = A(x = z3 = 53 = 6) = A0 + A1z3 = 3 + 3 ⋅ 6 = 0,a4 = A(x = z4 = 54 = 2) = A0 + A1z4 = 3 + 3 ⋅ 2 = 2,a5 = A(x = z5 = 55 = 3) = A0 + A1z5 = 3 + 3 ⋅ 3 = 5.
Result: A = (3,3,0,0,0,0) a = (6,4,1,0,2,5).
The Hamming distance is: dH(a,b) = dH((6,4,1,0,2,5),(2,6,5,0,3,4)) = 5.For comparison: dmin = n − k + 1 = 6 − 2 + 1= 5.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 215
4.2.2 Coding of Reed-Solomon Codes
u = (3,4) ⇒ u(x) = 3 + 4x ⇒ u(x) ⋅ xn−k = 3x4 + 4x5.
An alternative is given by coding with a systematic generator matrix:
)()(
xgxr
Example of the systematic coding in the time domain (method 3):A (6,2)-RS code over GFGF(7) with the primitive element z = 5 is considered. The information u = (3,4) is to be coded systematically.
( ) ( )0,5,6,2,4,34236104652014,3syst =⎟
⎠⎞⎜
⎝⎛⋅=⋅= Gub
Division of u(x) by g(x) = 2 + 5x + 6x2 + 4x3 + x4 (compare page 203):
(4x5 + 3x4) ÷ (1x4 + 4x3 + 6x2 + 5x + 2) = 4x + 1 +− (4x5 + 2x4 + 3x3 + 6x2 + 1x)
1x4 + 4x3 + 1x2 + 6x − (1x4 + 4x3 + 6x2 + 5x + 2)
2x2 + 1x + 5 = r(x)−r(x) + u(x) xn−k = 2 + 6x + 5x2 + 3x4 + 4x5 ⇒ a = (2, 6, 5, 0, 3, 4)
−r u
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 216
4.2.2 Coding of Reed-Solomon Codes
The generator polynomial is calculated according to eq. (4.39):g(x) = (x − z) (x − z2) (x − z3) (x − z4)
= (x2 + xz4 + z3) (x2 + xz6 + 1)= x4 + x3z3 + x2 + xz + z3.
The check polynomial is calculated according toeq. (4.40):
h(x) = (x − z5) (x − z6) (x − z7)= x3z3 + x2z3 + xz2 + z4.
Example of the systematic coding in the time domain (method 3):A (7,3)-RS code over GFGF(23) is considered. The primitive polynomial isp(x) = 1 + x + x3.The code word length is calculated with n = pm − 1 = 8 − 1 = 7. The numberof corrigible symbols is t = 2, because n − k = 4 with k = 3. With p(x), thefollowing relation applies to the primitive element: 1 + z + z3 = 0.The single elements in exponential and component representation of thisexpansion field are given in the table.
0 = 0z0 = 1z1 = zz2 = z2
z3 = 1 + zz4 = z + z2
z5 = 1 + z + z2
z6 1 + z2
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 217
4.2.2 Coding of Reed-Solomon Codes
Example (continuation):The information u = (1, z, z2) is to be coded systematically.u = (1, z, z2) ⇒ u(x) = 1 + xz + x2z2 ⇒ u(x) ⋅ xn−k = x4 + x5z + x6z2.Division of u(x) ⋅ xn−k by g(x) = x4 + x3z3 + x2 + xz + z3:
)()(
xgxr
The respective generator matrix reads: .⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛=
110001100011
3333
33
zzzzzz
zzzG
(x6z2 + x5z + x4) ÷ (x4 + x3z3 + x2 + xz + z3) = x2z2 + x + z3 + + (x6z2 + x5z5 +x4z2 + x3z3 + x2z5)
x5z6 + x4z6 + x3z3 + x2z5
+ (x5z6 + x4z2 + x3z6 + x2 + xz2)x4 + x3z4 + x2z4 + xz2
+ (x4 + x3z3 + x2 + xz + z3)x3z6 + x2z5 + xz4+ z3 = r(x)
−r(x) + u(x) xn−k = z3+xz4 +x2z5 +x3z6 +x4+x5z +x6z2 ⇒ a = (z3,z4,z5,z6,1,z,z2).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 218
4.2.3 Decoding of Reed-Solomon Codes4.2 Reed-Solomon Codes
The decoding of RS codes is carried out by means of special algorithms. The usage of a syndrome table is not possible due to the enormous discspace required.A receive vector r is assumed, which has resulted from the covering of theoriginally sent code word a with an error vector f: r = a + f.Let the RS code be able to correct t errors.
The approach with the algebraic decoding can then be divided into thefollowing steps:1. Calculation of the syndrome S(x) from r2. Calculation of the fault locations from S(x) by means of the so-called
code equation3. Calculation of the error values of f, as RS codes are no binary codes4. Error correction using the relation a = r – f.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 219
4.2.3 Decoding of Reed-Solomon Codes
Calculation of the syndrome:According to definition, a code word A has 2t consecutive coefficients in thefrequency range that are equal to zero. These positions of the vector R arereferred to as syndrome vector or simply syndrome. The following applies:
Calculation of the fault locations:For this, the so-called code equation is derived. The code equationestablishes a relationship between the syndrome S(x) and the fault locationpolynomial c(x) C(x) , the zeros of which mark the fault locations. c(x) is defined that way that: ci = 0 for fi ≠ 0, so that the following applies:
syndrome S
r = a + f R = A + F = (A0+F0, A1+F1, ..., Ak–1+Fk–1, Fn–2t, ... , Fn–1).
ci ⋅ fi = 0 for i = 0, 1, ... , n − 1
∏≠
−=0,
)()(ifi
izxxCThus, every fault location produces a zero / a linear factor in C(x):
(4.47)
(4.48)
(4.49)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 220
4.2.3 Decoding of Reed-Solomon Codes
According to equations 4.31b and 4.26e, for the DFT of the product ci ⋅ fiapplies:
2knte −
=≤
ee
fi
i xCxCxzxCi
+++=⋅−= ∏≠
− ...1)1()( 10,
ci ⋅ fi = 0 C(x) ⋅ F(x) = 0 mod (xn − 1)
The polynomial C(x) ⋅ F(x) has all kinds of zeros. The degree of C(x) isequal to the number of fault locations e. This applies under the assumption:
With eq. (4.49), for the fault location polynomial C(x) applies:C(x) = C0 + C1x + ... + Cexe
Thus, C(x) has exactly e + 1 coefficients, however, only the e zeros of thepolynomial are of interest. Therefore, a coefficient Ci is arbitrary and can benormalised to 1. According to eq. (4.49), Ce = 1 is chosen. If C0 = 1 ischosen, C(x) is defined by:
(4.50)
(4.51)
(4.52)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 221
4.2.3 Decoding of Reed-Solomon Codes
Before deriving the code equation, the relations between the polynomialsconsidered so far are to be summarised first. These are the code word a(x) sent, the error polynomial f(x), the receive word r(x) as well as the fault location polynomial c(x) and the appropriate representations in thefrequency range:
a(x) xxxxxxxxxxxxxxxxxxxx A(x) xxxxxxxxxxxxxx000000
f(x) 00000x000x000000x000 F(x) xxxxxxxxxxxxxxxxxxxx
r(x) xxxxxxxxxxxxxxxxxxxx R(x) xxxxxxxxxxxxxxxxxxxx S(x)
c(x) xxxxx0xxx0xxxxxx0xxx C(x) xxxx0000000000000000
ci⋅fi = 0 C(x)⋅F(x) = 0 mod (xn−1)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 222
4.2.3 Decoding of Reed-Solomon Codes
In order to determine the coefficients Ci by setting up and subsequently solvingthe code equation, it is assumed at first that the following applies: e = t.Eq. (4.50) in the form C(x) ⋅ F(x) = 0 mod (xn − 1) is written as a system of equations, where the coefficients are arranged according to index sums modulo-n in ascending order, that means according to the powers x0 to xn–1
(xn = x0):
00
000
00
13221102423120
1212110222110
12322110
1120110221100
=+++=++++
=++++=++++=++++
=++++=++++
−−−−−−−−−−
+−−−−+−−−−−−−
−−−−−−−−
+−−−−−
tntnnntntnnn
tnttntntntnttntntntnttntntn
tntntntnn
FCFCFCFCFCFCFCFC
FCFCFCFCFCFCFCFCFCFCFCFC
FCFCFCFCFCFCFCFC
KK
MMMMMKKK
MMMMMKK
(4.53)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 223
4.2.3 Decoding of Reed-Solomon Codes
The framed part of eq. (4.53) has t equations with t unknowns. Thecoefficients of the fault location polynomial are known there and correspondto the syndrome coefficients of S(x):
0
00
1322221120
112110022110
=++++
=++++=++++
−−−−
−+−−
ttttt
tttttttt
SCSCSCSC
SCSCSCSCSCSCSCSC
KMMMMM
KK
S0 = Fn−2t , S1 = Fn−2t+1 , .... , S2t−1 = Fn−1.Inserting this in the framed part of eq. (4.53), the following applies:
The linear equation system is uniquely solvable. The relation between thewanted coefficients Ci and the known coefficients Si obtained from thiscorresponds to a cyclic convolution and, using the normalisation C0 = 1, forms the code equation (also called Newton identity):
∑=
− −==⋅+t
iijij ttjSCS
112,,for0 K
(4.54)
(4.55)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 224
4.2.3 Decoding of Reed-Solomon Codes
In general, the number of errors is unknown, with errors of little importancebeing more likely than errors of great importance. For e < t , the wantedcoefficients are C1, C2, ... , Ce. The number of variables is equal to e, however, the number of equations is equal to t. From this results that theequation system is over-determined and different approaches for C(x) arerequired. For this, the not normalised code equation with a variable numberof errors is considered:
In matrix notation, the following representation applies:
te
CC
CC
SSSSSSSS
SSSSSSSS
ee
etetttetettt
eeee
,,2,1für1
10
122221222123222
121011
KM
KK
MMMMLL
==
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
⋅
⎟⎟⎟⎟⎟
⎠
⎞
⎜⎜⎜⎜⎜
⎝
⎛
−−−−−−−−−−−−
+−
0
(4.57)
(4.56)tetejSC
e
iiji ,,2,1für12,,für0
0KK =−==⋅∑
=− for for
for
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 225
4.2.3 Decoding of Reed-Solomon Codes
The search for the coefficients of the fault location polynomial is startedwith the most likely error event e = 1. If for this number of errors no solution has been found, e is increased gradually until it has reached thelimit of t. If no fault location polynomial has been found, there is a decodingfailure.Using two well-known algorithms, the solution to the code equation can becalculated with low computation effort. These are:– the Berlekamp-Massey algorithm and– Euclid‘s division algorithm.If a solution for the fault location polynomial has been found, the fault locations are searched by simple testing of all kinds of zeros (Chien search): C(x = zi) for 0 ≤ i ≤ n–1.If for any j from this codomain
C(x = zj) = 0 applies,then j is a fault location.
(4.58)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 226
4.2.3 Decoding of Reed-Solomon Codes
Calculation of the error values:Eq. (4.50) corresponds to a cyclic convolution of the coefficients, which canbe described as
n) mod(Index 1,,1,0für00
−==⋅∑=
− njFCe
iiji K
n) mod(Index 1,,1,0für01
0 −==⋅+ ∑=
− njFCFCe
iijij Kor
n) mod(Index 1,,1,0für1
10 −=⋅⋅−= ∑
=−
− njFCCFe
iijij K
Resolved to Fj , the error values can be calculated recursively:
using C(x) = C0 + C1x + ... + Cexe
and F(x) = F0 + F1x + ... + Fk−1xk−1 + Fkxk + Fk+1xk+1 + ... + Tn−1xn−1
unknown part known part (S)
(4.59a)
(4.59b)
(4.60)
for
for
for
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 227
4.2.3 Decoding of Reed-Solomon Codes
Example:The (6,2)-RS code over GFGF(7) with the primitive element z = 5 isconsidered. Then the parameters are: n = 6, k = 2, d = dH,min = 5 andt = 2. The receive vector r = (1,2,3,1,1,1) has to be decoded.1. Calculation of the syndrome (eq. 4.47):
S
r = (r0,r1,r2,r3,r4,r5) = (1,2,3,1,1,1)R = (R0,R1,R2,R3,R4,R5) = (R0,R1,F2,F3,F4,F5) = (5,0,4,6,6,1)
⇒ S = (F2,F3,F4,F5) =(S0,S1,S2,S3) = (4,6,6,1).
Assumption of the most likely number of errors : e = 1 ⇒ C(x) = C0 + C1x.
2. Calculation of the number of errors using eq. (4.57):
⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟
⎠⎞
⎜⎝⎛⋅⎟
⎟⎠
⎞⎜⎜⎝
⎛=⎟
⎠⎞
⎜⎝⎛⋅⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛
000
1616646
010
231201 C
CC
SSSSSS
Normalisation (arbitrary): C1 = 1 ⇒
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 228
4.2.3 Decoding of Reed-Solomon Codes
The single equations have the following solutions:6 C0 + 4 = 0 ⇒ C0 = 4,6 C0 + 6 = 0 ⇒ C0 = 6 and1 C0 + 6 = 0 ⇒ C0 = 1.
This is a contradiction, thus e has to be increased:e = 2 ⇒ C(x) = C0 + C1x + C2x2.
I: 6 C0 + 6 C1 + 4 = 0II: 1 C0 + 6 C1 + 6 = 0I−II: 5 C0 − 2 = 0 ⇒ C0 = 2⋅5−1 = 6 II: 6 C1 + 5 = 0 ⇒ C1 = −5⋅6−1 = 5
⎟⎠⎞⎜
⎝⎛=⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛⋅⎟
⎠⎞⎜
⎝⎛=⎟
⎟
⎠
⎞
⎜⎜
⎝
⎛⋅⎟
⎠⎞
⎜⎝⎛
00
1661466
10
210
123012 C
C
CCC
SSSSSSNormalisation: C2 = 1 ⇒
Solving the equation system leads to:
Thus, the error location polynomial reads: C(x) = 6 + 5 x + x2.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 229
4.2.3 Decoding of Reed-Solomon Codes
The search for the roots (Chien search) results in:C(x = z0 = 1) = 6 + 5 + 1 = 5,C(x = z1 = 5) = 6 + 4 + 4 = 0 ⇒ root at z1,C(x = z2 = 4) = 6 + 6 + 2 = 0 ⇒ root at z2,C(x = z3 = 6) = 6 + 2 + 1 = 2,C(x = z4 = 2) = 6 + 3 + 4 = 6 andC(x = z5 = 3) = 6 + 1 + 2 = 2.
Errors were detected at the 1st and the 2nd position of the receive vector(index beginning with 0!).Proof: C(x) = (x − z1)⋅(x − z2) = 6 + 5 x + x2
3. Calculation of the error values (eq. 4.60):With R = (R0,R1,F2,F3,F4,F5) = (5,0,4,6,6,1) and C(x) = 6 + 5 x + x2 thefollowing results:
F0 = −C0−1 ⋅ (C1 ⋅ F−1 + C2 ⋅ F-2)
= −C0−1 ⋅ (C1 ⋅ F5 + C2 ⋅ F4)
= −6−1 ⋅ (5 ⋅ 1 + 1 ⋅ 6) = 1 ⋅ (5 + 6) = 4
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 230
4.2.3 Decoding of Reed-Solomon Codes
andF1 = −C0
−1 ⋅ (C1 ⋅ F0 + C2 ⋅ F-1)= −C0
−1 ⋅ (C1 ⋅ F0 + C2 ⋅ F5)= −6−1 ⋅ (5 ⋅ 4 + 1 ⋅ 1) = 1 ⋅ (6 + 1) = 0.
With this, the complete error vector in the frequency domain reads:F = (F0,F1,F2,F3,F4,F5) = (4,0,4,6,6,1).
The inverse transform in the time domain results in:f = (f0,f1,f2,f3,f4,f5) = (0,1,2,0,0,0).
aFinally, the decoding result reads:
= r − f = (1,2,3,1,1,1) − (0,1,2,0,0,0) = (1,1,1,1,1,1).
This position provides a good checking opportunity, since in f only thosepositions should not be equal to zero, which have been determined with thesearch for fault locations.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 231
4.3 Bose-Chaudhuri-Hocquenghem Codes4 Burst-Error Correcting Block Codes
Like the RS codes, Bose-Chaudhuri-Hocquenghem codes (BCH codes) belong to the class of the cyclic codes and represent an important extensionof the Hamming codes. Especially, they are suitable for the correction of multiple, statistically independent (single) errors. Regarding this, a differentiation between binary and non-binary BCH codes is made, wherethe binary BCH codes can be considered as a special case of the RS codes, whereas in turn the RS codes can be described as a special case of the non-binary BCH codes. Only the binary BCH codes, however, will beconsidered here.A binary BCH code of the code word length n = 2m–1 is characterised thatway that every code word a in the time domain consists of componentsai ∈ GFGF(2), whereas the code word in the frequency domain consists of components Ai ∈ GFGF(2m).In analogy to the RS code, a binary BCH code can be defined in thefrequency domain by disappearing adjacent control bit or, in
the time domain, by a generator polynomial.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 232
4.3 Bose-Chaudhuri-Hocquenghem Codes
For a binary BCH code with ai ∈ GFGF(2), the DFT transform equations4.25a/b are simplified to
Ai = a(x = z−i) and aj = A(x = zj),although Ai ∈ GFGF(2m).If Ai = 0 applies, the polynomial a(x) has a root z−i in the time domain.That means, in a(x) the coefficient (x − z−i) is included:
Ai = 0 = a(x = z−i).
df
4.3.1 Derivation of BCH Codes in the Frequency Domain
The theorem applies that a polynomial f(x) in GFGF(2) has the followingimportant characteristic:
[f(x)]2 = f(x2)
Proof via the binomial formula:(a + b)2 = a2 + 2ab + b2 = a2 + b2, since 2ab = 0 mod 2.
a and b themselves can be again a sum term, the single components of which are sum terms as well etc. The result is an indefinitely long sum term
with the characteristic to be proven.
(4.61a/b)
(4.62)
(4.63)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 233
4.3.1 Derivation of the BCH Codes in the Frequency Domain
With eq. (4.63) applies:A2i = a(x = z−2i) = a(x = (z−i)2) = [a(x = z−i)]2 = (Ai)2
A4i = a(x = z−4i) = a(x = (z−2i)2) = [a(x = z−2i)]2 = (A2i)2 = (Ai)4
A8i = a(x = z−8i) = a(x = (z−4i)2) = [a(x = z−4i)]2 = (A4i)2 = (Ai)8 etc.Thus, the general relation that the condition ai ∈ GFGF(2) is fulfilled, reads:
The binary (primitive) BCH code C with the code word length n = 2m − 1 and the built minimum distance d is defined by
C = {a | ai = A(zi), grad A(x) ≤ n − d, A(2 j⋅i) = (Ai)(2j), Ai ∈ GFGF(2m)}
with the primitive element z ∈ GFGF(2m).
( ) )2()2(
jj ii AA =⋅
With this, the definition reads as follows:
(4.64)
(4.65)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 234
4.3.1 Derivation of the BCH Codes in the Frequency Domain
Example:A BCH code over GFGF(23) with n = 23 − 1 = 7 and t = 1, that means d = 3, isconsidered. For the components of the code word vector
A = (A0,A1,A2,A3,A4,A5,A6) = (A0,A1,A2,A3,A4,0,0) applies:for A0: A(2 j⋅ 0) = A0 = (A0)(2
j) ⇒ A0 ∈ GFGF(2)
for A1: A(2 j⋅ 1) = A(2 j) = (A1)(2j) ⇒ A2 = (A1)2 , A4 = (A1)4 ,
A8 mod 7 = A1 = (A1)8 ⇒ A1 ∈ GFGF(23)for A3: A(2 j⋅ 3) = (A3)(2
j) ⇒ A6 = (A3)2 , A12 = A5 = (A3)4, A24 = A3 = (A3)8
⇒ A3 ∈ GFGF(23)Using A5 = A6 = 0, however, also A3 = 0. That means, although 5 digits areavailable in the frequency domain, only two symbols and 4 bits (1 + 3) of information per code word, respectively, can be transferred. Coding example:A0 = 1, A1 = z ⇒ A = (1,z,z2,0,z4,0,0), A(x) = 1 + z x + z2 x2 + z4 x4
a0 = A(z0) = 1+z1+z212+z414 = 1+z+z2+z4 = 1+z+z2+z+z2 = 1a1 = A(z1) = 1+zz+z2z2+z4z4 = 1+z2+z4+z8 = 1+z2+z+z2+z = 1 ...
⇒ a = (1,1,0,1,0,0,0) (comp. respective RS coding on p. 214f.)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 235
4.3.1 Derivation of the BCH Codes in the Frequency Domain
The aforementionedexample can berealised usingthe adjoining circuit.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 236
4.3.2 Derivation of the BCH Codes in the Time Domain4.3 Bose-Chaudhuri-Hocquenghem Codes
In case that the component Ai of a vector A as well as their conjugatedcomponents are equal to zero, the product of the respective roots forms in the time domain a minimum polynomial with coefficients from GFGF(2) according to eq. (4.17) and (4.22). If A has consecutive components Ai, Ai–1, ..., that are equal to zero, the product of the minimum polynomials generatesthe generator polynomial of the binary BCH code. Then the definition reads:
In case that d − 1 consecutive numbers in M exist, the built minimumdistance is d. The minimum distance actually achieved can be longer.
K⋅⋅=−= ∏∈
)()()()(21
xmxmzxxg jjMi
i
Kj are the cyclotomic fields in terms of a number n = 2m − 1, z is a primitive element in GFGF(2m) and M the union of sets of the cyclotomic fields: M = {Kj1
∪Kj2∪...}. A primitive BCH code of the length n = 2m − 1 is defined by
the generator polynomial:
The BCH code has the dimension k = n − grad (g(x)).
(4.66)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 237
4.3.2 Derivation of the BCH Codes in the Time Domain
Example:On the basis of the GFGF(24), that means n = 24 − 1 = 15, structures fordifferent minimum distances to be built are considered (compare p. 184).
1)()()( 41
},{ 21
++==−= ∏∈
xxxmzxxgKKi
i
1)1()1(
)()()()(
46782344
31},,,{ 4321
++++=++++⋅++=
⋅=−= ∏∈
xxxxxxxxxx
xmxmzxxgKKKKi
i
I:
II:
With M = {K1,K2} = K1 = {1,2,4,8} the built minimum distance is d = 3, since d − 1 = 2 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 4 = 11.
With M = {K1∪K3} = {1,2,3,4,6,8,9,12} the built minimum distance isd = 5, since d − 1 = 4 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 8 = 7.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 238
4.3.2 Derivation of the BCH Codes in the Time Domain
III:
Combining the cyclotomic fields, it has to be taken into account that it isabout the union of sets in which the same element cannot be foundrepeatedly. Therefore, e.g. for the first example (I) the following applies:
M = {K1∪K2} = K1, da K1 = K2.
1
)1()1()1(
)()()()()(
245810
22344
531},,,{ 621
++++++=
++⋅++++⋅++=
⋅⋅=−= ∏∈
xxxxxx
xxxxxxxx
xmxmxmzxxgKKKi
i
K
With M = {K1∪K3∪K5} = {1,2,3,4,5,6,8,9,10,12} the built minimumdistance is d = 7, since d − 1 = 6 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 10 = 5.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 239
4.3.3 Aspects of Coding / Decoding of BCH Codes4.3 Bose-Chaudhuri-Hocquenghem Codes
The BCH code is a linear cyclic code that, in analogy to section 3.3, can becoded in two ways using the generator polynomial:– unsystematically by multiplication by the generator polynomial and– systematically by division by the generator polynomial.Unlike the RS codes, the shift of the test frequencies the BCH codes.The decoding is similar to the decoding with the RS codes. It is theadvantage of this case that with binary codes no calculation of the errorvalues is required.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 240
5 Convolutional CodesCoding Theory
Beside the block codes, the convolutional codes are the second important groupof error-correction codes. Instead of the generation of code words block byblock, the code words are generated by convolution of an information sequencewith a set of generator coefficients.Though they have a simpler mathematical structure compared to the block codes, no analytical methods for construction exist for them; they are found bycomputer simulations. However, convolutional codes can process reliabilityinformation of code symbols much better than block codes. In return, they arehighly sensitive to burst errors. Normally, the codes are binary.
First publications regarding convolutional codes were launched in 1955 byElias. However, they could only be used in practice after decoding methodshad been developed. Among those are the methods of the sequentialdecoding by Fano (1963) and the maximum Likelihood decoding algorithmby Viterbi (1967) which can be realised in practice.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 241
5 Convolutional Codes
Soft and hard decision:With decoding, convolutional codes are able to use hard decision as well as soft decision symbols. Hard decision means that the continuous receivedsignal is converted by a threshold decision into a sequence of possibletransmit symbols afterdemodulation.With the soft decision,either the analogsignal is passed on or thequantisation values(interim values) are applied(see figure). Therefore, amore reliable decoding ispossible.
Zeit
Empf
angs
pege
l
HD decision threshold
bit error!
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 242
5.1 Definition of Convolutional Codes5 Convolutional Codes
In analogy to the block codes, information and code sequences are classifiedinto blocks of the length k and n, respectively. Here, however, they are indexedby the block number r: ur = (ur,1,ur,2, ..., ur,k), ar = (ar,1,ar,2, ..., ar,n).
∑ ∑= =
−⋅=m k
rr uga0 1
,,,,μ κ
κμνμκν
Usually, the symbols are binary : ai,j , ui,j ∈ GFGF(2) = {0,1}.
ar = ar(ur, ur−1, ... , ur−m)The coder is linear (comp. section 5.4), that means the code bits result in linear combinations of the information bits. The formal description is madeby using the generator coefficients gκ,μ,ν ∈ GFGF(2) with 1 ≤ κ ≤ k, 0 ≤ μ ≤ mand 1 ≤ ν ≤ n, so that the code bit subsequences result in convolutions of theinformation bit sequences with the generator coefficients:
The assignment of ar to ur is, in contrast to the block codes, notmemoryless. The active code block is defined by the current informationblock and by m preceding ones:
(5.1a/b)
(5.2)
(5.3)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 243
5.1 Definition of Convolutional Codes
The code rate of the convolutional code is: .nkR =C
The size m is referred to as memory and L = m + 1 as the constraint length. In addition to the code rate, it influences the performance as well as thedecoding effort. Formally, block codes are a special case of theconvolutional codes with m = 0.Whereas with block codes k and n are usually large, typical values withconvolutional codes are: k = 1, 2; n = 2, 3, 4; m ≤ 8.For the realisation of the memory, k ⋅ (m + 1) locations are required (comp. figure on the next page). In literature, representations also exist putting theinformation sequence ur directly to the adder without intermediate storage. In this case, the number of locations only amounts to k ⋅ m.
(5.4)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 244
5.1 Definition of Convolutional Codes
Block diagram of a general (n,k,m) convolutional coder:
A connectioncorresponds to gκ,μ,ν.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 245
5.1 Definition of Convolutional Codes
Example: (2,1,2) convolutional coder
The parameters are:n = 2, k = 1, m = 2,RC = 1/2.
The code bits are calculated from:ar,1 = ur + ur−1 + ur−2;ar,2 = ur + ur−2.
Example of a coding sequence:u = (1,1,0,1,0,0,...);a = (11,01,01,00,10,11,...).
This example will be often referred to in this section.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 246
5.1 Definition of Convolutional Codes
Example: (3,2,2) convolutional coder
The parameters are:n = 3, k = 2, m = 2,RC = 2/3.
The code bits are calculated from:ar,1 = ur,2 + ur−1,1 + ur−2,2;ar,2 = ur,1 + ur−1,1 + ur−1,2;ar,3 = ur,2.
Example of a code sequence:u = (11,10,00,11,01,...);a = (111,110,010,111,001,...).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 247
5.1 Definition of Convolutional Codes
Description of the polynomial:The restriction k = 1 is to be applied here. From this, tremendoussimplifications in the description result.To the generator coefficients gk,v the generator polynomials
∑=
⋅=m
xgxg0
,)(μ
μνμν
are assigned. The information bit sequence is characterised by a power series and the code block sequence by a power series vector:
∑∞
==
0)(
r
rr xuxu
( ) ∑∞
=
==0
,21 )(with)(,),(),()(r
rrn xaxaxaxaxax ννKa
and
(5.5)
(5.6)
(5.7a/b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 248
5.1 Definition of Convolutional Codes
The convolutional coding corresponds to a polynomial multiplicationaccording to:
⇔ a(x) = u(x) ⋅ G(x)
For the memory applies:
nxgxuxa ,,1for)()()( K=⋅= ννν
( ) ( ))(,),(),()()(,),(),( 2121 xgxgxgxuxaxaxa nn KK ⋅=
( ))(,),(),()( 21 xgxgxgx nK=G
))((gradmax1
xgmn
νν ≤≤
=
with the generator matrix:
Then the definition of the convolutional codes with the polynomialmultiplication reads:
⎭⎬⎫
⎩⎨⎧
∈=⋅= ∑∞
=0
}1,0{,)(|)()(r
rr
r uxuxuxGxuC
(5.8a)
(5.8b)
(5.8c)(5.9)
(5.10)
(5.11)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 249
5.1 Definition of Convolutional Codes
Example:
From this, the code sequence in vector notation results by sorting thecoefficients:
For the (2,1,2) convolutional coder on page 245 applies:G(x) = (g1(x),g2(x)) = (1 + x + x2, 1 + x2).
The coding example can be described as:u = (1,1,0,1,0,0,...) ⇔ u(x) = 1 + x + x3 + ...
and a(x) = (a1(x),a2(x)) = u(x) G(x)= ((1 + x + x3)(1 + x + x2),(1 + x + x3)(1 + x2))= (1 + x4 + x5,1 + x + x2 + x5)= (a0,1x0 + a1,1x1 + a2,1x2 + a3,1x3 + a4,1x4 + a5,1x5,
a0,2x0 + a1,2x1 + a2,2x2 + a3,2x3 + a4,2x4 + a5,2x5).
Result: a = (11,01,01,00,10,11).a = (a0,1a0,2, a1,1a1,2, a2,1a2,2, a3,1a3,2, a4,1a4,2, a5,1a5,2).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 250
5.2 Special Code Classes5 Convolutional Codes
Systematic convolutional codes:The information bits are adopted directly to the code bits.Block diagram of a general systematic convolutional coder:
Non-systematicconvolutional codesnormally have morefavourable features.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 251
5.2 Special Code Classes
Time-phased convolutional codes:After L information blocks, m known blocks (tail bits – normally zeros) areinserted. Thus, the code word is closed (scheduling) and the memory is setto the zero state. This corresponds to a block structure (block code). Thenthe code rate is:
nmLkLR
⋅+⋅
=)(terminiertC, (5.12)
information blocks
coded information
m known additionalblocks
Note: Do not confuse the L combined blocks for the scheduling with theconstraint length (see page 243)!
information
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 252
5.2 Special Code Classes
Punctured convolutional codes:P code blocks (P is called puncturation length) are combined. From this, lcode bits are deleted according to a puncturation scheme (puncturation matrixP). Then the code rate is given by:
It must apply: RC,punktiert < 1 ⇒ l < P ⋅ (n − k). lnP
kPR−⋅
⋅=punktiertC,
Normally, they are applied with information blocks of the length k = 1.With the same code rate and k > 1, they are only slightly less efficient thannon-punctured convolutional codes. The decoding effort hardly changescompared to the non-punctured convolutional codes. Switch is possible withouthigh additional effort.Especially the RCPC codes (rate compatible punctured convolutional codes) are of great practical importance. Their code families are derived from a mother code that way (non-punctured convolutional code), that the
higher-rate codes are contained in the low-rate codes.
(5.13)
(5.14)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 253
5.2 Special Code Classes
Example:It is l = 1.
⎟⎠⎞⎜
⎝⎛= 11
01P
The code example reads as follows:u = (1,1,0,1,0,0,...)a = (11,01,01,00,10,11,...) apunkt = (11,0x,01,0x,10,1x,...).
‚x‘ identifies the digits that were deleted:apunkt = (11,0,01,0,10,1,...).
32
12212
punktiertC, =−⋅
⋅=RCode rate:
This is NOT a matter of matrixmultiplication!
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 254
5.3 Representation of Convolutional Codes5 Convolutional Codes
Besides the representation via the shift register circuit or via the generatorpolynomials and the polynomial multiplication as convolution, thefollowing three representations are familiar:– code tree– state diagram– netword diagram (Trellis diagram)
Code tree:The code is spanned as a tree, with the nodes corresponding to the possiblememory states. From every node, 2k branches go off, where the informationand respective code bits are depicted.The disadvantage of this representation is that with every coding step thenumber of nodes increases exponentially.The number of possible paths is: ( ) BlockBlock 22Pfade
NkNkN ⋅==where NBlock is equal to the number of processed blocks and thus to the
depth of the tree.
(5.15)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 255
5.3 Representation of Convolutional Codes
Example: Code tree for the (2,1,2) convolutional coder (see page 245).A path corresponds to an information and code sequence, respectively. It ismarked here: u = (1 1 0 1 ...), a = (11 01 01 00 ...).
state (output bits)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 256
5.3 Representation of Convolutional Codes
State diagram:In this case, it is about (in contrast to the code tree) a repetition-freedescription of the convolutional code. The time information is missing in the state diagram.Thereby, the output block depends on the input block and the memorycontent, whereas the new memory content depends on the input block and the old memory content.Regarding this, the minimum number of coding steps required to get fromthe state A to the state B can be read off . Thereby, the number of states is:
( ) mkmkN ⋅== 22ZustandAt every state, 2k transitions exist so that all in all
NZustand · 2k = 2k·m · 2k = 2k·(m+1)
transitions exist. With large parameters, this representation can becomeunclear again.
(5.17)
(5.16)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 257
5.3 Representation of Convolutional Codes
Example: (2,1,2) convolutional code(see page 245).
The parameters are:n = 2, k = 1, m = 2,RC = 1/2.
The used notations are:
memorycontent =
state= ur−1 ur−2
input bits (output bits)= ur (ar,1 ar,2)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 258
5.3 Representation of Convolutional Codes
Example: (3,2,2) convolutional coder(see page 246).
The parameters are:n = 3, k = 2, m = 2,RC = 2/3.
This example shows theproblems with unfavourableparameter values. Tough allarrows are included in the diagram,the information about the inputand output bits was renounced.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 259
5.3 Representation of Convolutional Codes
Network diagram − Trellis diagram:It corresponds to a state diagramwith a time component and representsthe basis for decoding.A receive sequence can becompared with all possible pathsblock by block and thus themost likely code sequence of areceive word can be determined.
Example:Trellis segment with all possibletransitions for the (2,1,2)convolutional coder (see page 245).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 260
5.3 Representation of Convolutional Codes
Example:Trellis diagram for the (2,1,2) convolutional coder (see page 245) with a coding example for the scheduling.The time-phased code word reads: a = (11,01,01,00,10,11,00,00).
scheduling
time
initial phase final phase
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 261
5.4 Features of Convolutional Codes5 Convolutional Codes
Linearity:The convolution is a linear operation. Thus, the convolutional code is linear.
⎭⎬⎫
⎩⎨⎧
≠=⋅= ∑∞
=0Hf 0)(,)(|))()((min
i
ii xuxuxuxxuwd G
Minimum distance and free distance:According to the evaluation of block codes, the minimum distance isunsuitable for convolutional codes, because code words of convolutionalcodes do not have a defined length. Instead of that, a so-called free distancedf is defined. It is to represent the minimum Hamming distance betweenarbitrary code sequences with different information sequences:
df = min {dH(a1,a2)| u1 ≠ u2}Since the convolutional code is a linear code, the distance to the zerosequence is sufficient for the calculation. Thus, df corresponds to theminimum weight of all possible code sequences:
(5.18a)
(5.18b)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 262
5.4 Features of Convolutional Codes
Fundamental path, distance function and distance profile:A path leaving and returning to the zero state without touching it in themeantime is called fundamental path. At least one path exists here whichforms a code sequence with a minimum weight.The paths can be determined via the Trellis or the state diagram. For thelatter, separation at the zero state is recommended.
Example (see next page): the shortest path from 00 to 00 is:
00 → 10 → 01 → 00 ⇒ wH = 5.(11) (10) (11)
Two paths can be found for wH = 6:
00 → 10 → 11 → 01 → 00 ⇒ wH = 6;
00 → 10 → 01 → 10 → 01 → 00 ⇒ wH = 6.
(11) (01) (01) (11)
(11) (10) (00) (10) (11)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 263
5.4 Features of Convolutional Codes
State diagram separated at the zero statefor the (2,1,2) convolutional coder (see page 245):For further approach, two parameters are introduced.
The states are described bythe formal parameter Zi.
For the description of thedistance, the formal parameter D is introduced, the power of which statesthe number of bits different from zero.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 264
5.4 Features of Convolutional Codes
By means of the introduced parameters, the following equation system canbe set up: Zb = D2Za + D0Zc; Zc = D1Zb + D1Zd;
Zd = D1Zb + D1Zd; Ze = D2Zc.The solution of the equation system provides the distance function:
From the power series expansion
the following results:
DD
ZZDT
a
e21
)(5
−==
1for11
1 432 <+++++=−
xxxxxx
K
K+++++=−
⋅= 987655 16842211)( DDDDD
DDDT
The result shows the path with the minimum weight 5, so that the freedistance df = 5. Furthermore, the two paths with the weight 6 can be read off (see page 262). There are 4 more paths with the weight 7, 8 paths with the
weight 8, 16 paths with the weight 9 etc.
(5.19)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 265
5.4 Features of Convolutional Codes
The minimum distance dmin is dependent on the number of coding steps i. Considering dmin as a function of i, the distance profile of a convolutionalcode is obtained. In this case, the statement is of interest how many codingsteps are required for a code word to reach the free distance by all means.
Example: (2,1,2) convolutional code (see page 245)After i = 6 steps at most, the freedistance is reached(comp. fundamental way with the lowestweight, where onlythree steps form thefree distance; page262).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 266
5.4 Features of Convolutional Codes
Optimal convolutional code:A code with the maximum free distance at a given code rate and memory isreferred to as optimal code.
m df g1(x) g2(x) g3(x)2 5 1 + x + x2 1 + x2
3 6 1 + x + x3 1 + x + x2 + x3
4 7 1 + x3 + x4 1 + x + x2 + x4
5 8 1 + x2 + x4 + x5 1 + x + x2 + x3 + x5
6 10 1 + x2 + x3 + x5 + x6 1 + x + x2 + x3 + x6
2 8 1 + x + x2 1 + x2 1 + x + x2
3 10 1 + x + x3 1 + x + x2 + x3 1 + x2 + x3
4 12 1 + x2 + x4 1 + x + x3 + x4 1 + x + x2 + x3 + x4
5 13 1 + x2 + x4 + x5 1 + x + x2 + x3 + x5 1 + x3 + x4 + x5
6 15 1 + x2 + x3 + x5 + x6 1 + x + x4 + x6 1 + x + x2 + x3 + x4 + x6
Table of optimal convolutional codes:
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 267
5.4 Features of Convolutional Codes
Catastrophic coder:If a finite number of transmission errors results in an infinite number of decoding errors, this is called catastrophic error propagation. This occurs iftwo output sequences of a coder differ only in few bits, although the inputsequences are completely different. Thus, a small number of transmissionerrors in these positions can lead to a falsification of one code word into theother.
For the case k = 1, a catastrophic coder can be avoided, if the generatorpolynomials do not have a common divisor (comp. example):
ggT(g1(x), g2(x), ... , gn(x)) = 1.
This is always possible if at least two memory states exist which are notabandoned if the respective information does not change and which therebygenerate identical code sequences. At the same time, this is a criterion forthe detection of a catastrophic coder and can be read off easily from a statediagram (see example next page).
(5.20)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 268
5.4 Features of Convolutional Codes
Example:
From the generator polynomials, the same statementcan be derived, since g1(x) is a divisor of g2(x):
The two states (00) and(11) are not abandoned incase of consistent informationand generate the same codesequences. Thus, this is about a catastrophic coder.
g1(x) = 1 + x;g2(x) = 1 + x2 = (1 + x) ⋅ (1 + x).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 269
5.5 Decoding Using the Viterbi Algorithm5 Convolutional Codes
Viterbi decoding is a method that requires small effort and with which, technically, memory depths of up to m ≈ 10 can be realised. It provides a Maximum-Likelihood decoding and thus is optimal.For the derivation of the method, a terminated code sequence a of the lengthN and a receive sequence r is to be known. The objective of the decoder isto estimate a code sequence complying with the transmit sequence a as much as possible, that means the Likelihood function p(r|a) maximises:
a
Alternatively, the maximum of the Log-Likelihood function log p(r|a) issearched for, so that for the memoryless channel the transition probabilityp(r|a) can be replaced by a sum of the transition probabilities of the singlesymbols p(ri|ai):
∑−
==
1
0)|(log)|(log
N
iii arpp ar
∏−
==
1
0)|()|(
N
iii arpp ar (5.21)
(5.22)
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 270
5.5 Decoding Using theViterbi Algorithm
Thus, the Maximum-Likelihood detection of the maximisation correspondsto metrics with the general form:
( ) ∑∑
∑
−
=
−
=
−
=
=+⋅=
+⋅=
1
0
1
0
1
0
)|()|(log
)|(log)|(
N
iii
N
iiii
N
iii
ararp
arpM
μβα
βαar
using the metrics increment .)|( ii arμ
⎩⎨⎧
−=+=−
=
⎩⎨⎧
≠=−
=
1for1for1
forfor1
)|(
err
err
err
err
ii
ii
ii
iiii
raprap
raprap
arp
{ } { }1,1, 21 +−=∈ xxai { } { }1,1, 21 +−=∈ yyriExample: Using andfor the symmetric binary channel applies:
(5.23)
(5.24) input output
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 271
5.5 Decoding Using theViterbi Algorithm
Choosing for the free parameters α and βi the values
then the metrics increment just reads:
errerr
err log1und1log/2 pp
pi αβα −−=
−=
iiiiii
ii rararaar =
⎭⎬⎫
⎩⎨⎧
−=−+=+= 1für1
1für1)|(μ
is equivalent to a minimisation of the Hamming distance or to themaximisation of the number of agreements.This, however, is only one possible approach. Other definitions of metrics
are possible.
iiii raar =)|(μ
The maximisation of the Viterbi metrics
(5.25a/b)
(5.26b)
(5.26a)
and
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 272
5.5 Decoding Using the Viterbi Algorithm
Viterbi algorithm:Decoding is carried out by means of the Trellis diagram. The objective is to compare all possible code sequences with the receive sequence. With thisalgorithm, this takes place calculating the Viterbi metrics.
Approach: With the initial state, the metrics is set to 0. Then, in every step, to each of the states reached via different paths a metrics z is assigned, thatis calculated from the addition of the metrics of the previous state and a metrics increment.From all 2k paths arriving at a state only the path is retained that shows thelargest metrics. All other 2k − 1 paths are discarded, since they can neverachieve a larger metrics. If several paths with the same metrics exist, a random sample has to be effected. Eventually, the decoding result is thepath with the largest metrics.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 273
5.5 Decoding Using the Viterbi Algorithm
During decoding, a definite number of paths has to be saved. Since thenumber of paths to be saved can become very large (before one pathremains only; see the following example), the length of the paths to besaved is restricted to 5 ⋅ k ⋅ m. This restriction is based on experience. Thereby, the need on storage space is approx. the product of the number of states and the path length (2k ⋅ m ⋅ 5 ⋅ k ⋅ m bit).
In principle, the rule for the metrics increment is arbitrary. With the hard-decision decoding, eq. (5.26b) is normally used or simply the number of agreements between the code sequence of a part of a path and therespective receiver is selected. With the soft-decision decoding, e.g., thedistance between a receive value and the code symbols can be selected, withthis method being more reliable than the hard decision decoding (comp. page 241). In this case, it is advantageous that the propability of equalmetrics at a state is lower.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 274
5.5 Decoding Using the Viterbi Algorithm
Example: (2,1,2) convolutional code (see page 245)
In the following, the single decoding steps of the 9 known sequences all in all are represented in detail. The paths with the largest metrics arehighlighted in purple.
Let the information vector beu = (1,1,0,1,0,0,0,1,0,...).
From this, a code sequence vector resultsa = (11,01,01,00,10,11,00,11,10,...).
With the transistion, the error vectorf = (00,10,00,00,01,00,00,00,00,...)
occurs so that the receive vector reads:r = (11,11,01,00,11,11,00,11,10,...).
The receive vector is to be decoded using the Viterbi algorithm. Thereby, a hard-decision decoding is to be carried out and the sum of agreements is to be used as metrics.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 275
5.5 Decoding Using the Viterbi Algorithm
1st step:
At the beginning the metrics is equal to 0 and increases per step with the numberof agreements of the receive sequence with the possible code sequences.
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 276
5.5 Decoding Using the Viterbi Algorithm
2nd step:
After m = 2 steps, all states are achieved. With the next step, every state will be achieved over more than one path (see next page).
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 277
5.5 Decoding Using the Viterbi Algorithm
3rd step:
Per arriving path a metrics exist, with the upper metrics belonging to theupper path and the lower metrics belonging to the lower path. In each case,
the path with the lower metrics can be deleted.
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 278
5.5 Decoding Using the Viterbi Algorithm
4th step:
Deleting the path, the representation becomes clearer. In this step, the twolower states each have the same metrics so that in each case one path is
deleted randomly (comp. next page).
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 279
5.5 Decoding Using the Viterbi Algorithm
5th step:
The previous path cancellations cause that only one single transition fromthe 0th state to the 1st state remains, that means is the most likely one. Thus, this single sequence is already decoded. →
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 280
5.5 Decoding Using the Viterbi Algorithm
6th step:
Virtually, the code sequence (11) and the appropriate information sequence(1), respectively, can be output by the decoder and the representation can
move by one state. In this example, it is still retained.
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 281
5.5 Decoding Using the Viterbi Algorithm
7th step:
From the maximum metrics it can be seen how many errors must at least have occurred so far (in this case, maximum 14 agreements are possible,
12 do exist at most, thus: at least 2 errors; comp. error vector).
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 282
5.5 Decoding Using the Viterbi Algorithm
8th step:
In case of a time-phased code, the number of states would become smaller to the end (final phase), until after the last step only the initial state remains
(comp. page 260).
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 283
5.5 Decoding Using the Viterbi Algorithm
9th step:
The end of the known sequence is achieved. Up to the 4th state, the result canbe displayed; up to the last state this would only be possible if the receive
sequence was finished. However, in practice it goes on!
receive sequencest
ate
input input
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 284
6 Advanced Coding TechniquesCoding Theory
In the course of this lecture, fundamentals and basic codes in the field of channel coding have been dealt with in detail. These are widely spread also in practice, however, often as complex implementations. Particularly, among those are concatenated codes with interleavers, the principles of which are to be shortly introduced here.
These „basic codes“ are, of course, not the only codes developed in the past40 years. Since the beginning of the 1990s, first and foremost two codeshave been investigated, which are able to approach the Shannon limit veryclosely (less than 1dB) with regard to their correction capability. On the onehand, these are the turbo codes virually derived from experiments and, on the other hand, the Low-Density-Parity-Check codes (LDPC codes), emerged in the 1960s. The latter, however, had been disregarded until thebeginning of the 1990s since they were very complex compared to the best available technology at that time. The principle of these two codes is to beoutlined.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 285
6 Advanced Coding Techniques
Concatenation of codes:With the concatenation of codes, 2 or several codes are connected in series(see figure next page). The motivation for the use of this constructionconsists in the exploitation of different complementary code characteristics,e. g. for error correction and detection or for single error and burst errorcorrection.
One classical example consists in the correction of single errors by the inner code, whereas burst errors are corrected by the outer code. A method of thiskind is applied e.g. in European mobile radio systems such as GSM. Byconcatenation, very efficient codes can be built. In many examples of use, the inner code is implemented using a convolutional code for single errorcorrection and the outer code is implemented using a Reed-Solomon codefor burst error correction.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 286
6 Advanced Coding Techniques
Principle of concatenated codes:
The principle of the interleaver is explained on the following pages.
outerchannelcoder
outerchanneldecoder
channel
innerchannelcoder
innerchanneldecoder
inter-leaver
deinter-leaver
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 287
6 Advanced Coding Techniques
Interleaving: On a transmission channel, burst errors can occur that are beyond thecorrection capability of the applied channel codes. It is the task of theinterleaver to re-sort the code word sequence to be sent, so that at thereceiver after having passed the deinterleaver several quasi-single errorsresult from one burst error, which are then spread to several code words and can be corrected.The example (see next page) shows this method as a matrix, in which thecode words are arranged line by line, however, are transmitted column bycolumn. Thus, the burst error occurring in the 4th column is distributed to several code words as single error.The disadvantage of the interleaver is that, due to the intermediate storage of the code words, an additional time delay emerges, which increases dependingon the size of the interleaver. This causes the most problems with time-critical systems such as voice transmission.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 288
6 Advanced Coding Techniques
Block interleaver:
code word 1code word 2code word 3code word 4
code word k
blocks sent
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 289
6 Advanced Coding Techniques
Coding of the signaling information with GSM (BCCH channel):The outer coder consists of a (224,184) block code, where 4 tail bits for theterminated (2,1,4) convolutional coding (inner coder) are attached. The tailbits are interpreted as information by the convolutional code; therefore a
code rate of 1/2. The 456 bits are cut into burstsand transmitted after interleaving.
Signalisierungsinformation 184 Bits
signalling information184 bits
184 bits 40 bits 4
• • • 456 bits • • •
TDMA frames: 4.6 ms
normal bursts
• interleaving• burst generation
convolutional code for error correction: rate 1/2
all in all: 228 bits
block code + 4 tail bits
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 290
6 Advanced Coding Techniques
Turbo codes:The first paper was published in 1993 by the two French electrical engineersClaude Berrou and Alain Glavieux. The original idea resulting in the turbocodes was the use of feedback of signals commonly used in electricalengineering (e. g. with amplifiers). Turbo codes are characterised by theirerror correction property, which is close to the Shannon limit (approx. 0.5 –0.7 dB).
The turbo codes were developed experimentally and not by mathematical-theoretical approaches. Therefore, their performance was checked criticallyfor years, before they were taken into consideration for application in modern systems. Today, they are e.g. a part of UMTS with datatransmission. For voice transmission, however, turbo codes are non-applicable due to the delay time. In this field, convolutional codes are used. In space communication, however, turbo codes are now applied.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 291
6 Advanced Coding Techniques
Principle of coding with turbo codes:
The information bits are transmitted threefoldly over the channel: oncewithout error protection (i) and twice with the error protection of a convolutional code (c1 and c2), with one of the coding blocks receiving an information sequence split by an interleaver as input. The multiplexer thenassembles the three signals to one signal.
convolutional code 1
convolutional code 2interleaver
multiplexer
information symbols
output signali
c1
c2
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 292
6 Advanced Coding Techniques
Principle of decoding with turbo codes:
The receive signal consisting of the three parts sent is splitted and transmitted to the decoding blocks, with these receiving their signals to bedecoded as soft-decision information as well as a copy of the uncodedsignal. From this, suggestions for the correctly decoded bits are derived, which are passed on over the signals e1 and e2 to the respective other block. The decoding is not completed until the suggestions of both blocks are equal(iterative method). From experience, the number of iteration steps isbetween 4 and 10.
decoder 1 decoder 2deinterleaver
receive signaldeinterleaver
interleaver
s, c1, c2
s, c2
s, c1
e2
e1
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 293
6 Advanced Coding Techniques
Low-Density-Parity-Check codes (LDPC codes):In 1960, Robert G. Gallager published the idea of LDPC codes within thescope of his doctoral thesis. Due to its complexity (among other things, iterative decoding was only possible using mainframe computers at thattime), they had been neglected with regard to practical applications for a long time. In terms of technology, the commercial application of highlyintegrated coder circuits to obtain efficient codes has only been possible fora few years. − In 1995, LDPC codes were rediscovered by MacKay andNeal. With a code rate of 0.5 and a block length of 10,000.000, the codeapproaches the Shannon limit up to 0.04 dB. This example rather to beconsidered as academic shows at least the capability of the LDPC codes. Therefore, finding suitable design methods for efficient codes (especially of short block lengths) is currently a major research topic. Nevertheless, thesecodes have already been practically included. In the ETSI standard for DVB-S2, for example, the concatenation of LDPC codes with a BCH code is used
for channel coding.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 294
6 Advanced Coding Techniques
Data rate and Shannon limit for digital satellite television: variants ofDVB-S2
0,00
0,50
1,00
1,50
2,00
2,50
3,00
3,50
4,00
4,50
-3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
R u[bi
t/s] p
er u
nit R
s
QPSK
8PSK
16APSK
32APSK
DVB-S
modulation constrained Shannon limit
18 dB
4.5
0.5
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 295
6 Advanced Coding Techniques
Definition of LDPC codes over an arbitrary alphabet A:An LDPC code is a linear block code of the block length n, the check matrixH of which has the following characteristics:1. The check matrix H is sparsely populated, that means a large part of the
elements hml is equal to 0.2. Each column l of the matrix H has a small number of Nl symbols from A
with Nl > 1.3. Each line m of the matrix H has a small number of Nm symbols from A
with Nm > 1.LDPC codes over A = GFGF(2b) are of particular practical interest.Since LDPC codes assigned to the class of linear block codes, they also have all their characteristics. Among those are the presentation by means of a generator matrix, coding by means of a matrix-vector multiplication, thepossibility to convert non-systematic into systematic LDPC codes, thedetermination of the minimum distance by means of the minimum
Hamming weight etc. (see chapter 3).
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 296
6 Advanced Coding Techniques
The figure shows the sparsely populated (900×1200) check matrix of a binary LDPC code with 3,600 ones. The (300×1200) generator matrix of theequivalent systematic LDPC codes, however, has 135,000 ones!
check matrix;3,600 ones
generator matrix;135,000 ones
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 297
6 Advanced Coding Techniques
Although a code word can be calculated by means of the relation c = u⋅G, use of this approach is not preferred in practice. In contrast to the check matrix, with LDPC codes the generator matrix is not sparsely populated in general. The coding according to this method is complex, since currentlyLDPC codes only show their strengths for large block lengths. Therefore, in most cases special LDPC codes (e. g. cyclic codes with linear codingcomplexity) are applied for implementation.Everything that is not easy in case of coding is, less than ever, easy fordecoding. A maximum likelihood decoding cannot be realised since thedependency of the complexity on the block length n is exponential. A set of sub-optimal methods exists, with the iterative methods providing the best performance. Due to their complexity, however, boundaries are set to theirimplementation.It can be summarised that the LDPC codes are the largest competitor of theturbo codes. Due to their complexity, however, they also show problems in
terms of the time delay.
Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009
page 298
... there is nothing else but to say:End of the Lecture ...
Good luck for theexamination!