Coding Theory WS200809englv1uni.paniladen.de/.../script/script_en.pdf · Prof. Dr.-Ing. Thomas...

Prof. Dr.-Ing. Thomas Kürner · Institut für Nachrichtentechnik · Technische Universität Braunschweig Coding Theory WS 2008/2009

page 1

Coding Theory

Prof. Dr.-Ing. Thomas Kürner


page 2

Parts of this script are based on the manuscript of the lecture of the same title held byProf. Dr.-Ing. Andreas Czylwik at Braunschweig Technical University in the wintersemester 2001/2002.

The version on hand was drawn up by Dipl.-Ing. Andreas Hecker in the years 2004 and 2005.


page 3

Organisation

• Lecture and practical exercise in the winter semester (2 + 1 SWS):– on Tuesdays; 08.50 am – 09.35 am; lecture hall SN 23.2– on Wednesdays; 11.30 am – 01.00 pm; lecture hall SN 23.2

• Assistant: Dipl.-Ing. M. Sc. Thomas Jansen• Examination: written or oral


page 4

Objectives:This lecture is intended to develop the understanding of the information-theoretical limiting factors in data transmission and to build knowledgeof methods of source and error control coding in theory and in practical applications.

Contents:Introduction1. Fundamentals of information theory2. Fundamentals of error control coding3. Single-error correcting block codes4. Burst error correcting block codes5. Convolutional codes6. Special coding techniques

Objectives and Contents of the Lecture


page 5

LiteratureLiterature recommended for the lecture:

– H. Rohling: Einführung in die Informations- und Codierungstheorie, Teubner– R. Togneri, C.J.S. de Silva: Fundamentals of Information Theory and Coding

Design, Chapman & Hall/CRC

Further literature:– B. Friedrichs: Kanalcodierung, Springer– H. Schneider-Obermann: Kanalcodierung, Vieweg– M. Bossert: Kanalcodierung, Teubner– J. Bierbrauer: Introduction to Coding Theory, Chapman & Hall/CRC– O. Mildenberger: Informationstheorie und Codierung, Vieweg– S. Lin, D. J. Costello Jr.: Error Control Coding, Pearson Prentice Hall

Literature dealing with digital transmission:– J.G. Proakis, M. Saleki: Grundlagen der Kommunikationstechnik, Pearson

Studium


page 6

IntroductionCode / Coding:A rule for the unambiguous assignment of characters of one character set to those of another character set is called a code.

The conversion of one character set into the other is referred to as coding(to code).

Depending on the problem, coding is classified as follows:– Cryptology– Source coding– Error control coding

Example: The assignment of names of towns to number plates:Berlin → BBraunschweig → BS

} subject of this lecture


page 7

Coding for encryption (cryptology):Messages are to be protected against unauthorised monitoring. Decryptioncan only be effected if the code is known.

Source coding:Messages are compressed to a minimum number of characters withoutloss of information (reduction of redundancy). With further compression, loss of information is accepted (e. g. considering video and audiotransmission).

Error control coding:Messages are to be protected against transmission errors by controlledadding of redundancy. Applications can be found, among other things, in the field of data transfer over waveguides and radio channels (especiallymobile radio channels) to be secured or with the storage of data.

Introduction


page 8

Block diagram of a digital broadcasting system:

Introduction

Digital source

Digital sink

Discrete channel

Source SourceCoder

Error ControlCoder Modulator

TransmissionChannel

Interference

Sink SourceDecoder

Error ControlDecoder Demodulator

A

D

D

A


page 9

Simplified model:

Introduction

Error control coding for error correctionFEC – forward error correction

The channel quality is characterised by the residual-error probability afterdecoding. The data rate, however, is independent from the channelquality.

DataSource ChannelChannel

CoderDataSink

Channel Decoder(Error Correction)


page 10

Introduction

Overview of some FEC codes:

Convolutional codes

FEC codes

Concatenated codesBlock codes

recursive non-recursivelinearnon-linear serial Product codes parallel

cyclicnon-cyclic

Hamming RS BCH cyclic HammingReed-Muller Abrahamson

The grey-coloured codes will not be dealt with in this lecture.


page 11

Error control coding for error detectionCRC – cyclic redundancy check;Application in combination with ARQ methods (automatic repeat request)

Introduction

For this method, a return channel is required, that is why it is notapplicable for „broadcast“ or time-critical applications. Redundancy isadded adaptively, that means only in case of errors. Thus, the residual-error probability is independent, however, the data rate depends on thechannel quality.

ARQ, for example, is applied for GPRS.

ChannelChannelCoder

ARQControl

ARQControl

return channel

errorinformation

Channel Decoder(Error

Detection)


page 12

Using error control coding, the error probability can bereduced to any value, if the data rate is smaller than thechannel capacity.

Claude E. Shannon, Warren Weaver,The mathematical theory of communication,Urbana, University of Illinois Press, 1949

Unfortunately, Shannon, the founder of the information theory, does notgive a practical construction rule for his theorem.

Einführung


page 13

Channel interference effects andexamples of correction results:

Introduction

Image before transmission:

Images after transmission:

via BSC channel with perr = 0.005without error control coding

via BSC channel with perr = 0.5without error control coding,but also not corrigible!

via BSC channel with perr = 0.005with (7,4,3)-Hamming error control coding(not all errors were fixed)

via BSC channel with perr = 0.005with (31,16,15)-BCH error control coding(almost error-free)


page 14

Fundamental idea of error control codingFor the protection of the message to be transmitted, redundancy is addedfor error detection (FE) and possibly for error correction (FK). Assignment in the coder; block coding (BC) is given as an example:

With the BC, a k-digit input vector u = (u1, ... , uk) is transformed into an n-digit output vector x = (x1, ... , xn).Then the code rate RC is defined:

With the minimum Hamming distance dmin, that means the minimumnumber of different digits of all possible pairs of code words, a measure

for the performance of a code is given.

Introduction

nkR =C (E1)

length length

output vector xinput vector u


page 15

Introduction

Code dice with n = 3 and

– RC = 1– dmin = 1– no FE and

no FK possible

– RC = 2/3– dmin = 2– one error noticeable;

no error corrigible

– RC = 1/3– dmin = 3– two errors noticeable;

one error corrigible

k = 3 k = 2 k = 1

valid code word invalid code word


page 16

1 Fundamentals of Information TheoryIn information theory, broadcasting is describedmathematically.

incorrect information

irrelevant relevant

redundant not redundant

Pivotal questions deal with the– quantitative calculation of the information content of messages,– determination of the transmission capacity of transmission channels

and– analysis and optimisation of source and error control coding.

Message from the point of view of the information theory:


page 17

1.2 Mathematical Description of MessagesMessages are classified according to their quality. The less a message is predictable, the higher its relevance.

1 Fundamentals of Information Theory

Relevance

Probability

Example:• Tomorrow the sun rises.• Tomorrow the weather will be bad.• For tomorrow we expect a severe

storm and, as a consequence, a power cut.

The message from a digital source corresponds to a sequenceof characters.


page 18

Maximum entropy H0 of a source:

1.2 Mathematical Description of Messages

(1.13)H0 describes the number of binary decisions required for the selection of themessage. Thereby, the occurrance probability of the source symbols is notconsidered.

Examples: Text in German with 71 alphanumerical characters as source(26⋅2 characters, 3⋅2 umlaut, ß, 12 special characters incl. space characters „“ ()-.,;:!?)

H0 = ld(71) = ln(71)/ln(2) = 6.15 bit/characterOne page of text in German with 40 lines and 70 characters per lineas source, that means all in all N = 7140⋅70 different pages are possible

H0 = ld(7140⋅70) = 40 ⋅ 70 ⋅ ld (71) = 17.22 kbit/page

It is assumed that a source with a character set of N characters is given. H0 isthen defined as:

H0 = ld(N) bit/character (bit = binary digit)


page 19

These conditions are fulfilled by the general solution I(xi) = −k ⋅ logb(p(xi)).


Information content I:

– I(xi) ≥ 0 for 0 ≤ p(xi) ≤ 1– I(xi) → 0 for p(xi) → 1– I(xi) > I(xj) for p(xi) < p(xj) – I(xi,xj) = I(xi) + I(xj) for p(xi,xj) = p(xi) ⋅ p(xj), d. h.

for two consecutive, statistically independent characters xi and xj

Definition of information content:(with k = 1 and b = 2)

terbit/charac)(

1ld)( ⎟⎟⎠

⎞⎜⎜⎝

⎛=

ii xp

xI (1.14)

It is assumed that a character set (alphabet) X with X = {x1, ... , xN} and thesingle occurrence probabilities of the characters p(x1), ... , p(xN) is given. In this regard, I is to map the content of a message in dependency on theprobability p [I = f(p)]. For this, the following characteristics are to befulfilled:


page 20


∑=

⋅==N

iiii xIxpxIXH

1)()()()(

Entropy H(X):

Redundancy of a source RQ:Information content of a source that is under-utilised due to the particularoccurrence probabilities of the characters:

RQ = H0 − H(X) (1.16)

(1.15)

H(X) describes the mean information content of a source:

terbit/charac)(

1ld)(1

∑=

⎟⎟⎠

⎞⎜⎜⎝

⎛⋅=

N

i ii xp

xp


page 21


Entropy of a binary source:

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

S(p) / bit

p

(1.18)

(1.17)

A binary source has a character set X of exactly two characters:X = {x1, x2} with the occurrence probabilities p(x1) = p and p(x2) = 1−p. Then the entropy H(X) is: H(X) = −p ld(p) − (1 − p) ld(1 − p).This function in dependence of p is also called Shannon function:

S(p) = H(X).

Repetition of mathematics:

( )

( ) 0loglim

log1log

0=⋅

−=⎟⎠⎞

⎜⎝⎛

→xx

xx

x


page 22


∑∑∑=== ⋅

⋅=⋅−⋅=−N

i ii

N

ii

N

i ii Nxp

xpNxpxp

xpNXH111 )(

1ld)(ld)()(

1ld)(ld)(

1ln −≤ xx

-1.0

0.0

1.0

1.0 2.0 x

ln(x)

x − 1

The entropy becomes maximum for equiprobable characters, that means:H(X) ≤ H0 = ld N.

Evidence:

Utilisation of the inequation:

)1(2ln

1ld1ln −≤⇒−≤ xxxx

∑=

⎟⎟⎠

⎞⎜⎜⎝

⎛−

⋅⋅≤−

N

i ii Nxp

xpNXH1

1)(

12ln

1)(ld)(

0112ln

1)(12ln

11

=⎟⎠⎞

⎜⎝⎛ −⋅=⎟

⎠⎞

⎜⎝⎛ −= ∑

= NNxp

N

N

ii

Using

q.e.d.

(1.19)

(1.20)


page 23

1.3 Methods and Algorithms of Source CodingIn communications technology, source coding is deployed in order to reduce the data volume to be transmitted. Examples for this can be foundin many applications frequently used such as data compression by„zapping“, MP3 or MPEG compression. It can also be found in GSM speech coding where a digitalisation has to be carried out first.

In the following, drafts for binary coding of discrete sources areconsidered in terms of the transmission via a binary channel.



page 24

1.3.1 Fields of Application of Source CodingThe character set X = {x1, ... , xN} of an arbitrary source is given.With source coding, a binary code of the code word length L(xi) isassigned to a character xi of the character set X. Regarding this, theobjective of the coding is the minimisation of the mean code wordlength :

1.3 Methods and Algorithms of Source Coding

L

∑=

⋅==N

iiii xLxpxLL

1)()()(

x1 1001x2 011

xN 010111L(xN)

....

....Examples for binary coding of characters:– ASCII code:

Fixed code word length L(xi) = 8 (block code)– Morse code:

Letters are represented by dots and dashes; code words are separatedby gaps; letters that often occur are assigned to short code words.

(1.21)


page 25

Prefix characteristic of a code:This characteristic means that a code word cannot form the beginning of another code word at the same time. A given bit sequence is then definite without use of further separation rules („point-free code“).

1.3.1 Fields of Application of Source Coding

x1 → 0x2 → 01x3 → 10x4 → 100

The definite decoding of a bit sequence is not possible. For example, for the sequence 010010 the following decodingresults are possible: x1x3x2x1, x2x1x1x3, x1x3x1x3, x2x1x2x1, x1x4x3 .

Examples for codes with and without prefix characteristic:x1 → 0x2 → 10x3 → 110x4 → 111

The definite decoding of a bit sequence is guaranteed. Decoding of the sequence 010010110111100, for example, provides definitely: x1x2x1x2x3x4x2x1 .For a successful decoding, however, the synchronisation to the beginning of the sequence is required.


page 26

Codes with a prefix characteristic can be decoded using a decision tree.


x1 → 0x2 → 10x3 → 110x4 → 111

(Do not mistake this for theredundancy RQ of a source!)Redundancy of a code RC: )(C XHLR −=

(1.22)

1st plane:

2nd plane:

3rd plane:


page 27

Kraft‘s inequation:A binary code with prefix characteristic and the code word lengthsL(x1), L(x2), ... , L(xN) only exists if:


121

)( ≤∑=

−N

i

xL i

Possible proof:The length of a tree structure is equal to the maximum code word length Lmax:

Lmax = max(L(x1), L(x2), ... , L(xN)).The code word xi on the layer L(xi) eliminates of the possiblecode words on the layer Lmax .The sum of all eliminated code words is less than or equal to the

maximum number of the code words on the layer Lmax:

.

)(max2 ixLL −

maxmax 221

)( LN

i

xLL i ≤∑=

−

The equals sign in Kraft‘s inequation is valid, if all endpoints of the codetree are taken by code words.

(1.23)


page 28

Lower limit of the mean code word length :Every prefix code has a mean code word length , which corresponds at least to the entropy of the source H(X): .


Possible proof:The evidence is similar to the evidence of inequation (1.19). Additionally, Kraft‘s inequation (1.23) is utilised.

The equals sign is fulfilled if the following applies:and the code word lengths are selected to: L(xi) = Ki = I(xi).

LL)(XHL ≥

xi p(xi) Code wordsx1 1/2 1 x2 1/4 00 x3 1/8 010 x4 1/16 0110 x5 1/16 0111

( ) iKixp 2

1)( =

With general p(xi), I(xi) has to be transferredto an integer value by rounding up, so thatthe following applies:

I(xi) ≤ L(xi) < I(xi) + 1. (1.27)L(xi) then continue to fulfil inequation (1.23).

815)( == XHLFor the example on the right:

(1.24)

(1.25)(1.26)


page 29

Shannon‘s coding theorem:For every source, a binary coding can be found:


1)()( +<≤ XHLXH

Possible proof:Multiply inequation (1.27) by p(xi) and sum up over all i.

Using the left side of inequation (1.27), proof is provided that with thiscondition a code with a prefix characteristic exists:

)()(

1 2)()(ld)( ii

xLiixpi xpxLxI −≥⇒≤⎟

⎠⎞⎜

⎝⎛=

1)(211

)( ∑∑==

− =≤N

ii

N

i

xL xpiSummation over all i:

From this, the inequation (1.23) results that guarantees the existence of prefixcodes. q.e.d.

(1.28)


page 30

1.3.2 Shannon CodingAlgorithm:1. Arranging of the occurrence probabilities2. Determination of the code word lengths L(xi) according to inequation

(1.27)3. Calculation of the accumulated

occurrence probabilities Pi:

1.3 Methods and Algorithms of Source Coding

∑−

==

1

1)(

i

jji xpP

i p(xi) I(xi) L(xi) Pi Code1 0.22 2.18 3 0.00 000 2 0.19 2.40 3 0.22 001 3 0.15 2.74 3 0.41 011 4 0.12 3.06 4 0.56 1000 5 0.08 3.64 4 0.68 1010 6 0.07 3.84 4 0.76 1100 7 0.07 3.84 4 0.83 1101 8 0.06 4.06 5 0.90 111009 0.04 4.64 5 0.96 11110

4. Code words are the L(xi) positionsafter decimal points of thebinary representation of Pi.

Example:0.90 = 1⋅2−1 + 1⋅2−2 + 1⋅2−3 +

0⋅2−4 + 0⋅2−5 + 1⋅2−6 +1⋅2−7 + 0⋅2−8 + 0⋅2−9 +1⋅2−10 + ...


page 31

Conversion of a decimal number into a binary number:For example, for the decimal number 0.9 the binary number is to bedeveloped. This search corresponds to the search for coefficients of thefollowing equation that can then be solved in the following way:

1.3.2 Shannon Coding

0.90⋅2 = 0.8 + 10.80⋅2 = 0.6 + 10.60⋅2 = 0.2 + 10.20⋅2 = 0.4 + 0 ...

0.90 = a1⋅2−1 + a2⋅2−2 + a3⋅2−3 + a4⋅2−4 + ... | ⋅21.80 = 1 + 0.80 = a1⋅20 + a2⋅2−1 + a3⋅2−2 + a4⋅2−3 + ...

Since 20 = 1, also a1 = 1 provided thata2⋅2−1 + a3⋅2−2 + a4⋅2−3 + ... < 1

However, this corresponds directly to Kraft‘s inequation (1.23), which isproven.Thus, by multiplying repeatedly, the coefficients can be determined.Simplified, this method can then be described:


page 32

1.3.2 Shannon Coding

Tree diagram of the Shannon code considered in the example:

The tree diagram shows the disadvantage of Shannon coding. Not all endpoints of the tree are taken, that means that the code word length can bereduced. Therefore, this code is not optimal.


page 33

1.3.3 Huffman Coding1.3 Methods and Algorithms of Source Coding

Huffman coding is based on a recursive approach. The fundamental ideais that the two symbols with the smallest probabilities must have the samecode word length; otherwise a reduction of the code word length ispossible. Thus, the symbols with the smallest probabilities form the initialpoint.

Algorithm:1. Arranging of the symbols according to their probability2. Assignment of 0 and 1 to the two symbols with the smallest

probability3. Combination of the two symbols with the smallest probability xN−1

and xN to a new symbol with the probability p(xN−1 ) + p( xN)4. Repetition of the steps 1 - 3, until only one symbol remains.


page 34

Coding of the example from 1.3.2:

1.3.3 Huffman Coding

0

0

1

1

1 0

0

1

01

1

1

10,1

0,18

0,26

0,41

0,59

1,0

7

1

4

5

6

3

8

2

0,22

0,19

0,15

0,12

0,08

0,07

0,07

0,06

9 0,04

0,14

0,33

10

11

001

011

0001

0100

0101

00000

00001

0

0

0

i Coding Codep(xi)


page 35

Tree structure of the example Huffman Code:

1.3.3 Huffman Coding


page 36

1.4 Discrete Source Models1.4.1 Discrete Memoryless Source


First, the joint entropy of character strings is to be calculated:

For M independent characters of the same source, the following applies:H(X1, X2, ... , XM) = M ⋅ H(X)

(1.29)

(1.30)

The coding possibilities for those sources are considered the charactersof which are read out statistically independent from each other.

)()(),( YHXHYXH +=

))((ld)()())((ld)()(1111

ypypxpxpxpypN

kkk

N

ii

N

iii

N

kk ⋅⋅−⋅⋅−= ∑∑∑∑

====

[ ]))((ld))((ld)()(1 1

ypxpypxpN

i

N

kkiki +⋅⋅−= ∑ ∑

= =

)),((ld),(),(1 1

yxpyxpYXHN

i

N

kkiki−= ∑ ∑

= =


page 37

1.4.1 Discrete Memoryless Source

As the general form of the Shannon coding theorem shows, the coding of character strings represents a more efficient method:

The disadvantage of this method is the strongly increasing codingcomplexity, since the number of possible characters increases exponentially.

As an example, a binary source with X = {x1, x2} and the occurranceprobabilities p(x1) = 0.2 and p(x2) = 0.8, respectively, are considered. Thus, the entropy of the source is H(X) = 0.7219 bit/character.As coding algorithm, the Huffman coding is used.

(1.31)

),,( 1 MM XXL KH(X1,...,XM) ≤ ≤ H(X1,...,XM) + 1

LM ⋅M⋅H(X) ≤ ≤ M⋅H(X) + 1

LH(X) ≤ ≤ H(X) + 1/M


page 38


Coding of single characters: Character p(xi) Code L(xi) p(xi)⋅L(xi)x1 0.2 0 1 0.2 x2 0.8 1 1 0.8

Σ = 1

Mean code word length: terbit/charac 1=L

CharacterPair

p(xi,xj) Code L(xi,xj) p(xi,xj)⋅L(xi,xj)

x1x1 0.04 101 3 0.12 x1x2 0.16 11 2 0.32 x2x1 0.16 100 3 0.48 x2x2 0.64 0 1 0.64

Σ = 1.56

Coding ofcharacter pairs:

Mean code word length: terbit/charac 78.0=L


page 39


Coding ofcharacter triples:

CharacterTriple

p(xi,xj,xk) Code L(xi,xj,xk) p(xi,xj,xk)⋅L(xi,xj,xk)

x1x1x1 0.008 11111 5 0.040 x1x1x2 0.032 11100 5 0.160 x1x2x1 0.032 11101 5 0.160 x1x2x2 0.128 100 3 0.384 x2x1x1 0.032 11110 5 0.160 x2x1x2 0.128 101 3 0.384 x2x2x1 0.128 110 3 0.384 x2x2x2 0.512 0 1 0.512

Σ = 2.184

Mean codeword length:

terbit/charac 728.0=L

Summary: M H(X) H(X) + 1/M1 0.7219 1.0 1.7219 2 0.7219 0.78 1.2219 3 0.7219 0.728 1.0552

L


page 40

1.4.2 Discrete Source With Memory1.4 Discrete Source Models

Coding possibilities for those sources, the characters of which are read out interdependently, are considered here. This fact applies to real sources, ifcorrelations occur between the single characters. In this case, the jointentropy is calculated to be:

∑ ∑= =

⋅−=N

i

N

kikki xypyxpXYH

1 1))|((ld),()|(with the conditional entropy

(1.32a)

(1.32b)

)|()(),( XYHXHYXH +=

))|((ld),())((ld)()|(1 11 1

xypyxpxpxpxypN

i

N

kikki

N

i

N

kiiik ⋅−⋅⋅−= ∑ ∑∑ ∑

= == =

[ ]))|((ld))((ld)|()(1 1

xypxpxypxpN

i

N

kikiiki +⋅⋅−= ∑ ∑

= =

)),((ld),(),(1 1

yxpyxpYXHN

i

N

kkiki−= ∑ ∑

= =


page 41

1.4.2 Discrete Source with Memory

Between the source entropy and theconditional entropy, the following relation exists: H(Y) ≥ H(Y | X )

Proof:

with p(xi,yk) = p(xi) ⋅ p(yk|xi)

∑ ∑= =

⎟⎟⎠

⎞⎜⎜⎝

⎛−⋅⋅≤−

N

i

N

k ik

kki xyp

ypyxpYHXYH1 1

1)|(

)(2ln

1),()()|(

0),()()(2ln

1)()|(1 11 1

=⎥⎦

⎤⎢⎣

⎡−≤− ∑ ∑∑ ∑

= == =

N

i

N

kki

N

i

N

kki yxpypxpYHXYH

Estimation with: )1(2ln

1ld1ln −≤⇒−≤ xxxx

q.e.d.

(1.33)

∑ ∑= =

⎟⎟⎠

⎞⎜⎜⎝

⎛⋅=−

N

i

N

k ik

kki xyp

ypyxpYHXYH1 1 )|(

)(ld),()()|(

∑ ∑∑= ==

⋅−=⋅−=N

i

N

kkki

N

kkk ypyxpypypYH

1 11))((ld),())((ld)()(


page 42


With this result, based on equiprobable single characters, the entropy of a source with memory is accordingly smaller than the entropy of a memoryless source. In other words: With sources with memory, a considerably more efficient source coding is possible by coding of characterstrings utilising the correlations among the single characters, as if the sourcewas considered memoryless.

Sources with memory can be modelled as Markoff processes. The respective sourcemodels are referred to as Markoff sources.

Example for correlations in texts in German: The probability that the letter „Q“is followed by „u“, is near 1.


page 43


Markoff source:A discrete source with memory can generally be described as Markoffsource. The Markoff sources are based on Markoff processes.As a basis, a sequence of successively occurring random variables isconsidered: z0, z1, z2, ..., zn. If these are statistically independent, thefollowing applies in general:

)()...,,,|( 021,...,,| 021 nznnnzzzz zfzzzzfnnnn

=−−−−

If the values for zi are statistically independent, the equation is extended to:

),...,|(),...,,|( 1,...,|021,...,,| 1021 mnnnzzznnnzzzz zzzfzzzzfmnnnnnn −−−− −−−−

=

Frequently, first order Markoff processes are considered (m = 1):

)|(),...,,|( 1|021,...,,| 1021 −−− −−−= nnzznnnzzzz zzfzzzzf

nnnnn

(1.34a)

(1.34b)

(1.34c)


page 44


zi can reach a finite number of discrete values: zi ∈ {x1, ... , xN}. TheMarkoff process emits these values. A set of values dependent on eachother represents a Markoff chain. This chain can be described in full bytransition probabilities:

),...,|(),...,|(101 101 mnnn imninjniinjn xzxzxzpxzxzxzp

−−−======= −−−

If the transition probabilities do not depend on the absolute time, this isreferred to as homogenous Markoff chain:

),...,|(),...,|(11 11 mm imkikjkimninjn xzxzxzpxzxzxzp ======= −−−−

If the steady state does not depend on the initial probabilities, this isreferred to as stationary Markoff chain:

For first order homogenous and stationary Markoff chains (y = 1) applies:

jjjnknikjnknwxpxzpxzxzp ====== +∞→+∞→

)()(lim)|(lim

ijinjn pxzxzp === − )|( 1

(1.35a)

(1.35b)

(1.35c)

(1.35d)


page 45


In case of the homogenousstationary Markoff chain, thesingle transition probabilities can becombined in a so-called transition matrix:

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

NNNN

N

N

ppp

pppppp

L

LOLL

L

L

21

22221

11211

P

11

=∑=

N

jijp

))()()(()( 2121 NN xpxpxpwww LL ==w

Pww =

The single rows of the transition matrixmust add up to 1:

The transition probabilities of the single discrete values can be combined to a probability vector:

The probability vector can be determined by means of the steady state:

(1.36a)

(1.37)

(1.38)

(1.36b)


page 46


A stationary Markoff source emitting first order Markoff chains is nowconsidered.

(1.39)N

∑ ∑= =

−− ==⋅==−=N

i

N

jinjninjn xzxzpxzxzp

1 111 ))|((ld),(

N N∑ ∑= =

−− ==⋅==⋅−=i j

injninjni xzxzpxzxzpw1 1

11 ))|((ld)|(

∑=

−− =⋅===i

inniiinn xzzHwxzzH1

11 )|()|(

−−−∞→

∞ == nnnnnn

zzHzzzzHZH 1021 )|(), ,,|(lim)( K

For this configuration described by w and P, the entropy is to be calculated. Then, this one is equal to the entropy in the steady state:

H∞(Z)


page 47


Since the entropy H∞(Z) includes characters from the past which havealready been known, this is about a conditional entropy. Analogue to inequation (1.33), the following applies to this:

)()()|()( 01 nnnn zHzHzzHZH ≤≤= −∞

The source coding of a Markoff source can now be carried out consideringthe memory. For this, for example, the Huffman coding can be appliedtaking into account the current state of the source.

The problem caused by this coding represents the disastrous errorpropagation. However, it is a basic problem of the source codes of a variable length. An error occurring on the channel that has not beencorrected can then lead to a loss of the complete information sequence.

(1.40)


page 48


In the following, the example of a Markoff source on the right is considered. Its states correspond to the transmittedcharacters: zn ∈ {x1, x2, x3}.

The transition probabilities for thisexample are as follows:

zn zn−1

x1 x2 x3

x1 0.2 0.4 0.4 x2 0.3 0.5 0.2 x3 0.6 0.1 0.3

( ) ( )⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛===== −

3.01.06.02.05.03.04.04.02.0

)|( 1 Pijinjn pxzxzpIn the following, wand H∞(Z) are to becalculated.


page 49


Calculation of the stationary probabilities wi:

w1 = 0.2 w1 + 0.3 w2 + 0.6 w3w2 = 0.4 w1 + 0.5 w2 + 0.1 w3w3 = 0.4 w1 + 0.2 w2 + 0.3 w31 = w1 + w2 + w3

3011.03441.03548.0 9328

39332

29333

1 ≈=≈=≈= www

From equation (1.38) three single equations can be obtained here to calculate the three unknowns. Each of the three single equations, however, is linearly dependent on the two other ones so that the second approach to obtain a third independent equation has to be taken into account:

}Strike out one of the three equations!

Solution to the system of equations:

(1.41)

As an approach, equation (1.38) as well as the following equation areapplied:

1.1

=∑=

N

iiw


page 50


Calculation of the Entropy H∞(Z):

∑ ∑∑= ==

−∞ ⋅⋅−==⋅=N

i

N

jijiji

N

iinni ppwxzzHwZH

1 111 )(ld)|()(

At first, the single calculations of the state entropy are the obviousprocedure for the overview:

character/bit441.1)|()|()|()( 313212111

≅=+=+== −−−∞ xzzHwxzzHwxzzHwZH nnnnnn

The entropy is then:

(1.42)

The approach for this calculation is represented by equation (1.39). Use of the notation for the stationary probabilities wi and transition probabilities pijleads to the following notation:

3.00.10.6 bit / character2955.1ld3.0ld1.0ld6.0)|( 11131 ≅⋅+ ⋅+⋅==− xzzH nn

acterbit / char4855.1ld2.0ld5.0ld3.0)|( 2.01

0.51

0.31

21 ≅⋅+⋅+⋅==− xzzH nn

bit / character5219.1ld4.0ld4.0ld2.0)|( 4.01

0.41

0.21

11 ≅⋅+⋅+⋅==− xzzH nn


page 51


A comparison of the different values shows:

character/bit441,1)( ≅∞ ZH

character/bit5814,11ld)(lim)(1

≅⋅== ∑=

∞→

N

i iinn w

wzHZH

character/bit1,5850ld30 ≅=H

See also inequation (1.40).


page 52


For the considered example, the state-dependent Huffman codings are listedbelow. For every state zn−1, an own coding exists that here is also different from the respective other ones:

zn zn−1

x1 x2 x3 ⟨L⟩|zn−1

x1 11 10 0 1.6 x2 10 0 11 1.5 x3 0 11 10 1.4

character/bit5054.1

)|()|(1 1

11

≅

==⋅⋅==== ∑∑= =

−−

N

i

N

jinjnijiinjn xzxzLpwxzxzLL

zn zn−1

x1 x2 x3

x1 0.2 0.4 0.4 x2 0.3 0.5 0.2 x3 0.6 0.1 0.3

The mean code word length is then calculated to be:

character/bit441.1)( ≅∞ ZHcharacter/bit5814.1)( ≅ZH

For comparison:

pij Code words

(1.43)


page 53

1.5 Communicationover a Discrete Memoryless Channel

1.5.1 Definition and Characteristicsof the Discrete Memoryless Channel


},...,,{ 21 XNxxxX ∈ },...,,{ 21 YNyyyY ∈

The discrete memoryless channel (DMC) describes a model for thetransmission of discrete values via a channel. Memoryless means that thetransmission probabilities of the symbols are independent from each other.

The input signal X and the output signal Y each can reach different discretevalues, whereas the possible values at the input and at the output as well as the number of values do not have to be equal:

DiscreteMemoryless

Channel


page 54

1.5.1 Definition and Characteristics of the Discrete Memoryless Channel

( ) ( )

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

====

YXXX

Y

Y

NNNN

N

N

ijij

ppp

pppppp

pxXyYp

L

LOLL

L

L

21

22221

11211

)|(P

With the transmission, each of the xi is now transferredinto yj with a certain probability pij:

The transition probabilities pij are againcombined to the transition matrix P.Thereby, the sum of pij over one line each has to equal 1 (equation (1.36b)).

(1.44)

(Similar to equation (1.36a), where,however, transitions in sources areconcerned!)


page 55


⎟⎟⎠

⎞⎜⎜⎝

⎛=

2221

1211pppp

P

212121

212121

)()()|()()|()()err(

pxppxpxypxpxypxpp

⋅+⋅=⋅+⋅=

⎟⎟⎠

⎞⎜⎜⎝

⎛−

−=

errerr

errerr1

1pp

ppP

errerr21

212121

)]()([)()()err(

ppxpxppxppxpp

=⋅+=⋅+⋅=

A special case of the DMC is the binary channel. X as well as Y consist of only two symbols. The transition matrix is reduced to:

The probability p(err) that a transition error occurs is calculated to be:

For the binary symmetric channel (BSC) the relationsare simplified to:

(1.45)

(1.46)

(1.47)


page 56


For the interference-free channel it is obvious that the information at theoutput has to be the same as the information at the input of the channel. At the same time, that means that transmitted information is equal to theentropy of the source.With the interfered channel, information gets lost with the transmissiondue to false assignment of several symbols. That means that theinformation at the output has to be less than the information at the input. This also leads to a decrease of the transmitted information, which has then to be less than the entropy of the source.For this reason, the transinformation T(X,Y) is defined as a measure forthe information actually transmitted. In the following, this transinformationis correlated to the other information streams.


page 57


Information streams:

Source SourceCoder Source

Decoder Sink

channel

irrelevance

transinformation

equivocation


page 58


Definitions of the entropies in terms of the discrete memoryless channel

Input entropy: mean information content of the input symbols

Output entropy: mean information content of the output symbols

Joint entropy: mean uncertainty of the complete transmission system

)(1ld)()(

1 i

N

ii xp

xpXHX

∑=

⋅=

)(1ld)()(

1 j

N

jj yp

ypYHY

∑=

⋅=

∑ ∑= =

⋅=X YN

i

N

j jiji yxp

yxpYXH1 1 ),(

1ld),(),(

(1.48)

(1.49)

(1.50)


page 59


Entropy of irrelevance: mean information content at the output in case of a known input symbol (conditional entropy)

Entropy of equivocation: mean information content at the input in case of a known output symbol (conditional entropy); entropy of informationgetting lost on the channel

∑ ∑= =

⋅=X YN

i

N

j ijji xyp

yxpXYH1 1 )|(

1ld),()|(

∑ ∑= =

⋅=X YN

i

N

j jiji yxp

yxpYXH1 1 )|(

1ld),()|(

(1.51)

(1.52)


page 60


From the diagramm of information streams as well as from equation (1.32a) the following relations can be derived:

)|()()|()(),(),( YXHYHXYHXHXYHYXH +=+==

)()|( XHYXH ≤

)()|( YHXYH ≤

The mean transinformation can be determined from these relations:

),()()(

)|()(

)|()(),(

YXHYHXH

XYHYH

YXHXHYXT

−+=

−=

−=

(1.53)

(1.54a)

(1.54b)

(1.55)


page 61


In terms of transinformation, two special examples are to be considered, forwhich the entropy can be derived easily.

For the ideal, not interfered channel applies:

For the entropy then applies: H(X|Y) = 0, H(Y|X) = 0,H(X,Y) = H(X) = H(Y) and thusT(X,Y) = H(X) = H(Y).

In terms of the useless, completely interfered channel the input and outputsymbols are statistically independent from each other, so that the followingapplies: p(xi,yj) = p(xi) ⋅ p(yj) = p(yj|xi) ⋅ p(xi) ⇒ p(yj|xi) = p(yj) ⇒ pij = pkj .That means that the transition probabilities are all the same. Regarding theentropy, that means: H(X|Y) = H(X), H(Y|X) = H(Y),

H(X,Y) = H(X) + H(Y) and thusT(X,Y) = 0.

⎩⎨⎧

≠=

=jiji

pij für 0für 1


page 62


Concluding, a concrete example regarding transinformation:1000 binary, statistically independent and equiprobable symbols (p(0) = p(1) = 0.5) are transmitted via a binary symmetric channelperr = 0.01.Consideration in terms of quality: The mean number of correctlytransmitted symbols is 990. However, the exact position of the errors isunknown so that the following has to apply: T(X,Y) < 0.99 bit/character.Consideration in terms of quantity:

bit/character9192.0)(1 err ≅−= pS

1ld1

1ld)1())1()0((1err

errerr

err ⎥⎦

⎤⎢⎣

⎡+

−−+−=

pp

pppp

)|()(),( −= XYHYHYXT

)|(1ld)|()(

)(1ld)(

1 11

⋅⋅−⋅= ∑∑∑= == xyp

xypxpyp

ypX YY N

i

N

j ijiji

N

j jj


page 63

1.5.2 Channel Capacity1.5 Communication over a Discrete Memoryless Channel

Considering a given channel, it is interesting to know how muchinformation can be sent via it at most. The transinformation, however, depends on the probability distribution of the source symbols and thus on the source so that it is not suitable to be a measure for the description of channels. Nevertheless, decoupling from the source is possible byconsidering the maximum value of the transinformation. Thus, the channelcapacity is defined as:

),(max1

)(YXT

TC

ixpΔ=

The channel capacity C is the maximum of the flow of information and has the dimension bit/s. ΔT describes the time period of the characterstransmitted via the channel. With this definition, C only depends on thechannel characteristics, that means on the transition probabilities und not on the source.

(1.56)


page 64

1.5.2 Channel Capacity

SymbolR

0)( =→= CRXHL

Symbol

inf

RL

R

⋅

=inf

,

inf RRRR

CKK ≥=

KRinfR

SymbolRQRXHH );(;0

Source

Source coder Channel coderFehlerp

BSC channelCp ;err

Channel coder1, ≤CKR

Source coder

CRL;

Sink

ideal coder:

(without redundancy)

Overview of a communication system:

Definition of flow of information:Flow of information (entropy / time): H′(X) = H(X) / ΔTFlow of transinformation (transinformation / time): T′(X,Y) = T(X,Y) / ΔTFlow of decision (decision content / time): H0′(X) = H0(X) /ΔT


page 65


Taking the binary symmetric channel (BSC) as an example, the channelcapacity is to be derived. For this, the transinformation is to be determinedand subsequently the maximum is to be found.

Transinformation of the BSC:

(1.57)⎥⎦

⎤⎢⎣

⎡−

−+−err

errerr

err 11ld)1(1ldp

pp

p

+−−+−−+

err1err1err1err1 21

1ld)21(pppp

pppp

−+−+=

err1err1err1err1 2

1ld)2(pppp

pppp

−= )|()(),( XYHYHYXT


page 66


0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 p(x1)

T(X,Y)

bit/c

hara

cter

perr = 0

perr = 0.1

perr = 0.2

perr = 0.3

perr = 0.5

Graph of the transinformation over different transmission errorprobabilities:

The position where the maximum of the transinformation at the BSC occursis independent from the transmission error probability. It isp1 = p(x1) = 0.5 (equiprobable source symbols).


page 67


Graph of the channel capacity. It can be derived from the graph of thetransinformation on the previous page by plotting the respective maxima of the transinformation over perr:

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 perr

C⋅ΔT / bit

)(11

1ld)1(1ld1),(max errerr

errerr

err)(

pSp

pp

pYXTTCixp

−=⎥⎦

⎤⎢⎣

⎡−

−+−==Δ⋅

(1.58)


page 68


As a further example, the binary erasure channel (BEC) is considered. Itmodels the scenario that shows that a received signal cannot be assigned to one of the transmitted signals. The data and results are as follows:

⎟⎟⎠

⎞⎜⎜⎝

⎛−

−=

err

err

err

err1

00

1pp

pp

P err1 pTC −=Δ⋅

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 perr

C⋅ΔT / bit

(1.60)(1.59)


page 69


Theorem of the channel capacity (Shannon 1948):For every ε > 0 and every flow of information of a source Rinf less than thechannel capacity C ( Rinf < C ), a binary block code of the length n (n abovea threshold) exists, so that the residual-error probability after decoding at the receiver is less than ε .Inversion: Whatever the effort, for Rinf > C, the residual-error probabilitycannot go below a certain limit.

The argumentation is carried out using random block codes (random codingargument). In doing so, the proof of the average over all codes is carriedout. It can be seen here that all known codes are bad codes. The theorem of the channel capacity, however, does only state that codes with such characteristics exist. There is no construction rule given by Shannon.


page 70


In terms of the channel capacity, use of infinite code word lengths (thatmeans n is very large) would be optimal. This, however, would bring aboutan infinite delay and complexity.Thus, Gallager improves the theorem of the channel capacity applying his error exponent EG(RC) for a DMC with NX input symbols. The objective isan estimate that is valid for smaller code word lengths.

⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛⋅−⋅−= ∑ ∑

=

+

=

+≤≤

Y X

i

N

j

sN

i

sijixps

xypxpRsRE1

1

1

11

C)(10

CG )|()(ldmaxmax)(

)(w CG2 REnP ⋅−<

An (n,k) block code (k: information word length) always exists usingRC = k / n ld NX < C ΔT,

so that the following applies for the word error probability:

(1.61)

(1.63)

(1.62)


page 71


0.0

0.2

0.4

0.6

0.8

0.0 0.2 0.4 0.6 0.8 1.0

EG(RC)

RC

R0

R0 C

The error exponent according to Gallager has the following characteristics:EG(RC) > 0 for RC < CEG(RC) = 0 for RC ≥ C

The R0 value is defined as:

R0 = EG(RC = 0) (1.64)


page 72


⎥⎥

⎦

⎤

⎢⎢

⎣

⎡

⎟⎟⎠

⎞⎜⎜⎝

⎛⋅−=== ∑ ∑

= =

Y X

i

N

j

N

iiji

xpxypxpRER

1

2

1)(CG0 )|()(ldmax)0(

C01

2

1C

)(CG )|()(ldmax)( RRxypxpRRE

Y X

i

N

j

N

iiji

xp−=

⎥⎥

⎦

⎤

⎢⎢

⎣

⎡

⎟⎟⎠

⎞⎜⎜⎝

⎛⋅−−≥ ∑ ∑

= =

The maximum for EG(RC = 0) is s = 1. The R0 value, which is also referredto as „computational cut-off rate“, is then:

If RC is not put to zero, the equation reads as follows:

(1.66)

(1.65)


page 73


R0 theorem:An (n,k) block code with

RC = k / n ld NX < C ΔT,always exists so that for the word error probability (with Maximum-Likelihood decoding) the following applies:

)(w C02 RRnP −−<

As in case of the theorem of the channel capacity, this theorem onlyprovides information about the existence and does not provide anyconstruction rule for good codes.

(1.67)

Co-domain for the code rate:0 ≤ RC ≤ R0 estimate of PW using (1.67)R0 ≤ RC ≤ C ΔT estimate of PW is difficult to calculateRC > C ΔT PW cannot become arbitrarily small


page 74


0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.1 0.2 0.3 0.4 0.5 perr

R0

C ΔTbi

t/cha

ract

er

Comparison of channel capacity and R0 value for a BSC:

The R0 graph is below the graph representing the channel capacity and indicates a limit predicting worse characteristics of the code. However, thislimit is more realistic in the sense that it can be reached by finite code wordlengths.


page 75

2 Fundamentals of Channel CodingCoding Theory

In chapter 1, the influence of interferences on the channel caused byreduction of the transmitted information was described. Practically, interference results in random errors occurring with the transmission of symbols.

In this chapter, the fundamental parameters of channel coding are to bedescribed:– code rate,– minimum distance and– error probabilities.

Dependent on a quality criterion like the bit error rate (BER), a certainnumber of (residual) errors can be accepted, for the limitation of whichchannel codes have to be applied. With voice transmission, e.g., a BER of 10-3 is applied.

Furthermore, decoding rules are introduced and bounds theoreticallyachievable are identified.


page 76

2.1 IntroductionBasic principle of channel coding:

2 Fundamentals of Channel Coding

The transmission line with channel coding/decoding is divided into:– Classification of the source symbol stream to information words u of the

length k,– Assignment of u to a transmit word x ∈ C of the length n,– Transmission of x; with this, superposition of an error word f,– Estimation of the receive word y = x + f, that means assignment of y to

the most likely code word ∈ C,– Assignment of to the estimated information word .x u

x

ChannelCoder

Channel

channel decoder

Code Word

Estimator

InformationWord

Assignment


page 77

2.1 Introduction

Regarding this, the task of coding is to add redundancy to the errorprotection (redundant codes).

In this chapter, only binary block codings are considered, that meansq = 2, and k and n, respectively, are values that are constant for a code.In the following, the repetition codes and the parity check codes are

considered as examples.

For the codingu = (u0, u1, ... , uk-1) → c = (c0, c1, ... , cn-1)

applies n ≥ k. The m = n – k added symbols are referred to as checksymbols. Each of the single symbols ui and ci , respectively, can reachq values in general (q: number of symbols). Then there are N = qk

different messages and code words, respectively and qn – qk invalid words.

The code space, that means the set of all code words, is referred to as C.


page 78

2.1 Introduction

Repetition code:The information consists of an information bit (k = 1) that is repeated(n – 1)-fold. With this, all in all 2k = 2 code words exist which are

c1 = (0 0 0 ... 0) and c2 = (1 1 1 ... 1).The decoding is carried out by majority decision, if n is odd.

Example: n = 5 ⇒ RC = 1/5 u1 = (0) → c1 = (0 0 0 0 0) and u2 = (1) → c2 = (1 1 1 1 1)

Interfered receive sequences with: f1 = (0 1 0 0 1) and x1 = c2 = (1 1 1 1 1) y1 = x1 + f1 = (1 0 1 1 0) → (successful decoding)

f2 = (1 1 0 1 0) and x2 = c1 = (0 0 0 0 0) y2 = x2 + f2 = (1 1 0 1 0) → (decoding error)

)1(ˆ1 =u

)1(ˆ 2 =u

The addition is made bit by bit without carry.


page 79

2.1 Introduction

Example: k = 3y1 = ( 0 1 1 0) → no error

detectedy2 = ( 1 1 1 0) → error detected

Parity Check Code:The code consists of k information bits and a control bit (m = 1). With this, n = k + 1. There are 2k code words of the form

c = (u p) with p = u0 + u1 + u2 + ... + uk−1.From this, an even number of ones in the code words always results.

Code word Information Control bit

c0 000 0 c1 001 1 c2 010 1 c3 011 0 c4 100 1 c5 101 0 c6 110 0 c7 111 1

The parity check code reads then as follows:s0 = y0 + y1 + y2 + ... + yn−1 = 0

→ no errors0 = y0 + y1 + y2 + ... + yn−1 = 1

→ error


page 80

2.2 Code Rate2 Fundamentals of Channel Coding

A source generates information with a mean rate of Ri bit per second (bps), that means every Ti = 1/Ri second one bit is transmitted. The channel coderassigns a code word of the length n to every message of the length k so thatthe channel bit rate reads as follows:

since the following applies: n ≥ k, ist RC ≤ 1.The code rate measures the relative information content in every codeword and is one of the key values for the performance of a channel code. Though a high value of RC means that more information is transmitted, protection from transmission errors is more difficult due to the smallernumber of redundant symbols.

knRR iK ⋅=

i

K

K

iC T

TRR

nkR ===The code rate is then:

(2.1)

(2.2)


page 81

2.2 Code Rate

In some systems, Ri and RK are fixed parameters. For a given messagelength k, n is then the greatest integer that is less than . If this termitself is no integer, dummy bits have to be included in order to maintain thesynchronisation.Example:A system is given by Ri = 3 bps and RK = 4 bps. With this, RC = ¾.

iK RRk /⋅

If a channel coder is to be made available with k = 3, n = 4 has to bechosen; loss of synchronisation does not occur.If a channel coder with k = 5 is to be made available, for the product results

= 6 + 2/3. Thus, n = 6. However, 2 dummy bits have to be addedafter every third code word in order to close the gap of 2/3.

iK RRk /⋅

k = 5X X X X X

⇓O O O O O O

n = 6

k = 5X X X X X

⇓O O O O O O

n = 6

k = 5X X X X X

⇓O O O O O O

n = 6d d

15 bits

20 bitsRC = 3/4

dummy bits


page 82

2.3 Decoding Rules2 Fundamentals of Channel Coding

The channel decoder has to use a decoding rule for the receive word y in order to decide which transmit word x was transmitted. That means that a decision rule D(.) is searched for, for which the following applies: x = D(y).

Let p(x) be the a-priori probability for the message x to be sent. By meansof Bayes‘ rule (equation 1.9) the probability that x was transmitted when ywas received is given by:

Let p(y | x) be the probability that y was received after transmitting x. For a discrete memoryless channel it can be expressed by the transmissionprobabilities:

∏−

=

=1

0

)|()|(n

iii xypp xy (2.3)

)()()|()|(

yxxyyx

pppp ⋅

= (2.4)


page 83

From the transmission point of view, the 3 possible results are as follows:− y = x, that means error-free transmission: ,− y ≠ x, y ∈ C, that means falsification of the receive word into another

code word (not corrigible):− y ≠ x, y ∉ C, that means erroneous transmission,

in which the error is observable and possibly corrigible:

2.3 Decoding Rules

uuxx == ˆ,ˆ

uuxx ≠≠ ˆ,ˆ

uuxx⎭⎬⎫

⎩⎨⎧

≠=

⎭⎬⎫

⎩⎨⎧

≠=

ˆ,ˆ

From the receiver‘s point of view, there a 3 possible decoding results:− Correct decoding, where either no errors have occurred on the channel

or all errors have been corrected properly:− Erroneous decoding, that means an error has occurred in such a way

that the estimation has assigned an erroneous code word to the receiveword:

− Decoding failure, that means an error was detected,however, the decoder does not find a solution for this error.

uuxx == ˆ,ˆ

uuxx ≠≠ ˆ,ˆ


page 84

The objective is now to minimise the word error probability.For this, has to be selected that way that becomes maximum.

Using equation (2.4), there is also the possibility to maximise .

It is assumed that the code words are transmitted with the same probability. Then for pW the following applies:

∑∑≠

=≠

=≠xx

xyxx

xyxxxyy ˆ,

)|(ˆ,

)gesendet |empfangen ()gesendet |ˆ( ppp


x )ˆ|( xyp

Word error probability for a DMC:

The limited probability for an estimation error is:

∑∈

⋅≠=Cx

xxxx )gesendet()gesendet|ˆ( pp≠= xx )ˆ(p≠= uu )ˆ(W pp

)|ˆ( yxp

(2.5)

(2.7)

(2.6)

∑⋅−= −

yxy )ˆ|(21 pk∑⋅−=

=∈

−

xxyxxy )|(21

ˆ,,pk

C

∑⋅=∈

−

yxxy )|([2

,pk

C∑ ∑ ⋅=∈

−

≠x xxyxy 2)|(

ˆ,W pp k

C∑−

=∈ xxyxxy ])|(

ˆ,,p

C


page 85

= DMAP(y), with ∈ C,

2.3 Decoding Rules

Maximum A-Posteriori rule:The probability that the decision for the receive word y to the code word xis correct is just p(x | y). The probability that an error occurs is then1 - p(x | y). Thus, the minimisation of the estimation error with the selectionof x is the maximisation of the a-posteriori probability p(x | y).

Cxyxyxx ∈≥= allfor)|()|ˆ( pp

This decoding rule results in a minimum decoding error.

The corresponding maximum a-posteriori rule (MAP rule) reads as follows:

x xso that

Using equation (2.4) and since p(y) is independent from x, equation (2.8b) is simplified to:

Cxxxyxxxxy ∈≥== allfor)()|()ˆ()ˆ|( pppp

(2.8a)

(2.8b)

(2.8c)


page 86

2.3 Decoding Rules

Maximum Likelihood decoding rule:An alternative decoding rule is based on the maximisation of p(y | x), theprobability that with the transmission of x the word y was received. Thisrule does not guarantee a minimum of errors, but it is easier to implement, since the channel input probabilities are not required.

C∈≥= xxyxxy allfor)|()ˆ|( pp

For equiprobable input parameters, the MLD and the MAP rule lead to thesame results (compare the following example).

The corresponding Maximum Likelihood Decoding rule (MLD rule) readsas follows:

x x= DMLD(y), with ∈ C,so that

(2.9a)

(2.9b)


page 87

2.3 Decoding Rules

Example: A BSC with perr = 0.4 and a channel coder with (k,n) = (2,3) are assumed, that means a channel code with N = 22 = 4 code words of the length n = 3. The codewords and their occurrence probabilities are assumed to be:

Code word p(x) x1 = (000) 0.4 x2 = (011) 0.2 x3 = (101) 0.1 x4 = (110) 0.3

144.0)0|1()1|1()1|1()110|111()|(144.0)1|1()0|1()1|1()101|111()|(144.0)1|1()1|1()0|1()011|111()|(064.0)0|1()0|1()0|1()000|111()|(

4

3

2

1

=⋅⋅===⋅⋅===⋅⋅===⋅⋅==

pppppppppppppppppppp

xyxyxyxy

Assumption: y = (111). The MLD rule provides:

0432.03.0144.0)()|(0144.01.0144.0)()|(0288.02.0144.0)()|(0256.04.0064.0)()|(

44

33

22

11

=⋅=⋅=⋅=⋅=⋅=⋅=⋅=⋅

xxyxxyxxyxxy

pppppppp

This rule provides three equal maximum probabilities, so that a uniquedecision is not possible.

Minimising the error isunique here by selection of the code word x4, that meansx4 = DMAP(y = (111)).

The MAP rule, however, provides (by means of equation (2.8c)):


page 88

2.3 Decoding Rules

Decoding sphere:The decoding sphere describes an n-dimensional sphere around a code wordwith the radius t. All n-digit receive words located within a definite decoding sphere are decoded as the respective code word.

It applies that the total number of all vectors within decoding spheres is less thanor equal the total number of all possible words.

The decoding sphere forms the basis for the Hamming bound (see below).

decoding sphere

possible receive word(no code word)

code word

space of possible receive words

t


page 89

2.3 Decoding Rules

Maximum-Likelihood Decoding (MLD):The receive word is compared to all code words (2k necessary vectorcomparisons). Though this method is optimal, it is very complex. Possibly, more than t errors are corrigible.Limited Distance Decoding (BDD):Every code word is surrounded by decoding spheres with the radius t0, which can overlap. Decoding takes place only for those receive vectors yexactly located in a sphere. All receive vectors y located in no sphere or in several spheres are not decoded.

Limited Minimum Distance Decoding (BMD):Every code word is surrounded by a decoding sphere with the radius t. Thedecoding spheres are disjoint. Decoding takes place only for the receive

vectors y located in a sphere.

Decoding methods:A block code that can definitely correct t errors is considered.


page 90

2.3 Decoding Rules

MLD BDD BMD

Further decoding methods are e.g. the syndrome decoding and themajority decision.The syndrome decoding is easy to realise. However, it is suboptimal sincenot the complete information is used.The majority decision has already been known from the

repetition codes and is suboptimal as well.


page 91

2.4 Hamming Distance2 Fundamentals of Channel Coding

Assumption: The two binary code words x and y of the length n are given. The Hamming distance between x and y, dH(x,y), is defined as the numberof position bits in which x and y differ. For arbitrary binary words x, y and z, the Hamming distance fulfils the following conditions:

− dH(x,y) ≥ 0, with equality, if x = y, (2.10a)− dH(x,y) = dH(y,x) and (2.10b)− dH(x,y) + dH(y,z) ≥ dH(x,z) (triangle inequality). (2.10c)

Example: Let n = 8 and x = (11010001),y = (00010010) andz = (01010011).

The Hamming distances for all combinations are givenin the table.

dH x y zx 0 4 2y 4 0 2z 2 2 0

Check of the triangle inequality (equation (2.10c)) results in:dH(x,y) + dH(y,z) ≥ dH(x,z) ⇒ 4 + 2 ≥ 2


page 92

2.4 Hamming Distance

(Hamming) weight:A binary vector x of the length n is given. Then the Hamming weight wH(x) is defined as the number of ones in x:

Example: x = ( 0 1 1 1 0 1 0 1 )y = ( 1 0 1 0 0 1 0 1 ) ⇒ dH(x,y) = 3

x + y =( 1 1 0 1 0 0 0 0 ) ⇒ wH(x + y) = 3

∑−

=

=1

0H )(

n

iixw x

Example: wH(0 1 1 1 0 1 0 1) = 5.

Between the Hamming distance dH and the Hamming weight wH thefollowing relation exists:

dH(x,y) = wH(x + y).

(2.11)

(2.12)


page 93

2.4.1 Decoding Rule for the BSC2.4 Hamming Distance

A BSC with the bit error probability perr is assumed. With regard to theMLD rule, for the conditional probability the following results:

),(

err

errerr

),(err

),(err

1

0 err

err1

0

HHH

1)1()1(

fürfür1

)|()|(

xyxyxy

xy

dnddn

n

i ii

iin

iii

ppppp

xypxyp

xypp

⎟⎟⎠

⎞⎜⎜⎝

⎛−

⋅−=⋅−=

⎭⎬⎫

⎩⎨⎧

≠=−

==

−

−

=

−

=∏∏

C∈≤ xxyxy allfor),()ˆ,( HH dd

Equation (2.13) is maximum, if dH(x,y) is minimum.According to the MLD decoding rule, the estimated vector x is to besearched in that way that it is nearest to the receive word y:

Moreover, equation (2.13) states that, if perr < 0.5, the probability for oneerror is greater than for two or three errors etc., that means

p(i errors) > p(i + 1 errors), 0 ≤ i ≤ n – 1.

(2.13)

(2.14)

(2.15)


page 94

Example: The following channel code isassumed:

2.4.1 Decoding Rule for the BSC

Message Code word00 000 01 001 10 011 11 111

Since n = 3, N = 8 possible receive words exist, 4 of which are no code words. If the receiveword is one of the code words, the decodingrule also assigns this code word to the estimate( = y). If y is no code word, however, thedecoding rule reads as follows:

y dH(y,ci) Next code word Action 010 1, 2, 1, 2 000; 011 1 bit error detected 100 1, 2, 3, 2 000 1 bit error corrected 101 2, 1, 2, 1 001; 111 1 bit error detected 110 2, 3, 2, 1 111 1 bit error corrected

The separate consideration of possible pairs of words is the disadvantage of this approach. A parameter describing the general detection and correction

characteristics for a code is more interesting.

x


page 95

2.4.2 Error Detection and Error Correction2.4 Hamming Distance

Minimum distance of a linear code*:The minimum distance dmin of a linear block code C is given by the smallestHamming distance of all pairs of different code words:

( )212121Hmin ,|),(min cccccc ≠∧∈= Cdd

( )( ) minH

212121Hmin;|)(min

;,|),(minww

dd=≠∈=

≠∈=

0ccccccccc

CC

( )( )( )( )0ccc

0ccc0cccccc0

cccccc

≠∈=

≠∈=

≠∈+=

≠∈=

;|)(min;|),(min

;,|),(min;,|),(min

H

H

212121H

212121Hmin

CC

CC

wdddd

The minimum distance of a linear code word corresponds to the minimumweight of the code without considering the zero word:

Proof:

(2.16)

(2.17)

(*) linear codes:see chapter 3


page 96

2.4.2 Error Detection and Error Correction

Error detection characteristic:A block code C detects up to t errors, if its minimum distance is greaterthan t:

This is an important characteristic for ARQ methods, where codes for errordetection are required.

dmin > t

Error correction characteristic:A block code C corrects up to t errors, if its minimum distance is greaterthan 2t:

This is an important characteristic for FEC methods where codes for errorcorrection are required.

dmin > 2t

(2.18a)

(2.18b)


page 97


dmin = 3 dmin = 4

Thus, the number of detectable errors is given by te = dmin − 1.The number of corrigible errors is given by

t = (dmin − 2) / 2, if dmin is even andt = (dmin − 1) / 2, if dmin is odd.

With dmin = 1, neither an error detection nor an error correction can beguaranteed; with dmin = 2 at least one single error is always detectable; withdmin = 3 at least one single error is always corrigible and at least two errorsare detectable etc.

(2.19)

(2.20a)(2.20b)


page 98


Examples:1) With the repetition code, (n – 1)/2 errors (rounded down) are corrigibleand n – 1 errors are detectable.2) With the parity check, no error is corrigible and an odd number of errorsis detectable in general.3) In the example of the table below, an error is definitely detectable sincedmin = 2.

Code word Information Control bit Weight c0 000 0 0 c1 001 1 2 c2 010 1 2 c3 011 0 2 c4 100 1 2 c5 101 0 2 c6 110 0 2 c7 111 1 4


page 99

2.5 Bounds and Perfect Codes2 Fundamentals of Channel Coding

As observed in 2.4, the minimum distance dmin specifies the ability of a code C to detect errors. Based on different problems, interesting questionsresult:

− A code C of the length n is to be generated which shall have theminimum distance dmin(C). Does then a maximum number (upper bound) of possible code words (messages) N exist?

− A code C of the length n for messages of the length k is to be developed. Which is the maximum error protection (that means maximum dmin) obtainable by these parameters?

− A code C for messages of the length k with a given error protection (tbits) is to be developed. Which is the shortest length n of a code wordthat can be used for this?

In the following, various bounds are to be considered.


page 100

2.5 Bounds and Perfect Codes

Singleton bound:The minimum distance and the minimum weight of a linear code C arebounded by

mknwd +=−+≤= 11minmin

Proof: In general, qk different code words exist, which have a minimumdistance of dmin to each other. If the last dmin – 1 symbols are removed fromevery code word, at least qk different words still have to remain, however. Otherwise the coding would not have a minimum distance of dmin.All in all, qn different words of the length n exist in general (all possiblecombinations). If the last dmin – 1 symbols are removed here, at most qn – (dmin

– 1) words are different from each other. Thus, the following has to apply: n –(dmin – 1) = n – dmin + 1 ≥ k or dmin ≤ n – k + 1.

A code for which the equals sign in equation (2.21) applies, is calledMaximum Distance Separable (MDS).

(2.21)


page 101


From equation (2.21), using equation (2.19) the following can be derived for theerror detection:

te + 1 = dmin ≤ 1 + m ⇒ m ≥ te

That means that at least two control bits per corrigible error are required.

(dmin − 1) / 2 ≥ t ≥ (dmin − 2) / 22 t + 1 ≤ dmin ≤ 1 + m ⇒ m ≥ 2 t

That means that at least one control bit per observable error is required.

For error correction, the following can be derived from equations (2.20a/b) :

(2.22a)

(2.22b)


page 102


Hamming bound:For a binary (n,k) block code with N = 2k code words of the length n and the correction capability t the following applies:

nt

i

kin

220

≤⎟⎠

⎞⎜⎝

⎛∑=

1)1())1(()1(

⋅⋅−⋅−−⋅⋅−⋅

=⎟⎠

⎞⎜⎝

⎛K

K

tttnnn

tnnk

tnnnn

2210

2 ≤⎥⎦

⎤⎢⎣

⎡⎟⎠

⎞⎜⎝

⎛++⎟

⎠

⎞⎜⎝

⎛+⎟

⎠

⎞⎜⎝

⎛+⎟

⎠

⎞⎜⎝

⎛K

Proof: The total number of all words within the 2k existing decodingspheres is at most equal to the number of all possible words (2n), i. e.:

with

number of words around a code word with dH = t

number of words around a code word with dH = 2number of words around a code word with dH = 1code word (centre of a decoding sphere)

(2.23)


page 103


Perfect codes:A code C with the variables n, k and dmin is referred to as perfect if equalitywithin the Hamming bound (equation (2.23)) applies.Graphically, this means that all qn possible receive words are located withinthe correction spheres of the qk code words. Thus, with perfect codes all corrupted code words can be assigned unambiguously to a correct codeword.The repetition code of odd length, e.g., is a perfect code.

Further bounds:Furthermore, there are the Plotkin and the Elias bound representing upperbounds. They are justified by different focuses in different ranges of parameters.The Gilbert-Varshamov bound is a lower bound, the adherence of whichguarantees the existence of a code.

(for further information, see Schneider-Obermeier, Friedrichs)


page 104


Asymptotic bounds for the minimum distance:Eventually, the dependency between the code rate RC and the normaliseddistance dmin/n is to be considered. For large code word lengths (n → ∞) thefollowing terms arise for the mentioned bounds:

– Singleton bound:

– Hamming bound:

– Plotkin bound:

– Elias bound:

– Gilbert-Varshamov bound:

ndR min

C 1−≤

⎟⎠⎞

⎜⎝⎛ ⋅−≤

ndSR min

C 211

ndR min

C 21 ⋅−≤

⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

⎛−−−≤

ndSR min

C 211211

⎟⎠⎞

⎜⎝⎛−≥

ndSR min

C 1

(2.24a)

(2.24b)

(2.24c)

(2.24d)

(2.24e)


page 105


The upper bounds form different limits in terms of the error protection(dmin/n) at which a certain information content (RC) can be transmitted. TheGilbert-Varshamov bound forms the lower limit. Codes the values of whichare lower than the lower limit have to be considered as bad codes, since in theory it is said that, based on a bad code, a better one with a higher coderate exists. Thesecodes can thenbe consideredas good codes.

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

RC=k/n

SingletonPlotkinEliasHamming

dmin / n

Gilbert-Varshamov

upperbounds


page 106

2.6 Error Probabilities2.6.1 Error Occurrence Probability


It is assumed that a BSC with the error probability perr is given. From this, a mean number of errors nFehler per code word of results. The occurrence probability of which is calculated as follows:

errFehler pnn ⋅=

The probability that no error occurs is:nppppp )1()1()1()1()( errerrerrerr −=−⋅⋅−⋅−== K0f

The probability that a single error occurs at a certain point (f = f1,wH(f1) = 1) is: 1

errerr1 )1()( −−⋅== nppp ff

The probability that a single error occurs at an arbitrary point(f = f1, wH(f1) = 1) is: 1

errerr1 )1()( −−⋅⋅== nppnp ffThe probability of ne errors at arbitrary points(f = fne

, wH(fne) = ne) is:

eee

)1()( errerre

nnnn pp

nn

p −−⋅⋅⎟⎟⎠

⎞⎜⎜⎝

⎛== ff

(2.25a)

(2.25c)

(2.25b)

(2.25d)

n coefficients


page 107

0.0

0.2

0.4

0.6

0.8

1.0

0.0 1.0 2.0 3.0 4.0 5.0

p(wH(f) = ne)

ne

perr = 0.01

perr = 0.1perr = 0.3

2.6.1 Error Occurrence Probability

For the binomial coefficients(eq. 2.26) generally applies:

Using these, the binomialformula can be represented as:

; 10

=⎟⎠

⎞⎜⎝

⎛=⎟

⎠

⎞⎜⎝

⎛nnn

)!(!! ⎟

⎠

⎞⎜⎝

⎛−

=−⋅

=⎟⎠

⎞⎜⎝

⎛kn

nknk

nkn

nn

k kn

20

=⎟⎠

⎞⎜⎝

⎛∑=

;nknkn

kbaba

kn

)(0

+=⎟⎠

⎞⎜⎝

⎛∑ −

=

For n = 7, the error occurrence probabilityaccording to bit errors is represented here.

(2.26)

(2.27)


page 108

2.6.2 Probability of Undetected Errors2.6 Error Occurrence Probabilities

Let xi and xj be code words; xi was sent and y = xj was received. Thedecoding rule will assume that no error has occurred and see xj as the codeword that was transmitted. This special case of a decoding error cannot beavoided; the error remains undetected. For codes that shall detect the errors, the probability pue (undetected error) that an error remains undetected is an important parameter calculated by:

∑∑

∑ ∑

∑ ∑

−

=

−

≠=

−

−

=

−

≠=

−

−

=

−

≠=

−=

−=

=

1

0

1

0

),(),(

1

0

1

0

),(),(

1

0

1

0

HH

HH

)1(1

)1()(

)|()(

N

i

N

ijj

derr

dnerr

N

i

N

ijj

derr

dnerri

N

i

N

ijj

ijiue

ijij

ijij

ppM

ppxp

pxpp

xxxx

xxxx

xx

(2.28a)


page 109

2.6.2 Probability of Undetected Errors

It is assumed that all code words have equal occurrence probability. Thedouble sum in (eq. 2.28a) and the need for all possible Hamming distancesare the main difficulty in terms of the calculation. For the linear binarycodes discussed in the next chapter, it applies that the distribution of d(xi,xj) can be determined by the number Ai of code words with the weight i.

∑=

− ⋅−⋅=n

di

iini ppAp

min

errerrue )1(

∑=

−−⋅⋅+=⎟⎟⎠

⎞⎜⎜⎝

⎛−

n

di

iii ppA

ppA

min

)1(11 errerr

err

err

⎥⎦

⎤⎢⎣

⎡−⎟⎟

⎠

⎞⎜⎜⎝

⎛−

−= 11

)1(err

errerrue p

pApp n

Using the equation

the following notationresults:

(2.28b)

(2.29)

(2.30a)


page 110

2.6.2 Probability of Undetected Errors

For small error probabilities perr, pue can be calculated approximately by:

Example:Consider the (7,4) Hamming code, which is a linear binary code. This codehas one code word each with the weight 0 and the weight 7 and 7 codewords each with the weights 3 and 4.The minimum distance is

dmin = 3.From this, a probability pue results:

pue = 7 (1 − perr)4 perr3 + 7 (1 − perr)3 perr

4 + perr7

= 7 perr3 − 21 perr

4 + 21 perr5 − 7 perr

6 + perr7 ≈ 7 perr

3

(2.30b)minmin

min

errerrerrue 1)( dd

n

di

ii pApApAp ⋅≈−=⋅≈ ∑

=


page 111

2.6.3 Residual-Error Probability2.6 Error Probabilities

Let C = {c0,c1,c2, ... ,c2k-1} be a linear block code with the correctioncapability t. Usage of a limited minimum distance decoding (BMD) isassumed. Let the error vector be referred to as f. Then for the word errorprobability with BMD pw,BMD the following applies:

∑∑+==

===−=

+≥=≤−=

−=

n

ti

t

i

iwpiwp

twptwppp

1H

0H

HH

BMDW,

))(())((1

)1)(())((1)decodingcorrect (1

ff

ff

ini ppin

iwp −−⋅⋅⎟⎠

⎞⎜⎝

⎛== )1())(( errerrH f

With equation (2.25d), the probability of error vectors with the weightwH(f) = i results

∑∑+=

−

=

− −⎟⎠⎞⎜

⎝⎛=−⎟

⎠⎞⎜

⎝⎛−=

n

ti

init

i

ini ppinppi

np1

errerr0

errerrBMDW, )1()1(1

From this, the following results:

(2.31a)

(2.31b)

(2.31c)


page 112

2.6.3 Residual-Error Propability

For small error probabilities, the following estimation applies:1

errBMDW, 1+⎟

⎠

⎞⎜⎝

⎛+

< tpt

np

BMDW,MLDW, pp ≤Between the word error probabilities with MLD and BMD the followingrelation exists:

2err

2err

2errerrerr

3err

2errerr

6errerr

7err

6err

1err

7err

0errW

27

21

)61(7)2171(1

)1(7)1(1

)1(17

)1(07

1

pp

pppppp

ppp

ppppp

⎟⎠

⎞⎜⎝

⎛=≈

+−−−+−−=

−−−−=

−⎟⎠

⎞⎜⎝

⎛−−⎟

⎠

⎞⎜⎝

⎛−=

KK

Example: For the (7,4) Hamming code (see above) results:

(2.31d)

equals sign, since the code is a perfect one


page 113

2.6.3 Residual-Error Probability

Eventually, the bit error probability pBit after decoding is to be estimated:

pbit ∑=

=⋅==n

iiwpiw

k 0HH ))(()(|worddecod.perno. of bit errors1 ff

)ˆ,(),()ˆ,()ˆ,( HHHH xyyxxxuu dddd +≤≤

= i ≤ t

The number of bit errors is limited to the number of information bits k and to i + t, because:

∑+=

=⋅+≤n

tiiwptik

kp

1Hbit ))((),min(1 f⇒

inin

tipp

in

ktip −

+=−⋅⋅⎟

⎠

⎞⎜⎝

⎛⋅⎟

⎠⎞

⎜⎝⎛ +

≤ ∑ )1(,1min errerr1

bitAnd finally:

For small error probabilities theestimate applies:

1err

minbit 1

,1min +⋅⎟⎠

⎞⎜⎝

⎛+

⋅⎟⎠⎞

⎜⎝⎛≤ tp

tn

kdp

(2.32)

(2.33)

(2.34)

(2.35a)

(2.35b)

triangle inequality


page 114

3 Single-Error Correcting Block CodesCoding Theory

In this chapter, fundamental knowledge of linear block coding is imparted. Besides the mathematical dependencies and their properties, two simple code examples are considered which are of great importance: the generalcyclic codes and the Hamming codes. Furthermore, practical approaches forthe realisation of such codes by means of shift register circuits areconsidered.

Based on a block code with the number of levels q, the number of information bits k, the number of code word bits n and the minimumdistance dmin, the general term reads:

In this chapter, only binary codes are considered (q = 2). Therefore, theindex is skipped. With most terms, instead of the minimum distance thenumber of control bits m are used or they are skipped as well:

(n,k,dmin)q block code

(n,k,m) block code and (n,k) block code respectively


page 115

3 Single-Error Correcting Block Codes

An essential indicator is the classification intosystematic / unsystematic codes.

With systematic codes, information bits and control bits can be separatedfrom each other:

u0 u1 u2 u3 ... ... uk-1

↓ ↓ ↓ ↓ ↓ ↓ ↓

c0 c1 c2 c3 ... ... ck-1 ck ck+1 ... ... cn-1

m = n – kBy contrast, information bits and control bits cannot be separated by non-systematic codes.However, block codes can always be converted (i.e. by transposing of

bits) to equivalent systematic codes.

c = (u, p) (3.1)

(3.2)


page 116

3.1 Mathematical Principles3 Single-Error Correcting Block Codes

Calculating in the binary number system (modulo 2):Addition: Multiplication:⊕ 0 1

0 0 11 1 0

⊗ 0 10 0 01 0 1

In an equation, a modulo calculation is often clarified explicitly by statingit, e. g.: 1 + 1 = 0 mod 2.Mathematically, this number system with its calculation rules represents thesimplest example of a field K (chapter 4). Practically, they can be realisedby an XOR and an AND relation, respectively.Binary words form so-called n-tuples of elements of Kn (space over all n-digit bit sequences). The addition in Kn is effected component bycomponent,

example from K4: 0011 + 0101 = 0110,whereas the multiplication by elements of K is simply defined as:

0b = 0 and 1b = b for all b ∈ Kn.


page 117

3.1 Mathematical Principles

Representation of block codes as a matrix:Let u be an information vector of the length k and let x be the respectivecode word vector of a block code of the length n. Then u and x are relatedby

x = u G,where G represents the generator matrix (k × n - matrix) of the block codes.

If a code can be described by this matrix multiplication for any u, it islinear.The rows of the generator matrix have to be linearly independent.

For systematical block codes, G has the formG = [Ik P]

with the k × k - unit matrix Ik and the k × (n − k) – control bit matrix P.

(3.3)

(3.4)


page 118


Polynomial representation of block codes:An n-digit code word c can be represented as a polynomial c(x) as follows:

( ) ∑−

=− =⇔=

1

0110 )(

n

i

iin xcxccccc K

c c(x)0000 ... 0 01000 ... 0 10100 ... 0 x0010 ... 0 x2

000 ... 01 xn−1

1111 ... 1 1 + x + x2 + ... + xn−1

1010 ... 0 1 + x2

Examples:c1 = 0011 ⇔ c1(x) = x2 + x3;c2 = 0101 ⇔ c2(x) = x + x3;(see also table for furtherpolynomial examples of codewords of the length n).

The addition of vectors corresponds tothe addition of polynomials.Example: c1 + c2 = 0110

⇔ c1(x) + c2(x) = x2 + x3 + x + x3 = x + x2.

(3.5)


page 119


Weighting function:Let C be a block code. Then let Ai be the number of code words of C thathave the weight i (0 ≤ i ≤ n). Then let be the weighting function A(z) and W(x,y), respectively:

Between these two notations the following relations exist:

∑=

=n

i

ii zAzA

0)( ∑

=

−=n

i

iini yxAyxW

0),(

),1()( zWzA = )(),(xyAxyxW n=

The weighting function can be calculated in closed form only for a smallnumber of codes.Some codes have a symmetrical weight distribution: Ai = An−i.The weight distribution makes the accurate calculation of the residual-errorprobability possible (see chapter 2).For the weight distribution of a linear code with the minimum distance

dmin applies: A0 = 1; An ≤ 1 and Ai = 0 for 0 < i < dmin.

(3.6b)

(3.6a) (3.7a)

(3.7b)


page 120


Example: (4,3) parity check

Code word Information Control bit Weight c0 000 0 0 c1 001 1 2 c2 010 1 2 c3 011 0 2 c4 100 1 2 c5 101 0 2 c6 110 0 2 c7 111 1 4

4261)( zzzA ++=

The weight distribution from the table results in a weighting function:


page 121

3.2 Terms and Features of Linear Block Codes3 Single-Error Correcting Block Codes

For a linear block code C, which is described by equation (3.3), thefollowing applies:

– Every code word is a linear combination of rows of the generatormatrix G;

– The code is composed of all linear combinations of rows of G;– The sum of code words results in a code word and– in the code the zero vector (0 0 0 ... 0) is included.

Fundamental row operations in G do not change a code, that means thecode space C remains constant:

– Permutation of two rows,– Multiplication of a row by a scalar unequal 0 and– Addition of a row to another one.

However, the assignment of the information words u to the code words x changes.


page 122

3.2 Terms and Features of Linear Block Codes

Parity-check matrix:For a linear block code C, a parity-check matrix H is defined by:

( ) 0PPIPPII

PPIHG =+=+=⎟⎟

⎠

⎞⎜⎜⎝

⎛= −

−knk

knk

T

c HT = 0 for all c ∈ C and x HT ≠ 0 for all x ∉ CH is a (n − k) × n matrix. It applies:

0 = c HT = (u G) HT = u (G HT) → G HT = 0

That means that the generator matrix G and the parity-check matrix H areorthogonal.As for G, also for H fundamental row operations are allowed.If C is a systematic code with G = (Ik P), for H applies:

H = (PT In − k)

For this case, orthogonality can be proven as follows:

(3.8a/b)

(3.9)

(3.10)


page 123


Syndrome:Let H be the parity-check matrix of a linear (n,k,m) block code and x ∈ Cany code word to be sent. With the transmission, an error word f issuperposed so that the receive word y reads:

– s is a zero vector only then if y is a code word,– All those error vectors f are recognised that are no code words and– s is independent from the code word.

For the product of y and HAT then applies:y HT = (x + f) HT = x HT + f HT = 0 + f HT = f HT = s

The product y HT is referred to as syndrome s, the represented calculation isthe calculation of the syndrome in matrix notation.The features of s are:

y = x + f. (3.11)

(3.12)


page 124


Dual code:Based on a code C with the generator matrix G and the parity-check matrix H, let a dual code Cd, which is described by its dual generator matrix Gd and its dual parity-check matrix Hd, be defined as:

Gd = H and Hd = GThe code words of the two codes are orthogonal. Using c = u G ∈ C and cd = v H∈ Cd, the following applies:

( ) ( ) 0TTTTTd ==== v0uvHGuHvGucc

Examples for codes being dual to each other are the repetition code (index W) and the parity check where the first digit is the control bit (index P):

( )111|1 L=GW = HP

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

10000

010001

||||

1

11

L

OM

M

L

MHW = GP.

(3.13a/b)


page 125


Weighting function of the dual code (MacWilliams identity):If A(z) is the weighting function of an (n,k) block code, then the weightingfunction Ad(z) of the corresponding dual code is:

⎟⎠⎞

⎜⎝⎛

+−

⋅+⋅= −zzAzzA nk

11)1(2)(d ⎟

⎠⎞

⎜⎝⎛

+−

⋅+⋅= −−zzAzzA nkn

11)1(2)( d

)(

),(2),(d yxyxWyxW k −+⋅= − ),(2),( d)( yxyxWyxW kn −+⋅= −−

Example: For the (3,2) parity check, the code C is:C = {000,011,101,110}.

The weighting function reads as follows: A(z) = 1 + 3z2.

32

32d 1

1131)1(2)( z

zzzzA +=

⎥⎥⎦

⎤

⎢⎢⎣

⎡⎟⎠⎞

⎜⎝⎛

+−

+⋅+⋅= −

The corresponding dual code reads: Cd = {000,111} (repetition code).The weighting function is calculated to be:

(3.14a/b)

(3.15a/b)


page 126


Modifications of linear codes:An (n,k,dmin)q code can be modified to an (n',k',dmin')q code as follows:

– Extending: appending additional control bitsn' > n, k' = k, m' > m, RC' < RC, dmin' ≥ dmin

– Puncturing: reduction of control bitsn' < n, k' = k, m' < m, RC' > RC, dmin' ≤ dmin

– Lengthening: appending additional information bitsn' > n, k' > k, m' = m, RC' > RC, dmin' ≤ dmin

– Shortening: reduction of information bitsn' < n, k' < k, m' = m, RC' < RC, dmin' ≥ dmin

Adding a parity check bit, the extension of a code word with an unevenminimum distance can result in an augmentation of the minimum distance by 1, so that the code can detect a further error (example: see Hammingcodes).


page 127

3.3 Cyclic Codes3 Single-Error Correcting Block Codes

Definition of cyclic codes:A linear (n,k) block code is called cyclic, if every cyclic shift of a codeword results in a code word:

C ∈⇒∈ −−− ),,,,(),,,,( 21011210 nnn cccccccc KK C

The set of cyclic codes is a subset of the linear block codes. Accordingly, between an information word

the following relation existsc = u G.

u = (u0, u1, u2, ... , uk−1)and a code word

c = (c0, c1, c2, ... , cn−1)

Thus, a cyclic code is also described by a generator matrix G. Thereby, all cyclic codes have at least a generator matrix with band structure (see

equation (3.18) and the following example).

(3.16)

(3.17)


page 128

3.3 Cyclic Codes

Example of a cyclic code:

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

1011000010110000101100001011

G

u x0000 00000001000 11010000100 01101000010 00110100001 00011011110 10001100111 01000111101 1010001

u x1011 11111111010 11100100101 01110011100 10111000110 01011100011 00101111111 10010111001 1100101

Due to the cyclic feature, the code can easily be derived. In this case it issufficient to know the code words {0000000, 1101000, 1110010, 1111111}. At first, however, these have to be found (using linear combinations of rowsof G + zero word). The remaining words result from cyclic shift.Stop criterion: 2k words found.


page 129

3.3 Cyclic Codes

Generator matrix with band structure:

⎟⎟⎟⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜⎜⎜⎜

⎝

⎛

=

−

−

−

−

−

kn

kn

kn

kn

kn

gggggggg

gggggggg

gggg

KK

KK

MM

MM

KK

KK

KK

210

210

210

210

210

00000000

000000000000

G

Caution: Generator matrices with band structure do not necessarily describecyclic codes!Example: C = {00000, 11010, 01101, 10111} with the generator matrix

⎟⎟⎠

⎞⎜⎜⎝

⎛=

1011001011

G

is not a cyclic code.

(3.18)


page 130

3.3 Cyclic Codes

Generator polynomial:The generator matrix G of a cyclic code is thus characterised completely bya row of G. It includes at most n − k + 1 of 0 different entries: g0, g1, g2, ... , gn−k. These coefficients provide a special polynomial:

g(x) = g0 + g1 x + g2 x2 + ... + gn−k xn−k

g(x) is called generator polynomial.With binary codes, the coefficients of the generator polynomial can onlytake values of 0 or 1. For an (n,k) block code, the following must continueto apply:

g0 = gn−k = 1⇒ g(x) = 1 + g1 x + g2 x2 + ... + xn−k

Using polynomials, for cyclic codes a simplified description of generatormatrix, message and code vectors results.

(3.19b)

(3.19a)


page 131

3.3 Cyclic Codes

The polynomial and the matrix notation are listed below side by side:Information words:

u = (u0, u1, u2, ... , uk−1) u(x) = u0 + u1 x + u2 x2 + ... + uk−1 xk−1

Code words:c = (c0, c1, c2, ... , cn−1) c(x) = c0 + c1 x + c2 x2 + ... + cn−1 xn−1

Coding rules:c = u G c(x) = u(x) ⋅ g(x)

M M M M

With the equations (3.18/19b), both rules provide the same results:c0 = u0 g0

c1 = u0 g1 + u1 g0c2 = u0 g2 + u1 g1 + u2 g0

Consequently, the general presentation of a code word polynomial is:c(x) = (u0 + u1 x + u2 x2 + ... + uk−1 xk−1) ⋅ g(x)⇒ g(x) is the code word polynomial with the smallest degree greaterthan zero.

(3.20a/b)

(3.20c/d)(3.20e/f)


page 132

3.3 Cyclic Codes

Period of a generator polynomial:Let g(x) be the generator polynomial of a cyclic code C. And let r be thesmallest exponent for which applies

xr + 1 = 0 mod g(x).g(x) is then the divisor of xr + 1, and r is the period of g(x).A cyclic feature only exists if for the code word length n the followingapplies: n = r

For the proof the following theorem is applied, which is quite plausible using equation (3.20f):

Theorem: Every code word polynomial c(x) is divisible by g(x) without remainder.

Example: g(x) = 1 + x + x3, u(x) = 1 + x2 + x3

⇒ c(x) = u(x) ⋅ g(x) = 1 + x + x2 + x3 + x4 + x5 + x6 ⇔ c = (1111111)

(3.22)

(3.21)


page 133

3.3 Cyclic Codes

d is a code word only then, if d(x) is divisible by g(x) without remainder. Therefore, the following has to apply:

1 + xn = 0 mod g(x).

Only if the generator polynomial fulfils condition (3.23), a cyclic coderesults.

Proof of the condition (3.22): It shall apply thatc = (c0, c1, c2, ... , cn−1) ∈ C ⇒ d = (cn−1, c0, c1, ... , cn−2) ∈ C.

= c(x)= x ⋅ c(x) + cn−1 (1 + xn)

c(x) = c0 + c1 x + c2 x2 + ... + cn−1 xn−1 is divisible by g(x) without remainder:c(x) = 0 mod g(x) ⇒ x ⋅ c(x) = 0 mod g(x)d(x) = cn−1 + c0 x + c1 x2 + ... + cn−2 xn−1

= cn−1 + x (c0 + c1 x + c2 x2 + ... + cn−1 xn−1) + cn−1 xn

(3.23)


page 134

3.3 Cyclic Codes

Determination of a generator polynomial:g(x) is the divisor of 1 + xn. In general, that means that 1 + xn can besplit into single coefficients. Every coefficient can then be applied as a generator polynomial.

Example: x7 + 1 = (1 + x) ⋅ (1 + x + x3) ⋅ (1 + x2 + x3)

Calculation of the period of a given generator polynomial: Thisis done by reversion of the Gaussian Division Algorithm. The polynomial isshifted that way that always the lowest powers cancel out each other.

Example: (x3 + x + 1) ⋅ p(x) = xn + 1x3 + 0 + x + 1 | ⋅1

x4 + 0 + x2 + x | ⋅x ⇒ p(x) = x4 + x2 + x + 1x5 + 0 + x3 + x2 | ⋅x2 = h(x)

x7 + 0 + x5 + x4 | ⋅x4

x7 + 0 + 0 + 0 + 0 + 0 + 0 + 1 ⇒ n = 7


page 135

3.3 Cyclic Codes

Check matrix and check polynomial:For the generator polynomial g(x) of a cyclic (n,k) block code applies:

xn + 1 = g(x) ⋅ h(x).

⎟⎟⎟⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜⎜⎜⎜

⎝

⎛

=

−−

−−

−−

−−

−−

021

021

021

021

021

00000000

000000000000

hhhhhhhh

hhhhhhhh

hhhh

kkk

kkk

kkk

kkk

kkk

KK

KK

MM

MM

KK

KK

KK

H

h(x) is the check polynomial and has the degree k:h(x) = h0 + h1 x + h2 x2 + ... + hk xk

with h0 = hk = 1.

The respective (n − k) × n parity-check matrix H is composed as follows:

(3.25a)

(3.25b)

(3.24)


page 136

3.3 Cyclic Codes

Systematic Coding:In general, cyclic codes are not systematic. In order to achieve this, thefollowing construction rule is used:

)()()(

)()(

xgxrxq

xgxxu kn

+=⋅ −

1. Multiplication of the message polynomial u(x) by xn−k:u(x) ⋅ xn−k = u0 xn−k + u1 xn−k+1 + u2 xn−k+2 + ... + uk−1 xn−1

(corresponds to the shift of the message digits to the highest coefficients).

2. Division of the polynomial u(x) ⋅ xn−k by g(x): that means, u(x) ⋅ xn−k = q(x) ⋅ g(x) + r(x) withq(x) as whole-number division result and r(x) as division remainder (ingeneral: r(x) ≠ 0 and r(x) = r0 + r1 x + r2 x2 + ... + rn−k−1 xn−k−1).

3. Whole-number divisibility is enforced by addition of r(x):u(x) ⋅ xn−k + r(x) = q(x) ⋅ g(x) = c(x).

Resulting code word: c = (r0, r1, r2, ... , rn−k−1, u0, u1, u2, ... , uk−1)(3.27)

(3.26b)

(3.26a)


page 137

3.3 Cyclic Codes

Example:Using the cyclic (7,4) block code with the generator matrix

g(x) = x3 + x + 1,for the information word

u = (1 0 0 1), d. h. u(x) = x3 + 1,a systematic code word is to be developed.

13

2

++

+

xxxx

1. For n = 7 and k = 4, n − k = 3. From this results: u(x) ⋅ xn−k = x6 + x3.2. Division of the polynomial u(x) ⋅ xn−k by g(x):

( x6 + 0 + 0 + x3) : (x3 + 0 + x + 1) = x3 + x ++ (x6 + 0 + x4 + x3)

( 0 + 0 + x4 + 0 + 0 + 0)+ ( x4 + 0 + x2 + x)

( 0 + 0 + x2 + x) = r(x)3. c(x) = u(x) ⋅ xn−k + r(x) = x6 + x3 + x2 + x ⇒ c = (0 1 1 1 0 0 1)


page 138

3.3 Cyclic Codes

Decoding of cyclic codes:The receive vector y and the receive polynomial y(x), respectively, are to bechecked

y = c + f ⇔ y(x) = c(x) + f(x).

)()()(

)()(

xgxsxq

xgxy

+=

It applies that a code word c without remainder is divisible by the generatorpolynomial g(x). The result is the original information word u, which canbe taken out again in this way. For any y, this method is a possibility to check whether or not it is about a code word. In case of error, with thedivision a remainder is left:

s(x) represents the syndrome in polynomial notation:y(x) = q(x) ⋅g(x) + s(x).

If s(x) ≠ 0, the error correction takes place by means of a syndrome table. The polynomial division is practically realised by shift register circuits. It isa great advantage that with the coding and decoding no matrix

multiplications are required.

(3.28)

(3.29a)

(3.29b)


page 139

3.4 Hamming Codes3 Single-Error Correcting Block Codes

Definition of the Hamming codes:Let be m ≥ 3. A Hamming code is a code with code words of the lengthn = 2m − 1, consisting of k = 2m − m − 1 information bits and m check bits. The minimum distance is 3.

Thus, the Hamming code is a (2m − 1,2m − 1 − m) block code.For the code, inter alia, the following values for n, k and m result:

m 3 4 5 6 7 8n 7 15 31 63 127 255k 4 11 26 57 120 247

Due to its minimum distance, the Hamming code can correct exactly oneerror. In reality, it is constructed for this case.


page 140

3.4 Hamming Codes

Derivation of the construction rule for Hamming codes:Aim: construction of a code that can correct exactly one error.The syndrome s of the length m only depends on the error vector f of thelength n (eq. (3.12)). A single error at the position i (fi = 1; 0 ≤ i ≤ n − 1) results in a syndrome complying with the respective row of HT. In order to differentiate all single-error positions i unambiguously, all rows of HT haveto differ from each other; none of these rows may be the zero vector.All in all, there are 2m − 1 different rows without the zero vector, thus n = 2m − 1. From n = k + m results k = 2m − m − 1.Virtually, the rows of HT and the columns of H, respectively, are formed byall kinds of sequences except the zero vector.In doing so, the m = n − k control bits address the 2m − 1 positions.


page 141

3.4 Hamming Codes

Since it is irrelevant which syndrome describes an error position, the bitsequences may be randomised. In doing so, the parity-check matrix can bearranged in that way that the resulting Hamming code is systematic:

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

== −

10000

010001

||||

11

1101

)I( T

L

OM

M

L

LL

LLMM

LL

LL

knPH

all possible column vectors with more than one 1The generator matrix can then be derived easily using equation (3.4).

(3.30)


page 142

3.4 Hamming Codes

Example: For the (7,4) Hamming code the following systematic notationresults:

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

110|101|011|111|

1000010000100001

G

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

1101

1011

0111

P⎥⎥⎥

⎦

⎤

⎢⎢⎢

⎣

⎡== −

100

010

001

|||

110

101

011

111

)I( TknPH

u0 u1 u2 u3

↓ ↓ ↓ ↓

c0 c1 c2 c3 c4 c5 c6

control bits:c4 = c0 + c1 + c2c5 = c0 + c1 + c3c6 = c0 + c2 + c3

⇒

⇒With this, the following systematicswith the respective equations for thecalculation of the control bitsresults:


page 143

3.4 Hamming Codes

Example (continuation):Table of code words of the (7,4) Hamming codes:

Code word Information Control bit c8 1000 111 c9 1001 100 c10 1010 010 c11 1011 001 c12 1100 001 c13 1101 010 c14 1110 100 c15 1111 111

d

Code word Information Control bitc0 0000 000 c1 0001 011 c2 0010 101 c3 0011 110 c4 0100 110 c5 0101 101 c6 0110 011 c7 0111 000


page 144

3.4 Hamming Codes

Error correction with Hamming codes:Example (continuation): error correction using the (7,4) Hamming code

s0 = y0 + y1 + y2 + y4s1 = y0 + y1 + y3 + y5s2 = y0 + y2 + y3 + y6

These are analysed according to thetable of syndromes.

Syndrome Error position s0 s1 s2

No error 0 0 0 Error in 0th pos. 1 1 1 Error in 1st pos. 1 1 0 Error in 2nd pos. 1 0 1 Error in 3rd pos. 0 1 1 Error in 4th pos. 1 0 0 Error in 5th pos. 0 1 0 Error in 6th pos. 0 0 1

The respective syndrome reads in general: s = (s0 s1 s2).From the parity-check matrix, the checkequations result:

The decoding steps are then:1. Evaluation of the check equations → syndrome;2. Determination of the error position from the table of syndromes;3. Realisation of the error correction: add „1“ at the position of error.


page 145

3.4 Hamming Codes

Minimum distance of general Hamming codes:All columns of the check matrix are different from each other. With this, two columns are linearly independent at any time. However, since all kindsof bit combinations exist, three columns are linearly independent at anytime, e. g.:

100000...., 010000.... und 110000....With this, the minimum distance is always

dmin = 3and the number of corrigible errors

t = 1.

⎥⎥

⎦

⎤

⎢⎢

⎣

⎡−⋅+⋅++

+=

+−2

12

1

)1()1()1(1

1)(nn

n zznzn

zA

Weighting function of general Hamming codes:

(3.31a)

(3.31b)

(3.32)


page 146

3.4 Hamming Codes

On pages 109 and 111 bothprobabilities for a (7,4) Hamming codehave already beencalculated. For nequal to 3, 7, 15, 31, 63 and 127 they aredepicted in theadjoining diagram.

-6 -5 -4 -3 -2 -1 0-12

-10

-8

-6

-4

-2

0

lg pW

lg perr

lg pue

pW = perr

pW = perr2

pW = perr3

n = 3n = 7

lg pW

lg pue

Word error probability and probability of undetected errors of Hamming codes:


page 147

3.4 Hamming Codes

The extended Hamming code differentiates between the 1-error and 2-error situation by using an additional control bit. Therefore, it is a (2m,2m − 1 − m) block code. The respective generator matrix of the extended Hamming codereads as follows:

Extended Hamming code:Through evaluation of the syndrome, one error is corrigible. With a minimum distance of 3, however, also two errors are observable whichresult in s ≠ 0 as well. However, it cannot be distinguished whether one ortwo errors have occurred.

⎥⎥⎥⎥

⎦

⎤

⎢⎢⎢⎢

⎣

⎡

=

1

11

HextH,M

GG (3.33)


page 148

3.4 Hamming Codes

Possible error events are then:– no error: s = 0;– one error: s ≠ 0, sm+1 = 1; – two errors: s ≠ 0, sm+1 = 0.

Decoding is then carried out according to the following rules:– s = 0:

receive vector = code word– s ≠ 0, sm+1 = 1:

uneven number of errors ⇒ one error accepted and corrected bysyndrome evaluation

– s ≠ 0, sm+1 = 0:even number of errors ⇒ errors cannot be corrected.


page 149

3.5 Realisation of Coding and Decoding by ShiftRegisters 3 Single-Error Correcting Block Codes

Basic circuits:

S0 . . . .S1 Sm−1

a(x) x ⋅ a(x)

a a + b

b

a a ⋅ b a a ⋅ bbAdder Multiplier Scaling

Shift registers consist of memory cells, each delaying by one pulse. Thecontent of the memory cell j at the time i is referred to as Sj(i). With this, the following applies:

Sj+1(i+1) = Sj(i) (3.34)


page 150

3.5 Realisation of Coding and Decoding Using Shift Register Circuits

Cyclic shifting of a polynomial:Shift register circuits are an excellent possibility to realise cyclic coding. The mathematical operations required for this can all be carried out usingthis technique. One of these is the cyclic shifting of a polynomial:

x0

. . . .

x1 x2 xn−1xn−2xn−3

With this circuit,xi ⋅ a(x) mod (xn + 1)

is calculated, forming the simplest example of a regenerated shift register.(3.35)


page 151


Multiplication of two polynomials:An equation of the following form is to be calculated

c(x) = a(x) ⋅ [b0 + b1 x + b2 x2 + ... + bn xn].

b0

. . . .a(x)

. . . .

b1bn−1bn

c(x)

b0

. . . .a(x)

. . . .

b1 b2 bn

c(x)

Alternativestructure:

(3.36)


page 152


Example of a multiplication:The following is to be calculated

b0 = 1 b1 = 0 b2 = 1 b3 = 1

10111

11110011

s0 s1 s2 s3

i s0 s1 s2 s3 ci

−1 0 0 0 0 00 1 0 0 0 11 1 1 0 0 12 1 1 1 0 03 0 1 1 1 04 1 0 1 1 15 0 1 0 1 16 0 0 1 0 17 0 0 0 1 18 0 0 0 0 0

c(x) = a(x) ⋅ b(x) = (x4 + x2 + x +1 )⋅(x3 + x2 + 1) = x7 + x6 + x5 + x4 + 0 + 0 + x + 1


page 153


Division of two polynomials:

Based on the circuit, the following equation is calculated:q(x−1) = a(x−1) xm + q(x−1) [b0 xm + b1 xm−1 + ... +bm−1 x]

⇒ a(x−1) = q(x−1) [b0 + b1 x−1 + ... + bm−1 x−(m−1) + x−m] = q(x−1) b(x−1)

is discontinued after having written the last digit of a(x−1) in the shiftregister. Then q(x−1) is the initial word and s(x−1) the

content of the memory cells.

b0

. . . .

a(x−1) . . . . q(x−1)

b1 bm−1

)()()(

xbxaxq =Substitution x−1 → x ⇒ a(x) = q(x) b(x) ⇒

)()()(

)()(

xbxsxq

xbxa

+=Division of two polynomials with remainder

(3.37a)

(3.37b)


page 154


Example of a division of two polynomials with remainder:

x0 x1 x2 x3 x4

000100001 001110000s0 s1 s2 s3

q(x−1)a(x−1)

i ai s0 s1 s2 s3 qi

9 0 0 0 0 0 08 1 1+0=1 0+0=0 0 0+0=0 07 0 0+0=0 1+0=1 0 0+0=0 06 0 0+0=0 0+0=0 1 0+0=0 05 0 0+0=0 0+0=0 0 1+0=1 04 0 0+1=1 0+1=1 0 0+1=1 13 1 1+1=0 1+1=0 1 0+1=1 12 0 0+1=1 0+1=1 0 1+1=0 11 0 0+0=0 1+0=1 1 0+0=0 00 0 0+0=0 0+0=0 1 1+0=1 0

It is calculated(x8 + x3) : (x4 + x3 + x + 1)

134

23

+++

+

xxxxx

= x4 + x3 + x2 +


page 155


Circuit for coding of a systematic cyclic code:

code word

control bitsinformation bits


page 156


Mode of operation:

1. Reading in the information bits u0, u1, u2, ... , uk−1, starting with uk−1.Reading in from the „right“ corresponds to the multiplication by xn−k. After reading in the k information bits, the shift register contains thedivision remainder r(x).

2. Switch-off of the switches S1 and S2 (break of the feedback loop).3. Output of the control bits.


page 157


Example:The following has to be coded: u(x) = x3 + x2 + 1with g(x) = x3 + x + 1. t s1 b0 b1 b2 s2 s3

0 0 0 0 0 0 01 1 1 1 0 1 12 1 1 0 1 1 13 0 1 0 0 1 04 1 1 0 0 1 15 0 1 0 0 06 0 0 1 0 07 0 0 0 0 1

code wordinformation bitscontrol bits


page 158


Circuit for the decoding:With this circuit, the receive word is divided by the generator polynomialand the syndrome is calculated.


page 159


This circuit is also applied for decoding. However, the receive word is hereread in from the right.


page 160


Meggitt decoder: decoding with error correction at the same time

k-bit receive register

k-bit error register

syndrome logic


page 161

4 Burst-Error Correcting Block CodesCoding Theory

With the Reed Solomon codes (RS codes), this chapter introducesrepresentatives of the best known and often used codes being able to correctburst errors. They are closely related to the Bose-Chaudhuri-Hocquenghemcodes (BCH codes), the strength of which is rather to correct statisticallydistributed single errors. They are introduced at the end of this chapter as a special case of the RS codes.

In order to understand these two codes, knowledge of the algebra of finitefields (Galois fields) and of their features is required. At first, thisknowledge will be built in the following section.

A burst error of the length l ≥ 1 is understood by l consecutive digits of an error vector f, the first and the last digit of which are not equal to zero. Thevalues of the digits located between the first and the last digit of the errorburst are arbitrary.


page 162

4.1 Fundamental Algebraic Terms for Codes4 Burst-Error Correcting Block Codes

In this section, a brief introduction of the algebra required for coding theoryis given. For this, basic knowledge of set theory is assumed. The concept of function is summarised briefly.

Binary function:In a not empty set MM, a binary function ° is understood by an allocation of two elements a,b of MM (a,b ∈ MM) in such a way:

a ° b = c.Here, it is not absolutely necessary that: a ° b = b ° a.If the condition (4.1b) is fulfilled, however, the set is commutative.In terms of the binäry function, this set is called closed (closed set), if:a ° b ∈ MM for all a,b ∈ MM.A binary function is called associative, if:

a ° (b ° c) = (a ° b) ° c, a,b,c ∈ MM.

(4.1a)(4.1b)

(4.1c)

(4.1d)


page 163

4.1 Fundamental Algebraic Terms for Codes

Modulo-m addition:For a set of whole numbers MM = {0, 1, 2, ... , m−1} with m > 0, let themodulo-m addition „⊕“ be defined in such a way that for two elementsa,b ∈ MM the following applies:

a ⊕ b = r,where r is the remainder of the division of a + b by m.

rmqba +⋅=+⇔mrq

mba +=

+

⊕ 0 1 2 3 4 5 60 0 1 2 3 4 5 61 1 2 3 4 5 6 02 2 3 4 5 6 0 13 3 4 5 6 0 1 24 4 5 6 0 1 2 35 5 6 0 1 2 3 46 6 0 1 2 3 4 5

Example:MM = {0, 1, 2, 3, 4, 5, 6}and modulo-7 addition „⊕“.

For the demonstration of eq. (4.2a):

According to eq. (4.2b), the remainder r isagain an element of the set MM. The set MM isclosed in terms of the modulo-m addition.

(4.2a)

(4.2b)


page 164

4.1 Fundamental Algebraic Terms for Codes

Example with p = 7:MM = {1, 2, 3, 4, 5, 6}andmodulo-7 multiplication „⊗“.

⊗ 1 2 3 4 5 61 1 2 3 4 5 62 2 4 6 1 3 53 3 6 2 5 1 44 4 1 5 2 6 35 5 3 1 6 4 26 6 5 4 3 2 1

rpqba +⋅=⋅⇔prq

pba +=⋅

a ⋅ b is not divisible by p without remainder, because p is a prime number. The remainder isagain an element of the set MM. Thus, MM is closed.

Modulo-p multiplication:For a set of positive whole numbers MM = {1, 2, 3, ... , p−1}, with p being a prime number, let the modulo-p multiplication „ ⊗“ be defined in such a way that for two elements a,b ∈ MM the following applies:

a ⊗ b = r,with r being the remainder of the division of a ⋅ b by p.

For the demonstration of eq. (4.3a):

(4.3a)

(4.3b)


page 165

4.1.1 Groups4.1 Fundamental Algebraic Terms for Codes

A set GG and a function ° are referred to as group (GG,°), if the followingconditions for any a,b,c ∈ GG are fulfilled:

– The function ° is associative,– the set GG is closed,– the set GG contains a neutral element e with the feature:

a ° e = e ° a = a with e ∈ GG and– for every element a an inverse element a′ ∈ G G existsexists, so that:

a ° a′ = a′ ° a = e.

A group is referred to as commutative or Abelian group, if for any a,b ∈ GGequation (4.1b) applies.

Example: The set of integer numbers ZZ = {... −2, −1, 0, 1, 2, ...} composes a commutative group in terms of the addition. The neutral element is equal to 0. For the element a, the inverse element is -a.In terms of the multiplication, ZZ does not compose a group, since the inverse

elements are not contained in ZZ.

(4.4a)

(4.4b)


page 166

4.1.1 Groups

Finite group:If a group has a finite number of elements, it is referred to as finite group.

The neutral element of a group is unambiguous.Proof: Assuming that there are two neutral elements e1 and e2 with thefeature (4.4a), from this follows: e1 = e1 ° e2 = e2 ° e1 = e2.

Examples:The binary set BB = {0, 1}with the modulo-2 addition „⊕“. The neutral element is equal to 0 and the respective inverseelements: a = 0 ⇒ a′ = 0,

a = 1 ⇒ a′ = 1.

The sets with the modulo-m addition and modulo-p multiplicationconsidered at the beginning of section 4.1 are also finite groups.

⊕ 0 10 0 11 1 0


page 167

4.1.1 Groups

Cyclic groupsIf all elements of a multiplicative group GG = {1, g1, g2, ... , gm−1} can beillustrated as powers of at least one element gi, this group is called cyclic. Then, gi is called a primitive element of the group of the m-th order. For the‘‘one‘‘ element, the representation 1 = gi

0 applies. GG consists then of melements of the form gi

j, with 0 ≤ j ≤ m – 1:GG = {gi

0, gi1, gi

2, ... , gim−1}.

For any gk ∈ GG, the order (gk) describes the number of elements that can becomposed of gk

l.

z z1 z2 z3 z4 Order 1 1 1 1 1 1 2 2 4 3 1 4 3 3 4 2 1 4 4 4 1 4 1 2

Example:Multiplicative modulo-5 groupGG = {1, 2, 3, 4}.The elements z1 = 2 and z2 = 3 are

primitive elements.

(4.5)


page 168

4.1.2 Rings4.1 Fundamental Algebraic Terms for Codes

A set RR, at which the addition ⊕ as well as the multiplication ⊗ are defined, is called a ring if the following conditions are fulfilled:

Example of a commutative ring with neutral element:whole numbers ZZ with common addition and multiplication.

–– RR is a commutative group in terms of ⊕,–– RR is closed in terms of ⊗,–– RR is associative in terms of ⊗ and – the distributive law is applied: a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) with

a,b,c ∈ RR.A commutative ring exists if ⊗ is also commutative.A ring with a neutral element exists, if:

a ⊗ 1 = 1 ⊗ a = a with a ∈ RR.

Example of a commutative ring with neutral element:whole numbers ZZm = {0,1, 2, ... , m−1} with modulo-m addition and modulo-m multiplication. There, an unambiguous neutral element exists

only, if m is a prime number.

(4.6a)

(4.6b)


page 169

4.1.3 Fields4.1 Fundamental Algebraic Terms for Codes

A set KK, where the binary functions addition ⊕ and multiplication ⊗ areexplained, is a finite field in terms of ⊕ and ⊗, if the following conditionsare fulfilled:–– KK is a commutative group in terms of ⊕ with 0 as neutral element,–– KK without 0 is a commutative group in terms of ⊗ with 1 as neutral

element and– the distributive law is applied: a ⊗ (b ⊕ c) = (a ⊗ b) ⊕ (a ⊗ c) with

a,b,c ∈ KK.Galois field:A field with a finite number of elements forms a Galois field GFGF.

Example: GFGF(2) ⊕ 0 10 0 11 1 0

⊗ 0 10 0 01 0 1

(4.7)

The finite set ZZm = {0,1, 2, ... , m−1} of whole numbers is a Galois fieldGFGF(m) = ZZm, if m is a prime number, that means: m = p.


page 170

4.1.3 Fields

Several features of Galois fields are listed here without proof:

Every Galois field has at least one primitive element.

a ⊗ 0 = 0 ⊗ a = 0,

From the closure of the multiplication results that no zero divisors exist:a ≠ 0 and b ≠ 0 → a ⊗ b ≠ 0

a ⊗ b = 0 and a ≠ 0 → b = 0,

− (a ⊗ b) = (−a) ⊗ b = a ⊗ (−b),

a ⊗ b = a ⊗ c and a ≠ 0 → b = c,

1 ⊕ 1 ⊕ 1 ⊕ ... ⊕ 1 = 0,

m summands

(4.8a)

(4.8b)

(4.8c)

(4.8d)

(4.8e)

(4.8f)


page 171

4.1.3 Fields

Example: GFGF(5) ⊕ 0 1 2 3 40 0 1 2 3 41 1 2 3 4 02 2 3 4 0 13 3 4 0 1 24 4 0 1 2 3

⊗ 0 1 2 3 40 0 0 0 0 01 0 1 2 3 42 0 2 4 1 33 0 3 1 4 24 0 4 3 2 1

a 0 1 2 3 4−a 0 4 3 2 1a−1 – 1 3 2 4

Direct addition using table:1 + 2 = 3, 2 + 4 = 1, 4 + 4 = 3.

Inverse elements of the addition: a + (−a) = 0.Subtraction with inverse elements of the addition:

3 − 2 = 3 + (−2) = 3 + 3 = 1 1 − 4 = 1 + (−4) = 1 + 1 = 2

Direct multiplication using table:1 ⋅ 2 = 2, 2 ⋅ 4 = 3, 4 ⋅ 4 = 1.

Inverse elements of the multiplication: a ⋅ (a−1) = 1Division by inverse elements of the multiplication:

3 ÷ 2 = 3 ⋅ (2−1) = 3 ⋅ 3 = 4 1 ÷ 3 = 1 ⋅ (3−1) = 1 ⋅ 2 = 2


page 172

4.1.3 Fields

Example:Solution to a linear system of equations in GFGF(5)

The following linear system of equations is given:I: 2 x0 + x1 = 2II: 3 x1 + x2 = 3III: x0 + x1 + 2 x2 = 3

Solution stepsI: 2 x0 = 2 − x1II: x2 = 3 − 3 x12 ⋅ III: (2 − x1) + 2 x1 + 4 (3 − 3 x1) = 2 ⋅ 3

⇒ 2 + 4 ⋅ 3 − 2 ⋅ 3 = x1(1 − 2 + 4 ⋅ 3) ⇒ 3 = x1⇒ x2 = 3 − 3 x1 = 4⇒ x0 = 2−1 (2 − x1) = 2


page 173

4.1.4 Expansion Fields4.1 Fundamental Algebraic Terms for Codes

Problem:In 4.1.3, finite fields are built, which are, however, limitedto prime numbers with regard to the number of theirelements. For example, GG = {0, 1, 2, 3} does not providea Galois field due to the required modulo-4 multiplication!

A much more general and, at the same time, more flexible construction of Galois fields can be achieved by the expansion of pm elements. These arerepresented as polynomials of degree < m with the coefficients pi ∈ GFGF (p):

⊗ 0 1 2 30 0 0 0 01 0 1 2 32 0 2 0 23 0 3 2 1

p(x) = p0 + p1x + p2x2 + ... + pm−1xm−1 = mit pm-1 ≠ 0.∑−

=

1

0

m

i

ii xp

These overall pm possible polynomials can be divided into:– splittable or reducible polynomials and– nonsplittable or irreducible polynomials.In digital communication technology, particularly the expansion fields

GFGF(2m) are of great importance.

(4.9)


page 174

4.1.4 Expansion Fields

Irreducible polynomials:A polynomial p(x) with coefficients from GFGF (p) is called irreducible in terms of GFGF (p), if it cannot be represented as a product of two polynomialspa(x), pb(x) ≠ 0, 1 of lower degree with coefficients from GFGF (p).Thus, irreducible polynomials do not have a root in GFGF (p) either.The characteristics of irreducible polynomials and prime numbers arecomparable.

Reducible polynomials:A polynomial p(x) with coefficients from GFGF (p) is called reducible in termsof GFGF (p), if it can be decomposed into a product of two polynomials pa(x), pb(x) ≠ 0, 1 of lower order with coefficients from GFGF (p).

Example: p(x) = x2 + x + 1 Test polynomial of degree 1: pa(x) = x, pb(x) = x + 1.

x1

11+x

(x2 + x + 1) ÷ x = x + 1 + ; (x2 + x + 1) ÷ (x + 1) = x + .

Example: p(x) = (x + 1) (x2 + x + 1) = (x3 + x2 + x) + (x2 + x + 1) = x3 + 1.


page 175


Roots:An irreducible polynomial p(x) of degree > 1 with coefficients from GFGF (p) does not have roots in GFGF (p). An element α ∈ GFGF(pm) of the expansionfield is called root of p(x), if the following condition is fulfilled:p(α) = 0There is an analogy between the expansion of Galois fields and theexpansion of real numbers to complex numbers.

1 ⇔⇔ 1;−=α2 −=α0!

12 =+α ⇔ 2mod12 += αα0!

12 =++αα

Definition of complex numbers; of an expanded Galois field:RR → CC GFGF (2) → GFGF(22)

No solution of x2 + 1 = 0; x2 + x + 1 = 0for x ∈ RR; x ∈ GFGF (2)field expansion:

Definition of complex numbers; of the elements of GFGF(22):c = c0 + c1α with c0, c1 ∈ RR; c = c0 + c1α with c0, c1 ∈ GFGF

(2).

(4.10)


page 176


Primitive polynomials:An irreducible polynomial p(x) of degree m and with the coefficientspi ∈ GFGF (p) is referred to as primitive, if it has a root α (p(α) = 0) with thefollowing characteristic:

αi mod p(α), for i = 0, 1, ..., pm − 2provides all possible pm − 1 of 0 different elements of the Galois fieldGFGF(pm). The element α ∈ GFGF(pm) is called primitive element, for whichapplies:

α0 = αn = 1 für n = pm − 1 and

Primitive polynomials are irreducible, in general the reversal does not apply.For every prime field GFGF(pm) and every m, at least one primitive polynomial p(x) exists. If more than one primitive polynomial exists, equivalent expansion fields result from the construction.

)1(mod −=mpkk αα

(4.11)

(4.13)(4.12)


page 177

4.1.4 Galois Fields

Primitive polynomials for the construction of Galois fields GFGF(2m):

m primitive polynomial m primitive polynomial1 x + 1 9 x9 + x4 + 12 x2 + x + 1 10 x10 + x3 + 13 x3 + x + 1 11 x11 + x2 + 14 x4 + x + 1 12 x12 + x7 + x4 + x3 + 15 x5 + x2 + 1 13 x13 + x4 + x3 + x + 16 x6 + x + 1 14 x14 + x8 + x6 + x + 17 x7 + x + 1 15 x15 + x + 18 x8 + x6 + x5 + x4 + 1 16 x16 + x12 + x3 + x + 1


page 178


There are three possibilities to represent the elements of a Galois field:

Example of GFGF(22):The elements of this Galois field in component representation are all 22 = 4 polynomials of degree < 2, which can be built with the coefficients fromGFGF(2): GFGF(22) = {0, 1, α, 1+α}. 0 = α–∞

1 = α0

α = α1

1+α = α2

1 = α3 = α0

α = α4 = α1

1+α = α5 = α2

... ...

– Component representation (polynomial notation),– Exponential representation (according to equation 4.11) and– Vectorial representation (representation of the coefficients).

The representation of the coefficients only results in thevectorial representation: GFGF(22) = {00, 10, 01, 11}.In exponential representation, the zero element isrepresented symbolically as α–∞ :GFGF(22) = {α–∞, α0, α1, α2}.For the exponential representation, with equation 4.13

applies: αk = αk mod 3.


page 179


Addition and multiplication tables for GFGF(22) in

⊕ 0 1 α 1+α0 0 1 α 1+α1 1 0 1+α αα α 1+α 0 1

1+α 1+α α 1 0

⊗ 0 1 α 1+α0 0 0 0 01 0 1 α 1+αα 0 α 1+α 1

1+α 0 1+α 1 α

⊕ 00 10 01 1100 00 10 01 1110 10 00 11 0101 01 11 00 1011 11 01 10 00

⊗ 00 10 01 1100 00 00 00 0010 00 10 01 1101 00 01 11 1011 00 11 10 01

⊕ 0 1 α α2

0 0 1 α α2

1 1 0 α2 αα α α2 0 1α2 α2 α 1 0

⊗ 0 1 α α2

0 0 0 0 01 0 1 α α2

α 0 α α2 1α2 0 α2 1 α

Componentrepresentation:

Vectorialrepresentation:

Exponentialrepresentation:


page 180


Polynomial residue class:Vice versa, the single elements of the Galois fields can be also described as polynomial remainders of a modulo-p(x) calculation, with p(x) being a primitive polynomial.

Example in GFGF(22):For this, p(x) = x2 + x + 1 is a primitive polynomial. The condition

x2 + x + 1 = 0corresponds to a calculation modulo x2 + x + 1. For any product, e. g. (1 + α) ⋅ (1 + α) = 1 + α2, with the division by p(x = α) results:

12 ++ααα

(α2 + 1) : (α2 +α + 1) = 1 ++ (α2 + α + 1)

α = r(α) (comp. component representation of GFGF(22))GFGF(22) is defined by all possible results of arbitraryf(α) = p(α) modulo α2 + α + 1.


page 181


Conjugate roots:Let p(x), with grad p(x) = m and coefficients pi ∈ GFGF(p), be irreducible in terms of GFGF(p), and let α be the root of p(x), then, with α ,

are also roots of p(x). These roots are referred to as conjugate.

The characteristics are:– for all a ∈ GFGF(pm) applies: a ∼ a,– for all a, b ∈ GFGF(pm) applies: a ∼ b ⇒ b ∼ a and– for all a, b, c ∈ GFGF(pm) applies: a ∼ b and b ∼ c ⇒ a ∼ c.

∏−

=−=

1

0)()(

m

i

pixxp α

pmα = αp2

α , ...,pα , pm–1α , where

If two elements a, b ∈ GFGF (pm) are conjugate to each other, you can also write: a ∼ b.

By the conjugated root, a linear factorisation of p(x) is given:

(4.14)

(4.15)

(4.16a)(4.16b)(4.16c)

(4.17)


page 182


Equivalence classes:The combination of elements that are conjugate to each other according to

[zi] = {zi, zj, zk, zl, ...} with zi ∼ zj ∼ zk ∼ zl ...is referred to as equivalence classes and forms disjoint decompositions of GFGF(pm).

Cyclotomic fields:Cyclotomic fields Kj contain all exponents of the elements that areconjugate to each other:

The term cyclotomy comes from the complex numbers, where then-th root of unit and the one-element of a Galois field behave similarly:

⇔ .Therefore, expansion fields are also referred to as cyclotomic fields.

12j == πezn 101 ===− zzz npm

(4.18)

(4.19)

(4.20)

1mit}1,,1,0|mod{ −=−=⋅= mij pnminpjK K with


page 183


Minimum polynomial:For every a ∈ GFGF(pm), an unambiguously determined normalised polynomialm[a](x) of minimum degree with coefficients from GFGF(p) exists, the root of which is a. This polynomial is called minimum polynomial.m[a](x) is irreducible.

The minimum polynomial is calculated using equation (4.17):

∏∏∈∈

−=−=][,][,

][ )()()(a

i

Kii

p

abba axbxxm

Two conjugate elements have the same minimum polynomial:a ∼ b ⇒ m[a](x) = m[b](x)

The degree of the minimum polynomial shows the power of theequivalence class.

(4.21)

(4.22)


page 184


The product of all minimum polynomials from GFGF(pm)results in:

Cyclotomic field Minimum polynomial K0 = {0} m0(x) = (x − z0) = x + 1 K1 = {1,2,4,8} m1(x) = (x − z1) (x − z2) (x − z4) (x − z8) = x4 + x + 1 K3 = {3,6,9,12} m3(x) = (x − z3) (x − z6) (x − z9) (x − z12) = x4 + x3 + x2 + x + 1 K5 = {5,10} m5(x) = (x − z5) (x − z10) = x2 + x + 1 K7 = {7,11,13,14} m7(x) = (x − z7) (x − z11) (x − z13) (x − z14) = x4 + x3 + 1

1)(][ −=∏ n

ia xxm

i

m0(x) ⋅ m1(x) ⋅ m3(x) ⋅ m5(x) ⋅ m7(x) = x15 − 1

Example:GFGF(24) with primitive polynomial p(x) = x4 + x + 1 and primitive element zwith z4 + z + 1 = 0. The elements in GFGF(24) are:GFGF(24) = {0, z0, z1, z2, ... , zn−1} with n = pm − 1 = 24 − 1 = 15.In the exponent, the calculation is carried out modulo n: zn = z0.

There, one of the minimum polynomials corresponds tothe primitive polynomial.

(4.23)


page 185

4.1.5 Discrete Fourier Transformation4.1 Fundamental Algebraic Terms for Codes

The Discrete Fourier Transformation (DFT) describes in general a transformation. For the coding theory, it provides a very well arrangeddisplay format.Based on a Galois field GFGF(pm) with the primitive element z, letpolynomials of degree ≤ n − 1 = pm − 2 with coefficients from GFGF(pm) begiven in the following form:

∑−

=− =⇔=

1

0110 )(),...,,(

n

i

iin xaxaaaaa

∑−

=− =⇔=

1

0110 )(),...,,(

n

j

jjn xAxAAAAA

∑−

=

⋅−− −=−=1

0)(

n

i

jii

jj zazaA ∑

−

=

⋅==1

0)(

n

j

jij

ii zAzAa

The transformation then reads:

(4.24a)

(4.24b)

(4.25a/b)


page 186

4.1.5 Discrete Fourier Transform

For the DFT, the following notations exist:a(x) A(x)

a Atime domain frequency domain

The DFT has the following characteristics:– the transformation is one-to-one,– IDFT(DFT(a(x))) = a(x),– DFT(DFT(a(x))) = −a(x),– DFT(DFT(DFT(DFT(a(x))))) = a(x),– z−j is the root of a(x) ⇔ Aj = 0,

),...,,0,,...,,(0)( 11110 −+−− =⇔= njj

j AAAAAza A

),...,,0,,...,,(0)( 11110 −+−=⇔= niii aaaaazA a

– zi is the root of A(x) ⇔ ai = 0.

(4.26a)(4.26b)(4.26c)

(4.26d)

(4.26e)


page 187


The polynomial transformation according to eq. (4.25a/b) can be also described for the coefficient vectors by means of matrices:

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−−−−

−−

− 1

210

)1()1(21

)1(242121

1

210

21

11

1111

nnnn

nn

n A

AAA

zzz

zzzzzz

a

aaa

ML

MOMMMLLL

M

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−−−−−−−

−−−−−−−−

− 1

210

)1()1(2)1(

)1(242)1(21

1

210

21

11

1111

nnnn

nn

n a

aaa

zzz

zzzzzz

A

AAA

ML

MOMMMLLL

M

a A:

A a:

(4.27a)

(4.27b)


page 188


Example:The polynomial A(x) = 4 + 5x with the coefficients from GFGF(7) is to betransformed. z = 5 is chosen as a primitive element. Application of eq. (4.27b) results in:

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

543210

1234524024303034204254321

543210

11111

111111

AAAAAA

zzzzzzzzzzzzzzzzzzzzzzzzz

aaaaaa

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅+

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

506312

326451

5

111111

4

000054

546231421421616161241241326451111111

543210

aaaaaa


page 189


The inverse transform of the polynomial a(x) = 2 + x + 3x2 + 6x3 + 5x5 withthe coefficients from GFGF(7) calculated on the previous page is carried out using eq. (4.27a). The primitive element z = 5 is retained.

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

−−−−−−−−−−−−−−−−−−−−−−−−−

543210

1234524024303034204254321

543210

11111

111111

aaaaaa


AAAAAA

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

543210

5432142042303032402412345

543210

11111

111111

aaaaaa


AAAAAA


page 190


Inverse transform (continuation):

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

000054

000023

506312

326451241241616161421421546231111111

543210

AAAAAA

A = (4,5,0,0,0,0) a = (2,1,3,6,0,5)


page 191


A shift register circuit of the DFT according to equation (4.25a) is shown in the following figure:

a0z−j⋅0

z0

a0

Σ Aj

a1z−j⋅1

z−1

a1

an−1z−j⋅(n−1)

z−(n−1)

an−1

−1Initialisation

Memory contentafter j steps

Here the j-th step of the transformation for the calculation of A(x) ismapped. At the beginning (j = 0), the register is initialised with the time domain a0, a1, ... an−1, so that the following applies:

A0 = −(a0 + a1 + ... + an−1) mod pm. (4.28a)


page 192


A shift register circuit of the IDFT according to equation (4.25b) is shownin the following figure:

A0zi⋅0

z0

A0

Σ ai

A1zi⋅1

z1

A1

An−1zi⋅(n−1)

zn−1

An−1

Initialisation

Memory contentafter i steps

Here the i-th step of the transformation for the calculation of a(x) is mapped. At the beginnning (i = 0) the register is initialised with the frequencydomain A0, A1, ... An−1, so that the following applies:

a0 = A0 + A1 + ... + An−1 mod pm. (4.28b)


page 193


Finally, two theorems of the DFT are to be considered.

The product of two polynomials corresponds to the cyclic convolution of the coefficients:

)...()...()...( 1110

1110

1110

−−

−−

−− +++=+++⋅+++ n

nn

nn

n xcxccxbxbbxaxaa

∑−

=−

−−

−−−

⋅=

++++⋅=++++⋅=

1

0mod

211201101

112211000...

...

n

jnjiji

nn

nnn

abc

ababababcababababc

M

)1(mod)()( −⋅⇔∗ nxxbxaba

Let polynomials a(x), b(x), c(x) of degree ≤ n − 1 = pm − 2 be given with thecoefficients from GFGF(pm), the primitive element of which let be referred to as z. For any x ∈ GFGF(pm), xn = x0 = 1, i. e. modulo-(xn − 1) calculation, applies. For the product of two polynomials a(x) ⋅ b(x) = c(x) then applies:

(4.30)

(4.29)


page 194

4.1.5 Discrete Fourier Transformation

Convolution Theorem of the DFT:

Shift Theorem of the DFT:

Proof of (4.31a): Using equation (4.3b), for the modulo calculation therepresentation c(x) = a(x) ⋅ b(x) − γ(x) ⋅ (xn−1) resultsand thus Cj = − c(z−j) = − a(z−j) ⋅ b(z−j) + γ(z−j) ⋅ (z−jn−1).

−Aj =1−Bj

a(x) ⋅ b(x) mod (xn−1) − Aj ⋅ Bjai ⋅ bi A(x) ⋅ B(x) mod (xn−1)

with a(x) A(x), b(x) B(x)

x ⋅ a(x) mod (xn−1) z−j ⋅ Ajzi ⋅ ai x ⋅ A(x) mod (xn−1)

with a(x) A(x).Proof: by inserting b(x) = x and B(x) = x, respectively, in the Convolution

Theorem.

(4.31a/b)

(4.32a/b)


page 195

4.2 Reed-Solomon Codes4 Burst-Error Correcting Block Codes

The Reed-Solomon codes (RS codes) were developed approx. in 1960. They can be constructed analytically. There, the minimum distance can beprovided so that their weight distribution is known.The RS codes fulfil the Singleton bound with equality.They are very efficient and vitally important in practice. Among otherthings, they are used in spacecraft communications, for CDs (Compact Disk) and in mobile radio communications.

RS codes are symbol-oriented codes, that means they are especiallydedicated for the correction of burst errors. In contrast, they can beinefficient for the correction of single errors, since always whole symbolshave to be corrected. A respective redundancy is required here.


page 196

4.2.1 Construction of Reed-Solomon Codes4.2 Reed-Solomon Codes

Fundamental theorem of the linear algebra:A polynomial A(x) = A0 + A1 x + A2 x2 + ... + Ak−1 xk−1 of degree k − 1 withcoefficients Ai ∈ GFGF(pm) and Ak−1 ≠ 0 has at most k − 1 different roots.

The proof of this theorem is to confirm that a polynomial can be representedby a number of linear factors restricted by its degree:

∏−

=− −⋅=

1

11 )()(

k

iik xxAxA

Hamming weight of general vectors:The Hamming weight wH(c) of a vector c is defined as the number of elements of c that are not zero.Eq. (2.11), which calculates the weight of a binary vector, can be seen as a special case here, if the elements of the vector come from GF GF (2).

(4.33)


page 197

4.2.1 Construction of Reed-Solomon Codes

Proof:Equation (4.26e) says that roots of A(zi) correspond to elementsai = 0. The polynomial A(x) has at most k − 1 different roots, so that thevector a has at least n − (k − 1) = n − k + 1 ≥ d points different from zero.

Theorem for the minimum weight of vectors:

According to this theorem vectors can be constructed, the weight of whichdepends only on the polynomial degree of the transformed vector. These

have a defined minimum weight and thus a defined minimumdistance.

Let A(x) = A0 + A1 x + A2 x2 + ... + Ak−1 xk−1 be a polynomial with thecoefficients Ai ∈ GFGF(pm), where its degree is bounded by:

grad A(x) = k − 1 ≤ n − d,then for the weight of a vector applies a = (a0, a1, a2, .... , an−1) withthe relation a(x) A(x):

wH(a) ≥ d.(4.34)


page 198


Definition of Reed-Solomon codes:

The RS codes satisfy the Singleton bound with equality, since the followingapplies: n − k = dmin − 1The best case is an odd dmin, so that applies: t = (dmin − 1)/2.And thus: n − k = 2 t

I.e., the number of detectable and corrigible errors can be adjusted.

Let z be a primitive element from GFGF(pm), then a = (a0, a1, a2, .... , an−1) is a code word of the RS code C of the length n = pm − 1, dimension k = pm − dand minimum distance d = n − k + 1, if the following applies:

C = {a | ai = A(zi), grad A(x) ≤ k − 1 = n − d}.

The code vector (A0, A1, A2, .... , Ak−1) corresponds to the information tobe coded. By filling with n − k = d − 1 zeros (also called test symbols or testfrequencies), the code word in the frequency domain A results, by IDFT thecode word in the time domain a results:

a = (a0, a1, a2, .... , an−1) A = (A0, A1, A2, .... , Ak−1,0,0, ... ,0)

(4.36)

(4.37)

(4.35)


page 199


Example:Selection of the Galois field GFGF(pm) with p = 2 and m = 4 ⇒ GFGF(24).The code word length is then: n = pm − 1 = 16 − 1 = 15.Let the number of corrigible symbols be t = 3 ⇒ n − k = dmin − 1 = 2 t = 6. Then the code word in the frequency domain reads in general:

This code can be referred to as hexadecimal (15,9)-RS code and as binary(60,36)-RS code.

with Aj ∈ {0, 1, z, z+1, z2, z2+1, z2+z, z2+z+1,z3, z3+1, z3+z, z3+z+1, z3+z2, z3+z2+1, z3+z2+z, z3+z2+z+1}.

The number of information digits / bits is: k = 9 / k ⋅ m = 9 ⋅ 4 = 36;the number of code word digits / bits is: n = 15 / n ⋅ m = 15 ⋅ 4 = 60.

A0 A1 A2 A3 A4 A5 A6 A7 A8 0 0 0 0 0 0

k = n − (dmin − 1) = 9 dmin − 1


page 200


Possible error vectors in a (partial) binary notation with 0 = (0000) are:f = (0 0 0 1111 1111 1111 0 0 0 0 0 0 0 0 0) is corrigible;f = (0 0 0001 1111 1111 1110 0 0 0 0 0 0 0 0 0) is not corrigible;f = (0100 0 0 0 0001 0 0 0100 0 0 0 0 0 0010 0) ist not corrigible.

Example (continuation):With the correction capability of t = 3 symbols maximumtb,max = 3 ⋅ 4 = 12 bits and minimum tb,min = 3 ⋅ 1 = 3 bits can be corrected.

From the extreme cases it can be derived that definitely 3 single errors or9 consecutive errors are corrigible.Thus, burst errors are corrigible up to a length of:

nd

ndn

nkR 11)1(

C−

−=−−

==For the code rate of the RS code applies: .

tb,Bündel = m (t − 1) + 1 = m (d − 3) / 2 + 1

Example: For the (15,9)-RS code, RC = k / n = 9 / 15 = 60 %.

(4.38)


page 201


Generator polynomial:Due to the restriction of the degree of the code word A(x), for which applies:

Aj = 0 for j = k ... n − 1,and the DFT characteristic (4.26d), every code word polynomial a(x) has the roots:

a(z−j) = 0 for j = k ... n − 1.The product of all linear factors resulting from the roots forms the generatorpolynomial of the respective RS code:

Every code word polynomial can then be described as:a(x) = u(x) ⋅ g(x).

Since an RS code can be described by a generator polynomial, it is a cycliccode.

∏∏−

=

−

=

− −=−=kn

j

jn

kj

j zxzxxg1

1)()()(

(4.39)


page 202


Check polynomial:The check polynomial h(x) is the complementary polynomial to thegenerator polynomial and contains as roots all those elements different fromzero which are no roots of g(x):

Proof:

∏∏+−=

−

=

− −=−=n

knj

jk

j

j zxzxxh1

1

0)()()(

)1(mod0)1()()()(

)()()()()(1

0−=−⋅=−⋅=

⋅⋅=⋅

∏−

=

nnn

i

i xxxuzxxu

xhxgxuxhxa

For the polynomials g(x), u(x) and h(x) the following relations apply:grad(g(x)) = n − k; grad(u(x)) = k − 1; grad(g(x) ⋅ u(x)) = n − 1;grad(h(x)) = k.

(4.40)


page 203


Example:For the (6,2)-RS code over GFGF(7) using the primitive element z = 5, thegenerator polynomial g(x) is:

The check polynomial h(x) is then calculated to be:

43222

4

1

5

2

4652)65()56(

)2)(6()4)(5(

)5()5()(

xxxxxxxx

xxxx

xxxgi

i

i

i

++++=++⋅++=

−−⋅−−=

−=−= ∏∏==

−

)3)(1(334652

1)(

)(1)(

2432

6−−=++=

++++

−=

−=

xxxxxxxx

xxh

xgxxh

n


page 204


Generator matrix:The RS code is a cyclic code, the generator matrix G according to equation3.18 of which can be developed from the coefficients of the generatorpolynomial g(x).Example with g(x) = 2 + 5x + 6x2 + 4x3 + x4 (compare page 203):

⎟⎠⎞⎜

⎝⎛= 146520

014652G

Systematisation:By transformation, every generator matrix can be systemised.Example of the above generator matrix: The systematisation steps are here1) substitution of the first row by the sum of the rows and 2) multiplicationof all elements by 4.

⎟⎠⎞⎜

⎝⎛= 146520

1534021G ⎟

⎠⎞⎜

⎝⎛= 423610

465201systGG ⇒ ⇒


page 205


Control matrix:The control matrix H can be determined in three ways:1. Due to the cyclic characteristic of the RS codes using eq. 3.25bExample with h(x) = x2 + 3x + 3 (see page 203):

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

331000033100003310000331

1H

2. Using the systematic generator matrix: G = [Ik P] ⇒ H2 = [−PT In-k]Example with a systematic generator matrix Gsyst (compare page 204):

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛

−−−−−−−−

=100033010051001042000115

100044010026001035000162

2H


page 206


3. Direct evaluation of the check equations:−

0)( =−= −zaA jj 1for

1

0−=⋅−= ∑

=

− nkjzan

i

iji K

Inserting j = k …n-1, the sums for all j yield as a matrix:

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−

−−−−−−−

+−−+−+−

+−−+−+−

−−−−

0

000

1

111

1

2

1

0

)1()1(2)1(

)2)(1()2(2)2(

)1)(1()1(2)1(

)1(2

2MM

K

MMMMKKK

nnnn

knkk

knkk

knkk

a

aaa

zzz

zzzzzz

zzz

H3

Examplewith z = 5in GFGF(7): ⎟

⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜

⎝

⎛

=−

−−−−−−−−−−−−−−−−

326451241241616161421421

1111

12345240243030342042

3

zzzzzzzzzzzzzzzzzzzz

H

(4.41)


page 207


Syndrome calculation:The received signal r can be described as

r = a + f R = A + Fand accordingly r(x) = a(x) + f(x) R(x) = A(x) + F(x).

In the frequency range, the syndrome is a part of the receive vector which islocated at the test frequencies:

From this, the following can be derived in the frequency range:A(x) = A0x0 + A1x1 + ... + Ak−1xk−1

F(x) = F0x0 + F1x1 + ... + Fk−1xk−1 + Fkxk + ... + Fn−1xn−1

R(x) = R0x0 + R1x1 + ... + Rk−1xk−1 + Fkxk + ... + Fn−1xn−1

with Rj = Aj + Fj syndrome coefficients

S(x) = S0x0 + ... + Sn−k−1xn−k−1 = Fkx0 + ... + Fn−1xn−k−1 (4.42)


page 208


In vectorial notation, the syndrome isS = (S0, S1, ... , Sn−k−1).

The syndrome can also be calculated by multiplication by the check matrix:

S = r ⋅ H3T ⇔ ST = H3 ⋅ rT

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜⎜

⎝

⎛

−=

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−−−−−−−

+−−+−+−+−−+−+−

−−−−

−− 1

210

)1()1(2)1(

)2)(1()2(2)2()1)(1()1(2)1(

)1(2

1

210

21

111

nnnn

knkkknkk

knkk

kn r

rrr

zzz

zzzzzz

zzz

S

SSS

M

K

MMMMKKK

M

Written out, the following representation results:

(4.43a)

(4.43b)


page 209


Generic RS code:The RS code is a cyclic code. According to eq. 4.32a, cyclic shifts in thetime domain have no effect on the test frequencies:

a(x) = xk ⋅ b(x) mod (xn−1) z−j⋅k ⋅ Bj = Aj → Bj = 0 ⇒ Aj = 0.

Otherwise, using equations 4.26e and 4.32b, it is certain that the cyclic shiftof a polynomial in the frequency range does not have an effect on theweight and thus on the minimum distance of the appropriate code word in the time domain.

Definition: A generic RS code of the length n, dimension k and minimumdistance d is defined by:C = {a | ai = A(zi), A(x) = xk ⋅ B(x) mod (xn−1), grad B(x) ≤ k − 1 = n − d },

with k being any number.(4.44)


page 210


∑=

=n

i

ii zAzA

0)(

diiAi <≤

== 1für 00für 1

diqjiqi

nAdi

j

jdiji ≥⎟

⎠⎞⎜

⎝⎛ −−−⎟

⎠⎞⎜

⎝⎛= ∑

−

=

−− für 1)1()1(0

The RS codes belong to the MDS codes (compare page 100).The proof is based on combinatorial considerations in terms of essential MDS characteristics of the RS codes.

Weighting function of MDS codes:The weighting function A(z) for an (n,k)-MDS code with the minimumdistance d = n – k + 1 over the GFGF(q = pm) can be calculated as follows:

with

(4.45)and for

forfor


page 211


Example: weighting distribution of the (7,3)-RS code over GFGF(q = 23) withthe minimum distance d = (n − k) + 1 = 5.

1477217)!57(!5

!71757

5 =⋅=⋅−⋅

=⋅⋅⎟⎠⎞⎜

⎝⎛=A

147)58()!67(!6

!7)1158(76

76 =−⋅

−⋅=⋅⎟

⎠⎞⎜

⎝⎛−⋅⋅⎟

⎠⎞⎜

⎝⎛=A

217)154864(7)12681

68(777 2

7 =+−⋅=⋅⎟⎠⎞⎜

⎝⎛+⋅⎟

⎠⎞⎜

⎝⎛−⋅⋅⎟

⎠⎞⎜

⎝⎛=A

A0 = 1A1 = A2 = A3 = A4 = 0

With this, the weighting function is:

∑=

===+++=n

ii qA

0

33 85122171471471

765 2171471471)( zzzzA +++=

Check: .


page 212

4.2.2 Coding of Reed-Solomon Codes4.2 Reed-Solomon Codes

The information u = (u0, u1, u2, .... , uk−1) and the information polynomialu(x) = u0x0 + u1x1 + ... + uk−1xk−1, respectively, is to be coded. In thefollowing, three possible coding methods are identified:1. Coding in the frequency range, that means inverse transform of u by

means of IDFT (eq. 4.25b):

2. Non-systematic coding in the time domain, that means multiplicationof u(x) by generator polynomial:

3. Systematic coding in the time domain, that means multiplication of u(x) by xn−k: u(x) ⋅ xn−k = u0xn−k + u1xn−k+1 + ... + uk−1xn−1,division of u(x) ⋅ xn−k by g(x):u(x) ⋅ xn−k = q(x) ⋅ g(x) + r(x) with r(x) = r0 + r1x + ... + rn−k−1xn−k−1 andsolving according to the code word: − r(x) + u(x) ⋅ xn−k = q(x) ⋅ g(x) = a(x).

a = (−r0, −r1, −r2, ... , −rn−k−1, u0, u1, u2, ... , uk−1)

A = (u0, u1, u2, .... , uk−1,0,0, ... ,0) a = (a0, a1, a2, .... , an−1)

a(x) = u(x) ⋅ g(x)

(4.46c)

(4.46b)

(4.46a)


page 213

4.2.2 Coding of Reed-Solomon Codes

Block diagram for coding in the frequency range (method 1):

u0 A0 a0 c0

u1 A1 a1 c1

u2 A2 a2 c2

uk−1 Ak−1

0 Ak

0 Ak+1

0 An−1 an−1 cn−1

....

................

................

∑−

=⋅=

1

0

n

j

ijji zAa

IDFT....

....

....

Messageword

Codeword


page 214


Example of coding in the frequency range (method 1):A (6,2)-RS code over GFGF(7) with the primitive element z = 5 is considered. The information u = (3,3) is to be coded.

With B(x) = 1 + 1x, the code word b results as followsB = (1,1,0,0,0,0) b = (2,6,5,0,3,4).

n = 6; k = 2 and A(x) = 3 + 3x. From ai = A(x = zi) (eq. 4.25b) follows:a0 = A(x = z0 = 50 = 1) = A0 + A1z0 = 3 + 3 ⋅ 1 = 6,a1 = A(x = z1 = 51 = 5) = A0 + A1z1 = 3 + 3 ⋅ 5 = 4,a2 = A(x = z2 = 52 = 4) = A0 + A1z2 = 3 + 3 ⋅ 4 = 1,a3 = A(x = z3 = 53 = 6) = A0 + A1z3 = 3 + 3 ⋅ 6 = 0,a4 = A(x = z4 = 54 = 2) = A0 + A1z4 = 3 + 3 ⋅ 2 = 2,a5 = A(x = z5 = 55 = 3) = A0 + A1z5 = 3 + 3 ⋅ 3 = 5.

Result: A = (3,3,0,0,0,0) a = (6,4,1,0,2,5).

The Hamming distance is: dH(a,b) = dH((6,4,1,0,2,5),(2,6,5,0,3,4)) = 5.For comparison: dmin = n − k + 1 = 6 − 2 + 1= 5.


page 215


u = (3,4) ⇒ u(x) = 3 + 4x ⇒ u(x) ⋅ xn−k = 3x4 + 4x5.

An alternative is given by coding with a systematic generator matrix:

)()(

xgxr

Example of the systematic coding in the time domain (method 3):A (6,2)-RS code over GFGF(7) with the primitive element z = 5 is considered. The information u = (3,4) is to be coded systematically.

( ) ( )0,5,6,2,4,34236104652014,3syst =⎟

⎠⎞⎜

⎝⎛⋅=⋅= Gub

Division of u(x) by g(x) = 2 + 5x + 6x2 + 4x3 + x4 (compare page 203):

(4x5 + 3x4) ÷ (1x4 + 4x3 + 6x2 + 5x + 2) = 4x + 1 +− (4x5 + 2x4 + 3x3 + 6x2 + 1x)

1x4 + 4x3 + 1x2 + 6x − (1x4 + 4x3 + 6x2 + 5x + 2)

2x2 + 1x + 5 = r(x)−r(x) + u(x) xn−k = 2 + 6x + 5x2 + 3x4 + 4x5 ⇒ a = (2, 6, 5, 0, 3, 4)

−r u


page 216


The generator polynomial is calculated according to eq. (4.39):g(x) = (x − z) (x − z2) (x − z3) (x − z4)

= (x2 + xz4 + z3) (x2 + xz6 + 1)= x4 + x3z3 + x2 + xz + z3.

The check polynomial is calculated according toeq. (4.40):

h(x) = (x − z5) (x − z6) (x − z7)= x3z3 + x2z3 + xz2 + z4.

Example of the systematic coding in the time domain (method 3):A (7,3)-RS code over GFGF(23) is considered. The primitive polynomial isp(x) = 1 + x + x3.The code word length is calculated with n = pm − 1 = 8 − 1 = 7. The numberof corrigible symbols is t = 2, because n − k = 4 with k = 3. With p(x), thefollowing relation applies to the primitive element: 1 + z + z3 = 0.The single elements in exponential and component representation of thisexpansion field are given in the table.

0 = 0z0 = 1z1 = zz2 = z2

z3 = 1 + zz4 = z + z2

z5 = 1 + z + z2

z6 1 + z2


page 217


Example (continuation):The information u = (1, z, z2) is to be coded systematically.u = (1, z, z2) ⇒ u(x) = 1 + xz + x2z2 ⇒ u(x) ⋅ xn−k = x4 + x5z + x6z2.Division of u(x) ⋅ xn−k by g(x) = x4 + x3z3 + x2 + xz + z3:

)()(

xgxr

The respective generator matrix reads: .⎟⎟⎟

⎠

⎞

⎜⎜⎜

⎝

⎛=

110001100011

3333

33

zzzzzz

zzzG

(x6z2 + x5z + x4) ÷ (x4 + x3z3 + x2 + xz + z3) = x2z2 + x + z3 + + (x6z2 + x5z5 +x4z2 + x3z3 + x2z5)

x5z6 + x4z6 + x3z3 + x2z5

+ (x5z6 + x4z2 + x3z6 + x2 + xz2)x4 + x3z4 + x2z4 + xz2

+ (x4 + x3z3 + x2 + xz + z3)x3z6 + x2z5 + xz4+ z3 = r(x)

−r(x) + u(x) xn−k = z3+xz4 +x2z5 +x3z6 +x4+x5z +x6z2 ⇒ a = (z3,z4,z5,z6,1,z,z2).


page 218

4.2.3 Decoding of Reed-Solomon Codes4.2 Reed-Solomon Codes

The decoding of RS codes is carried out by means of special algorithms. The usage of a syndrome table is not possible due to the enormous discspace required.A receive vector r is assumed, which has resulted from the covering of theoriginally sent code word a with an error vector f: r = a + f.Let the RS code be able to correct t errors.

The approach with the algebraic decoding can then be divided into thefollowing steps:1. Calculation of the syndrome S(x) from r2. Calculation of the fault locations from S(x) by means of the so-called

code equation3. Calculation of the error values of f, as RS codes are no binary codes4. Error correction using the relation a = r – f.


page 219

4.2.3 Decoding of Reed-Solomon Codes

Calculation of the syndrome:According to definition, a code word A has 2t consecutive coefficients in thefrequency range that are equal to zero. These positions of the vector R arereferred to as syndrome vector or simply syndrome. The following applies:

Calculation of the fault locations:For this, the so-called code equation is derived. The code equationestablishes a relationship between the syndrome S(x) and the fault locationpolynomial c(x) C(x) , the zeros of which mark the fault locations. c(x) is defined that way that: ci = 0 for fi ≠ 0, so that the following applies:

syndrome S

r = a + f R = A + F = (A0+F0, A1+F1, ..., Ak–1+Fk–1, Fn–2t, ... , Fn–1).

ci ⋅ fi = 0 for i = 0, 1, ... , n − 1

∏≠

−=0,

)()(ifi

izxxCThus, every fault location produces a zero / a linear factor in C(x):

(4.47)

(4.48)

(4.49)


page 220


According to equations 4.31b and 4.26e, for the DFT of the product ci ⋅ fiapplies:

2knte −

=≤

ee

fi

i xCxCxzxCi

+++=⋅−= ∏≠

− ...1)1()( 10,

ci ⋅ fi = 0 C(x) ⋅ F(x) = 0 mod (xn − 1)

The polynomial C(x) ⋅ F(x) has all kinds of zeros. The degree of C(x) isequal to the number of fault locations e. This applies under the assumption:

With eq. (4.49), for the fault location polynomial C(x) applies:C(x) = C0 + C1x + ... + Cexe

Thus, C(x) has exactly e + 1 coefficients, however, only the e zeros of thepolynomial are of interest. Therefore, a coefficient Ci is arbitrary and can benormalised to 1. According to eq. (4.49), Ce = 1 is chosen. If C0 = 1 ischosen, C(x) is defined by:

(4.50)

(4.51)

(4.52)


page 221


Before deriving the code equation, the relations between the polynomialsconsidered so far are to be summarised first. These are the code word a(x) sent, the error polynomial f(x), the receive word r(x) as well as the fault location polynomial c(x) and the appropriate representations in thefrequency range:

a(x) xxxxxxxxxxxxxxxxxxxx A(x) xxxxxxxxxxxxxx000000

f(x) 00000x000x000000x000 F(x) xxxxxxxxxxxxxxxxxxxx

r(x) xxxxxxxxxxxxxxxxxxxx R(x) xxxxxxxxxxxxxxxxxxxx S(x)

c(x) xxxxx0xxx0xxxxxx0xxx C(x) xxxx0000000000000000

ci⋅fi = 0 C(x)⋅F(x) = 0 mod (xn−1)


page 222


In order to determine the coefficients Ci by setting up and subsequently solvingthe code equation, it is assumed at first that the following applies: e = t.Eq. (4.50) in the form C(x) ⋅ F(x) = 0 mod (xn − 1) is written as a system of equations, where the coefficients are arranged according to index sums modulo-n in ascending order, that means according to the powers x0 to xn–1

(xn = x0):

00

000

00

13221102423120

1212110222110

12322110

1120110221100

=+++=++++

=++++=++++=++++

=++++=++++

−−−−−−−−−−

+−−−−+−−−−−−−

−−−−−−−−

+−−−−−

tntnnntntnnn

tnttntntntnttntntntnttntntn

tntntntnn

FCFCFCFCFCFCFCFC

FCFCFCFCFCFCFCFCFCFCFCFC

FCFCFCFCFCFCFCFC

KK

MMMMMKKK

MMMMMKK

(4.53)


page 223


The framed part of eq. (4.53) has t equations with t unknowns. Thecoefficients of the fault location polynomial are known there and correspondto the syndrome coefficients of S(x):

0

00

1322221120

112110022110

=++++

=++++=++++

−−−−

−+−−

ttttt

tttttttt

SCSCSCSC

SCSCSCSCSCSCSCSC

KMMMMM

KK

S0 = Fn−2t , S1 = Fn−2t+1 , .... , S2t−1 = Fn−1.Inserting this in the framed part of eq. (4.53), the following applies:

The linear equation system is uniquely solvable. The relation between thewanted coefficients Ci and the known coefficients Si obtained from thiscorresponds to a cyclic convolution and, using the normalisation C0 = 1, forms the code equation (also called Newton identity):

∑=

− −==⋅+t

iijij ttjSCS

112,,for0 K

(4.54)

(4.55)


page 224


In general, the number of errors is unknown, with errors of little importancebeing more likely than errors of great importance. For e < t , the wantedcoefficients are C1, C2, ... , Ce. The number of variables is equal to e, however, the number of equations is equal to t. From this results that theequation system is over-determined and different approaches for C(x) arerequired. For this, the not normalised code equation with a variable numberof errors is considered:

In matrix notation, the following representation applies:

te

CC

CC

SSSSSSSS

SSSSSSSS

ee

etetttetettt

eeee

,,2,1für1

10

122221222123222

121011

KM

KK

MMMMLL

==

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

⋅

⎟⎟⎟⎟⎟

⎠

⎞

⎜⎜⎜⎜⎜

⎝

⎛

−−−−−−−−−−−−

+−

0

(4.57)

(4.56)tetejSC

e

iiji ,,2,1für12,,für0

0KK =−==⋅∑

=− for for

for


page 225


The search for the coefficients of the fault location polynomial is startedwith the most likely error event e = 1. If for this number of errors no solution has been found, e is increased gradually until it has reached thelimit of t. If no fault location polynomial has been found, there is a decodingfailure.Using two well-known algorithms, the solution to the code equation can becalculated with low computation effort. These are:– the Berlekamp-Massey algorithm and– Euclid‘s division algorithm.If a solution for the fault location polynomial has been found, the fault locations are searched by simple testing of all kinds of zeros (Chien search): C(x = zi) for 0 ≤ i ≤ n–1.If for any j from this codomain

C(x = zj) = 0 applies,then j is a fault location.

(4.58)


page 226


Calculation of the error values:Eq. (4.50) corresponds to a cyclic convolution of the coefficients, which canbe described as

n) mod(Index 1,,1,0für00

−==⋅∑=

− njFCe

iiji K


0 −==⋅+ ∑=

− njFCFCe

iijij Kor


10 −=⋅⋅−= ∑

=−

− njFCCFe

iijij K

Resolved to Fj , the error values can be calculated recursively:

using C(x) = C0 + C1x + ... + Cexe

and F(x) = F0 + F1x + ... + Fk−1xk−1 + Fkxk + Fk+1xk+1 + ... + Tn−1xn−1

unknown part known part (S)

(4.59a)

(4.59b)

(4.60)

for

for

for


page 227


Example:The (6,2)-RS code over GFGF(7) with the primitive element z = 5 isconsidered. Then the parameters are: n = 6, k = 2, d = dH,min = 5 andt = 2. The receive vector r = (1,2,3,1,1,1) has to be decoded.1. Calculation of the syndrome (eq. 4.47):

S

r = (r0,r1,r2,r3,r4,r5) = (1,2,3,1,1,1)R = (R0,R1,R2,R3,R4,R5) = (R0,R1,F2,F3,F4,F5) = (5,0,4,6,6,1)

⇒ S = (F2,F3,F4,F5) =(S0,S1,S2,S3) = (4,6,6,1).

Assumption of the most likely number of errors : e = 1 ⇒ C(x) = C0 + C1x.

2. Calculation of the number of errors using eq. (4.57):

⎟⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎠⎞

⎜⎝⎛⋅⎟

⎟⎠

⎞⎜⎜⎝

⎛=⎟

⎠⎞

⎜⎝⎛⋅⎟

⎟

⎠

⎞

⎜⎜

⎝

⎛

000

1616646

010

231201 C

CC

SSSSSS

Normalisation (arbitrary): C1 = 1 ⇒


page 228


The single equations have the following solutions:6 C0 + 4 = 0 ⇒ C0 = 4,6 C0 + 6 = 0 ⇒ C0 = 6 and1 C0 + 6 = 0 ⇒ C0 = 1.

This is a contradiction, thus e has to be increased:e = 2 ⇒ C(x) = C0 + C1x + C2x2.

I: 6 C0 + 6 C1 + 4 = 0II: 1 C0 + 6 C1 + 6 = 0I−II: 5 C0 − 2 = 0 ⇒ C0 = 2⋅5−1 = 6 II: 6 C1 + 5 = 0 ⇒ C1 = −5⋅6−1 = 5

⎟⎠⎞⎜

⎝⎛=⎟

⎟

⎠

⎞

⎜⎜

⎝

⎛⋅⎟

⎠⎞⎜

⎝⎛=⎟

⎟

⎠

⎞

⎜⎜

⎝

⎛⋅⎟

⎠⎞

⎜⎝⎛

00

1661466

10

210

123012 C

C

CCC

SSSSSSNormalisation: C2 = 1 ⇒

Solving the equation system leads to:

Thus, the error location polynomial reads: C(x) = 6 + 5 x + x2.


page 229


The search for the roots (Chien search) results in:C(x = z0 = 1) = 6 + 5 + 1 = 5,C(x = z1 = 5) = 6 + 4 + 4 = 0 ⇒ root at z1,C(x = z2 = 4) = 6 + 6 + 2 = 0 ⇒ root at z2,C(x = z3 = 6) = 6 + 2 + 1 = 2,C(x = z4 = 2) = 6 + 3 + 4 = 6 andC(x = z5 = 3) = 6 + 1 + 2 = 2.

Errors were detected at the 1st and the 2nd position of the receive vector(index beginning with 0!).Proof: C(x) = (x − z1)⋅(x − z2) = 6 + 5 x + x2

3. Calculation of the error values (eq. 4.60):With R = (R0,R1,F2,F3,F4,F5) = (5,0,4,6,6,1) and C(x) = 6 + 5 x + x2 thefollowing results:

F0 = −C0−1 ⋅ (C1 ⋅ F−1 + C2 ⋅ F-2)

= −C0−1 ⋅ (C1 ⋅ F5 + C2 ⋅ F4)

= −6−1 ⋅ (5 ⋅ 1 + 1 ⋅ 6) = 1 ⋅ (5 + 6) = 4


page 230


andF1 = −C0

−1 ⋅ (C1 ⋅ F0 + C2 ⋅ F-1)= −C0

−1 ⋅ (C1 ⋅ F0 + C2 ⋅ F5)= −6−1 ⋅ (5 ⋅ 4 + 1 ⋅ 1) = 1 ⋅ (6 + 1) = 0.

With this, the complete error vector in the frequency domain reads:F = (F0,F1,F2,F3,F4,F5) = (4,0,4,6,6,1).

The inverse transform in the time domain results in:f = (f0,f1,f2,f3,f4,f5) = (0,1,2,0,0,0).

aFinally, the decoding result reads:

= r − f = (1,2,3,1,1,1) − (0,1,2,0,0,0) = (1,1,1,1,1,1).

This position provides a good checking opportunity, since in f only thosepositions should not be equal to zero, which have been determined with thesearch for fault locations.


page 231

4.3 Bose-Chaudhuri-Hocquenghem Codes4 Burst-Error Correcting Block Codes

Like the RS codes, Bose-Chaudhuri-Hocquenghem codes (BCH codes) belong to the class of the cyclic codes and represent an important extensionof the Hamming codes. Especially, they are suitable for the correction of multiple, statistically independent (single) errors. Regarding this, a differentiation between binary and non-binary BCH codes is made, wherethe binary BCH codes can be considered as a special case of the RS codes, whereas in turn the RS codes can be described as a special case of the non-binary BCH codes. Only the binary BCH codes, however, will beconsidered here.A binary BCH code of the code word length n = 2m–1 is characterised thatway that every code word a in the time domain consists of componentsai ∈ GFGF(2), whereas the code word in the frequency domain consists of components Ai ∈ GFGF(2m).In analogy to the RS code, a binary BCH code can be defined in thefrequency domain by disappearing adjacent control bit or, in

the time domain, by a generator polynomial.


page 232

4.3 Bose-Chaudhuri-Hocquenghem Codes

For a binary BCH code with ai ∈ GFGF(2), the DFT transform equations4.25a/b are simplified to

Ai = a(x = z−i) and aj = A(x = zj),although Ai ∈ GFGF(2m).If Ai = 0 applies, the polynomial a(x) has a root z−i in the time domain.That means, in a(x) the coefficient (x − z−i) is included:

Ai = 0 = a(x = z−i).

df

4.3.1 Derivation of BCH Codes in the Frequency Domain

The theorem applies that a polynomial f(x) in GFGF(2) has the followingimportant characteristic:

[f(x)]2 = f(x2)

Proof via the binomial formula:(a + b)2 = a2 + 2ab + b2 = a2 + b2, since 2ab = 0 mod 2.

a and b themselves can be again a sum term, the single components of which are sum terms as well etc. The result is an indefinitely long sum term

with the characteristic to be proven.

(4.61a/b)

(4.62)

(4.63)


page 233

4.3.1 Derivation of the BCH Codes in the Frequency Domain

With eq. (4.63) applies:A2i = a(x = z−2i) = a(x = (z−i)2) = [a(x = z−i)]2 = (Ai)2

A4i = a(x = z−4i) = a(x = (z−2i)2) = [a(x = z−2i)]2 = (A2i)2 = (Ai)4

A8i = a(x = z−8i) = a(x = (z−4i)2) = [a(x = z−4i)]2 = (A4i)2 = (Ai)8 etc.Thus, the general relation that the condition ai ∈ GFGF(2) is fulfilled, reads:

The binary (primitive) BCH code C with the code word length n = 2m − 1 and the built minimum distance d is defined by

C = {a | ai = A(zi), grad A(x) ≤ n − d, A(2 j⋅i) = (Ai)(2j), Ai ∈ GFGF(2m)}

with the primitive element z ∈ GFGF(2m).

( ) )2()2(

jj ii AA =⋅

With this, the definition reads as follows:

(4.64)

(4.65)


page 234


Example:A BCH code over GFGF(23) with n = 23 − 1 = 7 and t = 1, that means d = 3, isconsidered. For the components of the code word vector

A = (A0,A1,A2,A3,A4,A5,A6) = (A0,A1,A2,A3,A4,0,0) applies:for A0: A(2 j⋅ 0) = A0 = (A0)(2

j) ⇒ A0 ∈ GFGF(2)

for A1: A(2 j⋅ 1) = A(2 j) = (A1)(2j) ⇒ A2 = (A1)2 , A4 = (A1)4 ,

A8 mod 7 = A1 = (A1)8 ⇒ A1 ∈ GFGF(23)for A3: A(2 j⋅ 3) = (A3)(2

j) ⇒ A6 = (A3)2 , A12 = A5 = (A3)4, A24 = A3 = (A3)8

⇒ A3 ∈ GFGF(23)Using A5 = A6 = 0, however, also A3 = 0. That means, although 5 digits areavailable in the frequency domain, only two symbols and 4 bits (1 + 3) of information per code word, respectively, can be transferred. Coding example:A0 = 1, A1 = z ⇒ A = (1,z,z2,0,z4,0,0), A(x) = 1 + z x + z2 x2 + z4 x4

a0 = A(z0) = 1+z1+z212+z414 = 1+z+z2+z4 = 1+z+z2+z+z2 = 1a1 = A(z1) = 1+zz+z2z2+z4z4 = 1+z2+z4+z8 = 1+z2+z+z2+z = 1 ...

⇒ a = (1,1,0,1,0,0,0) (comp. respective RS coding on p. 214f.)


page 235


The aforementionedexample can berealised usingthe adjoining circuit.


page 236

4.3.2 Derivation of the BCH Codes in the Time Domain4.3 Bose-Chaudhuri-Hocquenghem Codes

In case that the component Ai of a vector A as well as their conjugatedcomponents are equal to zero, the product of the respective roots forms in the time domain a minimum polynomial with coefficients from GFGF(2) according to eq. (4.17) and (4.22). If A has consecutive components Ai, Ai–1, ..., that are equal to zero, the product of the minimum polynomials generatesthe generator polynomial of the binary BCH code. Then the definition reads:

In case that d − 1 consecutive numbers in M exist, the built minimumdistance is d. The minimum distance actually achieved can be longer.

K⋅⋅=−= ∏∈

)()()()(21

xmxmzxxg jjMi

i

Kj are the cyclotomic fields in terms of a number n = 2m − 1, z is a primitive element in GFGF(2m) and M the union of sets of the cyclotomic fields: M = {Kj1

∪Kj2∪...}. A primitive BCH code of the length n = 2m − 1 is defined by

the generator polynomial:

The BCH code has the dimension k = n − grad (g(x)).

(4.66)


page 237

4.3.2 Derivation of the BCH Codes in the Time Domain

Example:On the basis of the GFGF(24), that means n = 24 − 1 = 15, structures fordifferent minimum distances to be built are considered (compare p. 184).

1)()()( 41

},{ 21

++==−= ∏∈

xxxmzxxgKKi

i

1)1()1(

)()()()(

46782344

31},,,{ 4321

++++=++++⋅++=

⋅=−= ∏∈

xxxxxxxxxx

xmxmzxxgKKKKi

i

I:

II:

With M = {K1,K2} = K1 = {1,2,4,8} the built minimum distance is d = 3, since d − 1 = 2 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 4 = 11.

With M = {K1∪K3} = {1,2,3,4,6,8,9,12} the built minimum distance isd = 5, since d − 1 = 4 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 8 = 7.


page 238

4.3.2 Derivation of the BCH Codes in the Time Domain

III:

Combining the cyclotomic fields, it has to be taken into account that it isabout the union of sets in which the same element cannot be foundrepeatedly. Therefore, e.g. for the first example (I) the following applies:

M = {K1∪K2} = K1, da K1 = K2.

1

)1()1()1(

)()()()()(

245810

22344

531},,,{ 621

++++++=

++⋅++++⋅++=

⋅⋅=−= ∏∈

xxxxxx

xxxxxxxx

xmxmxmzxxgKKKi

i

K

With M = {K1∪K3∪K5} = {1,2,3,4,5,6,8,9,10,12} the built minimumdistance is d = 7, since d − 1 = 6 consecutive numbers are contained in M.The dimension is k = n − grad (g(x)) = 15 − 10 = 5.


page 239

4.3.3 Aspects of Coding / Decoding of BCH Codes4.3 Bose-Chaudhuri-Hocquenghem Codes

The BCH code is a linear cyclic code that, in analogy to section 3.3, can becoded in two ways using the generator polynomial:– unsystematically by multiplication by the generator polynomial and– systematically by division by the generator polynomial.Unlike the RS codes, the shift of the test frequencies the BCH codes.The decoding is similar to the decoding with the RS codes. It is theadvantage of this case that with binary codes no calculation of the errorvalues is required.


page 240

5 Convolutional CodesCoding Theory

Beside the block codes, the convolutional codes are the second important groupof error-correction codes. Instead of the generation of code words block byblock, the code words are generated by convolution of an information sequencewith a set of generator coefficients.Though they have a simpler mathematical structure compared to the block codes, no analytical methods for construction exist for them; they are found bycomputer simulations. However, convolutional codes can process reliabilityinformation of code symbols much better than block codes. In return, they arehighly sensitive to burst errors. Normally, the codes are binary.

First publications regarding convolutional codes were launched in 1955 byElias. However, they could only be used in practice after decoding methodshad been developed. Among those are the methods of the sequentialdecoding by Fano (1963) and the maximum Likelihood decoding algorithmby Viterbi (1967) which can be realised in practice.


page 241

5 Convolutional Codes

Soft and hard decision:With decoding, convolutional codes are able to use hard decision as well as soft decision symbols. Hard decision means that the continuous receivedsignal is converted by a threshold decision into a sequence of possibletransmit symbols afterdemodulation.With the soft decision,either the analogsignal is passed on or thequantisation values(interim values) are applied(see figure). Therefore, amore reliable decoding ispossible.

Zeit

Empf

angs

pege

l

HD decision threshold

bit error!


page 242

5.1 Definition of Convolutional Codes5 Convolutional Codes

In analogy to the block codes, information and code sequences are classifiedinto blocks of the length k and n, respectively. Here, however, they are indexedby the block number r: ur = (ur,1,ur,2, ..., ur,k), ar = (ar,1,ar,2, ..., ar,n).

∑ ∑= =

−⋅=m k

rr uga0 1

,,,,μ κ

κμνμκν

Usually, the symbols are binary : ai,j , ui,j ∈ GFGF(2) = {0,1}.

ar = ar(ur, ur−1, ... , ur−m)The coder is linear (comp. section 5.4), that means the code bits result in linear combinations of the information bits. The formal description is madeby using the generator coefficients gκ,μ,ν ∈ GFGF(2) with 1 ≤ κ ≤ k, 0 ≤ μ ≤ mand 1 ≤ ν ≤ n, so that the code bit subsequences result in convolutions of theinformation bit sequences with the generator coefficients:

The assignment of ar to ur is, in contrast to the block codes, notmemoryless. The active code block is defined by the current informationblock and by m preceding ones:

(5.1a/b)

(5.2)

(5.3)


page 243

5.1 Definition of Convolutional Codes

The code rate of the convolutional code is: .nkR =C

The size m is referred to as memory and L = m + 1 as the constraint length. In addition to the code rate, it influences the performance as well as thedecoding effort. Formally, block codes are a special case of theconvolutional codes with m = 0.Whereas with block codes k and n are usually large, typical values withconvolutional codes are: k = 1, 2; n = 2, 3, 4; m ≤ 8.For the realisation of the memory, k ⋅ (m + 1) locations are required (comp. figure on the next page). In literature, representations also exist putting theinformation sequence ur directly to the adder without intermediate storage. In this case, the number of locations only amounts to k ⋅ m.

(5.4)


page 244


Block diagram of a general (n,k,m) convolutional coder:

A connectioncorresponds to gκ,μ,ν.


page 245


Example: (2,1,2) convolutional coder

The parameters are:n = 2, k = 1, m = 2,RC = 1/2.

The code bits are calculated from:ar,1 = ur + ur−1 + ur−2;ar,2 = ur + ur−2.

Example of a coding sequence:u = (1,1,0,1,0,0,...);a = (11,01,01,00,10,11,...).

This example will be often referred to in this section.


page 246


Example: (3,2,2) convolutional coder


The code bits are calculated from:ar,1 = ur,2 + ur−1,1 + ur−2,2;ar,2 = ur,1 + ur−1,1 + ur−1,2;ar,3 = ur,2.

Example of a code sequence:u = (11,10,00,11,01,...);a = (111,110,010,111,001,...).


page 247


Description of the polynomial:The restriction k = 1 is to be applied here. From this, tremendoussimplifications in the description result.To the generator coefficients gk,v the generator polynomials

∑=

⋅=m

xgxg0

,)(μ

μνμν

are assigned. The information bit sequence is characterised by a power series and the code block sequence by a power series vector:

∑∞

==

0)(

r

rr xuxu

( ) ∑∞

=

==0

,21 )(with)(,),(),()(r

rrn xaxaxaxaxax ννKa

and

(5.5)

(5.6)

(5.7a/b)


page 248


The convolutional coding corresponds to a polynomial multiplicationaccording to:

⇔ a(x) = u(x) ⋅ G(x)

For the memory applies:

nxgxuxa ,,1for)()()( K=⋅= ννν

( ) ( ))(,),(),()()(,),(),( 2121 xgxgxgxuxaxaxa nn KK ⋅=

( ))(,),(),()( 21 xgxgxgx nK=G

))((gradmax1

xgmn

νν ≤≤

=

with the generator matrix:

Then the definition of the convolutional codes with the polynomialmultiplication reads:

⎭⎬⎫

⎩⎨⎧

∈=⋅= ∑∞

=0

}1,0{,)(|)()(r

rr

r uxuxuxGxuC

(5.8a)

(5.8b)

(5.8c)(5.9)

(5.10)

(5.11)


page 249


Example:

From this, the code sequence in vector notation results by sorting thecoefficients:

For the (2,1,2) convolutional coder on page 245 applies:G(x) = (g1(x),g2(x)) = (1 + x + x2, 1 + x2).

The coding example can be described as:u = (1,1,0,1,0,0,...) ⇔ u(x) = 1 + x + x3 + ...

and a(x) = (a1(x),a2(x)) = u(x) G(x)= ((1 + x + x3)(1 + x + x2),(1 + x + x3)(1 + x2))= (1 + x4 + x5,1 + x + x2 + x5)= (a0,1x0 + a1,1x1 + a2,1x2 + a3,1x3 + a4,1x4 + a5,1x5,

a0,2x0 + a1,2x1 + a2,2x2 + a3,2x3 + a4,2x4 + a5,2x5).

Result: a = (11,01,01,00,10,11).a = (a0,1a0,2, a1,1a1,2, a2,1a2,2, a3,1a3,2, a4,1a4,2, a5,1a5,2).


page 250

5.2 Special Code Classes5 Convolutional Codes

Systematic convolutional codes:The information bits are adopted directly to the code bits.Block diagram of a general systematic convolutional coder:

Non-systematicconvolutional codesnormally have morefavourable features.


page 251

5.2 Special Code Classes

Time-phased convolutional codes:After L information blocks, m known blocks (tail bits – normally zeros) areinserted. Thus, the code word is closed (scheduling) and the memory is setto the zero state. This corresponds to a block structure (block code). Thenthe code rate is:

nmLkLR

⋅+⋅

=)(terminiertC, (5.12)

information blocks

coded information

m known additionalblocks

Note: Do not confuse the L combined blocks for the scheduling with theconstraint length (see page 243)!

information


page 252


Punctured convolutional codes:P code blocks (P is called puncturation length) are combined. From this, lcode bits are deleted according to a puncturation scheme (puncturation matrixP). Then the code rate is given by:

It must apply: RC,punktiert < 1 ⇒ l < P ⋅ (n − k). lnP

kPR−⋅

⋅=punktiertC,

Normally, they are applied with information blocks of the length k = 1.With the same code rate and k > 1, they are only slightly less efficient thannon-punctured convolutional codes. The decoding effort hardly changescompared to the non-punctured convolutional codes. Switch is possible withouthigh additional effort.Especially the RCPC codes (rate compatible punctured convolutional codes) are of great practical importance. Their code families are derived from a mother code that way (non-punctured convolutional code), that the

higher-rate codes are contained in the low-rate codes.

(5.13)

(5.14)


page 253


Example:It is l = 1.

⎟⎠⎞⎜

⎝⎛= 11

01P

The code example reads as follows:u = (1,1,0,1,0,0,...)a = (11,01,01,00,10,11,...) apunkt = (11,0x,01,0x,10,1x,...).

‚x‘ identifies the digits that were deleted:apunkt = (11,0,01,0,10,1,...).

32

12212

punktiertC, =−⋅

⋅=RCode rate:

This is NOT a matter of matrixmultiplication!


page 254

5.3 Representation of Convolutional Codes5 Convolutional Codes

Besides the representation via the shift register circuit or via the generatorpolynomials and the polynomial multiplication as convolution, thefollowing three representations are familiar:– code tree– state diagram– netword diagram (Trellis diagram)

Code tree:The code is spanned as a tree, with the nodes corresponding to the possiblememory states. From every node, 2k branches go off, where the informationand respective code bits are depicted.The disadvantage of this representation is that with every coding step thenumber of nodes increases exponentially.The number of possible paths is: ( ) BlockBlock 22Pfade

NkNkN ⋅==where NBlock is equal to the number of processed blocks and thus to the

depth of the tree.

(5.15)


page 255

5.3 Representation of Convolutional Codes

Example: Code tree for the (2,1,2) convolutional coder (see page 245).A path corresponds to an information and code sequence, respectively. It ismarked here: u = (1 1 0 1 ...), a = (11 01 01 00 ...).

state (output bits)


page 256


State diagram:In this case, it is about (in contrast to the code tree) a repetition-freedescription of the convolutional code. The time information is missing in the state diagram.Thereby, the output block depends on the input block and the memorycontent, whereas the new memory content depends on the input block and the old memory content.Regarding this, the minimum number of coding steps required to get fromthe state A to the state B can be read off . Thereby, the number of states is:

( ) mkmkN ⋅== 22ZustandAt every state, 2k transitions exist so that all in all

NZustand · 2k = 2k·m · 2k = 2k·(m+1)

transitions exist. With large parameters, this representation can becomeunclear again.

(5.17)

(5.16)


page 257


Example: (2,1,2) convolutional code(see page 245).


The used notations are:

memorycontent =

state= ur−1 ur−2

input bits (output bits)= ur (ar,1 ar,2)


page 258


Example: (3,2,2) convolutional coder(see page 246).


This example shows theproblems with unfavourableparameter values. Tough allarrows are included in the diagram,the information about the inputand output bits was renounced.


page 259


Network diagram − Trellis diagram:It corresponds to a state diagramwith a time component and representsthe basis for decoding.A receive sequence can becompared with all possible pathsblock by block and thus themost likely code sequence of areceive word can be determined.

Example:Trellis segment with all possibletransitions for the (2,1,2)convolutional coder (see page 245).


page 260


Example:Trellis diagram for the (2,1,2) convolutional coder (see page 245) with a coding example for the scheduling.The time-phased code word reads: a = (11,01,01,00,10,11,00,00).

scheduling

time

initial phase final phase


page 261

5.4 Features of Convolutional Codes5 Convolutional Codes

Linearity:The convolution is a linear operation. Thus, the convolutional code is linear.

⎭⎬⎫

⎩⎨⎧

≠=⋅= ∑∞

=0Hf 0)(,)(|))()((min

i

ii xuxuxuxxuwd G

Minimum distance and free distance:According to the evaluation of block codes, the minimum distance isunsuitable for convolutional codes, because code words of convolutionalcodes do not have a defined length. Instead of that, a so-called free distancedf is defined. It is to represent the minimum Hamming distance betweenarbitrary code sequences with different information sequences:

df = min {dH(a1,a2)| u1 ≠ u2}Since the convolutional code is a linear code, the distance to the zerosequence is sufficient for the calculation. Thus, df corresponds to theminimum weight of all possible code sequences:

(5.18a)

(5.18b)


page 262

5.4 Features of Convolutional Codes

Fundamental path, distance function and distance profile:A path leaving and returning to the zero state without touching it in themeantime is called fundamental path. At least one path exists here whichforms a code sequence with a minimum weight.The paths can be determined via the Trellis or the state diagram. For thelatter, separation at the zero state is recommended.

Example (see next page): the shortest path from 00 to 00 is:

00 → 10 → 01 → 00 ⇒ wH = 5.(11) (10) (11)

Two paths can be found for wH = 6:

00 → 10 → 11 → 01 → 00 ⇒ wH = 6;

00 → 10 → 01 → 10 → 01 → 00 ⇒ wH = 6.

(11) (01) (01) (11)

(11) (10) (00) (10) (11)


page 263


State diagram separated at the zero statefor the (2,1,2) convolutional coder (see page 245):For further approach, two parameters are introduced.

The states are described bythe formal parameter Zi.

For the description of thedistance, the formal parameter D is introduced, the power of which statesthe number of bits different from zero.


page 264


By means of the introduced parameters, the following equation system canbe set up: Zb = D2Za + D0Zc; Zc = D1Zb + D1Zd;

Zd = D1Zb + D1Zd; Ze = D2Zc.The solution of the equation system provides the distance function:

From the power series expansion

the following results:

DD

ZZDT

a

e21

)(5

−==

1for11

1 432 <+++++=−

xxxxxx

K

K+++++=−

⋅= 987655 16842211)( DDDDD

DDDT

The result shows the path with the minimum weight 5, so that the freedistance df = 5. Furthermore, the two paths with the weight 6 can be read off (see page 262). There are 4 more paths with the weight 7, 8 paths with the

weight 8, 16 paths with the weight 9 etc.

(5.19)


page 265


The minimum distance dmin is dependent on the number of coding steps i. Considering dmin as a function of i, the distance profile of a convolutionalcode is obtained. In this case, the statement is of interest how many codingsteps are required for a code word to reach the free distance by all means.

Example: (2,1,2) convolutional code (see page 245)After i = 6 steps at most, the freedistance is reached(comp. fundamental way with the lowestweight, where onlythree steps form thefree distance; page262).


page 266


Optimal convolutional code:A code with the maximum free distance at a given code rate and memory isreferred to as optimal code.

m df g1(x) g2(x) g3(x)2 5 1 + x + x2 1 + x2

3 6 1 + x + x3 1 + x + x2 + x3

4 7 1 + x3 + x4 1 + x + x2 + x4

5 8 1 + x2 + x4 + x5 1 + x + x2 + x3 + x5

6 10 1 + x2 + x3 + x5 + x6 1 + x + x2 + x3 + x6

2 8 1 + x + x2 1 + x2 1 + x + x2

3 10 1 + x + x3 1 + x + x2 + x3 1 + x2 + x3

4 12 1 + x2 + x4 1 + x + x3 + x4 1 + x + x2 + x3 + x4

5 13 1 + x2 + x4 + x5 1 + x + x2 + x3 + x5 1 + x3 + x4 + x5

6 15 1 + x2 + x3 + x5 + x6 1 + x + x4 + x6 1 + x + x2 + x3 + x4 + x6

Table of optimal convolutional codes:


page 267


Catastrophic coder:If a finite number of transmission errors results in an infinite number of decoding errors, this is called catastrophic error propagation. This occurs iftwo output sequences of a coder differ only in few bits, although the inputsequences are completely different. Thus, a small number of transmissionerrors in these positions can lead to a falsification of one code word into theother.

For the case k = 1, a catastrophic coder can be avoided, if the generatorpolynomials do not have a common divisor (comp. example):

ggT(g1(x), g2(x), ... , gn(x)) = 1.

This is always possible if at least two memory states exist which are notabandoned if the respective information does not change and which therebygenerate identical code sequences. At the same time, this is a criterion forthe detection of a catastrophic coder and can be read off easily from a statediagram (see example next page).

(5.20)


page 268


Example:

From the generator polynomials, the same statementcan be derived, since g1(x) is a divisor of g2(x):

The two states (00) and(11) are not abandoned incase of consistent informationand generate the same codesequences. Thus, this is about a catastrophic coder.

g1(x) = 1 + x;g2(x) = 1 + x2 = (1 + x) ⋅ (1 + x).


page 269

5.5 Decoding Using the Viterbi Algorithm5 Convolutional Codes

Viterbi decoding is a method that requires small effort and with which, technically, memory depths of up to m ≈ 10 can be realised. It provides a Maximum-Likelihood decoding and thus is optimal.For the derivation of the method, a terminated code sequence a of the lengthN and a receive sequence r is to be known. The objective of the decoder isto estimate a code sequence complying with the transmit sequence a as much as possible, that means the Likelihood function p(r|a) maximises:

a

Alternatively, the maximum of the Log-Likelihood function log p(r|a) issearched for, so that for the memoryless channel the transition probabilityp(r|a) can be replaced by a sum of the transition probabilities of the singlesymbols p(ri|ai):

∑−

==

1

0)|(log)|(log

N

iii arpp ar

∏−

==

1

0)|()|(

N

iii arpp ar (5.21)

(5.22)


page 270

5.5 Decoding Using theViterbi Algorithm

Thus, the Maximum-Likelihood detection of the maximisation correspondsto metrics with the general form:

( ) ∑∑

∑

−

=

−

=

−

=

=+⋅=

+⋅=

1

0

1

0

1

0

)|()|(log

)|(log)|(

N

iii

N

iiii

N

iii

ararp

arpM

μβα

βαar

using the metrics increment .)|( ii arμ

⎩⎨⎧

−=+=−

=

⎩⎨⎧

≠=−

=

1for1for1

forfor1

)|(

err

err

err

err

ii

ii

ii

iiii

raprap

raprap

arp

{ } { }1,1, 21 +−=∈ xxai { } { }1,1, 21 +−=∈ yyriExample: Using andfor the symmetric binary channel applies:

(5.23)

(5.24) input output


page 271

5.5 Decoding Using theViterbi Algorithm

Choosing for the free parameters α and βi the values

then the metrics increment just reads:

errerr

err log1und1log/2 pp

pi αβα −−=

−=

iiiiii

ii rararaar =

⎭⎬⎫

⎩⎨⎧

−=−+=+= 1für1

1für1)|(μ

is equivalent to a minimisation of the Hamming distance or to themaximisation of the number of agreements.This, however, is only one possible approach. Other definitions of metrics

are possible.

iiii raar =)|(μ

The maximisation of the Viterbi metrics

(5.25a/b)

(5.26b)

(5.26a)

and


page 272

5.5 Decoding Using the Viterbi Algorithm

Viterbi algorithm:Decoding is carried out by means of the Trellis diagram. The objective is to compare all possible code sequences with the receive sequence. With thisalgorithm, this takes place calculating the Viterbi metrics.

Approach: With the initial state, the metrics is set to 0. Then, in every step, to each of the states reached via different paths a metrics z is assigned, thatis calculated from the addition of the metrics of the previous state and a metrics increment.From all 2k paths arriving at a state only the path is retained that shows thelargest metrics. All other 2k − 1 paths are discarded, since they can neverachieve a larger metrics. If several paths with the same metrics exist, a random sample has to be effected. Eventually, the decoding result is thepath with the largest metrics.


page 273


During decoding, a definite number of paths has to be saved. Since thenumber of paths to be saved can become very large (before one pathremains only; see the following example), the length of the paths to besaved is restricted to 5 ⋅ k ⋅ m. This restriction is based on experience. Thereby, the need on storage space is approx. the product of the number of states and the path length (2k ⋅ m ⋅ 5 ⋅ k ⋅ m bit).

In principle, the rule for the metrics increment is arbitrary. With the hard-decision decoding, eq. (5.26b) is normally used or simply the number of agreements between the code sequence of a part of a path and therespective receiver is selected. With the soft-decision decoding, e.g., thedistance between a receive value and the code symbols can be selected, withthis method being more reliable than the hard decision decoding (comp. page 241). In this case, it is advantageous that the propability of equalmetrics at a state is lower.


page 274


Example: (2,1,2) convolutional code (see page 245)

In the following, the single decoding steps of the 9 known sequences all in all are represented in detail. The paths with the largest metrics arehighlighted in purple.

Let the information vector beu = (1,1,0,1,0,0,0,1,0,...).

From this, a code sequence vector resultsa = (11,01,01,00,10,11,00,11,10,...).

With the transistion, the error vectorf = (00,10,00,00,01,00,00,00,00,...)

occurs so that the receive vector reads:r = (11,11,01,00,11,11,00,11,10,...).

The receive vector is to be decoded using the Viterbi algorithm. Thereby, a hard-decision decoding is to be carried out and the sum of agreements is to be used as metrics.


page 275


1st step:

At the beginning the metrics is equal to 0 and increases per step with the numberof agreements of the receive sequence with the possible code sequences.

receive sequencest

ate

input input


page 276


2nd step:

After m = 2 steps, all states are achieved. With the next step, every state will be achieved over more than one path (see next page).

receive sequencest

ate

input input


page 277


3rd step:

Per arriving path a metrics exist, with the upper metrics belonging to theupper path and the lower metrics belonging to the lower path. In each case,

the path with the lower metrics can be deleted.

receive sequencest

ate

input input


page 278


4th step:

Deleting the path, the representation becomes clearer. In this step, the twolower states each have the same metrics so that in each case one path is

deleted randomly (comp. next page).

receive sequencest

ate

input input


page 279


5th step:

The previous path cancellations cause that only one single transition fromthe 0th state to the 1st state remains, that means is the most likely one. Thus, this single sequence is already decoded. →

receive sequencest

ate

input input


page 280


6th step:

Virtually, the code sequence (11) and the appropriate information sequence(1), respectively, can be output by the decoder and the representation can

move by one state. In this example, it is still retained.

receive sequencest

ate

input input


page 281


7th step:

From the maximum metrics it can be seen how many errors must at least have occurred so far (in this case, maximum 14 agreements are possible,

12 do exist at most, thus: at least 2 errors; comp. error vector).

receive sequencest

ate

input input


page 282


8th step:

In case of a time-phased code, the number of states would become smaller to the end (final phase), until after the last step only the initial state remains

(comp. page 260).

receive sequencest

ate

input input


page 283


9th step:

The end of the known sequence is achieved. Up to the 4th state, the result canbe displayed; up to the last state this would only be possible if the receive

sequence was finished. However, in practice it goes on!

receive sequencest

ate

input input


page 284

6 Advanced Coding TechniquesCoding Theory

In the course of this lecture, fundamentals and basic codes in the field of channel coding have been dealt with in detail. These are widely spread also in practice, however, often as complex implementations. Particularly, among those are concatenated codes with interleavers, the principles of which are to be shortly introduced here.

These „basic codes“ are, of course, not the only codes developed in the past40 years. Since the beginning of the 1990s, first and foremost two codeshave been investigated, which are able to approach the Shannon limit veryclosely (less than 1dB) with regard to their correction capability. On the onehand, these are the turbo codes virually derived from experiments and, on the other hand, the Low-Density-Parity-Check codes (LDPC codes), emerged in the 1960s. The latter, however, had been disregarded until thebeginning of the 1990s since they were very complex compared to the best available technology at that time. The principle of these two codes is to beoutlined.


page 285

6 Advanced Coding Techniques

Concatenation of codes:With the concatenation of codes, 2 or several codes are connected in series(see figure next page). The motivation for the use of this constructionconsists in the exploitation of different complementary code characteristics,e. g. for error correction and detection or for single error and burst errorcorrection.

One classical example consists in the correction of single errors by the inner code, whereas burst errors are corrected by the outer code. A method of thiskind is applied e.g. in European mobile radio systems such as GSM. Byconcatenation, very efficient codes can be built. In many examples of use, the inner code is implemented using a convolutional code for single errorcorrection and the outer code is implemented using a Reed-Solomon codefor burst error correction.


page 286


Principle of concatenated codes:

The principle of the interleaver is explained on the following pages.

outerchannelcoder

outerchanneldecoder

channel

innerchannelcoder

innerchanneldecoder

inter-leaver

deinter-leaver


page 287


Interleaving: On a transmission channel, burst errors can occur that are beyond thecorrection capability of the applied channel codes. It is the task of theinterleaver to re-sort the code word sequence to be sent, so that at thereceiver after having passed the deinterleaver several quasi-single errorsresult from one burst error, which are then spread to several code words and can be corrected.The example (see next page) shows this method as a matrix, in which thecode words are arranged line by line, however, are transmitted column bycolumn. Thus, the burst error occurring in the 4th column is distributed to several code words as single error.The disadvantage of the interleaver is that, due to the intermediate storage of the code words, an additional time delay emerges, which increases dependingon the size of the interleaver. This causes the most problems with time-critical systems such as voice transmission.


page 288


Block interleaver:

code word 1code word 2code word 3code word 4

code word k

blocks sent


page 289


Coding of the signaling information with GSM (BCCH channel):The outer coder consists of a (224,184) block code, where 4 tail bits for theterminated (2,1,4) convolutional coding (inner coder) are attached. The tailbits are interpreted as information by the convolutional code; therefore a

code rate of 1/2. The 456 bits are cut into burstsand transmitted after interleaving.

Signalisierungsinformation 184 Bits

signalling information184 bits

184 bits 40 bits 4

• • • 456 bits • • •

TDMA frames: 4.6 ms

normal bursts

• interleaving• burst generation

convolutional code for error correction: rate 1/2

all in all: 228 bits

block code + 4 tail bits


page 290


Turbo codes:The first paper was published in 1993 by the two French electrical engineersClaude Berrou and Alain Glavieux. The original idea resulting in the turbocodes was the use of feedback of signals commonly used in electricalengineering (e. g. with amplifiers). Turbo codes are characterised by theirerror correction property, which is close to the Shannon limit (approx. 0.5 –0.7 dB).

The turbo codes were developed experimentally and not by mathematical-theoretical approaches. Therefore, their performance was checked criticallyfor years, before they were taken into consideration for application in modern systems. Today, they are e.g. a part of UMTS with datatransmission. For voice transmission, however, turbo codes are non-applicable due to the delay time. In this field, convolutional codes are used. In space communication, however, turbo codes are now applied.


page 291


Principle of coding with turbo codes:

The information bits are transmitted threefoldly over the channel: oncewithout error protection (i) and twice with the error protection of a convolutional code (c1 and c2), with one of the coding blocks receiving an information sequence split by an interleaver as input. The multiplexer thenassembles the three signals to one signal.

convolutional code 1

convolutional code 2interleaver

multiplexer

information symbols

output signali

c1

c2


page 292


Principle of decoding with turbo codes:

The receive signal consisting of the three parts sent is splitted and transmitted to the decoding blocks, with these receiving their signals to bedecoded as soft-decision information as well as a copy of the uncodedsignal. From this, suggestions for the correctly decoded bits are derived, which are passed on over the signals e1 and e2 to the respective other block. The decoding is not completed until the suggestions of both blocks are equal(iterative method). From experience, the number of iteration steps isbetween 4 and 10.

decoder 1 decoder 2deinterleaver

receive signaldeinterleaver

interleaver

s, c1, c2

s, c2

s, c1

e2

e1


page 293


Low-Density-Parity-Check codes (LDPC codes):In 1960, Robert G. Gallager published the idea of LDPC codes within thescope of his doctoral thesis. Due to its complexity (among other things, iterative decoding was only possible using mainframe computers at thattime), they had been neglected with regard to practical applications for a long time. In terms of technology, the commercial application of highlyintegrated coder circuits to obtain efficient codes has only been possible fora few years. − In 1995, LDPC codes were rediscovered by MacKay andNeal. With a code rate of 0.5 and a block length of 10,000.000, the codeapproaches the Shannon limit up to 0.04 dB. This example rather to beconsidered as academic shows at least the capability of the LDPC codes. Therefore, finding suitable design methods for efficient codes (especially of short block lengths) is currently a major research topic. Nevertheless, thesecodes have already been practically included. In the ETSI standard for DVB-S2, for example, the concatenation of LDPC codes with a BCH code is used

for channel coding.


page 294


Data rate and Shannon limit for digital satellite television: variants ofDVB-S2

0,00

0,50

1,00

1,50

2,00

2,50

3,00

3,50

4,00

4,50

-3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

R u[bi

t/s] p

er u

nit R

s

QPSK

8PSK

16APSK

32APSK

DVB-S

modulation constrained Shannon limit

18 dB

4.5

0.5


page 295


Definition of LDPC codes over an arbitrary alphabet A:An LDPC code is a linear block code of the block length n, the check matrixH of which has the following characteristics:1. The check matrix H is sparsely populated, that means a large part of the

elements hml is equal to 0.2. Each column l of the matrix H has a small number of Nl symbols from A

with Nl > 1.3. Each line m of the matrix H has a small number of Nm symbols from A

with Nm > 1.LDPC codes over A = GFGF(2b) are of particular practical interest.Since LDPC codes assigned to the class of linear block codes, they also have all their characteristics. Among those are the presentation by means of a generator matrix, coding by means of a matrix-vector multiplication, thepossibility to convert non-systematic into systematic LDPC codes, thedetermination of the minimum distance by means of the minimum

Hamming weight etc. (see chapter 3).


page 296


The figure shows the sparsely populated (900×1200) check matrix of a binary LDPC code with 3,600 ones. The (300×1200) generator matrix of theequivalent systematic LDPC codes, however, has 135,000 ones!

check matrix;3,600 ones

generator matrix;135,000 ones


page 297


Although a code word can be calculated by means of the relation c = u⋅G, use of this approach is not preferred in practice. In contrast to the check matrix, with LDPC codes the generator matrix is not sparsely populated in general. The coding according to this method is complex, since currentlyLDPC codes only show their strengths for large block lengths. Therefore, in most cases special LDPC codes (e. g. cyclic codes with linear codingcomplexity) are applied for implementation.Everything that is not easy in case of coding is, less than ever, easy fordecoding. A maximum likelihood decoding cannot be realised since thedependency of the complexity on the block length n is exponential. A set of sub-optimal methods exists, with the iterative methods providing the best performance. Due to their complexity, however, boundaries are set to theirimplementation.It can be summarised that the LDPC codes are the largest competitor of theturbo codes. Due to their complexity, however, they also show problems in

terms of the time delay.


page 298

... there is nothing else but to say:End of the Lecture ...

Good luck for theexamination!

Coding Theory WS200809englv1uni.paniladen.de/.../script/script_en.pdf · Prof. Dr.-Ing. Thomas...

Documents

Transcript of Coding Theory WS200809englv1uni.paniladen.de/.../script/script_en.pdf · Prof. Dr.-Ing. Thomas...