Information Theory Handout Bw
-
Upload
sweetieabi -
Category
Documents
-
view
26 -
download
1
Transcript of Information Theory Handout Bw
3/23/2009
1
INFORMATION THEORY
MODULE IIIPART I
Module IV –Part I
Information theory and coding: Discretemessages - amount of information – entropy -information rate. Coding- Shannon’s theorem,Channel capacity - capacity of Gaussian channel-Bandwidth S/N Trade off - Use of orthogonal signalto attain Shannon‘s limit - Efficiency oforthogonal signal transmission.
AMOUNT OF INFORMATION
Consider a communication system in which the allowablemessages are m1,m2…. with probabilities of occurrence p1,p2…Then p1+p2+……….=1.Let the transmitter transmit a message mk with probability pk. Letthe receiver has correctly identified the message. Then theamount of information conveyed by the system is defined as:Ik= logb (1/pk) where b is the base of log.
= -logb pkThe base may be 2,10 or e.When the base is 2 the unit of Ik is is bit (binary unit)when it is 10 the unit is Hartley or decit.When the natural logarithmic base is used the unit is nat.Base 2 is commonly used to represent Ik.
AMOUNT OF INFORMATION
The above units are related as :
The base of 2 is preferred because in binary PCM thepossible messages 0 and 1 occur with likely hood and the
102
10
ln loglogln 2 log 2
= =a aa 1 3.32
1 1.44==
Hartley bitsNat bits
possible messages 0 and 1 occur with likely hood and theamount of information conveyed by each bit is log22=1 bit.
IMPORTANT PROPERTIES OF IK
Ik approaches 0 as pk approaches 1.pk=1 means the receiver already knows the message andthere is no need for transmission so Ik=0. Eg: The statement‘sun rises in the east’ conveys no information.Ik must be a non-negative quantity since each messagecontains some information in the worst case Ik=0.The information content of a message having higherprobability of occurrence is less than the information contentof message having lower probability.As pk approaches 0, Ik approaches infinity. The informationcontent in a highly improbable event approaches unity.
NOTESWhen the symbols 0 and 1 of a PCM data occur with equal likelyhood with probabilities ½ the amount of information conveyed byeach bit isIk(0) = Ik(1) = log22= 1 bit
When the probabilities are different the less probable symbolconveys more information.Let p(0)=1/4 p(1)=3/4Let p(0)=1/4 p(1)=3/4Ik(0)=log2 4=2 bitsIk(1)=log2 4/3=0.42 bit
When there are M equally likely and independent messages suchthat M=2N with N an integer, the information in each message isIk=log2 M=log2 2N = N bits.
3/23/2009
2
NOTES
In this case if we are using binary PCM code for representing theM messages the number of binary digits required to represent allthe 2N messages is also N.i.e when there are M (=2N) equally likely messages the amount ofinformation conveyed by each message is equal to the number ofbinary digits needed to represent all the messages.When two independent messages mk and mI are correctlyidentified the amount of information conveyed is the sum of theidentified the amount of information conveyed is the sum of theinformation associated with each of the messages individually.
When the messages are independent the probabilities of thecomposite message is pkpI.
21log=k
k
Ip 2
1log=II
Ip
, 2 2 21 1 1log log log= = + = +k I k Ik I k I
I I Ip p p p
EXAMPLE
EXAMPLE 1A source produces one of four possible symbols during eachinterval having probabilities: p(x1)=1/2, p(x2)=1/4, p(x3) = p(x4)= 1/8. Obtain the information content of each of thesesymbols.
ANS:I(x1)=log22 =1 bitI(x2)=log24 =2 bitsI(x3)=log28 =3 bitsI(x4)=log28 =3 bits
AVERAGE INFORMATION,ENTROPY
Suppose we have M different and independent messagesm1,m2… with probabilities of occurrence p1,p2…Suppose further that during a long period of transmission asequence of L messages has been generated. If L is verylarge we may expect that in the L message sequence wetransmitted p1L messages of m1, p2L messages of m2,etc.The total information in such a message will be:The total information in such a message will be:
Average information per message interval is represented bythe symbol H is given by:
1 2 2 21 2
1 1log log ...................= + +TotalI p L p Lp p
1 2 2 21 2
1 1log log ...................≡ = + +TotalIH p pL p p
AVERAGE INFORMATION,ENTROPY
Average information is also referred to as Entropy. Its unit is information bits/symbol or bits/message.
21
1log=
≡ ∑M
kk k
H pp
21
1log=
≡ ∑M
kk k
H pp
AVERAGE INFORMATION,ENTROPYWhen pk=1, there is only a single possible message and thereceipt of that message conveys no information.H = log2 1 = 0When pk→0 amount of information Ik→∞ and the averageinformation in this case is :
20
1log 0lim→
=p
pp
The average information associated with an extremely unlikelymessage as well as an extremely likely message is zero.Consider that a source generates two messages with probabilitiesp and (1-p). The average information per message is :
2 21 1log (1 ) log
(1 )= + −
−H p p
p p
0, 0 = =when p H 1, 0 = =when p H
AVERAGE INFORMATION,ENTROPY
1 HMAX
H Plot of H as a function of p
1/2 10 p
3/23/2009
3
AVERAGE INFORMATION,ENTROPYThe maximum value of H may be located by setting 0=
dHdp
1 1l ( ) l ( )dH ⎛ ⎞ ⎛ ⎞−
2 21 1log (1 ) log
(1 )H p p
p p= + −
−
( )2 2log (1 ) log 1H p p p p= − − − −
1 log 1 log(1 )p p= − − + + −
1 1log (1 ). log(1 ). 1(1 )
dH p p p pdp p p
⎛ ⎞ ⎛ ⎞= − ⋅ + − − + − −⎜ ⎟ ⎜ ⎟−⎝ ⎠ ⎝ ⎠
log(1 ) logp p= − −
1log pp
⎛ ⎞−= ⎜ ⎟
⎝ ⎠
AVERAGE INFORMATION,ENTROPY
Similarly when there are 3 messages the average information Hbecomes maximum when the probability of each of these messages
1/3
0dHdP
= 1log 0pp
⎛ ⎞−=⎜ ⎟
⎝ ⎠1 1p
p−
= 1 p p− =12
=p
p=1/3.
Extending this, when there are M messages H becomes a maximumwhen all the messages are equally likely with p=1/M. In this caseeach message has a probability 1/M and
2 2 2 21 1 1log 3 log 3 log 3 log 33 3 3
= + + =MAXH
max 2 21 log log= =∑H M MM
INFORMATION RATE R
Let a source emits symbol at the rate r symbols/second. Then information rate of the source:R= r H information bits/second.
R→ information rate, H→ entropy of the sourcer rate at which symbols are generatedr→ rate at which symbols are generated.
R= r (symbols/second) x H (information bits/symbol)R= rH (information bits/second)
EXAMPLE 1A discrete source emits one of the five symbols once very millisecondswith probabilities 1/2, 1/4, 1/8, 1/16 and 1/16 respectively. Determinesource entropy and information rate.
21
1logM
ii i
H PP=
= ∑5
21
1logii i
PP=
= ∑
2 2 2 2 21 1 1 1 1log 2 log 4 log 8 log 16 log 16= + + + +
1000Infor 1.8mation r 1875 bits/sat c75e eR rH= = × =
2 2 2 2 2log 2 log 4 log 8 log 16 log 162 4 8 16 16
+ + + +
1 1 3 1 12 2 8 4 4
= + + + +15 1.875 bits/symbol8
= =
3
1 110
Symbol rate 1000 symbols/se cbb
r fT −= = = =
EXAMPLE 2The probabilities of five possible outcomes of an experiment are givenas
Determine the entropy and information rate if there are 16 outcomes persecond.
1 2 3 4 51 1 1 1( ) , ( ) , ( ) , ( ) ( )2 4 8 16
P x P x P x P x P x= = = = =
5
21( ) ( ) log bits/symboliH X P x=∑ 2
1
( ) ( ) g y( )i
i iP x=∑
15Rate of information 30 bits6 / c)8
s( eR rH X= = × =
2 2 2 2 21 1 1 1 1log 2 log 4 log 8 log 16 log 162 4 8 16 16
= + + + +
1 2 3 4 42 4 8 16 16
= + + + + 1.875 bits/out15 me8
co= =
Rate of outcomes 16 outcomes/secr =
EXAMPLE 3An analog signal band limited to 10kHz is quantized into 8 levels ofa PCM system with probabilities of 1/4, 1/5, 1/5,1/10, 1/10, 1/20,1/20 and 1/20 respectively. Find the entropy and rate of information.fm= 10 kHz fs = 2 x 10kHz = 20 kHzRate at which messages are produced 320 10 / secsr f messages= = ×
2 2 2 21 1 1 1( ) log 4 log 5 2 log 10 2 log 20 34 5 10 20
H X ⎛ ⎞ ⎛ ⎞ ⎛ ⎞= + × + × + ×⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠
56800 bits/sec=
4 5 10 20⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠
2.84 bits/messages=
( )R rH X=
20000 2.84= ×
3/23/2009
4
EXAMPLE 4Consider a telegraph source having two symbols dot and dash. Thedot duration is 0.2s. The dash duration is 3 times the dot duration.The probability of the dots occurring is twice that of the dash andthe time between symbols is 0.2s. Calculate the information rate ofthe telegraph source.
(dot) (dash)2p p= (dot) (dash) (dash)3 1p p p+ = =
( ) 0.92 b/symbolH X =
( ) ( ) ( )
(dash)13
p =(dot)
23
p =
(dot) 2 (dash) 2(dot) (dash)
1 1( ) log logH X p pp p
= +
0.667 0.585 0.333 1.585 0.92 b/symbol= × + × =
EXAMPLE 4 (Contd..)Average time per symbol is
(dot) (dot) (dash) (dash) spacesT P t P t t= ⋅ + ⋅ +
2 10.2 0.6 0.23 3
= × + × +
0.5333 seconds/symbol=
Average information ra 1.875 1.720. b/se 92t R rH= = × =
Average symbol rate is 1.875 symbols/s1 ec s
rT
= =
SOURCE CODINGLet there be M equally likely messages such that M=2N. If themessages are equally likely, the entropy H becomes maximumand is given by
The number of binary digits needed to encode each message isalso N
max 2 2log log 2NH M N= = =
also N.So entropy H = N if the messages are equally likely.The average information carried by individual bit is H/N = 1 bit.If however the messages are not equally likely H is less than Nand each bit carries less than 1 bit of information.This situation can be corrected by using a code in which not allmessages are encoded into the same number of bits.
SOURCE CODINGThe more likely a message is, the fewer the number of bitsthat should be used in its code word.Let X be a DMS with finite entropy H(X) and an alphabet x1,x2,…,xm with corresponding probabilities of occurrence p(xi)where i = 1, 2, 3, ……..m.Let the binary code word assigned to symbol xi by the
d h l th d i bit L th f dencoder have length ni measured in bits. Length of a codeword is the number of bits in the code word.The average code word length L per source symbol is givenby
1
( )M
i ii
p x n=
= ∑1 2 2( ) ( ) ... ( )i m mL p x n p x n p x n= + + +
x1
x2
.
.
y1
y2
.
.
SOURCE CODING
n1n2n3
p(x1)
p(x2)
p(x3)
CHANNELX Y.
.
.
.
.
.
xm
.
.
.
.
.
.
yn
.
nm
.
.p(xm)
SOURCE CODINGThe parameter L represents the average number of bits persource symbol used in the source coding process.Code efficiency η is defined as where Lmin is theminimum possible value of L. When η approaches unity thecode is said to be efficient.Code redundancy γ is defined as γ = 1 – η
LminL
=η
DMS SOURCE CODINGkb
Binary sequence
3/23/2009
5
SOURCE CODINGThe conversion of the output of a DMS into a sequence ofbinary symbols (binary codes) is called source coding.The device that performs this is called source encoder.If some symbols are known to be more probable than others
then we may assign short codes to frequent source symbolsand long code words to rare source symbols.Such a code is called a variable length code.As an example, in Morse code the letter E is encoded into asingle dot where as the letter Q is encoded as ‘_ _ . _ ‘. This isbecause in English language letter E occurs more frequentlythan the letter Q.
SHANNON’S SOURCE CODING THEOREM
Source coding theorem states that for a DMS X with entropyH(x) the average code word length per symbol is bounded as
and L can be made as close to H(x) as desired for somesuitably chosen code. When the code
ffi i i
( )L H x≥
min ( )L H x=( )H xefficiency is
No code can achieve efficiency greater than 1, but for anysource, there are codes with efficiency as close to 1 asdesired.The proof does not give a method to find the best codes. Itjust sets a limit on how good they can be.
( )H xL
η =
SHANNON’S SOURCE CODING THEOREM
Proof of the statement: Length of code ≥ H(X) ≥ 0 [0 ≤ H(X) ≤ N]
−
− −
0 1 1
0 1 1 0 1 1
, ,..., , ,...
Consider any two probability distributions and on the alphabet of a discret
, , ,e mem
.oryless
.., channel.
M
M M
p p pq q q x x x
⎡ ⎤ ⎛ ⎞
Applying this property to Eq-(1)
− −
= =
⎡ ⎤ ⎛ ⎞= ⎜ ⎟⎢ ⎥
⎣ ⎦ ⎝ ⎠∑ ∑
1 1
20 0
1log ln .............. (1)ln 2
M Mi i
i ii ii i
q qp pp p
≤ − ≥By a special property of the natural logarithm (ln), we havln 1 0
e,,x x x
SHANNON’S SOURCE CODING THEOREM− −
= =
⎛ ⎞ ⎛ ⎞≤ −⎜ ⎟ ⎜ ⎟
⎝ ⎠ ⎝ ⎠∑ ∑
1 1
20 0
1log 1 ln2
M Mi i
i ii ii i
q qp pp p
−
=
≤ −∑1
0
1 ( ) ln2
M
i ii
q p
− −
= =
⎛ ⎞≤ − =⎜ ⎟
⎝ ⎠∑ ∑
1 1
0 0
1 0 ln2
M M
i ii i
q p− ⎛ ⎞
⎜ ⎟∑1
l 0M
iq
−
=
⎛ ⎞≤⎜ ⎟
⎝ ⎠∑
1
20
Thus we obtain the fundamental inequalit
log 0
y
..............(2)M
ii
i i
qpp
−
=
−
−
=∑1
2
1 1
20
1 2 1
2
, ,...,, ,...
If there are equally probable messages with probabilities the ent, ropy of this DMS is given b
1log lo
y
............ (g .. 3)
M
MM
ii i
x x xq q
M
q
q
q M
=
⎛ ⎞≤⎜ ⎟
⎝ ⎠∑ 2
0log 0i
ii i
qpp
SHANNON’S SOURCE CODING THEOREM
= = −1q for 0,1,..Also, ............, 1 ...(4)i i MM
−
=
≤∑1
20
Substituting Eq-(4) in Eq- )1l g 0
(2
oM
ii i
pp M
−
=
=≤ ∑1
02 sinclog , e 1
M
ii
M p
− −
= =
+ ≤∑ ∑1 1
2 20 0
1 1log log 0M M
i ii ii
p pp M
− −
= =
≤ −∑ ∑1 1
2 20 0
1 1log logM M
i ii ii
p pp M
−
=
≤ ∑1
20
logM
ii
p M
≤ 2( ) logH X M
≤ = if 2( ) NH X N M
≤If the symbols are equally li
( )k
ely.
H X L
SHANNON’S SOURCE CODING THEOREM
H(X) = 0 if and only if the probability pi = 1 for some i and theremaining probabilities in the set are all zero. This lowerbound on entropy corresponds to no uncertainty.
H(X) = log2M if and only if pi = 1/M for all i, i.e., all the symbolsin the alphabet are equi-probable. This upper bound onentropy corresponds to maximum uncertainty.
Proof of the lower bound: Each probability pi is less than orequal to 1. Each term pi log2(1/ pi) is zero if and only if pi = 0or 1. i.e., pi = 1 for some I and all others are zeroes. Sinceeach probability pi ≤ 1, each term pi log2(1/ pi) is always nonnegative.
3/23/2009
6
CLASSIFICATION OF CODESFixed Length Code: A fixed length code is one whose code wordlength is fixed. Code 1 and code 2 of Table 1 are fixed lengthcodes.Variable Length Code: A variable length code is one whose codeword length is not fixed. Shannon-Fano and Huffman’s codes areexamples of variable length codes. Code 3, 4, 5 in the Table 1are variable length codesare variable length codes.Distinct Code: A code is distinct if each code word isdistinguishable from other code words. Codes 2,3,4,5 and 6 aredistinct codes.Prefix Code: A code in which no code word can be formed byadding code symbols to another code word is called a prefixcode. No code word should be a prefix to another.e.g. Codes 2,4 and 6
CLASSIFICATION OF CODESUniquely Decodable Code: A code is uniquely decodable ifthe original source sequence can be reconstructed perfectlyfrom the encoded binary sequence.Code 3 of the table is not uniquely decodable since the binarysequence 1001 may correspond to source sequences x2x3x2or x2x1x1x2.A sufficient condition to ensure that a code is uniquely
d d bl i th t d d i fi f th Thdecodable is that no code word is a prefix of another. Thuscodes 2,4 and 6 are uniquely decodable codes.Prefix-free condition is not a necessary condition for uniquedecodability. e.g. code 5Instantaneous Codes: A code is called instantaneous if theend of any code word is recognizable without examiningsubsequent code symbols. Prefix-free codes areinstantaneous codes e.g. code 6
CLASSIFICATION OF CODES.
xi Code 1
Code 2
Code 3
Code 4
Code 5
Code 6
x1 00 00 0 0 0 1
x2 01 01 1 10 01 01
x3 00 10 00 110 011 001
x4 11 11 11 111 0111 0001
Fixed Length Codes: 1,2Variable Length Code: 3,4,5,6Distinct Code: 2,3,4,5,6
Prefix Code: 2,4,6Uniquely Decodable Code: 2,4,6Instantaneous Codes: 2,4,6
PREFIX CODING (INSTANTANEOUS CODING)Consider a discrete memory-less source of alphabet x0, x1,…,xm-1 with statistics p0, p1, …, pm-1Let the code word assigned to source symbol xk be denoted bymk1, mk2, …, mkn where the individual elements are 0s and 1sand n is the code word length.Initial part of code word is represented by mk1, …, mki for some i≤nAny sequence made of the initial part of the code word is called aAny sequence made of the initial part of the code word is called aprefix of the code word.A prefix code is defined as a code in which no code word is aprefix of any other code word.It has the important property that it is always uniquely decodable.But the converse is not always true.Thus, a code that does not satisfy the prefix condition is alsouniquely decodable.
EXAMPLE 1(Contd…)
xi
x1
x2
x3
Code A
0 0
0 1
1 0
Code B
0
1 0
1 1
Code C
0
1 1
1 0 0
Code D
0
1 0 0
1 1 0
B anc C are not uniquely decodable
x4 1 1 1 1 0 1 1 0 1 1 1
A,C,D satisfies Kraft inequality A and D are Prefix codes.
A prefix code always satisfies Kraft inequality. But the converse is not always true.
EXAMPLEAn analog signal is band-limited to fm Hz and sampled at Nyquistrate. The samples are quantized into 4 levels. Each levelrepresents one symbol. The probabilities of occurrence of these4 levels (symbols) are p(x1) = p(x4) = 1/8 and p(x2) = p(x3) = 3/8.Obtain the information rate of the source.Answer:
3 1( ) ( ) p( ) ( )= = = =p x p x x p x2 3 1 4( ) ( ) p( ) ( )8 8
= = = =p x p x x p x
⎛ ⎞⎛ ⎞= =⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠
symbols bits2 1.8 3.6 bits/secondsecond symbolm mR f f
= + + +2 2 2 21 3 8 3 8 1( ) log 8 log log log 88 8 3 8 3 8
H X = 1.8 bits/symbol.
=Nyquist rate means 2s mf f
=Rate at which symbols are generated 2 symbols/secondmr f
3/23/2009
7
EXAMPLE
We are transmitting 3.6fm bits/second. There are four levels, these four levels may be coded using binary PCM as shown below.
Symbol Probabilities Binary digitsQ1 1/8 00Q2 3/8 01Q3 3/8 10Q4 1/8 11
Two binary digits are needed to send each symbol. Sincesymbols are sent at the rate 2fm symbols/sec, the transmissionrate of binary digits will be:
EXAMPLE
Since one binary digit is capable of conveying 1 bit of
onddigits/secbinary 4 second
symbols2symbol
digitsbinary 2 ratedigit Binary
m
m
f
f
=
×=
y g p y ginformation, the above coding scheme is capable ofconveying 4fm information bits/sec.But we have seen earlier that we are transmitting 3.6fm bits ofinformation per second.This means that the information carrying ability of binary PCMis not completely utilized by this transmission scheme.
EXAMPLE
In the above example if all the symbols are equally likely iep(x1)=p(x2)=p(x3)=p(x4)=1/4
With binary PCM coding, maximum information rate is achieved ifall messages are equally likely.Often this is difficult to achieve. So we go for alternative coding
h t i i f ti bitschemes to increase average information per bit.
SHANNON-FANO CODING PROCEDURE(i) List the source symbols in the order of decreasing
probability.
(ii) Partition the set in to two sets that are as close toequiprobables as possible.
(iii) Assign 0’s to upper set and 1’s to the lower set.( ) g pp
(iv) Continue the process each time partitioning the sets withas nearly equal probabilities as possible until furtherpartitioning is not possible.
(v) The rows of the table corresponding to the symbol givesthe Shannon –Fano code.
Find out the Shannon-Fano Codes corresponding to eightmessages m1,m2,m3……m7 with probabilities 1/2, 1/8, 1/8, 1/16,1/16, 1/16, 1/32 and 1/32
SHANNON-FANO CODING
Message Probabilities Codes No of bits/message
m1 1/2 0 0 1
m2 1/8 1 0 0 100 3
m3 1/8 1 0 1 101 3 m4 1/16 1 1 0 0 1100 4m5 1/16 1 1 0 1 1101 4
m6 1/16 1 1 1 0 1110 4
m7 1/32 1 1 1 1 0 11110 5
m8 1/32 1 1 1 1 1 11111 5
SHANNON-FANO CODING
m1
m2
m3
m
0
1 0 0
1 0 1
1 1 0 0m4
m5
m6
m7
m8
1 1 0 0
1 1 0 1
1 1 1 0
1 1 1 1 0
1 1 1 1 1 1
3/23/2009
8
SHANNON-FANO CODING8
1
( )i ii
L p x n=
= ∑
53215
3214
1614
1614
1613
813
811
21
×+×+×+×+×+×+×+×=
2.31=8 1∑ 2
1
1( ) log( )i
i i
H p xp x=
= ∑
⎟⎠⎞
⎜⎝⎛ ××+⎟
⎠⎞
⎜⎝⎛ ××+⎟
⎠⎞
⎜⎝⎛ ××+×= 232log
321316log
16128log
812log
21
2222
2.31=
100%HL
η ==
There are 6 possible messages m1, m2, m3…….m6 withprobabilities 0.3, 0.25, 0.2, 0.12, 0.08, 0.05 Obtain Shannon-Fano Codes
Messages Probablities codes length
m1 0.3 0 0 00 2
SHANNON-FANO CODING
1
m2
m3
m4
m5
m6
0 30.250. 20.120.080.05
001111
010111
011
01
00011011011101111
22344
SHANNON-FANO CODING8
1
( )i ii
L p x n=
= ∑
0.3 2 0.25 2 0.2 2 0.12 3 0.08 4 0.05 4= × + × + × + × + × + ×
symbolb / 38.2=8
21( ) logiH p x= ∑ 2
1
( ) g( )i
i i
pp x=
∑
12.01log12.0
2.01log2.0
25.01log25.0
3.01log3.0 2222 ×+×+×+×=
symbolb / 36.2=
2.36 0.99 99%2.38
HL
η = = ==05.01log05.0
08.01log08.0 22 ×+×+
1 0.01 1% Redundancy γ η = == −
SHANNON-FANO CODING
xi P(xi) Codes Length
x1
x2
0.20.2
0 0 0 0
0 1
2
20 1
A DMS has five equally likely symbols. Construct Shannon-Fano Code
x3
x4
x5
1 0
1 1 0
1 1 1
2
3
3
0.20.20.2
1 0
11
1 0
1 1
SHANNON-FANO CODING5
1
( )i ii
L p x n=
= ∑( ) 4.2332222.0 =++++=
symbolb / 4.2=5
21( ) logiH p x= ∑ 2
1
( ) g( )i
i i
pp x=
∑
52.0
1log2.0 2 ×⎟⎠⎞
⎜⎝⎛ ×= symbolb / 32.2=
%7.96967.04.232.2
====LHη
SHANNON-FANO CODING
xi P(xi) Codes Length
x1
x2
0.40.19
0 0 0 0
0 1
2
20 1
A DMS has five symbols x1, x2, x3, x4, x5, construct Shannon-Fano Code
x3
x4
x5
1 0
1 1 0
1 1 1
2
3
3
0.160.150.1
1 0
11
1 0
1 1
3/23/2009
9
HUFFMANN -CODING(i) List the source symbols in the order of decreasing probability.(ii) Combine the probabilities (add) of two symbols having the
lowest probabilities and reorder the resultant probabilities.This process is called bubbling.
(iii) During the bubbling process if the new weight is equal toexisting probabilities the new branch is to be bubbled to thetop of the group having same probabilities.p g p g p
(iv) Complete the tree structure and assign a ‘1’ to the branchrising up and ‘0’ to that coming down.
(v) From the final point trace the path to the required symbol andorder the 0’s and 1’s encountered in the path to form the code.
* It produces the optimum code .
* It has the highest efficiency.
EXAMPLES OF HUFFMANN’S CODING
x1
x2
x3
0.4
0.2
0.1
0.4
0.2
0.2
0.4
0.2
0.2
0.4
0.4
0.2
0.6
0.41
01
01
1
(1 1)
(0 0)
x4
x5
x6
0.1
0.1
0.1
0.1
0.1
0.2 01
01
01
(1 0 1)
(1 0 0)
(0 1 1)
(0 1 0)
EXAMPLES OF HUFFMANN’S CODING
x1
x2
x3
0.30
0.25
0.20
0.30
0.25
0.20
0.30
0.25
0.25
0.45
0.30
0.25
0.55
0.451
01
01
1
(1 1)
(0 1)
x4
x5
x6
0.12
0.08
0.05
0.13
0.12
0.20 01
01
01
(0 0)
(1 0 0)
(1 0 1 1)
(1 0 1 0)
EXAMPLES OF HUFFMANN’S CODING
x1
x2
x3
0.4
0.19
0.16
0.4
0.25
0.19
0.4
0.35
0.25
0.6
0.4 1(0)
(1 1 1)
101
01
x4
x5
0.15
0.1
0.16(1 1 0)
(1 0 1)
(1 0 0)
0
1
01
EXAMPLES OF HUFFMANN’S CODING
x1
x2
x3
0.2
0.2
0.2
0.4
0.2
0.2
0.4
0.4
0.2
0.6
0.4 1(1 0)
(0 1)
101
01
x4
x5
0.2
0.2
0.2(0 0)
(1 1 1)
(1 1 0)
0
1
01
A communication channel may be defined as the path ormedium through which the symbols flow to the receiver.A Discrete Memory-less Channel (DMC) is a statistical modelwith an input X and an output Y as shown below.During each signalling interval, the channel accepts an inputsignal from X, and in response it generates an output symbol
CHANNEL REPRESENTATION
from Y.The channel is discrete when the alphabets of X and Y are bothfinite values.It is memory-less when the current output depends only on thecurrent input and not on any of the previous inputs.
3/23/2009
10
x1
x2
.
.
y1
y2
.
.
CHANNEL REPRESENTATION
p(yj/xi)X Y.
.
.
.
.
.
xm
.
.
.
.
.
.
yn
A diagram of a DMC with m inputs and n outputs is shownabove.The input X consists of input symbols x1,x2,…..xm. The output Yconsists of output symbols y1,y2….yn.
Each possible input to output path is indicated along with aconditional probability p(y /x ) which indicates the conditional
CHANNEL REPRESENTATION
conditional probability p(yj/xi) which indicates the conditionalprobability of obtaining output yj given that input is xi and is calledchannel transition probability.A channel is completely specified by the complete set oftransition probabilities. So a DMC is often specified by a matrixof transition probabilities [P(y/x)]
CHANNEL MATRIX
1 1 2 1 1
1 2 2 2 2
( | ) ( | ) .................. ( | )( | ) ( | ) ................. ( | )
.......................................................................
..........................
n
n
P y x P y x P y xP y x P y x P y x
.............................................( | ) ( | ) ( | )P P P
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦1 2( | ) ( | ) ................. ( | )m m n mP y x P y x P y x⎢ ⎥⎣ ⎦
Matrix [P(y|x)] is called channel matrix. Each row of the matrixspecifies the probabilities of obtaining y1,y2….yn, given x1. So, thesum of elements in any row should be unity.
If the probabilities P(X) are represented by the row matrix, then we
CHANNEL MATRIX
1
( | ) 1 n
j ij
p y x for all i=
=∑p ( ) p y ,
have [P(X)] = [p(x1) p(x2) ………………...p(xm)]The output probabilities P(y) are represented by the row matrix
[P(Y)] = [p(y1) p(y2)………………….p(yn)]The output probabilities may be expressed in terms of inputprobabilities as
[P(Y)] = [P(X)] [P(Y|X)]
If [P(X)] is represented as a diagonal matrix
CHANNEL MATRIX
1
2
( ) 0 ...................00 ( ) .... .........0 ....................................0 0 ( )
p xp x
⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
Then [P(X,Y)] = [P(X)]d [P(Y|X)]
The (i,j) element of matrix [P(X,Y)] has the form p(xi,yj). Thematrix [P(X,Y)] is known as the joint probability matrix and theelement p(xi,yj) is the joint probability of transmitting xi andreceiving yj.
0 0 ........... ( )mp x⎢ ⎥⎣ ⎦
A channel described by a channel matrix with only one non-zero element in each column is called a lossless channel.In a lossless channel no source information is lost intransmission.
x1
y13/4
1/4
LOSSLESS CHANNEL
3 1 0 0 0 4 4
⎡ ⎤⎢ ⎥
x2
x3
y2
y3
y4
y5
1/3
2/3
1
4 41 2 0 0 03 3
0 0 0 0 1
⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
3/23/2009
11
DETERMINISTIC CHANNEL
A channel described by a channel matrix with only one non-zero element in each row is called a deterministic channel
x1
x2
x3
x5
y1
y2
y3
11
11x4
1
1 0 01 0 0
[P(Y|X)] = 0 1 00 1 0
DETERMINISTIC CHANNEL
0 1 00 0 1
Since each row has only one non-zero element, this elementmust be unity. When a given source symbol is sent to adeterministic channel, it is clear which output symbol is received.
A channel is called noiseless if it is both lossless anddeterministic.The channel matrix has only one element in each row and eachcolumn, and this element is unity.The input and output alphabets are of the same size.
NOISELESS CHANNEL
x1
x2
x3
xm
y1
y2
y3
ym
1
1
1
1
BINARY SYMMETRIC CHANNEL
A binary symmetric channel is defined by the channel diagramshown below and its channel matrix is given by
[ ] 1( | )
1p p
P Y Xp p−⎡ ⎤
= ⎢ ⎥−⎣ ⎦
x1=0
x2=1
y1=0
y2=1
1-p
1-p
p
p
The channel matrix has two inputs 0 and 1 and two outputs 0and 1.This channel is symmetric because the probability ofreceiving a 1 if a 0 is sent is the same as the probability ofreceiving a 0 if a 1 is sent.This common transition probability is represented by p.
BINARY SYMMETRIC CHANNEL EXAMPLE 1
(i) Find the channel matrix of the binary channel.
p(x1)
p(x2)
y1
y2
0.9
0.8
(ii) Find p(y1) and p(y2) when p(x1)=p(x2)=0.5(iii) Find the joint probabilities p(x1,y2) and p(x2,y1) when
p(x1)=p(x2)=0.5
3/23/2009
12
SOLUTION
1 1 2 1
1 2 2 2
( | ) ( | ) ( | )=
( | ) ( | )p y x p y x
Channel Matrix p y xp y x p y x⎛ ⎞⎜ ⎟⎝ ⎠
0.9 0.1=
0.2 0.8⎛ ⎞⎜ ⎟⎝ ⎠
[ ] [ ][ ]=( ) ( ) ( | )P Y P X P Y X [ ] [ ]0.9 0.10.5 0.5 0.55 0.45
0.2 0.8⎛ ⎞
= =⎜ ⎟⎝ ⎠
= =1 2( ) 0.55, p(y ) 0.45p y
⎛ ⎞⎛ ⎞ ⎛ ⎞[ ] [ ] [ ]=( , ) ( ) ( | )d
P X Y P X P Y X 0.5 0 0.9 0.1 0.45 0.050 0.5 0.2 0.8 0.1 0.4
⎛ ⎞⎛ ⎞ ⎛ ⎞= =⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠⎝ ⎠ ⎝ ⎠
1 1 1 2
2 1 2 2
( , ) ( , ) 0.45 0.05( , ) ( , ) 0.1 0.4
p x y p x yp x y p x y⎛ ⎞ ⎛ ⎞
=⎜ ⎟ ⎜ ⎟⎝ ⎠⎝ ⎠
= =1 2 2 1( , ) 0.05 ( , ) 0.1p x y p x y
EXAMPLE 2Two binary channels of the above example are connected incascade. Find the overall channel matrix and draw the resultantequivalent channel diagram. Find p(z1), p(z2) whenp(x1)=p(x2)=0.5
x1 z10.9 0.9
x2z2
0.8 0.8
SOLUTION
[ ][ ]=( | ) ( | ) ( |P Z X P Y X P Z Y
0.9 0.1 0.9 0.1 0.83 0.170.1 0.8 0.2 0.8 0.34 0.66⎛ ⎞⎛ ⎞ ⎛ ⎞
= =⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠⎝ ⎠ ⎝ ⎠
z10.83x1
z20.66
x2
[ ] [ ][ ]=( ) ( ) ( | )P Z P X P Z X
[ ] [ ]0.83 0.17( ) 0.5 0.5 0.585 0.415
0.34 0.66P Z ⎛ ⎞
= =⎜ ⎟⎝ ⎠
EXAMPLE 3A channel has the channel matrix
(i) Draw the channel diagram(ii) If the source has equally likely outputs compute the
probabilities associated with the channel outputs for p=0.2
[ ] −⎡ ⎤= ⎢ ⎥−⎣ ⎦
1 0( | )
0 1p p
P Y Xp p
p p p
x1=0
x2=1
y1=0
y3=1
y2=e (erasure)
SOLUTIONThis channel is known as binary erasure channel (BEC)It has two inputs x1=0 and x2=1 and three outputs y1=0, y2=e,y3=1 where e denotes erasure. This means that the output is indoubt and hence it should be erased
[ ] 02.08.05050)]([ ⎥
⎤⎢⎡
YP [ ]
[ ]4.02.04.08.02.00
5.05.0)]([
=
⎥⎦
⎢⎣
=YP
EXAMPLE 5
31 3
1
21
41
41
41
31
1x
2x
1y
2y
(i) Find the channel matrix(ii) Find output probabilities if , .(iii) Find the output entropy .
11( )2
p x =2 3
1( ) ( )4
p x p x= =
)(YH
21
41
3x 3y
3/23/2009
13
SOLUTION⎡ ⎤⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥= = ⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥ ⎣ ⎦
⎢ ⎥⎣ ⎦
1 1 13 3 3 0.33 0.33 0.33
1 1 1[ / ] 0.25 0.5 0.254 2 40.25 0.25 0.51 1 1
4 4 2
P Y X
=[ ] [ ( )][ ( | )]P Y P X P Y X [ ]⎡ ⎤⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦
0.33 0.33 0.330.5 0.25 0.25 0.25 0.5 0.25
0.25 0.25 0.5⎡ ⎤= ⎣ ⎦7 17 17
24 48 48
1.58 b/symbols=
⎣ ⎦24 48 48
=
⎧ ⎫= ⎨ ⎬
⎩ ⎭∑
3
21
1( ) ( ) log( )i
i i
H Y p yp y
2 2 27 24 17 48 17 48log log log24 7 48 17 48 17
⎛ ⎞ ⎛ ⎞ ⎛ ⎞= + +⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠
MUTUAL INFORMATION AND CHANNEL CAPACITY OF DMCLet a source emit symbols and the receiver receive symbols . The set of symbols may or may not be identical to the set of symbols depending on the nature of the receiver. Several types of probabilities will be needed to deal with the two alphabets and
mxxx ,...,, 21
nyyy ,...,, 21
kykx
kx kyp k ky
nx
xx
.
.
.2
1
CHANNEL
ny
yy
.
.
.2
1
X Y
PROBABILITIES ASSOCIATED WITH A CHANNEL
i. p(xi) is the probability that the source selects symbol xifor transmission
ii. p(yi) is the probability that the symbol yj is received.iii. p(xi,yi) is the joint probability that xi is transmitted and yj
is receivediv p(xi/yj) is the conditional probability that xi wasiv. p(xi/yj) is the conditional probability that xi was
transmitted given that yj is receivedv. p(yj/xi) is the conditional probability that yj is received
given that xi was transmitted.
ENTROPIES ASSOCIATED WITH A CHANNEL
Correspondingly we have the following entropies also:
i. H(X) is the entropy of the transmitter.ii. H(Y) is the entropy of the receiveriii. H(X,Y) is the joint entropy of the transmitted and received
symbolsiv. H(X|Y) is the entropy of the transmitter with a knowledge
of the received symbols.v. H(Y|X) is the entropy of the receiver with a knowledge of
the transmitted symbols.
ENTROPIES ASSOCIATED WITH A CHANNEL
− − ⎛ ⎞= ⎜ ⎟⎜ ⎟∑∑
1 1 1( | ) ( )logn m
H X Y p x y
−
=
⎛ ⎞= ⎜ ⎟
⎝ ⎠∑
1
20
1( ) ( ) log( )
m
ii i
H X p xp x
−
=
⎛ ⎞= ⎜ ⎟
⎝ ⎠∑
1
20
1( ) ( ) log( )
m
ii i
H Y p yp y
= =
= ⎜ ⎟⎜ ⎟⎝ ⎠
∑∑ 20 0
( | ) ( , )log( | )i j
j i i j
H X Y p x yp x y
− −
= =
⎛ ⎞= ⎜ ⎟⎜ ⎟
⎝ ⎠∑∑
1 1
20 0
1( | ) ( , )log( | )
n m
i jj i j i
H Y X p x yp y x
− −
= =
= ∑∑1 1
20 0
1( , ) ( , )log( , )
n m
i jj i i j
H X Y p x yp x y
RELATIONSHIP BETWEEN ENTROPIES
− −
= =
= ∑∑1 1
20 0
1( , ) ( , )log( , )
n m
i jj i i j
H X Y p x yp x y
− −
= =
= ∑∑1 1
20 0
1( , )log( | ) ( )
n m
i jj i i j j
p x yp x y p y
− − ⎛ ⎞= +⎜ ⎟⎜ ⎟
⎝ ⎠∑∑
1 1
2 20 0
1 1( , ) log log ( | ) ( )
n m
i jj i
p x yp x y p y= = ⎝ ⎠0 0 ( | ) ( )j i i j jp x y p y
− −
= =
⎛ ⎞= +⎜ ⎟⎜ ⎟
⎝ ⎠∑∑
1 1
2 20 0
1 1( , )log ( , )log( | ) ( )
n m
i j i jj i i j j
p x y p x yp x y p y
− −
= =
= +∑∑1 1
20 0
1( | ) ( , )log( )
n m
i jj i j
H X Y p x yp y
3/23/2009
14
RELATIONSHIP BETWEEN ENTROPIES
−
=
= +∑1
20
1( | ) ( )log ( )
n
ij j
H X Y p yp y
− −
= =
⎡ ⎤= + ⎢ ⎥
⎣ ⎦∑ ∑
1 1
20 0
1( , ) ( | ) ( , ) log( )
n m
i jj i j
H X Y H X Y p x yp y
( | ) ( )H X Y H Y
−
=
⎡ ⎤=⎢ ⎥
⎣ ⎦∑
1
0( , ) ( )
m
i j ji
p x y p y
= +( | ) ( )H X Y H Y
Similarly
= +( , ) ( | ) ( )H X Y H X Y H Y
+H(X,Y)= ( | ) ( )H Y X H X
MUTUAL INFORMATION
If the channel is noiseless then the reception of some symbolyj uniquely determines the message transmitted.Because of noise there is a certain amount of uncertaintyregarding the transmitted symbol when yj is received.p(xi|yj) represents the conditional probability that thetransmitted symbol was xi given that yj is received.The average uncertainty about x when yj is received isThe average uncertainty about x when yj is received isrepresented as
The quantity H(X|Y=yj) is itself a random variable that takes onvalues H(X|Y=y0), H(X|Y=y1),…, H(X|Y=yn) with probabilitiesp(y0), p(y1),…, p(yn).
−
=
⎧ ⎫⎪ ⎪= = ⎨ ⎬⎪ ⎪⎩ ⎭
∑1
20
1( | ) ( | )log( | )
m
j i ji i j
H X Y y p x yp x y
MUTUAL INFORMATION
Now the average uncertainty about X when Y is received is − −
= =
⎡ ⎤⎛ ⎞= ⎢ ⎥⎜ ⎟⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦∑ ∑
1 1
20 0
1( | ) ( | )log ( )( | )
n m
i j jj i i j
H X Y p x y p yp x y
1 1
20 0
1( | ) ( )log( | )
n m
i j jj i i j
p x y p yp x y
− −
= =
⎛ ⎞= ⎜ ⎟⎜ ⎟
⎝ ⎠∑∑
H(X|Y) represents the average loss of information about atransmitted symbol when a symbol is received.It is called equivocation of X w. r. t. Y.
− −
= =
⎛ ⎞= ⎜ ⎟⎜ ⎟
⎝ ⎠∑∑
1 1
20 0
1( , )log( | )
n m
i jj i i j
p x yp x y
MUTUAL INFORMATION
If the channel were noiseless the average amount of informationreceived would be H(X) bits per received symbol.H(X) is the average amount of information transmitted per symbol.Because of channel noise we lose an average of H(X|Y) informationper symbol.Due to this loss the receiver receives on the average H(X) – H(X|Y)bits per symbol.p yThe quantity H(X) – H(X|Y) is denoted by I(X;Y) and is calledmutual information.
− − −
= = =
⎛ ⎞⎛ ⎞= − ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠∑ ∑∑
1 1 1
2 20 0 0
1 1( ; ) ( )log ( , )log( ) ( | )
m m n
i i ji i ji i j
I X Y p x p x yp x p x y
MUTUAL INFORMATION−
=
=∑1
0 ( , ) ( )
n
i j ij
But p x y p x
1 1 1 1
20 0 0 0
1 1( ; ) ( , )log ( , )log( ) ( | )
m n m n
i j i ji j i ji i j
I X Y p x y p x yp x p x y
− − − −
= = = =
⎛ ⎞⎛ ⎞= − ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠∑∑ ∑∑
1 1
0 0
( / )( , )log
( )
m ni j
i ji j i
p x yp x y
p x
− −
= =
=∑∑
If we interchange the symbols xi and yj the value of eq(1) is not altered, so we get I(X;Y)=I(Y;X).H(X) – H(X|Y)=H(Y) – H(Y|X)
− −
= =
= ∑∑1 1
0 0
( , )( , )log ...........(1)
( ) ( )
m ni j
i ji j j i
p x yp x y
p y p x
0 0 ( )i j ip= =
CHANNEL CAPACITY
A particular communication channel has fixed source anddestination alphabets and a fixed channel matrix.So the only variable quantity in the expression for mutualinformation I(X;Y) is the source probability p(xi).Consequently maximum information transfer requires specificsource statistics obtained through source coding.A suitable measure of the efficiency of information transferA suitable measure of the efficiency of information transferthrough a DMS is obtained by comparing the actualinformation transfer to the upper bound of such trans-information for a given channel.The information transfer in a channel is characterised bymutual information and Shannon named the maximum mutualinformation as the channel capacity.
Channel capacity C=I(X;Y)max
3/23/2009
15
CHANNEL CAPACITY
Channel capacity C is the maximum possible informationtransmitted when one symbol is transmitted from thetransmitter.Channel capacity depends on the transmission medium, kindof signals, kind of receiver, etc. and it is a property of thesystem as a whole.
CHANNEL CAPACITY OF A BSC
x1=0
x2=1
y1=0
y2=1
1-α
1- α
α
αp
1
The source alphabet consists of two symbols x1 and x2 with probabilities p(x1)=p and p(x2)=1-p. The destination alphabet is y1,y2.The average error probability per symbol is
1 α1-p
1 2 1 2 1 2( ) ( | ) ( ) ( | )ep p x p y x p x p y x= +(1 )p pα α α= + − =
CHANNEL CAPACITY OF A BSC
The error probability of a BSC is αThe channel matrix is given by
Destination entropy H(Y) is
[ ] 1( | )
1P Y X
α αα α−⎡ ⎤
= ⎢ ⎥−⎣ ⎦
1 2 2 21 2
1 1( ) ( ) log ( ) log( ) ( )
H Y p y p yp y p y
= +
( )1 2 1 21 1
1 1( ) log 1 ( ) log( ) 1 ( )
p y p yp y p y
= + −−
1[ ( )]p y= Ω
( )2 21 1( ) log 1 log
1x x x
x xΩ = + −
−
CHANNEL CAPACITY OF A BSC
1 HMAX
H Plot of H as a function of α
1/2 10 α
CHANNEL CAPACITY OF A BSC
The maxima occurs at x=0.5 and Hmax =1 bit/symbolThe probability of the output symbol y1 is
1 1( ) ( , )ix
p y p x y=∑1 1 1 1 2 2( | ) ( ) ( | ) ( )p y x p x p y x p x= +
(1 ) (1 )p pα α= − + −( ) ( )p p2p pα α= + −
21 1
1 ( | ) ( , ) log( | )
n m
i jj i j i
Noise entropy H Y X p x yp y x= =
= ∑∑2 2
21 1
1( ) ( | ) log( | )i j i
j i j i
p x p y xp y x= =
=∑∑
( ) ( 2 )H Y p pα α= Ω + −
CHANNEL CAPACITY OF A BSC
2 2
21 1
1( | ) ( ) ( | ) log( | )i j i
i j j i
H Y X p x p y xp y x= =
= ∑ ∑
1 1 1 2 1 2 1 21 1 2 1
1 1( ) ( | ) log ( ) ( | ) log( | ) ( | )
p x p y x p x p y xp y x p y x
= +
1 12 1 2 2 2 2 2 2
1 2 2 2
1 1( ) ( | ) log ( ) ( | ) log( | ) ( | )
p x p y x p x p y xp y x p y x
+ +
2 2 2 21 1 1 1(1 )log log (1 ) log (1 )(1 )log
1 1p p p pα α α α
α α α α= − + + − + − −
− −
2 21 1(1 )log log
1α α
α α= − +
−( )α=Ω ( | ) ( )H Y X α=Ω
3/23/2009
16
CHANNEL CAPACITY OF A BSC
If the noise is small, error probability α<<1 and the mutualinformation becomes almost the source entropy.
( ; ) ( 2 ) ( )I X Y p pα α α= Ω + − −Ω
( ; ) ( ) ( | )I X Y H Y H Y X= −
On the other hand if the channel is very much noisy, α=1/2.
For a fixed α, Ω(α) is a constant, but the other term Ω(α+p-2 αp)varies with source probability.This term reaches a maximum value of 1 when α+p-2 αp=1/2
( ; ) ( ) ( )I X Y p H X= Ω =
( ; ) 0I X Y =
CHANNEL CAPACITY OF A BSCThis condition is satisfied by any α if p=1/2.So the channel capacity of a BSC can be written as
1 ( )C α= −Ω
1 ( )C α= −Ω
SHANNON’S THEOREM ON CHANNEL CAPACITY
i. Given a source of M equally likely messages with M>>1which is generating information at a rate R, given a channelwith channel capacity C, then if R≤C, there exists a codingtechnique such that the output of the source may betransmitted over the channel with a probability of error inthe received message which may be made arbitrarily small.
ii. Given a source of M equally likely messages with M>>1which is generating information at a rate R; then if R>C, theprobability of error is close to unity for every possible set ofM transmitter signals.
DIFFERENTIAL ENTROPY H(X)
Consider a continuous random variable X with the probability density function fX(x). By analogy with the entropy of a discrete random variable we can introduce the definition
21( ) ( )log( )Xh X f x dx
f x
∞ ⎛ ⎞= ⎜ ⎟
⎝ ⎠∫
h(x) is called differential entropy of X to distinguish it from the ordinary or absolute entropy. The difference between h(x) and H(X) can be explained as below.
( )Xf x−∞ ⎝ ⎠∫
DIFFERENTIAL ENTROPY H(X)
We can view the continuous random variable X as the limiting form of a discrete random variable that assumes the values xk=kΔx where k=0, ±1, ±2, … and Δx approaches zero. The continuous random variable X assumes a value in the interval xk, xk + Δx with probability fX(xk)Δx. Hence permitting Δx to approach zero the ordinary entropy of p g pp y pythe continuous random variable X may be written in the limit as follows
∞
Δ →=−∞
⎛ ⎞= Δ ⎜ ⎟Δ⎝ ⎠
∑ 20(continuous)
1( ) lim ( ) log( )X kx k X k
H X f x xf x x
∞ ∞
Δ →−∞ =−∞
⎧ ⎫⎛ ⎞⎪ ⎪= Δ − Δ Δ⎨ ⎬⎜ ⎟⎪ ⎪⎝ ⎠⎩ ⎭∑ ∑2 20
1lim ( )log log ( )( )X k X kx kX k
f x x x f x xf x
DIFFERENTIAL ENTROPY H(X)
In the limit as , approaches infinity.This implies that the entropy of a continuous random
0→Δx 20lim log
xx
Δ →− Δ
∞→)(XH
Δ
∞
−∞→
= − Δ⎡ ⎤
=⎢ ⎥⎣ ⎦
∫20(continuous) since (( ) ( ) )lim lo g 1 Xx
f x dxH X h X x
∞ ∞
Δ →−∞ −∞
⎛ ⎞= − Δ⎜ ⎟
⎝ ⎠∫ ∫2 20
1( )log lim log ( )( )X Xx
X
f x dx x f x dxf x
. This implies that the entropy of a continuous randomvariable is infinitely large.A continuous random variable may assume a value anywhere inthe interval and the uncertainty associated with thevariable is on the order of infinity.So we define h(X) as differential entropy with the term serving as a reference.
∞→)(XH
to +−∞ ∞
2log x− Δ
3/23/2009
17
EXAMPLE
A signal amplitude X is a random variable uniformlydistributed in the range (-1,1). The signal is passed throughan amplifier of gain 2. The output Y is also a random variableuniformly distributed in the range (-2,+2). Determine thedifferential entropies of X and Y
⎧ <⎪⎨
12, 1
( )x
f ⎪= ⎨⎪⎩
2( )0, xf x
Otherwise
⎧ <⎪= ⎨⎪⎩
14, 2
( )0, y
yf y
Otherwise
EXAMPLE
The entropy of the random variable Y is twice that of X.
[ ] bitxdxxh 1 2log )( 1 12
12
1
12
1 === −−∫
bitsxdyyh 2][2 4log )(2
2
2 24
124
1 =×== ∫−
−
Here Y=2X and a knowledge of X uniquely determines Y.Hence the average uncertainty about X and Y should beidentical.Amplification can neither add nor subtract information. But hereh(Y) is twice as large as h(X).This is because h(X) and h(Y) are differential entropies and theywill be equal only if their reference entropies are equal.
EXAMPLEThe reference entropy R1 for X is and reference entropy R2 for Y is In the limit as
log x− Δlog y− Δ
0, →ΔΔ yx1 0
lim logx
R xΔ →
= − Δ 2 0lim logy
R yΔ →
= − Δ
1 2 lim log yR R Δ− =
Δlog dy
dx⎛ ⎞= ⎜ ⎟⎝ ⎠
(2 )log d xdx
⎡ ⎤= ⎢ ⎥⎣ ⎦
R1, reference entropy of X is higher than the referenceentropy R2 for Y. Hence if X and Y have equal absoluteentropies their differential entropies must differ by 1 bit.
1 2 , 0g
x y xΔ Δ → Δ dx⎝ ⎠ dx⎢ ⎥⎣ ⎦log 2 1bit= = 1 2. ., 1 = +i e R R bits
CHANNEL CAPACITY AND MUTUAL INFORMATION
Let a random variable X is transmitted over a channel.Each value of X in a given continues range is now a message thatmay be transmitted. e.g. a pulse of height X.The message recovered by the receiver will be a continuousrandom variable Y.If the channel were noise free the received value Y would uniquelydetermine the transmitted value Xdetermine the transmitted value X.Consider the event that at the transmitter a value of X in theinterval (x, x+∆x) has been transmitted (∆x→0).Here the amount of information transmitted is since the probability of the above event is fx(x)∆x. Let the value of Y at the receiver be y and let fx(x|y) is the conditional pdf of X given Y.
[ ]log 1 ( )Xf x xΔ
MUTUAL INFORMATION
Then fx(x|y) ∆x is the probability that X will lie in the interval(x, x+∆x) when Y=y provided ∆x→0. There is an uncertaintyabout the event that X lies in the interval (x,x+∆x).This uncertainty arises because of channelnoise and therefore represents a loss of information.Because is the information transmitted and
[ ]log 1 ( | )Xf x y xΔ
log[1 ( ) ]Xf x xΔ. is the information lost over the channel thenet information received is given by the different between thetwo.
g[ ( ) ]Xf[ ]log 1 ( | )Xf x y xΔ
[ ] [ ]log 1 ( ) log 1 ( | )X Xf x x f x y x= Δ − Δ( | )log
( )X
X
f x yf x
=
Comparing with the discrete case we can write the mutualinformation between random variable X and Y as
MUTUAL INFORMATION
( ; ) 2( | )( , ) log
( )
∞ ∞
−∞ −∞
⎡ ⎤= ⎢ ⎥
⎣ ⎦∫ ∫ X
X Y XYX
f x yI f x y dxdyf x
1∞ ∞ ∞ ∞
∫ ∫ ∫ ∫2 21( , ) log ( , ) log ( | )( )−∞ −∞ −∞ −∞
= +∫ ∫ ∫ ∫XY XY XX
f x y dxdy f x y f x y dxdyf x
2 21( ) ( | )log ( , )log ( | )( )
∞ ∞ ∞ ∞
−∞ −∞ −∞ −∞
= +∫ ∫ ∫ ∫X Y XY XX
f x f y x dxdy f x y f x y dxdyf x
2 21( )log ( | ) ( , ) log ( | )( )
∞ ∞ ∞ ∞
−∞ −∞ −∞ −∞
= +∫ ∫ ∫ ∫X Y XY XX
f x dx f y x dy f x y f x y dxdyf x
3/23/2009
18
MUTUAL INFORMATION
1, ( ) log ( )( )
∞
−∞
=∫ XX
Now f x dx h xf x
( | ) 1∞
−∞
=∫ Yand f y x dy
( ; ) 2( ) ( , ) log ( | )∞ ∞
−∞ −∞
= + ∫ ∫x y XY XI h x f x y f x y dxdy
1∞ ∞
The second term on the RHS represents the average over x and y ofBut this term represents the uncertainty about x when y is received.
21( ) ( , ) log
( | )XYX
h x f x y dxdyf x y−∞ −∞
= − ∫ ∫
[ ]log 1 ( | )Xf x y[ ]log 1 ( | )Xf x y
MUTUAL INFORMATION
It is the loss of information over the channel.The average of is the average loss ofinformation over the channel when some x is transmittedand y is received.By definition this quantity is represented by h(x|y) and iscalled equivocation of X and Y
[ ]log 1 ( | )Xf x y
q
21( | ) ( , ) log
( | )XYx
h X Y f x y dxdyf x y
∞ ∞
−∞ −∞
= ∫ ∫
( ; ) ( ) ( | )I X Y h x h x y= −
CHANNEL CAPACITYThat is when some value of X is transmitted and when some value of Y is received the average information transmitted over the channel is I(X;Y).Channel capacity C is defined as the maximum amount of information that can be transmitted on the average.
max[ ( ; )]=C I X Y
max[ ( ; )]=C I X Y
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELS
For discrete random variables the entropy is maximum when all the outcomes were equally likely. For continuous random variables there exists a PDF fx(x) that maximizes h(x).It is found that the PDF that maximizes h(x) is Gaussian distribution given byg y
Also the random variables X and Y must have the same mean μand same variance σ2
2
2( )
21( )2
μσ
πσ
−−
=x
Xf x e
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELSConsider an arbitrary pair of random variation X and Y Whose PDF are respectively denoted by fy(x) and fx(x) where x is a dummy variable.Adapting the fundamental inequality
log 0≤∑m
kqp
we may write
21
log 0=
≤∑ kk k
pp
2( )( ) log 0( )
∞
−∞
≤∫ XY
Y
f xf x dxf x
2 21( ) log ( ) log ( ) 0( )Y Y X
Y
f x f x f xf x
∞ ∞
−∞ −∞
+ ≤∫ ∫
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELS
2( ) ( ) log ( ) ..........(1)Y Xh Y f x f x dx∞
−∞
≤ − ∫
2 21( ) log ( ) log ( )( )
∞ ∞
−∞ −∞
≤ −∫ ∫Y Y XY
f x f x f x dxf x
When the random variable X is Gaussian its PDF is given by
Substituting (2) in (1)
2( )21( ) .............(2)
2
x
Xf x eμ
σ
πσ
−−
=
2
2
2
( )21( ) ( ) log
2
x
Yh Y f x e dxμσ
πσ
−∞ −
−∞
≤ − ∫
3/23/2009
19
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELS
Converting the logarithm to base e using the relation
2 2log ( ) log [log ( )]ex e x=2
2( )
22
1( ) ( ) log log2
μσ
πσ
−∞ −
−∞
⎡ ⎤≤ − ⎢ ⎥
⎢ ⎥⎣ ⎦∫
x
Y eh Y f x e e dx∞ ⎢ ⎥⎣ ⎦
2
2 2
( )log ( ) log 22μ πσ
σ
∞
−∞
⎡ ⎤−≤ − − −⎢ ⎥
⎣ ⎦∫ Y e
xe f x dx
2
2 2
( )log ( ) ( ) log 22μ πσ
σ
∞ ∞
−∞ −∞
⎧ ⎫⎛ ⎞−≤ − − −⎨ ⎬⎜ ⎟
⎝ ⎠⎩ ⎭∫ ∫Y Y e
xe f x dx f x dx
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELS
It is given that the random variable X and Y has the properties (i) mean=μ, (ii) variance=σ2
( ) 1,∞
−∞
=∫ Yf x 2 2( ) ( )μ σ∞
−∞
− =∫ Yx f x dx
21( ) log log 22
πσ⎡ ⎤≤− − −⎢ ⎥⎣ ⎦eh Y e2( ) g g
2⎢ ⎥⎣ ⎦e
2 21log log .log 22
πσ≤ + ee e
2 21 lo g lo g 22
π σ≤ +e
22 2
1 1log log 22 2
πσ≤ +e
( )22
1 lo g 22
π σ≤ e
( )22
1( ) lo g 22
h Y eπ σ≤
MAXIMUM ENTROPY FOR CONTINUOUS CHANNELS
Maximum value of h(Y) is
For a finite variance σ2 the gaussian random variable has thelargest differential entropy attainable by any random variable.The entropy is uniquely determined by its variance.
22
1( ) log (2 )2
h y eπ σ=
py q y y
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
The channel capacity C is the maximum rate of informationtransmission over a channel . The mutual information I(X;Y) isgiven by I(X;Y)=h(Y)-h(Y|X)
The channel capacity is the maximum value of the mutualinformation I(X;Y). Let a channel is band limited to B Hz anddi t b d b hit G i i f PSD ( /2)disturbed by a white Gaussian noise of PSD (η/2)
Let the signal power be S. The disturbance is assumed to beadditive so the received signal y(t)=x(t) + n(t)
Because the channel is band limited both the signal x(t) and thenoise n(t) are bandlimited to B Hz . y(t) is also bandlimited toB Hz.
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
All these signals are therefore completely specified by samples taken at the uniform rate of 2B samples / second . Now we have to find the maximum information that can be transmitted per sample .Let x,n and y represent samples of x(t) , n(t) and y(t) .The information I(X;Y) transmitted per sample is given byThe information I(X;Y) transmitted per sample is given by I(X;Y) = h(Y)-h(Y|X)By definition
( | ) ( , ) log(1 / ( | ))XY Yh Y X f x y f y x dxdy∞ ∞
−∞ −∞= ∫ ∫
( ) ( | )log(1 ( | )X Y Yf x f y x f y x dxdy∞ ∞
−∞ −∞= ∫ ∫
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
For a given x, y is equal to a constant x+n .Hence the distribution of Y when X has a given value isidentical to that of n except for a translation by x .If fn represents the PDF of the noise sample n
2( | ) ( ) ( | ) log (1 ( | ))X Y Yh Y X f x dx f y x f y x dy∞ ∞
−∞ −∞= ∫ ∫
If fn represents the PDF of the noise sample n
putting y-x = z
( | ) ( )Y Nf y x f y x= −
2 2( | )log (1 ( | ) ( )log (1 ( ))Y Y N Nf y x f y x dy f y x f y x dy∞ ∞
−∞ −∞= − −∫ ∫
2 2( | ) log (1 ( | ) ( ) log (1 ( ))Y Y N Nf y x f y x f z f z dz∞ ∞
−∞ −∞=∫ ∫
3/23/2009
20
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
The mean square value of the x(t) = S and the mean square
( ; ) ( ) ( )= −I X Y h y h n
( | ) ( ) ( ) ( )= = − =h Y X h z h y x h n
( | ) ( )=h Y X h n
The mean square value of the x(t) S and the mean square value of the noise = N . Mean square value of y is given byNow mean square value of output entropy h(y) is obtained when Y is Gaussian and is given by
2 = +y S N
m a x1h (y ) lo g 2 e (S N )2
π= + 2σ = +S N
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
For a white gaussian noise with mean square value N
max max( , ) ( ) ( )= −I x y h y h n
)())(2log(21 nhNSe −+= π
N B1( ) l 2h N N Bη=1( ) lo g 22
π=h n e N
[ ] [ ]1 1log 2 ( ) log(2 )2 2
π π= + −SC e S N eN
max ( , )Channel capacity per sample =SC I x y
⎥⎦⎤
⎢⎣⎡ +
=eN
NSeπ
π2
)(2log21
Channel capacity per sample is ½ log( 1+S/N) . There are 2B samples per second .So the channel capacity per second is given by
⎥⎦⎤
⎢⎣⎡ +
=N
NSCS)(log
21 1 lo g 1
2SN
⎡ ⎤= +⎢ ⎥⎣ ⎦
12 log 1⎡ ⎤⎛ ⎞= +⎜ ⎟⎢ ⎥SC B
CHANNEL CAPACITY OF A BAND LIMITED AWGN CHANNEL (SHANNON HARTLEY THEOREM)
2 log 12
+⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦C B
N
lo g 1 b its /se c o n dSBN
⎡ ⎤⎛ ⎞= +⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦
log 1 bits/second⎡ ⎤⎛ ⎞= +⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦
SC BN
CAPACITY OF A CHANNEL OF INFINITE BANDWIDTH BW
The Shannon Hartley Theorem indicates that a noiselessGaussian channel with S/N= infinity has an infinite capacitysince
When the bandwidth B increases the channel capacity does notbecome infinite as expected because with an increase in BW the
2 lo g 1 SC BN
⎡ ⎤⎛ ⎞= +⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦
become infinite as expected because with an increase in BW thenoise power also increases.Thus for a fixed signal power and in presence of white Gaussiannoise the channel capacity approaches an upper limit withincrease in band width .
2log 1 ⎡ ⎤⎛ ⎞= +⎜ ⎟⎢ ⎥⎝ ⎠⎣ ⎦
SC BN
η=Putting N B
CAPACITY OF A CHANNEL OF INFINITE BANDWIDTH BW
⎥⎦
⎤⎢⎣
⎡+=
BS
SBS
ηη
η1log. 2
SB
BSS
η
ηη ⎥⎦
⎤⎢⎣
⎡+= 1log2
S
2log 1 η
⎡ ⎤⎛ ⎞= +⎢ ⎥⎜ ⎟
⎝ ⎠⎣ ⎦
SC BB
1s B
2S Slog 1
Bη
η η⎡ ⎤
= +⎢ ⎥⎣ ⎦
Putting this expression becomes S xBη=
xxSC /12 )1(log +=
η→∞ →As B , x 0
1/ x2x 0
SC lim log (1 x)η∞ →
= +
CAPACITY OF A CHANNEL OF INFINITE BANDWIDTH
This equation indicates that we may trade off bandwidth for signal to noise ratio and vice versa
2SC log eη∞ =
=SC 1 .4 4
S1.44η
=SC 1.44η∞ =
For a maximum C we can trade off S/N and B.If S/N is reduced we have to increase the BW . If the BW is to be reduced we have to increase S/N and so on .
η∞C 1 .4 4
( )⎛ ⎞= ⎜ ⎟⎝ ⎠
S1 . 4 4 BN
S BC 1 .4 4N∞ = putting N
Bη =
3/23/2009
21
ORTHOGONAL SET OF FUNCTIONS
Consider a set of functions defined overthe interval x1 ≤ x ≤ x2 and which are related to one another asbelow
If we multiply and integrate the functions over the interval x1d th lt i t h th i l th
2
1
( ) ( ) 0x
i jxi j
g x g x
≠
=∫
1 2( ), ( ),.........., ( )ng x g x g x
and x2 the result is zero except when the signal are the same.A set of functions which has this property is described asbeing orthogonal over the interval from x1 to x2.The function can be compared to vector and whose dotproduct is given by
cosi jv v θ
iv jv
ORTHOGONAL SET OF FUNCTIONS
The vectors vi and vj are perpendicular when θ = 90 i.e.vi.vj= 0. The vectors are then said to be orthogonal. Incorrespondence function whose integrated product is zero arealso orthogonal to one another.Consider we have an arbitrary function f(x) and we areinterested in f(x) only in the range from x1 and x2 ie over theinterval in which the set of functions g(x) are orthogonal.Now we can expand f(x) as a linear sum of the functions g (x)Now we can expand f(x) as a linear sum of the functions gn(x)
where c’s are numerical constants.The orthogonality of the g’s makes it easy to compute thecoefficients cn. To evaluate cn we multiply both sides of eq(2)by gn(x) and integrate over the interval of orthogonality.
.
1 1 2 2( ) ( ) ( )................ ( )...............(2)n nf x c g x c g x c g x= + +
ORTHOGONAL SET OF FUNCTIONS
Because of orthogonality all of the terms on the right hand side
2 2 2
1 1 1
1 1 2 2( ) ( ) ( ) ( ) ( ) ( ) .....x x x
n n nx x x
f x g x dx c g x g x dx c g x g x dx= + +∫ ∫ ∫2
1
....... ( ) ( )x
n n nx
c g x g x dx+ ∫Because of orthogonality all of the terms on the right hand side becomes zero with a single exception.
2 2
1 1
2( ) ( ) ( )x x
n n nx x
f x g x dx c g x dx=∫ ∫2
1
2
1
2
( ) ( )
( )
x
nxn x
nx
f x g x d xc
g x d x=∫∫
If we are selecting the denominator of RHS such that
we have
When orthogonal functions are selected such thatthey are said to be normalised.
ORTHOGONAL SET OF FUNCTIONS2
1
2( ) 1=∫x
nx
g x dx
2
1
( ) ( ) ..........(3)x
n nx
c f x g x dx= ∫2
1
2( ) 1=∫x
nx
g x dx
yThe use of normalised functions has the advantage that cn’scan be calculated from eq(3) without having to evaluate the
integral
A set of functions which are both orthogonal and normalisedis called orthonormal set.
2
1
2( )∫x
nx
g x dx
MATCHED FILTERS RECEPTION OF MARY FSK
Let a message source generates M messages each with equallikelihood.Let each message be represented by one of the orthogonal setof signals s1(t), s2(t), ……., sn(t). The message interval is T.The signals are transmitted over a communication channelwhere they are corrupted by Additive White Gaussian Noise(AWGN).At th i d t i ti f hi h h bAt the receiver a determination of which message has beentransmitted is made through the use of M matched filters orcorrelators.Each correlator consists of a multiplier followed by an integrator.The local inputs to the multipliers are si(t).Suppose that in the absence of noise the signal si(t) istransmitted and the output of each integrator is sampled at theend of a message interval.
MATCHED FILTERS RECEPTION OF MARY FSK
SOURCE OFM
AWGN
1( )s t
2 ( )s t0
T
∫
0
T
∫
1e
2eM
MESSAGES
( )Ms t
0
0
T
∫ Me
. . . . .
. . . . .
3/23/2009
22
MATCHED FILTERS RECEPTION OF MARY FSK
Then because of the orthogonality condition all the integrator willhave zero output except for the ith integrator whose output will be
It is adjusted to produce an output of Es, symbol energy.In the presence of an AWGN wave from n(t), the output of the lthcorrelator will be
2
0
( )T
is t d t∫
( ( ) ( )) ( )T
e s t n t s t dt= +∫ ( ) ( ) ( ) ( )T
T
l i ls t n t dt s t s t dt= +∫ ∫
The quantity nl is a random variable which is gaussian, which has amean value of zero and has a mean square value given byσ2=ηEs/2The correlator corresponding to the transmitted message willhave an output
0( ) ( )
T
l ln t s t dt n= ≡∫
0( ( ) ( )) ( )
T
i i ie s t n t s t dt= +∫
0( ( ) ( )) ( )l i le s t n t s t dt= +∫ 0
0
( ) ( ) ( ) ( )l i ls t n t dt s t s t dt+∫ ∫To determine which message has been transmitted we shallcompare the matched filters output e1, e2 …….., eM.
MATCHED FILTERS RECEPTION OF MARY FSK
( )0
( ) ( ) ( )= +∫T
i i ie s t n t s t dt
2
0 0( ) ( ) ( )= +∫ ∫
T T
i is t dt n t s t dt s iE n= +
1 2 MWe may decide that si(t) has been transmitted if thecorresponding output ei is larger than the output of any of thefilters.The probability that some arbitrarily selected output el is less thanthe output ei is given by
2
221( ) ................(1)2
li
ee
l i lp e e e deσ
π
−
−∞< =
σ ∫
MATCHED FILTERS RECEPTION OF MARY FSK
The probability that e1 and e2 both are smaller than ei is given by
1 2 1 2( ) ( ) ( )i i i ip e e and e e p e e p e e< < = < <
[ ]21( )ip e e= < [ ]2
2( )ip e e= <
The probability pL that ei is that largest of the outputs is given by
1 2 3( , , ,..... )L i Mp p e e e e e= > 1[ ( )]Ml ip e e −= <
2
2
1
212
li
Me
e
le deσ
π
−−
−∞
⎡ ⎤= ⎢ ⎥
σ⎢ ⎥⎣ ⎦∫
MATCHED FILTERS RECEPTION OF MARY FSK
2
2
1
212
σ
π
−−+
−∞
⎡ ⎤= ⎢ ⎥
σ⎢ ⎥⎣ ⎦∫
ls i
Me
E n
L lp e de
e 2σ
=lLet x
e ,
e , 2 2
l
s il s i
When xE nWhen E n xσ σ
= −∞ = −∞
= + = +
2lde dxσ=
2
1
21 σ σ
π
−+
−
−∞
⎡ ⎤⎢ ⎥= ⎢ ⎥⎢ ⎥⎣ ⎦
∫s i
ME n
xLp e d x
MATCHED FILTERS RECEPTION OF MARY FSK
2 , 2 2
η ησ σ= =s sE E
2
1
21 ..........( 4 )
s iM
E n
xLp e d x
η σ
π
−+
−
⎡ ⎤⎢ ⎥
= ⎢ ⎥⎢ ⎥
∫
= 2 ησ
s sE E
π −∞⎢ ⎥⎣ ⎦
∫
, ,2L
s iE nMp fη σ
⎛ ⎞= ⎜ ⎟
⎝ ⎠
pL depends on two deterministic parameters Es/η and M and on asingle random variable ni/√2σ
To find the probability that ei is the largest output without reference tothe noise output ni of the ith correlator we need to average pL over allpossible values of n
MATCHED FILTERS RECEPTION OF MARY FSK
, ,2L L
s iE np p Mη σ
⎛ ⎞= ⎜ ⎟
⎝ ⎠
possible values of ni.The average is the probability that we shall be correct in decidingthat the transmitted signal corresponds to the correlator which yieldsthe largest output. Let this probability be pC.The probability of an error is then pE = 1-pCni is a random variable (gaussian) with zero mean and variance σ2 .Hence the average value of pL considering all possible values of ni isgiven by
3/23/2009
23
MATCHED FILTERS RECEPTION OF MARY FSK
2
2212
, ,2
is i
n
C L ip p e dE M nn σ
σ η σ
∞ −
−∞
⎛ ⎞= ⎜ ⎟
⎝ ⎠∫
and using eq (42
)iIf nyσ
=
1
2 2
1
1s
ME y
My x
Cp e e dx dyη
π
−+
∞− −
−∞ −∞
⎛ ⎞⎜ ⎟⎛ ⎞= ⎜ ⎟⎜ ⎟
⎝ ⎠ ⎜ ⎟⎜ ⎟⎝ ⎠
∫ ∫
1e cp p= −
EFFICIENCY OF ORTHOGONAL SIGNAL TRANSMISSION
The above equation is plotted after evaluation by numerical integration by a computer
M=2
M=4
1
0.5
0.1
10 2
pe
M=16
M=1024
M=2048
M→α
10-2
10-3
10-4
10-5
0.71 2 3 6 10 20
ln 2 iSRη 2log
iSMη
=
EFFICIENCY OF ORTHOGONAL SIGNAL TRANSMISSION
( , )e sp f M E η=
22
loglogs sE E TAbscissa is T MM ηη
=
iS T= iS
si
EPut ST
=
2log MP R2log Mη
=2log MTη
=
iSRη
=
2g Put RT
=
Efficiency of Orthogonal Signal Transmission :Observations About the Graph
For all M, pe decreases as increases. As
For M ,pe=0 provided and pe=1 otherwise
For fixed M and R, pe decreases as as the noise density ηd
iSRη
, 0ie
S pRη
→ ∞ →
→∞ ln 2iSRη≥
decreases.For fixed M and η, pe decreases as the signal power goes up.For fixed Si, η and M, pe decreases as we allow more time T forthe transmission of a single message or the rate R is decreased.For fixed Si, η and T, pe decreases as M decreasesAs M, the number of messages increases, the errorprobability reduces.
Efficiency of Orthogonal Signal Transmission :Observations About the Graph
As the error probability provided
As the bandwidth (B = 2Mfs)
Maximum allowable errorless transmission rate R is the
M →∞
0ep →M →∞
B →∞
ln 2iSRη≥
Maximum allowable errorless transmission rate Rmax is the channel capacity.
The maximum rate obtained for this M ary FSK is the same as that obtained by Shannon’s Theorem
max
ln 2iSRη
= maxln 2
iSRη
= 1.44 iSη
=
Efficiency of Orthogonal Signal Transmission :Observations About the Graph
As M increases, the bandwidth is increased and the number ofmatched filters increases and so does the circuit complexityErrorless transmission is really possible as predicted byShannon’s Theorem provided Rmax=1.44 Si/ η ifAs we are required to transmit information at the same fixed rateR in the presence of fixed noise power spectral density with fixed
M →∞
R in the presence of fixed noise power spectral density with fixederror probability and fixed M, we have to control signal power Si