Discrete Memoryless Channel
-
Upload
purnachand-simhadri -
Category
Education
-
view
7.060 -
download
1
Transcript of Discrete Memoryless Channel
Discrete Memoryless Channel andit’s Capacity
by
Purnachand SimhadriAsst. Professor
Electronics and Communication Engineering Department
K L University
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
Outline
1 Discrete Memoryless channel
Probability Model
Binary Channel
2 Mutual Information
Joint Entropy
Conditional Entropy
Definition
3 Capacity of DMC
Transmission Rate
Definition
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
1 Discrete Memoryless channel
Probability Model
Binary Channel
2 Mutual Information
Joint Entropy
Conditional Entropy
Definition
3 Capacity of DMC
Transmission Rate
Definition
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties
Properties
The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).
The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).
Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties
Properties
The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).
The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).
Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties
Properties
The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).
The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).
Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties
Properties
The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).
The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).
Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties(contd...)
Properties (contd...)
In an ideal channel, the output is equal to the input.
In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).
In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties(contd...)
Properties (contd...)
In an ideal channel, the output is equal to the input.
In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).
In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties(contd...)
Properties (contd...)
In an ideal channel, the output is equal to the input.
In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).
In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProperties(contd...)
Properties (contd...)
In an ideal channel, the output is equal to the input.
In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).
In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProbability Model
All these transition probabilities from xi to yj aregathered in a transition matrix (also called as channelmatrix) to model DMC .
pti = P (X = xi), prj = P (Y = yi), pij = P (Y = yj/X = xi)
and P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti
⇒ prj =
M∑i=1
ptipij (1)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProbability Model
All these transition probabilities from xi to yj aregathered in a transition matrix (also called as channelmatrix) to model DMC .
pti = P (X = xi), prj = P (Y = yi), pij = P (Y = yj/X = xi)
and P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti
⇒ prj =
M∑i=1
ptipij (1)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelProbability Model
Equation( 1) can be written using matrix form aspr1pr2...
prM
︸ ︷︷ ︸
P rY
=
p11 p12 · · · p1Mp21 p22 · · · p2M...
.... . .
...pM1 pM2 · · · pMM
︸ ︷︷ ︸
Channel Matrix − PY/X
pt1pt2...
ptM
︸ ︷︷ ︸
P tX
(2)
Equation( 2) can be compactly written as
P rY = PY/XP t
X (3)
Note that,
M∑j=1
pij = 1 and pe =
M∑i=1
pi
M∑j=1,i 6=j
pij
(4)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelBinary Channel
Channels designed to transmit and receive one of Msymbols aree called discrete M-ary channels (M > 2).If M=2, then the channel is called binary channel.In the binary case we can statistically model thechannel as below
0
1
0
1
𝑷𝟎𝟎
𝑷𝟏𝟏
𝑷𝟎𝟏
𝑷𝟏𝟎
𝑷𝟎𝒕
𝑷𝟏𝒕
𝑷𝟎𝒓
𝑷𝟏𝒓
P (Y = j/X = i) = pij
p00 + p01 = 1
p10 + p11 = 1
P (X = 0) = pt0
P (X = 1) = pt1
P (Y = 0) = pr0
P (Y = 1) = pr1
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelBinary Channel
Channels designed to transmit and receive one of Msymbols aree called discrete M-ary channels (M > 2).If M=2, then the channel is called binary channel.In the binary case we can statistically model thechannel as below
0
1
0
1
𝑷𝟎𝟎
𝑷𝟏𝟏
𝑷𝟎𝟏
𝑷𝟏𝟎
𝑷𝟎𝒕
𝑷𝟏𝒕
𝑷𝟎𝒓
𝑷𝟏𝒓
P (Y = j/X = i) = pij
p00 + p01 = 1
p10 + p11 = 1
P (X = 0) = pt0
P (X = 1) = pt1
P (Y = 0) = pr0
P (Y = 1) = pr1
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelBinary Channel
for a binary channel,
pr0 = pt0p00 + pt1p10
pr1 = pt0p01 + pt1p11
and Pe = pt0p01 + pt1p10
Binary Symmetric Channel
A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).
Let, p00 = p11 = p ⇒ p01 = p10 = 1− p
then, for a binary symmetric channel
Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelBinary Channel
for a binary channel,
pr0 = pt0p00 + pt1p10
pr1 = pt0p01 + pt1p11
and Pe = pt0p01 + pt1p10
Binary Symmetric Channel
A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).
Let, p00 = p11 = p ⇒ p01 = p10 = 1− p
then, for a binary symmetric channel
Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
Properties
ProbabilityModel
Binary Channel
MutualInformation
Capacity ofDMC
Discrete Memoryless channelBinary Channel
for a binary channel,
pr0 = pt0p00 + pt1p10
pr1 = pt0p01 + pt1p11
and Pe = pt0p01 + pt1p10
Binary Symmetric Channel
A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).
Let, p00 = p11 = p ⇒ p01 = p10 = 1− p
then, for a binary symmetric channel
Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
1 Discrete Memoryless channel
Probability Model
Binary Channel
2 Mutual Information
Joint Entropy
Conditional Entropy
Definition
3 Capacity of DMC
Transmission Rate
Definition
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationJoint Entropy
In a DMC, there are two statistical process at work:input to the channel and the noise, which inturneffects the output of channel. So, it is worthy toconsider the joint and conditional densities of inputand output.
Thus there are a number of entropies or informationcontents that need to be considered for studyingdiscrete memoryless channel characteristics.First, entropy of the input is
H(X) = −M∑i=1
pti log2(pti) bits/symbol
Entropy of the output is
H(Y ) = −M∑j=1
prj log2(prj ) bits/symbol
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationJoint Entropy
Joint distribution of input and output can be obtainedfrom transition probabilities and input distribution as
P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti
Joint Entropy
Joint entropy H(X,Y ) is defined as
H(X,Y ) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi, yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationJoint Entropy
Joint distribution of input and output can be obtainedfrom transition probabilities and input distribution as
P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti
Joint Entropy
Joint entropy H(X,Y ) is defined as
H(X,Y ) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi, yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationJoint Entropy
Joint Entropy - Properties
The joint entropy of a set of variables is greater thanor equal to all of the individual entropies of thevariables in the set.
H(X,Y ) ≥ max(H(X), H(Y ))
The joint entropy of a set of variables is less than orequal to the sum of the individual entropies of thevariables in the set.
H(X,Y ) ≤ H(X) + H(Y )
This inequality is an equality if and only if X and Yare statistically independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationJoint Entropy
Joint Entropy - Properties
The joint entropy of a set of variables is greater thanor equal to all of the individual entropies of thevariables in the set.
H(X,Y ) ≥ max(H(X), H(Y ))
The joint entropy of a set of variables is less than orequal to the sum of the individual entropies of thevariables in the set.
H(X,Y ) ≤ H(X) + H(Y )
This inequality is an equality if and only if X and Yare statistically independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
Let the conditional distribution of X, given that the output ofchannel Y = yj , be P (X/Y = yj), then the average uncertainityabout X given that Y = yj is given by
H(X/Y = yj) = −∑
xi∈X
P (X = xi/Y = yj) log2 P (X = xi/Y = yj)
The conditional entropy of X conditioned on Y is the expectedvalue for the entropy of the distribution P (X/Y = yj)
⇒ H(X/Y ) = E[H(X/Y = yj)]
=∑
yj∈Y
P (Y = yj)H(X/Y = yj)
=∑
yj∈Y
P (yj)
[−∑
xi∈X
P (xi/yj) log2 P (xi/yj
]= −
∑xi∈X
∑yj∈Y
P (xi/yj)P (yj) log2 P (xi/yj)
= −∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
Let the conditional distribution of X, given that the output ofchannel Y = yj , be P (X/Y = yj), then the average uncertainityabout X given that Y = yj is given by
H(X/Y = yj) = −∑
xi∈X
P (X = xi/Y = yj) log2 P (X = xi/Y = yj)
The conditional entropy of X conditioned on Y is the expectedvalue for the entropy of the distribution P (X/Y = yj)
⇒ H(X/Y ) = E[H(X/Y = yj)]
=∑
yj∈Y
P (Y = yj)H(X/Y = yj)
=∑
yj∈Y
P (yj)
[−∑
xi∈X
P (xi/yj) log2 P (xi/yj
]= −
∑xi∈X
∑yj∈Y
P (xi/yj)P (yj) log2 P (xi/yj)
= −∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
Conditional Entropy - Definition
Conditional entropy H(X/Y ) is defined as
H(X/Y ) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
similarly, Conditional entropy H(Y/X) is defined as
H(Y/X) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (yj/xi)
Conditional entropy is also called as equivocation.
H(X/Y ) gives the amount of uncertainty remaining aboutthe channel input X after the channel output Y has beenobserved.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
Conditional Entropy - Definition
Conditional entropy H(X/Y ) is defined as
H(X/Y ) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
similarly, Conditional entropy H(Y/X) is defined as
H(Y/X) = −∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (yj/xi)
Conditional entropy is also called as equivocation.
H(X/Y ) gives the amount of uncertainty remaining aboutthe channel input X after the channel output Y has beenobserved.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
There is less information in conditional entropy H(X/Y )than in the entropy H(X)
⇒ H(X/Y )−H(X) ≤ 0
Proof:
H(X/Y )−H(X) =−∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
+∑
xi∈X
P (xi) log2 P (xi)
=−∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
+∑
xi∈X
( ∑yj∈Y
P (xi, yj))log2 P (xi)
=∑
xi∈X
∑yj∈Y
P (xi, yj) log2P (xi)
P (xi/yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy
Using the inequality, log a ≤ (a− 1),
it follows that:
H(X/Y )−H(X) ≤∑xi∈X
∑yj∈Y
P (xi, yj)
(P (xi)
P (xi/yj)− 1
)
=∑xi∈X
∑yj∈Y
P (xi, yj)
P (xi/yj)P (xi)−
∑xi∈X
∑yj∈Y
P (xi, yj)
=∑xi∈X
P (xi)∑yj∈Y
P (yj)− 1
=1− 1
=0
⇒ H(X/Y ) ≤ H(X)
and H(Y/X) ≤ H(Y )
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationConditional Entropy- Relation with Joint Entropy
Conditional entropy H(X/Y ) is given by
H(X/Y ) =−∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
=−∑
xi∈X
∑yj∈Y
P (xi, yj) log2
(P (xi, yj)
P (yj)
)=−
∑xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi, yj)
+∑
yj∈Y
( ∑xi∈X
P (xi, yj)
)log2 P (yj)
=−∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi, yj)
+∑
yj∈Y
P (yj) log2 P (yj)
= H(X,Y )−H(Y ) ⇒ H(X,Y ) = H(X/Y ) +H(Y )
similarly, H(X,Y ) = H(Y/X) +H(X)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Definition
Mutual Information I(X,Y ) of X and Y is deifned as
I(X,Y ) = H(X)−H(X/Y )
I(X,Y ) gives the uncertainty of the input X resolved byobserving output Y . In other words, it is the protion ofinformation of X that depends on Y .
Properties
Symmetric : I(X,Y ) = I(Y,X)
I(X,Y ) = H(X)−H(X/Y ) = H(Y )−H(Y/X)
= H(X) +H(Y )−H(X,Y )
Nonnegetive : I(X,Y ) ≥ 0
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Definition
Mutual Information I(X,Y ) of X and Y is deifned as
I(X,Y ) = H(X)−H(X/Y )
I(X,Y ) gives the uncertainty of the input X resolved byobserving output Y . In other words, it is the protion ofinformation of X that depends on Y .
Properties
Symmetric : I(X,Y ) = I(Y,X)
I(X,Y ) = H(X)−H(X/Y ) = H(Y )−H(Y/X)
= H(X) +H(Y )−H(X,Y )
Nonnegetive : I(X,Y ) ≥ 0
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Proof: Property - 1
I(X,Y ) =H(X)−H(X/Y )
=−∑
xi∈X
P (xi) log2 P (xi) +∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
=−∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi) +∑
xi∈X
∑yj∈Y
P (xi, yj) log2 P (xi/yj)
=∑
xi∈X
∑yj∈Y
P (xi, yj) log2P (xi/yj)
P (xi)
=∑
xi∈X
∑yj∈Y
P (xi, yj) log2P (xi, yj)
P (xi)P (yj)
=I(Y,X)
Equaion in box gives Kullback Leibler divergence between two probability
distributions P (xi, yj) and P (xi)P (yj)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Kullback Leibler divergence
In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.
DKL(P//Q) =∑i
log2
P (i)
Q(i)
KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.
Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Kullback Leibler divergence
In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.
DKL(P//Q) =∑i
log2
P (i)
Q(i)
KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.
Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Kullback Leibler divergence
In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.
DKL(P//Q) =∑i
log2
P (i)
Q(i)
KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.
Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationDefinition
Proof: Property - 2
It is known that :H(X) ≥ H(X/Y )
⇒ H(X)−H(X/Y ) ≥ 0
If X and Y are statistically independent, then
H(X/Y ) = H(X) ⇒ I(X,Y ) = 0
. Therefore,
I(X,Y ) = H(X)−H(X/Y ) ≥ 0
with equality when X and Y are statistically independent.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationFour Cases
Case - 1: X and Y are statistically independent
𝐻(𝑋) 𝐻(𝑌)
𝐻 𝑋, 𝑌 = 𝐻 𝑋 + 𝐻(𝑌)
𝐼(𝑋, 𝑌) = 0
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationFour Cases
Case - 2: Y is completely dependent on XCase - 3: X is completely dependent on Y
I(X,Y) = H(Y)
𝐻(𝑋) 𝐻(𝑌)
I(X,Y) = H(X)
𝐻(𝑋) 𝐻(𝑌)
𝐻(𝑋, 𝑌) = 𝐻(𝑋) 𝐻(𝑋, 𝑌) = 𝐻(𝑌)
CASE -2 CASE -3
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Joint Entropy
ConditionalEntropy
Definition
Four Cases
Capacity ofDMC
Mutual InformationFour Cases
Case - 4: X and Y are neither statistically independentnor one is completely dependent on the other.
𝐻(𝑋/𝑌) I(𝑋, 𝑌)
𝐻 𝑋, 𝑌 = 𝐻 𝑋 + 𝐻(𝑌/𝑋) = 𝐻 𝑌 + 𝐻(𝑋/𝑌)
𝐻(𝑌/𝑋)
𝐻(𝑋) 𝐻(𝑌)
𝐼(𝑋, 𝑌) = 𝐻(𝑋) − 𝐻(𝑋/𝑌) = 𝐻(𝑌) − 𝐻(𝑌/𝑋)
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
1 Discrete Memoryless channel
Probability Model
Binary Channel
2 Mutual Information
Joint Entropy
Conditional Entropy
Definition
3 Capacity of DMC
Transmission Rate
Definition
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCTransmission Rate
H(X) is the amount uncertainity about X, in otherwords, information gain related to X if we are toldabout X.
H(X/Y ) is remaining amount uncertainity about Xwhen Y is observed, in other words, the amountinformation required to resolve X if we are told aboutY .
I(X,Y ) is amount of uncertainity of X resolved byobserving the output Y .
So, the amount of information that can betransmitted over a channel is nothing but the amountof uncertainity resolved by observing the channeloutput.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCTransmission Rate
Thus, it is possible to transmit I(X,Y ) bits ofinformation per channel use, approximately, withoutany uncertainity about the input at the output of thechannel.
⇒ It = I(X,Y ) = H(X)−H(X/Y ) bits/channel use
If the the symbol rate of a source is Rs, then the rateof information that can be transmitted over a channelsuch that the input can be resolved approximatelywithout errors is given by
Dt = [H(X)−H(X/Y )]Rs bits/sec
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCTransmission Rate
For an ideal channel X = Y , there is no uncertainty over Xwhen we observe Y .
⇒ H(X/Y ) = 0
⇒ I(X,Y ) = H(X)−H(X/Y ) = H(X)
So all the information is transmitted for each channel use:
It = I(X,Y ) = H(X)
If the channel is too noisy, such that X and Y areindependent. So the uncertainty over X remains the sameirrespective of observation on Y .
⇒ H(X/Y ) = H(X)
⇒ I(X,Y ) = H(X)−H(X/Y ) = 0
i.e., no information passes through the channel:
It = I(X,Y ) = 0
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCTransmission Rate
For an ideal channel X = Y , there is no uncertainty over Xwhen we observe Y .
⇒ H(X/Y ) = 0
⇒ I(X,Y ) = H(X)−H(X/Y ) = H(X)
So all the information is transmitted for each channel use:
It = I(X,Y ) = H(X)
If the channel is too noisy, such that X and Y areindependent. So the uncertainty over X remains the sameirrespective of observation on Y .
⇒ H(X/Y ) = H(X)
⇒ I(X,Y ) = H(X)−H(X/Y ) = 0
i.e., no information passes through the channel:
It = I(X,Y ) = 0
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCDefinition
The capacity of DMC is the maximum rate of informationtransmission over the channel. The maximum rate oftransmission occurs when the source is matched to thechannel.
Definition
The capacity of DMC is defined the maximum rate ofinformation transmission over the channel, where themaximum is taken over all possible input distributionsP (X)
C = maxP (X)
I(X,Y )Rs bits/sec
= maxP (X)
[H(X)−H(X/Y )]Rs bits/sec
= maxP (X)
[H(Y )−H(Y/X)]Rs bits/sec
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCDefinition
The capacity of DMC is the maximum rate of informationtransmission over the channel. The maximum rate oftransmission occurs when the source is matched to thechannel.
Definition
The capacity of DMC is defined the maximum rate ofinformation transmission over the channel, where themaximum is taken over all possible input distributionsP (X)
C = maxP (X)
I(X,Y )Rs bits/sec
= maxP (X)
[H(X)−H(X/Y )]Rs bits/sec
= maxP (X)
[H(Y )−H(Y/X)]Rs bits/sec
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoiseless Binary Channel
Consider a noiseless binary channel as shown below
0
1
0
1
𝑷𝟎𝟎 = 𝟏
𝑷𝟏𝟏 = 𝟏
𝑷𝟎𝟏 = 𝟎
𝑷𝟏𝟎 = 𝟎
𝑷(𝒙𝟎)
𝑷(𝒙𝟏)
𝑷(𝒚𝟎) 𝑷(𝒙𝟎)
𝑷(𝒚𝟏)
P (x0, y0) = P (x0)P00 = P (x0)
P (x1, y1) = P (x1)P11 = P (x1)
P (x0, y1) = P (x0)P01 = 0
P (x1, y0) = P (x1)P10 = 0
P (y0) = P (x0)P00 + P (x1)P10
= P (x0)
P (y1) = P (x0)P01 + P (x1)P11
= P (x1)
P (x0/y0) =P (x0, y0)
P (y0)=
P (x0)
P (x0)= 1
P (x0/y1) =P (x0, y1)
P (y1)=
0
P (x1)= 0
P (x1/y0) =P (x1, y0)
P (y0)=
0
P (x0)= 0
P (x1/y1) =P (x1, y1)
P (y1)=
P (x1)
P (x1)= 1
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoiseless Binary Channel
Consider a noiseless binary channel as shown below
0
1
0
1
𝑷𝟎𝟎 = 𝟏
𝑷𝟏𝟏 = 𝟏
𝑷𝟎𝟏 = 𝟎
𝑷𝟏𝟎 = 𝟎
𝑷(𝒙𝟎)
𝑷(𝒙𝟏)
𝑷(𝒚𝟎) 𝑷(𝒙𝟎)
𝑷(𝒚𝟏)
P (x0, y0) = P (x0)P00 = P (x0)
P (x1, y1) = P (x1)P11 = P (x1)
P (x0, y1) = P (x0)P01 = 0
P (x1, y0) = P (x1)P10 = 0
P (y0) = P (x0)P00 + P (x1)P10
= P (x0)
P (y1) = P (x0)P01 + P (x1)P11
= P (x1)
P (x0/y0) =P (x0, y0)
P (y0)=
P (x0)
P (x0)= 1
P (x0/y1) =P (x0, y1)
P (y1)=
0
P (x1)= 0
P (x1/y0) =P (x1, y0)
P (y0)=
0
P (x0)= 0
P (x1/y1) =P (x1, y1)
P (y1)=
P (x1)
P (x1)= 1
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoiseless Binary Channel
H(X/Y ) = −1∑
i=0
1∑j=0
P (xi, yj) log2 P (xi/yj)
= − [P (x0, y0) log2 P (x0/y0) + P (x0, y1) log2 P (x0/y1)
+ P (x1, y0) log2 P (x1/y0) + P (x1, y1) log2 P (x1/y1)]
= 0
⇒ I(X,Y ) = H(X)−H(X/Y )
= H(X)
Therefore, the capacity of noiseless binary channel is
C = maxP (X)
I(X,Y ) bits/channel use
= maxP (X)
H(X) bits/channel use
= 1 bits/channel use
i.e., over a noiseless binary channel atmost one bit of information can be
send per channel use, which is maximum information content of a binary
source.Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoiseless Binary Channel
H(X/Y ) = −1∑
i=0
1∑j=0
P (xi, yj) log2 P (xi/yj)
= − [P (x0, y0) log2 P (x0/y0) + P (x0, y1) log2 P (x0/y1)
+ P (x1, y0) log2 P (x1/y0) + P (x1, y1) log2 P (x1/y1)]
= 0
⇒ I(X,Y ) = H(X)−H(X/Y )
= H(X)
Therefore, the capacity of noiseless binary channel is
C = maxP (X)
I(X,Y ) bits/channel use
= maxP (X)
H(X) bits/channel use
= 1 bits/channel use
i.e., over a noiseless binary channel atmost one bit of information can be
send per channel use, which is maximum information content of a binary
source.Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
Consider a noisy binary symmetric channel as shown below
0
1
0
1
𝑷𝟎𝟎 = 𝒑
𝑷𝟏𝟏 = 𝒑
𝑷𝟎𝟏 = 𝟏 − 𝒑
𝑷𝟏𝟎 = 𝟏 − 𝒑
𝑷(𝒙𝟎)
𝑷(𝒙𝟏)
𝑷(𝒚𝟎) 𝑷(𝒙𝟎)
𝑷(𝒚𝟏)
P (x0, y0) = P (x0)P00 = P (x0)p
P (x1, y1) = P (x1)P11 = P (x1)p
P (x0, y1) = P (x0)P01 = P (x0)(1− p)
P (x1, y0) = P (x1)P10 = P (x1)(1− p)
P (y0) = P (x0)P00 + P (x1)P10 = P (x0)p+ P (x1)(1− p)
P (y1) = P (x0)P01 + P (x1)P11 = P (x0)(1− p) + P (x1)p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
Consider a noisy binary symmetric channel as shown below
0
1
0
1
𝑷𝟎𝟎 = 𝒑
𝑷𝟏𝟏 = 𝒑
𝑷𝟎𝟏 = 𝟏 − 𝒑
𝑷𝟏𝟎 = 𝟏 − 𝒑
𝑷(𝒙𝟎)
𝑷(𝒙𝟏)
𝑷(𝒚𝟎) 𝑷(𝒙𝟎)
𝑷(𝒚𝟏)
P (x0, y0) = P (x0)P00 = P (x0)p
P (x1, y1) = P (x1)P11 = P (x1)p
P (x0, y1) = P (x0)P01 = P (x0)(1− p)
P (x1, y0) = P (x1)P10 = P (x1)(1− p)
P (y0) = P (x0)P00 + P (x1)P10 = P (x0)p+ P (x1)(1− p)
P (y1) = P (x0)P01 + P (x1)P11 = P (x0)(1− p) + P (x1)p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
H(Y/X) = −1∑
i=0
1∑j=0
P (xi, yj) log2 P (yj/xi)
= − [P (x0, y0) log2 P (y0/x0) + P (x0, y1) log2 P (y1/x0)
+ P (x1, y0) log2 P (y0/x1) + P (x1, y1) log2 P (y1/x1)]
= − [P (x0)p log2 p+ P (x0)(1− p) log2(1− p)
+ P (x1)(1− p) log2(1− p) + P (x1)p log2 p]
= − [p log2 p+ (1− p) log2(1− p)] = H(p, 1− p)
⇒ I(X,Y ) = H(Y )−H(Y/X)
= H(Y )−H(p, 1− p)
Therefore, the capacity of noisy binary symmetric channel is
C = maxP (X)
I(X,Y ) bits/channel use
= maxP (X)
H(Y )−H(p, 1− p) bits/channel use
= 1−H(p, 1− p) bits/channel use
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
H(Y/X) = −1∑
i=0
1∑j=0
P (xi, yj) log2 P (yj/xi)
= − [P (x0, y0) log2 P (y0/x0) + P (x0, y1) log2 P (y1/x0)
+ P (x1, y0) log2 P (y0/x1) + P (x1, y1) log2 P (y1/x1)]
= − [P (x0)p log2 p+ P (x0)(1− p) log2(1− p)
+ P (x1)(1− p) log2(1− p) + P (x1)p log2 p]
= − [p log2 p+ (1− p) log2(1− p)] = H(p, 1− p)
⇒ I(X,Y ) = H(Y )−H(Y/X)
= H(Y )−H(p, 1− p)
Therefore, the capacity of noisy binary symmetric channel is
C = maxP (X)
I(X,Y ) bits/channel use
= maxP (X)
H(Y )−H(p, 1− p) bits/channel use
= 1−H(p, 1− p) bits/channel use
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
To achieve the capacity of 1−H(p, 1− p) over a noisybinary symmetric channel, the input distributionshould make H(Y ) = 1.H(Y ) = 1, if P (y0) = P (y1) = 1
2
⇒ P (x0)p + P (x1)(1− p) =1
2
and P (x0)(1− p) + P (x1)p =1
2⇒ (1− 2p)(P (x1)− P (x0)) = 0
⇒ P (x1) = P (x0) =1
2
Thus over a binary symmetric channel, maximuminformation rate is possible when the source symbolsare equally likely.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
To achieve the capacity of 1−H(p, 1− p) over a noisybinary symmetric channel, the input distributionshould make H(Y ) = 1.H(Y ) = 1, if P (y0) = P (y1) = 1
2
⇒ P (x0)p + P (x1)(1− p) =1
2
and P (x0)(1− p) + P (x1)p =1
2⇒ (1− 2p)(P (x1)− P (x0)) = 0
⇒ P (x1) = P (x0) =1
2
Thus over a binary symmetric channel, maximuminformation rate is possible when the source symbolsare equally likely.
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
DiscreteMemorylesschannel
MutualInformation
Capacity ofDMC
TransmissionRate
Definition
Examples
Capacity of DMCNoisy Binary Symmetric Channel
Capacity of binary symmetric channel Vs p
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity
Information Theory and Coding Discrete Memoryless Channel and it’s Capacity