HW4_ES250_sol_a

download HW4_ES250_sol_a

of 9

Transcript of HW4_ES250_sol_a

  • 8/12/2019 HW4_ES250_sol_a

    1/9

    Harvard SEAS ES250 Information Theory

    Homework 4

    Solutions

    1. Find the channel capacity of the following discrete memoryless channel:

    Z

    X Y

    where Pr{Z= 0} = Pr{Z=a} = 12 . The alphabet forxis X = {0, 1}. Assume thatZis independentofX. Observe that the channel capacity depends on the value ofa.

    Solution :

    Y =X+ Z X {0, 1}, Z {0, a}

    We have to distinguish various cases depending on the values ofa.

    a= 0. In this case, Y =X, and max I(X; Y) = max H(X) = 1. Hence the capacity is 1 bit pertransmission.

    a = 0,1. In this case, Yhas four possible values 0, 1, a, and 1 + a. KnowingY, we know theXwhich was sent, and hence H(X|Y) = 0. Hence max I(X; Y) = max H(X) = 1, achieved foran uniform distribution on the input X.

    a= 1. In this case, Yhas three possible output values, 0, 1, and 2. The channel is identical to thebinary erasure channel witha= 1/2. The capacity of this channel is 1/2 bit per transmission.

    a = 1. This is similar to the case when a = 1 and the capacity here is also 1/2 bit pertransmission.

    2. Consider a 26-key typewriter.

    (a) If pushing a key results in printing the associated letter, what is the capacity C in bits?

    (b) Now suppose that pushing a key results in printing that letter or the next (with equal proba-bility). Thus A A orB, , Z Zor A. What is the capacity?

    (c) What is the highest rate code with block length one that you can find that achieves zero

    probability of error for the channel in part (b).

    Solution :

    (a) If the typewriter prints out whatever key is struck, then the outputY, is the same as the inputX, and

    C= max I(X; Y) = max H(X) = log 26,

    attained by a uniform distribution over the letters.

    1

  • 8/12/2019 HW4_ES250_sol_a

    2/9

    Harvard SEAS ES250 Information Theory

    (b) In this case, the output is either equal to the input (with probability 12 ) or equal to the nextletter (with probability 12 ). Hence H(Y|X) = log 2 independent of the distribution ofX, andhence

    C= max I(X; Y) = max H(Y) log 2 = log 26 log 2 = log 13

    attained for a uniform distribution over the output, which in turn is attained by a uniform

    distribution on the input.(c) A simple zero error block length one code is the one that uses every alternate letter, say

    A,C,E, ,W,Y. In this case, none of the codewords will be confused, since A will produceeither A or B, C will produce C or D, etc. The rate of this code,

    R= log(codewords)

    Block length =

    log 13

    1 = log 13

    In this case, we can achieve capacity with a simple code with zero error.

    3. Consider a binary symmetric channel withYi = Xi Zi, where is mod 2 addition, and Xi, Yi {0, 1}.Suppose that {Zi} has constant marginal probabilities p(Zi = 1) = p = 1 p(Zi = 0), but thatZ1, Z2, , Zn are not necessarily independent. Let C= 1 H(p). Show that

    maxp(x1,x2, ,xn)

    I(X1, X2, , Xn; Y1, Y2, , Yn) nC

    Comment on the implications.

    Solution :

    WhenX1, X2, , Xn are chosen i.i.d. Bern(12 ),

    I(X1, , Xn; Y1, , Yn) =H(X1, , Xn) H(X1, , Xn|Y1, , Yn)

    =H(X1, , Xn) H(Z1, , Zn|Y1, , Yn)

    H(X1, , Xn) H(Z1, , Zn) H(X1, , Xn)

    H(Zi)

    =n nH(p)

    Hence, the capacity of the channel with memory over n uses of the channel is

    nC(n) = maxp(X1, ,Xn)

    I(X1, , Xn; Y1, , Yn)

    I(X1, , Xn; Y1, , Yn)p(x1, ,xn)=Bern( 12 )

    n(1 H(p))

    =nC

    Hence, channels with memory have higher capacity. The intuitive explanation for this result is thatthe correlation between the noise decreases the effective noise; one could use the information fromthe past samples of the noise to combat the present noise.

    4. Consider the channelY =X+ Z(mod 13), where

    Z=

    1, with probability 132, with probability 133, with probability 13

    and X {0, 1, , 12}.

    2

  • 8/12/2019 HW4_ES250_sol_a

    3/9

  • 8/12/2019 HW4_ES250_sol_a

    4/9

    Harvard SEAS ES250 Information Theory

    where (1) and (2) follow from Markovity and (3) is met with equality ofX1 and X2 are inde-pendent and hence Y1 and Y2 are independent. Therefore

    C= maxp(x1,x2)

    I(X1, X2; Y1, Y2)

    maxp(x1,x2)

    I(X1; Y1) + maxp(x1,x2)

    I(X2; Y2)

    = maxp(x1)

    I(X1; Y1) + maxp(x2)

    I(X2; Y2)

    =C1+ C2

    with equality iff p(x1, x2) = p(x1)p

    (x2) and p(x1) and p

    (x2) are the distributions thatmaximize C1 and C2 respectively.

    (b) Let

    =

    1, if the signal is sent over the channel 12, if the signal is sent over the channel 2

    Consider the following communication scheme: The sender chooses between two channels ac-cording to Bern() coin flip. Then the channel input is X= (, X).Since the output alphabets Y1 and Y2 are disjoint, is a function of Y,i.e. X Y .Therefore,

    I(X; Y) =I(X; Y, )

    =I(X, ; Y, )

    =I(; Y, ) + I(X; Y, |)

    =I(; Y, ) + I(X; Y|)

    =H() + I(X; Y|= 1) + (1 )I(X; Y|= 2)

    =H() + I(X1; Y1) + (1 )I(X2; Y2)

    Thus, it follows that

    C= sup{H() + C1+ (1 )C2}

    which is a strictly concave function on . Hence, the maximum exists and by elementarycalculus, one can easily showC= log2(2

    C1 + 2C2), which is attained with = 2C1/(2C1 + 2C2).

    6. Consider a time-varying discrete memorylessbinary symmetric channel. LetY1, Y2, , Yn be con-ditionally independent given X1, X2, , Xn, with conditional distribution given by p(y

    n|xn) =

    ni=1pi(yi|xi), as shown below.

    (a) Find maxp(x) I(Xn; Yn).

    4

  • 8/12/2019 HW4_ES250_sol_a

    5/9

    Harvard SEAS ES250 Information Theory

    (b) We now ask for the capacity for the time invariant version of this problem. Replace eachpi,1 i n, by the average value p= 1

    n

    nj=1pj, and compare the capacity to part (a).

    Solution :

    (a)

    I(Xn; Yn) =H(Yn) ni=1

    H(Yi|Xi)

    ni=1

    H(Yi) ni=1

    H(Yi|Xi)

    ni=1

    (1 H(pi))

    with equality ifX1, , Xn are chosen i.i.d. Bern(12 ). Hence

    maxp(x)

    I(X1, , Xn; Y1, , Yn) =ni=1

    (1 H(pi))

    (b) SinceH(p) is concave on p, by Jensens inequality,

    1

    n

    ni=1

    H(pi) H

    1

    n

    ni=1

    pi

    =H(p)

    i.e.,

    ni=1

    H(pi) nH(p)

    Hence,

    Ctime-varying=ni=1

    (1 H(pi))

    =n ni=1

    H(pi)

    n nH(p)

    =ni=1

    (1 H(p))

    =Ctime invariant

    7. Suppose a binary symmetric channel of capacity C1 is immediately followed by a binary erasurechannel of capacity C2. Find capacity Cof the resulting channel.

    5

  • 8/12/2019 HW4_ES250_sol_a

    6/9

    Harvard SEAS ES250 Information Theory

    Now consider an arbitrary discrete memoryless channel (X, p(y|x),Y) followed by a binary erasurechannel , resulting in an output

    Y =

    Y, with probability 1 e, with probability

    wheree denotes erasure. Thus the outputY is erased with probability What is the capacity of thischannel?

    Solution :

    (a) LetC1= 1H(p) be the capacity of the BSC with parameterp, andC2= 1be the capacityof the BEC with parameter . Let Y denote the output of the cascaded channel, and Y theoutput of the BSC. Then, the transition rule for the cascaded channel is simply

    p(y|x) = y=0,1

    p(y|y)p(y|x)

    for each (x,y) pair.LetX Bern() denote the input to the channel. Then,

    H(Y) =H((1 )((1 p) +p(1 )), , (1 )(p+ (1 p)(1 )))

    and also

    H(Y|X= 0) =H((1 )(1 p), , (1 )p)

    H(Y|X= 1) =H((1 )p, , (1 )(1 p)) =H(Y|X= 0)

    Therefore,C = max

    p(x)I(X;Y)

    = maxp(x)

    {H(Y) H(Y|X)}

    = maxp(x)

    {H(Y)} H(Y|X)

    = maxp(x)

    {H((1 )((1 p) +p(1 )), , (1 )(p+ (1 p)(1 )))}

    H((1 )(1 p), , (1 )p) (4)

    6

  • 8/12/2019 HW4_ES250_sol_a

    7/9

    Harvard SEAS ES250 Information Theory

    Note that the maximum value ofH(Y) occurs when = 1/2 by the concavity and symmetryofH(). (We can check this also by differentiating (4) with respect to .)Substituting the value = 1/2 in the expression for the capacity yields

    C=H((1 )/2, , (1 )/2) H((1 p)(1 ), , p(1 ))

    = (1 )(1 +p logp + (1 p)log(1 p))

    =C1C2

    (b) For the cascade of an arbitrary discrete memoryless channel (with capacityC) with the erasurechannel (with the erasure probability ), we will show that

    I(X;Y) = (1 )I(X; Y) (5)

    Then, by taking suprema of both sides over all input distributions p(x), we can conclude thecapacity of the cascaded channel is (1 )C.Proof of (5):Let

    E=

    1,

    Y =e0, Y =Y

    Then, since E is a function ofY ,

    H(Y) =H(Y , E)

    =H(E) + H(Y|E)

    =H() + H(Y|E= 1) + (1 )H(Y|E= 0)

    =H() + (1 )H(Y),

    where the last equality comes directly from the construction ofE. Similarly,

    H(Y|X) =H(Y , E|X)

    =H(E|X) + H(Y|X, E)

    =H(E) + H(Y|X, E= 1) + (1 )H(Y|X, E= 0)

    =H() + (1 )H(Y|X),

    whence

    I(X;Y) =H(Y) H(Y|X) = (1 )I(X; Y)

    8. We wish to encode a Bernoulli() processV1, V2, for transmission over a binary symmetric channelwith error probability p.

    Find conditions on and p so that the probability of error p(Vn = Vn) can be made to go tozero as n .

    Solution :

    Suppose we want to send a binary i.i.d. Bern() source over a binary symmetric channel with error

    7

  • 8/12/2019 HW4_ES250_sol_a

    8/9

    Harvard SEAS ES250 Information Theory

    probabilityp. By the source-channel separation theorem, in order to achieve the probability of errorthat vanishes asymptotically, i.e. P(Vn = Vn) 0, we need the entropy of the source to be lessthan the capacity of the channel. Hence,

    H() + H(p)< 1,

    or, equivalently,

    (1 )1pp(1 p)1p 237.5 = 1.94 1011.

    (b) i. Obviously, Huffman codewords for X are all of length n. Hence, with n deterministicquestions, we can identify an object out of 2n candidates.

    ii. Observe that the total number of subsets which include both object 1 and object 2 orneither of them is 2m1. Hence, the probability that object 2 yields the same answers for kquestions as object 1 is (2m1/2m)k = 2k.

    iii. Let

    1j =

    1, object j yields the same answers for k questions as object 10, otherwise.

    for j = 2, , m

    Then

    E[N] =E

    mj=2

    1j

    =

    mj=2 E[1j]

    =m

    j=2

    2k

    = (m 1)2k

    = (2n 1)2k

    whereN is the number of objects in {2, 3, , m} with the same answers.

    9