SolvedProblems[1]

5.5. SOLVED PROBLEMS 81

Example 5.5.10. Let X be uniformly distributed in [0, 2π] and Y = sin(X). Calculate the

p.d.f. fY of Y .

Since Y = g(X), we know that

fY (y) =∑ 1

|g′(xn)|fX(xn)

where the sum is over all the xn such that g(xn) = y.

For each y ∈ (−1, 1), there are two values of xn in [0, 2π] such that g(xn) = sin(xn) = y.

For those values, we find that

|g′(xn)| = | cos(xn)| =√

1− sin2(xn) =√

1− y2,

and

fX(xn) =12π

.

Hence,

fY (y) = 21√

1− y2

12π

=1

π√

1− y2.

Example 5.5.11. Let X, Y be independent random variables with X exponentially dis-

tributed with mean 1 and Y uniformly distributed in [0, 1]. Calculate E(maxX, Y ).

Let Z = maxX, Y . Then

P (Z ≤ z) = P (X ≤ z, Y ≤ z) = P (X ≤ z)P (Y ≤ z)

=

z(1− e−z), for z ∈ [0, 1]

1− e−z, for z ≥ 1.

Hence,

fZ(z) =

1− e−z + ze−z, for z ∈ [0, 1]

e−z, for z ≥ 1.

82 CHAPTER 5. RANDOM VARIABLES

Accordingly,

E(Z) =∫ ∞

0zfZ(z)dz =

∫ 1

0z(1− e−z + ze−z)dz +

∫ ∞

1ze−zdz

To do the calculation we note that

∫ 1

0zdz = [z2/2]10 = 1/2,

∫ 1

0ze−zdz = −

∫ 1

0zde−z = −[ze−z]10 +

∫ 1

0e−zdz

= −e−1 − [e−z]10 = 1− 2e−1.

∫ 1

0z2e−zdz = −

∫ 1

0z2de−z = −[z2e−z]10 +

∫ 1

02ze−zdz

= −e−1 + 2(1− 2e−1) = 2− 5e−1.

∫ ∞

1ze−zdz = 1−

∫ 1

0ze−zdz = 2e−1.

Collecting the pieces, we find that

E(Z) =12− (1− 2e−1) + (2− 5e−1) + 2e−1 = 3− 5e−1 ≈ 1.16.

Example 5.5.12. Let Xn, n ≥ 1 be i.i.d. with E(Xn) = µ and var(Xn) = σ2. Use

Chebyshev’s inequality to get a bound on

α := P (|X1 + · · ·+ Xn

n− µ| ≥ ε).

Chebyshev’s inequality (4.8.1) states that

α ≤ 1ε2

var(X1 + · · ·+ Xn

n) =

1ε2

nvar(X1)n2

=σ2

nε2.

This calculation shows that the sample mean gets closer and closer to the mean: the

variance of the error decreases like 1/n.


Example 5.5.13. Let X =D P (λ). You pick X white balls. You color the balls indepen-

dently, each red with probability p and blue with probability 1 − p. Let Y be the number

of red balls and Z the number of blue balls. Show that Y and Z are independent and that

Y =D P (λp) and Z =D P (λ(1− p)).

We find

P (Y = m,Z = n) = P (X = m + n)(

m + n

m

)pm(1− p)n

=λm+n

(m + n)!

(m + n

m

)pm(1− p)n =

λm+n

(m + n)!× (m + n)!

m!n!pm(1− p)n

= [(λp)m

m!e−λp]× [

(λ(1− p))n

n!e−λ(1−p)],

which proves the result.


6.7 Solved Problems

Example 6.7.1. Let (X,Y ) be a point picked uniformly in the quarter circle (x, y) | x ≥0, y ≥ 0, x2 + y2 ≤ 1. Find E[X | Y ].

Given Y = y, X is uniformly distributed in [0,√

1− y2]. Hence

E[X | Y ] =12

√1− Y 2.

Example 6.7.2. A customer entering a store is served by clerk i with probability pi, i =

1, 2, . . . , n. The time taken by clerk i to service a customer is an exponentially distributed

random variable with parameter αi.

a. Find the pdf of T , the time taken to service a customer.

b. Find E[T ].

c. Find V ar[T ].

Designate by X the clerk who serves the customer.

a. fT (t) =∑n

i=1 pifT |X [t|i] =∑n

i=1 piαie−αit

b. E[T ] = E(E[T | X]) = E( 1αX

) =∑n

i=1 pi1αi

.

c. We first find E[T 2] = E(E[T 2 | X]) = E( 1α2

i) =

∑ni=1 pi

2α2

i. Hence, var(T ) =

E(T 2)− (E(T ))2 =∑n

i=1 pi2

α2i− (

∑ni=1 pi

1αi

)2.

Example 6.7.3. The random variables Xi are i.i.d. and such that E[Xi] = µ and var(Xi) =

σ2. Let N be a random variable independent of all the Xis taking on nonnegative integer

values. Let S = X1 + X2 + . . . + XN .

a. Find E(S).

b. Find var(S).

a. E(S)] = E(E[S | N ]) = E(Nµ) = µE(N).

96 CHAPTER 6. CONDITIONAL EXPECTATION

b. First we calculate E(S2). We find

E(S2) = E(E[S2 | N ]) = E(E[(X1 + X2 + . . . + XN )2 | N ])

= E(E[X21 + · · ·+ X2

N +∑

i6=j

XiXj | N ])

= E(NE(X21 ) + N(N − 1)E(X1X2)) = E(N(µ2 + σ2) + N(N − 1)µ2)

= E(N)σ2 + E(N2)µ2.

Then,

var(S) = E(S2)− (E(S))2 = E(N)σ2 + E(N2)µ2 − µ2(E(N))2 = E(N)σ2 + var(N)µ2.

Example 6.7.4. Let X, Y be independent and uniform in [0, 1]. Calculate E[X2 | X + Y ].

Given X + Y = z, the point (X,Y ) is uniformly distributed on the line (x, y) | x ≥0, y ≥ 0, x+ y = z. Draw a picture to see that if z > 1, then X is uniform on [z− 1, 1] and

if z < 1, then X is uniform on [0, z]. Thus, if z > 1 one has

E[X2 | X + Y = z] =∫ 1

z−1x2 1

2− zdx =

12− z

[x3

3]1z−1 =

1− (z − 1)3

3(2− z).

Similarly, if z < 1, then

E[X2 | X + Y = z] =∫ z

0x2 1

zdx =

1z[x3

3]z0 =

z2

3.

Example 6.7.5. Let (X, Y ) be the coordinates of a point chosen uniformly in [0, 1]2. Cal-

culate E[X | XY ].

This is an example where we use the straightforward approach, based on the definition.

The problem is interesting because is illustrates that approach in a tractable but nontrivial

example. Let Z = XY .

E[X | Z = z] =∫ 1

0xf[X|Z][x | z]dx.


Now,

f[X|Z][x | z] =fX,Z(x, z)

fZ(z).

Also,

fX,Z(x, z)dxdz = P (X ∈ (x, x + dx), Z ∈ (z, z + dz))

= P (X ∈ (x, x + dx))P [Z ∈ (z, z + dz) | X = x] = dxP (xY ∈ (z, z + dz))

= dxP (Y ∈ (z

x,z

x+

dz

x)) = dx

dz

x1z ≤ x.

Hence,

fX,Z(x, z) =

1x , if x ∈ [0, 1] and z ∈ [0, x]

0, otherwise.

Consequently,

fZ(z) =∫ 1

0fX,Z(x, z)dx =

∫ 1

z

1x

dx = −ln(z), 0 ≤ z ≤ 1.

Finally,

f[X|Z][x | z] = − 1xln(z)

, for x ∈ [0, 1] and z ∈ [0, x],

and

E[X | Z = z] =∫ 1

zx(− 1

xln(z))dx =

z − 1ln(z)

,

so that

E[X | XY ] =XY − 1ln(XY )

.

Examples of values:Examples of values:Examples of values:

E[X | XY = 1] = 1, E[X | XY = 0.1] = 0.39, E[X | XY ≈ 0] ≈ 0.

Example 6.7.6. Let X, Y be independent and exponentially distributed with mean 1. Find

E[cos(X + Y ) | X].


We have

E[cos(X + Y ) | X = x] =∫ ∞

0cos(x + y)e−ydy = Re

∫ ∞

0ei(x+y)−ydy

= Re eix

1− i =

cos(x)− sin(x)2

.

Example 6.7.7. Let X1, X2, . . . , Xn be i.i.d. U [0, 1] and Y = maxX1, . . . , Xn. Calculate

E[X1 | Y ].

Intuition suggests, and it is not too hard to justify, that if Y = y, then X1 = y with prob-

ability 1/n, and with probability (n−1)/n the random variable X1 is uniformly distributed

in [0, y]. Hence,

E[X1 | Y ] =1n

Y +n− 1

n

Y

2=

n + 12n

Y.

Example 6.7.8. Let X, Y, Z be independent and uniform in [0, 1]. Calculate E[(X + 2Y +

Z)2 | X].

One has, E[(X + 2Y + Z)2 | X] = E[X2 + 4Y 2 + Z2 + 4XY + 4Y Z + 2XZ | X]. Now,

E[X2 + 4Y 2 + Z2 + 4XY + 4Y Z + 2XZ | X]

= X2 + 4E(Y 2) + E(Z2) + 4XE(Y ) + 4E(Y )E(Z) + 2XE(Z)

= X2 + 4/3 + 1/3 + 2X + 1 + X = X2 + 3X + 8/3.

Example 6.7.9. Let X,Y, Z be three random variables defined on the same probability

space. Prove formally that

E(|X −E[X | Y ]|2) ≥ E(|X − E[X | Y, Z]|2).

Let X1 = E[X | Y ] and X2 = E[X | Y, Z]. Note that

E((X −X2)(X2 −X1)) = E(E[(X −X2)(X2 −X1) | Y,Z])


and

E[(X −X2)(X2 −X1) | Y, Z] = (X2 −X1)E[X −X2 | Y,Z] = X2 −X2 = 0.

Hence,

E((X−X1)2) = E((X−X2+X2−X1)2) = E((X−X2)2)+E((X2−X1)2) ≥ E((X−X2)2).

Example 6.7.10. Pick the point (X, Y ) uniformly in the triangle (x, y) | 0 ≤ x ≤1 and 0 ≤ y ≤ x.

a. Calculate E[X | Y ].

b. Calculate E[Y | X].

c. Calculate E[(X − Y )2 | X].

a. Given Y = y, X is U [y, 1], so that E[X | Y = y] = (1 + y)/2. Hence,

E[X | Y ] =1 + Y

2.

b. Given X = x, Y is U [0, x], so that E[Y | X = x] = x/2. Hence,

E[Y | X] =X

2.

c. Since given X = x, Y is U [0, x], we find

E[(X − Y )2 | X = x] =∫ x

0(x− y)2

1x

dy =1x

∫ x

0y2dy =

x2

3. Hence,

E[(X − Y )2 | X] =X2

3.

Example 6.7.11. Assume that the two random variables X and Y are such that E[X |Y ] = Y and E[Y | X] = X. Show that P (X = Y ) = 1.

We show that E((X − Y )2) = 0. This will prove that X − Y = 0 with probability one.

Note that

E((X − Y )2) = E(X2)− E(XY ) + E(Y 2)−E(XY ).


Now,

E(XY ) = E(E[XY | X]) = E(XE[Y | X]) = E(X2).

Similarly, one finds that E(XY ) = E(Y 2). Putting together the pieces, we get E((X −Y )2) = 0.

Example 6.7.12. Let X, Y be independent random variables uniformly distributed in [0, 1].

Calculate E[X|X < Y ].

Drawing a unit square, we see that given X < Y , the pair (X, Y ) is uniformly dis-

tributed in the triangle left of the diagonal from the upper left corner to the bottom right

corner of that square. Accordingly, the p.d.f. f(x) of X is given by f(x) = 2(1−x). Hence,

E[X|X < Y ] =∫ 1

0x× 2(1− x)dx =

13.

108 CHAPTER 7. GAUSSIAN RANDOM VARIABLES

7.4 Summary

We defined the Gaussian random variables N(0, 1), N(µ, σ2), and N(µµµ,ΣΣΣ) both in terms of

their density and their characteristic function.

Jointly Gaussian random variables that are uncorrelated are independent.

If X,Y are jointly Gaussian, then E[X | Y ] = E(X) + cov(X,Y )var(Y )−1(Y − E(Y )).

In the vector case,

E[XXX | YYY ] = E(XXX) + ΣX,YX,YX,Y Σ−1YYY (YYY − E(YYY ),

when ΣYYY is invertible. We also discussed the non-invertible case.

7.5 Solved Problems

Example 7.5.1. The noise voltage X in an electric circuit can be modelled as a Gaussian

random variable with mean zero and variance equal to 10−8.

a. What is the probability that it exceeds 10−4? What is the probability that it exceeds

2× 10−4? What is the probability that its value is between −2× 10−4 and 10−4?

b. Given that the noise value is positive, what is the probability that it exceeds 10−4?

c. What is the expected value of |X|?

Let Z = 104X, then Z =D N(0, 1) and we can reformulate the questions in terms of Z.

a. Using (7.1) we find P (Z > 1) = 0.159 and P (Z > 2) = 0.023. Indeed, P (Z > d) =

P (|Z| > d)/2, by symmetry of the density. Moreover,

P (−2 < Z < 1) = P (Z < 1)−P (Z ≤ −2) = 1−P (Z > 1)−P (Z > 2) = 1−0.159−0.023 = 0.818.

b. We have

P [Z > 1 | Z > 0] =P (Z > 1)P (Z > 0)

= 2P (Z > 1) = 0.318.


c. Since Z = 104X, one has E(|Z|) = 104E(|X|). Now,

E(|Z|) =∫ ∞

−∞|z|fZ(z)dz = 2

∫ ∞

0zfZ(z)dz = 2

∫ ∞

0

1√2π

z exp−12z2dz

= −√

2π

∫ ∞

0d[exp−1

2z2] =

√2π

.

Hence,

E(|X|) = 10−4

√2π

.

Example 7.5.2. Let UUU = Un, n ≥ 1 be a sequence of independent standard Gaussian

random variables. A low-pass filter takes the sequence UUU and produces the output sequence

Xn = Un + Un+1. A high-pass filter produces the output sequence Yn = Un − Un+1.

a. Find the joint pdf of Xn and Xn−1 and find the joint pdf of Xn and Xn+m for m > 1.

b. Find the joint pdf of Yn and Yn−1 and find the joint pdf of Yn and Yn+m for m > 1.

c. Find the joint pdf of Xn and Ym.

We start with some preliminary observations. First, since the Ui are independent, they

are jointly Gaussian. Second, Xn and Yn are linear combinations of the Ui and thus are

also jointly Gaussian. Third, the jpdf of jointly gaussian random variables ZZZ is

fZZZ(zzz) =1√

(2π)ndet(C)exp[−1

2(zzz −mmm)C−1(zzz −mmm)]

where n is the dimension of ZZZ, mmm is the vector of expectations of ZZZ, and C is the covariance

matrix E[(ZZZ − mmm)(ZZZ − mmm)T ]. Finally, we need some basic facts from algebra. If C = a b

c d

, then det(C) = ad − bc and C−1 = 1

det(C)

d −b

−c a

. We are now ready to

answer the questions.

a. Express in the form XXX = AUUU .

Xn

Xn−1

=

0 1

212

12

12 0

Un−1

Un

Un+1


Then det(C) = 14 − 1

14 = 316 and

C−1 =163

12 −1

4

−14

12

fXnYn(xn, yn) = 2π√

3exp[−4

3(x2n − xnyn + y2

n)]

ii. Consider m=n+1. Xn

Yn+1

=

12

12

12 −1

2

Un

Un+1

Then E[[Xn Yn+1]T ] = AE[UUU ] = 000.

C = AE[UUUUUUT ]AT =

12

12

12 −1

2

1 0

0 1

12

12

12 −1

2

=

12 0

0 12

Then det(C) = 14 and

C−1 =

2 0

0 2

fXnYn+1(xn, yn+1) = 1π exp[−1

4(x2n + y2

n+1)]

iii. For all other m.

Xn

Ym

=

12

12 0 0

0 0 −12

12

Un

Un+1

Um−1

Um

Then E[[Xn Ym]T ] = AE[UUU ] = 000.

C = AE[UUUUUUT ]AT =

12

12 0 0

0 0 −12

12

1 0 0 0

0 1 0 0

0 0 1 0

0 0 0 1

12 012 0

0 −12

0 12

=

12 0

0 12

Then det(C) = 14 and

C−1 =

2 0

0 2

fXnYm(xn, ym) = 1π exp[−1

4(x2n + y2

m)]


Example 7.5.3. Let X, Y, Z, V be i.i.d. N(0, 1). Calculate E[X + 2Y |3X + Z, 4Y + 2V ].

We have

E[X + 2Y |3X + Z, 4Y + 2V ] = aaaΣ−1

3X + Z

4Y + 2V

where

aaa = [E((X + 2Y )(3X + Z)), E((X + 2Y )(4Y + 2V ))] = [3, 8]

and

Σ =

var(3X + Z) E((3X + Z)(4Y + 2V ))

E((3X + Z)(4Y + 2V )) var(4Y + 2V )

=

10 0

0 20

.

Hence,

E[X+2Y |3X+Z, 4Y +2V ] = [3, 8]

10−1 0

0 20−1

3X + Z

4Y + 2V

=

310

(3X+Z)+410

(4Y +2V ).

Example 7.5.4. Assume that X,Yn, n ≥ 1 are mutually independent random variables

with X = N(0, 1) and Yn = N(0, σ2). Let Xn = E[X | X + Y1, . . . , X + Yn]. Find the

smallest value of n such that

P (|X − Xn| > 0.1) ≤ 5%.

We know that Xn = an(nX + Y1 + · · ·+ Yn). The value of an is such that

E((X − Xn)(X + Yj)) = 0, i.e., E((X − an(nX + Yj))(X + Yj)) = 0,

which implies that

an =1

n + σ2.

Then

var(X − Xn) = var((1− nan)X − an(Y1 + · · ·+ Yn)) = (1− nan)2 + n(an)2σ2

=σ2

n + σ2.


Thus we know that X − Xn = N(0, σ2

n+σ2 ). Accordingly,

P (|X − Xn| > 0.1) = P (|N(0,σ2

n + σ2)| > 0.1) = P (|N(0, 1)| > 0.1

αn)

where αn =√

σ2

n+σ2 . For this probability to be at most 5% we need

0.1αn

= 2, i.e., αn =

√σ2

n + σ2=

0.12

,

so that

n = 19σ2.

The result is intuitively pleasing: If the observations are more noisy (σ2 large), we need

more of them to estimate X.

Example 7.5.5. Assume that X, Y are i.i.d. N(0, 1). Calculate E[(X + Y )4 | X − Y ].

Note that X + Y and X − Y are independent because they are jointly Gaussian and

uncorrelated. Hence,

E[(X+Y )4 | X−Y ] = E((X+Y )4) = E(X4+4X3Y +6X2Y 2+4XY 3+Y 4) = 3+6+3 = 12.

Example 7.5.6. Let X, Y be independent N(0, 1) random variables. Show that W :=

X2 + Y 2 =D Exd(1/2). That is, the sum of the squares of two i.i.d. zero-mean Gaussian

random variables is exponentially distributed!

We calculate the characteristic function of W . We find

E(eiuW ) =∫ ∞

−∞

∫ ∞

−∞eiu(x2+y2) 1

2πe−(x2+y2)/2dxdy

=∫ 2π

0

∫ ∞

0eiur2 1

2πe−r2/2rdrdθ

=∫ ∞

0eiur2

e−r2/2rdr

=∫ ∞

0

12iu− 1

d[eiur2−r2/2] =1

1− 2iu.


On the other hand, if W =D Exd(λ), then

E(eiuW ) =∫ ∞

0eiuxλe−λxdx

=λ

λ− iu=

11− λ−1iu

.

.

Comparing these expressions shows that X2 + Y 2 =D Exd(1/2) as claimed.

Example 7.5.7. Let Xn, n ≥ 0 be Gaussian N(0, 1) random variables. Assume that

Yn+1 = aYn + Xn for n ≥ 0 where Y0 is a Gaussian random variable with mean zero and

variance σ2 independent of the Xn’s and |a| < 1.

a. Calculate var(Yn) for n ≥ 0. Show that var(Yn) → γ2 as n →∞ for some value γ2.

b. Find the values of σ2 so that the variance of Yn does not depend on n ≥ 1.

a. We see that

var(Yn+1) = var(aYn + Xn) = a2var(Yn) + var(Xn) = a2var(Yn) + 1.

Thus, we αn := var(Yn), one has

αn+1 = a2αn + 1 and α0 = σ2.

Solving these equations we find

var(Yn) = αn = a2nσ2 +1− a2n

1− a2, for n ≥ 0.

Since |a| < 1, it follows that

var(Yn) → γ2 :=1

1− a2as n →∞.

b. The obvious answer is σ2 = γ2.


Example 7.5.8. Let the Xn’s be as in Example 7.5.7.

a.Calculate

E[X1 + X2 + X3 | X1 + X2, X2 + X3, X3 + X4].

b. Calculate

E[X1 + X2 + X3 | X1 + X2 + X3 + X4 + X5].

a. We know that the solution is of the form Y = a(X1 +X2)+ b(X2 +X3)+ c(X3 +X4)

where the coefficients a, b, c must be such that the estimation error is orthogonal to the

conditioning variables. That is,

E((X1 + X2 + X3)− Y )(X1 + X2)) = E((X1 + X2 + X3)− Y )(X2 + X3))

= E((X1 + X2 + X3)− Y )(X3 + X4)) = 0.

These equalities read

2− a− (a + b) = 2− (a + b)− (b + c) = 1− (b + c)− c = 0,

and solving these equalities gives a = 3/4, b = 1/2, and c = 1/4.

b. Here we use symmetry. For k = 1, . . . , 5, let

Yk = E[Xk | X1 + X2 + X3 + X4 + X5].

Note that Y1 = Y2 = · · · = Y5, by symmetry. Moreover,

Y1+Y2+Y3+Y4+Y5 = E[X1+X2+X3+X4+X5 | X1+X2+X3+X4+X5] = X1+X2+X3+X4+X5.

It follows that Yk = (X1 + X2 + X3 + X4 + X5)/5 for k = 1, . . . , 5. Hence,

E[X1 + X2 + X3 | X1 + X2 + X3 + X4 + X5] = Y1 + Y2 + Y3 =35(X1 + X2 + X3 + X4 + X5).

Example 7.5.9. Let the Xn’s be as in Example 7.5.7. Find the jpdf of (X1 + 2X2 +

3X3, 2X1 + 3X2 + X3, 3X1 + X2 + 2X3).


These random variables are jointly Gaussian, zero mean, and with covariance matrix Σ

given by

Σ =

14 11 11

11 14 11

11 11 14

.

Indeed, Σ is the matrix of covariances. For instance, its entry (2, 3) is given by

E((2X1 + 3X2 + X3)(3X1 + X2 + 2X3)) = 2× 3 + 3× 1 + 1× 2 = 11.

We conclude that the jpdf is

fXXX(xxx) =1

(2π)3/2|Σ|1/2exp−1

2xxxT Σ−1xxx.

We let you calculate |Σ| and Σ−1.

Example 7.5.10. Let X1, X2, X3 be independent N(0, 1) random variables. Calculate

E[X1 + 3X2|YYY ] where

YYY =

1 2 3

3 2 1

X1

X2

X3

By now, this should be familiar. The solution is Y := a(X1 + 2X2 + 3X3) + b(3X1 +

2X2 + X3) where a and b are such that

0 = E((X1+3X2−Y )(X1+2X2+3X3)) = 7−(a+3b)−(4a+4b)−(9a+3b) = 7−14a−10b

and

0 = E((X1+3X2−Y )(3X1+2X2+X3)) = 9−(3a+9b)−(4a+4b)−(3a+b) = 9−10a−14b.

Solving these equations gives a = 1/12 and b = 7/12.

Example 7.5.11. Find the jpdf of (2X1 +X2, X1 +3X2) where X1 and X2 are independent

N(0, 1) random variables.


These random variables are jointly Gaussian, zero-mean, with covariance Σ given by

Σ =

5 5

5 10

.

Hence,

fXXX(xxx) =1

2π|Σ|1/2exp−1

2xxxT Σ−1xxx

=1

10πexp−1

2xxxT Σ−1xxx

where

Σ−1 =125

10 −5

−5 5

.

Example 7.5.12. The random variable X is N(µ, 1). Find an approximate value of µ so

that

P (−0.5 ≤ X ≤ −0.1) ≈ P (1 ≤ X ≤ 2).

We write X = µ + Y where Y is N(0, 1). We must find µ so that

g(µ) := P (−0.5− µ ≤ Y ≤ −0.1− µ)− P (1− µ ≤ Y ≤ 2− µ) ≈ 0.

We do a little search using a table of the N(0, 1) distribution or using a calculator. I

find that µ ≈ 0.065.

Example 7.5.13. Let X be a N(0, 1) random variable. Calculate the mean and the variance

of cos(X) and sin(X).

a. Mean Values. We know that

E(eiuX) = e−u2/2 and eiθ = cos(θ) + i sin(θ).

Therefore,

E(cos(uX) + i sin(uX)) = e−u2/2,


so that

E(cos(uX)) = e−u2/2 and E(sin(uX)) = 0.

In particular, E(cos(X)) = e−1/2 and E(sin(X)) = 0.

b. Variances. We first calculate E(cos2(X)). We find

E(cos2(X)) = E(12(1 + cos(2X))) =

12

+12E(cos(2X)).

Using the previous derivation, we find that

E(cos(2X)) = e−22/2 = e−2,

so that E(cos2(X)) = (1/2) + (1/2)e−2. We conclude that

var(cos(X)) = E(cos2(X))− (E(cos(uX)))2 =12

+12e−2 − (e−1/2)2 =

12

+12e−2 − e−1.

Similarly, we find

E(sin2(X)) = E(1− cos2(X)) =12− 1

2e−2 = var(sin(X)).

Example 7.5.14. Let X be a N(0, 1) random variable. Define

Y =

X, if |X| ≤ 1

−X, if |X| > 1.

Find the pdf of Y .

By symmetry, X is N(0, 1).

Example 7.5.15. Let X,Y, Z be independent N(0, 1) random variables.

a. Calculate

E[3X + 5Y | 2X − Y, X + Z].

b. How does the expression change if X, Y, Z are i.i.d. N(1, 1)?


a. Let V1 = 2X − Y, V2 = X + Z and VVV = [V1, V2]T . Then

E[3X + 5Y | VVV ] = aaaΣ−1V VVV

where

aaa = E((3X + 5Y )VVV T ) = [1, 3]

and

ΣV =

5 2

2 2

.

Hence,

E[3X + 5Y | VVV ] = [1, 3]

5 2

2 2

−1

VVV = [1, 3]16

2 −2

−2 5

VVV

=16[−4, 13]VVV = −2

3(2X − Y ) +

136

(X + Z).

b. Now,

E[3X + 5Y | VVV ] = E(3X + 5Y ) + aaaΣ−1V (VVV −E(VVV )) = 8 +

16[−4, 13](VVV − [1, 2]T )

=266− 2

3(2X − Y ) +

136

(X + Z).

Example 7.5.16. Let (X, Y ) be jointly Gaussian. Show that X − E[X | Y ] is Gaussian

and calculate its mean and variance.

We know that

E[X | Y ] = E(X) +cov(X, Y )

var(Y )(Y − E(Y )).

Consequently,

X − E[X | Y ] = X −E(X)− cov(X, Y )var(Y )

(Y − E(Y ))

and is certainly Gaussian. This difference is zero-mean. Its variance is

var(X) + [cov(X,Y )

var(Y )]2var(Y )− 2

cov(X,Y )var(Y )

cov(X,Y ) = var(X)− [cov(X, Y )]2

var(Y ).


and P : F → [0, 1] is a σ-additive set function such that P (Ω) = 1.

The idea is to specify the likelihood of various outcomes (elements of Ω). If one can

specify the probability of individual outcomes (e.g., when Ω is countable), then one can

choose F = 2Ω, so that all sets of outcomes are events. However, this is generally not

possible as the example of the uniform distribution on [0, 1] shows. (See Appendix C.)

2.6.1 Stars and Bars Method

In many problems, we use a method for counting the number of ordered groupings of

identical objects. This method is called the stars and bars method. Suppose we are given

identical objects we call stars. Any ordered grouping of these stars can be obtained by

separating them by bars. For example, || ∗∗∗ |∗ separates four stars into four groups of sizes

0, 0, 3, and 1.

Suppose we wish to separate N stars into M ordered groups. We need M − 1 bars to

form M groups. The number of orderings is the number of ways of placing the N identical

stars and M − 1 identical bars into N + M − 1 spaces,(N+M−1

M

).

Creating compound objects of stars and bars is useful when there are bounds on the

sizes of the groups.

2.7 Solved Problems

Example 2.7.1. Describe the probability space Ω,F , P that corresponds to the random

experiment “picking five cards without replacement from a perfectly shuffled 52-card deck.”

1. One can choose Ω to be all the permutations of A := 1, 2, . . . , 52. The interpretation

of ω ∈ Ω is then the shuffled deck. Each permutation is equally likely, so that pω = 1/(52!)

for ω ∈ Ω. When we pick the five cards, these cards are (ω1, ω2, . . . , ω5), the top 5 cards of

the deck.

20 CHAPTER 2. PROBABILITY SPACE

2. One can also choose Ω to be all the subsets of A with five elements. In this case, each

subset is equally likely and, since there are N :=(525

)such subsets, one defines pω = 1/N

for ω ∈ Ω.

3. One can choose Ω = ω = (ω1, ω2, ω3, ω4, ω5) | ωn ∈ A and ωm 6= ωn,∀m 6= n,m, n ∈1, 2, . . . , 5. In this case, the outcome specifies the order in which we pick the cards.

Since there are M := 52!/(47!) such ordered lists of five cards without replacement, we

define pω = 1/M for ω ∈ Ω.

As this example shows, there are multiple ways of describing a random experiment.

What matters is that Ω is large enough to specify completely the outcome of the experiment.

Example 2.7.2. Pick three balls without replacement from an urn with fifteen balls that

are identical except that ten are red and five are blue. Specify the probability space.

One possibility is to specify the color of the three balls in the order they are picked.

Then

Ω = R, B3,F = 2Ω, P (RRR) =1015

914

813

, . . . , P (BBB) =515

414

313

.

Example 2.7.3. You flip a fair coin until you get three consecutive ‘heads’. Specify the

probability space.

One possible choice is Ω = H, T∗, the set of finite sequences of H and T . That is,

H, T∗ = ∪∞n=1H, Tn.

This set Ω is countable, so we can choose F = 2Ω. Here,

P (ω) = 2−nwhere n := length of ω.

This is another example of a probability space that is bigger than necessary, but easier

to specify than the smallest probability space we need.


Example 2.7.4. Let Ω = 0, 1, 2, . . .. Let F be the collection of subsets of Ω that are

either finite or whose complement is finite. Is F a σ-field?

No, F is not closed under countable set operations. For instance, 2n ∈ F for each

n ≥ 0 because 2n is finite. However,

A := ∪∞n=02n

is not in F because both A and Ac are infinite.

Example 2.7.5. In a class with 24 students, what is the probability that no two students

have the same birthday?

Let N = 365 and n = 24. The probability is

α :=N

N× N − 1

N× N − 2

N× · · · × N − n + 1

N.

To estimate this quantity we proceed as follows. Note that

ln(α) =n∑

k=1

ln(N − n + k

N) ≈

∫ n

1ln(

N − n + x

N)dx

= N

∫ 1

aln(y)dy = N [yln(y)− y]1a

= −(N − n + 1)ln(N − n + 1

N)− (n− 1).

(In this derivation we defined a = (N − n + 1)/N .) With n = 24 and N = 365 we find that

α ≈ 0.48.

Example 2.7.6. Let A,B,C be three events. Assume that P (A) = 0.6, P (B) = 0.6, P (C) =

0.7, P (A ∩ B) = 0.3, P (A ∩ C) = 0.4, P (B ∩ C) = 0.4, and P (A ∪ B ∪ C) = 1. Find

P (A ∩B ∩ C).

We know that (draw a picture)

P (A∪B ∪C) = P (A) + P (B) + P (C)−P (A∩B)−P (A∩C)−P (B ∩C) + P (A∩B ∩C).


Substituting the known values, we find

1 = 0.6 + 0.6 + 0.7− 0.3− 0.4− 0.4 + P (A ∩B ∩ C),

so that

P (A ∩B ∩ C) = 0.2.

Example 2.7.7. Let Ω = 1, 2, 3, 4 and let F = 2Ω be the collection of all the subsets of

Ω. Give an example of a collection A of subsets of Ω and probability measures P1 and P2

such that

(i). P1(A) = P2(A),∀A ∈ A.

(ii). The σ-field generated by A is F . (This means that F is the smallest σ-field of Ω

that contains A.)

(iii). P1 and P2 are not the same.

Let A= 1, 2, 2, 4.Assign probabilities P1(1) = 1

8 , P1(2) = 18 , P1(3) = 3

8 , P1(4) = 38 ; and P2(1) =

112 , P2(2) = 2

12 , P2(3) = 512 , P2(4) = 4

12 .

Note that P1 and P2 are not the same, thus satisfying (iii).

P1(1, 2) = P1(1) + P1(2) = 18 + 1

8 = 14

P2(1, 2) = P2(1) + P2(2) = 112 + 2

12 = 14

Hence P1(1, 2) = P2(1, 2).P1(2, 4) = P1(2) + P1(4) = 1

8 + 38 = 1

2

P2(2, 4) = P2(2) + P2(4) = 212 + 4

12 = 12

Hence P1(2, 4) = P2(2, 4).Thus P1(A) = P2(A)∀A ∈ A, thus satisfying (i).

To check (ii), we only need to check that ∀k ∈ Ω, k can be formed by set operations

on sets in A ∪ φ∪ Ω. Then any other set in F can be formed by set operations on k.1 = 1, 2 ∩ 2, 4C


2 = 1, 2 ∩ 2, 43 = 1, 2C ∩ 2, 4C

4 = 1, 2C ∩ 2, 4.

Example 2.7.8. Choose a number randomly between 1 and 999999 inclusive, all choices

being equally likely. What is the probability that the digits sum up to 23? For example, the

number 7646 is between 1 and 999999 and its digits sum up to 23 (7+6+4+6=23).

Numbers between 1 and 999999 inclusive have 6 digits for which each digit has a value in

0, 1, 2, 3, 4, 5, 6, 7, 8, 9. We are interested in finding the numbers x1+x2+x3+x4+x5+x6 =

23 where xi represents the ith digit.

First consider all nonnegative xi where each digit can range from 0 to 23, the number

of ways to distribute 23 amongst the xi’s is(285

).

But we need to restrict the digits xi < 10. So we need to subtract the number of ways

to distribute 23 amongst the xi’s when xk ≥ 10 for some k. Specifically, when xk ≥ 10 we

can express it as xk = 10 + yk. For all other j 6= k write yj = xj . The number of ways to

arrange 23 amongst xi when some xk ≥ 10 is the same as the number of ways to arrange

yi so that∑6

i=1 yi = 23 − 10 is(185

). There are 6 possible ways for some xk ≥ 10 so there

are a total of 6(185

)ways for some digit to be greater than or equal to 10, as we can see by

using the stars and bars method (see 2.6.1).

However, the above counts events multiple times. For instance, x1 = x2 = 10 is counted

both when x1 ≥ 10 and when x2 ≥ 10. We need to account for these events that are counted

multiple times. We can consider when two digits are greater than or equal to 10: xj ≥ 10

and xk ≥ 10 when j 6= k. Let xj = 10 + yj and xk = 10 + yk and xi = yi∀i 6= j, k. Then the

number of ways to distribute 23 amongst xi when there are 2 greater than or equal to 10 is

equivalent to the number of ways to distribute yi when∑6

i=1 yi = 23− 10− 10 = 3. There

are(85

)ways to distribute these yi and there are

(62

)ways to choose the possible two digits

that are greater than or equal to 10.


We are interested in when the sum of xi’s is equal to 23. So we can have at most 2 xi’s

greater than or equal to 10. So we are done.

Thus there are(285

) − 6(185

)+

(62

)(85

)numbers between 1 through 999999 whose digits

sum up to 23. The probability that a number randomly chosen has digits that sum up to

23 is (285 )−6(18

5 )+(62)(

85)

999999 .

Example 2.7.9. Let A1, A2, . . . , An, n ≥ 2 be events. Prove that P (∪ni=1Ai) =

∑i P (Ai)−

∑i<j P (Ai ∩Aj) +

∑i<j<k P (Ai ∩Aj ∩Ak)− · · ·+ (−1)n+1P (A1 ∩A2 ∩ . . . ∩An).

We prove the result by induction on n.

First consider the base case when n = 2. P (A1 ∪A2) = P (A1) + P (A2)− P (A1 ∩A2).

Assume the result holds true for n, prove the result for n + 1.

P (∪n+1i=1 Ai) = P (∪n

i=1Ai) + P (An+1)− P ((∪ni=1Ai) ∩An+1)

= P (∪ni=1Ai) + P (An+1)− P (∪n

i=1(Ai ∩An+1))

=∑

i

P (Ai)−∑

i<j

P (Ai ∩Aj) +∑

i<j<k

P (Ai ∩Aj ∩Ak)− . . .

+ (−1)n+1P (A1 ∩A2 ∩ . . . ∩An) + P (An+1)− (∑

i

P (Ai ∩An+1)

−∑

i<j

P (Ai ∩Aj ∩An+1) +∑

i<j<k

P (Ai ∩Aj ∩Ak ∩An+1)− . . .

+ (−1)n+1P (A1 ∩A2 ∩ . . . ∩An ∩An+1))

=∑

i

P (Ai)−∑

i<j

P (Ai ∩Aj) +∑

i<j<k

P (Ai ∩Aj ∩Ak)− . . .

+ (−1)n+2P (A1 ∩A2 ∩ . . . ∩An+1)

Example 2.7.10. Let An, n ≥ 1 be a collection of events in some probability space

Ω,F , P. Assume that∑∞

n=1 P (An) < ∞. Show that the probability that infinitely many

of those events occur is zero. This result is known as the Borel-Cantelli Lemma.

To prove this result we must write the event “infinitely many of the events An occur”

1.4 Functions of a random variable

Recall that a random variable X on a probability space (Ω,F , P ) is a function mapping Ω to thereal line R , satisfying the condition ω : X(ω) ≤ a ∈ F for all a ∈ R. Suppose g is a functionmapping R to R that is not too bizarre. Specifically, suppose for any constant c that x : g(x) ≤ cis a Borel subset of R. Let Y (ω) = g(X(ω)). Then Y maps Ω to R and Y is a random variable.See Figure 1.6. We write Y = g(X).

Ω

g(X( ))X( )ω ω

gX

Figure 1.6: A function of a random variable as a composition of mappings.

Often we’d like to compute the distribution of Y from knowledge of g and the distribution ofX. In case X is a continuous random variable with known distribution, the following three stepprocedure works well:

(1) Examine the ranges of possible values of X and Y . Sketch the function g.

(2) Find the CDF of Y , using FY (c) = PY ≤ c = Pg(X) ≤ c. The idea is to express theevent g(X) ≤ c as X ∈ A for some set A depending on c.

(3) If FY has a piecewise continuous derivative, and if the pmf fY is desired, differentiate FY .

If instead X is a discrete random variable then step 1 should be followed. After that the pmf of Ycan be found from the pmf of X using

pY (y) = Pg(X) = y =∑

x:g(x)=y

pX(x)

Example 1.4 SupposeX is aN(µ = 2, σ2 = 3) random variable (see Section 1.6 for the definition)and Y = X2. Let us describe the density of Y . Note that Y = g(X) where g(x) = x2. The supportof the distribution of X is the whole real line, and the range of g over this support is R+. Next wefind the CDF, FY . Since PY ≥ 0 = 1, FY (c) = 0 for c < 0. For c ≥ 0,

FY (c) = PX2 ≤ c = P−√c ≤ X ≤

√c

= P−√c− 2√3

≤ X − 2√3

≤√c− 2√

3

= Φ(√c− 2√

3)− Φ(

−√c− 2√3

)

Differentiate with respect to c, using the chain rule and the fact, Φ′(s) = 1√2π

exp(− s2

2 ) to obtain

fY (c) =

1√

24πcexp(−[

√c−2√6

]2) + exp(−[−√

c−2√6

]2) if y ≥ 00 if y < 0

(1.7)

9

Example 1.5 Suppose a vehicle is traveling in a straight line at speed a, and that a randomdirection is selected, subtending an angle Θ from the direction of travel which is uniformly dis-tributed over the interval [0, π]. See Figure 1.7. Then the effective speed of the vehicle in the

B

a

Θ

Figure 1.7: Direction of travel and a random direction.

random direction is B = a cos(Θ). Let us find the pdf of B.The range of a cos(Θ) as θ ranges over [0, π] is the interval [−a, a]. Therefore, FB(c) = 0 for

c ≤ −a and FB(c) = 1 for c ≥ a. Let now −a < c < a. Then, because cos is monotone nonincreasingon the interval [0, π],

FB(c) = Pa cos(Θ) ≤ c = Pcos(Θ) ≤ c

a

= PΘ ≥ cos−1(c

a)

= 1−cos−1( c

a)π

Therefore, because cos−1(y) has derivative, −(1− y2)−12 ,

fB(c) =

1

π√

a2−c2| c |< a

0 | c |> a

A sketch of the density is given in Figure 1.8.

−a a

fB

0

Figure 1.8: The pdf of the effective speed in a uniformly distributed direction.

10

Y0

Θ

Figure 1.9: A horizontal line, a fixed point at unit distance, and a line through the point withrandom direction.

Example 1.6 Suppose Y = tan(Θ), as illustrated in Figure 1.9, where Θ is uniformly distributedover the interval (−π

2 ,π2 ) . Let us find the pdf of Y . The function tan(θ) increases from −∞ to ∞

as θ ranges over the interval (−π2 ,

π2 ). For any real c,

FY (c) = PY ≤ c= Ptan(Θ) ≤ c

= PΘ ≤ tan−1(c) =tan−1(c) + π

2

π

Differentiating the CDF with respect to c yields that Y has the Cauchy pdf:

fY (c) =1

π(1 + c2)−∞ < c <∞

Example 1.7 Given an angle θ expressed in radians, let (θ mod 2π) denote the equivalent anglein the interval [0, 2π]. Thus, (θ mod 2π) is equal to θ + 2πn, where the integer n is such that0 ≤ θ + 2πn < 2π.

Let Θ be uniformly distributed over [0, 2π], let h be a constant, and let

Θ = (Θ + h mod 2π)

Let us find the distribution of Θ.Clearly Θ takes values in the interval [0, 2π], so fix c with 0 ≤ c < 2π and seek to find

PΘ ≤ c. Let A denote the interval [h, h+ 2π]. Thus, Θ + h is uniformly distributed over A. LetB =

⋃n[2πn, 2πn+ c]. Thus Θ ≤ c if and only if Θ + h ∈ B. Therefore,

PΘ ≤ c =∫

AT

B

12πdθ

By sketching the set B, it is easy to see that A⋂B is either a single interval of length c, or the

union of two intervals with lengths adding to c. Therefore, PΘ ≤ c = c2π , so that Θ is itself

uniformly distributed over [0, 2π]

Example 1.8 Let X be an exponentially distributed random variable with parameter λ. LetY = bXc, which is the integer part of X, and let R = X − bXc, which is the remainder. We shalldescribe the distributions of Y and R.

11

Proposition 1.10.1 Under the above assumptions, Y is a continuous type random vector and fory in the range of g:

fY (y) =fX(x)

| ∂y∂x(x) |

= fX(x)∣∣∣∣∂x∂y (y)

∣∣∣∣Example 1.10 Let U , V have the joint pdf:

fUV (u, v) =u+ v 0 ≤ u, v ≤ 1

0 else

and let X = U2 and Y = U(1+V ). Let’s find the pdf fXY . The vector (U, V ) in the u− v plane istransformed into the vector (X,Y ) in the x− y plane under a mapping g that maps u, v to x = u2

and y = u(1 + v). The image in the x− y plane of the square [0, 1]2 in the u− v plane is the set Agiven by

A = (x, y) : 0 ≤ x ≤ 1, and√x ≤ y ≤ 2

√x

See Figure 1.12 The mapping from the square is one to one, for if (x, y) ∈ A then (u, v) can be

xu

v

1

1 1

2y

Figure 1.12: Transformation from the u− v plane to the x− y plane.

recovered by u =√x and v = y√

x− 1. The Jacobian determinant is∣∣∣∣ ∂x

∂u∂x∂v

∂y∂u

∂y∂v

∣∣∣∣ =∣∣∣∣ 2u 0

1 + v u

∣∣∣∣ = 2u2

Therefore, using the transformation formula and expressing u and V in terms of x and y yields

fXY (x, y) =

√x+( y√

x−1)

2x if (x, y) ∈ A0 else

Example 1.11 Let U and V be independent continuous type random variables. Let X = U + Vand Y = V . Let us find the joint density of X,Y and the marginal density of X. The mapping

g :(uv

)→(x

y

)=

(u+ v

v

)24

is invertible, with inverse given by u = x − y and v = y. The absolute value of the Jacobiandeterminant is given by ∣∣∣∣ ∂x

∂u∂x∂v

∂y∂u

∂y∂u

∣∣∣∣ =∣∣∣∣ 1 1

0 1

∣∣∣∣ = 1

Therefore

fXY (x, y) = fUV (u, v) = fU (x− y)fV (y)

The marginal density of X is given by

fX(x) =∫ ∞

−∞fXY (x, y)dy =

∫ ∞

−∞fU (x− y)fV (y)dy

That is fX = fU ∗ fV .

Example 1.12 Let X1 and X2 be independent N(0, σ2) random variables, and let X = (X1, X2)T

denote the two-dimensional random vector with coordinates X1 and X2. Any point of x ∈ R2 canbe represented in polar coordinates by the vector (r, θ)T such that r = ‖x‖ = (x2

1 + x22)

12 and

θ = tan−1(x1x2

) with values r ≥ 0 and 0 ≤ θ < 2π. The inverse of this mapping is given by

x1 = r cos(θ)x2 = r sin(θ)

We endeavor to find the pdf of the random vector (R,Θ)T , the polar coordinates of X. The pdf ofX is given by

fX(x) = fX1(x1)fX2(x2) =1

2πσ2e−

r2

2σ2

The range of the mapping is the set r > 0 and 0 < θ ≤ 2π. On the range,∣∣∣∣ ∂x∂( rθ )

∣∣∣∣ =∣∣∣∣ ∂x1

∂r∂x1∂θ

∂x2∂r

∂x2∂θ

∣∣∣∣ = ∣∣∣∣ cos(θ) −r sin(θ)sin(θ) r cos(θ)

∣∣∣∣ = r

Therefore for (r, θ)T in the range of the mapping,

fR,Θ(r, θ) = fX(x)∣∣∣∣ ∂x∂( r

θ )

∣∣∣∣ = r

2πσ2e−

r2

2σ2

Of course fR,Θ(r, θ) = 0 off the range of the mapping. The joint density factors into a function ofr and a function of θ, so R and Θ are independent. Moreover, R has the Rayleigh density withparameter σ2, and Θ is uniformly distributed on [0, 2π].

25

ELEG–636 Homework #1, Spring 2003

1. Show that if

,

and

then ! #" $%'& ( )" $*,+.-/Answer: 02134 5687079%:;1< # 5687079%:;1<>=#56 213( )" ? # ( #" In the same way ) ( )"

Then @? A2BDCFE.G.HEIJLK B CME.G - AONCFE.GIJLK N CFE.G?QP%RJ R A B CFE.G.HEIJLK B CFE.G 1< P%RJ R - A2NCME.GIJLK N CFE.G 13?QP RJ R H K B CME.GSHEIJLK B CFE.G P RJ R H K NTCFE.GIJLK N CME.G?VUSW & 8 #" L$*,+* U.W & ( #" X,+? 8 #" L$%'& 8 #" @$%,+.-2. Express the density

Y Xof the RV

Z\[*]in terms of

$%if (a)

[^$*_` $; (b)

[^$*_a J @b $% .Answer:(a) " @Xc 5d#eX 5df[^]8eX 5d g@eX h'i Xd7 i56 Xde0e9XQXdj i h i X_7 i560e9X 560e XVX_j i h i X_7 i" X )" XVX_j i

Xc 1 " XX1


h i X_7 iHk K B C G JLK B C J GmlH X_j i h i X_7 iH K BDC GH H K BTC J GH X_j i hni Xd7 iXo:p XQXdj i(b) " @X` 56)eX 56f[^](eX 56 a Jq b ]reX h i X_7 i56 a Jq eXVX_j i hni X_7 i56j U.W XsX_j i hni X_7 i( #" USW XsX_j iX` " @XX h i Xd7 iHk IJLK BTC Jrtvu C GfGmlH Xdj i h i Xd7 iA B C Jgtvu C G.G Xdj i

3. The RVs

and

are independent with exponential densities $%xw a JLy b $*>=z@Xx a J 3b XFind the densities of the following RVs: | _: ;

qAnswer: (1) Let ~ ':

andg w a J b

Since

andX

are independent, we haveY/< <]8 < RJ R 8213 a J C J G w a J 13 wY w a J a J b 2


(2) " < 5d ~ ex56e/ RJ R 56eY8 0XX213X RJ R J R $*213$* 1<X R w a JLy 1<$ a J 13X R a J a J C y z G X213X ! w:;Y/< 1 " 1@ w]w:]2 b <4. The RVs

and

are i and independent. Show that, if ~ Z r , then ~ ¡ o¢<£ ¤ = ¥ ~ ¡ Y¦

Answer:Let ~ § 0

, andQ . Since

and

are Gaussian, so

is also Gaussian. We can findg

by finding the mean and variance of

. & 6+¨ & %+ & *+ & %+ i © & ~ + &ª + & +@: & + & *+ & %+

So, r £ ¤ a J¬« ® Thus, & ~ +¨ RJ R 213 R £ ¤ a J¯« ® 13 £ ¤

3

ELEG–636 Homework #1, Spring 2003 & ~ + is already obtained, which is & ~ +* 5. Use the moment generating function, show that the linear transformation of a Gaussian random vectoris also Gaussian.

Proof:Let

be a °± real random Gaussian vector, then the density function is] ¤ O²/³ µ´ q I ³ a J(¶ C qJ·¹¸ GSº» ¸3¼ ¶ C q<J·¹¸ G

Let ½ be a °± real vector, then the moment generating function of

is¾ ½ ` & a¿ º q + a ¿ º q ¤ ²Y³ @µ´ q I ³ a J ¶ C qJ· ¸ G º » ¸ ¼ ¶ C q<J· ¸ G 1< a ·!¸ º ¿ ¶ ¿ º» ¸ ¿Let À be a linear transform of

0 À Then Á À ´ q À6ÂThe moment generating function of

is¾ ½ & a ¿ º + & a ¿ º Ã q + & a C Ãº ¿ G º q +

Using the moment generating function of

, we have¾ ½ a · ¸ º C Ãº ¿ G ¶ C Ãº ¿ G º » ¸ C Ãº ¿ G a ·¹Ä º ¿ ¶ ¿ º» Ä ¿which has the same form of

¾ ½ .So,

is also Gaussian.

6. Let $ - ° ¡Å-ÇÆ I be four IID random variables with exponential distribution with

w= 1.X - ° -È É Æ I $ É ° >= e_eÊ

(a) Determine and plot the pdf ofX °

(b) Determine and plot the pdf ofXËY °

(c) Determine and plot the pdf ofX Å °

(d) Compare the pdf ofX Å ° with that of the Gaussian density.

4


Answer: Let $% a J b $*The characteristic function of

$%is Ì fÍr ( ÏÎ Í

Since$ I ° >=ÐÐÐT=$ - ° are i.i.d., zÑ C ² G X X]!ÐÐÐ/r X

Evaluating both sides by the characteristic functions, we haveÌ Ñ C ² G fÍr & a>ÒÓ Ñ C ² G +* -ÔÉ Æ IÌ Õ C ² G

So,Ì zÑ C ² G fÍr\Ö ! ÏÎ ÍØ× -

whose inverse Fourier transform yields the pdf ofX - ° Ñ C ² G X X - J%I a J 9 >Ù b X

This expression holds for any positive integer

, including6@=zÚ@=Ê ¦

7. The mean and covariance of a Gaussian random vector

are given by, respectively,Û ÜÞÝ àßand á ÜÞÝ II ßPlot the 1 , 2 , and 3 concentration ellipses representing the contours of the density function in

the$ =$^3 plane. âäã,° å : The radius of an ellipse with major axis a (along

$ ) and minor axis æ 7ç(along

$*) is given by è ç æ ç å ãé° Çê : æ ëÇì å Çê

where i e ê eí ¤ . Compute the 1 ellipse specified byçî £ ï I and æ £ ï and then rotate and

translate each point C ã '& $ CFðSGI $ CFðSG + using the transformation

CmñMG óò q ñ : Û q .Answer: ô J%I Ý Å Ë Ë Ë Å Ë ß

5


So, q ]c ¤ ô I ³ a J ¶ C qJõ BG º ö ¼ ¶B C qJõ BG £ ÚÊ ¤ a J ÷ k C ¶ J%I G J C ¶ J%I GC J G C J G lLet [^$ I =$ '$ I 9 $ I Ç$ 3ø:$ 3

The linear transform Ý $ I$ ß Ýäù ù ù ù ß Ý I ßis a rotation of

Ê<ú/ûof the original axes. [^$ I =$ I : Ë

So, ç æ ÚSo, the radius of the ellipse is è ç æ ç %üý W ê : æ %þÇÿ3üz ê è The concentration ellipse of

( è

) is thus I : Ë or $ I 9 $ I 9 Ç$ 3ø:$ 3

When the function[*$ I =$ is chosen differently, the figure will be different. But the orientation of

the ellipses are the same.

6


1

2

x1

x2

1

2

3

7

ELEG–636 Test #1, March 25, 1999 NAME:

1. (35 pts) Lety = minfjx1j; x2g wherex1 andx2 are i.i.d. inputs with cdf and pdfFx() andfx(),respectively. For simplicity, assumefx() is symmetric about 0, i.e.,fx(x) = fx(x). Determine thecdf and pdf ofy in terms of the distribution of the inputs. Plot the pdf ofy for fx() uniform on[1; 1].

Note that

Fjxj(x) =

(Fx(x) Fx(x) for x 00 otherwise

AlsoFminfx1;x2g(x) = 1 Pfx1 xgPfx2 xg = 1 (1 Fx1(x))(1 Fx2(x))

Thus,Fy(y) = 1 (1 Fjx1j(y))(1 Fx2(y))

=

(1 (1 Fx(y) + Fx(y))(1 Fx(y)) for y 0

1 (1 Fx2(y)) otherwise

=

(2Fx(y) Fx(y) F 2

x (y) + Fx(y)Fx(y) for y 0Fx(y) otherwise

If fx() is symmetric about 0, thenfx(x) = fx(x) andFx(x) = 1 Fx(x), giving

Fy(y) =

(2Fx(y) (1 Fx(y)) F 2

x (y) + Fx(y)(1 Fx(y)) for y 0Fx(y) otherwise

=

(4Fx(y) 2F 2

x (y) 1 for y 0Fx(y) otherwise

Taking the derivative,

fy(y) =

(4fx(y) 4fx(y)Fx(y) for y 0

fx(y) otherwise

=

(4fx(y)(1 Fx(y)) for y 0

fx(y) otherwise

1

ELEG–636 Test #1, March 25, 1999 NAME:

2. (35 pts) Consider the observed samples

yi = + xi

for i = 1; 2; : : : ; N . We wish to estimate the location parameter using a maximum likelihood estimatoroperating on the observationsy1; y2; : : : ; yN . Consider two cases:

(10 pts) Thexi terms are i.i.d. with distributionxi N (0; 2), for i = 1; 2; : : : ; N .

(10 pts) Thexi terms are independent with distributionxi N (0; 2i ), for i = 1; 2; : : : ; N .

(15 pts) Are the estimates unbiased? What is the variance of the estimates? Are they consistent?

fyj(yj) =NYi=1

1p22

e

(yi)2

22 =

1

22

N=2

ePN

i=1

(yi)2

22

Thus,

ML = argmax

NXi=1

(yi )2

22

and taking the derivative,NXi=1

(yi ML)

2= 0) ML =

1

N

NXi=1

yi

For the case of changing variances,

NXi=1

(yi ML)

2i= 0) ML =

PNi=1

yi2iPN

i=11

2i

ML =

PNi=1 wiyiPNi=1wi

which is a normalized filter, wherewi =1

2for i = 1; 2; : : : ; N .

For each estimateEfMLg = , and they are thus unbiased.

var(ML)[N ] = Ef(ML )2g = E

8<: PN

i=1wiyi wiPNi=1 wi

!29=; = E

8<: PN

i=1wixiPNi=1wi

!29=;

=EfPN

i=1

PNj=1wixixjwjg

(PN

i=1wi)2=

PNi=1w

2i

2i

(PN

i=1 wi)2=

PNi=1 wi

(PN

i=1wi)2=

1PNi=1 wi

Sincewi > 0, we have var(ML)[N + 1] < var(ML)[N ]. This, combined with the fact that theestimator is unbiased means the estimate is consistent.

3

ELEG636 Test #1, March 23, 2000 NAME:

1. (30 pts) The random variables x and y are independent and uniformly distributed onthe interval [0,1]. Determine the conditional distribution frjA(rjA) where r =

px2 + y2 and

A = fr 1g.

Answer:

Examine the joint density fx;y(x; y) in the xy plane. Since x and y are independent,

fx;y(x; y) = fx(x)fy(y) = 1 for 0 x; y 1

This denes a uniform density over the region 0 x; y 1 in the rst quadrant of the xy plane.Note that r =

px2 + y2 denes an arc in the rst quadrant. Also, if 0 r 1 the area under

the uniform density up to radius r is simply given by

Fr(r) = Pr[qx2 + y2 r] =

Zpx2+y2r

fx;y(x; y)dxdy

=

Zpx2+y2r

1dxdy =r2

4for 0 r 1

Then for A = fr 1g.

FrjA(rjA) =Fr;A(r;A)

Pr[A]=

Fr(r)

Fr(1)=

r2

4

4

= r2 for 0 r 1

Thus, frjA(rjA) = 2r for 0 r 1 and 0 elsewhere.

1

ELEG636 Test #1, April 5, 2001 NAME:

1. (35 pts) Probability questions:

(10 pts) Let x be a random variable and set y = x2. Derive a simplied expression forf(yjx 0).

(15 pts) Suppose now that y = a sin(x+ ), where and a > 0 are constants. Determinefy(y).

(10 pts) Suppose further that x is uniformly distributed over [; ]. Determine fy(y)for this special case.

Answer: Clearly, F (yjx 0) = 0 for y < 0. Then for y 0,

F (yjx 0) =Pr(Y y;X 0)

Pr(X 0)=

Fx(py) Fx(0)

1 Fx(0)U(y):

Thus

f(yjx 0) =fx(py)

2py(1 Fx(0))

U(y):

Now for y = g(x) = a sin(x+ ) we have, assuming jyj a, innitely many solutions

xn = arcsin(y=a)

n = 0;1;2; : : :. Also,g0(xn) = a cos(xn + )

Note that g2(xn) + g02(xn) = a2 cos2(xn + ) + a2 sin2(xn + ) = a2. Or,

g0(xn) =qa2 g2(xn) =

qa2 y2:

Thus

fy(y) =Xi

fx(xn)

g0(xn)=

1pa2 y2

Xi

fx(xn); jyj a

If x U(; ) then there is only a single solution, and

fy(y) =1

2pa2 y2

; jyj a

1

ELEG–636 Test #1, April 14, 2003 NAME:

1. (30 pts) Probability questions:

(15 pts) Letx be a random variables with densityfx(x) given below. Lety = g(x) bethe shown function. Determinefy(y) andFy(y).

(15 pts) Letx andy be independent, zero mean, unit variance Gaussian random variables.Define

w = x2 + y2 and z = x2:

Determinefw;z(w; z). Arew andz independent?

Answer: Note that

fx(x) =

(14x + 1

2(x 0:5) 0 x < 20 otherwise

Thus

Fx(x) =

8><>:

0 x < 018x2 + 1

2u(x 0:5) 0 x < 21 2 x

Sincex =py for 0 y 1,

Fy(y) =

8><>:

0 y < 0Fx(

py) 0 y < 1

1 1 y

=

8><>:

0 y < 018y + 1

2u(py 0:5) 0 y < 1

1 1 y

=

8><>:

0 y < 018y + 1

2u(y 0:25) 0 y < 11 1 y

Taking the derivitive yields

fy(y) =

(18+ 1

2(y 0:25) + 3

8(y 1) 0 y 1

0 otherwise

1

ELEG–636 Test #1, April 14, 2003 NAME:

Tha Jabobian of the transformation is

J(x; y) =

d(x2+y2)

dxd(x2+y2)

dyd(x2)dx

d(x2)dy

= 2x 2y2x 0

= 4jxyj

The reverse transformation is easily seen to bex = pz andy = pw x2 = pw z,w z. Thus,

fw;z(w; z) =fx;y(x; y)

4jxyj

x =pz

y =pw z

+fx;y(x; y)

4jxyj

x =pz

y = pw z

+fx;y(x; y)

4jxyj

x = pzy =

pw z

+fx;y(x; y)

4jxyj

x = pzy = pw z

(1)

Sincex andy are independent,

fx;y(x; y) =1

2e(x2+y2)

2

Thus

fw;z(w; z) =1

2pzpw z

ew=2u(w)u(z)u(w z)

where the last three terms indicatew; z 0 andw z.

2

ELEG–636 Midterm, April 7, 2009 NAME:

1. [30 pts] Probability:

(a) [15 pts] Prove the Bienayme inequality, which is a generalization of the Tchebycheffinequality,

Pr|X − a| ≥ ε ≤ E|X − a|nεn

for arbitrary a and distribution of X.

(b) [15 pts] Consider the uniform distribution over [−1, 1].

i. [10 pts] Determine the moment generating function for this distribution.ii. [5 pts] Use the moment generating function to generate a simple expression for

the k′th moment, mk.

Answer:

(a)

E|x− a|n =∫ ∞−∞|x− a|nfx(x)dx ≥

∫x−a|≥ε

|x− a|nfx(x)dx ≥∫x−a|≥ε

εnfx(x)dx

=εnPr|x− a| ≥ ε ⇒ Pr|X − a| ≥ ε ≤ E|X − a|nεn

(b)

Φ(s) =12

∫ 1

−1esxdx =

12s(e

s − e−s) s 6= 01 s = 0

⇒ Exk =dkΦ(s)dks

∣∣∣∣s=0

Ex =dΦ(s)ds

∣∣∣∣s=0

=12s

(es + e−s)− 12s2

(es − e−s)∣∣∣∣s=0

=12

(es − e−s)− 14

(es − e−s)∣∣∣∣s=0

= 0

Repeat the differentiation, limit (l’Hpital’s rule) process. The analytical solution issimpler:

Exk =12

∫ 1

−1xkdx =

1− (−1)k+1

2(k + 1)=

0 k = 1, 3, 5, . . .1

k+1 k = 0, 2, 4, . . .

1

ELEG–636 Midterm, April 7, 2009 NAME:

3. [35 pts] Let Z = X+N , where X and N are independent with distributions N ∼ N (0, σ2N )

and fX(x) = 12δ(x− 2) + 1

2δ(x+ 2).

(a) [15 pts] Determine the MAP, MS, MAE, and ML estimates for X in terms of Z.

(b) [10 pts] Determine the bias of each estimate, i.e., determine whether or not eachestimate is biased.

(c) [10 pts] Determine the variances of the estimates.

Answer:

(a) Since X and N are independent, fZ(z) = fX(z)∗fN (z) = 12N (−2, σ2

N )+ 12N (2, σ2

N ).Also

fZ|X(z|x) =N (x, σ2N )

xML = arg maxx

fZ|X(z|x) = z

fX|Z(x|z) =fZ|X(z|x)fX(x)

fZ(z)=N (x, σ2

N )(δ(x− 2) + δ(x+ 2))2fZ(z)

xMAP = arg maxx

fX|Z(x|z) =

2 z > 0−2 z < 0

xMS =∫ ∞−∞

xfX|Z(x|z)dx =1

fZ(z)

∫ ∞−∞

xfZ|X(z|x)fX(x)dx

=

(2N (2, σ2

N )|x=z − 2N (−2, σ2N )|x=z

)2fZ(z)

=2N (2, σ2

N )|x=z −N (−2, σ2N )|x=z

N (2, σ2N )|x=z +N (−2, σ2

N )|x=z12

=∫ xMAE

−∞fX|Z(x|z)dx =

1fZ(z)

∫ xMAE

−∞fZ|X(z|x)fX(x)dx

⇒∫ xMAE

−∞fZ|X(z|x)fX(x)dx =

14(N (2, σ2

N )|x=z +N (−2, σ2N )|x=z

)⇒∫ xMAE

−∞N (x, σ2

N )(δ(x− 2)+δ(x+ 2))dx =12(N (2, σ2

N )|x=z +N (−2, σ2N )|x=z

)

Note the LHS is not continuous ⇒ xMAE not well defined.

(b) Note fZ(z) is symmetric about 0 ⇒ ExML = Ez = 0 ⇒ xML is unbiased(Ex = 0). Similarly, ExMAP = 2Prz > 0 − 2Prz < 0 = 0 ⇒ xMAP isunbiased. Also, xMS is an odd function (about 0) of z ⇒ ExMS = 0 ⇒ xMS isunbiased.

(c) σ2ML = σ2

Z = σ2X + σ2

N = 4 + σ2N . Also, σ2

MAP = 4 (since xMAP = ±2). Determiningσ2MS is not trivial, and will not be considered.

3


1. A token is placed at the origin on a piece of graph paper. A coin biased to heads is given, P (H) =2/3. If the result of a toss is heads, the token is moved one unit to the right, and if it is a tail thetoken is moved one unit to the left. Repeating this 1200 times, what is a probability that the tokenis on a unit N , where 350 ≤ N ≤ 450? Simulate the system and plot the histogram using 10,000realizations.

Solution:Let x = # of heads. Then 350 ≤ x− (1200− x) ≤ 450⇒ 775 ≤ x ≤ 825 and

Pr(775x ≤ 825) =825∑i=775

(1200i

)(23

)i(13

)1200−i

which can be approximated using the DeMoivre–Laplace approximation

i2∑i=i1

(ni

)(p)i (1− p)n−i ≈ Φ

(i2 − np√np(1− p)

)− Φ

(i1 − np√np(1− p)

)

where Φ(x) =∫ x−∞

12πe−x2/2dx

2. Random variable X is characterized by cdf FX(x) = (1 − e−x)U(x) and event C is defined byC = 0.5 < X ≤ 1. Determine and plot FX(x|C) and fX(x|C).

Solution: Evaluating Pr(X ≤ x, 0.5 < X ≤ 1) for the allowable three cases

x < 0.5 Pr(X ≤ x, 0.5 < X ≤ 1) = 0

0.5 ≤ x ≤ 1 Pr(X ≤ x, 0.5 < X ≤ 1) = FX(x)− FX(0.5) = e−0.5 − e−x

x > 1 Pr(X ≤ x, 0.5 < X ≤ 1) = FX(1)− FX(0.5) = e−0.5 − e−1 = 0.2386

Also, Pr(C) = FX(1)− FX(0.5) = e−0.5 − e−1 = 0.2386. Thus

fX(x|C) =Pr(X ≤ x, 0.5 < X ≤ 1)

Pr(0.5 < X ≤ 1)=

0 x < 0.5(e−0.5 − e−x)/0.2386 0.5 ≤ x ≤ 11 x > 1

3. Prove that the characteristic function for the univariate Gaussian distribution, N(η, σ2), is

φ(ω) = exp(jωη − ω2σ2

2

)Next determine the moment generating function and determine the first four moments.

1


Solution:

φ(ω) =∫ ∞−∞

1√2πσ

exp

(x− η)2

2σ2

ejωxdx

=∫ ∞−∞

1√2πσ

exp

(x2 − 2ηx+ η2 − 2jωxσ2)2σ2

dx

=∫ ∞−∞

1√2πσ

exp

(x− (ηx+ jωσ2)2

2σ2

exp

(−η2 + (η2 + jωσ2η)2

2σ2

dx

= exp

(−η2 + (η2 + jωσ2η)2

2σ2

∫ ∞−∞

1√2πσ

exp

(x− (ηx+ jωσ2)2

2σ2

dx

= exp

(−η2 + (η2 + jωσ2η)2

2σ2

which reduces to φ(ω) = exp

(jωη − ω2σ2

2

). The moment generating function is simple

Φ(s) = exp(sη +

s2σ2

2

)and mk = dkΦ(s)

dks|s=0, which yields

m1 = η m2 = σ2 + η2

m3 = 3ησ2 + η3 m4 = 3σ4 + 6σ2η2 + η4

4. Let Y = X2. Determine fY (y) for:

(a) fX(x) = 0.5 exp−|x|(b) fX(x) = exp−|x|U(X)

Solution: Y = X2 ⇒ X = ±√y and dY/dX = 2X . Thus

fY (y) =fX(x)|2x|

∣∣∣∣x=√y

+fX(x)|2x|

∣∣∣∣x=−√y

Substituting and simplifying

fX(x) = 0.5 exp−|x| ⇒ fY (y) =1

2√ye−√yU(y)

fX(x) = exp−|x|U(x) ⇒ fY (y) =1

2√ye−√yU(y)

5. Given the joint pdf fXY (x, y)

fXY (x, y) =

8xy, 0 < y < 1, 0 < x < y0, otherwise

Determine (a) fx(x), (b) fY (y), (c) fY (y|x), and (d) E[Y |x].

Solution:

2


(a) fX(x) =∫∞−∞ fXY (x, y)dy =

∫ 1x 8xydy =

4x− 4x3 0 < x < 10 otherwise

(b) fY (y) =∫∞−∞ fXY (x, y)dx =

∫ yo 8xydx =

4y3 0 < y < 10 otherwise

(c) fY (y|x) = fXY (x,y)fX(x) =

2y1−x2 x < y < 10 otherwise

(d) E[Y |x] =∫∞−∞ yfY (y|x)dy =

∫ 1x

2y2

1−x2dy = 23

(1−x3

1−x2

)= 2

3

(1+x+x2

1+x

)6. Let W and Z be RVs defined by

W = X2 + Y 2 and Z = X2

where X and Y are independent; X,Y ∼ N(0, 1).

(a) Determine the joint pdf fWZ(w, z).

(b) Are W and Z independent?

Solution: Given the system of equations

J

(w zx y

)=∣∣∣∣ 2x 2y

2x 0

∣∣∣∣ = 4|xy|

Note we must have w, z ≥ 0 and w ≥ z. Thus the inverse system (roots) are

x = ±√z, y = ±

√w − z.

Thus

fWZ(w, z) =fXY (x, y)

4|xy|

∣∣∣∣ x = ±√z

y = ±√w − z

(∗)

Note also that, since X,Y ∼ N(0, 1),

fXY (x, y) =1

2πe−

x2+y2

2 (∗∗)

Substituting (∗∗) into (∗) [which has four terms] and simplifying yields

fWZ(w, z) =ew/2

2π√z(w − z)

U(w − z)U(z) (∗ ∗ ∗)

Note W and Z are not independent. Counter example proof: Suppose W and Z are independent.Then fW (w)fZ(z) > 0 for all w, z > 0. But this violates (∗ ∗ ∗), as fWZ(w, z) > 0 only forw ≥ z.

3


1. Let

R =[

2 −2−2 5

]Express R as R = QΩQH , where Ω is diagonal.

Solution: ∣∣∣∣ 2− λ −2−2 5− λ

∣∣∣∣ = λ2 − 7λ+ 6 = 0 ⇒ λ1 = 6, λ2 = 1

Than solving Rqi = λiqi gives q1 = 1√5[1,−2]T and q2 = 1√

5[2, 1]T . Thus R = QΩQH

where

Q = [q1,q2] and Ω =[

6 00 1

]2. The two-dimensional covariance matrix can be expressed as:

C =[

σ21 ρσ1σ2

ρ∗σ1σ2 σ22

](a) Find the simplest expression for the eigenvalues of C.

(b) Specialize the results to the case σ2 = σ22 = σ2

2 .

(c) What are the eigenvectors in the special case (b) when ρ is real?

Solution:

(a) ∣∣∣∣ σ21 − λ ρσ1σ2

ρ∗σ1σ2 σ22 − λ

∣∣∣∣ = λ2 − (σ21 + σ2

2)λ+ (1− |p|2)σ21σ

22 = 0

⇒ λ =(σ2

1 + σ22)±

√σ4

1 + σ42 − 2σ2

1σ22 + 4|p|2σ2

1σ22

2

(b) For σ2 = σ22 = σ2

2

λ =2σ2 ±

√4|p|2σ4

2= (1± |p|)σ2

3. Letx[n] = Aejω0n

where the complex amplitude A is a RV with random magnitude and phase

A = |A|ejφ.

Show that a sufficient condition for the random process to be stationary is that the amplitude andphase are independent and that the phase is uniformly distributed over [−π, π].

Solution: First note Ex[n] = EAejω0n and

EA = E|A|Eejφ = 0

1


by independence and uniform distribution of φ. Thus it has a fixed mean. Next note

Ex[n]x∗[n− k] = E|A|2ejω0k

which is strictly a function of k ⇒WSS.

4. Let Xi be i.i.d. RVs uniformly distributed on [0, 1] and define

Y =20∑i=1

Xi.

Utilize Tchebycheff’s inequality to determine a bound for Pr8 < Y < 12.Solution: Note ηx = 1

2 and σ2x = 1

12 . Thus ηy = 10 and σ2y = 20

12 = 53 . Utilize Tchebycheff’s

inequality

Pr|Y − ηy| ≥ 2 ≤(σy

2

)2=

512

⇒ Pr8 < Y < 12 ≥ 1− 512

=712

5. Let X ∼ N (0, 2σ2) and Y ∼ N (1, σ2) be independent RVs. Also, define Z = XY . Find theBays estimate of X from observation Z:

(a) Using the squared error criteria.

(b) Using the absolute error criteria.

6. LetX and Y be independent RVs characterized by fX(x) = ae−axU(x) and fY (y) = ae−ayU(y).Also, define Z = XY . Find the Bays estimate of X from observation Z using the uniform costfunction.

Solution:

Fz|x(z|x) = Pr(xy ≤ z|x) = Pr(y ≤ z/x) = Fy(z/x) ⇒ fz|x(z|x) =1xfy(z/x)

x = arg max fz|x(z|x)fx(x) = arg max1xfy(z/x)fx(x)

= arg max1xae−az/xae−axU(x)U(z) = arg max a2x−1e−a(zx

−1+x)U(x)U(z)

⇒ 0 =− a2x−2e−a(zx−1+x) + (a2x−1e−a(zx

−1+x))(−a(1− zx−2))

0 =− x−1 − a(1− zx−2)⇒ ax2 + x− z = 0

⇒ x =−1±

√1 + 4az

2a

7. Random processes x[n] and y[n] are defined by

x[n] = v1[n] + 3v2[n− 1]y[n] = v2[n+ 1] + 3v2[n− 1]

where v1[n] and v2[n] are independent white noise processes, each with variance 0.5.

2


1. Let fx(t) be symmetric about 0. Prove that µ is the expected value of a sam-ple distributed according to fx−µ(t).

Solution.Since fx(t) is symmetric about 0, fx(t) is even.

E[(x− µ)] =∫ +∞

−∞tfx−µ(t)dt

=∫ +∞

−∞tfx(t− µ)dt

Let u = t− µ,

E[(x− µ)] =∫ +∞

−∞u+ µfx(u)du

=∫ +∞

−∞ufx(u)︸︷︷︸odd

du+∫ +∞

−∞µfx(u)du

= 0 + µ

∫ +∞

−∞fx(u)du

= µ

2. The complimentary cumulative distribution function is defined as Qx(x) =1 − Fx(x), or more explicitly in the zero mean, unit variance Gaussian dis-tribution case as

Qx(x) =∫ ∞x

1√2π

exp(−1

2t2)dt.

Show thatQx(x) ≈

1√2πx

exp(−1

2x2).

Hint: use integration by parts on Qx(x) =∫∞x

1√2πtt exp

(−1

2 t2)dt. Also

explain why the approximation improves x as increases.

Solution.Recall integration by parts:

∫ ba f(t)g′(t)dt = f(t)g(t)|ba −

∫ ba f′(t)g(t)dt.

Let g′(t) = t exp(−1

2 t2)

and f(t) = 1√2πt

Qx(x) =∫ ∞x

1√2πt

t exp(−1

2t2)dt

1


= − 1√2πt

exp(−1

2t2)∣∣∣∣∞x

−∫ ∞x

1√2πt2

exp(−1

2t2)dt︸︷︷︸

→0 as x→∞

≈ 1√2πx

exp(−1

2x2)

Since∫∞x

1√2πt2

exp(−1

2 t2)dt goes to zero as x goes to infinity, the ap-

proximation improves x as increase.

3. The probability density function for a two dimensional random vector is de-fined by

fx(x) =

Ax2

1x2 x1, x2 ≥ 0 and x1 + x2 ≤ 10 otherwise

(a) Determine Fx(x) and the value of A.

(b) Determine the marginal density fx2(x).

(c) Are fx1(x) and fx2(x) independent? Show why or why not.

Solution.

(a)

Fx1,x2(∞,∞) =∫ 1

0

∫ 1−x1

0Ax2

1x2dx2dx1

=∫ 1

0Ax2

1

x22

2

∣∣∣x2

0

=∫ 1

0Ax2

1

(1− x1)2

2dx1

=A

2

∫ 1

0(x4

1 − 2x31 + x2

1)dx1

=A

60= 1 (1)

Therefore, A = 60. Defining Fx1,x2(u, v) = Pr(x1 ≤ u, x2 ≤ v), we have

• x1 < 0 or x2 < 0, then F (x1, x2) = 0.

2


• x1, x2 ≥ 0 and x1 + x2 ≤ 1, then

F (x1, x2) =∫ x1

0

∫ x2

060u2vdvdu

= 10x31x

22

• 0 ≤ x1, x2 ≤ 1 and x1 + x2 ≥ 1, then

F (x1, x2) = 1−∫ 1−x2

0

∫ 1−u

x2

60u2vdvdu−∫ 1

x1

∫ 1−u

060u2vdvdu

= 10x22 − 20x3

2 + 15x42 − 4x5

2 + 10x31 − 15x4

1 + 6x51 − 1

• 0 ≤ x1 ≤ 1 and x2 ≥ 1, then

F (x1, x2) = 1−∫ 1

x1

∫ 1−u

060u2vdvdu

= 10x31 − 15x4

1 + 6x51

• 0 ≤ x2 ≤ 1 and x1 ≥ 1, then

F (x1, x2) = 1−∫ 1−x2

0

∫ 1−u

x2

60u2vdvdu

= 10x22 − 20x3

2 + 15x42 − 4x5

2

• x1, x2 ≥ 1, then F (x1, x2) = 1.

So

F (x1, x2) =

0 x1 < 0 or x2 < 010x3

1x22 x1, x2 ≥ 0, x1 + x2 ≤ 1

10x22 − 20x3

2 + 15x42 − 4x5

2 + 10x31 − 15x4

1 + 6x51 − 1 0 ≤ x1, x2 ≤ 1, x1 + x2 ≥ 1

10x31 − 15x4

1 + 6x51 0 ≤ x1 ≤ 1, x2 ≥ 1

10x22 − 20x3

2 + 15x42 − 4x5

2 0 ≤ x2 ≤ 1, x1 ≥ 11 x1, x2 ≥ 1

(b)

fx2(x2) =∫ 1−x2

060x2

1x2dx1

= 20x2(1− x2)3

3


(c) Since

fx1(x1) =∫ 1−x1

060x2

1x2dx2

= 30x21(1− x1)2

, fx1,x2(x1, x2) 6= fx1(x1)fx2(x2). Therefore, fx1(x1) and fx2(x2) are NOTindependent.

4. Consider the two independent marginal distributions

fx1(x) =

1 0 ≤ x1 ≤ 10 otherwise

fx2(x) =

2x 0 ≤ x2 ≤ 10 otherwise

Let A be the event x1 ≤ x2.

(a) Find and sketch fx(x).

(b) Determine PrA.(c) Determine fx|A(x|A). Are the components independent, i.e., are fx1|A(x|A)

and fx2|A(x|A) independent?

Solution.(a) Since two marginal distributions are independent,

fX(X) = fx1(x1)fx2(x2)

=

2x2 0 ≤ x1, x2 ≤ 10 otherwise

(b)

Pr(A) =∫ 1

0

∫ 1−x2

02x2dx1dx2

=∫ 1

02x2

2dx2

=2x3

2

3

∣∣∣10

=23

(2)

4


(c)

fX|A(X|A) =fX(X)Pr(A)

=

3x2 0 ≤ x1 < x2 ≤ 10 otherwise

fx1|A(x1|A) =∫ 1

x1

3x2dx2

=3x2

2

2

∣∣∣1x1

=3(1− x1)2

2, 0 ≤ x1 ≤ 1

fx2|A(x2|A) =∫ x2

02x2dx1

= 2x22, 0 ≤ x2 ≤ 1

fX|A(X|A) 6= fx1|A(x1|A)fx2|A(x2|A). Therefore, fx1|A(x1|A) and fx2|A(x2|A)are NOT independent.

5. The entropy H for a random vector is defined as −Eln fx(x). Show thatfor the complex Gaussian case

H = N(1 + lnπ) + ln |Cx|.

Determine the corresponding expression when the vector is real.

Solution.

The complex Gaussian p.d.f. is

fx(x) =1

πN |Cx|exp[−(x−mx)HC−1

x (x−mx)]

Then,

H = −Eln fx(x)= E[(x−mx)HC−1

x (x−mx)] +N lnπ + ln |Cx|

5


Note

E[(x−mx)HC−1x (x−mx)] = E[trace((x−mx)HC−1

x (x−mx))]= trace(C−1

x E[(x−mx)(x−mx)H ])= trace(C−1

x Cx)= trace(I) = N

Therefore

H = N +N lnπ + ln |Cx|= N(1 + lnπ) + ln |Cx|

Similarly, when the vector is real

H =12N(1 + ln(2π)) +

12

ln |Cx|

6. Let

x = 3u− 4vy = 2u+ v

where u and v are unit mean, unit variance, uncorrelated Gaussian randomvariables.

(a) Determine the means and variances of x and y.

(b) Determine the joint density of x and y.

(c) Determine the conditional density of y given x.

Solution.(a)

E(x) = E(3u− 4v)= 3E(u)− 4E(v)= 3− 4= −1

E(y) = E(2u+ v)= 2E(u) + E(v)= 2 + 1= 3

6


σ2x = E(x2)− E2(x)

= E[(3u− 4v)2]− 1= 25

σ2y = E(y2)− E2(y)

= E[(2u+ v)2]− 9= 5

(b) Note [xy

]=

[3 −42 1

]︸︷︷︸

A

[uv

]

Thus

A−1 =111

[1 4−2 3

]and

fx,y(x, y) =fu,v(A−1[x, y]T )

abs |A|

=111fu,v((x+ 4y)/11, (−2x+ 3y)/11)

=1

22πexp(−1

2[(x+ 4y

11− 1)2 + (

−2x+ 3y11

− 1)2])

(c) Note x is Gaussian

fx(x) =1√

2π × 5exp

(− 1

2× 25(x+ 1)2

)Thus

fy|x(y|x) =fx,y(x, y)fx(x)

=√

2π × 522π

exp(−1

2[(x+ 4y

11− 1)2 + (

−2x+ 3y11

− 1)2] +1

2× 25(x+ 1)2

)=

522

√2π

exp(−1

2[(x+ 4y

11− 1)2 + (

−2x+ 3y11

− 1)2 − 125

(x+ 1)2])

7


7. Consider the orthogonal transformation of the correlated zero mean randomvariables x1 and x2[

y1

y2

]=

[cos θ sin θ− sin θ cos θ

] [x1

x2

]

Note Ex21 = σ2

1 , Ex22 = σ2

2 , and Ex1x2 = ρσ1σ2. Determine theangle θ such that y1 and y2 are uncorrelated.

Solution.

y1 = x1 cos θ + x2 sin θy2 = −x1 sin θ + x2 cos θ

E(y1y2) = E[(x1 cos θ + x2 sin θ)(−x1 sin θ + x2 cos θ)]= sin θ cos θE[x2

2] + (cos2 θ − sin2 θ)E[x1x2]− sin θ cos θE[x21]

= sin θ cos θ(σ22 − σ2

1) + (cos2 θ − sin2 θ)ρσ1σ2

= sin 2θ · (σ22 − σ2

1)2

+ cos 2θ · ρσ1σ2

If y1 and y2 are uncorrelated, E(y1y2) = 0. For −π/2 ≤ θ < π/2,

θ =12

arctan2ρσ1σ2

σ22 − σ2

1

8. The covariance matrix and mean vector for a real Gaussian density are

Cx =

[1 0.5

0.5 1

]

and

mx =

[10

]

(a) Determine the eigenvalues and eigenvectors.

(b) Generate a mesh plot of the distribution using MATLAB.

(c) Change the off-diagonal values to −0.5 and repeat (a) and (b).

8


Solution.

(a) Solve |Cx − λI| = 0.

(1− λ)2 − 0.25 = (λ− 0.5)(λ− 1.5) = 0

Hence, eigenvalues are 0.5 and 1.5. For λ = 0.5, the corresponding eigen-vector is [1,−1]T . For λ = 1.5, the corresponding eigenvector is [1, 1]T .

(c) Eigenvalues are 0.5 and 1.5. For λ = 0.5, the corresponding eigenvectoris [1, 1]T . For λ = 1.5, the corresponding eigenvector is [1,−1]T .

9. Let xk(n)Kk=1 be i.i.d. zero mean, unit variance uniformly distributed ran-dom variables and set

yK(n) =K∑k=1

xk(n).

(a) Determine and plot the pdf of yK(n) for K = 2, 3, 4.

(b) Compare the pdf’s to the Gaussian density.

(c) Perform the comparison experimentally using MATLAB. That is, gen-erate K sequences of n = 1, 2, . . . , N uniformly distributed samples.Add the sequences and plot the resulting distribution (histogram). Fitthe results to a Gaussian distribution for various K and N .

Solution.

(a) xk(n)Kk=1 are i.i.d. zero mean, unit variance uniformly distributed ran-dom variables.

fxk(xk) =

1/2a xk ∈ [−a, a]0 otherwise

Since E[x2k] = 1,

E[x2k] =

12a

∫ a

−ax2dx

=x3

6a

∣∣∣a−a

=a2

3= 1

9


⇒ a =√

3

That is

fxk(xk) =

1

2√

3xk ∈ [−

√3,√

3]0 otherwise

For K=2, y2(n) = x1(n) + x2(n).

fy2(x) = fx1(x) ∗ fx2(x)

=

x12 + 1

2√

3−2√

3 ≤ x < 0− x

12 + 12√

30 < x ≤ 2

√3

0 otherwise

For K=3, y3(n) = x1(n) + x2(n) + x3(n) = y2(n) + x3(n).

fy3(x) = fy2(x) ∗ fx3(x)

=

(x+3√

3)2

48√

3−3√

3 ≤ x < −√

33−x2

8√

3−√

3 ≤ x <√

3(x−3

√3)2

48√

3

√3 ≤ x ≤ 3

√3

0 otherwise

10

SolvedProblems[1]

Documents

Transcript of SolvedProblems[1]