Handout Three

85
Introduction to Probability and Statistics Probability & Statistics for Engineers & Scientists, 9th Ed. 2009 Handout #3 Instructor: Lingzhou Xue TA: Daniel Eck 1

description

Number 3

Transcript of Handout Three

Page 1: Handout Three

Introduction to Probability and Statistics

Probability & Statistics for Engineers & Scientists, 9th Ed.

2009

Handout #3

Instructor: Lingzhou Xue

TA: Daniel Eck

1

Page 2: Handout Three

Goal

Mean, Expectation, Expected Value: The value you expectto get in a statistical experiment is the mean.

Variance To measures how spread out a distribution is. In otherwords, it’s a measure of variability.

Covariance It’s is a measure of the linear relationship betweentwo random variables.

Chebyshev’s Inequality It place an upper bound on the proba-bility that some random variable is greater than or equal a setvalue. No other information about that variable’s distributionis required.

2

Page 3: Handout Three

Chapter 4

Mathematical Expectation

4.1 Mean of a Random Variable

3

Page 4: Handout Three

Example

Experiment: Tossing a coin once.

1. What is probability to get one head?

2. Repeat the experiment 10 times. On average, how many

number of heads would there be in each experiment?

4

Page 5: Handout Three

Let X denote the number of heads. The probability density

function of X is:

x 0 1

f(x) = P (X = x) 12

12

Intuition: Suppose we play n times, then the total number of

heads we expect to have is

n×1

2.

Then, on average, in each experiment, we have

0.5n/n = 0.5.

Mathematics:

E(X) = 0×1

2+ 1×

1

2= 0.5 =

1∑x=0

xf(x).

5

Page 6: Handout Three

Example

Experiment: Tossing a coin twice.

1. What is probability to get one head?

2. Repeat the experiment 10 times. On average, how many

number of heads would there be in each experiment?

6

Page 7: Handout Three

Let X denote the number of heads. The probability density

function of X is:

x 0 1 2

f(x) = P (X = x) 14

12

14

Intuition: Expected value of the number of heads: 1.

Mathematics:

E(X) = 0×1

4+ 1×

1

2+ 2×

1

4= 1 =

2∑x=0

xf(x).

7

Page 8: Handout Three

Example 1

If two coins are tossed 16 times and X is the number of heads

occurring per toss, then the value of X can be 0, 1, and 2.

Suppose that the experiment yields

x 0 1 2times 4 7 5

The average number of heads per toss of the two coins is then

0 · 4 + 1 · 7 + 2 · 516

= 1.06.

An average value is not necessarily a possible outcome for the

experiment.

8

Page 9: Handout Three

Motivation 1

Consider a casino game in which the probability of losing $1 per

game is 0.8 and the probability 0.2 win $2 per game. The gain or

loss of a gambler who plays this game only a few times depends

on his luck more than anything else. For example, in one play

of the game, a lucky gambler might win $2, but he has 80%

chance of losing $1. However, if a gambler decides to play the

game a large number of times, his loss or gain depends more

on the number of plays than on his luck. A calculating player

argues that if he plays the game n times, for a large n, then in

approximately (0.8)n games he will lose $1 per game, and (0.2)n

he will win $1. Therefore, his total gain is

(0.8)n · (−1) + (0.2)n · 2 = (−0.4)n.

This gives an average of -0.4 of loss per game.

9

Page 10: Handout Three

If X is the random variable denoting the gain in one play, then

the number -0.4 is the average value of X. In this example, X

is a discrete random variable with the set of possible values {-1,

2}. The probability function of X, f(x) is given by

x -1 2f(x) = P (X = x) 0.8 0.2

Hence

−1 · f(−1) + 2 · f(1) = −0.4,

a relation showing that the expected value of X can be calcu-

lated directly by summing up the product of possible values

of X by their probabilities.

10

Page 11: Handout Three

Expected value is used to describe the long-term average out-

come of a given scenario. In order to calculate expected value,

you take every possible outcome, multiply each by the probabil-

ity of that outcome happening, and then adding those numbers

altogether.

11

Page 12: Handout Three

Definition: Expectation

Let X be a random variable with probability distribution f(x).

The mean, expected value, expectation of X is:

• if X is discrete

µ = E(X) =∑xxf(x).

• if X is continuous

µ = E(X) =∫ ∞−∞

xf(x)dx.

12

Page 13: Handout Three

Example 2

A lot containing 7 components is sampled by a quality inspector;

the lot contains 4 good component and 3 defective components.

A sample of 3 is taken by the inspector. Find the expected value

of the number of good components in this sample.

Solution:

Let X be represent the number of good components in the sam-

ple. The probability distribution of X is

f(x) = P (X = x) =

(4x

)(3

3− x

)(

73

) x = 0,1,2,3.

µ = E(X) =∑xxf(x) = 0 ·

1

35+ 1 ·

12

35+ 2 ·

18

35+ 3 ·

4

35=

12

7.

13

Page 14: Handout Three

Thus, if a sample of size 3 is selected at random over and over

again from a lot of 4 good components and 3 defective compo-

nents, it would contain, on average, 1.7 good components.

Page 15: Handout Three

Example 3

In a gambling game a man is paid $5 if he gets all heads or alltails when three coin are tossed, and he will pay out $3 if eitherone or two heads show. What is his expected gain?Solution:The sample space for the possible outcomes when three coinsare tossed simultaneously isS = {HHH,HHT,HTH, THH,HTT, THT, TTH, TTT} The ran-dom variable of interest is Y , the amount the gambler can win;and the possible values of Y are $5 if event E1 = {HHH,TTT}occurs and $-3 if eventE2 = {HHT,HTH, THH,HTT, THT, TTH} occurs, that is, theprobability function of Y is given by

f(y) = P (X = y) =

14, y = 5;34, y = −3;0, elsewhere.

µ = E(Y ) = 5·1

4+(−3)·

3

4= −1.

14

Page 16: Handout Three

Example 4

Let X be the random variable that denotes the life in hours of a

certain electronic device. The probability density function is

f(x) =

{20,000x3 , x > 100;

0, elsewhere.

Solution:

µ = E(X) =∫ ∞

100x

20,000

x3dx

= 20,000∫ ∞

100

1

x2dx

= 20,000 · (−1

x|∞100) = 200.

15

Page 17: Handout Three

Question

Suppose g(X) is the function of a random variable X.

• Is g(X) also a random variable?

• If yes, how to find the mean of g(X), E[g(X)]?

16

Page 18: Handout Three

Question

Suppose g(X) is the function of a random variable X.

• Is g(X) also a random variable?

• If yes, how to find the mean of g(X), E[g(X)]?

17

Page 19: Handout Three

Illustration 1

Now let us consider a new random variable g(X), which dependson X; that is, each value of g(X) is determined by knowing thevalues of X.

For instance, let Y = g(X) = X2. If X is a discrete randomvariable with probability distribution f(x)

x -1 0 1 2

f(x) = P (X = x) 18

38

38

18

P [g(X) = 0] = P (X2 = 0) = f(0) =3

8,

P [g(X) = 1] = P (X2 = 1) = f(−1) + f(1) =4

8,

P [g(X) = 4] = P (X2 = 4) = f(2) =3

8,

18

Page 20: Handout Three

so that the probability distribution of Y = g(X) may be written

y = g(x) 0 1 4

h(y) = P (Y = y) = P [g(X) = g(x)] 38

48

18

µg(X) = E[g(X)] = E(Y ) =∑yyh(y)

= 0 ·3

8+ 1 ·

4

8+ 4 ·

1

8

=8

8= 1

= (−1)2 ·1

8+ 02 ·

3

8+ 12 ·

3

8+ 22 ·

1

8=∑xg(x) · f(x).

Page 21: Handout Three

Illustration 1

Now let us consider a new random variable g(X), which dependson X; that is, each value of g(X) is determined by knowing thevalues of X.

For instance, let Y = g(X) = X2. If X is a discrete randomvariable with probability distribution f(x), for x = −1,0,1,2, then

P [g(X) = 0] = P (X2 = 0) = P (X = 0) = f(0),

P [g(X) = 1] = P (X2 = 1) = P (X = −1) + P (X = 1) = f(−1) + f(1),

P [g(X) = 4] = P (X2 = 4) = P (X = 2) = f(2),

so that the probability distribution of Y = g(X) may be written

y = g(x) 0 1 4h(y) = P (Y = y) = P [g(X) = g(x)] f(0) f(−1) + f(1) f(2)

19

Page 22: Handout Three

µg(X) = E[g(X)] = E(Y ) =∑yyh(y)

= 0 · h(0) + 1 · h(1) + 4 · h(2)

= 0 · f(0) + 1 · [f(1) + f(−1)] + 4 · f(2)

= (−1)2 · f(−1) + 02 · f(0) + 12 · f(1) + 22 · f(2)

=∑xg(x) · f(x).

Page 23: Handout Three

Illustration 2

Let Y = g(X) = 3X − 1. If X is a discrete random variable with

probability distribution f(x), for x = −1,0,1,2, then

P [g(X) = −4] = P (3X − 1 = −4) = P (X = −1) = f(−1)

P [g(X) = −1] = P (3X − 1 = −1) = P (X = 0) = f(0),

P [g(X) = 2] = P (3X − 1 = 2) = P (X = 1) = f(1),

P [g(X) = 5] = P (3X − 1 = 5) = P (X = 2) = f(2),

so that the probability distribution of g(X) may be written

y = g(x) -4 -1 2 5h(y) = P (Y = y) = P [g(X) = g(x)] f(−1) f(0) f(1) f(2)

20

Page 24: Handout Three

µg(X) = E[g(X)] = E(Y ) =∑yyh(y)

= (−4) · h(−4) + (−1) · h(−1) + 2 · h(2) + 5 · h(5)

= (−4) · f(−1) + (−1) · f(0) + 2 · f(1) + 5 · f(2)

=∑x

(3x− 1) · f(x)

=∑xg(x) · f(x).

Page 25: Handout Three

Theorem

Let X be a random variable with probability function f(x). The

expected value of the random variable g(X) is

• Discrete: if X is discrete

µg(X) = E[g(X)] =∑xg(x) · f(x).

• Continuous: if X is continuous

µg(X) = E[g(X)] =∫ ∞−∞

g(x) · f(x)dx.

21

Page 26: Handout Three

Example 5

Suppose that the number of cars X that pass through a car wash

between 4:00 P.M. and 5:00 P.M. on any sunny Friday has the

following probability:

x 4 5 6 7 8 9

P (X = x) 112

112

14

14

16

16

Let g(X) = 2X − 1 represent the amount of money in dollars,

paid to the attendant by the manager. Find the attendant’s

expected earnings for this particular time period.

22

Page 27: Handout Three

Solution:

By theorem, the attendant can expect to receive

E[g(X)] = E(2X − 1) =9∑

x=4

(2x− 1)f(x)

= 7×1

12+ 9×

1

12+ 11×

1

4

+ 13×1

4+ 15×

1

6+ 17×

1

6= 12.67.

23

Page 28: Handout Three

Example 6

Let X be a random variable with density function

f(x) =

{x2

3 , −1 < x < 2;0, elsewhere;

Find the expected value of g(X) = 4X + 3.

Solution:

E(4X + 3) =∫ 2

−1(4x+ 3)

x2

3dx =

1

3

∫ 2

−1(4x3 + 3x2)dx

=1

3

[(x4 + x3)|x=2

x=−1

]= 8.

24

Page 29: Handout Three

Definition

Let X and Y be random variables with joint probability distribu-

tion f(x, y). The mean or expected value of g(X,Y ) is:

• if both X and Y are discrete

µg(X,Y ) = E[g(X,Y )] =∑x

∑yg(x, y)f(x, y).

• if both X and Y are continuous

µg(X,Y ) = E[g(X,Y )] =∫ ∞−∞

∫ ∞−∞

g(x, y)f(x, y)dxdy.

25

Page 30: Handout Three

Example 7: Example 15 in Handout #2

Let X and Y be the random variables with joint probability dis-tribution indicated in the following Table.

x Rowf(x, y) 0 1 2 Totals

0 328

928

328

1528

y 1 628

628 0 12

282 1

28 0 0 128

Column Totals 1028

1528

328 1

Find the expected value of g(X,Y ) = XY .Solution:

E[g(X,Y )] = E(XY ) =2∑

x=0

2∑y=0

xyf(x, y) =3

14.

26

Page 31: Handout Three

Example 8

Find E(YX

)for the density function

f(x, y) =

{x(1+3y2)

4 , 0 < x < 2,0 < y < 1;0, elsewhere.

Solution:

E

(Y

X

)=∫ 1

0

∫ 2

0

y

x·x(1 + 3y2)

4dxdy =

∫ 1

0

y + 3y3

2dy =

5

8.

27

Page 32: Handout Three

Notes

Let X and Y be random variables with joint probability distribu-

tion f(x, y).

• if g(X,Y ) = X

E(X) =

{ ∑x∑y g(x, y)f(x, y) =

∑x xm1(x), discrete case;∫∞

−∞∫∞−∞ xf(x, y)dxdy =

∫∞−∞ xm1(x)dx, continuous case.

where m1(x) is the marginal distribution of X.

• if g(X,Y ) = Y

E(Y ) =

{ ∑y∑y g(x, y)f(x, y) =

∑y ym2(y), discrete case;∫∞

−∞∫∞−∞ yf(x, y)dxdy =

∫∞−∞ ym2(y)dy, continuous case.

where m2(x) is the marginal distribution of Y .

28

Page 33: Handout Three

Therefore, in calculating E(X) over a two-dimensional space,

one may use either the joint probability function of X and Y or

the marginal distribution of X.

Page 34: Handout Three

Riddle

Why are the mean, median, and mode like a valuable piece of

real estate?

LOCATION! LOCATION! LOCATION!

29

Page 35: Handout Three

4.2 Variance and Covariance of Random

Variables

30

Page 36: Handout Three

If I put your feet in the boiling water and head in the ice, on

average, you’d be comfortable. However, you’d die eventually.

It is not enough to just consider the average.

31

Page 37: Handout Three

If I put your feet in the boiling water and head in the ice, on

average, you’d be comfortable. However, you’d die eventually.

It is not enough to just consider the average.

32

Page 38: Handout Three

The variance of a random variable is a measure of its statistical

dispersion, indicating how its possible values are spread around

the expected value. While the expected value shows the loca-

tion of the distribution, the variance indicates the variability of

the values.

Distribution with equal means µ = 2 and unequal dispersions.

33

Page 39: Handout Three

Definition: Variance

Let X be a random variable with probability function f(x) and

mean µ. The variance of the random variable X is

• Discrete: if X is discrete

σ2 = Var(X) = E[(X − µ)2] =∑x

(x− µ)2 · f(x).

• Continuous: if X is continuous

σ2 = Var(X) = E[(X − µ)2] =∫ ∞−∞

(x− µ)2 · f(x)dx.

The average of the squared distance of its possible values

X from the expected value µ.

34

Page 40: Handout Three

Definition: Standard Deviation

The positive square root of the variance, σ, is called the stan-

dard deviation of X.

σ =√

Var(X).

35

Page 41: Handout Three

Example 9

Let the random variable X represent the number of automobiles

that are used for official business purposes on any given workday.

The probability distribution for company A is

x 1 2 3f(x) 0.3 0.4 0.3

and for company B

x 0 1 2 3 4f(x) 0.2 0.1 0.3 0.3 0.1

36

Page 42: Handout Three

Show that the variance of the probability distribution for com-

pany B is greater than that of company A.

µA = 1 · 0.3 + 2 · 0.4 + 3 · 0.3 = 2;

µB = 0 · 0.2 + 1 · 0.1 + 2 · 0.3 + 3 · 0.3 + 4 · 0.1 = 2;

σ2A =

2∑x=1

(x− 2)2fA(x) = 0.6;

σ2B =

4∑x=0

(x− 2)2fB(x) = 1.6.

Page 43: Handout Three

Theorem

The variance of a random variable X is

σ2 = Var(X) = E(X2)− µ2

Proof:

For the discrete case we can write

σ2 =∑x

(x− µ)2f(x) =∑x

(x2 − 2µx+ µ2)f(x)

=∑xx2f(x)− 2µ

∑xxf(x) + µ2∑

xf(x).

=∑xx2f(x)− 2µ · µ+ µ2 · 1.

=∑xx2f(x)− µ2

=E(X2)− µ2.

37

Page 44: Handout Three

Example 10

Let the random variable X represent the number of defective

parts for a machine when 3 parts are sampled from a production

line and tested. The following is the probability distribution of

X.

x 0 1 2 3f(x) 0.51 0.38 0.10 0.01

Solution:

µ = 0 · 0.51 + 1 · 0.38 + 2 · 0.10 + 3 · 0.01 = 0.61;

E(X2) = 0 · 0.51 + 1 · 0.38 + 4 · 0.10 + 9 · 0.01 = 0.87;

σ2 = E(X2)− µ2 = 0.87− 0.612 = 0.4979.

38

Page 45: Handout Three

Example 11

The weekly demand for pepsi, in thousands of liters, from a

local chain of efficiency stores, is a continuous random variable

X having the probability density

f(x) =

{2(x− 1), 1 < x < 2;0, elsewhere.

Find the mean and variance of X.

Solution:

µ = E(X) =∫ 2

1x · 2(x− 1)dx =

5

3.

E(X2) =∫ 2

1x2 · 2(x− 1)dx =

17

6.

σ2 = Var(X) = E(X2)− µ2 =17

6−(

5

3

)2=

1

18

39

Page 46: Handout Three

Theorem

Let X be a random variable with probability function f(x). The

variance of the random variable g(X) is

• Discrete: if X is discrete

σ2g(X) = E{[g(X)− µg(X)]2} =

∑x

[g(x)− µg(X)]2 · f(x).

• Continuous: if X is continuous

σ2g(X) = E{[g(X)− µg(X))]2} =

∫ ∞−∞

[x− µg(X)]2 · f(x)dx.

40

Page 47: Handout Three

Example 12

Calculate the variance of g(X) = 2X + 3, where X is a random

variable with probability distribution

x 0 1 2 3

f(x) 14

14

14

14

Solution:

µ2X+3 = E(2X + 3) =3∑

x=0

(2x+ 3)f(x) = 6.

σ22X+3 = E{[(2X + 3)− µ2X+3]2} = E[(2X + 3− 6)2]

= E(4X2 − 12X + 9) =3∑

x=0

(4x2 − 12x+ 9)f(x)

= 4.

41

Page 48: Handout Three

Example 6 Cont’d

Find the variance of the random variable g(X) = 4X + 3.

Solution:

σ24X+3 = E{[(4X + 3)− µ4X+3]2} = E[(4X + 3− 8)2]

= E(16X2 − 40X + 25) =∫ 2

−1(16x2 − 40x+ 25)

x2

3dx

=51

5.

42

Page 49: Handout Three

Question

Does there exist a quantity to measure how much two variables

change together or provide a measure of the strength of the

correlation between two random variables?

43

Page 50: Handout Three

Example

Let X denote the father’s height, Y the daughter’s height, and

Z the mother’s height.

●●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●● ●

●●

●●

● ●

4 5 6 7

3.54.0

4.55.0

5.56.0

6.5

Unit of Measurement: Foot

Father's Height

Daught

er's He

ight

Cov(X, Y)=0.3

●●

●●

● ●

● ●●

●●

● ●

● ●

●●

● ●

●●

●●

4 5 6 7 83.5

4.04.5

5.05.5

6.06.5

Unit of Measurement: Foot

Father's Height

Mothe

r's Heigh

t

Cov(X, Z)=0.02

44

Page 51: Handout Three

(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0.

45

Page 52: Handout Three

Definition: Covariance

Let X and Y be random variables with probability distribution

f(x, y). The covariance of the random variables X and Y is

• Discrete: if both X and Y are discrete is discrete

σXY = Cov(X,Y ) = E [(X − µX)(Y − µY )]

=∑x

∑y

(x− µx)(y − µy) · f(x, y)

• Continuous: if both X and Y are continuous

σXY = Cov(X,Y ) = E [(X − µX)(Y − µY )]

=∫ ∞−∞

∫ ∞−∞

(x− µx)(y − µy) · f(x, y)dxdy

46

Page 53: Handout Three

The covariance between two random variables is a measurementof the nature of the association between the two. If large valuesof X often result in large values of Y or small values of X resultin small values of Y , positive X − µx will often result in positiveY − µy and negative X − µx will often result in negative Y − µy.Thus the product (X − µx)(Y − µy) will tend to be positive. Onthe other hand, if large X values often result in small Y values,the product (X −µx)(Y −µy) will tend to be negative. Thus thesign of the covariance indicates whether the relationship betweentwo dependent random variables is positive or negative. WhenX and Y are statistically independent, it can be shown that thecovariance is zero. The converse, however, is not general true.Two variables may have zero covariance and still not be statis-tically independent. Note that the covariance only describes thelinear relationship between two random variables. Therefore,if a covariance between X and Y is zero, X and Y may havenonlinear relationship, which means that they are not necessarilyindependent.

47

Page 54: Handout Three

48

Page 55: Handout Three

Example 13

Let X be a random variable and the probability distribution of Xis

x -2 -1 0 1 2f(x) 0.2 0.2 0.2 0.2 0.2

Let Y = g(X) = X2. Then Cov(X,Y ) = 0, but X and Y havethe quadratic relationship.

49

Page 56: Handout Three

Theorem

The covariance of two random variables X and Y with meansµX and µY , respectively, is given by

σXY = Cov(X,Y ) = E(XY )− µXµY .Proof:For the discrete case we can write

σXY =∑x

∑y

(x− µX)(y − µY )f(x, y)

=∑x

∑y

(xy − µXy − µY x+ µXµY )f(x, y)

=∑x

∑yxyf(x, y)− µX

∑x

∑yyf(x, y)

−µY∑x

∑yxf(x, y) + µXµY

∑x

∑yf(x, y)

=E(XY )− µXµY − µY µX + µXµY=E(XY )− µXµY .

50

Page 57: Handout Three

Example 14

The fraction X of male runners and the fraction Y of female

runners who compete in marathon races are described by the

joint probability density function

f(x, y) =

{8xy, 0 ≤ y ≤ x ≤ 1;0, elsewhere.

Find the covariance of X and Y .

Solution:

We first compute the marginal density functions:

g(x) =

{4x3, 0 < x < 1;0, elsewhere.

h(x) =

{4y(1− y2), 0 < y < 1;0, elsewhere.

51

Page 58: Handout Three

µX = E(X) =∫ 1

04x4dx =

4

5;

µY =∫ 1

04y2(1− y2)dy =

8

15.

E(XY ) =∫ 1

0

∫ 1

y8x2y2dxdy =

4

9

σXY = E(XY )− µXµY =4

9−

4

8

15=

4

225.

Page 59: Handout Three

Although the covariance between two random variables does pro-

vide information regarding the nature of the relationship, the

magnitude of σXY does not indicate anything regarding the

strength of the relationship, since σXY is not scale-free. Its

magnitude will depend on the units measured for both X and

Y . There is a scale-free version of the covariance called the

correlation coefficient that is used widely in statistics.

52

Page 60: Handout Three

Example

Let X denote the father’s height, Y the daughter’s height, and

Z the mother’s height.

●●

●●

● ●

●●

4 5 6 7

3.54.0

4.55.0

5.56.0

6.5

Unit of Measurement: Foot

Father's Height

Daug

hter's

Heig

ht

Cov(X, Y)=0.3Cov(X, Y)=0.3Cov(X, Y)=0.3

●●

●●

● ●

●●

50 60 70 80 90

4050

6070

80

Unit of Measurement: Inch

Father's Height

Daug

hter's

Heig

ht

Cov(X, Y)=30

53

Page 61: Handout Three

Definition: Correlation Coefficient

Let X and Y be random variables with covariance σXY and stan-dard deviations σX and σY , respectively. The correlation coef-ficient of X and Y is

ρXY =Cov(X,Y )√

Var(X)Var(Y )=

σXYσXσY

Notice:

• −1 ≤ ρXY ≤ 1

• When Y = a+ bX, then

– If b > 0, ρXY = 1

– If b < 0, ρXY = −1

54

Page 62: Handout Three

Example 14 Cont’d

Find the correlation coefficient of X and Y .

σ2X =

2

75

σ2Y =

1

5

ρXY =σXYσXσY

=4

225√2

75 ·15

=2√

30

45.

55

Page 63: Handout Three

4.3 Means and Variances of Linear Combinations of

Random Variables

56

Page 64: Handout Three

Theorem

If a and b are constants, then

E(aX + b) = aE(X) + b

Proof:

E(aX + b) =∫ ∞−∞

(ax+ b)f(x)dx

=a∫ ∞−∞

xf(x)dx+ b∫ ∞−∞

f(x)dx

=aE(X) + b.

57

Page 65: Handout Three

Example 15

Let X be a random variable with density function

f(x) =

{x2

3 , −1 < x < 2;0, elsewhere;

Find the expected value of g(X) = 4X + 3.

Solution:

E(4X + 3) =∫ 2

−1(4x+ 3)

x2

3dx =

1

3

∫ 2

−1(4x3 + 3x2)dx = 8

E(X) =∫ 2

−1x ·

x2

3dx =

15

12

E(4X + 3) = 4E(X) + 3 = 4 ·15

12+ 3 = 8.

58

Page 66: Handout Three

Theorem

The expected value of the sum or difference of two or more

functions of a random variable X is the sum or difference of the

expected values of the functions. That is,

E[g(X)± h(X)] = E[g(X)]± E[h(X)]

Proof:

E[g(X)± h(X)] =∫ ∞−∞

(g(x)± h(x))f(x)dx

=∫ ∞−∞

g(x)f(x)dx±∫ ∞−∞

h(x)f(x)dx

=E[g(X)]± E[h(X)].

59

Page 67: Handout Three

Theorem

The expected value of the sum or difference of two or more

functions of random variables X and Y is the sum or difference

of the expected values of the functions. That is,

E[g(X,Y )± h(X,Y )] = E[g(X,Y )]± E[h(X,Y )]

Proof:

E[g(X,Y )± h(X,Y )] =∫ ∞−∞

∫ ∞−∞

(g(x, y)± h(x, y))f(x, y)dxdy

=∫ ∞−∞

∫ ∞−∞

g(x, y)f(x, y)dxdy

±∫ ∞−∞

∫ ∞−∞

h(x, y)f(x, y)dxdy

=E[g(X,Y )]± E[h(X,Y )].

60

Page 68: Handout Three

Theorem

Let X and Y be two independent random variables. Then

E(XY ) = E(X)E(Y ).

Proof:

Suppose that g(x) and h(y) are the marginal distribution of X

and Y , respectively. Since X and Y are independent, we may

write f(x, y) = g(x)h(y). By definition,

E(XY ) =∫ ∞−∞

∫ ∞−∞

xyf(x, y)dxdy

=∫ ∞−∞

∫ ∞−∞

xyg(x)h(y)dxdy

=∫ ∞−∞

xg(x)dx∫ ∞−∞

yh(y)dy

= E(X)E(Y ).

61

Page 69: Handout Three

Corollary

Let X and Y be two independent random variables. Then σXY =

0.

Proof:

Cov(X,Y ) = E(XY )− EXEY = 0.

Page 70: Handout Three

Example 16

Given the joint density function

f(x, y) =

{x(1+3y2)

4 , 0 < x < 2,0 < y < 1;0, elsewhere.

Verify E(XY ) = E(X)E(Y ).

Solution:

• g(x) = x2, 0 < x < 2.

• h(y) = 1+3y2

2 , 0 < y < 1.

• f(x, y) = g(x)h(y) for all x and y, so X and Y are indepen-

dent.62

Page 71: Handout Three

E(XY ) =∫ 1

0

∫ 2

0xyf(x, y)dxdy =

5

6.

E(X) =∫ 1

0

∫ 2

0xf(x, y)dxdy =

∫ 2

0xh(x)dx =

4

3.

E(Y ) =∫ 1

0

∫ 2

0yf(x, y)dxdy =

∫ 1

0yg(y)dy =

5

8.

Hence,

E(X)E(Y ) =4

5

8=

5

6= E(XY ).

Page 72: Handout Three

Example 17

If the joint density function of X and Y is given by?

f(x, y) =

{27(x+ 2y), if 0 < x < 1,1 < y < 2;0, elsewhere.

Find the expected value of E(XY ).Solution:

E(X

Y) =

2

3ln 2 6=

E(X)

E(Y ).

Question:

• E(XY )?=E(X)E(Y ) .

• E(XY )?=E(X)E( 1

Y ).

63

Page 73: Handout Three

Theorem

If a and b are constants, then

σ2aX+b = Var(aX + b) = a2Var(X)

Proof:

By definition,

Var(aX + b) = E[aX + b− µaX+b]2

= E[aX + b− (aµX + b)]2

= E[(aX − aµX)2]

= E[a2 · (X − µX)2]

= a2E[(X − µX)2] = a2Var(X).

64

Page 74: Handout Three

Theorem

If X and Y are random variables with joint probability distribution

f(x, y) and a and b are constants, then

σ2aX+bY = Var(aX + bY ) = a2Var(X) + 2abCov(X,Y ) + b2Var(Y )

Proof:

By definition,

Var(aX + b)

= E[aX + bY − µaX+bY ]2

= E[aX + bY − (aµX + bµY )]2

= E[a(X − µX) + b(Y − µY )]2

= E([a2(X − µX)2 + 2ab(X − µX)(Y − µY ) + b2(Y − µY )2]

= a2E[(X − µX)2] + 2abE[(X − µX)(Y − µY )] + b2E[(Y − µY )2]

= a2Var(X) + 2abCov(X,Y ) + b2Var(Y )

65

Page 75: Handout Three

Corollary

Let X and Y be two independent random variables. Then

Var(aX ± bY ) = a2Var(X)+b2Var(Y ).

Page 76: Handout Three

Example 18

Let X be a random variable with density function

f(x) =

{x2

3 , −1 < x < 2;0, elsewhere;

Find the variance of g(X) = 4X + 3.

Solution:

Var(4X + 3) =∫ 2

−1(4x+ 3− 8)2x

2

3dx = 8

Var(X) =∫ 2

−1(x−

5

3)2x

2

3dx =

1

2

Var(4X + 3) = 42Var(X) = 16 ·1

2= 8.

66

Page 77: Handout Three

4.4 Chebyshev’s Theorem

67

Page 78: Handout Three

Question

Let X have a probability density function f(x). Then

µ = E(X) =∫xf(x)dx, σ2 = Var(X) =

∫(x− µ)2f(x)dx.

P (µ− kσ < X < µ+ kσ) =∫ µ+kσ

µ−kσf(x)dx.

Suppose f(x) is unknown, but the mean µ and the variance σ2

are known. What is probability P (µ− kσ < X < µ+ kσ)?

Although we’re unable to obtain the exact probability, we can

estimate it.

68

Page 79: Handout Three

Question

Let X have a probability density function f(x). Then

µ = E(X) =∫xf(x)dx, σ2 = Var(X) =

∫(x− µ)2f(x)dx.

P (µ− kσ < X < µ+ kσ) =∫ µ+kσ

µ−kσf(x)dx.

Suppose f(x) is unknown, but the mean µ and the variance σ2

are known. What is probability P (µ− kσ < X < µ+ kσ)?

Although we’re unable to obtain the exact probability, we can

estimate it.

69

Page 80: Handout Three

Markov’s Inequality

Let X be a nonnegative random variable; then for any t > 0,

P (X ≥ t) ≤E(X)

t.

Proof:

70

Page 81: Handout Three

By definition,

E(X) =∫ ∞−∞

(x− µ)2f(x)dx

=∫ ∞

0xf(x)dx

=∫ t

0xf(x)dx+

∫ ∞t

xf(x)dx

≥∫ ∞t

xf(x)dx

≥∫ ∞t

tf(x)dx

= t∫ ∞t

f(x)dx

= tP (X ≥ t).

Thus,

P (X ≥ t) ≤E(X)

t.

Page 82: Handout Three

Theorem (Chebyshev’s Inequality)

The probability that any random variable X will assume a valuewithin k standard deviations of the mean is at least 1− 1

k2. Thatis,

P (µ− kσ < X < µ+ kσ) ≥ 1−1

k2

Proof:By definition,

P (µ− kσ < X < µ+ kσ)

=P [|X − µ| ≤ kσ]

=1− P [|X − µ|2 ≥ (kσ)2] (X − µ)2 ≥ (kσ)2 ⇔ |X − µ| ≥ kσ

≥1−E[(X − µ)2]

t2(by Markov Inequality)

=1−σ2

(kσ)2= 1−

1

k2

71

Page 83: Handout Three

Example 19

A random variable X has a mean µ = 8, a variance σ2 = 9, andan unknown probability distribution. Find

1. P (−4 < X < 20)

2. P (|X − 8| ≥ 6).

Solution:

1. P (−4 < X < 20) = P [8− 4 · 3 < X < 8 + 4 · 3] ≥ 1516

2. P (|X − 8| ≥ 6) = 1 − P (|X − 8| < 6) = 1 − P (−6 < X − 8 <6) = 1− P (8− 2 · 3 < X < 8 + 2 · 3) ≤ 1

4.

72

Page 84: Handout Three

L’Hopital’s Rule

In simple cases, L’Hopital’s rule states that for functions f(x)

and g(x), if:

limx→c f(x) = lim

x→c g(x) = 0

or:

limx→c f(x) = lim

x→c g(x) = ±∞

then:

limx→c

f(x)

g(x)= lim

x→cf ′(x)

g′(x)

where the prime (’) denotes the derivative.

Among other requirements, for this rule to hold, the limit limx→cf ′(x)g′(x)

must exist.

73

Page 85: Handout Three

Integration by Parts∫udv = uv −

∫vdu.

74