Handout Three
-
Upload
doomy-jones -
Category
Documents
-
view
224 -
download
0
description
Transcript of Handout Three
Introduction to Probability and Statistics
Probability & Statistics for Engineers & Scientists, 9th Ed.
2009
Handout #3
Instructor: Lingzhou Xue
TA: Daniel Eck
1
Goal
Mean, Expectation, Expected Value: The value you expectto get in a statistical experiment is the mean.
Variance To measures how spread out a distribution is. In otherwords, it’s a measure of variability.
Covariance It’s is a measure of the linear relationship betweentwo random variables.
Chebyshev’s Inequality It place an upper bound on the proba-bility that some random variable is greater than or equal a setvalue. No other information about that variable’s distributionis required.
2
Chapter 4
Mathematical Expectation
4.1 Mean of a Random Variable
3
Example
Experiment: Tossing a coin once.
1. What is probability to get one head?
2. Repeat the experiment 10 times. On average, how many
number of heads would there be in each experiment?
4
Let X denote the number of heads. The probability density
function of X is:
x 0 1
f(x) = P (X = x) 12
12
Intuition: Suppose we play n times, then the total number of
heads we expect to have is
n×1
2.
Then, on average, in each experiment, we have
0.5n/n = 0.5.
Mathematics:
E(X) = 0×1
2+ 1×
1
2= 0.5 =
1∑x=0
xf(x).
5
Example
Experiment: Tossing a coin twice.
1. What is probability to get one head?
2. Repeat the experiment 10 times. On average, how many
number of heads would there be in each experiment?
6
Let X denote the number of heads. The probability density
function of X is:
x 0 1 2
f(x) = P (X = x) 14
12
14
Intuition: Expected value of the number of heads: 1.
Mathematics:
E(X) = 0×1
4+ 1×
1
2+ 2×
1
4= 1 =
2∑x=0
xf(x).
7
Example 1
If two coins are tossed 16 times and X is the number of heads
occurring per toss, then the value of X can be 0, 1, and 2.
Suppose that the experiment yields
x 0 1 2times 4 7 5
The average number of heads per toss of the two coins is then
0 · 4 + 1 · 7 + 2 · 516
= 1.06.
An average value is not necessarily a possible outcome for the
experiment.
8
Motivation 1
Consider a casino game in which the probability of losing $1 per
game is 0.8 and the probability 0.2 win $2 per game. The gain or
loss of a gambler who plays this game only a few times depends
on his luck more than anything else. For example, in one play
of the game, a lucky gambler might win $2, but he has 80%
chance of losing $1. However, if a gambler decides to play the
game a large number of times, his loss or gain depends more
on the number of plays than on his luck. A calculating player
argues that if he plays the game n times, for a large n, then in
approximately (0.8)n games he will lose $1 per game, and (0.2)n
he will win $1. Therefore, his total gain is
(0.8)n · (−1) + (0.2)n · 2 = (−0.4)n.
This gives an average of -0.4 of loss per game.
9
If X is the random variable denoting the gain in one play, then
the number -0.4 is the average value of X. In this example, X
is a discrete random variable with the set of possible values {-1,
2}. The probability function of X, f(x) is given by
x -1 2f(x) = P (X = x) 0.8 0.2
Hence
−1 · f(−1) + 2 · f(1) = −0.4,
a relation showing that the expected value of X can be calcu-
lated directly by summing up the product of possible values
of X by their probabilities.
10
Expected value is used to describe the long-term average out-
come of a given scenario. In order to calculate expected value,
you take every possible outcome, multiply each by the probabil-
ity of that outcome happening, and then adding those numbers
altogether.
11
Definition: Expectation
Let X be a random variable with probability distribution f(x).
The mean, expected value, expectation of X is:
• if X is discrete
µ = E(X) =∑xxf(x).
• if X is continuous
µ = E(X) =∫ ∞−∞
xf(x)dx.
12
Example 2
A lot containing 7 components is sampled by a quality inspector;
the lot contains 4 good component and 3 defective components.
A sample of 3 is taken by the inspector. Find the expected value
of the number of good components in this sample.
Solution:
Let X be represent the number of good components in the sam-
ple. The probability distribution of X is
f(x) = P (X = x) =
(4x
)(3
3− x
)(
73
) x = 0,1,2,3.
µ = E(X) =∑xxf(x) = 0 ·
1
35+ 1 ·
12
35+ 2 ·
18
35+ 3 ·
4
35=
12
7.
13
Thus, if a sample of size 3 is selected at random over and over
again from a lot of 4 good components and 3 defective compo-
nents, it would contain, on average, 1.7 good components.
Example 3
In a gambling game a man is paid $5 if he gets all heads or alltails when three coin are tossed, and he will pay out $3 if eitherone or two heads show. What is his expected gain?Solution:The sample space for the possible outcomes when three coinsare tossed simultaneously isS = {HHH,HHT,HTH, THH,HTT, THT, TTH, TTT} The ran-dom variable of interest is Y , the amount the gambler can win;and the possible values of Y are $5 if event E1 = {HHH,TTT}occurs and $-3 if eventE2 = {HHT,HTH, THH,HTT, THT, TTH} occurs, that is, theprobability function of Y is given by
f(y) = P (X = y) =
14, y = 5;34, y = −3;0, elsewhere.
µ = E(Y ) = 5·1
4+(−3)·
3
4= −1.
14
Example 4
Let X be the random variable that denotes the life in hours of a
certain electronic device. The probability density function is
f(x) =
{20,000x3 , x > 100;
0, elsewhere.
Solution:
µ = E(X) =∫ ∞
100x
20,000
x3dx
= 20,000∫ ∞
100
1
x2dx
= 20,000 · (−1
x|∞100) = 200.
15
Question
Suppose g(X) is the function of a random variable X.
• Is g(X) also a random variable?
• If yes, how to find the mean of g(X), E[g(X)]?
16
Question
Suppose g(X) is the function of a random variable X.
• Is g(X) also a random variable?
• If yes, how to find the mean of g(X), E[g(X)]?
17
Illustration 1
Now let us consider a new random variable g(X), which dependson X; that is, each value of g(X) is determined by knowing thevalues of X.
For instance, let Y = g(X) = X2. If X is a discrete randomvariable with probability distribution f(x)
x -1 0 1 2
f(x) = P (X = x) 18
38
38
18
P [g(X) = 0] = P (X2 = 0) = f(0) =3
8,
P [g(X) = 1] = P (X2 = 1) = f(−1) + f(1) =4
8,
P [g(X) = 4] = P (X2 = 4) = f(2) =3
8,
18
so that the probability distribution of Y = g(X) may be written
y = g(x) 0 1 4
h(y) = P (Y = y) = P [g(X) = g(x)] 38
48
18
µg(X) = E[g(X)] = E(Y ) =∑yyh(y)
= 0 ·3
8+ 1 ·
4
8+ 4 ·
1
8
=8
8= 1
= (−1)2 ·1
8+ 02 ·
3
8+ 12 ·
3
8+ 22 ·
1
8=∑xg(x) · f(x).
Illustration 1
Now let us consider a new random variable g(X), which dependson X; that is, each value of g(X) is determined by knowing thevalues of X.
For instance, let Y = g(X) = X2. If X is a discrete randomvariable with probability distribution f(x), for x = −1,0,1,2, then
P [g(X) = 0] = P (X2 = 0) = P (X = 0) = f(0),
P [g(X) = 1] = P (X2 = 1) = P (X = −1) + P (X = 1) = f(−1) + f(1),
P [g(X) = 4] = P (X2 = 4) = P (X = 2) = f(2),
so that the probability distribution of Y = g(X) may be written
y = g(x) 0 1 4h(y) = P (Y = y) = P [g(X) = g(x)] f(0) f(−1) + f(1) f(2)
19
µg(X) = E[g(X)] = E(Y ) =∑yyh(y)
= 0 · h(0) + 1 · h(1) + 4 · h(2)
= 0 · f(0) + 1 · [f(1) + f(−1)] + 4 · f(2)
= (−1)2 · f(−1) + 02 · f(0) + 12 · f(1) + 22 · f(2)
=∑xg(x) · f(x).
Illustration 2
Let Y = g(X) = 3X − 1. If X is a discrete random variable with
probability distribution f(x), for x = −1,0,1,2, then
P [g(X) = −4] = P (3X − 1 = −4) = P (X = −1) = f(−1)
P [g(X) = −1] = P (3X − 1 = −1) = P (X = 0) = f(0),
P [g(X) = 2] = P (3X − 1 = 2) = P (X = 1) = f(1),
P [g(X) = 5] = P (3X − 1 = 5) = P (X = 2) = f(2),
so that the probability distribution of g(X) may be written
y = g(x) -4 -1 2 5h(y) = P (Y = y) = P [g(X) = g(x)] f(−1) f(0) f(1) f(2)
20
µg(X) = E[g(X)] = E(Y ) =∑yyh(y)
= (−4) · h(−4) + (−1) · h(−1) + 2 · h(2) + 5 · h(5)
= (−4) · f(−1) + (−1) · f(0) + 2 · f(1) + 5 · f(2)
=∑x
(3x− 1) · f(x)
=∑xg(x) · f(x).
Theorem
Let X be a random variable with probability function f(x). The
expected value of the random variable g(X) is
• Discrete: if X is discrete
µg(X) = E[g(X)] =∑xg(x) · f(x).
• Continuous: if X is continuous
µg(X) = E[g(X)] =∫ ∞−∞
g(x) · f(x)dx.
21
Example 5
Suppose that the number of cars X that pass through a car wash
between 4:00 P.M. and 5:00 P.M. on any sunny Friday has the
following probability:
x 4 5 6 7 8 9
P (X = x) 112
112
14
14
16
16
Let g(X) = 2X − 1 represent the amount of money in dollars,
paid to the attendant by the manager. Find the attendant’s
expected earnings for this particular time period.
22
Solution:
By theorem, the attendant can expect to receive
E[g(X)] = E(2X − 1) =9∑
x=4
(2x− 1)f(x)
= 7×1
12+ 9×
1
12+ 11×
1
4
+ 13×1
4+ 15×
1
6+ 17×
1
6= 12.67.
23
Example 6
Let X be a random variable with density function
f(x) =
{x2
3 , −1 < x < 2;0, elsewhere;
Find the expected value of g(X) = 4X + 3.
Solution:
E(4X + 3) =∫ 2
−1(4x+ 3)
x2
3dx =
1
3
∫ 2
−1(4x3 + 3x2)dx
=1
3
[(x4 + x3)|x=2
x=−1
]= 8.
24
Definition
Let X and Y be random variables with joint probability distribu-
tion f(x, y). The mean or expected value of g(X,Y ) is:
• if both X and Y are discrete
µg(X,Y ) = E[g(X,Y )] =∑x
∑yg(x, y)f(x, y).
• if both X and Y are continuous
µg(X,Y ) = E[g(X,Y )] =∫ ∞−∞
∫ ∞−∞
g(x, y)f(x, y)dxdy.
25
Example 7: Example 15 in Handout #2
Let X and Y be the random variables with joint probability dis-tribution indicated in the following Table.
x Rowf(x, y) 0 1 2 Totals
0 328
928
328
1528
y 1 628
628 0 12
282 1
28 0 0 128
Column Totals 1028
1528
328 1
Find the expected value of g(X,Y ) = XY .Solution:
E[g(X,Y )] = E(XY ) =2∑
x=0
2∑y=0
xyf(x, y) =3
14.
26
Example 8
Find E(YX
)for the density function
f(x, y) =
{x(1+3y2)
4 , 0 < x < 2,0 < y < 1;0, elsewhere.
Solution:
E
(Y
X
)=∫ 1
0
∫ 2
0
y
x·x(1 + 3y2)
4dxdy =
∫ 1
0
y + 3y3
2dy =
5
8.
27
Notes
Let X and Y be random variables with joint probability distribu-
tion f(x, y).
• if g(X,Y ) = X
E(X) =
{ ∑x∑y g(x, y)f(x, y) =
∑x xm1(x), discrete case;∫∞
−∞∫∞−∞ xf(x, y)dxdy =
∫∞−∞ xm1(x)dx, continuous case.
where m1(x) is the marginal distribution of X.
• if g(X,Y ) = Y
E(Y ) =
{ ∑y∑y g(x, y)f(x, y) =
∑y ym2(y), discrete case;∫∞
−∞∫∞−∞ yf(x, y)dxdy =
∫∞−∞ ym2(y)dy, continuous case.
where m2(x) is the marginal distribution of Y .
28
Therefore, in calculating E(X) over a two-dimensional space,
one may use either the joint probability function of X and Y or
the marginal distribution of X.
Riddle
Why are the mean, median, and mode like a valuable piece of
real estate?
LOCATION! LOCATION! LOCATION!
29
4.2 Variance and Covariance of Random
Variables
30
If I put your feet in the boiling water and head in the ice, on
average, you’d be comfortable. However, you’d die eventually.
It is not enough to just consider the average.
31
If I put your feet in the boiling water and head in the ice, on
average, you’d be comfortable. However, you’d die eventually.
It is not enough to just consider the average.
32
The variance of a random variable is a measure of its statistical
dispersion, indicating how its possible values are spread around
the expected value. While the expected value shows the loca-
tion of the distribution, the variance indicates the variability of
the values.
Distribution with equal means µ = 2 and unequal dispersions.
33
Definition: Variance
Let X be a random variable with probability function f(x) and
mean µ. The variance of the random variable X is
• Discrete: if X is discrete
σ2 = Var(X) = E[(X − µ)2] =∑x
(x− µ)2 · f(x).
• Continuous: if X is continuous
σ2 = Var(X) = E[(X − µ)2] =∫ ∞−∞
(x− µ)2 · f(x)dx.
The average of the squared distance of its possible values
X from the expected value µ.
34
Definition: Standard Deviation
The positive square root of the variance, σ, is called the stan-
dard deviation of X.
σ =√
Var(X).
35
Example 9
Let the random variable X represent the number of automobiles
that are used for official business purposes on any given workday.
The probability distribution for company A is
x 1 2 3f(x) 0.3 0.4 0.3
and for company B
x 0 1 2 3 4f(x) 0.2 0.1 0.3 0.3 0.1
36
Show that the variance of the probability distribution for com-
pany B is greater than that of company A.
µA = 1 · 0.3 + 2 · 0.4 + 3 · 0.3 = 2;
µB = 0 · 0.2 + 1 · 0.1 + 2 · 0.3 + 3 · 0.3 + 4 · 0.1 = 2;
σ2A =
2∑x=1
(x− 2)2fA(x) = 0.6;
σ2B =
4∑x=0
(x− 2)2fB(x) = 1.6.
Theorem
The variance of a random variable X is
σ2 = Var(X) = E(X2)− µ2
Proof:
For the discrete case we can write
σ2 =∑x
(x− µ)2f(x) =∑x
(x2 − 2µx+ µ2)f(x)
=∑xx2f(x)− 2µ
∑xxf(x) + µ2∑
xf(x).
=∑xx2f(x)− 2µ · µ+ µ2 · 1.
=∑xx2f(x)− µ2
=E(X2)− µ2.
37
Example 10
Let the random variable X represent the number of defective
parts for a machine when 3 parts are sampled from a production
line and tested. The following is the probability distribution of
X.
x 0 1 2 3f(x) 0.51 0.38 0.10 0.01
Solution:
µ = 0 · 0.51 + 1 · 0.38 + 2 · 0.10 + 3 · 0.01 = 0.61;
E(X2) = 0 · 0.51 + 1 · 0.38 + 4 · 0.10 + 9 · 0.01 = 0.87;
σ2 = E(X2)− µ2 = 0.87− 0.612 = 0.4979.
38
Example 11
The weekly demand for pepsi, in thousands of liters, from a
local chain of efficiency stores, is a continuous random variable
X having the probability density
f(x) =
{2(x− 1), 1 < x < 2;0, elsewhere.
Find the mean and variance of X.
Solution:
µ = E(X) =∫ 2
1x · 2(x− 1)dx =
5
3.
E(X2) =∫ 2
1x2 · 2(x− 1)dx =
17
6.
σ2 = Var(X) = E(X2)− µ2 =17
6−(
5
3
)2=
1
18
39
Theorem
Let X be a random variable with probability function f(x). The
variance of the random variable g(X) is
• Discrete: if X is discrete
σ2g(X) = E{[g(X)− µg(X)]2} =
∑x
[g(x)− µg(X)]2 · f(x).
• Continuous: if X is continuous
σ2g(X) = E{[g(X)− µg(X))]2} =
∫ ∞−∞
[x− µg(X)]2 · f(x)dx.
40
Example 12
Calculate the variance of g(X) = 2X + 3, where X is a random
variable with probability distribution
x 0 1 2 3
f(x) 14
14
14
14
Solution:
µ2X+3 = E(2X + 3) =3∑
x=0
(2x+ 3)f(x) = 6.
σ22X+3 = E{[(2X + 3)− µ2X+3]2} = E[(2X + 3− 6)2]
= E(4X2 − 12X + 9) =3∑
x=0
(4x2 − 12x+ 9)f(x)
= 4.
41
Example 6 Cont’d
Find the variance of the random variable g(X) = 4X + 3.
Solution:
σ24X+3 = E{[(4X + 3)− µ4X+3]2} = E[(4X + 3− 8)2]
= E(16X2 − 40X + 25) =∫ 2
−1(16x2 − 40x+ 25)
x2
3dx
=51
5.
42
Question
Does there exist a quantity to measure how much two variables
change together or provide a measure of the strength of the
correlation between two random variables?
43
Example
Let X denote the father’s height, Y the daughter’s height, and
Z the mother’s height.
●
●
●
●
●
●
●
●
●
●●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●● ●
●●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
4 5 6 7
3.54.0
4.55.0
5.56.0
6.5
Unit of Measurement: Foot
Father's Height
Daught
er's He
ight
Cov(X, Y)=0.3
●
●
●
●●
●●
●
●
●
●
●
●
● ●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
● ●
●●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
4 5 6 7 83.5
4.04.5
5.05.5
6.06.5
Unit of Measurement: Foot
Father's Height
Mothe
r's Heigh
t
Cov(X, Z)=0.02
44
(a) Cov(X,Y ) > 0 (b) Cov(X,Y ) < 0.
45
Definition: Covariance
Let X and Y be random variables with probability distribution
f(x, y). The covariance of the random variables X and Y is
• Discrete: if both X and Y are discrete is discrete
σXY = Cov(X,Y ) = E [(X − µX)(Y − µY )]
=∑x
∑y
(x− µx)(y − µy) · f(x, y)
• Continuous: if both X and Y are continuous
σXY = Cov(X,Y ) = E [(X − µX)(Y − µY )]
=∫ ∞−∞
∫ ∞−∞
(x− µx)(y − µy) · f(x, y)dxdy
46
The covariance between two random variables is a measurementof the nature of the association between the two. If large valuesof X often result in large values of Y or small values of X resultin small values of Y , positive X − µx will often result in positiveY − µy and negative X − µx will often result in negative Y − µy.Thus the product (X − µx)(Y − µy) will tend to be positive. Onthe other hand, if large X values often result in small Y values,the product (X −µx)(Y −µy) will tend to be negative. Thus thesign of the covariance indicates whether the relationship betweentwo dependent random variables is positive or negative. WhenX and Y are statistically independent, it can be shown that thecovariance is zero. The converse, however, is not general true.Two variables may have zero covariance and still not be statis-tically independent. Note that the covariance only describes thelinear relationship between two random variables. Therefore,if a covariance between X and Y is zero, X and Y may havenonlinear relationship, which means that they are not necessarilyindependent.
47
48
Example 13
Let X be a random variable and the probability distribution of Xis
x -2 -1 0 1 2f(x) 0.2 0.2 0.2 0.2 0.2
Let Y = g(X) = X2. Then Cov(X,Y ) = 0, but X and Y havethe quadratic relationship.
49
Theorem
The covariance of two random variables X and Y with meansµX and µY , respectively, is given by
σXY = Cov(X,Y ) = E(XY )− µXµY .Proof:For the discrete case we can write
σXY =∑x
∑y
(x− µX)(y − µY )f(x, y)
=∑x
∑y
(xy − µXy − µY x+ µXµY )f(x, y)
=∑x
∑yxyf(x, y)− µX
∑x
∑yyf(x, y)
−µY∑x
∑yxf(x, y) + µXµY
∑x
∑yf(x, y)
=E(XY )− µXµY − µY µX + µXµY=E(XY )− µXµY .
50
Example 14
The fraction X of male runners and the fraction Y of female
runners who compete in marathon races are described by the
joint probability density function
f(x, y) =
{8xy, 0 ≤ y ≤ x ≤ 1;0, elsewhere.
Find the covariance of X and Y .
Solution:
We first compute the marginal density functions:
g(x) =
{4x3, 0 < x < 1;0, elsewhere.
h(x) =
{4y(1− y2), 0 < y < 1;0, elsewhere.
51
µX = E(X) =∫ 1
04x4dx =
4
5;
µY =∫ 1
04y2(1− y2)dy =
8
15.
E(XY ) =∫ 1
0
∫ 1
y8x2y2dxdy =
4
9
σXY = E(XY )− µXµY =4
9−
4
5·
8
15=
4
225.
Although the covariance between two random variables does pro-
vide information regarding the nature of the relationship, the
magnitude of σXY does not indicate anything regarding the
strength of the relationship, since σXY is not scale-free. Its
magnitude will depend on the units measured for both X and
Y . There is a scale-free version of the covariance called the
correlation coefficient that is used widely in statistics.
52
Example
Let X denote the father’s height, Y the daughter’s height, and
Z the mother’s height.
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
4 5 6 7
3.54.0
4.55.0
5.56.0
6.5
Unit of Measurement: Foot
Father's Height
Daug
hter's
Heig
ht
Cov(X, Y)=0.3Cov(X, Y)=0.3Cov(X, Y)=0.3
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
50 60 70 80 90
4050
6070
80
Unit of Measurement: Inch
Father's Height
Daug
hter's
Heig
ht
Cov(X, Y)=30
53
Definition: Correlation Coefficient
Let X and Y be random variables with covariance σXY and stan-dard deviations σX and σY , respectively. The correlation coef-ficient of X and Y is
ρXY =Cov(X,Y )√
Var(X)Var(Y )=
σXYσXσY
Notice:
• −1 ≤ ρXY ≤ 1
• When Y = a+ bX, then
– If b > 0, ρXY = 1
– If b < 0, ρXY = −1
54
Example 14 Cont’d
Find the correlation coefficient of X and Y .
σ2X =
2
75
σ2Y =
1
5
ρXY =σXYσXσY
=4
225√2
75 ·15
=2√
30
45.
55
4.3 Means and Variances of Linear Combinations of
Random Variables
56
Theorem
If a and b are constants, then
E(aX + b) = aE(X) + b
Proof:
E(aX + b) =∫ ∞−∞
(ax+ b)f(x)dx
=a∫ ∞−∞
xf(x)dx+ b∫ ∞−∞
f(x)dx
=aE(X) + b.
57
Example 15
Let X be a random variable with density function
f(x) =
{x2
3 , −1 < x < 2;0, elsewhere;
Find the expected value of g(X) = 4X + 3.
Solution:
E(4X + 3) =∫ 2
−1(4x+ 3)
x2
3dx =
1
3
∫ 2
−1(4x3 + 3x2)dx = 8
E(X) =∫ 2
−1x ·
x2
3dx =
15
12
E(4X + 3) = 4E(X) + 3 = 4 ·15
12+ 3 = 8.
58
Theorem
The expected value of the sum or difference of two or more
functions of a random variable X is the sum or difference of the
expected values of the functions. That is,
E[g(X)± h(X)] = E[g(X)]± E[h(X)]
Proof:
E[g(X)± h(X)] =∫ ∞−∞
(g(x)± h(x))f(x)dx
=∫ ∞−∞
g(x)f(x)dx±∫ ∞−∞
h(x)f(x)dx
=E[g(X)]± E[h(X)].
59
Theorem
The expected value of the sum or difference of two or more
functions of random variables X and Y is the sum or difference
of the expected values of the functions. That is,
E[g(X,Y )± h(X,Y )] = E[g(X,Y )]± E[h(X,Y )]
Proof:
E[g(X,Y )± h(X,Y )] =∫ ∞−∞
∫ ∞−∞
(g(x, y)± h(x, y))f(x, y)dxdy
=∫ ∞−∞
∫ ∞−∞
g(x, y)f(x, y)dxdy
±∫ ∞−∞
∫ ∞−∞
h(x, y)f(x, y)dxdy
=E[g(X,Y )]± E[h(X,Y )].
60
Theorem
Let X and Y be two independent random variables. Then
E(XY ) = E(X)E(Y ).
Proof:
Suppose that g(x) and h(y) are the marginal distribution of X
and Y , respectively. Since X and Y are independent, we may
write f(x, y) = g(x)h(y). By definition,
E(XY ) =∫ ∞−∞
∫ ∞−∞
xyf(x, y)dxdy
=∫ ∞−∞
∫ ∞−∞
xyg(x)h(y)dxdy
=∫ ∞−∞
xg(x)dx∫ ∞−∞
yh(y)dy
= E(X)E(Y ).
61
Corollary
Let X and Y be two independent random variables. Then σXY =
0.
Proof:
Cov(X,Y ) = E(XY )− EXEY = 0.
Example 16
Given the joint density function
f(x, y) =
{x(1+3y2)
4 , 0 < x < 2,0 < y < 1;0, elsewhere.
Verify E(XY ) = E(X)E(Y ).
Solution:
• g(x) = x2, 0 < x < 2.
• h(y) = 1+3y2
2 , 0 < y < 1.
• f(x, y) = g(x)h(y) for all x and y, so X and Y are indepen-
dent.62
E(XY ) =∫ 1
0
∫ 2
0xyf(x, y)dxdy =
5
6.
E(X) =∫ 1
0
∫ 2
0xf(x, y)dxdy =
∫ 2
0xh(x)dx =
4
3.
E(Y ) =∫ 1
0
∫ 2
0yf(x, y)dxdy =
∫ 1
0yg(y)dy =
5
8.
Hence,
E(X)E(Y ) =4
3·
5
8=
5
6= E(XY ).
Example 17
If the joint density function of X and Y is given by?
f(x, y) =
{27(x+ 2y), if 0 < x < 1,1 < y < 2;0, elsewhere.
Find the expected value of E(XY ).Solution:
E(X
Y) =
2
3ln 2 6=
E(X)
E(Y ).
Question:
• E(XY )?=E(X)E(Y ) .
• E(XY )?=E(X)E( 1
Y ).
63
Theorem
If a and b are constants, then
σ2aX+b = Var(aX + b) = a2Var(X)
Proof:
By definition,
Var(aX + b) = E[aX + b− µaX+b]2
= E[aX + b− (aµX + b)]2
= E[(aX − aµX)2]
= E[a2 · (X − µX)2]
= a2E[(X − µX)2] = a2Var(X).
64
Theorem
If X and Y are random variables with joint probability distribution
f(x, y) and a and b are constants, then
σ2aX+bY = Var(aX + bY ) = a2Var(X) + 2abCov(X,Y ) + b2Var(Y )
Proof:
By definition,
Var(aX + b)
= E[aX + bY − µaX+bY ]2
= E[aX + bY − (aµX + bµY )]2
= E[a(X − µX) + b(Y − µY )]2
= E([a2(X − µX)2 + 2ab(X − µX)(Y − µY ) + b2(Y − µY )2]
= a2E[(X − µX)2] + 2abE[(X − µX)(Y − µY )] + b2E[(Y − µY )2]
= a2Var(X) + 2abCov(X,Y ) + b2Var(Y )
65
Corollary
Let X and Y be two independent random variables. Then
Var(aX ± bY ) = a2Var(X)+b2Var(Y ).
Example 18
Let X be a random variable with density function
f(x) =
{x2
3 , −1 < x < 2;0, elsewhere;
Find the variance of g(X) = 4X + 3.
Solution:
Var(4X + 3) =∫ 2
−1(4x+ 3− 8)2x
2
3dx = 8
Var(X) =∫ 2
−1(x−
5
3)2x
2
3dx =
1
2
Var(4X + 3) = 42Var(X) = 16 ·1
2= 8.
66
4.4 Chebyshev’s Theorem
67
Question
Let X have a probability density function f(x). Then
µ = E(X) =∫xf(x)dx, σ2 = Var(X) =
∫(x− µ)2f(x)dx.
P (µ− kσ < X < µ+ kσ) =∫ µ+kσ
µ−kσf(x)dx.
Suppose f(x) is unknown, but the mean µ and the variance σ2
are known. What is probability P (µ− kσ < X < µ+ kσ)?
Although we’re unable to obtain the exact probability, we can
estimate it.
68
Question
Let X have a probability density function f(x). Then
µ = E(X) =∫xf(x)dx, σ2 = Var(X) =
∫(x− µ)2f(x)dx.
P (µ− kσ < X < µ+ kσ) =∫ µ+kσ
µ−kσf(x)dx.
Suppose f(x) is unknown, but the mean µ and the variance σ2
are known. What is probability P (µ− kσ < X < µ+ kσ)?
Although we’re unable to obtain the exact probability, we can
estimate it.
69
Markov’s Inequality
Let X be a nonnegative random variable; then for any t > 0,
P (X ≥ t) ≤E(X)
t.
Proof:
70
By definition,
E(X) =∫ ∞−∞
(x− µ)2f(x)dx
=∫ ∞
0xf(x)dx
=∫ t
0xf(x)dx+
∫ ∞t
xf(x)dx
≥∫ ∞t
xf(x)dx
≥∫ ∞t
tf(x)dx
= t∫ ∞t
f(x)dx
= tP (X ≥ t).
Thus,
P (X ≥ t) ≤E(X)
t.
Theorem (Chebyshev’s Inequality)
The probability that any random variable X will assume a valuewithin k standard deviations of the mean is at least 1− 1
k2. Thatis,
P (µ− kσ < X < µ+ kσ) ≥ 1−1
k2
Proof:By definition,
P (µ− kσ < X < µ+ kσ)
=P [|X − µ| ≤ kσ]
=1− P [|X − µ|2 ≥ (kσ)2] (X − µ)2 ≥ (kσ)2 ⇔ |X − µ| ≥ kσ
≥1−E[(X − µ)2]
t2(by Markov Inequality)
=1−σ2
(kσ)2= 1−
1
k2
71
Example 19
A random variable X has a mean µ = 8, a variance σ2 = 9, andan unknown probability distribution. Find
1. P (−4 < X < 20)
2. P (|X − 8| ≥ 6).
Solution:
1. P (−4 < X < 20) = P [8− 4 · 3 < X < 8 + 4 · 3] ≥ 1516
2. P (|X − 8| ≥ 6) = 1 − P (|X − 8| < 6) = 1 − P (−6 < X − 8 <6) = 1− P (8− 2 · 3 < X < 8 + 2 · 3) ≤ 1
4.
72
L’Hopital’s Rule
In simple cases, L’Hopital’s rule states that for functions f(x)
and g(x), if:
limx→c f(x) = lim
x→c g(x) = 0
or:
limx→c f(x) = lim
x→c g(x) = ±∞
then:
limx→c
f(x)
g(x)= lim
x→cf ′(x)
g′(x)
where the prime (’) denotes the derivative.
Among other requirements, for this rule to hold, the limit limx→cf ′(x)g′(x)
must exist.
73
Integration by Parts∫udv = uv −
∫vdu.
74