Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and...

34
1 Lecture 11 ! Covariance and correlation ! The Sample Mean

Transcript of Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and...

Page 1: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

1

Lecture 11

! Covariance and correlation

! The Sample Mean

Page 2: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

2

Covariance and Correlation! Covariance: measure the association between

two random variables.

Let X and Y be random variables having a specifiedjoint distribution, and let E(X)= , E(Y)= .The covariance of X and Y, is defined as

if the expectation exists.

! It can be shown (in Exercise) that if both X and Y havefinite variance, then the expectation will exist.

Xµ Yµ

)])([(),( YX YXEYXCov µµ --=

Page 3: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

3

Correlation! If , , the correlation of X and

Y, is defined as

! The range of possible values of the correlation is:

¥<< 2X0 s ¥<< 2

Y0 s

.),(),(YX

YXCovYXss

r =

.1),(1 ££- YXr

Page 4: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

4

Theorem (Schwarz inequality)Theorem (Schwarz inequality): For any random variables U and V,

)()()]([ 222 VEUEUVE £

Page 5: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

5

! Let , then

! X and Y are positively correlated: X and Y are negatively correlated:X and Y are uncorrelated:

YX YV,XU µµ -=-=

1),(11)],([)()()]([)],([

2

222222

££-Þ£Þ

=£=

YXYXVEUEUVEYXCov YX

rr

ss

0),( >YXr

0),( <YXr0),( =YXr

Page 6: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

6

Properties of Covariance and Correlation

Theorem. For any random variables X and Ysuch that and ,

Cov(X,Y)=E(XY)-E(X)E(Y) ¥<2

Xs ¥<2Ys

Page 7: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

7

Proof.

)()()()()()()(

)])([(),(

YEXEXYEXEYEXYE

XYXYEYXEYXCov

YXYX

YXYX

YX

-=+--=

+--=--=

µµµµµµµµ

µµ

Page 8: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

8

! Theorem. If X and Y are independentrandom variables with and ,then

¥<< 2X0 s ¥<< 2

Y0 s

0),(),( == YXYXCov r

Page 9: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

9

Proof. If X and Y are independent, thenE(XY)=E(X)E(Y). Therefore,

Cov(X,Y)=E(XY)-E(X)E(Y)=0.

It follows that.0),( =YXr

Page 10: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

10

! Remark: Two uncorrelated random variables can be dependent.

Example:

Suppose that X can take only three values –1, 0,and 1 and each of these three values has the sameprobability. Let Y be defined by .

Please show that X and Y are dependent butuncorrelated.

2XY =

Page 11: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

11

Proof. (1) X and Y are dependent.

(2) X and Y are uncorrelated.

0),(0)()(

0)()()( 3

=Þþýü

====

YXCovYEXE

XEXEXYE

2

1

22

1 1

(1,1) Pr( 1 and 1) Pr( 1 and 1) 1 = Pr( 1 and ( 1 or -1))= Pr( 1)= .3

1(1) Pr( 1 ) = .3

2(1) Pr( 1) Pr( 1) = Pr( 1 or -1) = .3

Thus, (1,1) (1) (1), that is, X and Y are

f X Y X X

X X X X

f X

f Y X X X

f f f

= = = = = =

= = = =

= = =

= = = = = =

¹ not independent.

Page 12: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

12

! Theorem. Suppose X is a random variablewith , suppose that Y=aX+b where

. If a>0, then . If a<0, then¥<< 20 Xs

0¹a 1),( =YXr.1),( -=YXr

Page 13: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

13

Proof. If Y=aX+b, then and. Therefore,

ba XY += µµ

)( XY XaY µµ -=-

)(),(),(

||)][(),(

222

22

asignYXCovYX

aaaXaEYXCov

YX

XYXY

xX

==Þ

ïþ

ïýü

=Þ=

=-=

ssr

ssss

Page 14: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

14

! Theorem. If X and Y are random variables such that and , then

Var(X+Y)=Var(X)+Var(Y)+2Cov(X,Y).¥<)(XVar ¥<)(XVar

Page 15: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

15

Proof. Since E(X+Y)= , thenYX µµ +

),(2)()()])((2)()[(

)][()(22

2

YXCovYVarXVarYXYXE

YXEYXVar

YXYX

YX

++=--+-+-=

--+=+

µµµµ

µµ

Page 16: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

16

! Remark. For any constants a and b, we canshow that Cov(aX,bY)=abCov(X,Y). It followsthat

),(2)()(),(2)()()(

22 YXabCovYVarbXVarabYaXCovbYVaraXVarcbYaXVar

++=

++=++

Page 17: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

17

! Theorem. If X1,...,Xn are random variables such that for i=1,...,n, then

Proof. For any random variable Y, Cov(Y,Y)=Var(Y).

¥<)( iXVar

åååå<==

+=÷ø

öçè

æ

jiji

n

ii

n

ii XXCovXVarXVar ),(2)(

11

ååå

ååå

ååååå

<=

¹=

= ====

+=

+=

=÷ø

öçè

æ=÷

ø

öçè

æ

jiji

n

ii

jiji

jiii

n

i

n

jji

n

ii

n

ii

n

ii

XXCovXVar

XXCovXXCov

XXCovXXCovXVar

),(2)(

),(),(

),(,

1

1 1111

Page 18: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

18

! Remark. If X1,...,Xn are uncorrelated random variables, then

åå==

=÷ø

öçè

æ n

ii

n

ii XVarXVar

11)(

Page 19: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

19

Markov Inequality! Theorem. Markov Inequality. Suppose that X is a

random variable such that . Then for anygiven number t>0,

1)0Pr( =³X

.)()Pr(tXEtX £³

Page 20: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

20

Markov Inequality

Proof. Assume for convenience that X has a discretedistribution with p.f. f. Then,

tXEtX

tXtxtfxxfXE

xxfxxfxxfXE

txtx

txtxx

)()Pr(

)Pr()()()(

)()()()(

£³Þ

³=³³Þ

+==

åå

ååå

³³

³<

Page 21: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

21

! Remark. The Markov inequality is primarilyof interest for large values of t. For example,for any nonnegative random variable Xwhose mean is 1, the maximum possiblevalue of is 0.01.)100Pr( ³X

Page 22: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

22

! Theorem : Chebyshev Inequality. Let X be a random variable for which Var(X) exisits. Then for any given number t>0,

Proof. Let . Then and E(Y)=Var(X). By applying the Markov inequality to Y, we have

2)()|)(Pr(|

tXVartXEX £³-

2)]([ XEXY -= 1)0Pr( =³Y

222 )()()Pr()|)(Pr(|

tXVar

tYEtYtXEX =£³=³-

Page 23: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

23

! Remark. Suppose Var(X)=

91

)3()3|Pr(|

41

)2()2|Pr(|

2

2

2

2

=£³-

=£³-

sss

sss

EXX

EXX

2s

Page 24: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

24

Properties of the Sample Mean

! Suppose that X1,...,Xn form a random sampleof size n from some distribution for which themean is and the variance is . Let

This random variable is called the samplemean.

! Question:

2sµ

11 ( )n nX X Xn

= + +L

?)|Pr(| ?)( ?)( £³-== tXXVarXE nnn µ

Page 25: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

25

2

2

22

21

2

12

1

)|Pr(|

1)(1

1)(

1)(1)(

nttX

nn

nXVar

n

XVarn

XVar

nn

XEn

XE

n

n

ii

n

iin

n

iin

ss

µµ

£³-

=×==

÷ø

öçè

æ=

=×==

å

å

å

=

=

=

Page 26: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

26

Example! Suppose a random sample is to be taken from

a distribution for which the mean isunknown, but the standard deviation isknown to be 2. How large a sample must betaken in order to make the probability at least0.99 that will be less than 1 units.|X| n µ-

40001.04thatfollowsit

,99.0)1|Pr(|fororderIn

4)1|Pr(|2

³Û£

³<-

=£³-

nn

Xnn

X

n

n

µ

Page 27: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

27

Example! Suppose a fair coin is to be tossed n times. For

i=1,...,n, let Xi=1 if a head is obtained on the ith tossand let Xi=0 if a tail is obtained on the ith toss. Thenthe sample mean is the proportion of heads thatare obtained on the n tosses. What is the number oftimes the coin must be tossed in order to make

Solution: Let denote the total number ofheads obtained on the n tosses, then . T has abinomial distribution with parameters n and p=1/2.So E(T)=n/2, Var(T)=n/4.

nX

Pr(0.4 0.6) 0.70nX£ £ ³

å ==

n

1i iXTn/TX n =

Page 28: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

28

1. Use Chebyshev inequality:

2

Pr(0.4 0.6) Pr(0.4 0.6 )

Pr(| | 0.1 ) 1 Pr(| | 0.1 )2 2

251 14(0.1 )

25When 84,1 0.70.

nX n T nn nT n T n

nn n

nn

£ £ = £ £

= - £ = - - >

³ - = -

³ - ³

2. Use binomial distribution: for n=?,

So ? tosses is sufficient.Pr(0.4 0.6) Pr(0.4n 0.6 ) 0.70X T n£ £ = £ £ >

Page 29: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

29

下载R

Page 30: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

30

Page 31: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

31

Page 32: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

32

Page 33: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

33

R codefor (n in 1:20){start = ceiling(0.4*n)end = floor(0.6*n)prob = sum(dbinom(c(start:end), n, 0.5))print(c(n, start, end, prob))}

Page 34: Lecture 11 - PKUmwfy.gsm.pku.edu.cn/miao_files/ProbStat/lecture11.pdf2 Covariance and Correlation!Covariance:measuretheassociationbetween tworandomvariables. LetXandYberandomvariableshavingaspecified

34