Correlation and Covariance
Transcript of Correlation and Covariance
![Page 1: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/1.jpg)
Statistics
Introduce the statistical concepts of Covariance Correlation
Investigate invariance propertiesDevelop computational formulas
![Page 2: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/2.jpg)
Covariance
So far, we have been analyzing summary statistics that describe aspects of a single list of numbersFrequently, however, we are interested in how variables behave together
![Page 3: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/3.jpg)
Smoking and Lung Capacity
Suppose, for example, we wanted to investigate the relationship between cigarette smoking and lung capacityWe might ask a group of people about their smoking habits, and measure their lung capacities
![Page 4: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/4.jpg)
Smoking and Lung Capacity
Cigarettes (X) Lung Capacity (Y)
0 45
5 42
10 33
15 31
20 29
![Page 5: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/5.jpg)
Smoking and Lung Capacity
With SPSS, we can easily enter these data and produce a scatterplot.
Smoking
3020100-10
Lung
Cap
acity
50
40
30
20
![Page 6: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/6.jpg)
Smoking and Lung Capacity
We can see easily from the graph that as smoking goes up, lung capacity tends to go down.The two variables covary in opposite directions.We now examine two statistics, covariance and correlation, for quantifying how variables covary.
![Page 7: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/7.jpg)
Covariance
When two variables covary in opposite directions, as smoking and lung capacity do, values tend to be on opposite sides of the group mean. That is, when smoking is above its group mean, lung capacity tends to be below its group mean.Consequently, by averaging the product of deviation scores, we can obtain a measure of how the variables vary together.
![Page 8: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/8.jpg)
The Sample Covariance
Instead of averaging by dividing by N, we divide by . The resulting formula is1N
1
11
N
xy i ii
S X X Y YN
![Page 9: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/9.jpg)
Calculating Covariance
Cigarettes (X) dX dXdY dY Lung
Capacity (Y)0 10 +9 45
5 5 +6 42
10 0 0 3 33
15 +5 5 31
20 +10 7 29
215
![Page 10: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/10.jpg)
Calculating Covariance
So we obtain
1 ( 215) 53.754xyS
![Page 11: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/11.jpg)
Invariance Properties of Covariance
The covariance is invariant under listwise addition, but not under listwise multiplication. Hence, it is vulnerable to changes in standard deviation of the variables, and is not scale-invariant.
![Page 12: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/12.jpg)
Invariance Properties of CovarianceIf , theni i
i i
L aX bdl adx
1
1 1
Let , 1Then
11 1
1 1
i i i i
N
LM i ii
N N
i i i i xyi i
L aX b M cY d
S dl dmN
adx cdy ac dx dy acSN N
![Page 13: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/13.jpg)
Invariance Properties of Covariance
Multiplicative constants come straight through in the covariance, so covariance is difficult to interpret – it incorporates information about the scale of the variables.
![Page 14: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/14.jpg)
The (Pearson) Correlation Coefficient
Like covariance, but uses Z-scores instead of deviations scores. Hence, it is invariant under linear transformation of the raw scores.
1
11
N
xy i ii
r zx zyN
![Page 15: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/15.jpg)
Alternative Formula for the Correlation Coefficient
xyxy
x y
sr
s s
![Page 16: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/16.jpg)
Computational Formulas -- Covariance
There is a computational formula for covariance similar to the one for variance. Indeed, the latter is a special case of the former, since variance of a variable is “its covariance with itself.”
1 1
1
11
N N
i iNi i
xy i ii
X Ys X Y
N N
![Page 17: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/17.jpg)
Computational Formula for Correlation
By substituting and rearranging, you obtain a substantial (and not very transparent) formula for
xyr
2 22 2xy
N XY X Yr
N X X N Y Y
![Page 18: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/18.jpg)
Computing a correlation
Cigarettes (X) XY
Lung Capacity (Y)
0 0 0 2025 455 25 210 1764 4210 100 330 1089 3315 225 465 961 3120 400 580 841 2950 750 1585 6680 180
2X 2Y
![Page 19: Correlation and Covariance](https://reader036.fdocuments.us/reader036/viewer/2022082907/577cca071a28aba711a52fef/html5/thumbnails/19.jpg)
Computing a Correlation
2 2
(5)(1585) (50)(180)
(5)(750) 50 (5)(6680) 180
7925 9000(3750 2500)(33400 32400)
1075 .96151250 (1000)
xyr