Regression and Correlation
description
Transcript of Regression and Correlation
![Page 1: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/1.jpg)
Regression and Correlation
Jake BlanchardFall 2010
![Page 2: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/2.jpg)
IntroductionWe can use regression to find
relationships between random variables
This does not necessarily imply causation
Correlation can be used to measure predictability
![Page 3: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/3.jpg)
Regression with Constant VarianceLinear Regression: E(Y|
X=x)=+xIn general, variance is function of
xIf we assume the variance is a
constant, then the analysis is simplified
Define total absolute error as the sum of the squares of the errors
![Page 4: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/4.jpg)
Linear Regression
n
ii
n
iii
n
iiii
n
iii
n
iii
n
iii
xx
xxyy
xysolve
xyx
xy
xyxy
1
2
1
1
2
1
2
1
2
1
22
02
02
![Page 5: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/5.jpg)
Variance in Regression AnalysisRelevant variance is conditional:
Var(Y|X=x)
2
2|2
22|
1
22
1
22|
1
22|
1
2
2121
Y
XY
XY
n
ii
n
iiXY
n
iiiXY
ss
r
ns
xxyyn
s
xyn
s
![Page 6: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/6.jpg)
Confidence IntervalsRegression coefficients are t-
distributed with n-2 dofStatistic below is thus t-
distributed with n-2 dof
And the confidence interval is
n
ii
ixY
xYi
xx
xxn
s
Yi
1
2
2
|
|
1
n
ii
iXY
nixY
xx
xxn
styi
1
2
2
|2,
211|
1
![Page 7: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/7.jpg)
ExampleExample 8.1Data for compressive strength (q)
of stiff clay as a function of “blow counts” (N)
038.08305.0
2
029.0
112.0
22.191
12.9591123.27.18
22|
22
222
222
ns
Nq
NnNqNnqN
qnqs
NnNs
qN
Nq
i
ii
iq
iN
744.0,21.07.18*104353
7.184101038.*306.2477.
477.04*112.0029.04
306.2
1
95.0|
2
2
95.0|
8,975.0
1
2
2
|2,
211|
Nq
Nq
i
n
ii
iXY
nixY
yNat
t
xx
xxn
styi
![Page 8: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/8.jpg)
Plot
![Page 9: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/9.jpg)
Correlation Estimate
22
2|2
,
,
1,
1,
121
11
11
rss
nn
ss
ss
yxnyx
n
ss
yyxx
n
Y
xYyx
Y
Xyx
YX
n
iii
yx
YX
n
iii
yx
![Page 10: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/10.jpg)
Regression with Non-Constant VarianceNow relax
assumption of constant variance
Assume regions with large conditional variance weighted less
)(2
)(1
)(1
|1
)|()(|
|
1
2
2
22
2
11
2
1
1111
1
11
1
22
22
22
xsgsn
yyws
xgww
xwxww
ywxwyxww
w
xwyw
xyw
xgxXYVarw
weightsxxXYExgxXYVar
xY
n
iii
iii
n
iii
n
iii
n
ii
n
iii
n
iii
n
iiii
n
ii
n
ii
n
iii
n
iii
n
iiii
iii
![Page 11: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/11.jpg)
Example (8.2)Data for maximum settlement (x)
of storage tanks and maximum differential settlement (y)
From looking at data, assume g(x)=x (that is, standard deviation of y increases linearly with x
2
22
1|
ii xw
xxXYVar
![Page 12: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/12.jpg)
Example (8.2) continued
96.0
243.00589.0
65.0045.0
627.0923.011.165.1
|
2
xss
ssyx
xy
y
x
![Page 13: Regression and Correlation](https://reader035.fdocuments.us/reader035/viewer/2022062310/568161c1550346895dd1a340/html5/thumbnails/13.jpg)
Multiple Regression
ikkiii xxxy ...22110
“Nonlinear” Regression
)()|( xgxYE
Use LINEST in Excel