Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of...
-
Upload
harry-mccarthy -
Category
Documents
-
view
220 -
download
0
Transcript of Regression Analysis Part C Confidence Intervals and Hypothesis Testing Read Chapters 3, 4 and 5 of...
RegressionAnalysis
Part CConfidence Intervals and
Hypothesis Testing
Read Chapters 3, 4 and 5of Forecasting and Time Series, An Applied Approach.
L01C MGS 8110 - Regression Inference 2
Part A – Basic Model & Parameter Estimation
Part B – Calculation Procedures
Part C – Inference: Confidence Intervals & Hypothesis Testing
Part D – Goodness of Fit
Part E – Model Building
Part F – Transformed Variables
Part G – Standardized Variables
Part H – Dummy Variables
Part I – Eliminating Intercept
Part J - Outliers
Part K – Regression Example #1
Part L – Regression Example #2
Part N – Non-linear Regression
Part P – Non-linear Example
Regression Analysis Modules
L01C MGS 8110 - Regression Inference 3
Overview of Part L01C Confidence Intervals and Hypothesis Testing
• Confidence Intervals• For Yi prediction and Yi mean
• Formulas for univariate and multivariate cases.• Example calculation: 1) Manual in Excel and 2) SPSS.
• For Regression Coefficients, bi
• Formulas for univariate and multivariate cases.• Example calculation: 1) Data Analysis in Excel and 2)
SPSS.
• Hypothesis Testing• For Regression Coefficients, bi
• For Entire Regression Model, F-test
L01C MGS 8110 - Regression Inference 4
Underlying Statistical Theory Confidence Intervals and Hypothesis Testing
)(~
)(~
kntondistributitahas
jbsjb
kntondistributitahasesiy
L01C MGS 8110 - Regression Inference 5
The Standard Error of a Regression Equation single independent variable
kn
YY
kn
YY
kn
SSEss
iiii
e
222
2ˆ)ˆ(
whereYi is the actually observed values of the dependent variable.Yi
hat is the predicted value from the fitted regression equation.p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1.n is the sample size used when calculating s.
L01C MGS 8110 - Regression Inference 6
Confidence Interval for Individual Prediction single independent variable
2
2
)(
)(11)2/,(ˆ
XX
XX
nskntY
i
fef
wheref denotes the future (forecasted) or predicted value.
p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 7
Confidence Interval for Mean Prediction single independent variable (1 of 2)
2
2
)(
)(11)2/,(ˆ
XX
XX
nmskntY
i
fef
wheref denotes the future (forecasted) or predicted value. p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.m is the sample size that is going to be used to calculate the mean value.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 8
When m=1, the CI for the mean becomes the CI for an individual Y.
When m = infinity, the CI for the mean become the CI for a general mean.
Confidence Interval for Mean Prediction single independent variable (2 of 2)
2
2
)(
)(1
1
1)2/,(ˆ
XX
XX
nskntY
i
fef
2
2
2
2
)(
)(1)2/,(ˆ
)(
)(11)2/,(ˆ
XX
XX
nskntY
XX
XX
nskntY
i
fef
i
fef
L01C MGS 8110 - Regression Inference 9
Confidence Intervals for Individual Predictions and Mean Predictions
0
20
40
60
80
100
120
140
160
180
200
0 5 10 15 20 25 30 35 40 45
R/D Expense
Sa
les
L01C MGS 8110 - Regression Inference 10
CI Manual Calculationssingle independent
variable
)39.115,27.41(
06.3733.7805275.1)532.17)(06.2(33.78
01705.0357.1)532.17)(06.2(33.78
)27()443.7(
)05.2520(
28
11367.307)06.2()20(459.3151.9
)(
)(11)2/,(ˆ
2
2
2
2
XX
XX
nskntY
i
fef
=TINV(0.05,26)
2.06
Descriptive Statistics
95.82 30.968 28
25.0536 7.44342 28
SALES
R D
Mean Std. Deviation N
ANOVAb
17902.572 1 17902.572 58.245 .000a
7991.535 26 307.367
25894.107 27
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), R Da.
Dependent Variable: SALESb. Coefficientsa
9.151 11.830 .774 .446
3.459 .453 .831 7.632 .000
(Constant)
R D
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: SALESa.
L01C MGS 8110 - Regression Inference 11
12
3456789101112131415161718192021222324252627282930313233343536373839
A B C D E F G H I J K L M N O P Q R S T=TINV(0.05,26)
a = 9.15 "A_coef" b = 3.46 "B_coef" 2.06 "t_05"
ORIGINAL DATA PREDICTED VALUESX Y X (X-Xbar)2
Ret 1+(1/n)+Ret (1/n)+Ret Delta Delta M Y pred Lower Upper Lower UpperQuarter R D Sales 9 257.72 0.172 1.099 0.46 39.61 16.44 40.29 0.68 79.89 23.85 56.72
1 9.25 40 41.150 1600.00 1693.34 249.75 10 226.61 0.151 1.090 0.43 39.27 15.59 43.74 4.48 83.01 28.15 59.342 12.50 37 52.393 1369.00 2745.06 157.59 11 197.50 0.132 1.081 0.41 38.94 14.76 47.20 8.26 86.15 32.44 61.963 17.50 50 69.690 2500.00 4856.76 57.06 12 170.40 0.114 1.072 0.39 38.64 13.94 50.66 12.02 89.30 36.72 64.604 20.00 70 78.339 4900.00 6137.00 25.54 13 145.29 0.097 1.064 0.36 38.36 13.13 54.12 15.77 92.48 40.99 67.265 15.00 60 61.042 3600.00 3726.11 101.07 14 122.18 0.082 1.057 0.34 38.09 12.35 57.58 19.49 95.68 45.24 69.936 18.00 60 71.420 3600.00 5100.84 49.75 15 101.07 0.068 1.050 0.32 37.85 11.58 61.04 23.19 98.89 49.46 72.627 22.00 72 85.258 5184.00 7268.90 9.32 16 81.97 0.055 1.044 0.30 37.63 10.84 64.50 26.87 102.13 53.66 75.348 25.25 88 96.501 7744.00 9312.43 0.04 17 64.86 0.043 1.039 0.28 37.43 10.13 67.96 30.53 105.40 57.83 78.099 15.00 101 61.042 10201.00 3726.11 101.07 18 49.75 0.033 1.034 0.26 37.26 9.46 71.42 34.16 108.68 61.96 80.8810 20.25 80 79.204 6400.00 6273.25 23.07 19 36.65 0.024 1.030 0.25 37.11 8.84 74.88 37.77 111.99 66.04 83.7211 24.25 81 93.042 6561.00 8656.73 0.65 20 25.54 0.017 1.026 0.23 36.98 8.28 78.34 41.36 115.32 70.06 86.6212 27.50 97 104.285 9409.00 10875.29 5.99 21 16.43 0.011 1.023 0.22 36.87 7.79 81.80 44.93 118.67 74.01 89.5913 25.00 110 95.636 12100.00 9146.26 0.00 22 9.32 0.006 1.021 0.20 36.79 7.38 85.26 48.47 122.04 77.88 92.6414 25.75 89 98.231 7921.00 9649.26 0.49 23 4.22 0.003 1.019 0.20 36.73 7.07 88.72 51.99 125.44 81.64 95.7915 29.25 103 110.339 10609.00 12174.62 17.61 24 1.11 0.001 1.018 0.19 36.69 6.88 92.18 55.49 128.87 85.30 99.0616 32.75 117 122.447 13689.00 14993.18 59.24 25 0.00 0.000 1.018 0.19 36.68 6.81 95.64 58.96 132.31 88.83 102.4517 30.00 131 112.933 17161.00 12753.91 24.47 26 0.90 0.001 1.018 0.19 36.69 6.87 99.10 62.41 135.78 92.23 105.9618 28.00 98 106.014 9604.00 11239.05 8.68 27 3.79 0.003 1.019 0.20 36.72 7.05 102.55 65.83 139.27 95.51 109.6019 33.50 112 125.041 12544.00 15635.30 71.34 28 8.68 0.006 1.021 0.20 36.78 7.34 106.01 69.24 142.79 98.67 113.3620 38.25 134 141.473 17956.00 20014.74 174.15 29 15.57 0.010 1.023 0.21 36.86 7.74 109.47 72.61 146.33 101.73 117.2121 32.00 153 119.852 23409.00 14364.52 48.25 30 24.47 0.016 1.026 0.23 36.96 8.22 112.93 75.97 149.90 104.71 121.1622 25.25 145 96.501 21025.00 9312.43 0.04 31 35.36 0.024 1.029 0.24 37.09 8.78 116.39 79.30 153.48 107.61 125.1723 22.25 101 86.123 10201.00 7417.12 7.86 32 48.25 0.032 1.033 0.26 37.24 9.40 119.85 82.61 157.09 110.46 129.2524 25.00 89 95.636 7921.00 9146.26 0.00 33 63.15 0.042 1.038 0.28 37.42 10.06 123.31 85.90 160.73 113.25 133.3725 26.25 90 99.960 8100.00 9992.08 1.43 34 80.04 0.054 1.044 0.30 37.61 10.76 126.77 89.16 164.38 116.01 137.5426 31.25 105 117.257 11025.00 13749.32 38.40 35 98.93 0.066 1.050 0.32 37.83 11.50 130.23 92.40 168.06 118.73 141.7327 30.00 125 112.933 15625.00 12753.91 24.47 36 119.82 0.080 1.056 0.34 38.07 12.26 133.69 95.62 171.76 121.43 145.9528 40.50 145 149.257 21025.00 22277.70 238.59 37 142.72 0.095 1.064 0.36 38.33 13.05 137.15 98.82 175.48 124.10 150.20
Sum's = 701.50 2,683.00 282,983.0 274,991.47 1,495.92 38 167.61 0.112 1.071 0.38 38.61 13.85 140.61 102.00 179.22 126.76 154.46Mean's = 25.05 95.82 95.821 "X_SS" 39 194.50 0.130 1.080 0.41 38.91 14.67 144.07 105.16 182.98 129.40 158.74
=B34/28 307.37 =(E34-F34)/26 40 223.40 0.149 1.089 0.43 39.23 15.50 147.53 108.30 186.76 132.03 163.03min = 9.25 s = 17.53 =SQRT(E36) 41 254.29 0.170 1.098 0.45 39.57 16.34 150.99 111.42 190.56 134.64 167.33
max = 40.50 "St_Err" =J37/X_SS =t_05*St_Err*L37 =P37+N37 =P37+O37
=(I37-X_mean)^2 =SQRT(1+(1/28)+K37)
Y2 Pred (X-Xbar)2Y pred Y2
CI Manual Calculationssingle independent
variable
2
2
)(
)(11)2/,(ˆ
XX
XX
nmskntY
i
fef
L01C MGS 8110 - Regression Inference 12
SPSS Data Analysis Calculations single independent variable
L01C MGS 8110 - Regression Inference 13
SPSS Data Analysis Calculations single independent variable (continued)
Quarter R_D Sales lmci_1 umci_1 lici_1 uici_11 9.25 40.00 24.93 57.37 1.63 80.672 12.50 37.00 38.86 65.93 13.90 90.893 17.50 50.00 59.90 79.48 32.35 107.034 20.00 70.00 70.06 86.62 41.36 115.325 15.00 60.00 49.46 72.62 23.19 98.896 18.00 60.00 61.96 80.88 34.16 108.687 22.00 72.00 77.88 92.64 48.47 122.048 25.25 88.00 89.69 103.31 59.83 133.189 15.00 101.00 49.46 72.62 23.19 98.8910 20.25 80.00 71.05 87.35 42.26 116.1511 24.25 81.00 86.19 99.89 56.36 129.7212 27.50 97.00 97.10 111.47 67.54 141.0313 25.00 110.00 88.83 102.45 58.96 132.3114 25.75 89.00 91.39 105.07 61.55 134.9115 29.25 103.00 102.49 118.19 73.46 147.2216 32.75 117.00 112.56 132.34 85.08 159.8217 30.00 131.00 104.71 121.16 75.97 149.9018 28.00 98.00 98.67 113.36 69.24 142.7919 33.50 112.00 114.63 135.45 87.53 162.5520 38.25 134.00 127.42 155.53 102.79 180.1521 32.00 153.00 110.46 129.25 82.61 157.0922 25.25 145.00 89.69 103.31 59.83 133.1823 22.25 101.00 78.83 93.42 49.35 122.8924 25.00 89.00 88.83 102.45 58.96 132.3125 26.25 90.00 93.06 106.86 63.27 136.6526 31.25 105.00 108.33 126.19 80.13 154.3827 30.00 125.00 104.71 121.16 75.97 149.9028 40.50 145.00 133.33 165.18 109.86 188.66
L01C MGS 8110 - Regression Inference 14
The Standard Error of a Regression Equation multivariate case
knkn
SSEsse
YX'b'YY'2
whereY is the actually observed values of the dependent variable, an [n x 1] matrix vector.X is the actually observed values of the independent variable, an [n x 1] matrix vector.b is the calculated regression parameters, a [k x 1] matrix. b=(X’X)-1(X’Y)p is the number of independent variables. k=p+1 is the number of parameters, 0, 1, … p.n is the sample size used when calculating s.
kns
YX'b'YY'
L01C MGS 8110 - Regression Inference 15
Confidence Interval for Individual Predictions multivariate case
ffef skntY XXX'X1 1' )()2/,(ˆ
whereXf is a matrix vector of specified values for the independent variables. X’f = [1 Xf,1, Xf,2, … Xf,p]p is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 16
Confidence Interval for Mean Predictions multivariate case
ffef skntY XXX'X 1' )()2/,(ˆ
whereXf is a matrix vector of specified values for the independent variables. X’f = [1 Xf,1, Xf,2, … Xf,p]p is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 17
CI Manual Calculations multivariate
case Y Y '6 8 . 7 0 6 8 . 7 5 4 . 9 5 1 . 5 7 1 . 6 5 8 . 4 4 0 . 7 5 1 . 7 7 1 . 9 5 7 . 1 5 8 . 3 7 3 . 5 5 8 . 5 4 9 . 1 6 7 . 5 5 3 . 7 5 05 4 . 9 0
5 1 . 5 0 Y ' Y b ' X ' Y N u m D e m s 2 s7 1 . 6 0 5 6 2 5 6 . 6 5 6 0 3 1 2 2 5 . 6 1 3 1 7 . 3 5 4 . 1 6 6 " S t _ E r "5 8 . 4 0 = A T 5 - A W 5 = A Y 5 / A Z 5
4 0 . 7 0 X '5 1 . 7 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
7 1 . 9 0 2 . 0 5 1 . 7 1 . 4 7 1 . 7 5 1 . 9 4 1 . 1 9 1 . 5 6 1 . 9 5 1 . 6 1 . 4 9 1 . 9 1 1 . 3 8 1 . 5 5 1 . 8 8 1 . 6 1 . 5 5
5 7 . 1 0 3 . 4 3 1 1 . 6 1 8 . 3 1 0 7 . 4 1 3 1 . 7 1 6 . 1 2 . 0 5 1 . 7 4 2 . 7 6 0 0 1 2 . 6 1 2 . 8 7 . 0 8 1 8
5 8 . 3 0
7 3 . 5 0 b = ( X ' X ) - 1 ( X ' Y ) X ' Y b ' ( X ' X ) - 1
5 8 . 5 0 2 9 . 4 1 9 8 9 3 7 . 1 2 9 . 4 2 2 0 . 3 3 - 0 . 5 8 7 8 5 . 7 5 1 - 3 . 1 5 7 - 0 . 0 5 74 9 . 1 0 2 0 . 3 3 1 8 2 . 1 6 1 5 8 3 . 5 2 3 - 3 . 1 5 7 1 . 7 6 9 0 . 0 2 86 7 . 5 0 - 0 . 5 8 7 8 " t _ . 0 2 5 " 6 3 5 2 . 3 0 3 - 0 . 0 5 7 0 . 0 2 8 0 . 0 0 15 3 . 7 0
5 0 . 0 0
X ' f ( X ' X ) - 1 X f
X f Y f R a d D e l t a D e l t a _ I L o w e r U p p e r L o w e r U p p e r
1 1 . 1 0 5 1 . 7 8 0 . 9 4 7 8 . 7 5 9 1 2 . 5 5 8 3 9 . 2 3 6 4 . 3 4 4 3 . 0 3 6 0 . 5 41 1 . 3 0 5 5 . 8 5 0 . 5 3 4 6 . 5 7 4 1 1 . 1 4 5 4 4 . 7 1 6 7 . 0 0 4 9 . 2 8 6 2 . 4 31 1 . 5 0 5 9 . 9 2 0 . 2 6 2 4 . 6 0 3 1 0 . 1 0 8 4 9 . 8 1 7 0 . 0 3 5 5 . 3 1 6 4 . 5 2
1 1 . 7 5 0 6 5 0 . 1 2 3 . 1 2 3 9 . 5 2 6 5 5 . 4 7 7 4 . 5 3 6 1 . 8 8 6 8 . 1 21 1 . 9 0 6 8 . 0 5 0 . 1 4 2 3 . 3 8 9 9 . 6 1 7 5 8 . 4 3 7 7 . 6 7 6 4 . 6 6 7 1 . 4 41 2 . 1 0 7 2 . 1 2 0 . 2 9 4 4 . 8 8 2 1 0 . 2 3 8 6 1 . 8 8 8 2 . 3 5 6 7 . 2 3 7 71 1 . 1 3 2 3 2 . 9 7 0 . 6 4 7 7 . 2 4 1 1 1 . 5 5 1 2 1 . 4 2 4 4 . 5 2 2 5 . 7 3 4 0 . 2 1
1 1 . 3 3 2 3 7 . 0 4 0 . 5 9 1 6 . 9 2 0 1 1 . 3 5 3 2 5 . 6 9 4 8 . 3 9 3 0 . 1 2 4 3 . 9 61 1 . 5 3 2 4 1 . 1 1 0 . 6 7 7 7 . 4 0 4 1 1 . 6 5 3 2 9 . 4 5 5 2 . 7 6 3 3 . 7 4 8 . 5 11 1 . 7 3 2 4 5 . 1 7 0 . 9 0 4 8 . 5 5 5 1 2 . 4 1 7 3 2 . 7 6 5 7 . 5 9 3 6 . 6 2 5 3 . 7 31 1 . 9 3 2 4 9 . 2 4 1 . 2 7 2 1 0 . 1 5 1 1 3 . 5 6 6 3 5 . 6 7 6 2 . 8 1 3 9 . 0 9 5 9 . 3 9
1 2 . 1 3 2 5 3 . 3 1 1 . 7 8 2 1 2 . 0 1 4 1 5 . 0 1 1 3 8 . 2 9 6 8 . 3 2 4 1 . 2 9 6 5 . 3 2= M M U L T ( A R 2 0 : A T 2 0 , A T $ 1 3 : A T $ 1 5 )
X f'
1 1 1 1 1 1 1 1 1 1 1 11 . 1 1 . 3 1 . 5 1 . 7 5 1 . 9 2 . 1 1 . 1 1 . 3 1 . 5 1 . 7 1 . 9 2 . 1
0 0 0 0 0 0 3 2 3 2 3 2 3 2 3 2 3 2
kns
YX'b'YY'
L01C MGS 8110 - Regression Inference 18
SPSS Data Analysis Calculations multivariate case
L01C MGS 8110 - Regression Inference 19
SPSS Data Analysis Calculations multivariate case
(continued)
House Price Size Age lmci_1 umci_1 lici_1 uici_11 68.70 2.05 3.43 64.49 73.68 58.98 79.192 54.90 1.70 11.61 54.42 59.90 47.75 66.573 51.50 1.47 8.31 51.28 57.57 44.89 63.964 71.60 1.75 0.00 61.88 68.12 55.47 74.535 58.40 1.94 7.41 60.54 68.47 54.67 74.346 40.70 1.19 31.70 28.05 41.91 23.62 46.347 51.70 1.56 16.10 48.48 54.86 42.13 61.228 71.90 1.95 2.05 64.24 71.49 58.16 77.569 57.10 1.60 1.74 57.56 64.29 51.32 70.5410 58.30 1.49 2.76 54.09 62.09 48.24 67.9411 73.50 1.91 0.00 64.81 71.69 58.62 77.8912 58.50 1.38 0.00 51.73 63.22 46.80 68.1613 49.10 1.55 12.61 50.89 56.15 44.15 62.9014 67.50 1.88 2.80 62.88 69.12 56.47 75.5215 53.70 1.60 7.08 55.37 60.21 48.47 67.1116 50.00 1.55 18.00 46.75 53.95 40.66 60.05
L01C MGS 8110 - Regression Inference 20
The Standard Error of a Regression Equation
kn
YYs
ii
e
22 ˆ
whereYi is the actually observed values of the dependent variable.
Yihat is the predicted value from the fitted regression equation.
p = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1.n is the sample size used when calculating s.
Review from previous s
lide.
knse
YX'b'YY'
L01C MGS 8110 - Regression Inference 21
Skip’s Quick and Dirty method to Estimate the Confidence Interval for a Regression Line.
eeUCL
eeLCL
sXbasYY
sXbasYY
2ˆˆ2ˆˆ
2ˆˆ2ˆˆ
Procedure:Select a range of X values from Minimum X to Maximum X.
Calculate the corresponding predicted values for Y, Yhat.
Add and subtract 2 times the Standard Error for Regression to the predicted values.
Optional – plot the two CL line on the scatter plot.
L01C MGS 8110 - Regression Inference 22
Confidence Interval for Regression Coefficients single independent variable
1)2/,2(ˆ
)(
1)2/,(ˆ
1
21
b
ie
sntb
XXskntb
wherep = 1 is the number of independent variables. k = p+1 = 2 for the number of parameters, 0, 1. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
0)2/,2(ˆ
)()2/,(ˆ
0
2
21
0
b
ie
sntb
XXN
Xskntb
L01C MGS 8110 - Regression Inference 23
Confidence Interval for Regression Coefficients multivariate case
ibi
iei
skntb
dskntb
)2/,(ˆ
)2/,(ˆ
1i (X'X) of element diagonal ith the is d where
wherep is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 24
Excel, Data Analysis Calculations Multivariate Case
House Price Size Age TOOLS / DATA ANALYSIS / Regression1 68.70 2.05 3.432 54.90 1.70 11.613 51.50 1.47 8.31 Regression Statistics4 71.60 1.75 0.005 58.40 1.94 7.416 40.70 1.19 31.707 51.70 1.56 16.108 71.90 1.95 2.059 57.10 1.60 1.7410 58.30 1.49 2.7611 73.50 1.91 0.0012 58.50 1.38 0.0013 49.10 1.55 12.6114 67.50 1.88 2.8015 53.70 1.60 7.0816 50.00 1.55 18.00
L01C MGS 8110 - Regression Inference 25
Excel, Data Analysis Calculations Multivariate Case
(continued)
SUMMARY OUTPUTRegression StatisticsMultiple R 0.914R Square 0.836Adjusted R Square0.810Standard Error4.166Observations 16
ANOVAdf SS MS F Significance F
Regression 2 1146.2 573.1 33.0 0.0Residual 13 225.6 17.4Total 15 1371.8
CoefficientsStandard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%Upper 95.0%Intercept 29.42 9.99 2.94 0.011 7.84 51.00 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26 -0.92 -0.26
L01C MGS 8110 - Regression Inference 26
SPSS Data Analysis Calculations Multivariate Case
L01C MGS 8110 - Regression Inference 27
SPSS Data Analysis Calculations Multivariate Case
(continued)
Coefficientsa
29.420 9.990 2.945 .011 7.837 51.002
20.332 5.540 .503 3.670 .003 8.363 32.301
-.588 .153 -.527 -3.845 .002 -.918 -.258
(Constant)
SIZE
AGE
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Lower Bound Upper Bound
95% Confidence Interval for B
Dependent Variable: PRICEa.
L01C MGS 8110 - Regression Inference 28
Hypothesis Test of Regression Coefficient
)TDIST( if Rejector
TINVif Reject
H
Statistic Test :H
1
o
2value-Probvalue-Prob
),()2/,()2/,(
0:
0
,n-k,t
knkntkntt
s
btb
b
C
C
b
jCj
j
j
wherep is the number of independent variables. k = p+1 is the number of parameters, 0, 1, … p. n is the sample size used when calculating s.1- is the confidence level, typically .95. So /2 = .025.
L01C MGS 8110 - Regression Inference 29
Excel, Data Analysis Calculation
Multivariate CaseSUMMARY OUTPUTRegression Statistics
Multiple R 0.914R Square 0.836Adjusted R Square 0.810Standard Error 4.166Observations 16
ANOVAdf SS MS F Significance F
Regression 2 1146.2 573.1 33.0 0.000Residual 13 225.6 17.4Total 15 1371.8
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 29.42 9.99 2.94 0.011 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26
t = =2.16 0.05
=TINV(0.05,13)
L01C MGS 8110 - Regression Inference 30
SPSS Data Analysis Calculations
Multivariate Case
Coefficientsa
29.420 9.990 2.945 .011 7.837 51.002
20.332 5.540 .503 3.670 .003 8.363 32.301
-.588 .153 -.527 -3.845 .002 -.918 -.258
(Constant)
SIZE
AGE
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Lower Bound Upper Bound
95% Confidence Interval for B
Dependent Variable: PRICEa.
L01C MGS 8110 - Regression Inference 31
Coefficientsa
29.420 9.990 2.945 .011 7.837 51.002
20.332 5.540 .503 3.670 .003 8.363 32.301
-.588 .153 -.527 -3.845 .002 -.918 -.258
(Constant)
SIZE
AGE
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Lower Bound Upper Bound
95% Confidence Interval for B
Dependent Variable: PRICEa.
Summary:
Never test the intercept (constant). Discussed in more detail in L01I
If sig is less than .05, keep the variable (slope not equal to zero).
If sig is greater than .05, consider eliminating the variable from the model (slope could be zero).
L01C MGS 8110 - Regression Inference 32
Coefficientsa
29.420 9.990 2.945 .011 7.837 51.002
20.332 5.540 .503 3.670 .003 8.363 32.301
-.588 .153 -.527 -3.845 .002 -.918 -.258
(Constant)
SIZE
AGE
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig. Lower Bound Upper Bound
95% Confidence Interval for B
Dependent Variable: PRICEa.
Summary:
Never test the intercept (constant).
If sig is less than .05, keep the variable (slope not equal to zero).
If sig is greater than .05, consider eliminating the variable from the model (slope could be zero).
If you can’t remember theses rules a year
from now, look at the confidence
interval. Does the confidence interval
contain 0 (zero)
L01C MGS 8110 - Regression Inference 33
F-test for Overall Model
)2FDIST( valueProb valueProb ifReject or
),1,(FINV),,1(),,1(ifReject
ˆ
1ˆ
/)ˆ(
1/)ˆ(
/Variation dUnexplaine
1/Variation Explained
/
1/StatisticTest
0 oneleast at :H0.... :H
22
22
2
2
mod
1
321o
,n-k,t
knkknkFknkFF
k/nYY
/k)YNY(
knYY
kYY
kn
k
knSSE
kSSF
bbbbb
C
C
ii
i
ii
i
elC
j
p
L01C MGS 8110 - Regression Inference 34
Excel, Data Analysis Calculation
Multivariate Case
SUMMARY OUTPUTRegression Statistics
Multiple R 0.914R Square 0.836Adjusted R Square 0.810Standard Error 4.166Observations 16
ANOVAdf SS MS F Significance F
Regression 2 1146.2 573.1 33.0 0.000Residual 13 225.6 17.4Total 15 1371.8
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%Intercept 29.42 9.99 2.94 0.011 7.84 51.00Size 20.33 5.54 3.67 0.003 8.36 32.30Age -0.59 0.15 -3.85 0.002 -0.92 -0.26
L01C MGS 8110 - Regression Inference 35
SPSS Data Analysis Calculations
Multivariate Case
ANOVAb
1146.245 2 573.123 33.027 .000a
225.589 13 17.353
1371.834 15
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), AGE, SIZEa.
Dependent Variable: PRICEb.
L01C MGS 8110 - Regression Inference 36
Review of ANOVA Analysis
Green = Residual from mean.
Blue, dashed = portion of residual explained by regression equation.
Red = portion of residual still unexplained after fitting regression equation.
Sales = 9.15 + 3.46(R/D Expense)20
40
60
80
100
120
140
160
0.00 10.00 20.00 30.00 40.00 50.00
R/D Expense
Sal
es
L01C MGS 8110 - Regression Inference 37
Fundamental Concept ofANOVA Analysis
Residual AnalysisTotal = Unexplained + Explained
It can be shown (algebraically complex) Total SS = Unexplained SS + Explained SS
)ˆ()ˆ()( yyyyyy iiii
222 )ˆ()ˆ()( yyyyyy iiii
L01C MGS 8110 - Regression Inference 38
Review of ANOVA Table (1 of 3)
Terminology and Table Calculations
SS df MS F(Sum of Squares) (degrees of freedom) (Mean Squares)
SSR k-1 SSR/(k-1) SSR/(k-1) / SSE/(n-k)(Sum of Squares Regression) (Mean Squares Regression)
SSE n-k SSE/(n-k)(Sum of Squares Error) (Mean Squares Residual)
(Sum of Squares Residual)
SST n-1(Sum of Squares Total)
L01C MGS 8110 - Regression Inference 39
Review of ANOVA Table (2 of 3)
Algebraic explanation of terms
(total)
ˆ
ˆ
1n)y(ySST
ed)(unexplain
k)SSE/(nkn)y(yeSSE
)(explained
k)SSE/(n1)SSR/(k1)SSR/(k1k)yy(SSR
FMSdfSS
2i
2ii
2i
2i
L01C MGS 8110 - Regression Inference 40
Review of ANOVA Table (3 of 3)
Calculation formulas
k)SSE/(nesSSR/SST2R
(total)
1SST
ed)(unexplain
)/(SSEˆSSE
)(explained
)/(SSE)1/(SSR)1/(SSR1ˆSSR
22
22
22
nyny
knknyy
knkkkyny
i
ii
i
FMSdf SS
L01C MGS 8110 - Regression Inference 41
Review of ANOVA Table (1 of 3)
Matrix explanation of termsRegression prediction compared to prediction
mean of y
(total)
1nynSST
ed)(unexplain
k)SSE/(nknSSE
)(explained
k)SSE/(n1)SSR/(k1)SSR/(k1kynSSR
FMSdfSS
2original
2original
YY'
YX'b'-YY'
YX'b'
L01C MGS 8110 - Regression Inference 42
Review of ANOVA Table (2 of 3)Alternative matrix explanation of termsRegression prediction compared to prediction 0 (zero)
(total)
nSST
ed)(unexplain
k)SSE/(nknSSE
)(explained
k)SSE/(nSSR/(k)SSR/(k)kSSR
FMSdfSS
YY'
YX'b'-YY'
YX'b'
L01C MGS 8110 - Regression Inference 43
Review of ANOVA Table (3 of 3)Alternative matrix explanation of termsRegression prediction compared to prediction mean of Y & 0 (zero)
total)ed(uncorrect
(total)
nSST
1nynSST
ed)(unexplain
k)SSE/(nknSSE
)(explained
k)SSE/(n1)SSR/(k1)SSR/(k1kynbSSR
)(explained
k)SSE/(n1)SSR/(k1)SSR/(k1ynSSb
FMSdfSS
2original
2original0
2original0
YY'
YY'
YX'b'-YY'
YX'b'
L01C MGS 8110 - Regression Inference 44
Statistical Assumptions0. The expected value of the residuals is zero, E(i)=0.
The algebraic equation is the correct functional form and accurately predicts E(Yi,j) for all j.
Inference Assumptions1. The residual variance is constant. That is, j,j
2 = 2 for all Xj,j and all i and j. The variance of the observations (Yi,j) does not change as more observations are obtained and/or as different values of Xj are observed.
2. The observations are statistically independent. That is, Yi,j is statistically independent of all other Y’,j values for all i (& j fixed). Knowing the current value of Y does not provide insights into the value of the next Y.
3. The residual errors are normally distributed. The i,j terms are N(0,2).