Simple Linear Regression - Statistical Inference
Transcript of Simple Linear Regression - Statistical Inference
1
Simple Linear Regression -Statistical Inference
Reading: Section 12.3 and 12.4, 12.5Learning Objectives: Students should be able to:• Describe the relationship between two distributions using
plots and correlation.• Make inference about population parameters
• Confidence intervals• Hypothesis tests
• Make predictions for new observations of independent variables based on known dependent variables
2
Is the Simple Linear Regression Model Useful?Coefficient of determination and correlation coefficient
• Coefficient of Determination (r2) – the larger the r2, the greater the variation in Y being explained by its linear relationship to x.
• Correlation coefficient (r) describes how “tight” a linear relationship is between two variables.– Positive: larger x tend to associate with large y values– Negative: larger x values tend to associate with smaller y values
Relationship between COD (r2) and CC (r)
Pearson correlation of y and x = 1.000
The regression equation is y = 5.00 + 1.00 xS = 0 R-Sq = 100.0% R-Sq(adj) = 100.0%Source DF SS MS F PRegression 1 10.254 10.254 * *Residual Error 8 0.000 0.000Total 9 10.254
3
Pearson correlation of y1 and x1 = 0.000
The regression equation is y1 = 4.75 + 0.000 x1S = 1.13376 R-Sq = 0.0% R-Sq(adj) = 0.0%Source DF SS MS F PRegression 1 0.000 0.000 0.00 0.999Residual Error 8 10.283 1.285Total 9 10.283
Relationship between COD (r2) and CC (r)
4
Regression Analysis: MPG versus motorsz
The regression equation isMPG = 33.7 - 0.0474 motorsz
S = 3.06705 R-Sq = 77.2% R-Sq(adj) = 76.4%
Analysis of Variance
Source DF SS MS F PRegression 1 955.34 955.34 101.56 0.000Residual Error 30 282.20 9.41Total 31 1237.54Correlations: MPG, motorsz
Pearson correlation of MPG and motorsz= -0.879
Relationship between COD (r2) and CC (r)
5
Correlations: motorsz, weight
Pearson correlation of motorsz and weight = 0.947P-Value = 0.000
Regression Analysis: motorsz versus weight
The regression equation ismotorsz = - 135 + 0.117 weight
Predictor Coef SE Coef T PConstant -134.67 26.84 -5.02 0.000weight 0.116932 0.007241 16.15 0.000
S = 38.2179 R-Sq = 89.7% R-Sq(adj) = 89.3%
Analysis of Variance
Source DF SS MS F PRegression 1 380883 380883 260.77 0.000Residual Error 30 43818 1461Total 31 424701
Is the Simple Linear Regression Model Useful?Inference about
• The larger the |r|, the larger the• Small |r| may be indicative of true = 0 • Will make inference about β using CI and HT.
If |r| or r2 is reasonably large – CI will not contain 0– Hypothesis of = 0 will be rejected
Then utility of model is confirmed 6
1
1̂1
1
β1 is the true expected change in Y for every one unit change in X. If , then changes in x do not influence changes in Y. Hence model is not useful.
01
(1-α)100% Confidence Interval for
7
1
2ˆ1
2
11 1,,~ˆ
N
SN
XX
(1-α)100% Confidence Interval for
8
1
Point Estimator and its distribution (t-distribution)
Hypothesis-Testing Procedure (t-test)
9
Hypothesis-Testing Procedure (F-test)
10
Source of Variation D.F. Sum of Square Mean Square F-testRegression
Error
Total
Reject for a level α test.
Or compute p-value
2,1,10 0: nFfH if
Example: MPG and Motorsize
11
Regression Analysis: MPG versus motorsz - Editted ouput from MINITAB
The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)
Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706
S = 3.06705 R-Sq = 77.2%
Analysis of Variance
Source DF SS MS F PRegression 1 955.34 955.34 Residual Error 282.20 9.41Total 1237.54
12
13
Confidence Interval for Mean Y ValueCI for
14
**)|( 10*. xxYExY
Point Estimation
Prediction Interval for a Future Y ValuePI for
15
*| xY
Prediction
Example: MPG and Motorsize
16
The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)
Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706
S = 3.06705 R-Sq = 77.2%
Obs Fit SE Fit 95% CI 95% PI1 19.499 0.547
(1) Suppose we want to get some idea on MPG of cars with motorsize of 300.Do we want a CI or a PI?
Example: MPG and Motorsize
17
The regression equation is MPG = 33.7 - 0.0474 motorsz (n=32)
Predictor Coef SE Coef T PConstant 33.727 1.446 23.33 0.000motorsz -0.047428 0.004706
S = 3.06705 R-Sq = 77.2%
Obs Fit SE Fit 95% CI 95% PI1 19.499 0.547
(2) Suppose you want to get some idea on a car with motorsize of 300 that you are considering purchasing. Do you want a CI or a PI?
CI and PI Using Minitab
18
Predicted Values for New Observations
NewObs Fit SE Fit 95% CI 95% PI
1 19.499 0.547 (18.382, 20.616) (13.136, 25.862)
Extrapolation
19
Would it be wise to use our estimated simple linear regression model to predict the MPG of a car with motorsz= 600?
Extrapolation is making estimations/predictions for Y conditional on values of x outside of those observed in the data used to estimate the regression parameters.
Summary
Use the estimated linear regression model to: • Evaluate the linear relationship between Y and x:
– Coefficient of determination– Confidence interval for β1
– Hypothesis test for H0: β1=0• Predict values of Y conditional on known x
– Point estimate– Confidence interval or prediction interval
20