Topic 3: Simple Linear Regression. Outline Simple linear regression model –Model parameters...
-
Upload
paula-arnold -
Category
Documents
-
view
222 -
download
5
Transcript of Topic 3: Simple Linear Regression. Outline Simple linear regression model –Model parameters...
Topic 3: Simple Linear Regression
Outline
• Simple linear regression model–Model parameters– Distribution of error terms
• Estimation of regression parameters–Method of least squares–Maximum likelihood
Data for Simple Linear Regression
• Observe i=1,2,...,n pairs of variables
• Each pair often called a case
• Yi = ith response variable
• Xi = ith explanatory variable
Simple Linear Regression Model
• Yi = b0 + b1Xi + ei
• b0 is the intercept
• b1 is the slope
• ei is a random error term
– E(ei)=0 and s2(ei)=s2
– ei and ej are uncorrelated
Simple Linear Normal Error Regression Model
• Yi = b0 + b1Xi + ei
• b0 is the intercept
• b1 is the slope
• ei is a Normally distributed random error with mean 0 and variance σ2
• ei and ej are uncorrelated → indep
Model Parameters
• β0 : the intercept
• β1 : the slope
• σ2 : the variance of the error term
Features of Both Regression Models
• Yi = β0 + β1Xi + ei
• E (Yi) = β0 + β1Xi + E(ei) = β0 + β1Xi
• Var(Yi) = 0 + var(ei) = σ2
– Mean of Yi determined by value of Xi
– All possible means fall on a line
– The Yi vary about this line
Features of Normal Error Regression Model
• Yi = β0 + β1Xi + ei
• If ei is Normally distributed then
Yi is N(β0 + β1Xi , σ2) (A.36)
• Does not imply the collection of Yi are Normally distributed
Fitted Regression Equation and Residuals
• Ŷi = b0 + b1Xi
–b0 is the estimated intercept
–b1 is the estimated slope
• ei : residual for ith case
• ei = Yi – Ŷi = Yi – (b0 + b1Xi)
X=82
Ŷ82=b0 + b182 e82=Y82-Ŷ82
Plot the residuals
proc gplot data=a2; plot resid*year vref=0; where lean ne .; run;
Continuation of pisa.sas
Using data set from output statement
vref=0 adds horizontal line to plot at zero
e82
Least Squares
• Want to find “best” b0 and b1
• Will minimize Σ(Yi – (b0 + b1Xi) )2
• Use calculus: take derivative with respect to b0 and with respect to b1
and set the two resulting equations equal to zero and solve for b0 and b1
• See KNNL pgs 16-17
Least Squares Solution
• These are also maximum likelihood estimators for Normal error model, see KNNL pp 30-32
XbYb
XX
YYXXb
10
2i
ii1 )(
))((
Maximum Likelihood
2
i 0 1 i
2i 0 1 i
Y X1
2i
1 2 n
0 1
Y ~ X ,
1
2 (likelihood function)
Find and which maximizes
N
f e
L f f f
L
Estimation of σ2
MSEss
MSEdf
SSE
s
E
Root
2n
e
2n
YY
2
2
iii2 )(
Standard output from Proc REG
Analysis of Variance
Source DFSum of
SquaresMean
Square F Value Pr > FModel 1 15804 15804 904.12 <.0001Error 11 192.2857
117.48052
Corrected Total 12 15997 Root MSE 4.18097 R-Square 0.9880
Dependent Mean 693.69231 Adj R-Sq 0.9869
Coeff Var 0.60271
MSEdfe
s
Properties of Least Squares Line
• The line always goes through •
• Other properties on pgs 23-24
)Y,X(
0
))((
))((
0110
10
10
bXbYnXnbnbYn
XbbY
XbbYe
ii
iii
Background Reading
• Chapter 1– 1.6 : Estimation of regression function– 1.7 : Estimation of error variance– 1.8 : Normal regression model
• Chapter 2– 2.1 and 2.2 : inference concerning ’s
• Appendix A– A.4, A.5, A.6, and A.7