Notes Ch8
description
Transcript of Notes Ch8
Chapter 8
Linear Regression
Regression Line
• Straight line that describes how a response variable y changes as an explanatory variable x changes.
• LSRL - Least Squares Regression Line
• LSRL is a predictor of y given x. It serves as a mathematical model for the data.
LSRL cont.
• Error = observed – predicted
• The LSRL minimizes the vertical distance of the data points from the line (error).
LSRL cont.
• LSRL of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.
Formulas to know
ˆ - predicted response
- observed response
ˆ
y
x
y
y
y a bx
a y bx
sb r
s
Slope and Intercept
• Slope is the rate of change of y when x increases by 1 unit.
• The intercept is the value of the predicted response when x is equal to 0. The intercept is not always statistically meaningful.
Coefficient of Determination
• The coefficient of determination, r^2, gives the proportion that is not error.
• If error is large, r^2 will be close to 0.
• If error is small, r^2 will be close to 1.
Coefficient of Determination = (Correlation Coefficient)^2
Interpretation of r^2
• Convert r^2 to a per cent and state that:
About _______% of the variation in y is explained by the LSRL on x.
Calculate LSRL using formulas
• Data was taken from all 78 7th grade students in a rural midwestern town. Find the equation of the LSRL for predicting GPA from IQ given the following summary statistics.
IQ scores,
108.9, 13.17
Grade Point Averages,
7.447, 2.10
The correlation between IQ and GPA is 0.6337
x
y
x
x s
y
y s
r
IQ scores,
108.9, 13.17
Grade Point Averages,
7.447, 2.10
The correlation between IQ and GPA is 0.6337
x
y
x
x s
y
y s
r
What percent of the observed variation in these student’s GPAs can be explained by the linear relationship between GPA and IQ?
Do Part I of the Archaeopteryx worksheet
Residuals
• Residuals (errors) = observed – predicted
• Residual Plot (x, residuals)– There should be no patterns in the plot only
random scatter.
Outliers
• Points that are outside overall pattern of the other observations. They show up clearly in the residual plot.
Influential points
• A point is influential if by removing it, it would markedly change the position of the regression line. Points in the x direction are often influential.