Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006.
-
Upload
myra-jones -
Category
Documents
-
view
217 -
download
0
Transcript of Modeling a Linear Relationship Lecture 47 Secs. 13.1 – 13.3.1 Tue, Apr 25, 2006.
Modeling a Linear Modeling a Linear RelationshipRelationship
Lecture 47Lecture 47
Secs. 13.1 – 13.3.1Secs. 13.1 – 13.3.1
Tue, Apr 25, 2006Tue, Apr 25, 2006
Bivariate DataBivariate Data
Data is called Data is called bivariatebivariate if each if each observations consists of a pair of observations consists of a pair of values (values (xx, , yy).).
xx is the is the explanatoryexplanatory variable. variable. yy is the is the responseresponse variable. variable. xx is also called the is also called the independentindependent
variable.variable. yy is also called the is also called the dependentdependent
variable.variable.
ScatterplotsScatterplots
ScatterplotScatterplot – A display in which each – A display in which each observation (observation (xx, , yy) is plotted as a ) is plotted as a point in the point in the xyxy plane. plane.
ExampleExample
Draw a scatterplot of the percent on-time Draw a scatterplot of the percent on-time arrivals vs. percent on-time departures arrivals vs. percent on-time departures for the 22 airports listed in Exercise 4.29, for the 22 airports listed in Exercise 4.29, p. 252, and also in Exercise 13.5, p 822.p. 252, and also in Exercise 13.5, p 822. OnTimeArrivals.xlsOnTimeArrivals.xls..
Does there appear to be a relationship?Does there appear to be a relationship? How can we tell?How can we tell? How would we describe that relationship?How would we describe that relationship?
Linear AssociationLinear Association Draw (or imagine) an oval around the Draw (or imagine) an oval around the
data set.data set. If the oval is If the oval is tiltedtilted, then there is some , then there is some
linear associationlinear association.. If the oval is tilted If the oval is tilted upwardsupwards from left to from left to
right, then there is right, then there is positive associationpositive association.. If the oval is tilted If the oval is tilted downwardsdownwards from left to from left to
right, then there is right, then there is negative associationnegative association.. If the oval is not tilted at all, then there is If the oval is not tilted at all, then there is
no associationno association..
Positive Linear Positive Linear AssociationAssociation
x
y
Positive Linear Positive Linear AssociationAssociation
x
y
Negative Linear Negative Linear AssociationAssociation
x
y
Negative Linear Negative Linear AssociationAssociation
x
y
No Linear AssociationNo Linear Association
x
y
No Linear AssociationNo Linear Association
x
y
Strong vs. Weak Strong vs. Weak AssociationAssociation
The association is The association is strongstrong if the oval if the oval is narrow.is narrow.
The association is The association is weakweak if the oval is if the oval is wide.wide.
Strong Positive Linear Strong Positive Linear AssociationAssociation
x
y
Strong Positive Linear Strong Positive Linear AssociationAssociation
x
y
Weak Positive Linear Weak Positive Linear AssociationAssociation
x
y
Weak Positive Linear Weak Positive Linear AssociationAssociation
x
y
TI-83 - ScatterplotsTI-83 - Scatterplots
To set up a scatterplot,To set up a scatterplot, Enter the Enter the xx values in L values in L11..
Enter the Enter the yy values in L values in L22.. Press 2Press 2ndnd STAT PLOT. STAT PLOT. Select Plot1 and press ENTER.Select Plot1 and press ENTER.
TI-83 - ScatterplotsTI-83 - Scatterplots
The Stat Plot display appears.The Stat Plot display appears. Select On and press ENTER.Select On and press ENTER. Under Type, select the first icon (a Under Type, select the first icon (a
small image of a scatterplot) and press small image of a scatterplot) and press ENTER.ENTER.
For XList, enter LFor XList, enter L11..
For YList, enter LFor YList, enter L22.. For Mark, select the one you want and For Mark, select the one you want and
press ENTER.press ENTER.
TI-83 - ScatterplotsTI-83 - Scatterplots
To draw the scatterplot,To draw the scatterplot, Press ZOOM. The Zoom menu appears.Press ZOOM. The Zoom menu appears. Select ZoomStat (#9) and press Select ZoomStat (#9) and press
ENTER. The scatterplot appears.ENTER. The scatterplot appears. Press TRACE and use the arrow keys to Press TRACE and use the arrow keys to
inspect the individual points.inspect the individual points.
ExampleExample
Use the TI-83 to draw a scatterplot Use the TI-83 to draw a scatterplot of the following data.of the following data.
x y
2 3
3 5
5 9
6 12
9 16
Simple Linear Simple Linear RegressionRegression
To quantify the linear relationship To quantify the linear relationship between between xx and and yy, we wish to find the , we wish to find the equation of the line that “best” fits equation of the line that “best” fits the data.the data.
Typically, there will be many lines Typically, there will be many lines that all look pretty good.that all look pretty good.
How do we measure how well a line How do we measure how well a line fits the data?fits the data?
Measuring the Goodness Measuring the Goodness of Fitof Fit
Start with the scatterplot.Start with the scatterplot.
x
y
Measuring the Goodness Measuring the Goodness of Fitof Fit
Draw any line through the Draw any line through the scatterplot.scatterplot.
x
y
Measuring the Goodness Measuring the Goodness of Fitof Fit
Measure the vertical distances from Measure the vertical distances from every point to the lineevery point to the line
x
y
Measuring the Goodness Measuring the Goodness of Fitof Fit
Each of these represents a deviation, Each of these represents a deviation, called a called a residualresidual ee, from the line., from the line.
x
y
e
ResidualsResiduals The The i i thth residual residual – The difference between – The difference between
the the observedobserved value of value of yyii and the and the predictedpredicted value of value of yyii..
Use Use yyii^̂ for the predicted for the predicted yyii..
The formula for the The formula for the iithth residual is residual is
Notice that the residual is positive if the Notice that the residual is positive if the data point is data point is aboveabove the line and it is the line and it is negative if the data point is negative if the data point is belowbelow the line. the line.
iii yye ˆ
Measuring the Goodness Measuring the Goodness of Fitof Fit
Each of these represents a deviation, Each of these represents a deviation, called a called a residualresidual ee, from the line., from the line.
x
y
e
xi
yi^
yi
Measuring the Goodness Measuring the Goodness of Fitof Fit
Find the sum of the squared Find the sum of the squared residuals.residuals.
x
y
e
xi
yi^
yi
Measuring the Goodness Measuring the Goodness of Fitof Fit
The smaller the sum of squared The smaller the sum of squared residuals, the better the fit.residuals, the better the fit.
x
y
e
xi
yi^
yi
ExampleExample
Consider the data pointsConsider the data points
x y
2 3
3 5
5 9
6 12
9 16
ExampleExample
2 3 4 5 6 7 8 9
5
10
15
Least Squares LineLeast Squares Line
Let’s see how good the fit is for the Let’s see how good the fit is for the lineline
yy^̂ = -1 + 2 = -1 + 2xx,,
where where yy^̂ represents the represents the predictedpredicted value of value of yy, not the observed value., not the observed value.
Sum of Squared Sum of Squared ResidualsResiduals
Begin with the data set.Begin with the data set.
x y
2 3
3 5
5 9
6 12
9 16
Sum of Squared Sum of Squared ResidualsResiduals
Compute the predicted Compute the predicted yy, using , using yy^̂ = = -1 + 2-1 + 2xx..
x y y^
2 3 3
3 5 5
5 9 9
6 12 11
9 16 17
Sum of Squared Sum of Squared ResidualsResiduals
Compute the residuals, Compute the residuals, yy – – yy^̂..
x y y^ y – y^
2 3 3 0
3 5 5 0
5 9 9 0
6 12 11 1
9 16 17 -1
Sum of Squared Sum of Squared ResidualsResiduals
Compute the squared residuals.Compute the squared residuals.
x y y^ y – y^ (y – y^)2
2 3 3 0 0
3 5 5 0 0
5 9 9 0 0
6 12 11 1 1
9 16 17 -1 1
Sum of Squared Sum of Squared ResidualsResiduals
Compute the sum of the squared Compute the sum of the squared residuals.residuals.
x y y^ y – y^ (y – y^)2
2 3 3 0 0
3 5 5 0 0
5 9 9 0 0
6 12 11 1 1
9 16 17 -1 1(y – y^)2 = 2.00
Sum of Squared Sum of Squared ResidualsResiduals
Now let’s see how good the fit is for Now let’s see how good the fit is for the linethe line
yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx..
Sum of Squared Sum of Squared ResidualsResiduals
Begin with the data set.Begin with the data set.
x y
2 3
3 5
5 9
6 12
9 16
Sum of Squared Sum of Squared ResidualsResiduals
Compute the predicted Compute the predicted yy, using , using yy^̂ = = -0.5 + 1.9-0.5 + 1.9xx..
x y y^
2 3 3.3
3 5 5.2
5 9 9.0
6 12 10.9
9 16 16.6
Sum of Squared Sum of Squared ResidualsResiduals
Compute the residuals, Compute the residuals, yy – – yy^̂..
x y y^ y – y^
2 3 3.3 -0.3
3 5 5.2 -0.2
5 9 9.0 0.0
6 12 10.9 1.1
9 16 16.6 -0.6
Sum of Squared Sum of Squared ResidualsResiduals
Compute the squared residuals.Compute the squared residuals.
x y y^ y – y^ (y – y^)2
2 3 3.3 -0.3 0.09
3 5 5.2 -0.2 0.04
5 9 9.0 0.0 0.00
6 12 10.9 1.1 1.21
9 16 16.6 -0.6 0.36
Sum of Squared Sum of Squared ResidualsResiduals
Compute the sum of the squared Compute the sum of the squared residuals.residuals.
x y y^ y – y^ (y – y^)2
2 3 3.3 -0.3 0.09
3 5 5.2 -0.2 0.04
5 9 9.0 0.0 0.00
6 12 10.9 1.1 1.21
9 16 16.6 -0.6 0.36(y – y^)2 = 1.70
Sum of Squared Sum of Squared ResidualsResiduals
We conclude that We conclude that yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx is is a a betterbetter fit than fit than yy^̂ = -1 + 2 = -1 + 2xx..
Sum of Squared Sum of Squared ResidualsResiduals
2 3 4 5 6 7 8 9
5
10
15
y^ = -1 + 2x
Sum of Squared Sum of Squared ResidualsResiduals
2 3 4 5 6 7 8 9
5
10
15
y^ = -0.5 + 1.9x
Least Squares LineLeast Squares Line
Least squares lineLeast squares line – The line for – The line for which the sum of the squares of the which the sum of the squares of the distances is as small as possible.distances is as small as possible.
The least squares line is also called The least squares line is also called the the line of best fitline of best fit or the or the regression regression lineline..
ExampleExample
For all the lines that one could draw For all the lines that one could draw through this data set, through this data set,
it turns out that 1.70 is the it turns out that 1.70 is the smallest smallest possiblepossible value for the sum of the value for the sum of the squares of the residuals.squares of the residuals.
x y
2 3
3 5
5 9
6 12
9 16
ExampleExample
Therefore, Therefore,
yy^̂ = -0.5 + 1.9 = -0.5 + 1.9xx
is the regression line for this data is the regression line for this data set.set.
Regression LineRegression Line
We will write regression line asWe will write regression line as
aa is the is the yy-intercept.-intercept. bb is the slope. is the slope.
This is the usual slope-intercept form This is the usual slope-intercept form yy = = mxmx + + bb with the two terms with the two terms rearranged and relabeled.rearranged and relabeled.
bxay ˆ
TI-83 – Computing TI-83 – Computing ResidualsResiduals
It is not hard to compute the residuals It is not hard to compute the residuals and the sum of their squares on the TI-and the sum of their squares on the TI-83.83.
(Later, we will see a faster method.)(Later, we will see a faster method.) Enter the Enter the xx-values in list L-values in list L11 and the and the yy-values -values
in list Lin list L22.. Compute Compute aa + + bb*L*L11 and store in list L and store in list L33 ( (yy^̂
values).values). Compute (LCompute (L22 – L – L33))22. This is a list of the . This is a list of the
squared residuals.squared residuals. Compute sum(Ans). This is the sum of the Compute sum(Ans). This is the sum of the
squared residuals.squared residuals.
TI-83 – Computing TI-83 – Computing ResidualsResiduals
Enter the data setEnter the data set
and use the equation and use the equation yy^̂ = -0.5 + = -0.5 + 1.91.9xx to compute the sum of squared to compute the sum of squared residuals.residuals.
x y
2 3
3 5
5 9
6 12
9 16
PredictionPrediction
Use the regression line to predict Use the regression line to predict yy when when xx = 4 = 4 xx = 7 = 7 xx = 20 = 20
InterpolationInterpolation – Using an – Using an xx value within value within the observed extremes of the observed extremes of xx values to values to predict predict yy..
ExtrapolationExtrapolation – Using an – Using an xx value beyond value beyond the observed extremes of the observed extremes of xx values to values to predict predict yy..
Interpolation vs. Interpolation vs. ExtrapolationExtrapolation
Interpolated values are more Interpolated values are more reliable then extrapolated values.reliable then extrapolated values.
The farther out the values are The farther out the values are extrapolated, the less reliable they extrapolated, the less reliable they are.are.