Chapter 7 Linear Regression Day 1 – (pg 176-184).
-
Upload
ilene-dickerson -
Category
Documents
-
view
227 -
download
0
Transcript of Chapter 7 Linear Regression Day 1 – (pg 176-184).
![Page 1: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/1.jpg)
Chapter 7
Linear RegressionDay 1 – (pg 176-184)
![Page 2: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/2.jpg)
Correlation and Regression
• Correlation = linear relationship between two variables.
• Summarize relationship with line.• Called Regression line.– Explanatory variable (x)– Response variable (y)
![Page 3: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/3.jpg)
Regression line
• Explains how response variable (y) changes in relation to explanatory variable (x).
• Use line to predict value of y for given value of x.– The predicted values are called • Values on the regression line.
– The observed values are called y.• Points in the scatterplot.
y
![Page 4: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/4.jpg)
Regression line
• Residual - the difference between the observed value and its associated predicted value.
• To find the residuals, we always subtract the predicted value from the observed one:
yye ˆ
![Page 5: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/5.jpg)
Least squares regression (LSRL)
• Most commonly used regression line.• Puts line where sum of the squared errors
(residuals) as small as possible.– Minimizes
• Based on statistics
rssyx yx ,,,,
2e
![Page 6: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/6.jpg)
Regression line equation
where
xbby 10ˆ
x
y
s
srb 1 xbyb 10
![Page 7: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/7.jpg)
Regression line equation
• b1 = slope of line. – For every unit increase in x, changes by the
amount of the slope.– Very important for interpreting data.
• b0 = y-intercept of line. – The value of when x = 0.– Usually not important for interpreting data.– Values of x are usually not close to 0.
y
y
![Page 8: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/8.jpg)
![Page 9: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/9.jpg)
Calculating the regression line.
• Degree Days vs. Gas Usage
9953.0,37.3,74.17,31.5,31.22 rssyx yx
x
y
s
srb 1
![Page 10: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/10.jpg)
Calculating the regression line.
• Degree Days vs. Gas Usage
9953.0,37.3,74.17,31.5,31.22 rssyx yx
19.074.17
37.3)9953.0(1
x
y
s
srb
![Page 11: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/11.jpg)
Calculating the regression line.
• Degree Days vs. Gas Usage
9953.0,37.3,74.17,31.5,31.22 rssyx yx
19.074.17
37.3)9953.0(1
x
y
s
srb
xbyb 10
![Page 12: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/12.jpg)
Calculating the regression line.
• Degree Days vs. Gas Usage
9953.0,37.3,74.17,31.5,31.22 rssyx yx
19.074.17
37.3)9953.0(1
x
y
s
srb
07.1)31.22(19.031.510 xbyb
![Page 13: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/13.jpg)
Calculating the regression line.
• Don’t forget to write the equation.
• Where = predicted gas usage• x = degree days
• ALWAYS IDENTIFY THE VARIABLES
xy 19.007.1ˆ
y
![Page 14: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/14.jpg)
Interpretations
• Slope– For every one unit increase in degree days, the
predicted gas usage increases by 0.19
• Intercept– When the degree days is 0, the predicted gas
usage is 1.07
![Page 15: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/15.jpg)
Prediction
• Use regression equation to predict y from x.– Ex. Predicted gas consumption when degree days
= 40?
– Ex. Predicted gas consumption when degree days = 20?
![Page 16: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/16.jpg)
Prediction
• Use regression equation to predict y from x.– Ex. Predicted gas consumption when degree days
= 40?
– Ex. Predicted gas consumption when degree days = 20?
67.86.707.1)40(19.007.1ˆ y
![Page 17: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/17.jpg)
Prediction
• Use regression equation to predict y from x.– Ex. Predicted gas consumption when degree days
= 40?
– Ex. Predicted gas consumption when degree days = 20?
67.86.707.1)40(19.007.1ˆ y
87.48.307.1)20(19.007.1ˆ y
![Page 18: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/18.jpg)
Plotting the regression line
• Find two points on line.– Pick two x values– Find predicted for each x value.– Ex. x = 20, =4.87 and x = 40, = 8.67
• Plot two points on graph.• Make line through two points.• Regression line .
yy y
![Page 19: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/19.jpg)
![Page 20: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/20.jpg)
Finding In Calculator
• Stat Edit Enter Data in L1 and L2• Stat Calc 8: LinReg (y = a + bx)
• x 16 24 42 60 75 102 120
• y 24 30 35 40 48 56 60
• LSRL:
![Page 21: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/21.jpg)
Example: A random sample of records of sales of homes from Feb.
15 to Apr. 30, 1993, from the files maintained by the Albuquerque Board of Realtors gives the price (in thousands of dollars) and size (in square feet) of 117 homes. A regression analysis gives us the model:
a. What does the slope of this line say about housing prices and house
size?b. What price would you predict to pay for a 3000 square foot home?c. A real estate agent shows a potential buyer a 1200 sq-ft home with
an asking price that is $6000 less than one would expect to pay for a house of that size. What is the asking price?
sizey 061.082.47ˆ
![Page 22: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/22.jpg)
Example
a. Every square foot increase in size increases the average price $0.061x1000, or $61.
b. $230,820 (230.82 thousand)c. $115,020
![Page 23: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/23.jpg)
Homework – Read Chapter 7 Pg 184 – 196
Ch 7 Day 1 WS
![Page 24: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/24.jpg)
Properties of regression line
• Regression line always goes through point
• r is connected to the value of b1.– r has same sign as b1. (If slope is negative,
correlation is negative & vice versa)
),( yx
x
y
s
srb 1
![Page 25: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/25.jpg)
Properties of regression line
• The values of the y variable vary.• Regression line tries to explain variation
of y through its relationship with x.– Not perfectly – points not exactly on line.– Points close to line = regression explains variation
of y well.– Points far from line = regression does not explain
variation of y well.
![Page 26: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/26.jpg)
Properties of regression line
• How much variation can we explain with our regression?
• Answer: R2
– Percent of variation in y that is explained by the linear relationship with x.
– Higher values of R2 mean regression line helps explain the variation in y variable.
![Page 27: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/27.jpg)
Degree Days vs. Gas Usage
– R2 = r2
– Ex. r = 0.9953, R2 = 0.9906 or 99.06% of the variation in gas usage can be explained by the linear relationship with number of degree days.
![Page 28: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/28.jpg)
Example: Roller Coasters
People who responded to a July 2004 Discovery Channel poll names the 10 best roller coasters in the United States. A table in Chapter 7, Exercise 33 shows the length of the initial drop (in feet) and the duration of the ride (in seconds). A regression to predict duration from drop has R2=12.4%.
a. What are the variables and units in this regression?b. Write a sentence (in context) summarizing that the
R2 says about this regression.c. What is the correlation between drop and
duration?
![Page 29: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/29.jpg)
Example: Roller Coasters
a. y: Duration (in seconds) x: drop (in feet) b. Differences in height explain 12.4% of the variability in
the duration of the ride… (or better)… “12.4% of the variation in y (the duration of the ride) can be explained by the least squares regression with x (length of the initial drop).”
c. 0.352
![Page 30: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/30.jpg)
Residuals
• Variation in y not measured by regression line.• Formula: • Residual for each data point.• Mean of residuals = 0.
yye ˆ
![Page 31: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/31.jpg)
Example: Calculating Residuals
• Degree Days vs. Gas Usage
• Find the residual for the point (30,6.4)
xy 19.007.1ˆ
![Page 32: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/32.jpg)
Calculating Residuals
• Degree Days vs. Gas Usage
• Find the residual for the point (30,6.4)
xy 19.007.1ˆ
77.6)30(19.007.1ˆ y
![Page 33: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/33.jpg)
Calculating Residuals
• Degree Days vs. Gas Usage
• Find the residual for the point (30,6.4)
xy 19.007.1ˆ
37.077.64.6ˆ
77.6)30(19.007.1ˆ
yye
y
![Page 34: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/34.jpg)
Calculating Residuals
• Find the residual for the point (13,4.0)
![Page 35: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/35.jpg)
Calculating Residuals
• Find the residual for the point (13,4.0)
54.3)13(19.007.1ˆ y
![Page 36: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/36.jpg)
Calculating Residuals
• Find the residual for the point (13,4.0)
46.054.30.4ˆ
54.3)13(19.007.1ˆ
yye
y
![Page 37: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/37.jpg)
Residual Plots
• Scatterplot– Explanatory variable (x) on horizontal axis.– Residuals (e) on vertical axis.– Horizontal line at residual = 0.
• Good Residual Plot– No pattern or shape– No outliers
![Page 38: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/38.jpg)
Interpreting Residual Plots
The residual plot should show about the same amount of scatter throughout!!
If it has a…• Curved Pattern– relationship is not linear.
• Increasing spread about line as x increases.– Predictions of y for larger x will be less accurate.
• Decreasing spread about line as x increases.– Predictions of y for smaller x will be less accurate.
![Page 39: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/39.jpg)
![Page 40: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/40.jpg)
The ages (in months) and heights (in inches) of seven children are given.
x 16 24 42 60 75 102 120
y 24 30 35 40 48 56 60
Find the LSRL.
Interpret the slope and correlation coefficient in the context of the problem.
Example: Children’s Ages and Heights
![Page 41: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/41.jpg)
Correlation coefficient:
There is a strong, positive, linearstrong, positive, linear association between the age and age and height of childrenheight of children.
Slope:For an increase in age of one monthage of one month, there is an approximate increaseincrease of .34 .34 inches in heights of children.inches in heights of children.
![Page 42: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/42.jpg)
The ages (in months) and heights (in inches) of seven children are given.
x 16 24 42 60 75 102 120
y 24 30 35 40 48 56 60
Predict the height of a child who is 4.5 years old.
Predict the height of someone who is 20 years old.
![Page 43: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/43.jpg)
Create the residual plot of the data in your calculator:
Stat Edit Enter Data in L1 and L2Stat Calc 8: LinReg (y = a + bx)2nd y= (stat plot) Plot 1
Type: Scatterplot Xlist = L1Ylist = 2nd Stat (List) 7: ResidualsEnter
Zoom 9
Is a LSRL appropriate here? Why?
![Page 44: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/44.jpg)
ExtrapolationExtrapolation
• The LSRL should notshould not be used to predict y for values of x outside the data set. • It is unknown whether the pattern
observed in the scatterplot continues outside this range.
![Page 45: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/45.jpg)
![Page 46: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/46.jpg)
Reading Computer Output
Exercise physiologists are investigating the relationship between lean body mass (in kilograms) and the resting metabolic rate (in calories per day) in sedentary males.
What is the LSRL?
![Page 47: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/47.jpg)
The correlation coefficient and the LSRL are both non-resistantnon-resistant measures.
![Page 48: Chapter 7 Linear Regression Day 1 – (pg 176-184).](https://reader030.fdocuments.us/reader030/viewer/2022020920/5697bff11a28abf838cbb6c3/html5/thumbnails/48.jpg)
You should be able to….
•Calculate a regression line given summary statistics.•Interpret the slope and intercept of the regression line.•Find predictions and residuals for points.•Interpret a residual plot.•Interpret the R2 value for a regression.•Understand the limitations of regression.