Ch. 5 Regression Review

10
Ch. 5 Regression Review

description

Ch. 5 Regression Review. Co rrelation. Symbol: r Called P earson’s Correlation Coefficient Measure of association used only in LINEAR situations Sign of r is same as sign of slope Strength of correlation: | r |

Transcript of Ch. 5 Regression Review

Page 1: Ch. 5 Regression Review

Ch. 5 Regression Review

Page 2: Ch. 5 Regression Review

CorrelationSymbol: r

Called Pearson’s Correlation Coefficient

Measure of association used only in LINEAR situations

Sign of r is same as sign of slope

Strength of correlation: |r|<0.5 weak correlation 0.5<|r|<0.8 moderate 0.8<|r|<1.0 strong r = 1 or r = -1 indicates ____________

Correlation does not imply causation Could be a 3rd extraneous variable that is affecting both

Page 3: Ch. 5 Regression Review

Regression

Formulas for slope and y-intercept

InterpretingSlope = predicted change in y for every 1 unit

increase in xY-intercept = predicted value of y when x = 0; often

useless

Symbols – yhat vs. y

Residual: y – yhatAlways sum to be 0

Page 4: Ch. 5 Regression Review

Determining Model Fit

Residual plot – want no patternNonlinear pattern in residual plot indicates data

may be nonlinearNo fanNo other pattern

Coefficient of determination – r2 - % of variability in y that can be explained by the approximately linear relationship between x and y (use CONTEXT)

Standard deviation about LSRL (se) = typical amount by which an observation deviates in y direction from least squares regression line

Page 5: Ch. 5 Regression Review

Extrapolation

Using model to predict y for x value outside range used to create LSRL

Prediction could be accurate or inaccurate

Page 6: Ch. 5 Regression Review

Polynomial Regressions

Page 7: Ch. 5 Regression Review

Regression Analysis: Profit versus Price, Price^2

The regression equation isProfit = - 2701 + 7060 Price - 2851 Price^2

Predictor Coef SE Coef T PConstant -2700.6 346.5 -7.79 0.000Price 7060.1 474.9 14.87 0.000Price^2 -2851.3 157.7 -18.08 0.000

S = 83.4862 R-Sq = 97.6% R-Sq(adj) = 97.4%

Analysis of Variance

Source DF SS MS F PRegression 2 10869753 5434877 779.76 0.000Residual Error 39 271828 6970Total 41 11141581

Source DF Seq SSPrice 1 8590119Price^2 1 2279634

Unusual Observations

Obs Price Profit Fit SE Fit Residual St Resid 8 1.15 1463.4 1647.7 20.3 -184.3 -2.28R 15 1.35 1802.6 1634.1 18.0 168.6 2.07R

R denotes an observation with a large standardized residual.

Page 8: Ch. 5 Regression Review

Process: 1. Scatterplot of (x,y). 2. Transform x, y, or both (if needed) using the ladder of powers. 3. Scatterplot of transformed data.

4. Least Squares Regression.5. Residual Plot.6. Acceptable? Yes: go to 7 No: go to 2

7. Solve for y. 8. Plot y= and (x,y).

Power Transformed Value* Name

3 (Original value)3 Cube

2 (Original value)2 Square

1 (Original value) No Transformation

1/2 Originalvalue Square root

1/3 3Originalvalue Cube root

0 Log(original value) Logarithm

-1 1originalvalue Reciprocal

Page 9: Ch. 5 Regression Review

A Few More Formulas

r2 =1 −SSRe sid

SSTo

se =SSRe sid

n − 2

Page 10: Ch. 5 Regression Review

A Quick Note

In any linear regression model, you may only input a value for x in the equation

If you are asked to predict x based off of y, you have to do the regression again