Regression. A regression line attempts to predict one variable based on the relationship with...

15
Chapter 7 Regression

Transcript of Regression. A regression line attempts to predict one variable based on the relationship with...

Page 1: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Chapter 7Regression

Page 2: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

RegressionA regression line attempts to predict one

variable based on the relationship with another variable (its correlation).

The regression line is placed so that the error (distance from the line to each data point) is minimized.

The placement of the regression line minimizes the total squared predictive error. (That way there are no negative values.)

Chapter 7-Regression 2

Page 3: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

PredictionWhat is predictive error?

The amount of error associated with the placement of a best fitting regression line.

The placement of the regression line: Minimizes the total predictive error, and Minimizes the total squared predictive error

3Chapter 7-Regression

Page 4: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Progress Check 7.1

Chapter 7-Regression 4

1 2 3 4

a) Predict the approximate rate of inflation, given an unemployment rate of 5 percent.

b) Predict the approximate rate of inflation, given an unemployment rate of 15 percent.

5 10 15 20

20

15

10

5

Page 5: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Estimating regression and relationshipWitte web demonstration

Chapter 7-Regression 5

Page 6: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Least Squares Regression EquationY = bX + a

Where Y = the predicted valueX = a known valueb = slope of the linea = Y-intercept

Ability to predict an outcome for a variable, given a regression line and a value of a second paired variable.

Chapter 7-Regression 6

Page 7: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Calculating Least Squares RegressionPage 159

1. Calculate SSx, SSy, and r for the data.

2. Substitute numbers into the formula below SSy

b = SSx

3. Find the mean for X and mean for Y4. Solve for a

a = Y-(b)(X)

5. Solve for the predicted valueY’ = (b)(X) + a

Chapter 7-Regression 7

(r)√

Page 8: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Standard error of estimateThis represents a special kind of standard

deviation that reflects the magnitude of predictive error.

It is the difference between known values and predicted values based on the regression equation.

It is how much we over/under estimate a value based on the regression equation, which is related to the strength of the correlation.

Chapter 7-Regression 8

Page 9: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Calculation of standard error of estimate Page 162

Square root of the quantity of sum of squares for Y times one minus r squared divided by n minus 2.

(n-2 because 2 paired variable results in n-2 degrees of freedom)

SSy(1-r2)

Sy|x= n-2

Chapter 7-Regression 9

Page 10: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

AssumptionsUse of regression equation requires that the

underlying relationship be linear.

Use of the standard error of estimate assumes that except for chance, the dots in the original scatterplot will be dispersed equally about all segments of the regression line. (homoscedasticity)

Chapter 7-Regression 10

Page 11: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Progress Check 7.2 Page 160-1

a) Determine the least squares equation for predicting weekly reading time from educational level.

b) Faith’s education level is 15. What is her predicted reading time?

c) Keegan’s educational level is 11. What is his predicted reading time?

Chapter 7-Regression 11

Educational Level (X) Weekly Reading Time (Y)

X = 13 Y = 8

SSx = 25 SSy = 50

R = .30

Page 12: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Calculate Standard Error of EstimateCalculate the Standard Error of Estimate

using the data in 7.2 on page 160

Chapter 7-Regression 12

Educational Level (X) Weekly Reading Time (Y)

X = 13 Y = 8

SSx = 25 SSy = 50

R = .30

Page 13: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Correlation, Prediction, ErrorX Y X2 Y2 X*Y

5 4

4 6

1 4

2 3

4 5

2 4

Chapter 7-Regression 13

X - mean number of cases of influenza in a month among employeesY - mean number of bacteria x 1 million on a door knob on front door

Page 14: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Regression toward the meanExtreme scores on multiple trials will tend

toward the mean.

The regression fallacy: accepting that regression toward the mean is real, rather than a chance effect.

Chapter 7-Regression 14

Page 15: Regression. A regression line attempts to predict one variable based on the relationship with another variable (its correlation). The regression line.

Regression toward the meanTversky and KahnemannStudy of Israeli Air Force pilots in 1974Some trainees were praised after good

landings, while others were reprimanded after bad landings.

On their next landings, praised trainees did more poorly and reprimanded trainees did better.

Conclusion:Praise hinders but a reprimand helps

performance!Chapter 7-Regression 15