© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple...

33
© 2004 Prentice-Hall, Inc. Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building

Transcript of © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple...

Page 1: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-1

Basic Business Statistics(9th Edition)

Chapter 15Multiple Regression Model

Building

Page 2: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-2

Chapter Topics

The Quadratic Regression Model Using Transformations in Regression

Models Influence Analysis Collinearity Model Building Pitfalls in Multiple Regression and Ethical

Issues

Page 3: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-3

The Quadratic Regression Model

Relationship between the Response Variable and the Explanatory Variable is a Quadratic Polynomial Function

Useful When Scatter Diagram Indicates Non-Linear Relationship

Quadratic Model :

The Second Explanatory Variable is the Square of the First Variable

20 1 1 2 1i i i iY X X

Page 4: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-4

Quadratic Regression Model(continued)

X1

Y

X1X1

YYY

2 > 0 2 > 0 2 < 0 2 < 0

2 = the coefficient of the quadratic term

X1

Quadratic model may be considered when a scatter diagram takes on the following shapes:

Page 5: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-5

Testing for Significance: Quadratic Model

Testing for Overall Relationship Similar to test for linear model F test statistic =

Testing the Quadratic Effect Compare quadratic model

with the linear model

Hypotheses (No quadratic effect) (Quadratic effect is present)

MSR

MSE

20 1 1 2 1i i i iY X X

0 1 1i i iY X

0 2: 0H 1 2: 0H

Page 6: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-6

Heating Oil ExampleOil (Gal) Temp Insulation

275.30 40 3363.80 27 3164.30 40 1040.80 73 694.30 64 6

230.90 34 6366.70 9 6300.60 8 10237.80 23 10121.40 63 331.40 65 10

203.50 41 6441.10 21 3323.00 38 352.50 58 10

(0F)

Determine if a quadratic model is needed for estimating heating oil used for a single family home in the month of January based on average temperature and amount of insulation in inches.

Page 7: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-7

Heating Oil Example: Residual Analysis

Insulation Residual Plot

0 2 4 6 8 10 12

No discernable pattern

Temperature Residual Plot

-60

-40

-20

0

20

40

60

0 20 40 60 80

Re

sid

ua

ls

Possible non-linear relationship

(continued)

Page 8: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-8

Heating Oil Example: t Test for Quadratic Model

Testing the Quadratic Effect Model with quadratic insulation term

Model without quadratic insulation term

Hypotheses (No quadratic term in insulation) (Quadratic term is needed in

insulation)

(continued)

20 1 1 2 2 3 2i i i i iY X X X

0 1 1 2 2i i i iY X X

0 3: 0H 1 3: 0H

Page 9: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-9

3

3 3 1.8667 01.6611

1.1238b

bt

S

Example Solution

H0: 3 = 0

H1: 3 0

df = 11

Critical Values:

Test Statistic:

Decision:

Conclusion:

Do not reject H0 at = 0.05.

There is not sufficient evidence for the need to include quadratic effect of insulation on oil consumption.

Z0 2.2010-2.2010

.025

Reject H0 Reject H0

.025

Is quadratic term in insulation needed on monthly consumption of heating oil? Test at = 0.05.

1.6611

Page 10: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-10

Example Solution in PHStat

PHStat | Regression | Multiple Regression …

Excel spreadsheet for the heating oil example

Microsoft Excel Worksheet

Page 11: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-11

Using Transformations

Either or Both Independent and Dependent Variables May Be Transformed

Can Be Based on Theory, Logic or Scatter Diagrams

Page 12: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-12

Inherently Non-Linear Models

Non-Linear Models that Can Be Expressed in Linear Form Can be estimated by least squares in linear

form Require Data Transformation

Page 13: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-13

Transformed Multiplicative Model (Log-Log)

1 20 1 2Original: i i i iY X X

0 1 1 2 2Transformed: ln ln ln ln lni i i iY X X

Y

X1

Y

X1

1 1

10 1 11 0

1 1

1 1

Similarly for X2

Page 14: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-14

Square Root Transformation

Y

X1

0 1 1 2 2i i i iY X X

1 > 0

1 < 0

Similarly for X2

Transforms non-linear model to one that appears linear. Often used to overcome heteroscedasticity.

Page 15: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-15

Exponential Transformation(Log-Linear)

Y

X1

0 1 1 2 2i iX Xi iY e Original Model

1 > 0

1 < 0

Transformed Into: 0 1 1 2 2 1ln lni i iY X X

Page 16: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-16

Interpretation of Coefficients

Transformed Exponential Model (Y is Transformed into lnY ) The coefficient of the independent variable

can be approximately interpreted as: a 1 unit change in leads to an estimated average rate of change of percentage in Y

kX

kX 100 kb

Page 17: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-17

Interpretation of Coefficients

Transformed Multiplicative Model The Dependent Variable Y is transformed to ln Y The Independent Variable X is transformed to ln X The coefficient of the independent variable

can be approximately interpreted as a 1 percent rate of change in leads to an estimated average rate of change of percentage in Y. Therefore, is the elasticity of Y with respect to a change in .

(continued)

kX

kX

kbkb

kX

Page 18: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-18

Influence Analysis To Determine Observations that Have

Influential Effect on the Fitted Model Potentially Influential Points Become

Candidates for Removal from the Model Criteria Used are:

The hat matrix elements hi

The studentized deleted residuals ti

Cook’s distance statistic Di

All 3 Criteria are Complementary Only when all 3 criteria provide a consistent result

should an observation be removed

Page 19: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-19

The Hat Matrix Element hi

If , Xi is an Influential Point Xi may be considered a candidate for

removal from the model

2

2

1

1 i

i n

ii

X Xh

n X X

2 1 /ih k n

Page 20: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-20

The Hat Matrix Element hi :Heating Oil Example

Oil (Gal) Temp Insulation h i275.30 40 3 0.1567363.80 27 3 0.1852164.30 40 10 0.175740.80 73 6 0.246794.30 64 6 0.1618

230.90 34 6 0.0741366.70 9 6 0.2306300.60 8 10 0.3521237.80 23 10 0.2268121.40 63 3 0.244631.40 65 10 0.2759

203.50 41 6 0.0676441.10 21 3 0.2174323.00 38 3 0.157452.50 58 10 0.2268

No hi > 0.4 No observation appears to be a candidate for removal from the model

15 2

2 1 / 0.4

n k

k n

Page 21: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-21

The Studentized Deleted Residuals ti

: the residual for observation i SSE : error sum of squares An observation is considered influential if

is the critical value of a two-tail test at 10% level of significance

2

1

1i ii i

n kt e

SSE h e

ie

2i n kt t

2n kt

Page 22: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-22

The Studentized Deleted Residuals ti :Example

Oil (Gal) Temp Insulation t i275.30 40 3 -0.3772363.80 27 3 0.3474164.30 40 10 0.8243

40.80 73 6 -0.187194.30 64 6 0.0066

230.90 34 6 -1.0571366.70 9 6 -1.1776300.60 8 10 -0.8464237.80 23 10 0.0341121.40 63 3 -1.8536

31.40 65 10 1.0304203.50 41 6 -0.6075441.10 21 3 2.9674323.00 38 3 1.1681

52.50 58 10 0.2432

2 11

15 2

1.7957n k

n k

t t

t10 and t13 are

influential points for potential removal from the model

10t

13t

Page 23: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-23

Cook’s Distance Statistic Di

ei = the residual for observation i MSE = mean square error of the fitted

regression model hi = hat matrix element of observation i If , an observation is considered

influential is the critical value of the F

distribution at a 50% level of significance

2

21

i ii

i

e hD

kMSE h

1, 1i k n kD F

1, 1k n kF

Page 24: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-24

Cook’s Distance Statistic Di : Heating Oil Example

Oil (Gal) Temp Insulation D i

275.30 40 3 0.0094363.80 27 3 0.0098164.30 40 10 0.049640.80 73 6 0.004194.30 64 6 0.0001

230.90 34 6 0.0295366.70 9 6 0.1342300.60 8 10 0.1328237.80 23 10 0.0001121.40 63 3 0.308331.40 65 10 0.1342

203.50 41 6 0.0094441.10 21 3 0.4941323.00 38 3 0.082452.50 58 10 0.0062

No Di > 0.835 No observation appears to be a candidate for removal from the modelUsing the 3 criteria, there is insufficient evidence for the removal of any observation from the model.

1, 1 3,12

15 2

0.835k n k

n k

F F

Page 25: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-25

Collinearity (Multicollinearity)

High Correlation between Explanatory Variables

Coefficient of Multiple Determination Measures Combined Effect of the Correlated Explanatory Variables

Little or No New Information Provided Leads to Unstable Coefficients (Large

Standard Error)

Page 26: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-26

Venn Diagrams and Collinearity

Oil

TempInsulation

Large Overlap Overlap in variation of Temp and Insulation is used in explaining the variation in Oil but NOTNOT in estimating and

12

Large Overlap Overlap reflects collinearity between Temp and Insulation

Page 27: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-27

Detect Collinearity (Variance Inflationary

Factor)

Used to Measure Collinearity

If is Highly Correlated with

the Other Explanatory Variables

2 coefficient of multiple

determination from the

regression of on all

the other explantory variables

j

j

R

X

2

1

1j

j

VIFR

5, j jVIF X

jVIF

Page 28: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-28

Detect Collinearity in PHStat

PHStat | Regression | Multiple Regression … Check the “Variance Inflationary Factor (VIF)”

box Excel spreadsheet for the heating oil

example Since there are only two explanatory variables,

only one VIF is reported in the Excel spreadsheet

No VIF is > 5 There is no evidence of collinearity

Microsoft Excel Worksheet

Page 29: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-29

Model Building

Goal is to Develop a Good Model with the Fewest Explanatory Variables Easier to interpret Lower probability of collinearity

Stepwise Regression Procedure Provides limited evaluation of alternative

models Best-Subset Approach

Uses the or Cp Statistic Selects the model with the largest or

small Cp near k+1

2adjr

2adjr

Page 30: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-30

Model Building FlowchartChoose X1,X2,…Xp

Run Regression to Find VIFs

Remove Variable with

Highest VIF

Any VIF>5?

Run Subsets Regression to Obtain

“Best” Models in Terms of Cp

Do Complete Analysis

Add Curvilinear Term and/or Transform Variables as Indicated

Perform Predictions

No

More than One?

Remove this X

Yes

No

Yes

Page 31: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-31

Pitfalls and Ethical Issues

Fail to Understand that the Interpretation of the Estimated Regression Coefficients are Performed Holding All Other Independent Variables Constant

Fail to Evaluate Residual Plots for Each Independent Variable

Fail to Evaluate Interaction Terms

Page 32: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-32

Pitfalls and Ethical Issues

Fail to Obtain VIF for Each Independent Variable and Remove Variables that Exhibit a High Collinearity with Other Independent Variables Before Performing Significance Test on Each Independent Variable

Fail to Examine Several Alternative Models Fail to Use Other Methods When the

Assumptions Necessary for Least-Squares Regression Have Been Seriously Violated

(continued)

Page 33: © 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.

© 2004 Prentice-Hall, Inc. Chap 15-33

Chapter Summary

Described the Quadratic Regression Model

Discussed Using Transformations in Regression Models

Presented Influence Analysis Described Collinearity Discussed Model Building Addressed Pitfalls in Multiple Regression

and Ethical Issues