1 Chapter 3 Multiple Linear Regression. 2 3.1 Multiple Regression Models Suppose that the yield in...

Post on 01-Jan-2016

214 views 0 download

Tags:

Transcript of 1 Chapter 3 Multiple Linear Regression. 2 3.1 Multiple Regression Models Suppose that the yield in...

1

Chapter 3Multiple Linear Regression

2

3.1 Multiple Regression Models

• Suppose that the yield in pounds of conversion in a chemical process depends on temperature and the catalyst concentration. A multiple regression model that might describe this relationship is

• This is a multiple linear regression model in two variables.

3

3.1 Multiple Regression Models

Figure 3.1 (a) The regression plane for the model E(y)= 50+10x1+7x2 . (b) The contour plot.

4

3.1 Multiple Regression Models

In general, the multiple linear regression model with k regressors is

5

3.1 Multiple Regression Models

6

3.1 Multiple Regression Models

Linear regression models may also contain interaction effects:

If we let x3 = x1x2 and 3 = 12, then the model can be written in the form

7

3.1 Multiple Regression Models

8

9

3.2 Estimation of the Model Parameters

3.2.1 Least Squares Estimation of the Regression Coefficients

Notation

n – number of observations available

k – number of regressor variables, p-- k+1( number of regression coefficients)

y – response or dependent variable

xij – ith observation on jth regressor j.

10

3.2.1 Least Squares Estimation of Regression Coefficients

11

3.2.1 Least Squares Estimation of the Regression Coefficients

The sample regression model can be written as

12

3.2.1 Least Squares Estimation of the Regression Coefficients

The least squares function is

The function S must be minimized with respect to the coefficients.

13

3.2.1 Least Squares Estimation of the Regression Coefficients

The least squares estimates of the coefficients must satisfy

14

3.2.1 Least Squares Estimation of the Regression Coefficients

Simplifying, we obtain the least squares normal equations:

The ordinary least squares estimators are the solutions to the normal equations.

15

3.2.1 Least Squares Estimation of the Regression Coefficients

Matrix notation is more convenient to find the estimiates

Let

where

16

3.2.1 Least Squares Estimation of the Regression Coefficients

17

3.2.1 Least Squares Estimation of the Regression Coefficients

These are the least-squares normal equations. The solution is

Linear Regression Analysis 5E Montgomery, Peck & Vining

18

3.2.1 Least Squares Estimation of the Regression Coefficients

19

3.2.1 Least Squares Estimation of the Regression Coefficients

The n residuals can be written in matrix form as

There will be some situations where an alternative form will prove useful

Where H is called hat matrix

20

Example 3-1. The Delivery Time Data

The model of interest is

y = 0 + 1x1+ 2x2 +

21

Example 3-1. The Delivery Time Data

Figure 3.4 Scatterplot matrix for the delivery time data from Example 3.1.

R codes for the figure in “Chapter_3_nulti_reg.txt”

22

Example 3-1 The Delivery Time Data

Figure 3.5 Three-dimensional scatterplot of the delivery time data from Example 3.1.

23

Example 3-1 The Delivery Time Data

24

Example 3-1 The Delivery Time Data

25

Example 3-1 The Delivery Time Data

26

27

R Output

28

3.2.3 Properties of Least-Squares Estimators

• Statistical Properties

• Variances/Covariances

12 ')ˆ(

)ˆ(

XXβ

ββ

Cov

E

ijji

jjj

C)ˆ,ˆ(Cov

C)ˆ(Var

2

2

β

p×p matrixDiagonal entities Cjj are variances,And the remaining Cij are covariance of two regression coefficients

Linear Regression Analysis 5E Montgomery, Peck & Vining

29

3.2.4 Estimation of 2

• The residual sum of squares can be shown to be:

• The residual mean square for the model with p parameters is:

ResˆSS y'y β'X'y

2ReRe ̂

pn

SSMS s

s

Linear Regression Analysis 5E Montgomery, Peck & Vining

30

3.2.4 Estimation of 2

• Recall that the estimator of 2 is model dependent - that is, change the form of the model and the estimate of 2 will invariably change.– Note that the variance estimate is a function of

the errors; “unexplained noise about the fitted regression line”

Which model is better?• Let’s calculate the variance of errors of different models

Model 1; consider two reggressors ( case and distance)

Model 2; only consider reggressor “case”

We would usually prefer a model with a small residual mean square (estimated variance of error).

31

Linear Regression Analysis 5E Montgomery, Peck & Vining

32

Example 3.2 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining

33

Example 3.2 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining

34

3.2.5 Inadequacy of Scatter Diagrams in Multiple Regression

• Scatter diagrams of the regressor variable(s) against the response may be of little value in multiple regression.– These plots can actually be misleading– If there is an interdependency between two or

more regressor variables, the true relationship between xi and y may be masked.

35

Illustration of the Inadequacy of Scatter Diagrams in Multiple Regression

• There is only one (of few) dominate reggressor

• The regressors operate nearly independent

• !!! Scartterplot could be misleading when several important regressors are related. ( we will discuss the analytical methods for sorting out the relationships between regressors in later chapter.

Scatterplot is useful if…

36

37

3.3 Hypothesis Testing in Multiple Linear Regression

Once we have estimated the parameters in the model, we face two immediate questions:

1. What is the overall adequacy of the model?

2. Which specific regressors seem important?

Linear Regression Analysis 5E Montgomery, Peck & Vining

38

3.3 Hypothesis Testing in Multiple Linear Regression

• Test for Significance of Regression (sometimes called the global test of model adequacy)

• Tests on Individual Regression Coefficients (or groups of coefficients)

Next we will consider:

Linear Regression Analysis 5E Montgomery, Peck & Vining

39

3.3.1 Test for Significance of Regression

• The test for significance is a test to determine if there is a linear relationship between the response and any of the regressor variables

• The hypotheses are H0: 1 = 2 = …= k = 0

H1: j 0 for at least one j

40

3.3.1 Test for Significance of Regression

• As in Chapter 2, the total sum of squares can be partitioned in two parts:

SST = SSR + SSRes

• This leads to an ANOVA procedure with the test (F) statistic

s

R

s

R

MS

MS

pnSS

pSSF

ReRe0 )/(

/

41

3.3.1 Test for Significance of Regression

• The standard ANOVA is conducted with

Linear Regression Analysis 5E Montgomery, Peck & Vining

42

3.3.1 Test for Significance of Regression

ANOVA Table:

Reject H0 if

or p-1or n-p

43

3.3.1 Test for Significance of Regression

• R2

– R2 is calculated exactly as in simple linear regression

– R2 can be inflated simply by adding more terms to the model (even insignificant terms)

• Adjusted R2

– Penalizes you for added terms to the model that are not significant

)1/(

)/(1 Re2

nSS

pnSSR

T

sadj

T

s

T

R

SS

SS

SS

SSR Re2 1

Linear Regression Analysis 5E Montgomery, Peck & Vining

44

Example 3.3 Delivery Time Data

Linear Regression Analysis 5E Montgomery, Peck & Vining

45

Example 3.3 Delivery Time Data

To test H0: 1 = 2 = 0, we calculate the F–statistic:

46

Example 3.3 Delivery Time Data

R2 = 0.9596

Adjusted R2 = 0.9559

To look at the overall significance of regression: p-value of F test R2

Adjusted R2

Adding a variable will always result in increase of R –squared.

Our goal is to only add necessary regressors that will reduce the residual variability..

But we do not want over-fitting( add un necessary variables ( will learn variable selection procedure in later chapters).

47

48

3.3.2 Tests on Individual Regression Coefficients

• Hypothesis test on any single regression coefficient:

• Test Statistic:

– Reject H0 if |t0| >

– This is a partial or marginal test!

0:

0:

1

0

j

j

H

H

)ˆ(

ˆ

ˆ

ˆ

20

j

j

jj

j

seCt

1,2/ knt

Linear Regression Analysis 5E Montgomery, Peck & Vining

49

The Extra Sum of Squares method can also be used to test hypotheses on individual model parameters or groups of parameters

Full model

Linear Regression Analysis 5E Montgomery, Peck & Vining

50

Linear Regression Analysis 5E Montgomery, Peck & Vining

51

Linear Regression Analysis 5E Montgomery, Peck & Vining

52

3.3.3 Special Case of Orthogonal Columns in X

• If the columns X1 are orthogonal to the columns in X2, the sum of squares due to 2

that is free of any dependence on the the regressors in X1.

Linear Regression Analysis 5E Montgomery, Peck & Vining

53

Example

• Consider a dataset with four regressor variables and a single response.

• Fit the equation with all regressors and find that: y = - 19.9 + 0.0123x1 + 27.3x2 - 0.0655x3 - 0.196x4

• Looking at the t-tests, suppose that x3 is insignificant. So it is removed. What is the equation now?

• Generally, it is not

y = - 19.9 + 0.0123x1 + 27.3x2 - 0.196x4

Linear Regression Analysis 5E Montgomery, Peck & Vining

54

Example

• The model must be refit with the insignificant regressors left out of the model.

• The regression equation is

y = - 24.9 + 0.0117x1 + 31.0x2 - 0.217x4

• The refitting must be done since the coefficient estimates for an individual regressor depend on all of the regressors, xj

Linear Regression Analysis 5E Montgomery, Peck & Vining

55

Example

• However, if the columns are orthogonal to each other, then there is no need to refit.

• Can you think of some situations where we would have orthogonal columns?

Linear Regression Analysis 5E Montgomery, Peck & Vining

56

3.4.1. Confidence Intervals on the Regression Coefficients

/ 2, / 2,ˆ ˆ ˆ ˆ( ) ( )j n p j j j n p jt se t se

A 100(1-) percent C.I. for the regression coefficient, j is:

2 2/ 2, / 2,

ˆ ˆˆ ˆj n p jj j j n p jjt C t C

Or,

Linear Regression Analysis 5E Montgomery, Peck & Vining

57

Linear Regression Analysis 5E Montgomery, Peck & Vining

58

3.4.2. Confidence Interval Estimation of the Mean Response

• 100(1-) percent CI on the mean response at the point x01, x02, …, x0k is

• See Example 3-9 on page 95 and the discussion that follows

2 10 / 2, 0 0 0

2 10 / 2, 0 0

ˆ ˆ ' ( ) ( | )

ˆ ˆ ' ( )

n p

n p

y t E y

y t

x X'X x x

x X'X x

Linear Regression Analysis 5E Montgomery, Peck & Vining

59

Linear Regression Analysis 5E Montgomery, Peck & Vining

60

Linear Regression Analysis 5E Montgomery, Peck & Vining

61

Linear Regression Analysis 5E Montgomery, Peck & Vining

62

3.4.3. Simultaneous Confidence Intervals on Regression Coefficients

It can be shown that

From this result, the joint confidence region for all parameters in is

,Re

ˆ ˆ( ) ' ' ( )~ p n p

s

FpMS

β β X X β β

, ,Re

ˆ ˆ( ) ' ' ( )p n p

s

FpMS

β β X X β β

Linear Regression Analysis 5E Montgomery, Peck & Vining

63

3.5 Prediction of New Observations

• A 100(1-) percent prediction interval for a future observation is

2 10 / 2, 0 0 0

2 10 / 2, 0 0

ˆ ˆ (1 ' ( ) )

ˆ ˆ (1 ' ( ) )

n p

n p

y t y

y t

x X'X x

x X'X x

Linear Regression Analysis 5E Montgomery, Peck & Vining

64

Linear Regression Analysis 5E Montgomery, Peck & Vining

65

3.6 Hidden Extrapolation in Multiple Regression

• In prediction, exercise care about potentially extrapolating beyond the region containing the original observations.

Figure 3.10 An example of extrapolation in multiple regression.

Linear Regression Analysis 5E Montgomery, Peck & Vining

66

3.6 Hidden Extrapolation in Multiple Regression

• We will define the smallest convex set containing all of the original n data points (xi1, xi2, … xik), i = 1, 2, …, n, as the regressor variable hull RVH.

• If a point x01, x02, …, x0k lies inside or on the boundary of the RVH, then prediction or estimation involves interpolation, while if this point lies outside the RVH, extrapolation is required.

Linear Regression Analysis 5E Montgomery, Peck & Vining

67

3.6 Hidden Extrapolation in Multiple Regression

• Diagonal elements of the matrix H = X(X’X)-1X’ can aid in determining if hidden extrapolation exists:

• The set of points x (not necessarily data points used to fit the model) that satisfy

is an ellipsoid enclosing all points inside the RVH.

Linear Regression Analysis 5E Montgomery, Peck & Vining

68

3.6 Hidden Extrapolation in Multiple Regression

• Let x0 be a point at which prediction or estimation is of interest. Then

• If h00 > hmax then the point is a point of extrapolation.

Linear Regression Analysis 5E Montgomery, Peck & Vining

69

Example 3.13

Consider prediction or estimation at:

Linear Regression Analysis 5E Montgomery, Peck & Vining

70

Figure 3.10 Scatterplot of cases and distance for the delivery time data.

#9

ab

c

d