INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about...

52
INTRODUCTORY LINEAR REGRESSION 1

Transcript of INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about...

Page 1: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

INTRODUCTORY LINEAR

REGRESSION

1

Page 2: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

3.1 SIMPLE LINEAR REGRESSION

- Curve fitting

- Inferences about estimated parameter

- Adequacy of the models

- Linear correlation

3.2 Multiple Linear Regression

2

Page 3: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Introduction:

Regression – is a statistical procedure for establishing the r/ship between 2 or more variables.

This is done by fitting a linear equation to the observed data.

The regression line is then used by the researcher to see the trend and make prediction of values for the data.

There are 2 types of relationship: Simple ( 2 variables) Multiple (more than 2 variables)

3

Page 4: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

3.1 The Simple Linear Regression Model is an equation that describes a dependent

variable (Y) in terms of an independent variable (X) plus random error

where,

= intercept of the line with the Y-axis

= slope of the line

= random error Random error, is the difference of data point

from the deterministic value. This regression line is estimated from the data

collected by fitting a straight line to the data set and getting the equation of the straight line,

0 1Y X

0

1

0 1Y X

4

Page 5: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Example 3.1:

1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight.a) X is the carbohydrate intake (independent variable).b) Y is the weight (dependent variable).

2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume.a) X is costb) Y is sales volume 5

Page 6: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

3.1.1 CURVE FITTING (SCATTER PLOTS) A scatter plot is a graph or ordered pairs (x,y).

The purpose of scatter plot – to describe the nature of the relationships between independent variable, X and dependent variable, Y in visual way.

The independent variable, x is plotted on the horizontal axis and the dependent variable, y is plotted on the vertical axis.

6

Page 7: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Positive Linear RelationshipPositive Linear Relationship

EE((yy))

xx

Slope Slope 11

is positiveis positive

Regression lineRegression line

InterceptIntercept00

SCATTER DIAGRAM

7

Page 8: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Negative Linear RelationshipNegative Linear Relationship

EE((yy))

xx

Slope Slope 11

is negativeis negative

Regression lineRegression lineInterceptIntercept00

SCATTER DIAGRAM

8

Page 9: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

No RelationshipNo Relationship

EE((yy))

xx

Slope Slope 11

is 0is 0

Regression lineRegression lineInterceptIntercept00

SCATTER DIAGRAM

9

Page 10: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

A linear regression can be develop by freehand plot of the data.

Example 3.2:The given table contains values for 2 variables, X and Y. Plot the given data and make a freehand estimated regression line.

LINEAR REGRESSION MODEL

X -3 -2 -1 0 1 2 3Y 1 2 3 5 8 11 12

10

Page 11: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

11

Page 12: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

• The least squares method is commonly used to determine values for and that ensure a best fit for the estimated regression line to the sample data points

• The straight line fitted to the data set is the line:

3.1.2 INFERENCES ABOUT ESTIMATED PARAMETERS

Least Squares Method

0 1

0 1y x

12

Page 13: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Theorem : Given the sample data

, the coefficients of the least squares line are:

LEAST SQUARES METHOD

, ; 1, 2,....ix yi i n

0 1y x

0 1y x

i)i) yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression Equation,Equation,

andand are the mean of x and y respectively.

x y

13

Page 14: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

LEAST SQUARES METHODii) Slope for the Estimated Regression

Equation,

Where,

1xy

xx

S

S

1 1

1

n n

i ini i

xy i ii

x y

S x yn

2

12

1

n

ini

xx ii

x

S xn

2

12

1

n

ini

yy ii

y

S yn

14

Page 15: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

LEAST SQUARES METHOD

• Given any value of the predicted value of the dependent variable , can be found by substituting into the equation

ix

ix

0 1y x

y

15

Page 16: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Example 3.3: Students score in history Example 3.3: Students score in history

The data below represent scores obtained by ten primary school students before and after they were taken on a tour to the museum (which is supposed to increase their interest in history)

Before,x

65 63 76 46 68 72 68 57 36 96

After, y 68 66 86 48 65 66 71 57 42 87

a) Fit a linear regression model with “before” as the explanatory variable and “after” as the dependent variable.

b) Predict the score a student would obtain “after” if he scored 60 marks “before”.

16

Page 17: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

2

2

2

10 44435

647 44279 64.7

656 44884 y = 65.6

647 65644435 1991.8

10

64744279 2418.1

10

4488

xy

xx

yy

Solution

n xy

x x x

y y

S

S

S

26564 1850.4

10

17

Page 18: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

1

0 1

1991.8ˆa) 0.82372418.1

ˆ ˆ 65.6 0.8237 64.7 12.3063

12.3063 0.8237

xy

xx

S

S

y x

Y x

b) 60

12.3063 0.8237 60 61.7283

x

Y

18

Page 19: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X).

The symbol for the coefficient of determination is or .

If =0.90, then =0.81. It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X).

The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of nondetermination.

Formula for the coefficient of nondetermination is

3.1.3 ADEQUACY OF THE MODEL COEFFICIENT OF DETERMINATION( )

2r2R

r 2r

21.00 r

2R

19

Page 20: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Relationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2ˆ( )iy y 2ˆ( )i iy y

The The coefficient of determinationcoefficient of determination is: is:

where:where:

SSR = sum of squares due to regressionSSR = sum of squares due to regression

SST = total sum of squaresSST = total sum of squares

COEFFICIENT OF DETERMINATION( ) 2R

2

2 xy

xx yy

SSSRr

SST S S

20

Page 21: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Example 3.4

1)If =0.919, find the value for and explain the value.

Solution : = 0.84. It means that 84% of the variation in the dependent variable (Y) is explained by the variations in the independent variable (X).

r 2r

2r

21

Page 22: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Correlation measures the strength of a linear relationship between the two variables.

Also known as Pearson’s product moment coefficient of correlation.

The symbol for the sample coefficient of correlation is , population .

Formula :

@

3.1.4 Linear Correlation (r)

r

.xy

xx yy

Sr

S S

22

21(sign of ) r b r 21(sign of ) r b r

Page 23: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Properties of :

Values of close to 1 implies there is a

strong positive linear relationship between x and y.

Values of close to -1 implies there is a strong negative linear relationship between x and y.

Values of close to 0 implies little or no linear relationship between x and y.

r

r

r

r

1 1r

23

Page 24: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Refer Example 3.3: Students score in Students score in historyhistory

c)Calculate the value of r and interpret its meaning

Solution:Solution:

Thus, there is a strong positive linear Thus, there is a strong positive linear relationship between score obtain relationship between score obtain before (x) and after (y).before (x) and after (y).

.

1991.8

2418.1 1850.4

0.9416

xy

xx yy

Sr

S S

24

Page 25: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

ˆ 12.3063 0.8237y x ˆ 12.3063 0.8237y x

Refer example 3.3:

rr = +.9416

=+ .8866r

2r

Page 26: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Assumptions About the Error Term 1. The error is a random variable with mean of zero.1. The error is a random variable with mean of zero.

2. The variance of , denoted by 2, is the same for all values of the independent variable.2. The variance of , denoted by 2, is the same for all values of the independent variable.

3. The values of are independent.3. The values of are independent.

4. The error is a normally distributed random variable.4. The error is a normally distributed random variable.

Page 27: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

To determine whether X provides information in predicting Y, we proceed with testing the hypothesis.

Two test are commonly used:

i)

ii)

3.1.5 TEST OF SIGNIFICANCE

tt Test Testtt Test Test

FF Test TestFF Test Test

27

Page 28: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

1) t-Test

1

1

ˆ

ˆt

Var

11

ˆ 1ˆ2

yy xy

xx

S SVar

n S

1. Determine the hypotheses.1. Determine the hypotheses.

0 1: 0H 0 1: 0H

1 1: 0H 1 1: 0H

2. Compute Critical Value/ level of significance.2. Compute Critical Value/ level of significance.

3. Compute the test statistic.3. Compute the test statistic.

2,2 n

t

p-valuep-value

( no linear r/ship)( no linear r/ship)

(exist linear r/ship)(exist linear r/ship)

28

Page 29: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

1) t-Test

4. Determine the Rejection Rule. 4. Determine the Rejection Rule.

Reject Reject HH00 if if ::

tt << - - or or tt >>

pp-value -value <<

There is a significant relationship between variable X and Y.

5.Conclusion.5.Conclusion.

2,2 n

t 2,

2 n

t

29

Page 30: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

2) F-Test

1. Determine the hypotheses.1. Determine the hypotheses.

2. Specify the level of significance.2. Specify the level of significance.

3. Compute the test statistic.3. Compute the test statistic.

0 1: 0H 0 1: 0H

1 1: 0H 1 1: 0H

FFwiwith degree of freedom (df) in the numerator (1) and degrees of freedom (df) in the denominator (n-2)

FF = MSR/MSE = MSR/MSE

4. Determine the Rejection Rule. 4. Determine the Rejection Rule.

Reject Reject HH00 if : if :pp-value -value << F F test test >>

( no linear r/ship)( no linear r/ship)

(exist linear r/ship)(exist linear r/ship)

,1, 2F n 30

Page 31: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

There is a significant relationship between variable X and Y.

5.Conclusion.5.Conclusion.

2) F-Test

31

Page 32: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Refer Example 3.3: Students score in historyStudents score in history d) Test to determine if their scores before and

after the trip is related. Use

Solution:Solution:

1. 1. ( no linear r/ship)( no linear r/ship)

(exist linear r/ship)(exist linear r/ship)

2.2.

3.3.

0 1

1 1

: 0

: 0

H

H

0.05/ 2,80.05, 2.306t

1

1( )

0.82377.926

0.0108

testtVar

11

1( )

2

1850.4 (0.8237)(1991.8) 18 2418.1

0.0108

yy xy

xx

S SVar

n S

32

Page 33: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

4. Rejection Rule:4. Rejection Rule:

5. Conclusion:5. Conclusion:

Thus, we reject HThus, we reject H0. 0. The score before (x) is The score before (x) is linear relationship to the score after (y) the linear relationship to the score after (y) the trip.trip.

0.025,8

7.926 2.306testt t

33

Page 34: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

The value of the test statistic F for an ANOVA test is calculated as:

F=MSR MSE To calculate MSR and MSE, first compute the regression sum of squares (SSR) and the error sum of squares (SSE).

ANALYSIS OF VARIANCE (ANOVA)

34

Page 35: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

General form of ANOVA table:

ANOVA Test1) Hypothesis:

2) Select the distribution to use: F-distribution3) Calculate the value of the test statistic: F4) Determine rejection and non rejection regions:5) Make a decision: Reject Ho/ accept H0

ANALYSIS OF VARIANCE (ANOVA)

Source of Variation

Degrees of Freedom(d

f)

Sum of Squares

Mean Squares

Value of the Test Statistic

Regression 1 SSR MSR=SSR/1F=MSR MSE

Error n-2 SSE MSE=SSE/n-2

Total n-1 SST

0 1

1 1

: 0

: 0

H

H

35

Page 36: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Example 3.5The manufacturer of Cardio Glide exercise equipment wants to study the relationship between the number of months since the glide was purchased and the length of time the equipment was used last week.

1) Determine the regression equation.2) At , test whether there is a linear relationship between the

variables0.01

36

Page 37: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Solution (1):

Regression equation:

ˆ 9.939 0.637Y X

37

Page 38: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Solution (2):

1) Hypothesis:

1) F-distribution table:

2) Test Statistic:

F = MSR/MSE = 17.303

or using p-value approach:

significant value =0.003

4) Rejection region:

Since F statistic > F table (17.303>11.2586 ), we reject H0 or since p-value (0.003 < 0.01 )we reject H0

5) Thus, there is a linear relationship between the variables (month X and hours Y).

0 1

1 1

: 0

: 0

H

H

0.01,1,8 11.2586F

38

Page 39: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

3.2 MULTIPLE LINEAR REGRESSION In multiple regression, there are several

independent variables (X)and one dependent variable (Y).

The multiple regression model:

This equation that describes how the dependent variable y is related to these independent variables x1, x2, . . . xp.

1, 2,......, kX X X

where: are the parameters, ande is a random variable called the error term

are the independent variables.

0 1 2, , ...... p

0 1 1 2 2 ........ p pY X X X

39

Page 40: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

MULTIPLE REGRESSION MODEL

Multiple regression analysis is use when a statistician thinks there are several independent variables contributing to the variation of the dependent variable.

This analysis then can be used to increase the accuracy of predictions for the dependent variable over one independent variable alone.

40

Page 41: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Estimated Multiple Regression EquationEstimated Multiple Regression Equation

Estimated Multiple Regression EquationEstimated Multiple Regression Equation

0 1 1 2 2ˆ ˆ ˆ ˆˆ ........ p pY X X X

In multiple regression analysis, we interpret In multiple regression analysis, we interpret eacheach

regression coefficient as follows:regression coefficient as follows:

ii represents an estimate of the change in represents an estimate of the change in yy corresponding to a 1-unit increase in corresponding to a 1-unit increase in xxii when all when all other independent variables are held constant.other independent variables are held constant.

41

Page 42: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

MULTIPLE COEFFICIENT OF DETERMINATION (R2)

As with simple regression, R2 is the coefficient of multiple determination, and it is the amount of variation explained by the regression model.

Formula:

In multiple regression, as in simple regression, the strength of the relationship between the independent variable and the dependent variable is measured by correlation coefficient, R.

MULTIPLE CORRELATION COEFFICIENT (R)

2 1SSR SSE

RSST SST

42

Page 43: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

MODEL ASSUMPTIONS

The errors are normally distributed with mean and variance Var = .

The errors are statistically independent. Thus the error for any value of Y is unaffected by the error for any other Y-value.

The X-variables are linear additive (i.e., can be summed).

( )( ) 0E ( ) 2

43

Page 44: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

ANALYSIS OF VARIANCE (ANOVA)General form of ANOVA table:

Source Degrees of Freedom

Sum of Squares

Mean Squares

Value of the Test Statistic

Regression p SSR MSR=SSR p F=MSR

MSEError n-p-1 SSE MSE=SSE n-p-

1

Total n-1 SST Excel’s ANOVA OutputExcel’s ANOVA Output

A B C D E F3233 ANOVA34 df SS MS F Significance F35 Regression 2 500.3285 250.1643 42.76013 2.32774E-0736 Residual 17 99.45697 5.8504137 Total 19 599.785538

SSRSSRSSTSST 44

Page 45: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

TEST OF SIGNIFICANCE

In In simple linear regressionsimple linear regression, the , the FF and and tt tests tests provideprovide the same conclusion.the same conclusion. In In multiple regressionmultiple regression, the , the FF and and tt tests have tests have differentdifferent purposes.purposes.

The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables. The F test is referred to as the test for overall significance.

The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables. The F test is referred to as the test for overall significance.

The The tt test is used to determine whether each of the individual test is used to determine whether each of the individual independent variables is significant.independent variables is significant.A separate A separate tt test is conducted for each of the test is conducted for each of the independent variables in the model.independent variables in the model. We refer to each of these We refer to each of these tt tests as a tests as a test for individualtest for individual significancesignificance..

The The tt test is used to determine whether each of the individual test is used to determine whether each of the individual independent variables is significant.independent variables is significant.A separate A separate tt test is conducted for each of the test is conducted for each of the independent variables in the model.independent variables in the model. We refer to each of these We refer to each of these tt tests as a tests as a test for individualtest for individual significancesignificance.. 45

Page 46: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Testing for Significance: Testing for Significance: F F Test - Overall Test - Overall SignificanceSignificance

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

HH00: : 11 = = 2 2 = . . . = = . . . = p p = 0= 0

HH11: One or more of the parameters: One or more of the parameters

is not equal to zero.is not equal to zero.

FF = MSR/MSE = MSR/MSE

Reject Reject HH00 if if

pp-value -value << or if or if FF > > FF

where :where :

FF is based on an is based on an FF distribution distribution

With p d.f. in the numerator andWith p d.f. in the numerator and

nn - p - 1 d.f. in the denominator. - p - 1 d.f. in the denominator.46

Page 47: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Testing for Significance: Testing for Significance: t t Test- Individual Test- Individual ParametersParameters

HypothesesHypotheses

Rejection RuleRejection Rule

Test StatisticsTest Statistics

Reject Reject HH00 if if

pp-value -value << or or

tt << - -ttor or tt >> tt

Where:Where:

tt is based on a is based on a t t distribution distribution

with with nn - p - 1 degrees of freedom. - p - 1 degrees of freedom.

tbs

i

bi

tbs

i

bi

0 : 0iH 0 : 0iH 1 : 0iH 1 : 0iH

47

Page 48: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Example:An independent trucking company, The Butler Trucking Company involves deliveries throughout southern California. The managers want to estimate the total daily travel time for their drivers. He believes the total daily travel time would be closely related to the number of miles traveled in making the deliveries.

a) Determine whether there is a relationship among the variables using b) Use the t-test to determine the significance of each independent

variable. What is your conclusion at the 0.05 level of significance?

0.05

48

Page 49: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Solution:a) Hypothesis Statement:

Test Statistics:

Rejection Region:

Since 32.88>4.74, we Reject H0 and conclude that there is a significance relationship between travel time (Y) and two independent variables, miles traveled and number of deliveries.

0 1 2

1:

: 0

One or more of the parameters is not equal to zero

H

H

32.88F

0.05,2,7 4.74F

49

Page 50: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Solution:b) Hypothesis Statement:

Test Statistics:

Rejection Region:

Since 6.18>2.365, we Reject H0 and conclude that there is a significance relationship between travel time (Y) and miles traveled (X1).

0 1

1: 1

: 0

: 0

H

H

0.0611356.18

0.009888t

0.05/2,7 2.365t

50

Page 51: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

Solution:b) Hypothesis Statement:

Test Statistics:

Rejection Region:

Since 4.18>2.365, we Reject H0 and conclude that there is a significance relationship between travel time (Y) and number of deliveries (X2).

0 2

1: 2

: 0

: 0

H

H

0.92344.18

0.2211t

0.05/2,7 2.365t

51

Page 52: INTRODUCTORY LINEAR REGRESSION 1. 3.1 SIMPLE LINEAR REGRESSION - Curve fitting - Inferences about estimated parameter - Adequacy of the models - Linear.

End of Chapter 3

52