IS 310 Business Statistics CSU Long Beach

38
1 IS 310 – Business Statistics IS 310 – Business Statistics IS 310 Business Statisti cs CSU Long Beach

description

IS 310 Business Statistics CSU Long Beach. Simple Linear Regression. Often, two variables are related. Examples: Amount of advertising expenses and amount of sales. Daily temperature and daily water consumption. Undergraduate GPA and starting salary of graduates. - PowerPoint PPT Presentation

Transcript of IS 310 Business Statistics CSU Long Beach

Page 1: IS 310 Business Statistics CSU  Long Beach

1 1 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

IS 310

Business Statistic

sCSU

Long Beach

Page 2: IS 310 Business Statistics CSU  Long Beach

2 2 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear RegressionSimple Linear Regression

Often, two variables are related. Often, two variables are related.

Examples:Examples:

Amount of advertising expenses and amount of sales.Amount of advertising expenses and amount of sales.

Daily temperature and daily water consumption.Daily temperature and daily water consumption.

Undergraduate GPA and starting salary of graduates.Undergraduate GPA and starting salary of graduates.

Weight of automobiles and miles per gallon.Weight of automobiles and miles per gallon.

Page 3: IS 310 Business Statistics CSU  Long Beach

3 3 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear RegressionSimple Linear Regression

When two variables are related, one can be used When two variables are related, one can be used to predict the value of the other.to predict the value of the other.

Examples:Examples:

Knowing the amount of advertising expenses, Knowing the amount of advertising expenses, one can predict the amount of sales.one can predict the amount of sales.

Knowing the daily temperature, one can predict Knowing the daily temperature, one can predict the amount of water consumption.the amount of water consumption.

Page 4: IS 310 Business Statistics CSU  Long Beach

4 4 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear RegressionSimple Linear Regression

Regression analysisRegression analysis can be used to develop an can be used to develop an equation showing how the variables are related.equation showing how the variables are related.

Managerial decisions often are based on theManagerial decisions often are based on the relationship between two or more variables.relationship between two or more variables.

The variables being used to predict the value of theThe variables being used to predict the value of the dependent variable are called the dependent variable are called the independentindependent variablesvariables and are denoted by and are denoted by xx..

The variable being predicted is called the The variable being predicted is called the dependentdependent variablevariable and is denoted by and is denoted by yy..

Page 5: IS 310 Business Statistics CSU  Long Beach

5 5 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear RegressionSimple Linear Regression

The relationship between the two variables isThe relationship between the two variables is approximated by a straight line.approximated by a straight line.

Simple linear regressionSimple linear regression involves one independent involves one independent variable and one dependent variable.variable and one dependent variable.

Regression analysis involving two or more Regression analysis involving two or more independent variables is called independent variables is called multiple regressionmultiple regression..

Page 6: IS 310 Business Statistics CSU  Long Beach

6 6 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear Regression ModelSimple Linear Regression Model

yy = = 00 + + 11xx + +

where:where:

00 and and 11 are called are called parameters of the modelparameters of the model,,

is a random variable called theis a random variable called the error term error term..

The The simple linear regression modelsimple linear regression model is: is:

The equation that describes how The equation that describes how yy is related to is related to xx and and an error term is called the an error term is called the regression modelregression model..

Page 7: IS 310 Business Statistics CSU  Long Beach

7 7 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear Regression EquationSimple Linear Regression Equation

The The simple linear regression equationsimple linear regression equation is: is:

• EE((yy) is the expected value of ) is the expected value of yy for a given for a given xx value. value.• 11 is the slope of the regression line. is the slope of the regression line.• 00 is the is the yy intercept of the regression line. intercept of the regression line.• Graph of the regression equation is a straight line.Graph of the regression equation is a straight line.

EE((yy) = ) = 00 + + 11xx

Page 8: IS 310 Business Statistics CSU  Long Beach

8 8 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear Regression EquationSimple Linear Regression Equation

Positive Linear RelationshipPositive Linear Relationship

EE((yy))

xx

Slope Slope 11

is positiveis positive

Regression lineRegression line

InterceptIntercept00

Page 9: IS 310 Business Statistics CSU  Long Beach

9 9 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear Regression EquationSimple Linear Regression Equation

Negative Linear RelationshipNegative Linear Relationship

EE((yy))

xx

Slope Slope 11

is negativeis negative

Regression lineRegression lineInterceptIntercept00

Page 10: IS 310 Business Statistics CSU  Long Beach

10 10 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear Regression EquationSimple Linear Regression Equation

No RelationshipNo Relationship

EE((yy))

xx

Slope Slope 11

is 0is 0

Regression lineRegression lineInterceptIntercept

00

Page 11: IS 310 Business Statistics CSU  Long Beach

11 11 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Estimated Simple Linear Regression Estimated Simple Linear Regression EquationEquation

The The estimated simple linear regression estimated simple linear regression equationequation

0 1y b b x 0 1y b b x

• is the estimated value of is the estimated value of yy for a given for a given xx value. value.yy• bb11 is the slope of the line. is the slope of the line.• bb00 is the is the yy intercept of the line. intercept of the line.

• The graph is called the estimated regression line.The graph is called the estimated regression line.

Page 12: IS 310 Business Statistics CSU  Long Beach

12 12 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Estimation ProcessEstimation Process

Regression ModelRegression Modelyy = = 00 + + 11xx + +

Regression EquationRegression EquationEE((yy) = ) = 00 + + 11xx

Unknown ParametersUnknown Parameters00, , 11

Sample Data:Sample Data:x yx y

xx11 y y11

. .. . . .. . xxnn yynn

bb00 and and bb11

provide estimates ofprovide estimates of00 and and 11

EstimatedEstimatedRegression EquationRegression Equation

Sample StatisticsSample Statistics

bb00, , bb11

0 1y b b x 0 1y b b x

Page 13: IS 310 Business Statistics CSU  Long Beach

13 13 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Least Squares MethodLeast Squares Method

Least Squares CriterionLeast Squares Criterion

min (y yi i )2min (y yi i )2

where:where:

yyii = = observedobserved value of the dependent variable value of the dependent variable

for the for the iith observationth observation^yyii = = estimatedestimated value of the dependent variable value of the dependent variable

for the for the iith observationth observation

Page 14: IS 310 Business Statistics CSU  Long Beach

14 14 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Slope for the Estimated Regression EquationSlope for the Estimated Regression Equation

1 2

( )( )

( )i i

i

x x y yb

x x

1 2

( )( )

( )i i

i

x x y yb

x x

Least Squares MethodLeast Squares Method

where:where:xxii = value of independent variable for = value of independent variable for iithth observationobservation

__yy = mean value for dependent variable = mean value for dependent variable

__xx = mean value for independent variable = mean value for independent variable

yyii = value of dependent variable for = value of dependent variable for iithth observationobservation

Page 15: IS 310 Business Statistics CSU  Long Beach

15 15 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression EquationEquation

Least Squares MethodLeast Squares Method

0 1b y b x 0 1b y b x

Page 16: IS 310 Business Statistics CSU  Long Beach

16 16 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Reed Auto periodically hasReed Auto periodically has

a special week-long sale. a special week-long sale.

As part of the advertisingAs part of the advertising

campaign Reed runs one orcampaign Reed runs one or

more television commercialsmore television commercials

during the weekend preceding the sale. Data from aduring the weekend preceding the sale. Data from a

sample of 5 previous sales are shown on the next sample of 5 previous sales are shown on the next slide.slide.

Simple Linear RegressionSimple Linear Regression

Example: Reed Auto SalesExample: Reed Auto Sales

Page 17: IS 310 Business Statistics CSU  Long Beach

17 17 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Simple Linear RegressionSimple Linear Regression

Example: Reed Auto SalesExample: Reed Auto Sales

Number ofNumber of TV Ads (TV Ads (xx))

Number ofNumber ofCars Sold (Cars Sold (yy))

1133221133

14142424181817172727

xx = 10 = 10 yy = 100 = 1002x 2x 20y 20y

Page 18: IS 310 Business Statistics CSU  Long Beach

18 18 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Estimated Regression EquationEstimated Regression Equation

ˆ 10 5y x ˆ 10 5y x

1 2

( )( ) 205

( ) 4i i

i

x x y yb

x x

1 2

( )( ) 205

( ) 4i i

i

x x y yb

x x

0 1 20 5(2) 10b y b x 0 1 20 5(2) 10b y b x

Slope for the Estimated Regression Slope for the Estimated Regression EquationEquation

yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression EquationEquation

Estimated Regression EquationEstimated Regression Equation

Page 19: IS 310 Business Statistics CSU  Long Beach

19 19 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Scatter Diagram and Trend LineScatter Diagram and Trend Line

y = 5x + 10

0

5

10

15

20

25

30

0 1 2 3 4TV Ads

Ca

rs S

old

Page 20: IS 310 Business Statistics CSU  Long Beach

20 20 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Sample ProblemSample Problem

Problem # 4 (10-Page 553; 11-Page 570)Problem # 4 (10-Page 553; 11-Page 570)

Variables are height (independent variable) and weight (dependent Variables are height (independent variable) and weight (dependent variable) of women swimmers.variable) of women swimmers.

_ _ _ _ _ 2_ _ _ _ _ 2

x y x – x y – y (x – x) (y – y) (x – x ) _ _x y x – x y – y (x – x) (y – y) (x – x ) _ _

68 132 3 15 45 9 68 132 3 15 45 9 ∑ (x – x) (y – y) = ∑ (x – x) (y – y) = 110110

64 108 -1 -9 9 1 _ 264 108 -1 -9 9 1 _ 2

62 102 -3 -15 45 9 ∑ (x – x ) = 20 62 102 -3 -15 45 9 ∑ (x – x ) = 20

65 115 0 -2 0 065 115 0 -2 0 0

66 128 1 11 11 1 b = 110/20 = 5.566 128 1 11 11 1 b = 110/20 = 5.5

1 _ _1 _ _

b = y – b x = b = y – b x = 117- 5.5(65)117- 5.5(65)

0 1 = - 0 1 = - 240.5240.5

The regression equation is: y = - 240.5 + 5.5 xThe regression equation is: y = - 240.5 + 5.5 x

Page 21: IS 310 Business Statistics CSU  Long Beach

21 21 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Sample Problem ContinuedSample Problem Continued

The regression equation is:The regression equation is:

ŷ = - 10.16 + 0.184 xŷ = - 10.16 + 0.184 x

For x = 120, For x = 120, ŷ = - 10.16 + 0.184 (120) = 11.92 ŷ = - 10.16 + 0.184 (120) = 11.92

The bonus of a vice presdient whose salary is The bonus of a vice presdient whose salary is $120,000 is predicted to be $11,920.$120,000 is predicted to be $11,920.

Page 22: IS 310 Business Statistics CSU  Long Beach

22 22 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Coefficient of Determination Coefficient of Determination

In regression analysis, the following natural In regression analysis, the following natural question arises:question arises:

How well does the regression equation forecast How well does the regression equation forecast the actual data? the actual data?

This question is answered by a quantity that we This question is answered by a quantity that we call “call “Coefficient of DeterminationCoefficient of Determination””

Page 23: IS 310 Business Statistics CSU  Long Beach

23 23 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Coefficient of DeterminationCoefficient of Determination

To understand the Coefficient of Determination, To understand the Coefficient of Determination, one has to understand the following three one has to understand the following three quantities:quantities:

22

SSE = SSE = SSum of um of SSquares due to quares due to EError = ∑ (y – rror = ∑ (y – ŷ)ŷ)

_ 2_ 2

SST = SST = SSum of um of SSquares quares TTotal = ∑ (y – y)otal = ∑ (y – y)

_ 2_ 2

SSR = Sum of Squares due to Regression = ∑ (ŷ – SSR = Sum of Squares due to Regression = ∑ (ŷ – y)y)

Page 24: IS 310 Business Statistics CSU  Long Beach

24 24 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Coefficient of DeterminationCoefficient of Determination

Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE

where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y

Page 25: IS 310 Business Statistics CSU  Long Beach

25 25 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

The The coefficient of determinationcoefficient of determination is: is:

Coefficient of DeterminationCoefficient of Determination

where:where:

SSR = sum of squares due to regressionSSR = sum of squares due to regression

SST = total sum of squaresSST = total sum of squares

rr22 = SSR/SST = SSR/SST

Page 26: IS 310 Business Statistics CSU  Long Beach

26 26 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Coefficient of DeterminationCoefficient of Determination

rr22 = SSR/SST = 100/114 = .8772 = SSR/SST = 100/114 = .8772

The regression relationship is very strong; 87.7%The regression relationship is very strong; 87.7%of the variability in the number of cars sold can beof the variability in the number of cars sold can beexplained by the linear relationship between theexplained by the linear relationship between thenumber of TV ads and the number of cars sold.number of TV ads and the number of cars sold.

Page 27: IS 310 Business Statistics CSU  Long Beach

27 27 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Sample Correlation CoefficientSample Correlation Coefficient

21 ) of(sign rbrxy 21 ) of(sign rbrxy

ionDeterminat oft Coefficien ) of(sign 1brxy ionDeterminat oft Coefficien ) of(sign 1brxy

where:where:

bb11 = the slope of the estimated regression = the slope of the estimated regression

equationequation xbby 10ˆ xbby 10ˆ

Page 28: IS 310 Business Statistics CSU  Long Beach

28 28 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

21 ) of(sign rbrxy 21 ) of(sign rbrxy

The sign of The sign of bb11 in the equation in the equation is “+”. is “+”.ˆ 10 5y x ˆ 10 5y x

=+ .8772xyr =+ .8772xyr

Sample Correlation CoefficientSample Correlation Coefficient

rrxyxy = = +.9366 +.9366

Page 29: IS 310 Business Statistics CSU  Long Beach

29 29 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Testing for SignificanceTesting for Significance

To test for a significant regression relationship, weTo test for a significant regression relationship, we must conduct a hypothesis test to determine whethermust conduct a hypothesis test to determine whether the value of the value of 11 is zero. is zero.

To test for a significant regression relationship, weTo test for a significant regression relationship, we must conduct a hypothesis test to determine whethermust conduct a hypothesis test to determine whether the value of the value of 11 is zero. is zero.

Two tests are commonly used:Two tests are commonly used: Two tests are commonly used:Two tests are commonly used:

tt Test Testtt Test Test andand FF Test TestFF Test Test

Both the Both the tt test and test and FF test require an estimate of test require an estimate of 22,, the variance of the variance of in the regression model. in the regression model. Both the Both the tt test and test and FF test require an estimate of test require an estimate of 22,, the variance of the variance of in the regression model. in the regression model.

Page 30: IS 310 Business Statistics CSU  Long Beach

30 30 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

An Estimate of An Estimate of 22

Testing for SignificanceTesting for Significance

210

2 )()ˆ(SSE iiii xbbyyy 210

2 )()ˆ(SSE iiii xbbyyy

where:where:

ss 22 = MSE = SSE/( = MSE = SSE/(n n 2) 2)

The mean square error (MSE) provides the estimateThe mean square error (MSE) provides the estimate

of of 22, and the notation , and the notation ss22 is also used. is also used.

Page 31: IS 310 Business Statistics CSU  Long Beach

31 31 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Testing for SignificanceTesting for Significance

An Estimate of An Estimate of

2

SSEMSE

n

s2

SSEMSE

n

s

• To estimate To estimate we take the square root of we take the square root of 22..

• The resulting The resulting ss is called the is called the standard error ofstandard error of the estimatethe estimate..

Page 32: IS 310 Business Statistics CSU  Long Beach

32 32 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

HypothesesHypotheses

Test StatisticTest Statistic

Testing for Significance: Testing for Significance: tt Test Test

0 1: 0H 0 1: 0H

1: 0aH 1: 0aH

1

1

b

bt

s

1

1

b

bt

s wherewhere

1 2( )b

i

ss

x x

1 2( )b

i

ss

x x

Page 33: IS 310 Business Statistics CSU  Long Beach

33 33 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Rejection RuleRejection Rule

Testing for Significance: Testing for Significance: tt Test Test

where: where:

tt is based on a is based on a tt distribution distribution

with with nn - 2 degrees of freedom - 2 degrees of freedom

Reject Reject HH00 if if pp-value -value << or or tt << - -ttor or tt >> tt

Page 34: IS 310 Business Statistics CSU  Long Beach

34 34 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

1. Determine the hypotheses.1. Determine the hypotheses.

2. Specify the level of significance.2. Specify the level of significance.

3. Select the test statistic.3. Select the test statistic.

= .05= .05

4. State the rejection rule.4. State the rejection rule.Reject Reject HH00 if if pp-value -value << .05 .05or |or |t|t| > 3.182 (with > 3.182 (with

3 degrees of freedom)3 degrees of freedom)

Testing for Significance: Testing for Significance: tt Test Test

0 1: 0H 0 1: 0H

1: 0aH 1: 0aH

1

1

b

bt

s

1

1

b

bt

s

Page 35: IS 310 Business Statistics CSU  Long Beach

35 35 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Testing for Significance: Testing for Significance: tt Test Test

5. Compute the value of the test statistic.5. Compute the value of the test statistic.

6. Determine whether to reject 6. Determine whether to reject HH00..

tt = 4.541 provides an area of .01 in the upper = 4.541 provides an area of .01 in the uppertail. Hence, the tail. Hence, the pp-value is less than .02. (Also,-value is less than .02. (Also,tt = 4.63 > 3.182.) We can reject = 4.63 > 3.182.) We can reject HH00..

1

1 54.63

1.08b

bt

s

1

1 54.63

1.08b

bt

s

Page 36: IS 310 Business Statistics CSU  Long Beach

36 36 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Sample ProblemSample Problem

Problem # 18 (10-Page 564; 11-Page 581) Problem # 18 (10-Page 564; 11-Page 581)

2 _ 2 _ 22 _ 2 _ 2

x y x y ŷ (y – ŷ) (y – y) (ŷ – y)ŷ (y – ŷ) (y – y) (ŷ – y)

2.6 3300 3301 1 122500 1218012.6 3300 3301 1 122500 121801

3.4 3600 3766 27556 2500 134563.4 3600 3766 27556 2500 13456

3.6 4000 3882 13924 122500 538243.6 4000 3882 13924 122500 53824

3.2 3500 3650 22500 22500 03.2 3500 3650 22500 22500 0

3.5 3900 3824 5776 62500 3.5 3900 3824 5776 62500 30276 30276

2.9 3600 3476 15376 2500 2.9 3600 3476 15376 2500 3027630276

Total 85133 335000 249633Total 85133 335000 249633

Page 37: IS 310 Business Statistics CSU  Long Beach

37 37 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

Sample Problem ContinuedSample Problem Continued

SSE = 85133SSE = 85133

SST = 335000SST = 335000

SSR = 249633SSR = 249633

22

Coefficient of Determination = r = SSR/SST = 0.745Coefficient of Determination = r = SSR/SST = 0.745

22

Correlation Coefficient = r = √ r = 0.86Correlation Coefficient = r = √ r = 0.86

There is a fairly good fit between the actual monthly salaries and There is a fairly good fit between the actual monthly salaries and the GPAs of students. There exists a strong relationship the GPAs of students. There exists a strong relationship between these variables. between these variables.

Page 38: IS 310 Business Statistics CSU  Long Beach

38 38 Slide

Slide

IS 310 – Business StatisticsIS 310 – Business Statistics

End of Chapter 14End of Chapter 14