Post on 05-Jan-2016
description
1 1 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
IS 310
Business Statistic
sCSU
Long Beach
2 2 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear RegressionSimple Linear Regression
Often, two variables are related. Often, two variables are related.
Examples:Examples:
Amount of advertising expenses and amount of sales.Amount of advertising expenses and amount of sales.
Daily temperature and daily water consumption.Daily temperature and daily water consumption.
Undergraduate GPA and starting salary of graduates.Undergraduate GPA and starting salary of graduates.
Weight of automobiles and miles per gallon.Weight of automobiles and miles per gallon.
3 3 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear RegressionSimple Linear Regression
When two variables are related, one can be used When two variables are related, one can be used to predict the value of the other.to predict the value of the other.
Examples:Examples:
Knowing the amount of advertising expenses, Knowing the amount of advertising expenses, one can predict the amount of sales.one can predict the amount of sales.
Knowing the daily temperature, one can predict Knowing the daily temperature, one can predict the amount of water consumption.the amount of water consumption.
4 4 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear RegressionSimple Linear Regression
Regression analysisRegression analysis can be used to develop an can be used to develop an equation showing how the variables are related.equation showing how the variables are related.
Managerial decisions often are based on theManagerial decisions often are based on the relationship between two or more variables.relationship between two or more variables.
The variables being used to predict the value of theThe variables being used to predict the value of the dependent variable are called the dependent variable are called the independentindependent variablesvariables and are denoted by and are denoted by xx..
The variable being predicted is called the The variable being predicted is called the dependentdependent variablevariable and is denoted by and is denoted by yy..
5 5 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear RegressionSimple Linear Regression
The relationship between the two variables isThe relationship between the two variables is approximated by a straight line.approximated by a straight line.
Simple linear regressionSimple linear regression involves one independent involves one independent variable and one dependent variable.variable and one dependent variable.
Regression analysis involving two or more Regression analysis involving two or more independent variables is called independent variables is called multiple regressionmultiple regression..
6 6 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear Regression ModelSimple Linear Regression Model
yy = = 00 + + 11xx + +
where:where:
00 and and 11 are called are called parameters of the modelparameters of the model,,
is a random variable called theis a random variable called the error term error term..
The The simple linear regression modelsimple linear regression model is: is:
The equation that describes how The equation that describes how yy is related to is related to xx and and an error term is called the an error term is called the regression modelregression model..
7 7 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear Regression EquationSimple Linear Regression Equation
The The simple linear regression equationsimple linear regression equation is: is:
• EE((yy) is the expected value of ) is the expected value of yy for a given for a given xx value. value.• 11 is the slope of the regression line. is the slope of the regression line.• 00 is the is the yy intercept of the regression line. intercept of the regression line.• Graph of the regression equation is a straight line.Graph of the regression equation is a straight line.
EE((yy) = ) = 00 + + 11xx
8 8 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear Regression EquationSimple Linear Regression Equation
Positive Linear RelationshipPositive Linear Relationship
EE((yy))
xx
Slope Slope 11
is positiveis positive
Regression lineRegression line
InterceptIntercept00
9 9 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear Regression EquationSimple Linear Regression Equation
Negative Linear RelationshipNegative Linear Relationship
EE((yy))
xx
Slope Slope 11
is negativeis negative
Regression lineRegression lineInterceptIntercept00
10 10 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear Regression EquationSimple Linear Regression Equation
No RelationshipNo Relationship
EE((yy))
xx
Slope Slope 11
is 0is 0
Regression lineRegression lineInterceptIntercept
00
11 11 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Estimated Simple Linear Regression Estimated Simple Linear Regression EquationEquation
The The estimated simple linear regression estimated simple linear regression equationequation
0 1y b b x 0 1y b b x
• is the estimated value of is the estimated value of yy for a given for a given xx value. value.yy• bb11 is the slope of the line. is the slope of the line.• bb00 is the is the yy intercept of the line. intercept of the line.
• The graph is called the estimated regression line.The graph is called the estimated regression line.
12 12 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Estimation ProcessEstimation Process
Regression ModelRegression Modelyy = = 00 + + 11xx + +
Regression EquationRegression EquationEE((yy) = ) = 00 + + 11xx
Unknown ParametersUnknown Parameters00, , 11
Sample Data:Sample Data:x yx y
xx11 y y11
. .. . . .. . xxnn yynn
bb00 and and bb11
provide estimates ofprovide estimates of00 and and 11
EstimatedEstimatedRegression EquationRegression Equation
Sample StatisticsSample Statistics
bb00, , bb11
0 1y b b x 0 1y b b x
13 13 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Least Squares MethodLeast Squares Method
Least Squares CriterionLeast Squares Criterion
min (y yi i )2min (y yi i )2
where:where:
yyii = = observedobserved value of the dependent variable value of the dependent variable
for the for the iith observationth observation^yyii = = estimatedestimated value of the dependent variable value of the dependent variable
for the for the iith observationth observation
14 14 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Slope for the Estimated Regression EquationSlope for the Estimated Regression Equation
1 2
( )( )
( )i i
i
x x y yb
x x
1 2
( )( )
( )i i
i
x x y yb
x x
Least Squares MethodLeast Squares Method
where:where:xxii = value of independent variable for = value of independent variable for iithth observationobservation
__yy = mean value for dependent variable = mean value for dependent variable
__xx = mean value for independent variable = mean value for independent variable
yyii = value of dependent variable for = value of dependent variable for iithth observationobservation
15 15 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression EquationEquation
Least Squares MethodLeast Squares Method
0 1b y b x 0 1b y b x
16 16 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Reed Auto periodically hasReed Auto periodically has
a special week-long sale. a special week-long sale.
As part of the advertisingAs part of the advertising
campaign Reed runs one orcampaign Reed runs one or
more television commercialsmore television commercials
during the weekend preceding the sale. Data from aduring the weekend preceding the sale. Data from a
sample of 5 previous sales are shown on the next sample of 5 previous sales are shown on the next slide.slide.
Simple Linear RegressionSimple Linear Regression
Example: Reed Auto SalesExample: Reed Auto Sales
17 17 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Simple Linear RegressionSimple Linear Regression
Example: Reed Auto SalesExample: Reed Auto Sales
Number ofNumber of TV Ads (TV Ads (xx))
Number ofNumber ofCars Sold (Cars Sold (yy))
1133221133
14142424181817172727
xx = 10 = 10 yy = 100 = 1002x 2x 20y 20y
18 18 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Estimated Regression EquationEstimated Regression Equation
ˆ 10 5y x ˆ 10 5y x
1 2
( )( ) 205
( ) 4i i
i
x x y yb
x x
1 2
( )( ) 205
( ) 4i i
i
x x y yb
x x
0 1 20 5(2) 10b y b x 0 1 20 5(2) 10b y b x
Slope for the Estimated Regression Slope for the Estimated Regression EquationEquation
yy-Intercept for the Estimated Regression -Intercept for the Estimated Regression EquationEquation
Estimated Regression EquationEstimated Regression Equation
19 19 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Scatter Diagram and Trend LineScatter Diagram and Trend Line
y = 5x + 10
0
5
10
15
20
25
30
0 1 2 3 4TV Ads
Ca
rs S
old
20 20 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Sample ProblemSample Problem
Problem # 4 (10-Page 553; 11-Page 570)Problem # 4 (10-Page 553; 11-Page 570)
Variables are height (independent variable) and weight (dependent Variables are height (independent variable) and weight (dependent variable) of women swimmers.variable) of women swimmers.
_ _ _ _ _ 2_ _ _ _ _ 2
x y x – x y – y (x – x) (y – y) (x – x ) _ _x y x – x y – y (x – x) (y – y) (x – x ) _ _
68 132 3 15 45 9 68 132 3 15 45 9 ∑ (x – x) (y – y) = ∑ (x – x) (y – y) = 110110
64 108 -1 -9 9 1 _ 264 108 -1 -9 9 1 _ 2
62 102 -3 -15 45 9 ∑ (x – x ) = 20 62 102 -3 -15 45 9 ∑ (x – x ) = 20
65 115 0 -2 0 065 115 0 -2 0 0
66 128 1 11 11 1 b = 110/20 = 5.566 128 1 11 11 1 b = 110/20 = 5.5
1 _ _1 _ _
b = y – b x = b = y – b x = 117- 5.5(65)117- 5.5(65)
0 1 = - 0 1 = - 240.5240.5
The regression equation is: y = - 240.5 + 5.5 xThe regression equation is: y = - 240.5 + 5.5 x
21 21 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Sample Problem ContinuedSample Problem Continued
The regression equation is:The regression equation is:
ŷ = - 10.16 + 0.184 xŷ = - 10.16 + 0.184 x
For x = 120, For x = 120, ŷ = - 10.16 + 0.184 (120) = 11.92 ŷ = - 10.16 + 0.184 (120) = 11.92
The bonus of a vice presdient whose salary is The bonus of a vice presdient whose salary is $120,000 is predicted to be $11,920.$120,000 is predicted to be $11,920.
22 22 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Coefficient of Determination Coefficient of Determination
In regression analysis, the following natural In regression analysis, the following natural question arises:question arises:
How well does the regression equation forecast How well does the regression equation forecast the actual data? the actual data?
This question is answered by a quantity that we This question is answered by a quantity that we call “call “Coefficient of DeterminationCoefficient of Determination””
23 23 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Coefficient of DeterminationCoefficient of Determination
To understand the Coefficient of Determination, To understand the Coefficient of Determination, one has to understand the following three one has to understand the following three quantities:quantities:
22
SSE = SSE = SSum of um of SSquares due to quares due to EError = ∑ (y – rror = ∑ (y – ŷ)ŷ)
_ 2_ 2
SST = SST = SSum of um of SSquares quares TTotal = ∑ (y – y)otal = ∑ (y – y)
_ 2_ 2
SSR = Sum of Squares due to Regression = ∑ (ŷ – SSR = Sum of Squares due to Regression = ∑ (ŷ – y)y)
24 24 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Coefficient of DeterminationCoefficient of Determination
Relationship Among SST, SSR, SSERelationship Among SST, SSR, SSE
where:where: SST = total sum of squaresSST = total sum of squares SSR = sum of squares due to regressionSSR = sum of squares due to regression SSE = sum of squares due to errorSSE = sum of squares due to error
SST = SSR + SST = SSR + SSE SSE
2( )iy y 2( )iy y 2ˆ( )iy y 2ˆ( )iy y 2ˆ( )i iy y 2ˆ( )i iy y
25 25 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
The The coefficient of determinationcoefficient of determination is: is:
Coefficient of DeterminationCoefficient of Determination
where:where:
SSR = sum of squares due to regressionSSR = sum of squares due to regression
SST = total sum of squaresSST = total sum of squares
rr22 = SSR/SST = SSR/SST
26 26 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Coefficient of DeterminationCoefficient of Determination
rr22 = SSR/SST = 100/114 = .8772 = SSR/SST = 100/114 = .8772
The regression relationship is very strong; 87.7%The regression relationship is very strong; 87.7%of the variability in the number of cars sold can beof the variability in the number of cars sold can beexplained by the linear relationship between theexplained by the linear relationship between thenumber of TV ads and the number of cars sold.number of TV ads and the number of cars sold.
27 27 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Sample Correlation CoefficientSample Correlation Coefficient
21 ) of(sign rbrxy 21 ) of(sign rbrxy
ionDeterminat oft Coefficien ) of(sign 1brxy ionDeterminat oft Coefficien ) of(sign 1brxy
where:where:
bb11 = the slope of the estimated regression = the slope of the estimated regression
equationequation xbby 10ˆ xbby 10ˆ
28 28 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
21 ) of(sign rbrxy 21 ) of(sign rbrxy
The sign of The sign of bb11 in the equation in the equation is “+”. is “+”.ˆ 10 5y x ˆ 10 5y x
=+ .8772xyr =+ .8772xyr
Sample Correlation CoefficientSample Correlation Coefficient
rrxyxy = = +.9366 +.9366
29 29 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Testing for SignificanceTesting for Significance
To test for a significant regression relationship, weTo test for a significant regression relationship, we must conduct a hypothesis test to determine whethermust conduct a hypothesis test to determine whether the value of the value of 11 is zero. is zero.
To test for a significant regression relationship, weTo test for a significant regression relationship, we must conduct a hypothesis test to determine whethermust conduct a hypothesis test to determine whether the value of the value of 11 is zero. is zero.
Two tests are commonly used:Two tests are commonly used: Two tests are commonly used:Two tests are commonly used:
tt Test Testtt Test Test andand FF Test TestFF Test Test
Both the Both the tt test and test and FF test require an estimate of test require an estimate of 22,, the variance of the variance of in the regression model. in the regression model. Both the Both the tt test and test and FF test require an estimate of test require an estimate of 22,, the variance of the variance of in the regression model. in the regression model.
30 30 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
An Estimate of An Estimate of 22
Testing for SignificanceTesting for Significance
210
2 )()ˆ(SSE iiii xbbyyy 210
2 )()ˆ(SSE iiii xbbyyy
where:where:
ss 22 = MSE = SSE/( = MSE = SSE/(n n 2) 2)
The mean square error (MSE) provides the estimateThe mean square error (MSE) provides the estimate
of of 22, and the notation , and the notation ss22 is also used. is also used.
31 31 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Testing for SignificanceTesting for Significance
An Estimate of An Estimate of
2
SSEMSE
n
s2
SSEMSE
n
s
• To estimate To estimate we take the square root of we take the square root of 22..
• The resulting The resulting ss is called the is called the standard error ofstandard error of the estimatethe estimate..
32 32 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
HypothesesHypotheses
Test StatisticTest Statistic
Testing for Significance: Testing for Significance: tt Test Test
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
1
1
b
bt
s
1
1
b
bt
s wherewhere
1 2( )b
i
ss
x x
1 2( )b
i
ss
x x
33 33 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Rejection RuleRejection Rule
Testing for Significance: Testing for Significance: tt Test Test
where: where:
tt is based on a is based on a tt distribution distribution
with with nn - 2 degrees of freedom - 2 degrees of freedom
Reject Reject HH00 if if pp-value -value << or or tt << - -ttor or tt >> tt
34 34 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
1. Determine the hypotheses.1. Determine the hypotheses.
2. Specify the level of significance.2. Specify the level of significance.
3. Select the test statistic.3. Select the test statistic.
= .05= .05
4. State the rejection rule.4. State the rejection rule.Reject Reject HH00 if if pp-value -value << .05 .05or |or |t|t| > 3.182 (with > 3.182 (with
3 degrees of freedom)3 degrees of freedom)
Testing for Significance: Testing for Significance: tt Test Test
0 1: 0H 0 1: 0H
1: 0aH 1: 0aH
1
1
b
bt
s
1
1
b
bt
s
35 35 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Testing for Significance: Testing for Significance: tt Test Test
5. Compute the value of the test statistic.5. Compute the value of the test statistic.
6. Determine whether to reject 6. Determine whether to reject HH00..
tt = 4.541 provides an area of .01 in the upper = 4.541 provides an area of .01 in the uppertail. Hence, the tail. Hence, the pp-value is less than .02. (Also,-value is less than .02. (Also,tt = 4.63 > 3.182.) We can reject = 4.63 > 3.182.) We can reject HH00..
1
1 54.63
1.08b
bt
s
1
1 54.63
1.08b
bt
s
36 36 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Sample ProblemSample Problem
Problem # 18 (10-Page 564; 11-Page 581) Problem # 18 (10-Page 564; 11-Page 581)
2 _ 2 _ 22 _ 2 _ 2
x y x y ŷ (y – ŷ) (y – y) (ŷ – y)ŷ (y – ŷ) (y – y) (ŷ – y)
2.6 3300 3301 1 122500 1218012.6 3300 3301 1 122500 121801
3.4 3600 3766 27556 2500 134563.4 3600 3766 27556 2500 13456
3.6 4000 3882 13924 122500 538243.6 4000 3882 13924 122500 53824
3.2 3500 3650 22500 22500 03.2 3500 3650 22500 22500 0
3.5 3900 3824 5776 62500 3.5 3900 3824 5776 62500 30276 30276
2.9 3600 3476 15376 2500 2.9 3600 3476 15376 2500 3027630276
Total 85133 335000 249633Total 85133 335000 249633
37 37 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
Sample Problem ContinuedSample Problem Continued
SSE = 85133SSE = 85133
SST = 335000SST = 335000
SSR = 249633SSR = 249633
22
Coefficient of Determination = r = SSR/SST = 0.745Coefficient of Determination = r = SSR/SST = 0.745
22
Correlation Coefficient = r = √ r = 0.86Correlation Coefficient = r = √ r = 0.86
There is a fairly good fit between the actual monthly salaries and There is a fairly good fit between the actual monthly salaries and the GPAs of students. There exists a strong relationship the GPAs of students. There exists a strong relationship between these variables. between these variables.
38 38 Slide
Slide
IS 310 – Business StatisticsIS 310 – Business Statistics
End of Chapter 14End of Chapter 14