1
Multiple Regression Analysisand Model Building
Department of Industrial EngineeringInstitut Teknologi Sepuluh Nopember
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
2
Chapter GoalsAfter completing this chapter, you should be
able to:understand model building using multipleregression analysisapply multiple regression analysis to businessdecision-making situationstest the significance of the independentvariables in a multiple regression model
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
3
Chapter GoalsAfter completing this chapter, you should be
able to:use variable transformations to model nonlinearrelationshipsrecognize potential problems in multipleregression analysis and take the steps to correctthe problems.incorporate qualitative variables into theregression model by using dummy variables.
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
4
The Multiple Regression ModelIdea: Examine the linear relationship between
1 dependent (y) & 2 or more independent variables (xi)
xxxy kk22110
kk22110 xbxbxbby
Population model:
Y-interceptPopulation slopes Random Error
Estimated(or predicted)value of y
Estimated slope coefficients
Estimated multiple regression model:
Estimatedintercept
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
5
Multiple Regression ModelTwo variable model
y
x1
x2
22110 xbxbby
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
6
Multiple Regression ModelTwo variable model
The best fit equation, y ,is found by minimizing thesum of squared errors, e2
<
y
x1i
x1
x2
22110 xbxbbyyi
yi
<
e = (y – y)<
x2i
Observasi sampel
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
7
Multiple Regression Assumptions
The errors are normally distributedThe mean of the errors is zeroErrors have a constant varianceThe model errors are independent
e = (y – y)
<
Errors (residuals) from the regressionmodel:
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
8
Model Specification
Decide what you want to do and selectthe dependent variable
Determine the potential independentvariables for your model
Gather sample data (observations) forall variables
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
9
The Correlation MatrixCorrelation between the dependentvariable and selected independentvariables can be found using Excel:
Tools / Data Analysis… / Correlation
Can check for statistical significance ofcorrelation with a t test
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
10
ExampleA distributor of frozen desert pies wants toevaluate factors thought to influence demand
Dependent variable: Pie sales (units perweek)Independent variables: Price (in $)
Advertising ($100’s)
Data is collected for 15 weeks
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
11
Pie Sales Model
Sales = b0 + b1 (Price)+ b2 (Advertising)
•Week•Pie
Sales•Price•($)
•Advertising•($100s)
1 350 5.50 3.3
2 460 7.50 3.3
3 350 8.00 3.0
4 430 8.00 4.5
5 350 6.80 3.0
6 380 7.50 4.0
7 430 4.50 3.0
8 470 6.40 3.7
9 450 7.00 3.5
10 490 5.00 4.0
11 340 7.20 3.5
12 300 7.90 3.2
13 440 5.90 4.0
14 450 5.00 3.5
15 300 7.00 2.7
Pie Sales Price Advertising
Pie Sales 1
Price -0.44327 1
Advertising 0.55632 0.03044 1
Correlation matrix:
Multiple regression model:
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
12
Interpretation of EstimatedCoefficients
Slope (bi)Estimates that the average value of y changes by biunits for each 1 unit increase in Xi holding all othervariables constantExample: if b1 = -20, then sales (y) is expected todecrease by an estimated 20 pies per week for each$1 increase in selling price (x1), net of the effects ofchanges due to advertising (x2)
y-intercept (b0)The estimated average value of y when all xi = 0(assuming all xi = 0 is within the range of observedvalues)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
13
Pie Sales Correlation Matrix
Price vs. Sales : r = -0.44327There is a negative association betweenprice and sales
Advertising vs. Sales : r = 0.55632There is a positive association betweenadvertising and sales
• •Pie Sales •Price•Advertisi
ng•Pie Sales •1•Price •-0.44327 •1•Advertising •0.55632 •0.03044 •1
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
14
Scatter DiagramsSales vs. Price
0
100
200
300
400
500
600
0 2 4 6 8 10
Sales vs. Advertising
0
100
200
300
400
500
600
0 1 2 3 4 5
Sales
Sales
Price
Advertising
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
15
Multiple Regression Output•Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
ertising)74.131(Advce)24.975(Pri-306.526Sales
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
16
The Multiple Regression Equation
ertising)74.131(Advce)24.975(Pri-306.526Sales
b1 = -24.975: saleswill decrease, onaverage, by 24.975pies per week foreach $1 increase inselling price, net ofthe effects of changesdue to advertising
b2 = 74.131: sales willincrease, on average,by 74.131 pies perweek for each $100increase inadvertising, net of theeffects of changesdue to price
whereSales is in number of pies per weekPrice is in $Advertising is in $100’s.
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
17
Using The Model to MakePredictions
Predict sales for a week in which the sellingprice is $5.50 and advertising is $350:
Predicted salesis 428.62 pies
428.62
(3.5)74.131(5.50)24.975-306.526
ertising)74.131(Advce)24.975(Pri-306.526Sales
Note that Advertising isin $100’s, so $350means that x2 = 3.5
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
18
Multiple Coefficient ofDetermination
Reports the proportion of total variationin y explained by all x variables takentogether
squaresofsumTotalregressionsquaresofSum
SSTSSRR2
SSTSSER 12
atau
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
19
Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
.5214856493.329460.0
SSTSSRR2
52.1% of the variation in pie sales isexplained by the variation in priceand advertising
Multiple Coefficient ofDetermination
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
20
Adjusted R2
R2 never decreases when a new x variable isadded to the model
This can be a disadvantage when comparingmodels
What is the net effect of adding a new variable?We lose a degree of freedom when a new xvariable is addedDid the new x variable add enoughexplanatory power to offset the loss of onedegree of freedom?
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
21
Shows the proportion of variation in y explainedby all x variables adjusted for the number of xvariables used
(where n = sample size, k = number of independent variables)
Penalize excessive use of unimportant independentvariablesSmaller than R2
Useful in comparing among models
Adjusted R2(continued)
1kn1n)R1(1R 22
A
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
22
•Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
.44172R2A
44.2% of the variation in pie sales is explainedby the variation in price and advertising, takinginto account the sample size and number ofindependent variables
Multiple Coefficient ofDetermination
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
23
Is the Model Significant?F-Test for Overall Significance of the ModelShows if there is a linear relationship betweenall of the x variables considered together andyUse F test statisticHypotheses:
H0: 1 = 2 = … = k = 0(no linear relationship)
HA: at least one i 0
(at least one independent variable affects y)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
24
F-Test for Overall SignificanceTest statistic:
where F has (numerator) D1 = k and(denominator) D2 = (n – k - 1)
degrees of freedom
(continued)
MSEMSR
knSSE
kSSR
F
1
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
25
6.53862252.8
14730.0MSEMSRF
•Regression Statistics
Multiple R 0.72213
R Square 0.52148
Adjusted R Square 0.44172
Standard Error 47.46341
Observations 15
ANOVA df SS MS F Significance F
Regression 2 29460.027 14730.013 6.53861 0.01201
Residual 12 27033.306 2252.776
Total 14 56493.333
Coefficients Standard Error t Stat P-value Lower 95% Upper 95%
Intercept 306.52619 114.25389 2.68285 0.01993 57.58835 555.46404
Price -24.97509 10.83213 -2.30565 0.03979 -48.57626 -1.37392
Advertising 74.13096 25.96732 2.85478 0.01449 17.55303 130.70888
(continued)F-Test for Overall Significance
With 2 and 12 degreesof freedom
P-value forthe F-Test
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
26
H0: 1 = 2 = 0HA: 1 and 2 not both zero
= .05df1= 2 df2 = 12
Test Statistic:
Decision:
Conclusion:Reject H0 at = 0.05
The regression model does explaina significant portion of the variationin pie sales
(There is evidence that at least oneindependent variable affects y)
0
= .05
F.05 = 3.885Reject H0Do not
reject H0
6.5386MSEMSRF
CriticalValue:
F = 3.885
F-Test for Overall Significance(continued)
F
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
27
Are Individual VariablesSignificant?
Use t-tests of individual variable slopes
Shows if there is a linear relationshipbetween the variable xi and y
Hypotheses:
H0: i = 0 (no linear relationship)
HA: i 0 (linear relationship does existbetween xi and y)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
28
Are Individual VariablesSignificant?
H0: i = 0 (no linear relationship)
HA: i 0 (linear relationship does existbetween xi and y)
Test Statistic:
(df = n – k – 1)
ib
i
s0bt
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
29
•Regression Statistics
•Multiple R •0.72213
•R Square •0.52148
•Adjusted R Square •0.44172
•Standard Error •47.46341
•Observations •15
•ANOVA •df •SS •MS •F •Significance F
•Regression •2 •29460.027•14730.01
3 •6.53861 •0.01201
•Residual •12 •27033.306 •2252.776
•Total •14 •56493.333 • • •
• •Coefficients •Standard Error •t Stat •P-value •Lower 95% •Upper 95%
•Intercept •306.52619 •114.25389 •2.68285 •0.01993 •57.58835 •555.46404
•Price •-24.97509 •10.83213 •-2.30565 •0.03979 •-48.57626 •-1.37392
•Advertising •74.13096 •25.96732 •2.85478 •0.01449 •17.55303 •130.70888
t-value for Price is t = -2.306, with p-value .0398
t-value for Advertising is t = 2.855, withp-value .0145
(continued)
Are Individual VariablesSignificant?
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
30
d.f. = 15-2-1 = 12
= .05
t /2 = 2.1788
Inferences about the Slope:t Test Example
H0: i = 0HA: i 0
The test statistic for each variable fallsin the rejection region (p-values < .05)
There is evidence that bothPrice and Advertising affectpie sales at = .05
From Excel output:
Reject H0 for each variable
Coefficients Standard Error t Stat P-value
Price -24.97509 10.83213 -2.30565 0.03979
Advertising 74.13096 25.96732 2.85478 0.01449
Decision:
Conclusion:Reject H0Reject H0
/2=.025
-t /2Do not reject H0
0 t /2
/2=.025
-2.1788 2.1788
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
31
Confidence Interval Estimatefor the Slope
Confidence interval for the population slope 1(the effect of changes in price on pie sales):
Example: Weekly sales are estimated to be reducedby between 1.37 to 48.58 pies for each increase of $1in the selling price
ib2/i stb•
•Coefficients
•StandardError •…
•Lower95% •Upper 95%
•Intercept •306.52619 •114.25389 •… •57.58835 •555.46404
•Price •-24.97509 •10.83213 •… •-48.57626 •-1.37392
•Advertising •74.13096 •25.96732 •… •17.55303 •130.70888
where t has(n – k – 1) d.f.
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
32
Standard Deviation of theRegression Model
The estimate of the standard deviationof the regression model is:
MSEkn
SSEs1
Is this value large or small? Mustcompare to the mean size of y forcomparison
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
33
•Regression Statistics
•Multiple R •0.72213
•R Square •0.52148
•Adjusted R Square •0.44172
•Standard Error •47.46341
•Observations •15
•ANOVA •df •SS •MS •F •Significance F
•Regression •2 •29460.027 •14730.013 •6.53861 •0.01201
•Residual •12 •27033.306 •2252.776
•Total •14 •56493.333 • • •
• •Coefficients •Standard Error •t Stat •P-value •Lower 95% •Upper 95%
•Intercept •306.52619 •114.25389 •2.68285 •0.01993 •57.58835 •555.46404
•Price •-24.97509 •10.83213 •-2.30565 •0.03979 •-48.57626 •-1.37392
•Advertising •74.13096 •25.96732 •2.85478 •0.01449 •17.55303 •130.70888
The standard deviation of theregression model is 47.46
(continued)
Standard Deviation of theRegression Model
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
34
The standard deviation of the regression modelis 47.46
A rough prediction range for pie sales in agiven week is
Pie sales in the sample were in the 300 to 500per week range, so this range is probably toolarge to be acceptable. The analyst may wantto look for additional variables that can explainmore of the variation in weekly sales
(continued)
Standard Deviation of theRegression Model
94.22(47.46)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
35
Multicollinearity
Multicollinearity: High correlation existsbetween two independent variables
This means the two variables contributeredundant information to the multipleregression model
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
36
Multicollinearity
Including two highly correlated independentvariables can adversely affect the regressionresults
No new information provided
Can lead to unstable coefficients (largestandard error and low t-values)
Coefficient signs may not match priorexpectations
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
37
Some Indications of SevereMulticollinearity
Incorrect signs on the coefficientsLarge change in the value of a previouscoefficient when a new variable is added to themodelA previously significant variable becomesinsignificant when a new independent variableis addedThe estimate of the standard deviation of themodel increases when a variable is added tothe model
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
38
Detect Collinearity(Variance Inflationary Factor)
VIFj is used to measure collinearity:
If VIFj > 5, xj is highly correlated withthe other explanatory variables
R2j is the coefficient of determination when the jth
independent variable is regressed against theremaining k – 1 independent variables
211
jj R
VIF
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
39
Qualitative VariablesCategorical explanatory variable with twoor more levels:
yes or no, on or off, male or femalecoded as 0 or 1
Regression intercepts are different if thevariable is significantAssumes equal slopes for other variablesThe number of dummy variables neededis (number of levels - 1)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
40
Categorical-Variable ModelExample (with 2 Levels)
Let:
y = pie sales
x1 = price
x2 = holiday (X2 = 1 if a holiday occurred during the week)(X2 = 0 if there was no holiday that week)
210 xbxbby 21
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
41
Sameslope
Dummy-Variable Model Example(with 2 Levels)
(continued)
x1 (Price)
y (sales)
b0 + b2
b0
1010
12010
xbb(0)bxbbyxb)b(b(1)bxbby
121
121Holiday
No Holiday
Differentintercept
If H0: 2 = 0 isrejected, then“Holiday” has asignificant effecton pie sales
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
42
Sales: number of pies sold per weekPrice: pie price in $
Holiday:
Interpretation of the categoricalVariable Coefficient (with 2 Levels)
Example:
1 If a holiday occurred during the week0 If no holiday occurred
b2 = 15: on average, sales were 15 pies greater inweeks with a holiday than in weeks without aholiday, given the same price
)15(Holiday30(Price)-300Sales
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
43
Categorical-Variable Models(more than 2 Levels)
The number of Categorical variables is one less thanthe number of levelsExample:
y = house price ; x1 = square feet
The style of the house is also thought to matter:Style = ranch, split level, condo
Three levels, so twoCategorical variables are
needed
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
44
Dummy-Variable Models(more than 2 Levels)
notif0
levelsplitif1x
notif0
ranchif1x 32
3210 xbxbxbby 321
b2 shows the impact on price if the house is aranch style, compared to a condo
b3 shows the impact on price if the house is asplit level style, compared to a condo
(continued)
Let the default category be “condo”
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
45
Interpreting the Categorical VariableCoefficients (with 3 Levels)
With the same square feet, asplit-level will have anestimated average price of18.84 thousand dollars morethan a condo
With the same square feet, aranch will have an estimatedaverage price of 23.53thousand dollars more than acondo.
Suppose the estimated equation is
321 18.84x23.53x0.045x20.43y
18.840.045x20.43y 1
23.530.045x20.43y 1
10.045x20.43yFor a condo: x2 = x3 = 0
For a ranch: x3 = 0
For a split level: x2 = 0
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
46
The relationship between thedependent variable and an independentvariable may not be linearUseful when scatter diagram indicatesnon-linear relationshipExample: Quadratic model
The second independent variable is thesquare of the first variable
Nonlinear Relationships
xxy 2j2j10
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
47
Polynomial Regression Model
where:
0 = Population regression constant
i = Population regression coefficient for variable xj : j = 1, 2, …kp = Order of the polynomial
i = Model error
xxy 2j2j10
xxxy pjp
2j2j10
If p = 2 the model is a quadratic model:
General form:
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
48
Linear fit does not giverandom residuals
Linear vs. Nonlinear Fit
Nonlinear fit givesrandom residuals
x
resi
dual
s
x
y
x
resi
dual
s
y
x
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
49
Quadratic Regression Model
Quadratic models may be considered when scatterdiagram takes on the following shapes:
x1
y
x1x1
yyy
1 < 0 1 > 0 1 < 0 1 > 0
1 = the coefficient of the linear term2 = the coefficient of the squared term
x1
xxy 2j2j10
2 > 0 2 > 0 2 < 0 2 < 0
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
50
Testing for Significance:Quadratic Model
Test for Overall RelationshipF test statistic =
Testing the Quadratic EffectCompare quadratic model
with the linear model
Hypotheses
(No 2nd order polynomial term)
(2nd order polynomial term is needed)
xxy 2j2j10
xy j10
H0: 2 = 0
HA: 2 0
MSEMSR
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
51
Higher Order Modelsy
x
xxxy 3j3
2j2j10
If p = 3 the model is a cubic form:
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
52
Interaction EffectsHypothesizes interaction between pairs of xvariables
Response to one x variable varies at differentlevels of another x variable
Contains two-way cross product terms
221521433
212110 xxxxxxxy
Basic Terms Interactive Terms
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
53
Effect of Interaction
Given:
Without interaction term, effect of x1 on y ismeasured by 1
With interaction term, effect of x1 on y ismeasured by 1 + 3 x2
Effect changes as x2 increases
xxxxy 21322110
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
54
x2 = 1
x2 = 0
y = 1 + 2x1 + 3(1) + 4x1(1)= 4 + 6x1
y = 1 + 2x1 + 3(0) + 4x1(0)= 1 + 2x1
Interaction Example
Effect (slope) of x1 on y does depend on x2 value
x1
4
8
12
0
0 10.5 1.5
yy = 1 + 2x1 + 3x2 + 4x1x2
where x2 = 0 or 1 (Categorical variable)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
55
Interaction Regression ModelWorksheet
Case, i yi x1i x2i x1i x2i
1 1 1 3 32 4 8 5 403 1 3 2 64 3 5 6 30: : : : :
multiply x1 by x2 to get x1x2, thenrun regression with y, x1, x2 , x1x2
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
56
xxxxy 21322110
Hypothesize interaction between pairs of independentvariables
Hypotheses:H0: 3 = 0 (no interaction between x1 and x2)HA: 3 0 (x1 interacts with x2)
Evaluating Presenceof Interaction
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
57
Model BuildingGoal is to develop a model with the best set ofindependent variables
Easier to interpret if unimportant variables are removedLower probability of collinearity
Stepwise regression procedureProvide evaluation of alternative models as variables are added
Best-subset approachTry all combinations and select the best using the highestadjusted R2 and lowest s
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
58
Idea: develop the least squares regressionequation in steps, either through forwardselection, backward elimination, or throughstandard stepwise regression
The coefficient of partial determination is themeasure of the marginal contribution of eachindependent variable, given that otherindependent variables are in the model
Stepwise RegressionClic
k to buy N
OW!PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
59
Best Subsets RegressionIdea: estimate all possible regression equationsusing all possible combinations of independentvariables
Choose the best fit by looking for the highestadjusted R2 and lowest standard error s
Stepwise regression and best subsetsregression can be performed using PHStat,
Minitab, or other statistical software packages
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
60
Aptness of the ModelDiagnostic checks on the modelinclude verifying the assumptions ofmultiple regression:
Each xi is linearly related to yErrors have constant varianceErrors are independentError are normally distributed
)yy(eiErrors (or Residuals) are given by
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
61
Residual Analysis
Non-constant variance Constant variance
x x
resi
dual
s
resi
dual
s
Not Independent Independent
x
resi
dual
s
xre
sidu
als
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
62
The Normality AssumptionErrors are assumed to be normallydistributed
Standardized residuals can be calculatedby computer
Examine a histogram or a normalprobability plot of the standardizedresiduals to check for normality
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
63
Chapter SummaryDeveloped the multiple regression modelTested the significance of the multipleregression modelDeveloped adjusted R2
Tested individual regression coefficientsUsed dummy variablesExamined interaction in a multipleregression model
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
64
Chapter SummaryDescribed nonlinear regression modelsDescribed multicollinearityDiscussed model building
Stepwise regressionBest subsets regression
Examined residual plots to checkmodel assumptions
(continued)
Click t
o buy NOW!
PDF-XChange
www.docu-track.com Clic
k to buy N
OW!PDF-XChange
www.docu-track.com
Top Related