Ken Black QA ch15

download Ken Black QA ch15

of 24

Transcript of Ken Black QA ch15

  • 8/3/2019 Ken Black QA ch15

    1/24

    Business Statistics, 5th ed.

    by Ken Black

    Chapter 15

    Multiple RegressionAnalysis

    Discrete Distributions

    PowerPoint presentations prepared by Lloyd Jaisingh,Morehead State University

  • 8/3/2019 Ken Black QA ch15

    2/24

    Learning Objectives

    Develop a multiple regression model.

    Understand and apply significance tests of the regressionmodel and its coefficients.

    Compute and interpret residuals, the standard error of theestimate, and the coefficient of determination.

    Interpret multiple regression computer output.

  • 8/3/2019 Ken Black QA ch15

    3/24

    Regression Models

    Probabilistic Multiple Regression Model

    Y = 0 + 1X1+ 2X2+ 3X3 + . . . + kXk+

    Y= the value of the dependent (response) variable

    0 = the regression constant1 = the partial regression coefficient of independent variable 12 = the partial regression coefficient of independent variable 2k = the partial regression coefficient of independent variable kk = the number of independent variables

    = the error of prediction

  • 8/3/2019 Ken Black QA ch15

    4/24

    Estimated Regression Model

    st variableindependenofnumber=

    tcoefficienregressionofestimate

    3tcoefficienregressionofestimate

    2tcoefficienregressionofestimate

    1tcoefficienregressionofestimate

    constantregressionofestimate

    ofvaluepredicted:

    3

    2

    1

    0

    3322110

    k

    k

    YYwhere

    Y

    b

    b

    b

    b

    b

    XbXbXbXbb

    k

    kk

  • 8/3/2019 Ken Black QA ch15

    5/24

    Multiple Regression Model with Two

    Independent Variables (First-Order)

    Y

    where

    Y

    where Y

    X X

    b b X b X

    b

    b

    b

    0 1 1 2 2

    0

    1

    2

    0 1 1 2 2

    0

    1

    2

    :

    :

    = the regression constant

    the partial regression coefficient for independent variable 1

    the partial regression coefficient for independent variable 2

    = the error of prediction

    predicted value of Yestimate of regression constant

    estimate of regression coefficient 1

    estimate of regression coefficient 2

    PopulationModel

    Estimated

    Model

  • 8/3/2019 Ken Black QA ch15

    6/24

    Response Plane for First-Order Two-

    Predictor Multiple Regression Model

    X1X2

    Response Plane

    Y1

    Vertical InterceptY

  • 8/3/2019 Ken Black QA ch15

    7/24

    Least Squares Equations for k = 2

    0 1 1 2 2

    0 1 1 1

    2

    2 1 2 1

    0 2 1 1 2 2 2

    2

    2

    b b X b X Y

    b X b X b X X X Y

    b X b X X b X X Y

    n

  • 8/3/2019 Ken Black QA ch15

    8/24

    Real Estate Data

    Observation Y X1 X2 Observation Y X1 X21 63.0

    65.1

    1,605 35 13 79.7 2,121 14

    2 2,489 45 14 84.5 2,485 93 69.9

    7

    1,553 20 15 96.0 2,300 19

    4 76.8 2,404 32 16 109.5 2,714 4

    5 73.9 1,884 25 17 102.5 2,463 5

    6 77.9 1,558 14 18 121.0 3,076 7

    7 74.9 1,748 8 19 104.9 3,048 3

    8 78.0 3,105 10 20 128.0 3,267 6

    9 79.0 1,682 28 21 129.0 3,069 1010 63.4 2,470 30 22 117.9 4,765 11

    11 79.5 1,820 2 23 140.0 4,540 8

    12 83.9 2,143 6

    Market

    Price

    ($1,000)

    Square

    Feet

    Age

    (Years)

    Market

    Price

    ($1,000)

    Square

    Feet

    Age

    (Years)

  • 8/3/2019 Ken Black QA ch15

    9/24

    Real Estate Data

    5000

    4000

    Price

    60

    90

    120

    3000

    150

    SquareFeet0200015

    3045Age

    Real Estate Data

  • 8/3/2019 Ken Black QA ch15

    10/24

    MINITAB Output

    for the Real Estate Example

    The regression equation is

    Price = 57.4 + 0.0177 Sq.Feet - 0.666 Age

    Predictor Coef StDev T P

    Constant 57.35 10.01 5.73 0.000

    Sq.Feet 0.017718 0.003146 5.63 0.000Age -0.6663 0.2280 -2.92 0.008

    S = 11.96 R-Sq = 74.1% R-Sq(adj) = 71.5%

    Analysis of Variance

    Source DF SS MS F PRegression 2 8189.7 4094.9 28.63 0.000

    Residual Error 20 2861.0 143.1

    Total 22 11050.7

  • 8/3/2019 Ken Black QA ch15

    11/24

    Predicting the Price of Home

    dollarsthousand658.93

    12666.025000177.04.57,12and2500

    666.00177.04.57

    21

    21

    Y

    XXFor

    XXY

  • 8/3/2019 Ken Black QA ch15

    12/24

    Evaluating the Multiple

    Regression Model

    H

    H

    k

    a

    01 2 3

    0:

    :

    At least one of the regression coefficients is 0

    H

    H

    H

    H

    H

    H

    H

    H

    a a

    a

    k

    ak

    01

    1

    03

    3

    02

    2

    0

    0

    0

    0

    0

    0

    0

    0

    0

    :

    :

    :

    :

    :

    :

    :

    :

    Significance

    Tests for

    IndividualRegression

    Coefficients

    Testing

    the

    Overall

    Model

  • 8/3/2019 Ken Black QA ch15

    13/24

    Testing the Overall Model for the

    Real Estate Example

    0istscoefficienregressiontheofoneleastAt:

    0: 21

    0

    aH

    H

    MSRSSR

    kMSE

    SSE

    n kF

    MSR

    MSE

    1

    ANOVAdf SS MS F p

    Regression 2 8189.723 4094.86 28.63 .000

    Residual (Error) 20 2861.017 143.1

    Total 22 11050.74

    . , ,.

    . . ,

    01 2 20585

    28 63 585

    F

    FCal

    reject H .0

  • 8/3/2019 Ken Black QA ch15

    14/24

    Significance Test

    of the Regression

    Coefficients forthe Real Estate

    Example

    H

    H

    H

    H

    a

    a

    01

    1

    02

    2

    0

    0

    0

    0

    :

    :

    :

    :

    tCal = 5.63 > 2.086, reject H0.

    Coefficients Std Dev t Stat p

    x1 (Sq.Feet) 0.0177 0.003146 5.63 .000

    x2 (Age) -0.666 0.2280 -2.92 .008

    t.025,20 = 2.086

  • 8/3/2019 Ken Black QA ch15

    15/24

    Residuals and Sum of Squares Error

    for the Real Estate Example

    SSE

    Observation Y Observation Y

    1 43.0 42.466 0.534 0.285 13 59.7 65.602 -5.902 34.832

    2 45.1 51.465 -6.365 40.517 14 64.5 75.383 -10.883 118.438

    3 49.9 51.540 -1.640 2.689 15 76.0 65.442 10.558 111.479

    4 56.8 58.622 -1.822 3.319 16 89.5 82.772 6.728 45.265

    5 53.9 54.073 -0.173 0.030 17 82.5 77.659 4.841 23.440

    6 57.9 55.627 2.273 5.168 18 101.0 87.187 13.813 190.799

    7 54.9 62.991 -8.091 65.466 19 84.9 89.356 -4.456 19.858

    8 58.0 85.702 -27.702 767.388 20 108.0 91.237 16.763 280.982

    9 59.0 48.495 10.505 110.360 21 109.0 85.064 23.936 572.936

    10 63.4 61.124 2.276 5.181 22 97.9 114.447 -16.547 273.81511 59.5 68.265 -8.765 76.823 23 120.0 112.460 7.540 56.854

    12 63.9 71.322 -7.422 55.092 2861.017

    Y Y Y 2

    Y Y Y Y Y 2

    Y Y

  • 8/3/2019 Ken Black QA ch15

    16/24

    MINITAB Residual Diagnostics for

    the Real Estate Problem

    Residual

    Pe

    rcent

    30150-15-30

    99

    90

    50

    10

    1

    Fitted Value

    Residual

    1401201008060

    30

    15

    0

    -15

    -30

    Residual

    Frequenc

    y

    24120-12-24

    6.0

    4.5

    3.0

    1.5

    0.0

    Observation Order

    Residual

    222018161412108642

    30

    15

    0

    -15

    -30

    Normal Probability Plot of the Residuals Residuals Versus the Fitted Values

    Histogram of the Residuals Residuals Versus the Order o f the Data

    Residual Plots for Price

  • 8/3/2019 Ken Black QA ch15

    17/24

    SSE and Standard Error

    of the Estimate

    eSSSE

    n k

    where

    1

    2861

    23 2 1

    1196.

    : n = number of observations

    k = number of independent variables

    SSE

    ANOVA

    df SS MS F PRegression 2 8189.7 4094.9 28.63 .000

    Residual (Error) 20 2861.0 143.1

    Total 22 11050.7

  • 8/3/2019 Ken Black QA ch15

    18/24

    Coefficient of Multiple Determination (R2)

    2

    2

    8189 723

    11050 74741

    1 1

    2861017

    11050 74 741

    R

    R

    SSR

    SSY

    SSE

    SSY

    .

    ..

    .

    . .

    SSEANOVA

    df SS MS F p

    Regression 2 8189.7 4094.89 28.63 .000

    Residual (Error) 20 2861.0 143.1

    Total 22 11050.7

    SSYY SSR

  • 8/3/2019 Ken Black QA ch15

    19/24

    Adjusted R2

    adj

    SSE

    n kSSY

    n

    R.

    .

    .. .

    21

    1

    1

    1

    2861017

    23 2 111050 74

    23 1

    1 285 715

    ANOVA

    df SS MS F p

    Regression 2 8189.7 4094.9 28.63 .000

    Residual (Error) 20 2861.0 143.1

    Total 22 11050.7

    SSYYSSEn-k-1n-1

  • 8/3/2019 Ken Black QA ch15

    20/24

    Demonstration Problem 15.1: Freight

    Data

    Country

    Freight Cargo Shipped by

    Road

    (Million Short-Ton

    Miles)

    Length 0f Roads

    (Miles)

    Number of Commercial

    Vehicles

    China 278,806 673,239 5,010,000

    Brazil 178,359 1,031,693 1,371,127

    India 144,000 1,342,000 1,980,000

    Germany 138,975 395,367 2,923,000

    Italy 125,171 188,597 2,745,500

    Spain 105,824 206,271 2,859,438

    Mexico 96,049 157,036 3,758,034

  • 8/3/2019 Ken Black QA ch15

    21/24

    Demonstration Problem 15.1: Excel

    Output

    Regression Statistics

    Multiple R 0.812

    R Square 0.659

    Adjusted R Square 0.488

    Standard Error 44273.86677

    Observations 7

    ANOVA

    df SS MS F Sig. F

    Regression 2 15148592381 7.57E+09 3.86 0.116

    Residual 4 7840701114 1.96E+09

    Total 6 22989293495

    Coefficients Standard Error t Stat P-value

    Intercept -26425.45085 67624.93769 -0.39 0.716

    Length 0f Roads 0.101820862 0.043495015 2.34 0.079

    Commercial Vehicles 0.04094856 0.017121018 2.39 0.075

  • 8/3/2019 Ken Black QA ch15

    22/24

    Demonstration Problem 15.1: MINITAB

    Output

  • 8/3/2019 Ken Black QA ch15

    23/24

    MINITAB Residual Diagnostics for

    Demonstration Problem 15.1

    Residual

    Pe

    rcent

    100000500000-50000-100000

    99

    90

    50

    10

    1

    Fitted Value

    Re

    sidual

    250000200000150000100000

    50000

    25000

    0

    -25000

    -50000

    Residual

    Frequency

    50000250000-25000-50000

    2.0

    1.5

    1.0

    0.5

    0.0

    Observation Order

    Residual

    7654321

    50000

    25000

    0

    -25000

    -50000

    Normal Probability Plot of the Residuals Residuals Versus the Fitted Values

    Hist ogram of t he Residuals Residuals Versus t he Order of t he Dat a

    Residual Plots for Freight

  • 8/3/2019 Ken Black QA ch15

    24/24

    Copyright 2008 John Wiley & Sons, Inc.All rights reserved. Reproduction or translation

    of this work beyond that permitted in section 117of the 1976 United States Copyright Act without

    express permission of the copyright owner isunlawful. Request for further information shouldbe addressed to the Permissions Department, JohnWiley & Sons, Inc. The purchaser may makeback-up copies for his/her own use only and notfor distribution or resale. The Publisher assumesno responsibility for errors, omissions, or damagescaused by the use of these programs or from theuse of the information herein.