Ken Black QA 5th chapter 12 Solution

download Ken Black QA 5th chapter 12 Solution

of 36

Transcript of Ken Black QA 5th chapter 12 Solution

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    1/36

    Chapter 12: Analysis of Categorical Data 1

    Chapter 12Analysis of Categorical Data

    LEARNING OBJECTIVES

    This chapter presents several nonparametric statistics that can be used to analyze data

    enabling you to:

    1. Understand the chi-square goodness-of-fit test and how to use it.

    2. Analyze data using the chi-square test of independence.

    CHAPTER TEACHING STRATEGY

    Chapter 12 is a chapter containing the two most prevalent chi-square tests: chi-

    square goodness-of-fit and chi-square test of independence. These two techniques areimportant because they give the statistician a tool that is particularly useful for analyzing

    nominal data (even though independent variable categories can sometimes have ordinal

    or higher categories). It should be emphasized that there are many instances in businessresearch where the resulting data gathered are merely categorical identification. For

    example, in segmenting the market place (consumers or industrial users), information is

    gathered regarding gender, income level, geographical location, political affiliation,

    religious preference, ethnicity, occupation, size of company, type of industry, etc. Onthese variables, the measurement is often a tallying of the frequency of occurrence of

    individuals, items, or companies in each category. The subject of the research is given no

    "score" or "measurement" other than a 0/1 for being a member or not of a given category.These two chi-square tests are perfectly tailored to analyze such data.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    2/36

    Chapter 12: Analysis of Categorical Data 2

    The chi-square goodness-of-fit test examines the categories of one variable to

    determine if the distribution of observed occurrences matches some expected ortheoretical distribution of occurrences. It can be used to determine if some standard or

    previously known distribution of proportions is the same as some observed distribution of

    proportions. It can also be used to validate the theoretical distribution of occurrences ofphenomena such as random arrivals that are often assumed to be Poisson distributed.

    You will note that the degrees of freedom, k- 1 for a given set of expected values or for

    the uniform distribution, change to k- 2 for an expected Poisson distribution and to k- 3for an expected normal distribution. To conduct a chi-square goodness-of-fit test to

    analyze an expected Poisson distribution, the value of lambda must be estimated from the

    observed data. This causes the loss of an additional degree of freedom. With the normal

    distribution, both the mean and standard deviation of the expected distribution areestimated from the observed values causing the loss of two additional degrees of freedom

    from the k- 1 value.

    The chi-square test of independence is used to compare the observed frequenciesalong the categories of two independent variables to expected values to determine if the

    two variables are independent or not. Of course, if the variables are not independent,they are dependent or related. This allows business researchers to reach some

    conclusions about such questions as: is smoking independent of gender or is type of

    housing preferred independent of geographic region? The chi-square test of

    independence is often used as a tool for preliminary analysis of data gathered inexploratory research where the researcher has little idea of what variables seem to be

    related to what variables, and the data are nominal. This test is particularly useful with

    demographic type data.

    A word of warning is appropriate here. When an expected frequency is small, the

    observed chi-square value can be inordinately large thus yielding an increased possibilityof committing a Type I error. The research on this problem has yielded varying results

    with some authors indicating that expected values as low as two or three are acceptable

    and other researchers demanding that expected values be ten or more. In this text, wehave settled on the fairly widespread accepted criterion of five or more.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    3/36

    Chapter 12: Analysis of Categorical Data 3

    CHAPTER OUTLINE

    12.1 Chi-Square Goodness-of-Fit Test

    Testing a Population Proportion Using the Chi-square Goodness-of-Fit

    Test as an Alternative Technique to the z Test

    12.2 Contingency Analysis: Chi-Square Test of Independence

    KEY TERMS

    Categorical Data Chi-Square Test of Independence

    Chi-Square Distribution Contingency Analysis

    Chi-Square Goodness-of-Fit Test Contingency Table

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    4/36

    Chapter 12: Analysis of Categorical Data 4

    SOLUTIONS TO THE ODD-NUMBERED PROBLEMS IN CHAPTER 12

    12.1 f0 fee

    eo

    fff

    2

    )(

    53 68 3.30937 42 0.595

    32 33 0.030

    28 22 1.63618 10 6.400

    15 8 6.125

    Ho: The observed distribution is the same

    as the expected distribution.

    Ha: The observed distribution is not the sameas the expected distribution.

    Observed

    =e

    e

    f

    ff 202 )( = 18.095

    df = k- 1 = 6 - 1 = 5, = .05

    2.05,5 = 11.0705

    Since the observed 2 = 18.095 > 2.05,5 = 11.0705, the decision is to reject thenull hypothesis.

    The observed frequencies are not distributed the same as the expected

    frequencies.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    5/36

    Chapter 12: Analysis of Categorical Data 5

    12.2 f0 fee

    eo

    f

    ff 2)(

    19 18 0.056

    17 18 0.05614 18 0.889

    18 18 0.00019 18 0.056

    21 18 0.50018 18 0.000

    18 18 0.000

    fo = 144 fe = 144 1.557

    Ho: The observed frequencies are uniformly distributed.

    Ha: The observed frequencies are not uniformly distributed.

    8

    1440==

    k

    fx = 18

    In this uniform distribution, eachfe = 18

    df = k 1 = 8 1 = 7, = .01

    2.01,7 = 18.4753

    Observed

    =

    e

    e

    f

    ff 202 )( = 1.557

    Since the observed 2 = 1.557 < 2.01,7 = 18.4753, the decision is to fail toreject

    the null hypothesis

    There is no reason to conclude that the frequencies are not uniformly

    distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    6/36

    Chapter 12: Analysis of Categorical Data 6

    12.3 Number f0 (Number)(f0)

    0 28 01 17 17

    2 11 22

    3 _5 1561 54

    Ho: The frequency distribution is Poisson.Ha: The frequency distribution is not Poisson.

    =61

    54=0.9

    Expected Expected

    Number Probability Frequency

    0 .4066 24.803

    1 .3659 22.3202 .1647 10.047

    > 3 .0628 3.831

    Sincefe for > 3 is less than 5, collapse categories 2 and >3:

    Number fo fee

    eo

    f

    ff 2)(

    0 28 24.803 0.4121 17 22.320 1.268

    >2 16 13.878 0.324

    61 60.993 2.004

    df = k- 2 = 3 - 2 = 1, = .05

    2.05,1 = 3.8415

    Observed

    =e

    e

    f

    ff 202 )( = 2.001

    Since the observed 2 = 2.001 < 2.05,1 = 3.8415, the decision is to fail to reject

    the null hypothesis.

    There is insufficient evidence to reject the distribution as Poisson distributed.The conclusion is that the distribution is Poisson distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    7/36

    Chapter 12: Analysis of Categorical Data 7

    12.4

    Category f(observed) Midpt. fm fm2

    10-20 6 15 90 1,350

    20-30 14 25 350 8,75030-40 29 35 1,015 35,525

    40-50 38 45 1,710 76,950

    50-60 25 55 1,375 75,62560-70 10 65 650 42,250

    70-80 7 75 525 39,375

    n = f= 129 fm = 5,715 fm2 = 279,825

    129

    715,5==

    f

    fmx = 44.3

    s =

    128

    129

    )715,5(825,279

    1

    )( 22

    2

    =

    n

    n

    fMfM = 14.43

    Ho: The observed frequencies are normally distributed.

    Ha: The observed frequencies are not normally distributed.

    For Category 10 - 20 Prob

    z =43.14

    3.4410

    = -2.38 .4913

    z =43.14

    3.4420 = -1.68 - .4535

    Expected prob.: .0378

    For Category 20-30 Prob

    forx = 20, z= -1.68 .4535

    z=43.14

    3.4430 = -0.99 -.3389

    Expected prob: .1146

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    8/36

    Chapter 12: Analysis of Categorical Data 8

    For Category 30 - 40 Prob

    forx = 30, z= -0.99 .3389

    z =

    43.14

    3.4440 = -0.30 -.1179

    Expected prob: .2210

    For Category 40 - 50 Prob

    forx = 40, z= -0.30 .1179

    z =43.14

    3.4450 = 0.40 +.1554

    Expected prob: .2733

    For Category 50 - 60 Prob

    z =43.14

    3.4460 = 1.09 .3621

    forx = 50, z= 0.40 -.1554Expected prob: .2067

    For Category 60 - 70 Prob

    z =43.14

    3.4470

    = 1.78 .4625

    forx = 60, z= 1.09 -.3621Expected prob: .1004

    For Category 70 - 80 Prob

    z =43.14

    3.4480 = 2.47 .4932

    forx = 70, z= 1.78 -.4625

    Expected prob: .0307

    Forx < 10:

    Probability between 10 and the mean, 44.3, = (.0378 + .1145 + .2210

    + .1179) = .4913. Probability < 10 = .5000 - .4912 = .0087

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    9/36

    Chapter 12: Analysis of Categorical Data 9Forx > 80:

    Probability between 80 and the mean, 44.3, = (.0307 + .1004 + .2067 + .1554) =.4932. Probability > 80 = .5000 - .4932 = .0068

    Category Prob expected frequency< 10 .0087 .0087(129) = 1.12

    10-20 .0378 .0378(129) = 4.88

    20-30 .1146 14.7830-40 .2210 28.51

    40-50 .2733 35.26

    50-60 .2067 26.66

    60-70 .1004 12.9570-80 .0307 3.96

    > 80 .0068 0.88

    Due to the small sizes of expected frequencies, category < 10 is folded into 10-20and >80 into 70-80.

    Category fo fee

    eo

    f

    ff 2)(

    10-20 6 6.00 .000

    20-30 14 14.78 .04130-40 29 28.51 .008

    40-50 38 35.26 .213

    50-60 25 26.66 .103

    60-70 10 12.95 .67270-80 7 4.84 .964

    2.001

    Calculated

    =e

    e

    f

    ff 202 )( = 2.001

    df = k- 3 = 7 - 3 = 4, = .05

    2.05,4 = 9.4877

    Since the observed 2 = 2.004 < 2.05,4 = 9.4877, the decision is to fail to rejectthe null hypothesis. There is not enough evidence to declare that the observed

    frequencies are not normally distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    10/36

    Chapter 12: Analysis of Categorical Data 10

    12.5 Definition fo Exp.Prop. fee

    eo

    f

    ff 2)(

    Happiness 42 .39 227(.39)= 88.53 24.46Sales/Profit 95 .12 227(.12)= 27.24 168.55

    Helping Others 27 .18 40.86 4.70Achievement/

    Challenge 63 .31 70.37 0.77227 198.48

    Ho: The observed frequencies are distributed the same as the expectedfrequencies.

    Ha: The observed frequencies are not distributed the same as the expectedfrequencies.

    Observed 2 = 198.48

    df = k 1 = 4 1 = 3, = .05

    2.05,3 = 7.8147

    Since the observed 2 = 198.48 > 2.05,3 = 7.8147, the decision is to reject thenull hypothesis.

    The observed frequencies for men are not distributed the same as the

    expected frequencies which are based on the responses of women.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    11/36

    Chapter 12: Analysis of Categorical Data 11

    12.6 Age fo Prop. from survey fee

    eo

    f

    ff 2)(

    10-14 22 .09 (.09)(212)=19.08 0.4515-19 50 .23 (.23)(212)=48.76 0.03

    20-24 43 .22 46.64 0.2825-29 29 .14 29.68 0.02

    30-34 19 .10 21.20 0.23> 35 49 .22 46.64 0.12

    212 1.13

    Ho: The distribution of observed frequencies is the same as the distribution of

    expected frequencies.

    Ha: The distribution of observed frequencies is not the same as the distribution of

    expected frequencies.

    = .01, df = k- 1 = 6 - 1 = 5

    2.01,5 = 15.0863

    The observed 2 = 1.13

    Since the observed 2 = 1.13 < 2.01,5 = 15.0863, the decision is to fail to rejectthe null hypothesis.

    There is not enough evidence to declare that the distribution of observedfrequencies is different from the distribution of expected frequencies.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    12/36

    Chapter 12: Analysis of Categorical Data 1212.7 Age fo m fm fm

    2

    10-20 16 15 240 3,600

    20-30 44 25 1,100 27,50030-40 61 35 2,135 74,725

    40-50 56 45 2,520 113,400

    50-60 35 55 1,925 105,87560-70 19 65 1,235 80,275

    231 fm = 9,155 fm2 = 405,375

    231

    155,9==

    n

    fMx = 39.63

    s =

    230

    231

    )155,9(375,405

    1

    )( 222

    =

    n

    n

    fMfM

    = 13.6

    Ho: The observed frequencies are normally distributed.

    Ha: The observed frequencies are not normally distributed.

    For Category 10-20 Prob

    z =6.13

    63.3910 = -2.18 .4854

    z =6.13

    63.3920 = -1.44 -.4251

    Expected prob. .0603

    For Category 20-30 Prob

    forx = 20, z= -1.44 .4251

    z =6.13

    63.3930 = -0.71 -.2611

    Expected prob. .1640

    For Category 30-40 Prob

    forx = 30, z= -0.71 .2611

    z =6.13

    63.3940 = 0.03 +.0120

    Expected prob. .2731

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    13/36

    Chapter 12: Analysis of Categorical Data 13

    For Category 40-50 Prob

    z =6.13

    63.3950 = 0.76 .2764

    forx = 40, z= 0.03 -.0120

    Expected prob. .2644

    For Category 50-60 Prob

    z =6.13

    63.3960 = 1.50 .4332

    forx = 50, z= 0.76 -.2764

    Expected prob. .1568

    For Category 60-70 Prob

    z =6.13

    63.3970 = 2.23 .4871

    forx = 60, z= 1.50 -.4332

    Expected prob. .0539

    For < 10:Probability between 10 and the mean = .0603 + .1640 + .2611 = .4854

    Probability < 10 = .5000 - .4854 = .0146

    For > 70:

    Probability between 70 and the mean = .0120 + .2644 + .1568 + .0539 = .4871

    Probability > 70 = .5000 - .4871 = .0129

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    14/36

    Chapter 12: Analysis of Categorical Data 14

    Age Probability fe

    < 10 .0146 (.0146)(231) = 3.3710-20 .0603 (.0603)(231) = 13.93

    20-30 .1640 37.88

    30-40 .2731 63.0940-50 .2644 61.08

    50-60 .1568 36.22

    60-70 .0539 12.45> 70 .0129 2.98

    Categories < 10 and > 70 are less than 5.

    Collapse the < 10 into 10-20 and > 70 into 60-70.

    Age fo fee

    eo

    f

    ff 2)(

    10-20 16 17.30 0.1020-30 44 37.88 0.9930-40 61 63.09 0.07

    40-50 56 61.08 0.42

    50-60 35 36.22 0.0460-70 19 15.43 0.83

    2.45

    df = k- 3 = 6 - 3 = 3, = .05

    2.05,3 = 7.8147

    Observed 2 = 2.45

    Since the observed 2 < 2.05,3 = 7.8147, the decision is to fail to reject the nullhypothesis.

    There is no reason to reject that the observed frequencies are normally

    distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    15/36

    Chapter 12: Analysis of Categorical Data 15

    12.8 Number f (f) (number)0 18 01 28 28

    2 47 94

    3 21 63

    4 16 645 11 55

    6 or more 9 54

    f= 150 f (number) = 358

    =150

    358=

    f

    numberf= 2.4

    Ho: The observed frequencies are Poisson distributed.

    Ha: The observed frequencies are not Poisson distributed.

    Number Probability fe0 .0907 (.0907)(150) = 13.611 .2177 (.2177)(150) = 32.66

    2 .2613 39.20

    3 .2090 31.354 .1254 18.81

    5 .0602 9.03

    6 or more .0358 5.36

    fo fe

    0

    2

    0 )(

    f

    ff e

    18 13.61 1.42

    28 32.66 0.6647 39.20 1.55

    21 31.35 3.42

    16 18.81 0.42

    11 9.03 0.439 5.36 2.47

    10.37

    The observed 2 = 10.37

    = .01, df = k 2 = 7 2 = 5, 2.01,5 = 15.0863

    Since the observed 2 = 10.37 < 2.01,5 = 15.0863, the decision is to fail to rejectthe null hypothesis.

    There is not enough evidence to reject the claim that the observed

    frequencies are Poisson distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    16/36

    Chapter 12: Analysis of Categorical Data 16

    12.9 H0: p = .28 n = 270 x = 62

    Ha: p .28

    fo fe e

    eo

    f

    ff 2)(

    Spend More 62 270(.28) = 75.6 2.44656

    Don't Spend More 208 270(.72) = 194.4 0.95144

    Total 270 270.0 3.39800

    The observed value of 2 is 3.398

    = .05 and /2 = .025 df = k- 1 = 2 - 1 = 1

    2.025,1 = 5.02389

    Since the observed 2 = 3.398 < 2.025,1 = 5.02389, the decision is to fail toreject the null hypothesis.

    12.10 H0: p = .30 n = 180 x= 42Ha: p < .30

    f0 fee

    eo

    f

    ff 2)(

    Provide 42 180(.30) = 54 2.6666

    Don't Provide 138 180(.70) = 126 1.1429

    Total 180 180 3.8095

    The observed value of 2 is 3.8095

    = .05 df = k- 1 = 2 - 1 = 1

    2.05,1 = 3.8415

    Since the observed 2 = 3.8095 < 2.05,1 = 3.8415, the decision is to fail toreject the null hypothesis.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    17/36

    Chapter 12: Analysis of Categorical Data 1712.11

    Variable Two

    Variable

    One

    203 326 529

    17868 110

    271 436 707

    Ho: Variable One is independent of Variable Two.Ha: Variable One is not independent of Variable Two.

    e11 =707

    )271)(529(= 202.77 e12 =

    707

    )436)(529(= 326.23

    e21 =707

    )178)(271(= 68.23 e22 =

    707

    )178)(436(= 109.77

    Variable Two

    Variable

    One

    (202.77

    )203

    (326.23

    )326

    529

    178(68.23)68

    (109.77)

    110

    271 436

    707

    2 =77.202

    )77.202203( 2+

    23.326

    )23.326326( 2+

    23.68

    )23.668( 2+

    77.109

    )77.109110( 2=

    .00 + .00 + .00 + .00 = 0.00

    = .01, df = (c-1)(r-1) = (2-1)(2-1) = 1

    2.01,1 = 6.6349

    Since the observed 2 = 0.00 < 2.01,1 = 6.6349, the decision is to fail to rejectthe null hypothesis.

    Variable One is independent of Variable Two.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    18/36

    Chapter 12: Analysis of Categorical Data 1812.12

    VariableTwo

    VariableOne

    24 13 47 58 14258393 59 187 244

    117 72 234 302 725

    Ho: Variable One is independent of Variable Two.

    Ha: Variable One is not independent of Variable Two.

    e11 =725

    )117)(142(= 22.92 e12 =

    725

    )72)(142(= 14.10

    e13 =725

    )234)(142(= 45.83 e14 =

    725

    )302)(142(= 59.15

    e21 =725

    )117)(583(= 94.08 e22 =

    725

    )72)(583(= 57.90

    e23 =725

    )234)(583(= 188.17 e24 =

    725

    )302)(583(= 242.85

    Variable Two

    Variable

    One

    (22.92

    )24

    (14.10

    )13

    (45.83)

    47

    (59.15)

    58

    142

    583(94.08)

    93

    (57.90)

    59

    (188.17)

    187

    (242.85)244

    117 72 234 302 725

    2 =92.22

    )92.2224( 2+

    10.14

    )10.1413( 2+

    83.45

    )83.4547( 2+

    15.59

    )15.5958( 2+

    08.94

    )08.9493( 2+

    90.57

    )90.5759( 2+

    17.188

    )17.188188( 2+

    85.242)85.242244(

    2

    =

    .05 + .09 + .03 + .02 + .01 + .02 + .01 + .01 = 0.24

    = .01, df = (c-1)(r-1) = (4-1)(2-1) = 3, 2.01,3 = 11.3449

    Since the observed 2 = 0.24 < 2.01,3 = 11.3449, the decision is to fail to

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    19/36

    Chapter 12: Analysis of Categorical Data 19reject the null hypothesis.

    Variable One is independent of Variable Two.

    12.13

    Social Class

    Number

    of

    Children

    Lower Middle Upper

    0

    1

    2 or 3>3

    7 18 6 31

    70

    189108

    9 38 23

    34 97 58

    47 31 30

    97 184 117 398

    Ho: Social Class is independent of Number of Children.Ha: Social Class is not independent of Number of Children.

    e11 =398

    )97)(31( = 7.56 e31 =398

    )97)(189( = 46.06

    e12 =398

    )184)(31(= 14.3 e32 =

    398

    )184)(189(= 87.38

    e13 =398

    )117)(31(= 9.11 e33 =

    398

    )117)(189(= 55.56

    e21 = 398

    )97)(70(

    = 17.06 e41 = 398

    )97)(108(

    = 26.32

    e22 =398

    )184)(70(= 32.36 e42 =

    398

    )184)(108(= 49.93

    e23 =398

    )117)(70(= 20.58 e43 =

    398

    )117)(108(= 31.75

    Social Class

    Number

    of

    Children

    Lower Middle Upper

    0

    1

    2 or 3

    (7.56)

    7

    (14.33

    )

    18

    (9.11)

    6

    31

    70

    189

    (17.06

    )9

    (32.36

    )38

    (20.58

    )23

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    20/36

    Chapter 12: Analysis of Categorical Data 20

    >3

    108

    (46.06

    )34

    (87.38

    )97

    (55.56

    )58

    (26.32)

    47

    (49.93)

    31

    (31.75)

    3097 184 117 398

    2 =56.7

    )56.77( 2+

    33.14

    )33.1418( 2+

    11.9

    )11.96( 2+

    06.17

    )06.179( 2+

    36.32

    )36.3238( 2+

    58.20

    )58.2023( 2+

    06.46

    )06.4634( 2+

    38.87

    )38.8797( 2+

    56.55

    )56.5558( 2

    + 32.26

    )32.2647( 2

    + 93.49

    )93.4931( 2

    + 75.31

    )75.3130( 2

    =

    .04 + .94 + 1.06 + 3.81 + .98 + .28 + 3.16 + 1.06 + .11 + 16.25 +

    7.18 + .10 = 34.97

    = .05, df = (c-1)(r-1) = (3-1)(4-1) = 6

    2.05,6 = 12.5916

    Since the observed

    2

    = 34.97 >

    2

    .05,6 = 12.5916, the decision is to reject thenull hypothesis.

    Number of children is not independent of social class.

    12.14

    Type of Music Preferred

    Region

    Rock R&B Coun Clssic

    195235

    202632

    NE 140 32 5 18S 134 41 52 8

    W 154 27 8 13

    428 100 65 39

    Ho: Type of music preferred is independent of region.Ha: Type of music preferred is not independent of region.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    21/36

    Chapter 12: Analysis of Categorical Data 21

    e11 =632

    )428)(195(= 132.6 e23 =

    632

    )65)(235(= 24.17

    e12 =632

    )100)(195(= 30.85 e24 =

    632

    )39)(235(= 14.50

    e13 =632

    )65)(195(= 20.06 e31 =

    632

    )428)(202(= 136.80

    e14 =632

    )39)(195(= 12.03 e32 =

    632

    )100)(202(= 31.96

    e21 =632

    )428)(235( = 159.15 e33 =

    632

    )65)(202(= 20.78

    e22 =632

    )100)(235(= 37.18 e34 =

    632

    )39)(202(= 12.47

    Type of Music Preferred

    Region

    Rock R&B Coun Clssic

    195235

    202

    632

    NE (132.06

    )

    140

    (30.85

    )

    32

    (20.06

    )

    5

    (12.03

    )

    18

    S (159.15

    )134

    (37.18

    )41

    (24.17

    )52

    (14.50

    )8

    W (136.80

    )154

    (31.96

    )27

    (20.78

    )8

    (12.47

    )13

    428 100 65 39

    2 =06.132

    )06.132141( 2+

    85.30

    )85.3032( 2+

    06.20

    )06.205( 2+

    03.12

    )03.1218( 2

    +

    15.159

    )15.159134( 2+ 18.37

    )18.3741( 2+ 17.24

    )17.2452( 2+ 50.14

    )50.148( 2+

    80.136

    )80.136154( 2+

    96.31

    )96.3127( 2+

    78.20

    )78.208( 2+

    47.12

    )47.1213( 2

    =

    .48 + .04 + 11.31 + 2.96 + 3.97 + .39 + 32.04 + 2.91 + 2.16 + .77 +

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    22/36

    Chapter 12: Analysis of Categorical Data 227.86 + .02 = 64.91

    = .01, df = (c-1)(r-1) = (4-1)(3-1) = 6

    2.01,6 = 16.8119

    Since the observed 2 = 64.91 > 2.01,6 = 16.8119, the decision is toreject the null hypothesis.

    Type of music preferred is not independent of region of the country.

    12.15

    Transportation Mode

    Industr

    y

    Air Train Truck 85

    35

    120

    Publishing 32 12 41

    Comp.Hard. 5 6 24

    37 18 65

    H0: Transportation Mode is independent of Industry.

    Ha: Transportation Mode is not independent of Industry.

    e11 =120

    )37)(85(= 26.21 e21 =

    120

    )37)(35(= 10.79

    e12 =120

    )18)(85(= 12.75 e22 =

    120

    )18)(35(= 5.25

    e13 = 120

    )65)(85(

    = 46.04 e23 = 120

    )65)(35(

    = 18.96

    Transportation Mode

    Industry

    Air Train Truck

    85

    35120

    Publishing (26.21

    )

    32

    (12.75

    )

    12

    (46.04

    )

    41

    Comp.Hard. (10.79

    )5

    (5.25)

    6

    (18.96

    )24

    37 18 65

    2 =21.26

    )21.2632( 2+

    75.12

    )75.1212( 2+

    04.46

    )04.4641( 2+

    79.10

    )79.105( 2+

    25.5

    )25.56( 2+

    96.18

    )96.1824( 2=

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    23/36

    Chapter 12: Analysis of Categorical Data 231.28 + .04 + .55 + 3.11 + .11 + 1.34 = 6.43

    = .05, df = (c-1)(r-1) = (3-1)(2-1) = 2

    2

    .05,2 = 5.9915

    Since the observed 2 = 6.43 > 2.05,2 = 5.9915, the decision is toreject the null hypothesis.

    Transportation mode is not independent of industry.

    12.16

    Number of Bedrooms

    Number of

    Stories

    < 2 3 > 4

    274

    575

    1 116 101 57

    2 90 325 160206 426 217 849

    H0: Number of Stories is independent of number of bedrooms.

    Ha: Number of Stories is not independent of number of bedrooms.

    e11 =849

    )206)(274(= 66.48 e21 =

    849

    )206)(575(= 139.52

    e12 = 849

    )426)(274(

    = 137.48 e22

    = 849

    )426)(575(

    = 288.52

    e13 =849

    )217)(274(= 70.03 e23 =

    849

    )217)(575(= 146.97

    2 =52.139

    )52.13990( 2+

    48.137

    )48.137101( 2+

    03.70

    )03.7057( 2+

    52.139

    )52.13990( 2+

    52.288)52.288325(

    2

    +97.146

    )97.146160(

    2

    =

    2 = 36.89 + 9.68 + 2.42 + 17.58 + 4.61 + 1.16 = 72.34

    = .10 df = (c-1)(r-1) = (3-1)(2-1) = 2

    2.10,2 = 4.6052

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    24/36

    Chapter 12: Analysis of Categorical Data 24

    Since the observed 2 = 72.34 > 2.10,2 = 4.6052, the decision is toreject the null hypothesis.

    Number of stories is not independent of number of bedrooms.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    25/36

    Chapter 12: Analysis of Categorical Data 2512.17

    Mexican Citizens

    Typeof

    Store

    Yes No

    4135

    3060

    Dept. 24 17

    Disc. 20 15

    Hard. 11 19Shoe 32 28

    87 79 166

    Ho: Citizenship is independent of store type

    Ha: Citizenship is not independent of store type

    e11 =166

    )87)(41(= 21.49 e31 =

    166

    )87)(30(= 15.72

    e12 =166

    )79)(41(= 19.51 e32 =

    166

    )79)(30(= 14.28

    e21 =166

    )87)(35(= 18.34 e41 =

    166

    )87)(60(= 31.45

    e22 =166

    )79)(35(= 16.66 e42 =

    166

    )79)(60(= 28.55

    Mexican Citizens

    Type

    ofStore

    Yes No

    41

    35

    30

    60

    Dept. (21.49

    )24

    (19.51

    )17

    Disc. (18.34

    )20

    (16.66

    )15

    Hard. (15.72)

    11

    (14.28)

    19

    Shoe (31.45)

    32

    (28.55)

    28

    87 79 166

    2 =49.21

    )49.2124( 2+

    51.19

    )51.1917( 2+

    34.18

    )34.1820( 2+

    66.16

    )66.1615( 2

    +

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    26/36

    Chapter 12: Analysis of Categorical Data 26

    72.15

    )72.1511( 2+

    28.14

    )28.1419( 2+

    45.31

    )45.3132( 2+

    55.28

    )55.2828( 2

    =

    .29 + .32 + .15 + .17 + 1.42 + 1.56 + .01 + .01 = 3.93

    = .05, df = (c-1)(r-1) = (2-1)(4-1) = 3

    2.05,3 = 7.8147

    Since the observed 2 = 3.93 < 2.05,3 = 7.8147, the decision is to fail toreject the null hypothesis.

    Citizenship is independent of type of store.

    12.18 = .01, k= 7, df = 6

    H0: The observed distribution is the same as the expected distributionHa: The observed distribution is not the same as the expected distribution

    Use:

    =e

    e

    f

    ff 202 )(

    critical 2.01,6 = 16.8119

    fo fe (f0-fe)2

    e

    eo

    f

    ff 2)(

    214 206 64 0.311

    235 232 9 0.039279 268 121 0.451

    281 284 9 0.032

    264 268 16 0.060254 232 484 2.086

    211 206 25 0.121

    3.100

    =e

    e

    f

    ff 202 )( = 3.100

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    27/36

    Chapter 12: Analysis of Categorical Data 27

    Since the observed value of 2 = 3.1 < 2.01,6 = 16.8119, the decision is to failto

    reject the null hypothesis. The observed distribution is not different from the

    expected distribution.

    12.19

    Variable 2

    Variable 1

    12 23 21 56

    8 17 20 45

    7 11 18 36

    27 51 59 137

    e11 = 11.04 e12 = 20.85 e13 = 24.12

    e21 = 8.87 e22 = 16.75 e23 = 19.38

    e31 = 7.09 e32 = 13.40 e33 = 15.50

    2 =04.11

    )04.1112( 2+

    85.20

    )85.2023( 2+

    12.24

    )12.2421( 2+

    87.8

    )87.88( 2+

    75.16

    )75.1617( 2+

    38.19

    )38.1920( 2+

    09.7

    )09.77( 2+

    40.13

    )40.1311( 2+

    50.15

    )50.1518( 2 =

    .084 + .222 + .403 + .085 + .004 + .020 + .001 + .430 + .402 = 1.652

    df = (c-1)(r-1) = (2)(2) = 4 = .05

    2.05,4 = 9.4877

    Since the observed value of

    2

    = 1.652 <

    2

    .05,4 = 9.4877, the decision is to failto reject the null hypothesis.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    28/36

    Chapter 12: Analysis of Categorical Data 28

    12.20

    Location

    NE W S

    Customer Industrial 230 115 68 413

    Retail 185 143 89 417415 258 157 830

    e11 =830

    )415)(413(= 206.5 e21 =

    830

    )415)(417(= 208.5

    e12 =830

    )258)(413(= 128.38 e22 =

    830

    )258)(417(= 129.62

    e13

    = 830

    )157)(413(

    = 78.12 e23

    = 830

    )157)(417(

    = 78.88

    Location

    NE W S

    Customer Industrial (206.5)

    230

    (128.38)

    115

    (78.12)

    68

    413

    Retail (208.5

    )

    185

    (129.62

    )

    143

    (78.88

    )

    89

    417

    415 258 157 830

    2 =5.206

    )5.206230( 2+

    38.128

    )38.128115( 2+

    12.78

    )12.7868( 2+

    5.208

    )5.208185( 2+

    62.129

    )62.129143( 2+

    88.78

    )88.7889( 2=

    2.67 + 1.39 + 1.31 + 2.65 + 1.38 + 1.30 = 10.70

    = .10 and df = (c - 1)(r- 1) = (3 - 1)(2 - 1) = 2

    2.10,2 = 4.6052

    Since the observed 2 = 10.70 > 2.10,2 = 4.6052, the decision is to reject thenull hypothesis.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    29/36

    Chapter 12: Analysis of Categorical Data 29Type of customer is not independent of geographic region.

    12.21 Cookie Type foChocolate Chip 189

    Peanut Butter 168

    Cheese Cracker 155Lemon Flavored 161

    Chocolate Mint 216

    Vanilla Filled 165

    fo = 1,054

    Ho: Cookie Sales is uniformly distributed across kind of cookie.Ha: Cookie Sales is not uniformly distributed across kind of cookie.

    If cookie sales are uniformly distributed, thenfe =6

    054,1

    .

    0=

    kindsno

    f= 175.67

    fo fee

    eo

    f

    ff 2)(

    189 175.67 1.01168 175.67 0.33

    155 175.67 2.43

    161 175.67 1.23216 175.67 9.26

    165 175.67 0.65

    14.91

    The observed 2 = 14.91

    = .05 df = k- 1 = 6 - 1 = 5

    2.05,5 = 11.0705

    Since the observed 2 = 14.91 > 2.05,5 = 11.0705, the decision is to reject thenull hypothesis.

    Cookie Sales is not uniformly distributed by kind of cookie.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    30/36

    Chapter 12: Analysis of Categorical Data 30

    12.22

    Gender

    M F

    Bought

    Car

    Y 207 65 272

    N 811 984 1,7951,018 1,049 2,067

    Ho: Purchasing a car or not is independent of gender.Ha: Purchasing a car or not is not independent of gender.

    e11 =067,2

    )018,1)(272(= 133.96 e12 =

    067,2

    )049,1)(27(= 138.04

    e21 =067,2

    )018,1)(795,1(= 884.04 e22 =

    067,2

    )049,1)(795,1(= 910.96

    Gender

    M F

    Bought

    Car

    Y (133.96

    )

    207

    (138.04

    )

    65

    272

    N (884.04

    )811

    (910.96

    )984

    1,795

    1,018 1,049 2,067

    2 =96.133

    )96.133207( 2+

    04.138

    )04.13865( 2+

    04.884

    )04.884811( 2+

    96.910

    )96.910984( 2= 39.82 + 38.65 + 6.03 + 5.86 = 90.36

    = .05 df = (c-1)(r-1) = (2-1)(2-1) = 1

    2.05,1 = 3.8415

    Since the observed 2 = 90.36 > 2.05,1 = 3.8415, the decision is to reject thenull hypothesis.

    Purchasing a car is not independent of gender.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    31/36

    Chapter 12: Analysis of Categorical Data 3112.23 Arrivals fo (fo)(Arrivals)

    0 26 0

    1 40 402 57 114

    3 32 96

    4 17 685 12 60

    6 8 48

    fo = 192 (fo)(arrivals) = 426

    =192

    426))((

    0

    0=

    f

    arrivalsf= 2.2

    Ho: The observed frequencies are Poisson distributed.

    Ha: The observed frequencies are not Poisson distributed.

    Arrivals Probability fe

    0 .1108 (.1108)(192) = 21.27

    1 .2438 (.2438)(192) = 46.812 .2681 51.48

    3 .1966 37.75

    4 .1082 20.775 .0476 9.14

    6 .0249 4.78

    fo fe

    e

    eo

    f

    ff 2)(

    26 21.27 1.05

    40 46.81 0.99

    57 51.48 0.5932 37.75 0.88

    17 20.77 0.68

    12 9.14 0.89

    8 4.78 2.177.25

    Observed 2 = 7.25

    = .05 df = k- 2 = 7 - 2 = 5

    2.05,5 = 11.0705

    Since the observed 2 = 7.25 < 2.05,5 = 11.0705, the decision is to fail to rejectthe null hypothesis. There is not enough evidence to reject the claim that the

    observed frequency of arrivals is Poisson distributed.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    32/36

    Chapter 12: Analysis of Categorical Data 32

    12.24 Ho: The distribution of observed frequencies is the same as the

    distribution of expected frequencies.Ha: The distribution of observed frequencies is not the same as the distribution of

    expected frequencies.

    Soft Drink fo proportions fee

    eo

    f

    ff 2)(

    Classic Coke 314 .179 (.179)(1726) = 308.95 0.08

    Pepsi 219 .115 (.115)(1726) = 198.49 2.12Diet Coke 212 .097 167.42 11.87

    Mt. Dew 121 .063 108.74 1.38

    Diet Pepsi 98 .061 105.29 0.50Sprite 93 .057 98.32 0.29

    Dr. Pepper 88 .056 96.66 0.78

    Others 581 .372 642.07 5.81fo = 1,726 22.83

    Observed 2 = 22.83

    = .05 df = k- 1 = 8 - 1 = 7

    2.05,7 = 14.0671

    Since the observed 2 = 22.83 > 2.05,6 = 14.0671, the decision is to reject thenull hypothesis.

    The observed frequencies are not distributed the same as the expected frequenciesfrom the national poll.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    33/36

    Chapter 12: Analysis of Categorical Data 33

    12.25

    Position

    Manager Programmer Operator

    Systems

    Analyst

    Years

    0-3 6 37 11 13 67

    4-8 28 16 23 24 91

    > 8 47 10 12 19 88

    81 63 46 56 246

    e11 =246

    )81)(67(= 22.06 e23 =

    246

    )46)(91(= 17.02

    e12 =246

    )63)(67( = 17.16 e24 =246

    )56)(91( = 20.72

    e13 =246

    )46)(67(= 12.53 e31 =

    246

    )81)(88(= 28.98

    e14 =246

    )56)(67(= 15.25 e32 =

    246

    )63)(88(= 22.54

    e21 =246

    )81)(91(= 29.96 e33 =

    246

    )46)(88(= 16.46

    e22 =246

    )63)(91(= 23.30 e34 =

    246

    )56)(88(= 20.03

    Position

    Manager Programmer Operato

    r

    Systems

    Analyst

    Years

    0-3 (22.06)

    6

    (17.16)

    37

    (12.53)

    11

    (15.25)

    13

    67

    4-8 (29.96)

    28

    (23.30)

    16

    (17.02)

    23

    (20.72)

    24

    91

    > 8 (28.98)

    47

    (22.54)

    10

    (16.46)

    12

    (20.03)

    19

    88

    81 63 46 56 246

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    34/36

    Chapter 12: Analysis of Categorical Data 34

    2 =06.22

    )06.226( 2+

    16.17

    )16.1737( 2+

    53.12

    )53.1211( 2+

    25.15

    )25.1513( 2+

    96.29

    )96.2928( 2+

    30.23

    )30.2316( 2+

    02.17

    )02.1723( 2+

    72.20

    )72.2024( 2+

    98.28

    )98.2847( 2+

    54.22

    )54.2210( 2+

    46.16

    )46.1612( 2+

    03.20

    )03.2019( 2=

    11.69 + 22.94 + .19 + .33 + .13 + 2.29 + 2.1 + .52 + 11.2 + 6.98 +

    1.21 + .05 = 59.63

    = .01 df = (c-1)(r-1) = (4-1)(3-1) = 6

    2.01,6 = 16.8119

    Since the observed 2 = 59.63 > 2.01,6 = 16.8119, the decision is to reject thenull hypothesis. Position is not independent of number of years of experience.

    12.26 H0: p = .43 n = 315 =.05

    Ha: p .43 x = 120 /2 = .025

    fo fee

    eo

    fff

    2

    )(

    More Work,

    More Business 120 (.43)(315) = 135.45 1.76

    Others 195 (.57)(315) = 179.55 1.33

    Total 315 315.00 3.09

    The observed value of 2 is 3.09

    = .05 and /2 = .025 df = k- 1 = 2 - 1 = 1

    2.025,1 = 5.0239

    Since 2 = 3.09 < 2.025,1 = 5.0239, the decision is to fail to reject the nullhypothesis.

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    35/36

    Chapter 12: Analysis of Categorical Data 3512.27

    Type of College or University

    Community

    College

    Large

    University

    Small

    College

    Number

    ofChildren

    0 25 178 31 234

    1 49 141 12 2022 31 54 8 93

    >3 22 14 6 42

    127 387 57 571

    Ho: Number of Children is independent of Type of College or University.Ha: Number of Children is not independent of Type of College or University.

    e11 =571

    )127)(234(= 52.05 e31 =

    571

    )127)(93(= 20.68

    e12 =571

    )387)(234( = 158.60 e32 =571

    )387)(193( = 63.03

    e13 =571

    )57)(234(= 23.36 e33 =

    571

    )57)(93(= 9.28

    e21 =571

    )127)(202(= 44.93 e41 =

    571

    )127)(42(= 9.34

    e22 =571

    )387)(202(= 136.91 e42 =

    571

    )387)(42(= 28.47

    e23 =571

    )57)(202(= 20.16 e43 =

    571

    )57)(42(= 4.19

    Type of College or University

    Community

    College

    Large

    University

    Small

    College

    Number

    of

    Children

    0 (52.05)

    25

    (158.60)

    178

    (23.36)

    31

    234

    1 (44.93)

    49

    (136.91)

    141

    (20.16)

    12

    202

    2 (20.68)

    31

    (63.03)

    54

    (9.28)

    8

    93

    >3 (9.34)

    22

    (28.47)

    14

    (4.19)

    6

    42

    127 387 57 571

  • 8/3/2019 Ken Black QA 5th chapter 12 Solution

    36/36

    Chapter 12: Analysis of Categorical Data 36

    2 =05.52

    )05.5225( 2+

    6.158

    )6.158178( 2+

    36.23

    )36.2331( 2+

    93.44

    )93.4449( 2+

    91.136

    )91.136141( 2+

    16.20

    )16.2012( 2+

    68.20

    )68.2031( 2+

    03.63

    )03.6354( 2+

    28.9

    )28.98( 2+

    34.9

    )34.922( 2+

    47.28

    )47.2814( 2+

    19.4

    )19.46( 2=

    14.06 + 2.37 + 2.50 + 0.37 + 0.12 + 3.30 + 5.15 + 1.29 + 0.18 +

    17.16 + 7.35 + 0.78 = 54.63

    = .05, df= (c - 1)(r- 1) = (3 - 1)(4 - 1) = 6

    2.05,6 = 12.5916

    Since the observed 2 = 54.63 > 2.05,6 = 12.5916, the decision is to reject thenull hypothesis.

    Number of children is not independent of type of College or University.

    12.28 The observed chi-square is 30.18 with ap-value of .0000043. The chi-squaregoodness-of-fit test indicates that there is a significant difference between the

    observed frequencies and the expected frequencies. The distribution of responses

    to the question is not the same for adults between 21 and 30 years of age as they

    are for others. Marketing and sales people might reorient their 21 to 30 year oldefforts away from home improvement and pay more attention to leisure

    travel/vacation, clothing, and home entertainment.

    12.29 The observed chi-square value for this test of independence is 5.366. The

    associatedp-value of .252 indicates failure to reject the null hypothesis. There is

    not enough evidence here to say that color choice is dependent upon gender.Automobile marketing people do not have to worry about which colors especially

    appeal to men or to women because car color is independent of gender. In

    addition, design and production people can determine car color quotas based on

    other variables.