Ken Black QA ch13

download Ken Black QA ch13

of 56

Transcript of Ken Black QA ch13

  • 8/3/2019 Ken Black QA ch13

    1/56

    Business Statistics, 5th ed.by Ken Black

    Chapter 13

    NonparametricStatistics

    Discrete Distributions

    PowerPoint presentations prepared by Lloyd Jaisingh,Morehead State University

  • 8/3/2019 Ken Black QA ch13

    2/56

    Learning Objectives

    Recognize the advantages and disadvantages ofnonparametric statistics.

    Understand how to use the runs test to test forrandomness.

    Know when and how to use the Mann-Whitney Utest, the Wilcoxon matched-pairs signed rank test,the Kruskal-Wallis test, and the Friedman test.

    Learn when and how to measure correlation usingSpearmans rank correlation measurement.

  • 8/3/2019 Ken Black QA ch13

    3/56

    Parametric vs. Nonparametric Statistics

    Parametric Statisticsare statistical techniques based onassumptions about the population from which the sampledata are collected.

    Assumption that data being analyzed are randomlyselected from a normally distributed population.

    Requires quantitative measurement that yield interval orratio level data.

    Nonparametric Statisticsare based on fewer assumptionsabout the population and the parameters.

    Sometimes called distribution-free statistics.

    A variety of nonparametric statistics are available for usewith nominal or ordinal data.

  • 8/3/2019 Ken Black QA ch13

    4/56

    Advantages

    of Nonparametric Techniques

    Sometimes there is no parametric alternative tothe use of nonparametric statistics.

    Certain nonparametric test can be used toanalyze nominal data.

    Certain nonparametric test can be used toanalyze ordinal data. The computations on nonparametric statistics

    are usually less complicated than those forparametric statistics, particularly for small

    samples. Probability statements obtained from most

    nonparametric tests are exact probabilities.

  • 8/3/2019 Ken Black QA ch13

    5/56

    Disadvantages

    of Nonparametric Statistics

    Nonparametric tests can be wasteful of data

    if parametric tests are available for use withthe data.

    Nonparametric tests are usually not aswidely available and well know as

    parametric tests. For large samples, the calculations for many

    nonparametric statistics can be tedious.

  • 8/3/2019 Ken Black QA ch13

    6/56

    Runs Test

    Test for randomness - is the order or sequence ofobservations in a sample random or not

    Each sample item possesses one of two possible

    characteristics Run - a succession of observations which possess

    the same characteristic

    Example with two runs: F, F, F, F, F, F, F, F, M,

    M, M, M, M, M, M Example with fifteen runs: F, M, F, M, F, M, F,

    M, F, M, F, M, F, M, F

  • 8/3/2019 Ken Black QA ch13

    7/56

    Runs Test: Sample Size

    Consideration

    Sample size: n

    Number of sample member possessingthe first characteristic: n1

    Number of sample members possessingthe second characteristic: n2

    n =n1 +n2

    If bothn1 andn2 are

    20, the smallsample runs test is appropriate.

  • 8/3/2019 Ken Black QA ch13

    8/56

    Runs Test: Small Sample Example

    H0: The observations in the sample are randomly generated.Ha: The observations in the sample are not randomly generated.

    = .05n1 = 18n2 = 8

    If 7 R 17, do not reject H0Otherwise, reject H0.

    1 2 3 4 5 6 7 8 9 10 11 12

    D CCCCC D CC D CCCC D C D CCC DDD CCC

    R = 12Since 7 R = 12 17, do not reject H0

  • 8/3/2019 Ken Black QA ch13

    9/56

    Runs Test: Small Sample Example

  • 8/3/2019 Ken Black QA ch13

    10/56

    Runs Test: Large Sample

    Rn n

    n n

    21

    1 2

    1 2

    Rn n n n n n

    n n n n

    2 2

    1 2 1

    1 2 1 2 1 2

    2

    1 2

    ( )

    ( )( )

    ZR

    R

    R

    If either n1 or n2 is > 20,

    the sampling

    distribution of R is

    approximately normal.

  • 8/3/2019 Ken Black QA ch13

    11/56

    Runs Test: Large Sample ExampleH

    0

    : The observations in the sample are randomly generated.Ha: The observations in the sample are not randomly generated.

    = .05n1 = 40n2 = 10

    If -1.96 Z 1.96, do not reject H0Otherwise, reject H0.

    1

    1 2 3 4 5 6 7 8 9 0 11

    NNN F NNNNNNN F NN FF NNNNNN F NNNN F NNNNN

    12 13

    FFFF NNNNNNNNNNNN R = 13

  • 8/3/2019 Ken Black QA ch13

    12/56

    Runs Test: Large Sample Example

    R

    n nn n

    21

    2 40 10

    40 10

    1

    17

    1 2

    1 2

    ( )( )

    R

    n n n n n nn n n n

    2 2

    1 2 1

    2 40 10 2 40 10 40 10

    40 10 1

    2 213

    1 2 1 2 1 2

    2

    1 2

    2

    40 10

    ( )

    ( )

    ( )( )[ ( )( ) ( ) ( )]

    ( )

    .

    ( )

    ( )

    ZR

    R

    R

    13 17

    2 213181

    ..

    -1.96 Z= -1.81 1.96,do not reject H0

  • 8/3/2019 Ken Black QA ch13

    13/56

    Runs Test: Large Sample Example

    MINITAB Output

    One may manipulate in MINITAB to do the one-sample Z-test to obtain the results

    for the large sample approximation. Make sure the SE Mean works out to be the

    standard deviation of 2.21. This is done by multiplying the standard deviation by

    square root of n and using this value for the known standard deviation in the test.

  • 8/3/2019 Ken Black QA ch13

    14/56

    Runs Test: Large Sample Example

    MINITAB Output

    Note: The P-values are the same.

  • 8/3/2019 Ken Black QA ch13

    15/56

    Mann-Whitney UTest

    Nonparametric counterpart of the ttest forindependent samples

    Does not require normally distributed

    populations May be applied to ordinal data

    Assumptions Independent Samples

    At Least Ordinal Data

  • 8/3/2019 Ken Black QA ch13

    16/56

    Mann-Whitney UTest:

    Sample Size Consideration

    Size of sample one: n1

    Size of sample two: n2

    If bothn1 andn2 are 10, the smallsample procedure is appropriate.

    If eithern1 orn2 is greater than 10, the

    large sample procedure is appropriate.

  • 8/3/2019 Ken Black QA ch13

    17/56

    Mann-Whitney U Test:

    Small Sample Example-Demonstration

    Problem 13.1

    Service

    Health Educational

    Service

    20.10 26.1919.80 23.88

    22.36 25.50

    18.75 21.64

    21.90 24.85

    22.96 25.30

    20.75 24.12

    23.45

    H0: The health service

    population is identical to the

    educational service

    population on employee

    compensation

    Ha: The health service

    population is not identical to

    the educational servicepopulation on employee

    compensation

  • 8/3/2019 Ken Black QA ch13

    18/56

    Mann-Whitney U Test: Small Sample

    Example-Demonstration Problem 13.1

    = .05

    If the final p-value < .05, reject H0.

    W1 = 1 + 2 + 3 + 4 + 6 + 7 + 8= 31

    W2 = 5 + 9 + 10 + 11 + 12 + 13 + 14 + 15= 89

    Compensation Rank Group

    18.75 1 H

    19.80 2 H

    20.10 3 H

    20.75 4 H

    21.64 5 E

    21.90 6 H

    22.36 7 H

    22.96 8 H

    23.45 9 E

    23.88 10 E

    24.12 11 E24.85 12 E

    25.30 13 E

    25.50 14 E

    26.19 15 E

  • 8/3/2019 Ken Black QA ch13

    19/56

    Mann-Whitney U Test: Small Sample

    Example-Demonstration Problem 13.1

    MINITAB Output

  • 8/3/2019 Ken Black QA ch13

    20/56

    Mann-Whitney UTest:

    Small Sample Example

    1 1 2

    1 1

    1

    2 1 2

    2 2

    2

    1 2

    1

    2

    7

    7

    2 31

    53

    1

    2

    79

    289

    3

    U n nn n

    W

    U n nn n

    W

    n n

    ( )

    ( )(8)

    ( )(8)

    ( )

    ( )(8)(8)( )

    Since U2 < U1, U= 3.

    p-value = .0011 < .05, reject H0.

  • 8/3/2019 Ken Black QA ch13

    21/56

    Mann-Whitney UTest:

    Formulas for Large Sample Case

    1groupinvalues

    ofranksor thesum

    2groupinnumber

    1groupinnumber:

    2

    1

    1

    2

    1

    1

    11

    21

    W

    n

    n

    Wnn

    nn

    where

    U

    U

    U

    U

    U

    n n

    n n n n

    ZU

    1 2

    1 2 1 2

    2

    1

    12

  • 8/3/2019 Ken Black QA ch13

    22/56

    Incomes of PBS

    and Non-PBS Viewers PBS Non-PBS24,500 41,000

    39,400 32,500

    36,800 33,000

    44,300 21,000

    57,960 40,500

    32,000 32,400

    61,000 16,000

    34,000 21,500

    43,500 39,500

    55,000 27,600

    39,000 43,500

    62,500 51,900

    61,400 27,800

    53,000

    n1 = 14

    n2 = 13

    Ho: The incomes for PBS viewersand non-PBS viewers areidentical

    Ha: The incomes for PBS viewersand non-PBS viewers are notidentical

    .

    . . ,

    05

    196 196If Z or Z reject Ho

  • 8/3/2019 Ken Black QA ch13

    23/56

    Ranks of Income from Combined

    Groups of PBS and Non-PBS Viewers

    Income Rank Group Income Rank Group

    16,000 1 Non-PBS 39,500 15 Non-PBS

    21,000 2 Non-PBS 40,500 16 Non-PBS

    21,500 3 Non-PBS 41,000 17 Non-PBS

    24,500 4 PBS 43,000 18 PBS27,600 5 Non-PBS 43,500 19.5 PBS

    27,800 6 Non-PBS 43,500 19.5 Non-PBS

    32,000 7 PBS 51,900 21 Non-PBS

    32,400 8 Non-PBS 53,000 22 PBS

    32,500 9 Non-PBS 55,000 23 PBS

    33,000 10 Non-PBS 57,960 24 PBS34,000 11 PBS 61,000 25 PBS

    36,800 12 PBS 61,400 26 PBS

    39,000 13 PBS 62,500 27 PBS

    39,400 14 PBS

  • 8/3/2019 Ken Black QA ch13

    24/56

    PBS and Non-PBS Viewers:

    Calculation ofU

    1

    1 2

    1 1

    1

    4 7 11 12 13 14 18 19 5 22 23 24 25 26 27

    1

    2

    14 1314 15

    22455

    2455

    415

    W

    n n n n WU

    .

    .

    .

    .

  • 8/3/2019 Ken Black QA ch13

    25/56

    PBS and Non-PBS Viewers: Conclusion

    U

    U

    n n

    n n n n

    1 2

    1 2 1 2

    2

    14 13

    2

    1

    12

    14 13 28

    12

    91

    206.

    ZU

    U

    U

    415 91

    206

    240

    .

    .

    .

    orejectZ H,96.140.2Cal

  • 8/3/2019 Ken Black QA ch13

    26/56

    PBS and Non-PBS Viewers:

    MINITAB OutputOne may manipulate in MINITAB to do the one-sample Z-test to obtain the results

    for the large sample approximation. Make sure the SE Mean works out to be the

    standard deviation of 20.6. This is done by multiplying the standard deviation by

    square root of n and using this value for the known standard deviation in the test.

  • 8/3/2019 Ken Black QA ch13

    27/56

    PBS and Non-PBS Viewers:

    MINITAB Output

    Note: The P-values are approximately the same.

  • 8/3/2019 Ken Black QA ch13

    28/56

    Wilcoxon Matched-Pairs

    Signed Rank Test

    A nonparametric alternative to the ttest forrelated samples

    Before and After studies Studies in which measures are taken on the

    same person or object under differentconditions

    Studies or twins or other relatives

  • 8/3/2019 Ken Black QA ch13

    29/56

    Wilcoxon Matched-Pairs

    Signed Rank Test

    Differences of the scores of the twomatched samples

    Differences are ranked, ignoring the sign Ranks are given the sign of the difference

    Positive ranks are summed

    Negative ranks are summed

    Tis the smaller sum of ranks

  • 8/3/2019 Ken Black QA ch13

    30/56

    Wilcoxon Matched-Pairs Signed

    Rank Test: Sample Size

    Consideration

    n is the number of matched pairs Ifn > 15, Tis approximately normally

    distributed, and aZ test is used. Ifn15, a special small sample procedure is

    followed. The paired data are randomly selected.

    The underlying distributions are symmetrical.

  • 8/3/2019 Ken Black QA ch13

    31/56

    Wilcoxon Matched-Pairs Signed

    Rank Test: Small Sample Example

    Family

    Pair Pittsburgh Oakland

    1 1,950 1,760

    2 1,840 1,8703 2,015 1,810

    4 1,580 1,660

    5 1,790 1,340

    6 1,925 1,765

    H0: Md = 0Ha: Md 0

    n = 6

    =0.05

    If Tobserved 1, reject H0.

  • 8/3/2019 Ken Black QA ch13

    32/56

    Wilcoxon Matched-Pairs Signed

    Rank Test: Small Sample ExampleFamily

    Pair Pittsburgh Oakland d Rank

    1 1,950 1,760 190

    2 1,840 1,870 -30

    3 2,015 1,810 2054 1,580 1,660 -80

    5 1,790 1,340 450

    6 1,925 1,765 160

    +4

    -1

    +5-2

    +6

    +3

    T= minimum(T+, T-)T+= 4 + 5 + 6 + 3= 18

    T-= 1 + 2 = 3

    T= 3

    T= 3 > Tcrit = 1, do not reject H0.

  • 8/3/2019 Ken Black QA ch13

    33/56

    Wilcoxon Matched-Pairs Signed

    Rank Test: MINITAB Output

    NOTE: Differences = Pittsburg - Oakland

  • 8/3/2019 Ken Black QA ch13

    34/56

  • 8/3/2019 Ken Black QA ch13

    35/56

    Airline Cost Data for 17 Cities,

    1979 and 2006

    City 1979 2006 d Rank City 1979 2006 d Rank

    1 20.3 22.8 -2.5 -8 10 20.3 20.9 -0.6 -1

    2 19.5 12.7 6.8 17 11 19.2 22.6 -3.4 -11.5

    3 18.6 14.1 4.5 13 12 19.5 16.9 2.6 9

    4 20.9 16.1 4.8 15 13 18.7 20.6 -1.9 -6.5

    5 19.9 25.2 -5.3 -16 14 17.7 18.5 -0.8 -2

    6 18.6 20.2 -1.6 -4 15 21.6 23.4 -1.8 -5

    7 19.6 14.9 4.7 14 16 22.4 21.3 1.1 3

    8 23.2 21.3 1.9 6.5 17 20.8 17.4 3.4 11.5

    9 21.8 18.7 3.1 10

    H0: Md = 0Ha: Md 0

    .

    . . ,

    05

    196 196If Z or Z reject Ho

  • 8/3/2019 Ken Black QA ch13

    36/56

    Airline Cost: TCalculation

    54)54,99(minimum

    54

    525.65.111416899

    5.1139105.614151317

    ),(minimum

    T

    T

    T

    T

    TT

  • 8/3/2019 Ken Black QA ch13

    37/56

    Airline Cost: Conclusion

    T

    T

    T

    T

    n n

    n n n

    ZT

    1

    4

    17 18

    4765

    1 2 1

    24

    17 18 35

    24211

    54 765

    211107

    .

    .

    .

    ..

    orejectZ Hnotdo,96.107.196.1 Cal

  • 8/3/2019 Ken Black QA ch13

    38/56

    Airline Cost: MINITAB Output

    One may manipulate in MINITAB to do the one-sample Z-test to obtain the resultsfor the large sample approximation. Make sure the SE Mean works out to be the

    standard deviation of 21.1. This is done by multiplying the standard deviation by

    square root of n and using this value for the known standard deviation in the test.

  • 8/3/2019 Ken Black QA ch13

    39/56

    Airline Cost: MINITAB Output

    Observe that the P-valueare approximately the same

    for both outputs.

  • 8/3/2019 Ken Black QA ch13

    40/56

    Kruskal-Wallis Test

    A nonparametric alternative to one-way analysisof variance

    May used to analyze ordinal data

    No assumed population shape

    Assumes that the Cgroups are independent

    Assumes random selection of individual items

  • 8/3/2019 Ken Black QA ch13

    41/56

    Kruskal-WallisKStatistic

    1-=dfwith,

    groupainitemsofnumber=

    groupainranksoftotal

    itemsofnumbertotal=

    groupsofnumber=:

    131

    12

    2

    j

    j

    1

    2

    T

    CK

    n

    n

    Cwhere

    nnn

    KC

    j j

    j

    nT

    N b f P i D

  • 8/3/2019 Ken Black QA ch13

    42/56

    Number of Patients per Day

    per Physician in Three Organizational Categories

    Two

    Partners

    Three or

    More

    Partners HMO

    13 24 26

    15 16 22

    20 19 31

    18 22 2723 25 28

    14 33

    17

    Ho: The three populations are identical

    Ha: At least one of the three populations is different

    0 05

    1 3 1 2

    5991

    599105 2

    2

    .

    .

    . ,. ,

    df C

    KIf reject H .o

  • 8/3/2019 Ken Black QA ch13

    43/56

    Patients per Day Data:

    Kruskal-Wallis Preliminary Calculations

    n = n1 + n2 + n3 = 5 + 7 + 6 = 18

    Two

    Partners

    Three or

    More

    Partners HMOPatients Rank Patients Rank Patients Rank

    13 1 24 12 26 14

    15 3 16 4 22 9.520 8 19 7 31 17

    18 6 22 9.5 27 15

    23 11 25 13 28 16

    14 2 33 18

    17 5T1 = 29 T2 = 52.5 T3 = 89.5

    n1 = 5 n2 = 7 n3 = 6

  • 8/3/2019 Ken Black QA ch13

    44/56

    Patients per Day Data: Kruskal-Wallis

    Calculations and Conclusion

    Kn n

    nj

    jj

    C Tn

    12

    13 1

    12

    18 18 1 5 7 6 3 18 1

    12

    18 18 11897 3 18 1

    9 56

    2

    1

    2 2 2

    29 525 895. .

    ,

    .

    . ,.

    . . ,

    05 2

    2

    5991

    956 5991

    K reject H .o

  • 8/3/2019 Ken Black QA ch13

    45/56

    Patients per Day Data: Kruskal-Wallis

    MINITAB Output

  • 8/3/2019 Ken Black QA ch13

    46/56

    Friedman Test

    A nonparametric alternative to the randomizedblock design

    Assumptions The blocks are independent.

    There is no interaction between blocks andtreatments.

    Observations within each block can be ranked.

    Hypotheses

    Ho: The treatment populations are equal

    Ha: At least one treatment populationyields larger values than at least oneother treatment population

  • 8/3/2019 Ken Black QA ch13

    47/56

    Friedman Test

    1-C=dfwith,

    leveltreatmentparticular=

    leveltreatmentparticularaforrankstotal=

    (rows)blocksofnumber=(columns)levelstreatmentofnumber:where

    )1(3)1(

    12

    22

    j

    1

    22

    r

    C

    jjr

    j

    R

    bC

    CbCbC

    R

  • 8/3/2019 Ken Black QA ch13

    48/56

    Friedman Test: Tensile Strength

    of Plastic Housings

    Supplier 1 Supplier 2 Supplier 3 Supplier 4

    Monday 62 63 57 61

    Tuesday 63 61 59 65

    Wednesday 61 62 56 63

    Thursday 62 60 57 64

    Friday 64 63 58 66

    Ho: The supplier populations are equal

    Ha: At least one supplier population yields larger

    values than at least one other supplier population

  • 8/3/2019 Ken Black QA ch13

    49/56

    Friedman Test: Tensile Strength

    of Plastic Housings

    0 05

    1 4 1 3

    7 81473

    7 81473

    05 3

    2

    2

    .

    .

    . ,

    . ,

    df C

    rIf reject H .o

  • 8/3/2019 Ken Black QA ch13

    50/56

    Friedman Test: Tensile Strength

    of Plastic HousingsSupplier 1 Supplier 2 Supplier 3 Supplier 4

    Monday 3 4 1 2

    Tuesday 3 2 1 4

    Wednesday 2 3 1 4

    Thursday 3 2 1 4

    Friday 3 2 1 4

    14 13 5 18

    196 169 25 324jR

    2

    jR

    714)32425169196(4

    1

    2 j

    jR

    F i d T T il S h

  • 8/3/2019 Ken Black QA ch13

    51/56

    Friedman Test: Tensile Strength

    of Plastic Housings

    r jj

    C

    bC Cb CR

    2 2

    1

    12

    13 1

    12

    4 4 1 714 3 4 1

    1068

    ( )

    ( )

    (5)( )( ) ( ) (5)( )

    .

    r

    2 7 81473 =10.68 reject H .o . ,

  • 8/3/2019 Ken Black QA ch13

    52/56

    Friedman Test: Tensile Strength

    of Plastic HousingsMINITAB

    Output

  • 8/3/2019 Ken Black QA ch13

    53/56

    Spearmans Rank Correlation

    Analyze the degree of association of twovariables

    Applicable to ordinal level data (ranks)

    srd

    nn

    where

    1

    6

    1

    2

    2

    : n = number of pairs being correlatedd = the difference in the ranks of each pair

  • 8/3/2019 Ken Black QA ch13

    54/56

    Spearmans Rank Correlation for Heifer

    and Lamb Prices

    Year

    Heifer Prices

    ($/100 lb)

    Lamb Prices

    ($/100 lb)

    Rank

    Heifer

    Rank:

    Lamb d d2

    1995 65.46 77.91 8 6 -2 4

    1996 64.18 82.00 9 4 -5 25

    1997 65.66 89.20 8 3 -4 161998 59.23 74.37 10 7 -3 9

    1999 65.68 66.42 6 10 4 16

    2000 69.55 80.10 3 5 2 4

    2001 67.81 69.78 4 9 5 25

    2002 67.39 72.09 5 8 3 9

    2003 82.06 92.14 2 2 0 02004 84.40 96.31 1 1 0 0

    108

  • 8/3/2019 Ken Black QA ch13

    55/56

    Spearmans Rank Correlation for Heifer

    and Lamb Prices

    345.0)110(10

    )108(61)1(

    6122

    2

    nnd

    sr

  • 8/3/2019 Ken Black QA ch13

    56/56

    Copyright 2008 John Wiley & Sons, Inc.All rights reserved. Reproduction or translation

    of this work beyond that permitted in section 117of the 1976 United States Copyright Act withoutexpress permission of the copyright owner isunlawful. Request for further information shouldbe addressed to the Permissions Department, JohnWiley & Sons, Inc. The purchaser may makeback-up copies for his/her own use only and notfor distribution or resale. The Publisher assumes

    no responsibility for errors, omissions, or damagescaused by the use of these programs or from theuse of the information herein.