Non Parametric Statistics PPT @ BEC DOMS

8/3/2019 Non Parametric Statistics PPT @ BEC DOMS

1/52

1

Nonparametric Statistics


2/52

2

Chapter Goals

After completing this chapter, you should be

able to:

Recognize when and how to use the Wilcoxonsigned rank test for a population median

Recognize the situations for which theWilcoxon signed rank test applies and be able

to use it for decision-making

Know when and how to perform a Mann-Whitney U-test

Perform nonparametric analysis of variance


3/52

3



Fewer restrictive assumptions about data

levels and underlying probabilitydistributions

Population distributions may be skewed

The level of data measurement may only be

ordinal or nominal


4/52

4

Wilcoxon Signed Rank Test

Used to test a hypothesis about one population

median

th

e median is th

e midpoint of th

e distribution: 50% below,50% above

A hypothesized median is rejected if sample

results vary too much from expectations

no highly restrictive assumptions about the shape of the

population distribution are needed


5/52

5

The W Test Statistic

Performing the Wilcoxon Signed Rank Test

Calculate the test statistic W using these steps:

Step 1: collect sample data

Step 2: compute di = difference between each

value and the hypothesized median

Step 3: convert di values to absolute

differences


6/52

6



Step 4: determine the ranks for each di

value

eliminate zero di values

Lowest di value = 1

For ties, assign each

th

e average rank of th

e tiedobservations

(continued)


7/52

7



Step 5: Create R+ and R- columns

for data values greater than the hypothesized

median, put the rank in an R+ column

for data values less th

an th

eh

ypoth

esizedmedian, put the rank in an R- column

(continued)


8/52

8



Step 6: the test statistic W is the sum of the

ranks in the R+ column

Test the hypothesis by comparing the

calculated W to the critical value from thetable in appendix P

Note that n = the number of non-zero di values

(continued)


9/52


10/52

10

Example

Rank the absolute differences:

| di | Rank

5

6

6

17

21

26

38

55

1

2.5

2.5

4

5

6

7

8

tied

(continued)


11/52

11

Example

Put ranks in R+ and R- columnsand find sums:

Class

size = xi

Difference

di = xi4

0

| di | Rank R+ R-

23

45

34

78

34

66

61

95

-17

5

-6

38

-6

26

21

55

17

5

6

38

6

26

21

55

4

1

2.5

7

2.5

6

5

8

1

7

6

5

8

4

2.5

2.5

7=27 7

=9

(continued)

These three

are below

the claimed

median, the

others are

above


12/52

12

Completing the Test

H0: Median = 40

HA: Median 40Test at the E = .05 level:

This is a two-tailed test and n = 8, so find WL and WU in

appendix P: WL = 3 and WU = 33

The calculated test statistic is W = 7R+ = 27


13/52

13

Completing the Test

H0: Median = 40

HA: Median 40W

L

= 3 and WU

= 33

WL < W < WU so do not reject H0

(there is not sufficient evidence to conclude that the median

class size is different than 40)

(continued)

WL = 3do not reject H0reject H0

W=7R+ = 27

WU = 33reject H0


14/52

14

If the Sample Size is Large

The W test statistic approaches a normal

distribution as n increases

For n > 20, W can be approximated by

241)1)(2nn(n

4

1)n(nW

z

!

where W= sum of the R+ ranks

d = numberof non-zero di values


15/52

15

Nonparametric Tests for Two PopulationCenters

Nonparametric

Tests for Two

Population Centers

Wilcoxon

Matched-Pairs

SignedR

ank Test

Mann-Whitney

U-test

Large

Samples

Small

Samples

Large

Samples

Small

Samples


16/52

16

Mann-Whitney U-Test

Used to compare two samples from two

populations

Assumptions:

The two samples are independent and random

The value measured is a continuous variable

The measurement scale used is at least ordinal

If they differ, the distributions of the two populations will differ

only with respect to the central location


17/52

17

Consider two samples

combine into a singe list, but keep track of which

sample each value came from

rank the values in the combined list from low to

high For ties, assign each the average rank of the tied values

separate back into two samples, each valuekeeping its assigned ranking

sum the rankings for each sample

Mann-Whitney U-Test(continued)


18/52

18

If the sum of rankings from one sample differs

enough from the sum of rankings from the

other sample, we conclude there is a

difference in the population medians

Mann-Whitney U-Test(continued)


19/52

19

(continued)

Mann-Whitney U-Test

Mann-Whitney U-

Statistics

! 111

211 2

1

R

)n(n

nnU

!2

22

212

2

1R

)n(nnnU

where:

n1 and n2 are the two sample sizes

R1 and R2 = sum ofranks forsamples 1 and 2


20/52

20

(continued)

Mann-Whitney U-Test

Claim: Median class size forMath is larger

than the median class size forEnglish

Arandom sample of 9 Math and 9 English

classes is selected (samples do not have to

be of equal size)

Rank the combined values and then splitthem back into the separate samples


21/52

21

Suppose the results are:

Class size (Math, M) Class size (English, E)

23

45

34

78

34

66

62

95

81

30

47

18

34

44

61

54

28

40

(continued)

Mann-Whitney U-Test


22/52

22

Size Rank

18 1

23 2

28 3

30 4

34 6

34 6

34 6

40 8

44 9

Size Rank

45 10

47 11

54 12

61 13

62 14

66 15

78 16

81 17

95 18

Ranking forcombined samples

tied

(continued)

Mann-Whitney U-Test


23/52

23

Split back into the original samples:Class size (Math,

M)Rank

Class size

(English, E)Rank

23

45

34

78

34

66

62

95

81

2

10

6

16

6

15

14

18

17

30

47

18

34

44

61

54

28

40

4

11

1

6

9

13

12

3

8

7 = 104 7 = 67

(continued)

Mann-Whitney U-Test


24/52

24

H0: MedianM MedianE

HA: MedianM >

MedianE

Claim: Median class size for

Math is largerthan the

median class size forEnglish

221042

(9)(10)(9)(9)R

2

1)(nnnnU 1

11211 !!

!

5967

2

(9)(10)(9)(9)R

2

1)(nnnnU 2

22212 !!

!

Note: U1 + U2 = n1n2

(continued)

Mann-Whitney U-Test

Math:

English:


25/52

25

The Mann-Whitney U tables in Appendices L

and M give the lower tail of the U-distribution

For one-tailed tests like this one, check thealternative hypothesis to see if U1 or U2should be used as the test statistic

Since the alternative hypothesis indicates thatpopulation 1 (Math) has a higher median, use

U1 as the test statistic

(continued)

Mann-Whitney U-Test


26/52

26

Use U1 as the test statistic: U = 22

Compare U = 22 to the critical value UE from

the appropriate table

For sample sizes less than 9, use Appendix L

For samples sizes from 9 to 20, use Appendix M

If U < UE, reject H0

(continued)

Mann-Whitney U-Test


27/52

27

Since U UE, do not reject H0

Use U1 as the test statistic: U = 19

UE from Appendix M forE = .05, n1 = 9 and

n2

= 9 is UE = 7

(continued)

Mann-Whitney U-Test

UE = 7

U= 19

do not reject H0reject H0


28/52

28

Mann-Whitney U-Test forLarge Samples

The table in Appendix M includes UE values

only for sample sizes between 9 and 20

The U statistic approaches a normaldistribution as sample sizes increase

If samples are larger than 20, a normal

approximation can be used


29/52

29


The mean and standard deviation for Mann-

Whitney U Test Statistic:

(continued)

2

nn 21!Q

12)1nn)(n)(n( 2121 !W

Where n1 and n2 are sample sizes from populations 1 and 2


30/52

30


Normal approximation for Mann-Whitney U

Test Statistic:

(continued)

12)1nn)(n)(n(

2

nnU

z

2121

21

!


31/52

31

Large Sample Example

We wish to test

Suppose two samples are obtained:

n1 = 40 , n2 = 50

When rankings are completed, the sum of

ranks for sample 1 is 7R1 = 1475 When rankings are completed, the sum of

ranks for sample 2 is 7R2

= 2620

H0: Median1 u Median2HA: Median1 < Median2


32/52

32

U statistic is found to be U = 655

134514752

(40)(41)(40)(50)R

2

1)(nnnnU 1

11211 !!

!

65526202

(50)(51)(40)(50)R

2

1)(nnnnU 2

22212 !!

!

Since the alternative hypothesis indicates that

population 2 has a highermedian, use U2 as the test

statistic

Compute the U statistics:

Large Sample Example(continued)


33/52

33

Since z= -2.80 < -1.645, we reject H0

645.1z !E

Reject H0

0MedianMedian:H

0MedianMedian:H

21A

210

"

e

80.2

12

)15040)(50)(40(

1000655

12

)1nn)(n)(n(

2

nnU

z

2121

21

!

!

!

E = .05

Do not reject H0

0

Large Sample Example(continued)


34/52


35/52

35

The Wilcoxon T Test Statistic

Performing the Small-Sample WilcoxonMatched Pairs Test (for n < 25)

Calculate t

he test statistic T using t

hese steps:

Step 1: collect sample data

Step2

: compute di = difference between th

esample 1 value and its paired sample 2 value

Step 3: rank the differences, and give eachrank the same sign as the sign of thedifference value


36/52

36

The Wilcoxon T Test Statistic

Performing the Small-Sample WilcoxonMatched Pairs Test (for n < 25)

Step 4: The test statistic is the sum of theabsolute values of the ranks for the group withthe smaller expected sum

Look at the alternative hypothesis to determine

the group with the smaller expected sum

For two tailed tests, just choose the smaller sum

(continued)


37/52

37

Small Sample Example

Paired samples, n = 9:

Value (before) Value (after)

38

45

34

58

30

46

42

55

41

30

47

18

34

34

31

24

38

40

baA

ba0

MedianMedian:H

MedianMedian:H

u

Claim: Median

value is smallerafter than before


38/52

38

Small Sample Example

Paired samples, n = 9:Value

(before)

Value

(after)

Difference

d

Rank

of d

Ranks with smaller

expected sum

36

45

34

58

30

4642

55

41

30

47

18

54

38

3124

62

40

6

-2

16

4

-8

1518

-7

1

4

-2

8

3

-6

79

-5

1

2

6

5

7 = T =13

(continued)


39/52

39

The calculated T value is T = 13

Complete the test by comparing the calculated

T value to the critical T-value from AppendixN

For n = 9 and E = .025 for a one-tailed test,

TE = 6

Since T TE, do not reject H0

TE = 6

T= 13

do not reject H0reject H0

Small Sample Example(continued)


40/52

40

Wilcoxon Matched Pairs Testfor Large Samples

The table in Appendix N includes TE values

only for sample sizes from 6 to 25

The T statistic approaches a normaldistribution as sample size increases

If the number of paired values is larger than

25, a normal approximation can be used


41/52

41

The mean and standard deviation for

Wilcoxon T :

(continued)

4

)1n(n !Q

24)1n2)(1n)(n( !W

where n is the numberof paired values

Wilcoxon Matched Pairs Testfor Large Samples


42/52

42


Normal approximation for the Wilcoxon T

Test Statistic:

(continued)

24)1n2)(1n(n

4

)1n(nT

z

!


43/52

43

Tests the equality ofmore than 2populationmedians

Assumptions:

variables have a continuous distribution.

the data are at least ordinal.

samples are independent.

samples come from populations whose onlypossible difference is that at least one may have adifferent central location than the others.

Kruskal-Wallis One-Way ANOVA


44/52

44

Kruskal-Wallis Test Procedure

Obtain relative rankings for each value

In event of tie, each of the tied values gets the

average rank

Sum the rankings for data from each of the k

groups

Compute the H test statistic


45/52

45


The Kruskal-Wallis H test statistic:(with k 1 degrees of freedom)

)1N(3nR

)1N(N12H

k

1i i

2

i

! !

where:

N = Sum of sample sizes in all samplesk = Numberof samples

Ri = Sum ofranks in the ith sample

ni = Size of the ith sample

(continued)


46/52

46

Complete the test by comparing the calculated

H value to a critical G2 value from the chi-

square distribution with k 1 degrees of

freedom

(The chi-square distribution is Appendix G)

Decision rule Reject H

0if test statistic H > G2E

Otherwise do not reject H0

(continued)



47/52

47

Do different departments have different class

sizes?

Kruskal-Wallis Example

Class size

(Math, M)

Class size

(English, E)

Class size

(History, H)

23

45

54

7866

55

60

72

4570

30

40

18

3444


48/52

48

Do different departments have different class

sizes?


Class size

(Math, M)R

anking

Class size

(English, E)R

anking

Class size

(History, H)R

anking

23

41

54

78

66

2

6

9

15

12

55

60

72

45

70

10

11

14

8

13

30

40

18

34

44

3

5

1

4

7

7 =44 7 =56 7 =20


49/52

49

The H statistic is(continued)


72.6)115(35

20

5

56

5

44

)115(15

12

)1N(3n

R

)1N(N

12H

222

k

1ii

2

i

!

!

! !

equalareMedianspopulationallotN:H

MedianMedianMedian:H

A

HEM0 !!


50/52

50

Since H = 6.72

Non Parametric Statistics PPT @ BEC DOMS

Documents

Transcript of Non Parametric Statistics PPT @ BEC DOMS