Ch11: Comparing 2 Samples

24
Ch11: Comparing 2 Samples 11.1: INTRO : This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced. Chapter #13 will be devoted to qualitative data analysis.

description

Ch11: Comparing 2 Samples. 11.1: INTRO : This chapter deals with analyzing continuous measurements. Later, some experimental design ideas will be introduced. Chapter #13 will be devoted to qualitative data analysis. 11.2 : Comparing Two Independent Samples. - PowerPoint PPT Presentation

Transcript of Ch11: Comparing 2 Samples

Page 1: Ch11:  Comparing 2 Samples

Ch11: Comparing 2 Samples11.1: INTRO:

This chapter deals with analyzing continuous measurements.

Later, some experimental design ideas will be introduced.

Chapter #13 will be devoted to qualitative data analysis.

Page 2: Ch11:  Comparing 2 Samples

11.2: Comparing Two Independent Samples

In medical study, one sample X of subjects may be assigned to a control (placebo) treatment and another sample Y to a particular (group) treatment.

This section deals with independent samples and later sections with dependent & paired samples.

GandFofparameterslocationotherormeansofdifferencethe

aboutINFERENCE

samplesrandomtIndependenTwoareYandX

GgrouptreatmentfromnsobservatioYY

FgroupcontrolfromnsobservatioXXiid

m

iidn

:

,...,

,...,

1

1

Page 3: Ch11:  Comparing 2 Samples

11.2.1: Methods based on Normal Distributions

Assumptions:

2

)1()1(var

)%1(10011

11,~

,~,...,

,~,...,

2222

2/2

2

21

21

nm

SmSnsiancesamplepooledthebyestimatedunknown

forCIaismn

zYXknown

mnNYX

YXestimatenaturalhasmeansofdifferenceThe

samplesrandomtIndependenTwoareYandX

NYY

NXX

YXp

YX

YX

YX

Ym

Xn

Page 4: Ch11:  Comparing 2 Samples

11.2.1 : (cont’d)

mn

stYXisforCIa

ATheoremofsassumptiontheunderACorollary

freedomofreesnmdfwithibutiondistrtaist

tafollows

mns

YXtstatisticThe

samplesrandomtIndependenTwobe

NYYandNXXLet

ATheorem

pnmYX

df

df

p

YX

YmXn

112/)%1(100

,:

deg2

11

,~,...,,~,...,

:

2

21

21

Page 5: Ch11:  Comparing 2 Samples

11.2.1 : (cont’d) Test Procedures for Normal Populations:

Null Hypothesis:

Test Statistic:

There 3 common alternative hypotheses. 2 of which are one-sided ( ) and one is two-sided ( ).

Revisit my handouts about CI and HT for references

0:0 YXYX orH

mns

YXt

p

11

0

YXAYXA HorH ::YXAH :

Page 6: Ch11:  Comparing 2 Samples

11.2.2 : Power calculationThe power of the 2-sample t-test depends on:

1. (real difference)– The larger , the greater the power

2. (level of significance)– The larger , the more powerful the test

3. (population standard deviation)– The smaller , the larger the power

4. n and m (sample sizes) – The larger n and m, the greater the power

YX

Page 7: Ch11:  Comparing 2 Samples

11.2.2 : Power calculation (cont’d)Assume that n=m (same sample size) are large enough

to test at level , with

test statistic based on ,

where are given.

The rejection region (RR) of such a test is:

The power of a test is the probability of rejecting the null hypothesis when it is false. That is,

n

YXZ

2

YXAYX HvsH ::0

nzYXzZ /2)2/()2/(

,,

Page 8: Ch11:  Comparing 2 Samples

Power against

2

'

22

'

21

/2

'/2)2/(

/2

'

/2

'/2)2/(

/2

'

/2)2/(

/2)2/(

/2)2/('|)'(

nz

nz

n

nz

n

YXP

n

nz

n

YXP

nzYXP

nzYXP

nzYXPRRP

YX '

Page 9: Ch11:  Comparing 2 Samples

Application: what n is needed?As the difference moves away from zero, one of the

terms

will be negligible with respect to the other.

Problem: want to be able to detect a difference of with probability 0.9 and ?

Solution:

'

2

'

22

'

2

nzor

nz

1'

525

28.1)1.0(25

196.1

1.025

196.19.00

2

'96.11

1

n

n

nn

5

Page 10: Ch11:  Comparing 2 Samples

11.2.3: The Mann-Whitney Test(a nonparametric method)

Known as the Wilcoxon RST (Rank Sum Test).Assume that m + n experimental units are to be

assigned (at random) to a treatment group and a control group. In this specific context, n (remaining m) units are randomly chosen and assigned to the ctrl (to the trt).

We are interested in testing the null hypothesis that the treatment has NO EFFECT.

Then, if the null is true, then any difference in the outcomes under the 2 conditions is due to the randomization (i.e. solely by chance).

Page 11: Ch11:  Comparing 2 Samples

The Mann-Whitney Test: (cont’d)The MW-test statistic is calculated as follows:

1. Group all m + n observations together and Rank them in order of increasing size (no ties)

2. Calculate the sum of the ranks of those observations that came from the ctrl group.

3. Reject null if the sum is too small or too largeExample: ranks are bold and shown in parentheses

R = 3 + 4 = 7 (ctrl) and R = 1 + 2 = 3 (trt)

Treatment Control

1 (1) 6 (4)

3 (2) 4 (3)

Page 12: Ch11:  Comparing 2 Samples

The Mann-Whitney Test: (cont’d)Question: Does this discrepancy provide convincing

evidence of a systematic difference between trt & ctrl, or could it be just due by chance?

Answer: null hypothesis trt had no effect

Under the null, every assignment (total: 4!=24) of ranks to observations happens equally likely.

In particular, each of the assignments

of ranks to the ctrl group (shown below) is equally likely:

6)!24(!2

!4

2

4

Rank {1,2} {1,3} {1,4} {2,3} {2,4} {3,4}

R 3 4 5 5 6 7

Page 13: Ch11:  Comparing 2 Samples

The Mann-Whitney Test: (cont’d)The null distribution of R is the discrete r.v. R:

From this table, ; that is to say that this discrepancy would occur one time out of 6 by chance.

Similar computations can be carried out for any sample sizes m and n and can be even extended to testing:

Read page 404 (textbook).

r 3 4 5 6 7 Sum

P(R=r) 16

1

6

1

6

2

6

1

6

1

6

1)7( RP

GYYTRTandFXXCTRLwhereGFH nn ~,...,~,...,,: 110

Page 14: Ch11:  Comparing 2 Samples

The Mann-Whitney Test: Another approach

Suppose that the X’s are sampled from F and the Y’s are sampled from G. The Mann-Whitney test can be derived from a different point of view than what was seen earlier.

We would like to estimate the probability that an observation from F is smaller than an independent observation from G which is as a measure of the treatment, where X and Y are independently distributed with distribution functions F and G.

An estimate of can be obtained by comparing all n values of X to all m values of Y and by calculating the proportion of the comparisons for which X is less than Y.

)( YXP

Page 15: Ch11:  Comparing 2 Samples

The Mann-Whitney Test: Another approach (Cont’d)

12

)1(

2

,::

,0

,1

,1

ˆ:

0

1 1

nmmnUVarand

mnUE

GFHnulltheUnderATheorem

otherwise

YXifZand

ZUwhere

Umn

isThat

YY

jiij

n

i

m

jijY

Y

Page 16: Ch11:  Comparing 2 Samples

11.3: Comparing Paired Samples

Paired Design vs Unpaired design:

YXYXYX

YXYXiYXi

iii

YXXYii

YY

XX

ii

nDVarandDE

DVarandDE

YXDandtIndependenareYXDsdifferenceThe

ncorrelatiomemberspairYXAssume

ddistributelytIndependenarepairsdifferentAssume

ianceandmeanhavesYThe

ianceandmeanhavesXThe

niwithpairsareYX

DesignPairedCASE

21

)()(

2)()(

,),cov(

var'

var'

,...,1),(

:1

22

22

2

2

Page 17: Ch11:  Comparing 2 Samples

11.3: (cont’d)

Unpaired Design:

.,

01

21,

1

0,''2

:2

2222

22

DESIGNeffectivemoretheisPAIRINGancecircumstthisIn

fornn

Thus

nYXVarandYXE

YXbyestimatedbewillThen

thentindependenaresYandsXsamplestheIf

DesignUnpairedCASE

YXYXYX

YXYX

YX

Page 18: Ch11:  Comparing 2 Samples

11.3: (cont’d)

What if ?

.22

1,

12

:,

)1(22

?

2

22

treatmentpersubjectsnwithDesignUnpairedan

aspreciseasbewillpairsnwithDesignPairedisThat

nYXVar

DVarisefficiencyrelativetheThus

nDVarand

nYXVarThen

ifWhat

unpaired

paired

pairedunpaired

YX

YX

Page 19: Ch11:  Comparing 2 Samples

Pros & Cons Paired vs Independent Samples:

Here are 2 competiting sampling schemes:

Paired Samples: n pairs (2n measurements)

Independent Samples: 2n observations (m=n)

They both give the common form:

But, the SE estimates and the df for t are different:

ES ˆˆ

Independent Samples Paired Samples

2n—2 = 2(n—1) n—1

YXDwheretESD df

,2

ˆˆ

nns p

11

dftn

sD

Page 20: Ch11:  Comparing 2 Samples

Pros & Cons Paired vs Independent Samples:

For a same SE estimate, a loss of DF (degrees of freedom) gives a larger value for the t-test.

(example: )A loss of DF for the t-test produces:

• C.I. Larger Confidence Intervals • H.T. Loss of Power to detect real differences in

the population means.Such loss of DF for Paired Samples is compensated

by a smaller variance Var(X—Y) of Paired Samples with respect to Independent Samples.

734.1)18(833.1)9(10 05.005.0 ttn

Page 21: Ch11:  Comparing 2 Samples

11.3.1: Parametric Methods on the Normal Distribution for Paired Data

22

0:)(0:

2:)%1(100

.deg1'

:

)var()(

,~

11

0

1

2

2

2

nnD

DAD

nDD

D

D

D

iDiYXD

DDiii

tttsDregionrejectionthehaslevelat

HvseffecttreatmentnoHTesting

tsDisforCIa

freedomofreesnwithndisttafollowst

generalinunknownisbecauses

Dt

onbasedbewillInferences

DandDEwhere

NYXDthatAssume

Page 22: Ch11:  Comparing 2 Samples

11.3.2: Nonparametric Method for Paired Data: Sign Rank Test (SRT)

The Wilcoxon SRT is computed as follows:

1. Rank the absolute values of the differences (no ties) with

2. To get the signed ranks, just restore the signs of the to the ranks.

3. Calculate , the sum of those ranks that have positive (+) signs.

Example: Let be -2, 4, 3, 2, -1, 5

-1(r1), -2(r2) ,+2(r3) ,+3(r4) ,+4(r5) ,+5(r6) 4 + obs.

niforDofrankR ii ,...,1

iD

W

iD

5.1765422

325.2

betweentieW

Page 23: Ch11:  Comparing 2 Samples

Wilcoxon SRT (cont’d):Theorem A: Under the null hypothesis that the are

independent and symmetrically distributed about zero,

Proof:

24

)12)(1(

4

)1(

nnnWVarand

nnWE

iD

.4

1)(

2

1)(

2

1~,

,0

0||arg,1,

0

1

followsresultThe

IVarandIE

tlyindependeniBernoullIHunder

otherwise

DhasDestlktheifIwherekIW

kk

k

iith

k

n

kk

Page 24: Ch11:  Comparing 2 Samples

11.4: Experimental Design

Some basic principles of DOE (Design of Experiment) are introduced here.

Experimental Design can be viewed as a sequence of linked studies under some conditions.

Read case studies 11.4.1 thru 11.4.8