Comparison of Two Means

14

description

11. Comparison of Two Means. Tests involving two samples – comparing variances , F distribution. TOH - x A = x B ? Step 1 - F-test  s A 2 = s B 2 ? Step 2 - t-test  use different formula for (i) s A 2 = s B 2 . (ii) s A 2 ≠s B 2 - PowerPoint PPT Presentation

Transcript of Comparison of Two Means

Page 1: Comparison of Two Means
Page 2: Comparison of Two Means

11Comparison of Two Means

Page 3: Comparison of Two Means

Tests involving two samples – comparing variances, F distribution

• TOH - xA = xB ?

• Step 1 - F-test sA2 = sB

2 ? • Step 2 - t-test use different formula for (i) sA

2 = sB2 . (ii) sA

2 ≠sB2

• Goal – whether a given gene is expressed differently between patients and healthy subjects

• This involves comparing the mean of the two samples• To answer this question one must first know whether the two samples have the s

ame variance• The method used to compare variances of two samples – F distribution• Then we use t-test to test whether the mean of the gene is expressed differently

between patients and healthy subjects

Page 4: Comparison of Two Means

Tests involving two samples – comparing variances, F distribution

• The values measured in controls are: 10, 11, 11, 12, 15, 13, 12• The values measured in patients are: 12, 13, 13, 15, 12, 18, 17, 16, 16, 12, 15,

10, 12. Is the variance different between the controls and the patients at a 5% significant level ?

• H0: sA2 = sB

2, H1: sA2 ≠sB

2

• Need to find a new test statistics,• Two-tail test • Notation: assume A = controls, B = patients in the following calculation• Controls sample A has d.o.f and variance = 6 and 2.66 • Patients sample B has d.o.f and variance = 12 and 5.74• Consider the ratio F = 2.66/5.74 = 0.4634, • Significant level for two-tail test = 5%/2 = 2.5%• F-distribution (right tail) F0.025(6,12) = 3.7283 (from Excel)• F0.975(6,12) = 0.1864 (from Excel)

2

2

B

A

s

sF

F- distribution (right tail) http://mips.stanford.edu/public/classes/stats_data_analysis/234_99.html

Page 5: Comparison of Two Means

F distribution – right tail

0.025 see next page

Page 6: Comparison of Two Means

Tests involving two samples – comparing variances, F distribution

•F0.025(6,12) = 3.7283

•F0.975(6,12) = 0.1864

Page 7: Comparison of Two Means

Tests involving two samples – comparing variances, F-distribution• Usually we have F-distribution table for 0.01, 0.025, 0.05 but not 0.9

75 !!• Given F0.025(6,12) = 3.7283, how to find F0.975(6,12) ???• The F distribution has the interesting property that :• left tail for an F with 1 and 2 d.o.f. is = the reciprocal of the right tail for an F with the d.o.f reversed:• F[Left tail(A,B)] = 1/F[right tail(B,A)]

• F0.975(6,12) = 1/ F(1-0.975)(12,6)

• F0.975(6,12) = 1/ F0.025(12,6) = 1/5.3662 = 0.18635• back to our null hypothesis test• Since 0.18635 < 0.4634 < 3.7283• Since the F-statistics is in between 0.18635 and 3.7283, we will acce

pt the null hypothesis there is no difference between controls and patients

),)(1(),(

12

21

1

F

F

Page 8: Comparison of Two Means

Tests involving two samples – comparing variances, F-distribution• Now, let us consider the ratio

• The two different choices should lead to same conclusion, since the conclusion should not depend which variance we put on the numerator or denominator

• Controls sample A has d.o.f and variance = 6 and 2.66 • Patients sample B has d.o.f and variance = 12 and 5.74• F = 5.74/2.66 = 2.1579• F-distribution (right tail) F0.025(12,6) = 5.3662 (from Excel)• F0.975(12,6) = 0.2682 (from Excel) • Since 0.2682 < 2.1579 < 5.3662• Since the F-statistics is in between 0.2682 and 5.366, we will accept the null hyp

othesis there is no difference between controls and patientsREMARK• The two F-tests are reciprocal to each other• That is 0.18635 < 0.4634 < 3.7283• Reciprocal 1/0.18635 > 1/0.4634 >1/3.7283 5.3662 > 2.1579 > 0.2682

2

2

A

B

s

sF

Page 9: Comparison of Two Means

Tests involving two samples – comparing means

The gene expression level of the gene AC002378 is measured for the patients, P and controls, C are given in the following:

geneID P1 P2 P3 P4 P5 P6AC002378 0.66 0.51 1.12 0.83 0.91 0.50geneID C1 C2 C3 C4 C5 C6AC002378 0.41 0.57 -0.17 0.50 0.22 0.71• F-test: H0: sP

2 = sC2, H1: sP

2 ≠sC2

• T-test: H0: xP = xC, H1: xP ≠ xC

• Mean of gene expression level of patients, XP = 0.755• Mean of gene expression level of controls, XC = 0.373• sP

2 = 0.059, sC2 = 0.097

• To test whether the two samples have the same variance or not, we perform the F-test at a 5% level

• F = 0.059/0.097 = 0.60, d.o.f. = 10• F0.025(5,5) = 7.146, F0.975(5,5) = 0.1399• In between 0.1399 and 7.146 accept the null hypothesis the patie

nts and controls have the same variances

Page 10: Comparison of Two Means

Tests involving two samples – comparing means

• t-statistic of two independent samples with equal variances

• The t-score is

where

• the p-value, or the probability of having such a value by chance is 0.0400. This value is smaller than the significant level 0.05, and therefore we reject the null hypothesis, the gene AC002378 is expressed differently between cancer patients and healthy subjects.

359.2

)61

61

(078.0

0)373.0755.0(

)11

(

)()(

2

CPpool

CPCP

nns

XXt

078.0266

097.0)16(059.0)16(

2

)1()1( 222

CP

CCPPpool nn

snsns

Page 11: Comparison of Two Means

Tests involving two samples – comparing means

• t-statistic of two independent samples with unequal variances• The modified t-score is

• The degree of freedom need to be adjusted as

• This value is not an integer and needs to be rounded down

)(

)()(22

C

C

P

P

CPCP

ns

ns

XXt

1

)(

1

)(

)(

22

22

222

C

C

C

P

P

P

C

C

P

P

n

n

s

nns

n

s

ns

Page 12: Comparison of Two Means

Chapter11 p259

Page 13: Comparison of Two Means

Chapter11 p264

Page 14: Comparison of Two Means

Chapter11 p2268