Analysis of variance

3
ONE-WAY ANALYSIS OF VARIANCE (ANOVA) The one-way analysis of variance is used to compare the means of several groups, e.g. the mean Hb levels of patients with different types of sickle cell disease. The analysis is called one-way as the data are classified just one way, in this case by the type of sickle cell disease. The method is based on assessing how much of the overall variation in the data is attributable to differences between the group means, and comparing this with the amount attributable to differences between individuals in the same group. Hence the name analysis of variance. R Type of sickle cell disease Hb SS Hb S / beta thalasemia Hb SC 1 7.2 8.1 10.7 2 7.7 9.2 11.3 3 8 10 11.5 4 8.1 10.4 11.6 5 8.3 10.6 11.7 6 8.4 10.9 11.8 7 8.4 11.1 12 8 8.5 11.9 12.1 9 8.6 12 12.3 10 8.7 12.1 12.6 11 9.1 12.6 12 9.1 13.3 13 9.1 13.3 14 9.8 13.8 15 10.1 13.9 16 10.3 ∑X = 430.2 ∑X 2 = 4651.8 K = No. of groups N = Total No. of observations. H 0 : mean1=mean2=mean3 1

Transcript of Analysis of variance

Page 1: Analysis of variance

ONE-WAY ANALYSIS OF VARIANCE (ANOVA)

The one-way analysis of variance is used to compare the means of several

groups, e.g. the mean Hb levels of patients with different types of sickle cell

disease. The analysis is called one-way as the data are classified just one

way, in this case by the type of sickle cell disease. The method is based on

assessing how much of the overall variation in the data is attributable to

differences between the group means, and comparing this with the amount

attributable to differences between individuals in the same group. Hence the

name analysis of variance.

RType of sickle cell disease

Hb SS Hb S / beta thalasemia Hb SC1 7.2 8.1 10.72 7.7 9.2 11.33 8 10 11.54 8.1 10.4 11.65 8.3 10.6 11.76 8.4 10.9 11.87 8.4 11.1 128 8.5 11.9 12.19 8.6 12 12.310 8.7 12.1 12.611 9.1 12.612 9.1 13.313 9.1 13.314 9.8 13.815 10.1 13.916 10.3

∑X = 430.2

∑X2 = 4651.8

K = No. of groups N = Total No. of observations.

H0 : mean1=mean2=mean3

1

Page 2: Analysis of variance

Steps to calculate F statistic

1. Total SS = N

XX

22 )(∑−∑

2. Between groups SS = n

X

n

X

n

X

n

X 2222 )(

3

)3(

2

)2(

1

)1( ∑−∑+∑+∑

3. Within groups SS (residual) (error) = Total SS – between groups

SS

4. Between groups mean square(MS) = Between SS / K-1

5. Within groups (MS) = Within groups SS / N-K

6. F = )(_

)(_

MSgroupsWithin

MSgroupsBetween

ANOVA table

Source of variation

SS d.f. MS=SS/d.f.

F=between groups MS /

Within groups MS

Between groups

99.89 2 49.94 49.99P<0.001

Within groups

37.96 38 0.999

Total 137.85 40

2

Page 3: Analysis of variance

Assumptions

1.The data are normally distributed.

If the observations are extremely skewed, especially for small samples, the

Kruskal-Wallis nonparametric procedure should be used.

2.The population value for the standard deviation between individuals is the

same for each group. This is estimated by the square root of the within

groups mean square. Moderate departure from normality may be safely

ignored, but the effect of unequal standard deviations may be serious.

Levene’s test of homogeneity of variances will help.

Relationship with the two sample t test

One way ANOVA is an extension of the two sample t test. When there are

only 2 groups, it gives exactly the same results as the t test. The F value

equals the square of the corresponding t value and the percentage points of

the F distribution with (1, N-2) degrees of freedom are the same as the

square of the t distribution with N-2 degrees of freedom.

3