1 Analysis of Variance (ANOVA) EPP 245 Statistical Analysis of Laboratory Data.

Post on 22-Dec-2015

226 views 2 download

Tags:

Transcript of 1 Analysis of Variance (ANOVA) EPP 245 Statistical Analysis of Laboratory Data.

1

Analysis of Variance (ANOVA)

EPP 245

Statistical Analysis of

Laboratory Data

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

2

The Basic Idea

• The analysis of variance is a way of testing whether observed differences between groups are too large to be explained by chance variation

• One-way ANOVA is used when there are k ≥ 2 groups for one factor, and no other quantitative variable or classification factor.

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

3

A B C

9 10 12

7 9 14

7 8 14

9 9 12

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

4

Data = Grand Mean + Row Deviations from grand mean +Cell Deviations from row mean

Are the row deviations from the grand mean too big to be accounted for by the cell deviations from the row means?

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

5

A B C

9 10 12

7 9 14

7 8 14

9 9 12

Data

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

6

A B C

8 9 13

8 9 13

8 9 13

8 9 13

Cell Means

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

7

A B C

1 1 -1

-1 0 1

-1 -1 1

1 0 -1

Deviations from Cell Means

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

8

Red cell folate data

Description:

22 rows and 2 columns. data on red cell folate levels in patients receiving three different methods of ventilation during anesthesia.

Format:

folate a numeric vector. Folate concentration (g/l).

ventilation a factor with levels 'N2O+O2,24h': 50% nitrous oxide and 50% oxygen, continuously for 24 hours; 'N2O+O2,op': 50% nitrous oxide and 50% oxygen, only during operation; 'O2,24h': no nitrous oxide, but 35-50% oxygen for 24 hours.

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

9

insheet using redcell.csvsummarize folatetabulate ventilationtabulate ventilation, summarize (folate)graph box folate, over (ventilation)graph export folate1.wmfoneway folate ventilationdescribe ventilationencode ventilation, generate(dv)Describe dv

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

10

. summarize folate

Variable | Obs Mean Std. Dev. Min Max-------------+-------------------------------------------------------- folate | 22 283.2273 51.28439 206 392

. tabulate ventilation

ventilation | Freq. Percent Cum.------------+----------------------------------- N2O+O2,24h | 8 36.36 36.36 N2O+O2,op | 9 40.91 77.27 O2,24h | 5 22.73 100.00------------+----------------------------------- Total | 22 100.00

. tabulate ventilation, summarize (folate)

| Summary of folateventilation | Mean Std. Dev. Freq.------------+------------------------------------ N2O+O2,24h | 316.625 58.717088 8 N2O+O2,op | 256.44444 37.121797 9 O2,24h | 278 33.756481 5------------+------------------------------------ Total | 283.22727 51.284391 22

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

11

200

250

300

350

400

N2O+O2,24h N2O+O2,op O2,24h

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

12

. oneway folate ventilation

Analysis of Variance Source SS df MS F Prob > F------------------------------------------------------------------------Between groups 15515.7664 2 7757.88321 3.71 0.0436 Within groups 39716.0972 19 2090.32091------------------------------------------------------------------------ Total 55231.8636 21 2630.08874

Bartlett's test for equal variances: chi2(2) = 2.0951 Prob>chi2 = 0.351

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

13

. describe ventilation

storage display valuevariable name type format label variable label-------------------------------------------------------------------------------ventilation str10 %10s. encode ventilation, generate(dv). describe dv

storage display valuevariable name type format label variable label-------------------------------------------------------------------------------dv long %10.0g dv

. anova folate dv

Number of obs = 22 R-squared = 0.2809 Root MSE = 45.72 Adj R-squared = 0.2052

Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 15515.7664 2 7757.88321 3.71 0.0436 | dv | 15515.7664 2 7757.88321 3.71 0.0436 | Residual | 39716.0972 19 2090.32091 -----------+---------------------------------------------------- Total | 55231.8636 21 2630.08874

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

14

Two- and Multi-way ANOVA

• If there is more than one factor, the sum of squares can be decomposed according to each factor, and possibly according to interactions

• One can also have factors and quantitative variables in the same model (cf. analysis of covariance)

• All have similar interpretations

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

15

Heart rates after enalaprilat

Description:

36 rows and 3 columns. data for nine patients with congestive heart failure before and shortly after administration of enalaprilat, in a balanced two-way layout.

Format:

hr a numeric vector. Heart rate in beats per minute.

subj a factor with levels '1' to '9'.

time a factor with levels '0' (before), '30', '60', and '120' (minutes after administration).

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

16

. drop _all

. insheet using heart.rate.csv(4 vars, 36 obs)

. anova hr subj time

Number of obs = 36 R-squared = 0.9685 Root MSE = 3.5165 Adj R-squared = 0.9540

Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 9117.52778 11 828.866162 67.03 0.0000 | subj | 8966.55556 8 1120.81944 90.64 0.0000 time | 150.972222 3 50.3240741 4.07 0.0180 | Residual | 296.777778 24 12.3657407 -----------+---------------------------------------------------- Total | 9414.30556 35 268.980159

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

17

6080

100

120

140

1 2 3 4 5 6 7 8 9

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

18

6080

100

120

140

0 30 60 120

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

19

. anova hr subj

Number of obs = 36 R-squared = 0.9524 Root MSE = 4.07226 Adj R-squared = 0.9383

Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 8966.55556 8 1120.81944 67.59 0.0000 | subj | 8966.55556 8 1120.81944 67.59 0.0000 | Residual | 447.75 27 16.5833333 -----------+---------------------------------------------------- Total | 9414.30556 35 268.980159

. predict hrhat(option xb assumed; fitted values)

. generate hrres = hr - hrhat

. graph box hrres, over (time)

. graph export hrresxtime.wmf

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

20

-50

510

15

0 30 60 120

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

21

. anova hr subj time

Number of obs = 36 R-squared = 0.9685 Root MSE = 3.5165 Adj R-squared = 0.9540

Source | Partial SS df MS F Prob > F -----------+---------------------------------------------------- Model | 9117.52778 11 828.866162 67.03 0.0000 | subj | 8966.55556 8 1120.81944 90.64 0.0000 time | 150.972222 3 50.3240741 4.07 0.0180 | Residual | 296.777778 24 12.3657407 -----------+---------------------------------------------------- Total | 9414.30556 35 268.980159

. rvfplot

. graph export hrrvf.wmf

. rvpplot subj

. graph export hrrvpsubj.wmf

. rvpplot time

. graph export hrrvptime.wmf

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

22

-10

-50

510

Res

idua

ls

60 80 100 120 140Fitted values

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

23

-10

-50

510

Res

idua

ls

0 2 4 6 8 10subj

November 2, 2006 EPP 245 Statistical Analysis of Laboratory Data

24

-10

-50

510

Res

idua

ls

0 50 100 150time