Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.

51
Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210

Transcript of Analysis of Variance (ANOVA) Quantitative Methods in HPELS 440:210.

Analysis of Variance (ANOVA)

Quantitative Methods in HPELS

440:210

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

Introduction Recall There are two possible

scenarios when obtaining two sets of data for comparison: Independent samples: The data in the first

sample is completely INDEPENDENT from the data in the second sample.

Dependent/Related samples: The two sets of data are DEPENDENT on one another. There is a relationship between the two sets of data.

Introduction Three or more data sets?

If the three or more sets of data are independent of one another Analysis of Variance (ANOVA)

If the three or more sets of data are dependent on one another Repeated Measures ANOVA

Introduction: Terminology

Factor: Synonym of independent variable Level: The treatment conditions that make

up the factor or independent variable Example: What is the effect of grade (1st,

2nd, 3rd) on IQ?Dependent variable: IQFactor: GradeLevels (3): 1st, 2nd and 3rd grades

Introduction: Terminology Between-Treatment Variance: Variance

between the treatments/levels As the between-treatment variance

increases: The statistic increases The p-value decreases

Greater chance of rejecting the H0

Introduction: Terminology Within-Treatment Variance: Variance

within the treatments/levels As the within-treatment variance

increases: The statistic decreases The p-value increases

Lesser chance of rejecting the H0

Recall the Independent-Measures t-Test

If there was a large difference between the means (between variance) t got bigger

Why? t = M1-M2 / s(M1-M2)

The t formula can be thought of as a ratio of: Between variance (M1-M2)

Within variance (s(M1-M2))

Several Scenarios can occur

-Small between variance

-Large within variance

-t = BV / WV = near zero value

Accept or reject the H0

-Large between variance

-Large within variance

-t = BV / WV = near value of 1.0

Accept or reject the H0

-Small between variance

-Small within variance

-t = BV / WV = near value of 1.0

Accept or reject the H0

-Large between variance

-Small within variance

-t = BV / WV = greater than 1.0

Accept or reject the H0

Introduction The F-Ratio

ANOVA is a ratio of between variance and within variance

Distinction: Three or more groups

The F Distribution Plot all possible F-ratios F distribution There is a family of F distributions As df increases, the distribution becomes more

narrow F-ratios are always positive in value

Computed with two variances Variances are always positive!

F distribution is skewed Most values cluster around 1.0 Figure 13.8 (p 413)

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

ANOVA Statistical Notation:

k = number of treatment conditions (levels)nx = number of samples per treatment level

N = total number of samples N = kn if sample sizes are equal

Tx = X for any given treatment level

G = TMS = mean square = variance

ANOVA

Formula Considerations:SSbetween = T2/n – G2/N

SSwithin = SSinside each treatment

SStotal = SSwithin + SSbetween

SStotal = X2 – G2/N

ANOVA Formula Considerations:

dftotal = N – 1

dfbetween = k – 1

dfwithin = (n – 1) dfwithin = dfin each treatment

ANOVA Formula Considerations:

MSbetween = s2between = SSbetween / dfbetween

MSwithin = s2within = SSwithin / dfwithin

F = MSbetween / MSwithin

Independent-Measures Designs

Static-Group Comparison Design: Administer treatment to two or more groups

and perform posttest Perform posttest to control group Compare groups

X1 O

X2 O

O

Independent-Measures Designs Quasi-Experimental Pretest Posttest

Control Group Design: Perform pretest on three or more groups Administer treatments to treatment groups Perform posttests on all groups Compare delta (Δ) scores

O X1 O Δ

O X2 O Δ

O O Δ

Independent-Measures Designs Randomized Pretest Posttest Control Group

Design: Randomly select subjects from three or more

populations Perform pretest on all groups Administer treatments to treatment groups Perform posttests on all groups Compare delta (Δ) scores

R O X1 O Δ

R O X2 O Δ

R O O Δ

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

Hypothesis Test: ANOVA Example 13.1 (p 415) Overview:

Researchers are interested in the effectiveness different pain relievers (A, B and C) compared placebo (D)

N = 20 randomly assigned to the four treatments (n = 5)

Amount of time (s) each subject could withstand a painfully hot stimulus was measured

Hypothesis Test: ANOVA

Questions:What is the experimental design?What is the independent variable/factor? How many levels are there?What is the dependent variable?

Step 1: State Hypotheses

Non-Directional

H0: µA = µB = µC = µD

H1: At least one mean is different than the others

Directional?

Too many too list

Step 2: Set Criteria

Alpha () = 0.05

Critical Value:

Use F Distribution Table

Appendix B.4 (p 693)

Information Needed:

dfbetween = k – 1

dfwithin = (n – 1)

Step 3: Collect Data and Calculate Statistic

Total Sum of Squares

SStotal = X2 – G2/N

SStotal = 262 – 602/20

SStotal = 262 - 180

SStotal = 82

Sum of Squares Between

SSbetween = T2/n – G2/N

SSbetween = 52/5+102/5+202/5+252/5 – 602/20

SSbetween = (5+20+80+125) - 180

SSbetween = 50

Sum of Squares Within

SSwithin = SSinside each treatment

SSwithin = 8+8+6+10

SSwithin = 32

Step 3: Collect Data and Calculate Statistic

Mean Square Between

MSbetween = SSbetween / dfbetween

MSbetween = 50 / 3

MSbetween = 16.67

Mean Square Within

MSwithin = SSwithin / dfwithin

MSwithin = 32/16

MSwithin = 2

F-Ratio

F = MSbetween / MSwithin

F = 16.67 / 2

F = 8.33

Step 4: Make Decision

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

Post Hoc Analysis What ANOVA tells us:

Rejection of the H0 tells you that there is a

high PROBABILITY that AT LEAST ONE difference exists SOMEWHERE

What ANOVA doesn’t tell us:Where the differences lie

Post hoc analysis is needed to determine which mean(s) is(are) different

Post Hoc Analysis

Post Hoc Tests: Additional hypothesis tests performed after a significant ANOVA test to determine where the differences lie.

Post hoc analysis IS NOT PERFORMED unless the initial ANOVA H0 was rejected!

Post Hoc Analysis Type I Error Type I error: Rejection of a true H0

Pairwise comparisons: Multiple post hoc tests comparing the means of all “pairwise combinations”

Problem: Each post hoc hypothesis test has chance of type I error

As multiple tests are performed, the chance of type I error accumulates

Experimentwise alpha level: Overall probability of type I error that accumulates over a series of pairwise post hoc hypothesis tests

How is this accumulation of type I error controlled?

Two Methods Bonferonni or Dunn’s Method:

Perform multiple t-tests of desired comparisons or contrasts

Make decision relative to / # of testsThis reduction of alpha will control for the

inflation of type I error Specific post hoc tests:

Note: There are many different post hoc tests that can be used

Our book only covers two (Tukey and Scheffe)

Tukey’s Honestly Significant Difference (HSD) Test Overview:

Computes a single value that determines the minimum difference (HSD) between any two means necessary for rejection of the H0

Compares the HSD value to all of the contrast results

If the contrast result exceeds the HSD, the H0

of that particular contrast is rejected

Tukey’s HSD Calculation

Formulas: Equal sample sizes

HSD = q√MSwithin / n

Unequal sample sizesHSD = q√(MSwithin/2)(1/n1+1/n2)

Tukey’s HSD Calculations

Formula Considerations: q = value found in Table B.5 (p 696)

Left column: dfwithin

Top row: k treatments Body:

Regular font: = 0.05 Bold font: = 0.01

MSwithin = value from ANOVA calculation n = number of subjects in each treatment

Example 13.5 (p 427)

Step 1: State Hypotheses

Null

H0: µA = µB

H0: µA = µC

H0: µB = µC

Alternative

H1: µA µB

H1: µA µC

H1: µB µC

Step 2: Set Criteria

Alpha () = 0.05

Step 3: Calculate Statistic

Get q from Table B.5

Information needed:

dfwithin = 24

k = 3

= 0.05

q = 3.53

Calculate Tukey’s HSD Value

HSD = qMSwithin / n

HSD = 3.53 4 / 9

HSD = 2.36

Step 4: Make Decision:

A significantly greater than B MA – MB = 2.44 > 2.36

A significantly greater than C MA – MC = 4.00 > 2.36

B not significantly different than C MB – MC = 1.56 < 2.36

Table 13.6

Scheffe Overview:

Most conservative/cautious of all post hoc tests Uses an F-ratio (like ANOVA) on only two treatments

Controls for type I error: Uses k value from the original ANOVA thus the numerator

of the F-ratio for the Scheffe test is k – 1 Uses same critical value used for the ANOVA

Calculation of Scheffe is identical to the ANOVA however: SSbetween uses the two means of interest

Example 13.6 (p 428)

Step 1: State Hypotheses

Null

H0: µA = µB

H0: µA = µC

H0: µB = µC

Alternative

H1: µA µB

H1: µA µC

H1: µB µC

Step 2: Set Criteria

Alpha () = 0.05

Step 3: Calculate Statistic

Sum of squares between:

SSbetween = T2/n – G2/N

SSbetween = (272/9 + 492/9) – 762/18

SSbetween = (81+266.78) – 320.89

SSbetween = 26.89

SSwithin from original ANOVA = 96

Critical Value 3.40

dfbetween = 2

dfwithin = 24

= 0.05

Mean square between and within

MSbetween = SSbetween/dfbetween

MSbetween = 26.89 / 2 = 13.45

MSwithin from original ANOVA = 4

F = MSbetween / MSwithin

F = 13.45 / 4

F = 3.36

F = MSbetween / MSwithin

F = 13.45 / 4

F = 3.36

Step 4: Make Decision

F = 3.36 < 3.40 (critical value)

Accept or reject?

Repeat for the other two contrasts:

H0: µA = µC

H0: µB = µC

df = 2, 24

3.40

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

Instat Type dependent variable data from the three or more

samples into one column: Label column appropriately

In a second column, type in the grouping variable (independent variable) next to each data point:

Label column appropriately Convert the grouping column into a “factor” column:

Highlight the grouping column. Choose “Manage” Choose “Column Properties” Choose “Factor” Select the appropriate column to be converted Indicate the number of levels in the factor Click OK

Instat Choose “Statistics”

Choose “Analysis of Variance” Choose “One-Way” Y-Variate:

Choose the dependent variable Factor:

Choose the factor column or grouping/independent variable Plots:

Not necessary to choose any Click OK. Interpret the p-value!!!

Post Hoc Analysis: Perform multiple Independent-Measures t-Tests with the

Bonferonni/Dunn correction method

Reporting ANOVA Results Information to include:

Value of the F statistic Degrees of freedom:

Between: k – 1 Within: (n – 1)

p-value Examples:

A significant treatment effect was observed (F(2, 24) = 8.33, p = 0.02)

Reporting ANOVA Results An ANOVA summary table is often

included

Source SS df MS

Between 50 3 16.67 F = 8.33

Within 32 16 2

Total 82 19

Agenda

Introduction The Analysis of Variance (ANOVA) Hypothesis Tests with ANOVA Post Hoc Analysis Instat Assumptions

Assumptions of ANOVA Independent Observations Normal Distribution Scale of Measurement

Interval or ratio Equal variances (homogeneity)

Violation of Assumptions Nonparametric Version Kruskall-Wallis

Test (Chapter 17) When to use the Kruskall-Wallis Test:

Independent-Measures design with three or more groups

Scale of measurement assumption violation: Ordinal data

Normality assumption violation: Regardless of scale of measurement

Textbook Assignment

Problems: 3, 5, 17a, 21