General Linear Model and 1-Way ANOVA. - University of Illinois at

119
Multivariate Relationships and Multiple Linear Regression Slide 1 of 119 The General Linear Model & ANOVA Edpsy 580 Carolyn J. Anderson Department of Educational Psychology ILLINOIS UNIVERSITY OF ILLINOIS AT URBANA - CHAMPAIGN

Transcript of General Linear Model and 1-Way ANOVA. - University of Illinois at

Multivariate Relationships and Multiple Linear Regression Slide 1 of 119

The General Linear Model & ANOVAEdpsy 580

Carolyn J. AndersonDepartment of Educational Psychology

I L L I N O I SUNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 2 of 119

Outline

■ Introduction

◆ What is it the General Linear Model.

◆ Explanatory variables.

◆ Couple (quick) examples.

■ One-Factor ANOVA (fixed effects model)

◆ Introduction.

◆ As a linear model.

◆ Hypothesis testing.

◆ Example.

■ More Examples

The General Linear Model

● The General Linear Model

● The General Linear Model

● Error, ǫi● “Linear in the parameters”

● The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 3 of 119

The General Linear Model

■ A General & unifying framework.

◆ Simple linear and multiple regression.

◆ Analysis of Variance (ANOVA).

◆ Analysis of Covariance (ANCOVA).

◆ Other experimental designs.

■ Can be extended to

◆ Generalized Linear model.

◆ Multivariate general linear model.

◆ Random coefficients linear models.

◆ Random coefficients generalized linear models.

The General Linear Model

● The General Linear Model

● The General Linear Model

● Error, ǫi● “Linear in the parameters”

● The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 4 of 119

The General Linear Model

■ Basic Linear form:

Yi = βoxio + β1xi1 + β2xi2 + . . . + ǫi

■ Fixed:

◆ xio, xi1, xi2, . . . are values of the explanatory (predictor,independent) variables for individual i

◆ βo, β1, β2, . . . are population parameters

■ Random:

◆ Yi is quantitative or numerical response (outcome,dependent) variable for individual i.

◆ ǫi is “error” for individual i.

The General Linear Model

● The General Linear Model

● The General Linear Model

● Error, ǫi● “Linear in the parameters”

● The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 5 of 119

Error, ǫi

■ Yi is random because ǫi is random.

■ Standard assumption:

E(ǫi) = 0 and var(ǫi) = σ2ǫ

and for statistical inference ǫi is normal.

■ Sources of Variability—ǫi consists of effects due to

◆ Sampling.

◆ Measurement imperfections.

◆ Individual differences.

◆ Uncontrolled variability.

◆ Unsystematic error.

The General Linear Model

● The General Linear Model

● The General Linear Model

● Error, ǫi● “Linear in the parameters”

● The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 6 of 119

“Linear in the parameters”

Linear or non-linear?

Yi = βo + β1xi1 + ǫi

Yi = βoxio + β1xi1 + β2x2i2 + ǫi

Yi = βo + β1 log(xi1) + ǫi

Yi = e(βo+β1xi1+β2xi2+ǫi)

Yi = βo + xβ1

i1 + ǫi

The General Linear Model

● The General Linear Model

● The General Linear Model

● Error, ǫi● “Linear in the parameters”

● The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 7 of 119

The General Linear Model

■ “Smoothes” the data.

■ Summary, description.

■ Prediction.

■ Better (smaller) standard errors for means.

■ Hypothesis testing.

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 8 of 119

The Explanatory Variables

■ Quantitative, e.g.

◆ Age.

◆ Grade.

◆ Pre-test score.

■ Qualitative, e.g.

◆ Season (winter, spring, summer, fall).

◆ Teaching materials (text, web, or both).

◆ Statistics text (standard, low explanation, highexplanation).

◆ Type of writing (narrative, summary, argument).

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 9 of 119

Quantitative Explanatory Variables

yi = βo + β1xi + ǫi −→ yi = 1.2 + 5.4xi

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 10 of 119

Qualitative Explanatory Variable: hot dogsHot dog eaters who are also concerned with their health mayprefer hot dogs that are lower in calories (and salt). The dataused in this example consist of calories contained in each of 54major hot dog brands. The hot dogs are classified by type:

■ Beef■ Meat (mostly pork and beef, but up to 15% poultry meat)■ PoultryData are from Consumers Reports, June 1986, pp. 366-367.Summary statistics:

Type n Sum Mean Variance Std Dev

Beef 20 3137.00 156.85 512.66 22.64Meat 17 2698.00 158.71 636.85 25.24Poultry 17 2019.00 118.76 508.57 22.55

Total 54 7854.00 145.44 863.38 29.38

Do different types of hot dogs differ in terms of calories?

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 11 of 119

GLM for Hot DogsDefinitions of variables■ Outcome variable: Yi = calories in a hot dog.

■ Constant: xio = 1 for all hot dogs.

■ Hot dog type:

xi1 =

{

1 if beef0 otherwise

xi2 =

{

1 if meat0 otherwise

The General Linear Model,

Yi = βo + β1xi1 + β2xi2 + ǫi

When we use our definitions of the variables,

caloriesi = βo + β1 + ǫi if beef

caloriesi = βo + β2 + ǫi if meat

caloriesi = βo + ǫi if poultry

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 12 of 119

Hot Dogs with Alternative Coding

■ Outcome variable: Yi = calories in a hot dog.

■ Constant: xio = 1 for all hot dogs.

■ Hot dog type:

xi1 =

{

1 if beef0 otherwise

xi2 =

{

1 if meat0 otherwise

xi3 =

{

1 if polutry0 otherwise

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 13 of 119

GLM for Hot Dogs

The General Linear Model,

Yi = βo + β1xi1 + β2xi2 + β3xi3 + ǫi

which when using our definitions of the variables is

caloriesi = βo + β1 + ǫi if beef

caloriesi = βo + β2 + ǫi if meat

caloriesi = βo + β3 + ǫi if poultry

This is the standard linear model used in ANOVA.

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 14 of 119

Example 2 of Qualitative Variable

■ Let Yi = test score on exam.

■ A constant: xio = 1 for all individuals.

■ Teaching Method: web or text based.

xi1 =

{

1 if web based0 if text based

■ Simple linear model:

Yi = βoxio + β1xi1 + ǫi

Yi = βo + β1 + ǫi if web based

Yi = βo + ǫi if text based

The General Linear Model

Explanatory Variables

● The Explanatory Variables

● Quantitative Explanatory

Variables● Qualitative Explanatory

Variable: hot dogs

● GLM for Hot Dogs● Hot Dogs with Alternative

Coding

● GLM for Hot Dogs

● Example 2 of Qualitative

Variable

● Alternative Coding

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 15 of 119

Alternative Coding

■ Let Yi = test score on exam.

■ A constant: xio = 1 for all individuals.

■ Teaching Method: web or text based.

xi1 =

{

1 if web based0 if text based

xi2 =

{

0 if web based1 if text based

■ The linear model,

Yi = βoxio + β1xi1 + β2xi2 + ǫi

Yi = βo + β1 + ǫi if web based

= βo + β2 + ǫi if text based

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

● 1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 16 of 119

1–Way Analysis of VarianceOutline of Topics covered:

■ Introduction

◆ What it is and Why we need it.

◆ Experimental and Observational designs.

◆ Vocabulary, Terminology, Abbreviations, and Notation.

■ One–Factor ANOVA (fixed effects model).

◆ Least squares estimation.

◆ F–ratio and test.

◆ Summary.■ Effect of violations of assumptions.■ Power and Sensitively of ANOVA.■ More examples and other considerations.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 17 of 119

What & Whys of ANOVA

Situation: One quantitative variable and one qualitative(discrete) variable:

■ Way to analyze the relationship between a quantitativevariable and a qualitative (discrete) variable.

■ Generalization of the two independent groups t-test.

■ The hypotheses tested in ANOVA is

Ho : µ1 = µ2 = . . . = µJ versus Ha : not all equal

where J equals the number of populations or groups.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 18 of 119

Why ANOVA is Needed

■ Why not do possible two independent groups t-tests?

■ Consider Hot dog example, we would need 3 tests:

Ho1 : µbeef = µmeat, Ho2 : µbeef = µpolutry,

and Ho3 : µmeat = µpolutry

■ This strategy leads to J(J − 1)/2 t-tests,

Number of levels (J) Number of t–tests needed

4 4(4-1)/2=65 5(5-1)/2=106 6(6-1)/2 = 15...

...

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 19 of 119

Familywise Error Rate

■ Problem with all possible t-tests: If α = .05 for each test,then the probability of making a Type I error (on at least oneof the t-tests) is larger than .05.

■ The familywise error rate is the probability of Type I errors fora set of tests.

■ How big the problem is depends on

◆ The number of t-tests performed.

◆ Whether the t-tests are statistically independent.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 20 of 119

Familywise Error Rate

If t-tests are independent and α = .05 for each one,

Then

P (at least one Type I error) = 1 − P (no type I errors)

= 1 − (1 − α)K

where K = the number of t-tests performed.

J = 3 −→ K = 3 −→ = 1 − (.95)3 = .14

J = 4 −→ K = 6 −→ = 1 − (.95)6 = .26

J = 5 −→ K = 10 −→ = 1 − (.95)10 = .40

J = 6 −→ K = 15 −→ = 1 − (.95)15 = .54

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 21 of 119

Familywise Error Rate■ Since the same data are used in more than one test, the

tests are dependent, which makes computing the actualfamilywise error rate very complex.

■ We do not know what the familywise error rate really equals.

■ The maximum familywise error rate is

P (at least one Type I error) ≤ Kα

which means that . . . The minimum and maximum familywiseerror rates for different numbers of t-tests are

K = 3, .14 ≤ P (at least one Type I error) ≤ 3(.05) = .15

K = 6, .26 ≤ P (at least one Type I error) ≤ 6(.05) = .30

K = 10, .40 ≤ P (at least one Type I error) ≤ 10(.05) = .50

K = 15, .54 ≤ P (at least one Type I error) ≤ 15(.05) = .75

Solution: Test the equality between all meanssimultaneously & set the familywise error rate.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 22 of 119

Advantages of ANOVA

■ Control the familywise error rate when testing the equality ofmultiple populations’ means.

■ All of the data is used (in a single test), =⇒ better estimatesof population parameters.

■ Better estimates of population variance =⇒ ANOVA hasmore Power than all possible t-tests.

■ Multiple factors can be included.

◆ Effects due to different factors can be teased apart.

◆ Check for interaction effects.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 23 of 119

Example of a 2-Way ANOVA

Example of multiple factor design: Wiley & Voss (1999)Constructing arguments from multiple sources: Tasks thatpromote understanding and not just memory for texts. Journal ofEducational Psychology, 91, 301–311.

■ Response/dependent variable = understanding as measuredby 10 item inference verification test (IVT), Yi = IVTi.

■ Factors:

◆ Format (text or web)

◆ Instructions participants received: write a Narrative (N),Summary (S), Explanation (E), Argument (A).

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 24 of 119

Example of a 2-Way ANOVA

Summary statistics: Y and s2, where ncell = 8.

InstructionsFormat N S E A Total

Text 71.25 72.5 68.75 73.75 71.56126.79 164.29 69.64 141.07

Web 76.25 73.75 72.5 90.0 78.1255.36 255.36 107.14 114.29

Totals 73.75 73.13 70.63 81.88 74.84

From Meyers & Well (2003). Research Design and Statistical Analysis.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

● What & Whys of ANOVA

● Why ANOVA is Needed

● Familywise Error Rate

● Familywise Error Rate

● Familywise Error Rate

● Advantages of ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

● Example of a 2-Way ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 25 of 119

Example of a 2-Way ANOVA

Possible null hypotheses that can be tested:

■ Main effects:◆ Effect of Instruction:

Ho1 : µN = µS = µE = µA

◆ Effect of Format:

Ho2 : µtext = µweb

■ Interaction of instruction and format: Ho3 :

µN,text = µN,web = µE,text = µE,web = µS,text = µS,web = µA,text = µA,web

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 26 of 119

Designs■ One Factor. . . later two or more factors, “factorial designs.”

■ Independent populations, groups, conditions, etc. . . versusrepeated measures, which is a generalization of paired ordependent t-test.

■ Fixed Effects versus random effects.

■ Every design has associated with it a statistical model forthe response (“dependent”) variable.

■ For each design, you need to◆ Know the assumptions.◆ Consider whether they are reasonable.◆ Check the assumptions by studying the data.

■ Designs for One–Factor ANOVA◆ Experimental◆ Observational

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 27 of 119

Experimental Designs

Completely Randomized Experimental Design

■ Subjects are randomly assigned to one and only onecondition (Note: “Subject” does not have to be a person. Itcould be a school, donut, etc.)

■ The different conditions are the levels of the independentvariable or factor, which are discrete.

■ The dependent variable is numerical measure of which wewant to know whether the independent variable has an effecton.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 28 of 119

An Example of a C.R. Experimental Design

Wiley & Voss (1999)

■ Students are random assigned to received one type ofInstruction. Type of instruction is independent variable and ithas four levels.

■ The amount learned (IVT measure) is the dependentvariables.

Goal: Make causal inferences about the effects of independentvariables on the dependent variable.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 29 of 119

Random Assignment in C.R. Designs

■ The random assignment to conditions is required so that

◆ The value of the dependent variable is not related to thecondition to which a person is assigned to.

◆ Assignment to conditions (treatments) is not confoundedwith treatment.

◆ Any differences between groups (conditions) with respectto subjects’ assigned to them are unsystematic.

■ Question answered by ANOVA with a completely randomizedexperimental design:

◆ Are differences on average (mean) responses ormeasures between conditions due to chance error or arethe differences large enough to indicate that there are realdifferences in the population?

◆ Are treatment effects different?

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 30 of 119

Observational Studies

■ Take Random samples from different populations.

■ Random sampling is required to get samples that arerepresentative of the populations that are of interest.

■ We do not want samples that differ from the population insome systematic way.

■ If samples differ systematically, then we have a confoundbetween conditions and selection.

An observational version of Wiley & Voss study:

■ Take a random sample of students who were required to dodifferent types of writing assignments and have them takethe IVT.

■ Type of writing assignment is the explanatory variable.■ Obtain measure of learning (e.g., IVT).■ IVTis the Response variable.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 31 of 119

Observational versus Experimental

■ In observational studies, the factor is called an explanatoryvariable rather than an independent variable.

■ In observational studies, response or criterion variablesinstead of dependent variables.

■ Why the difference in terminology?

◆ In observational studies, Cannot make causal inferences.

◆ In observational studies, Can make inferences aboutwhether differences exist.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

● Designs

● Experimental Designs

● An Example of a C.R.

Experimental Design

● Random Assignment in C.R.

Designs

● Observational Studies● Observational versus

Experimental

● Different Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

PowerMultivariate Relationships and Multiple Linear Regression Slide 32 of 119

Different Designs

■ In observational studies, should not conclude that

◆ “differences are caused by ”

◆ “the effect of” factor levels on . . . ”

■ Differences between experimental and observationaldesigns −→ differences in

◆ Terminology

◆ Conclusions

■ But the statistics and mathematical procedures are thesame.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 33 of 119

ANOVA Terminology and Notation

General:

■ ANOVA ≡ ANalysis Of VAriance■ Familywise error rate ≡ Prob(Type I error in the whole set of

tests)■ Factor ≡ Explanatory or Independent variable. It’s discrete

(nominal or ordinal) and observable.■ Levels of a factor ≡ Categories of the factor■ Dependent or Response variable≡ What’s being measured

(“construct of interest”); whether there’s a difference.■ Effect ≡ variability due to difference sources (e.g., error,

treatment or group, etc)■ Replicates ≡ usually subjects or individuals■ Source (of variation) ≡ error and treatment or group (for

now)■ Treatment effects (systematic) ≡ systematic variability■ Experimental Error (unsystematic) ≡ unsystematic variability

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 34 of 119

Design/Model Specific

■ Completely Randomized experimental Design — or CRD

■ Observational design/study

■ Factorial design — more than one factor

■ Balanced design — n1 = n2 = . . . = nJ = n

■ Fixed effects —

■ Random effects

■ Repeated measures

■ Crossed Factors

■ Nested Factors

■ Blocking Factor(s)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 35 of 119

Notation(some conventions specific to 1–Factor ANOVA)

■ SS (sums of squares).

■ MS (means squared errors).

■ Upper case Roman letters refer to factors (e.g., A, B, . . . ).

■ S to refer to subject.

■ S|A subject within factor A.

■ y refers to observations on the dependent or responsevariable.

■ i index subjects (experimental unit, replicate, etc.)

■ j index levels of a factor

■ So Yij is the observation on subject i at level j of the factor.

■ J number of levels of a factor

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 36 of 119

More Notation■ nj number of subjects assigned to (observed at) level j of

the factor.

■ N =∑J

j=1 nj = total number of subjects.

■ n+ = N (sometimes)

■ Yj = (1/nj)∑nj

j=1 Yij = sample mean for level j.

■ Y = (1/N)∑J

j=1

∑nj

i=1 Yij =

grand (sample) mean.

■ µ grand mean in population.

■ σ2ǫ = experimental (unsystematic, within groups) error.

■ µj = population mean for level j of the factor.

■ αj = population treatment effect for level j of factor.

■ ǫij = residual or error for subject i for level j.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 37 of 119

Working Example of 1-factor ANOVA

Consider the Wiley & Voss data:Instructions

Statistic N S E A Total

means 73.75 73.13 70.63 81.88 74.84sj 13.60 14.01 9.29 13.77 13.21s2

j 185.00 196.25 86.25 189.58 174.58nj 16 16 16 16 64

Define IVT = Yij for student/particpant i in jth level.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 38 of 119

The Wiley & Voss Data

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 39 of 119

The Wiley & Voss Data

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation● ANOVA Terminology and

Notation

● Design/Model Specific

● Notation

● More Notation● Working Example of 1-factor

ANOVA

● The Wiley & Voss Data

● The Wiley & Voss Data

● Working Example for 1-Factor

ANOVA

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Multivariate Relationships and Multiple Linear Regression Slide 40 of 119

Working Example for 1-Factor ANOVA■ Define Xio = 1 and

Xi1 =

{

1 if Narrative0 otherwise

Xi2 =

{

1 if Summary0 otherwise

Xi3 =

{

1 if Explanation0 otherwise

Xi4 =

{

1 if Argument0 otherwise

■ Yi = βoxio + β1xi1 + β2xi2 + β3xi3 + β4xi4 + ǫi

■ By level of instruction factor:

Yi =

βo + β1 + ǫi if Narrativeβo + +β2 + ǫi if Summaryβo + +β3 + ǫi if Explainβo + +β4 + ǫi if Argument

■ What do the weights (i.e., β’s) equal?

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 41 of 119

Least Squares Estimation

■ ǫi = Yi − (βo + β1xi1 + β2xi2 + β3xi3 + β4xi4).■ Want β’s that minimize the error variance,

1

N

N∑

i=1

(ǫi − ǫi)2

■ Two restrictions:∑N

i=1 ǫi = 0 and∑J

j=1 βj = 0.

■ The “loss function” is then

sumNi=1(ǫi − ǫi)

2 =N∑

i=1

(Yi − (βo + β1xi1 + β2xi2 + β3xi3 + β4xi4))2

=N∑

i=1

(Yi − guess)2

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 42 of 119

Least Squares Estimation■ Let Yij = observation for individual i from group j.■ What minimizes this?

nj∑

i=1

(yij − guessj)2

■ The group mean, yj =∑nj

i=1 yij/nj

■ Re-write the loss function as

J∑

j=1

nj∑

i=1

(Yij − guessj)2 =

n1∑

i=1

(yi1 − ?1)2 +

n2∑

i=1

(yi2 − ?2)2

n3∑

i=3

(yi3 − ?3)2 +

n4∑

i=1

(yi4 − ?4)2

■ If we minimize each one of these, we minimize their sum.■ Let’s consider the first one (i.e., for Narrative). . .

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 43 of 119

Least Squares Estimation

∑n1

i=1(yi1 − ?1)2

■ Setting ?1 = y1 = ynarrative minimizes this.

■ Filling in our linear model for Narrative:

n1∑

i=1

(yi1 − y1)2 =

n1∑

i=1

(yi1 − (βo + β1))2

■ So, y1 = ynarrative = βo + β1

■ For all of them

y1 = ynarrative = βo + β1, y2 = ysummary = βo + β2

y3 = yexplanation = βo + β3, y4 = yargument = βo + β4

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 44 of 119

Least Squares Estimation

y1 = ynarrative = βo + β1, y2 = ysummary = βo + β2

y3 = yexplanation = βo + β3, y4 = yargument = βo + β4

■ Solution for βo:

βo = y =1

∑Nj=1 nj

N∑

j=1

nj∑

i=1

yij = grand mean

■ Solution for βj = (group mean)j − (grand mean):

β1 = y1 − y, β2 = y2 − y

β3 = y3 − y, β4 = y4 − y

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 45 of 119

Least Squares Estimation

■ More conventional notation is that βj = αj , the effect of levelj of the factor.

■ So our estimated model is

Yij = Y + αj

= Y + (Yj − Y )

= Yj

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 46 of 119

Example: Least Squares EstimationFor the Wiley & Voss data:

InstructionsStatistic N S E A Total

means 73.75 73.13 70.63 81.88 74.84sj 13.60 14.01 9.29 13.77 13.21nj 16 16 16 16 64

yi1 = yi,narrative = 74.84 + (73.75 − 74.84) = 74.84 − 1.09

yi2 = yi,summary = 74.84 + (73.13 − 74.84) = 74.84 − 1.71

yi3 = yi,explanation = 74.84 + (70.63 − 74.84) = 74.84 − 4.21

yi4 = yi,argument = 74.84 + (81.88 − 74.84) = 74.84 + 7.04

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 47 of 119

Population 1 Factor ANOVA Model

■ The 1-factor ANOVA model:

Yij = µ + αj + ǫij

■ µ = grand mean over all populations.

■ αj = µj − µ

■ ǫ ∼ N (0, σ2ǫ ) i.i.d.

■ Estimation of the model (given data):

Yij = µ + αj + ǫij

= Y + (Yj − Y ) + (Yij − Yj)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 48 of 119

Picture of ANOVA

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 49 of 119

Picture of ANOVA

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Least Squares Estimation

● Example: Least Squares

Estimation● Population 1 Factor ANOVA

Model

● Picture of ANOVA

● Picture of ANOVA

● Picture of ANOVA

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect SizeMultivariate Relationships and Multiple Linear Regression Slide 50 of 119

Picture of ANOVA

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 51 of 119

Partitioning Sums of Squares■ How do we test the null hypothesis?

■ Study the variance.

■ According to our model

Yij = µ + αj + ǫij

= Y + (Yj − Y ) + (Yij − Yj)

or

(Yij − Y ) = (Yj − Y ) + (Yij − Yj)

deviation of

score from

overall mean

=

deviation of

group mean

from overall

+

deviation of

score from

group mean

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 52 of 119

Partitioning Sums of Squares

(Yij − Y ) = (Yj − Y ) + (Yij − Yj)■ Square both sides,

(Yij − Y )2 =[

(Yj − Y ) + (Yij − Yj)]2

■ Sum over all groups and individuals with groups,

J∑

j=1

nj∑

i=1

(Yij − Y )2 =

J∑

j=1

nj∑

i=1

[(Yj − Y ) + (Yij − Yj)]2

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 53 of 119

Work Through the Algebra

J∑

j=1

nj∑

i=1

(Yij − Y )2 =J∑

j=1

nj∑

i=1

[

(Yj − Y ) + (Yij − Yj)]2

=J∑

j=1

nj∑

i=1

[

(Yj − Y )2 + (Yij − Yj)2

+2(Yj − Y )(Yij − Yj)]

=J∑

j=1

nj∑

i=1

(Yj − Y )2 +J∑

j=1

nj∑

i=1

(Yij − Yj)2

+J∑

j=1

nj∑

i=1

2(Yj − Y )(Yij − Yj)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 54 of 119

Work Through the Algebra (continued)

J∑

j=1

nj∑

i=1

(Yij − Y )2 =J∑

j=1

nj∑

i=1

(Yj − Y )2 +J∑

j=1

nj∑

i=1

(Yij − Yj)2

+J∑

j=1

nj∑

i=1

2(Yj − Y )(Yij − Yj)

=

J∑

j=1

nj(Yj − Y )2+

J∑

j=1

nj∑

i=1

(Yij − Yj)2

+2J∑

j=1

(Yj − Y )

nj∑

i=1

(Yij − Yj)

=J∑

j=1

nj(Yj − Y )2 +J∑

j=1

nj∑

i=1

(Yij − Yj)2

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 55 of 119

The ANOVA Decomposition

J∑

j=1

nj∑

i=1

(Yij − Y )2 =J∑

j=1

nj(Yj − Y )2 +J∑

j=1

nj∑

i=1

(Yij − Yj)2

SStotal = SSbetween + SSwithtin

SStotal=SSmodel + SSerror

■ SStotal is “corrected for mean”

■ SSwithin also gets called

◆ SSerror◆ SSresidual◆ SSS|A, “subjects within factor A”

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 56 of 119

Example: Sums of Squares Decomposition

Wiley & Voss Data(note: summary statistics on pages 40 and 51)

SStotal =J∑

j=1

nj∑

i=1

(Yij − Y )2

=4∑

j=1

16∑

i=1

(Yij − 74.84)2

= (N − 1)s2

= (64 − 1)174.578

= 10, 998.414

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 57 of 119

Example: Sums of Squares Decomposition

Wiley & Voss Data

SSwithin =∑

j

i

(Yij − Yj)2 =

j

i

ǫ2ij

= (n1 − 1)s21 + (n2 − 1)s2

2 + (n3 − 1)s23 + (n4 − 1)s2

4

= (n − 1)(s21 + s2

2 + s23 + s2

4)

= (16 − 1)(185.00 + 196.25 + 86.25 + 189.58)

= 9, 856.250

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 58 of 119

Example: Sums of Squares Decomposition.

Wiley & Voss Data

SSbetween =J∑

j=1

nj(Yj − Y )2 =J∑

j=1

njα2j

= 16(73.75 − 74.84)2 + 16(73.13 − 74.84)2

+16(70.63 − 74.84)2 + 16(81.88 − 74.84)2

= 16(1.1881 + 2.9241 + 17.7241 + 49.5616)

= 16(71.3867)

= 1, 142.188

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 59 of 119

Example: Sums of Squares Decomposition

Wiley & Voss Data:SStotal = SSbetween + SSwithin

10, 998.414 = 1, 142.188 + 9, 856.250

. . . within rounding error

The easier way to compute SSbetween,

SStotal − SSwithin = 10, 998.414 − 9, 856.250 = 1, 142.188

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 60 of 119

Brief Digression/Note

■ If we had squared and summed the equation

Yij = Y + (Yj − Y ) + (Yij − Yj)

■ We would have ended up with

SSraw scores = SSoverall mean + SSbetween + SSwithin

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 61 of 119

Decomposition of Variance

J∑

j=1

nj∑

i=1

(Yij − Y )2 =J∑

j=1

nj(Yj − Y )2 +J∑

j=1

nj∑

i=1

(Yij − Yj)2

j

i(Yij − Y )2

N − 1=

j nj(Yj − Y )2

N − 1+

j

i(Yij − Yj)2

N − 1

SStotalN − 1

=SSbetween

N − 1+

SSwithinN − 1

var(Yij) = (systematic) + (unsystematic)

var(Yij) = (accounted for) + (not explained)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 62 of 119

Proportion of Variance

SStotal = SSbetween + SSwithin

1 =SSbetween

SStotal+

SSwithinSStotal

1 =

(

proportion of variance of

Yij due to treatment

)

+

(

proportion of variance of

Yij not due to treatment

)

The variance of Yij is down into two statistically independentparts:

1. Between groups

2. Within groups

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 63 of 119

Proportion of Variance

■ In regression, the proportion of variance due to the modelequals the squared correlation.

■ In ANOVA,

SSbetweenSStotal

= R2 = “Multiple R Squared”

■ R2 equals the squared correlation between Yij andYij = µ + αj ,

R2 =cov(Yij , Yij)

2

var(Yij)var(Yij)

■ R2 is just one way to measure the relative size or magnitudeof treatment effects.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

● Partitioning Sums of Squares

● Partitioning Sums of Squares

● Work Through the Algebra● Work Through the Algebra

(continued)

● The ANOVA Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition

● Example: Sums of Squares

Decomposition.

● Example: Sums of Squares

Decomposition

● Brief Digression/Note

● Decomposition of Variance

● Proportion of Variance

● Proportion of Variance

● Multiple R Squared

Hypothesis Testing: F -test

Multivariate Relationships and Multiple Linear Regression Slide 64 of 119

Multiple R Squared

■ In Wiley & Voss example,

SSbetweenSStotal

= R2 =1, 142.188

10, 998.414= .1039

and r(Yij , Yij) =√

.1039 = .322

■ 1 − R2 is proportional to the “loss function”, which we set outto minimize,

1 − R2 = 1 − .1039 = .8961

■ Is this statistically a good model?

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 65 of 119

Hypothesis Testing: F -test

■ Statistical Hypotheses:

Ho : µ1 = µ2 = . . . = µJ or all µ′js are equal

Ha : at least one µ′js is not equal to the rest

or Ha : the means differ in the population

■ The alternative hypothesis is NOT

µ1 6= µ2 6= . . . 6= µJ

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 66 of 119

Assumptions:

■ The dependent or response variable is normally distributedin the population.

■ The variances of scores in different populations are equal.

■ Observations are independent across (between) groups andwithin groups.

Succinctly : Yij ∼ N (µj , σ2ǫ ) i.i.d.

σ2ǫ , experimental error.

Distributions are the same, except the means.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 67 of 119

Test Statistic

■ If Ho is TRUE, then we expect the sample means to differbecause of unsystematic error, i.e., σ2

ǫ .

■ If Ho is FALSE, then the differences between sample meansreflect

1. Experimental or unsystematic error, i.e., σ2ǫ

2. Systematic differences or true differences betweenpopulation means, or “treatment effects”.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 68 of 119

Test Statistic (continued)

■ To test Ho, we look at the following ratio:

(Differences among treatment means)

(Differences among subjects treated alike)

(Between group differences)

(Within group differences)

■ If Ho is TRUE, then this ratio equals

(Experimental Error)(Experimental Error)

∼ 1

■ If Ho is FALSE, then this ratio equals

(Treatment Effects) + (Experimental Error)(Experimental Error)

> 1

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 69 of 119

Test Statistic: Variance Estimates

■ How to estimate σ2ǫ ? (The denominator of our test statistic).

■ Use all the data and compute a pooled estimate.

σ2pool =

(n1 − 1)s21 + (n2 − 1)s2

2 + . . . + (nJ − 1)s2J

(n1 − 1) + (n2 − 1) + . . . + (nJ − 1)

=

∑Jj=1

∑nj

i=1(Yij − Yj)2

∑Jj=1(nj − 1)

=SSwithin

∑Jj=1(nj − 1)

∑Jj=1(nj − 1) equals the degrees of freedom associated with

σ2pool; that is, νpool.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 70 of 119

Test Statistic: Variance Estimates

■ The pooled variance, SSw

νw= MSw = σ2

ǫ is thewithin groups mean squared error.

■ Degrees of freedom= νw =∑J

j=1(nj − 1), which equals(N − J) for a balanced design.

■ MSw is also called

◆ Error mean square

◆ Mean square error

◆ Residual mean square

■ Wiley & Voss example: MSw = 9, 856.250/(64− 4) = 164.271

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 71 of 119

Test Statistic: Variance Estimates

SSw

νw= MSw = σ2

ǫ

■ The expected value E(SSw/νw) = MSw = σ2ǫ

■ It does not depend on whether Ho is true or false.

■ It is constant across groups

■ In a good experiment or study, this should be “small” andonly reflect chance error.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 72 of 119

Variation Due to Treatments

Consider the sampling distributions of the means for each ofour treatments.. . .

Sample Population VarianceGroup Mean (true) mean of Yj

N Y1 µ1 σ2Y1

= σ2ǫ /n1

E Y2 µ2 σ2Y2

= σ2ǫ /n2

S Y3 µ3 σ2Y3

= σ2ǫ /n3

A Y4 µ4 σ2Y4

= σ2ǫ /n4

In Wiley & Voss example, we have a “balanced design”; that is,n1 = n2 = n3 = n4 = 16 (equal sample sizes).

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 73 of 119

Variation Due to Treatments (continued)For balanced designs,

If Ho : µ1 = µ2 = µ3 = µ4 = µ (a constant) is true,

Then Y1, Y2, . . . YJ is a random sample from the population ofsample means of size J with mean µ and variance σ2

ǫ /n.

■ The grand mean

Y =1

N

J∑

j=1

n∑

i=1

Y ij

where N =∑J

j=1 nj .

■ If Ho is true, then variance of the means,

σ2Yj

=1

(J − 1)

J∑

j=1

(Yj − Y )2

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 74 of 119

Variation Due to Treatments (continued)

For balanced designs, If Ho is true

■ Then variance of the means (by definition),

σ2Yj

=1

(J − 1)

J∑

j=1

(Yj − Y )2

■ Since σ2Yj

= σ2ǫ /n, the estimate of σ2

ǫ is

σ2ǫ = nσ2

Yj=

1

(J − 1)

J∑

j=1

n(Yj − Y )2

=SSbetween

(J − 1)

= MSbetween

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 75 of 119

Variation Due to Treatments (continued)

For unbalanced designs,

■ Grand mean,

Y =1

N

J∑

j=1

nj∑

i=1

Y ij

where N =∑J

j=1 nj .

■ Estimate of σ2ǫ if Ho is true,

σ2ǫ = =

1

(J − 1)

J∑

j=1

nj(Yj − Y )2

=SSbetween

(J − 1)= MSbetween

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 76 of 119

Test Statistic: Variance Estimates

■ Wiley-Voss data:

MSbetween =SSbetween

(J − 1)=

1, 142.188

4 − 1= 380.729

■ Test statistic,

F =SSbetween/νbetween

SSwithin/νwithin=

MSbetween

MSerror

■ If Ho is true, the Sampling distribution of F. . .

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 77 of 119

Sampling distribution of F

If Ho is true, the Sampling distribution of F. . .

F =SSbetween/νbetween

SSwithin/νwithin

1σ2

ǫ

1σ2

ǫ

=

Jj=1

nj(Yj−Y )2/σ2

ǫ

(J−1)

Jj=1

nji=1

(Yij−Yj)2/σ2ǫ

Jj=1

(nj−1)

=χ2

b/νb

χ2w/νw

∼ Fνb,νw

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 78 of 119

ANOVA Summary Table

■ For Wiley-Voss exampleSource df SS MS F p-value

Instructions 3 1,142.188 380.729 2.32 .08Error 60 9,856.250 164.271

Total 63 10,998.414

■ Retain Ho, because p-value= (1 − .0.92) = .08 > α = .05

■ If we had rejected Ho, what would you want to know?

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 79 of 119

One-Factor ANOVA Summary Table■ In general, for 1 factor ANOVA

Source df SS MS F

Factor A J − 1jnj(Yj − Y )2 SSA

νA

MSA

MSerror

Errorj(nj − 1)

j i(Yij − Yj)

2 SSerror

νerror

Total N − 1j i

(Yij − Y )2

■ Reject Ho for “large” F statistics. Compare to Fν1,ν2

distribution.

■ By “Total”, we mean “total corrected for mean”

■ The terms “Factor A”, “Treatment”, “Condition”, “Between”are interchangeable.

■ The terms “within”, “residual” and “error” areinterchangeable.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 80 of 119

A Closer Look at MSbetween

■ If Ho is TRUE, then

E(MSbetween) = σ2ǫ

■ IF Ho is FALSE (i.e., Ha is true), then the Expected value

E(MSbetween) = σ2ǫ +

∑Jj=1 njα

2j

J − 1

(See Hayes for proof).

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 81 of 119

A Closer Look at the F Statistic

■ If Ho is true:

F =MSbetween

MSerror=

σ2ǫ

σ2ǫ

■ If Ho is false:

F =MSbetween

MSerror=

σ2ǫ +

Jj=1

njα2

j

J−1

σ2ǫ

■ Test statistic F = MSbetween/MSerror follows a non-centralF distribution.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

● Hypothesis Testing: F -test

● Assumptions:

● Test Statistic

● Test Statistic (continued)

● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates● Test Statistic: Variance

Estimates

● Variation Due to Treatments● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)● Variation Due to Treatments

(continued)

● Test Statistic: Variance

Estimates

Multivariate Relationships and Multiple Linear Regression Slide 82 of 119

A Summary of 1-Factor ANOVA■ Statistical Hypotheses: Ho : µ1 = µ2 = . . . = µJ or

Ho : α1 = α2 = . . . = αJ versusHa: at least one µj (or αj) differs from rest.

■ Assumptions: Yij ∼ N (µj , σ2ǫ ) i.i.d.(or write out model).

■ Test Statistic:

Source df SS MS F

Between (J − 1) J

j=1nj(Yj − Y )2 SSb

νB

MSB

MSw

Within J

j=1(nj − 1) 2

j=1

nj

i=1(Yij − Yj)

2 SSw

νw

Total (corrected) N − 1 J

j=1

nj

i=1(Yij − Y )2

■ Sampling distribution: If the null hypothesis is true, thenF ∼ (central) Fνb,νw

distribution.

■ Decision and Conclusion: If you reject the Ho all we know isthat at least one group has a mean that’s different from therest.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

● ANOVA & SAS

● SAS/ASSIST

● SAS/Program Commands

● SAS/Analyst

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 83 of 119

ANOVA & SAS

■ ASSIST■ Analyst■ Program Commands

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

● ANOVA & SAS

● SAS/ASSIST

● SAS/Program Commands

● SAS/Analyst

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 84 of 119

SAS/ASSIST

■ Put a data set in SAS working memory■ On main toolbar: Solutions

> ASSIST> Data Analysis

> ANOVA> Analysis of Variance

■ In ANOVA window, fill in◆ Table −→ data set,◆ Dependent −→ dependent variable◆ Classification −→ factor

■ Click on RUN.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

● ANOVA & SAS

● SAS/ASSIST

● SAS/Program Commands

● SAS/Analyst

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 85 of 119

SAS/Program Commands

■ In the program/text editor window:

PROC GLM DATA=ivt;CLASS instruct;MODEL ivt = instruct;TITLE ’ANOVA for 1-factor Wiley & Voss’;

RUN;■ Click on RUN on the toolbar■ Note: This is what SAS/ASSIST does.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

● ANOVA & SAS

● SAS/ASSIST

● SAS/Program Commands

● SAS/Analyst

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

Multivariate Relationships and Multiple Linear Regression Slide 86 of 119

SAS/Analyst

■ Have a data set in working memory.■ On SAS main toolbar:→Solutions → Analysis → Analyst

■ In Analyst environment, File > Open by SAS name“WORK” −→ select the one you want.

■ On Analyst toolbar:→ Statistics → ANOVA → 1 Way ANOVA

■ fill in boxes with dependent and independent variable names.■ Request any other options you want.■ Click “OK”.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 87 of 119

Unequal Sample Sizes

■ Example: Data are from Moore & McCabe who got it fromKirmani, A & Wright, P. (1989). Money Talks: PerceivedAdvertising expense and expected product quality. Journal ofConsumer Research, 16, 344-353.

■ Yij = rating of the quality of a take-home refrigeratedentrees based on add from 1 (bad) to 7 (good).

■ Factor: Information included in the add:

◆ U: Undermine quality and advertising (n1 = 55)

◆ A: Affirm quality and advertising (n2 = 36)

◆ C: Control (n3 = 36)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 88 of 119

The Data

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 89 of 119

The Data and Means

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 90 of 119

Summary Statistics

Group N Mean Std Dev Variance

A 36 5.0555556 0.8261596 0.6825397C 36 5.4166667 0.8742344 0.7642857U 55 4.5090909 0.6904837 0.4767677

Total 127 4.9212598 0.8692845 0.7556555

SStotal = (J − 1)s2 = (127 − 1)(.7556555) = 95.2125

SSerror = (n1 − 1)s21 + (n2 − 1)s2

2 + (n3 − 1)s23

= (36 − 1)(.6825397) + (36 − 1)(.7642857)

+(55 − 1)(.4767677)

= 76.3843

SSgroup = SStotal − SSerror= 95.2125 − 76.3843 = 18.8282

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 91 of 119

ANOVA Summary Table

Basically what get from SAS

Sum of MeanSource DF Squares Square F Pr > F

Model 2 18.828 9.414 15.28 < .0001

Error 124 76.384 0.616

Corrected Total 126 95.213

R-Square Coeff Var Root MSE qual Mean

0.197750 15.94832 0.784858 4.921260

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

● Unequal Sample Sizes

● The Data

● The Data and Means

● Summary Statistics

● ANOVA Summary Table

● Plot of The Means±2sY

Effect Size

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 92 of 119

Plot of The Means ±2sY

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 93 of 119

Effect Size

Effect Size: Estimates of treatment magnitude

■ p-value of F -statistic is not a measure of size of an effect.

■ Goal: develop comprehensive theory of a phenomenon.

◆ The importance of an experimental manipulation −→degree to which can account for total variability amongsubjects by isolating experimental effect.

◆ In observational studies, the importance of factors −→degree to which differences can be explained by factors.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 94 of 119

Effect Size

Estimates of treatment magnitude

■ Multiple R2 = SSbetween/SStotal

■ Omega squared, ω2.

■ Epsilon squared, e2.

■ Non-centrality parameter.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 95 of 119

Effect Size: Omega Squared

ω2 =variance due to treatment

total variance

=

∑Jj=1 α2

j/J

σ2ǫ +

∑Jj=1 α2

j/J

■ Most common effect size measure.

■ The proportion of total population variance due to (explainedby) treatments.

■ Estimating ω2,

ω2 =SSbetween − (J − 1)MSerror

SStotal + MSerror

◆ SSbetween reflects treatment magnitude and σ2ǫ .

◆ MSerror only reflects σ2ǫ .

◆ There are other algebraically equivalent formula’s — thisone easiest to compute but obscures logic.

◆ Quality rating data: ω2 = 18.828−(3−1)(.616)95.213+.616 = .18

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 96 of 119

Properties of Omega Squared

■ 0 < ω2 < 1.00 (unless F statistic < 1).

■ For social science research, Cohen (Keppel, 1982) suggests

◆ “Large” −→ ω2 > .15

◆ “Medium”−→ ω2 ≈ .06

◆ “Small”−→ ω2 ≈ .01

■ ω2 is not a test statistic, but a significant F statistics impliesthat ω2 is “significantly” greater than 0.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 97 of 119

Epsilon Squared, e2

e2 =SSbetween − (J − 1)MSerror

SStotal

■ e2 > ω2 because of difference in denominators.

■ Quality rating data:

e2 = (18.828 − (3 − 1))(.616)/95.213 = .185.

■ In a “good” experiment, difference between e2 and ω2 shouldbe small, i.e.,

MSerrorσ2ǫ

is small.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

● Effect Size

● Effect Size

● Effect Size: Omega Squared

● Properties of Omega Squared

● Epsilon Squared, e2

● Notes on Effect Size

Measures

Power

Violations of AssumptionsMultivariate Relationships and Multiple Linear Regression Slide 98 of 119

Notes on Effect Size Measures

■ e2 comes out of multiple regression framework.

■ R2 > e2 because e2 is obtainable from “adjusted” or“shrunken” R2.

■ ω2 ≤ e2 ≤ R2. For quality rating data:

ω2 = .184 < e2 = .185 < R2 = .198

■ e2 is a better estimate of the strength in the population thanR2.

■ For simple designs, can use either ω2 or e2.

■ ω2 has been extended to more complex designs.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 99 of 119

Power

The decision of a hypothesis test will either be correct orincorrect.

Possible Outcomes:Actual State of World

Ho true Ha trueretain Ho correct Type II error

Decision 1 − α β

reject Ho Type I error Correctα Power= 1 − β

1 1

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 100 of 119

PowerWhen the alternative Ha is true,

■ At least one mean is not equal to the rest (i.e. at least oneαj 6= 0).

■ Sampling distribution of the F statistic is a non-central F ,which depends on◆ νbetween◆ νerror (i.e., νwithin)

◆ non-centrality parameter,

φ =

∑Jj=1 njα2

j

Jσ2ǫ

or φ2 =

∑Jj=1 njα

2j

Jσ2ǫ

.

where■ σ2

ǫ = MSerror■ αj = Yj − Y .■ Needed to compute power.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 101 of 119

Example: Non-Centrality Parameter

Quality rating data (summary statistics on page 90, ANOVAsummary table on page 91):

■ Estimated treatment effects: αj = Yj − Y ,

αA = 5.0556−4.9216 = .1343, αC = 5.4167−4.9216 = .4954,

αU = 4.5091 − 4.9216 = −.4122

■ Non-centrality parameter:

φ2 =36(.1343)2 + 36(.4954)2 + 55(−.4122)2

3(.616)

=36(.01804) + 36(.2454) + 55(.1699)

1.848

=18.8283

1.848= 10.188

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 102 of 119

Things that Effect Power

■ Prob(Type I Error) = α: ↑ α =⇒ Power ↑.

■ Within groups degrees of freedom, νw = νe:↑ νe =⇒ Power ↑.

■ Between groups, νb = J − 1 (least effect):↑ νb =⇒ Power ↑.

■ Cell sample size nj (related to νe):↑ nj =⇒ Power ↑.

■ Effect sizes, i.e., αj = µj − µ, or∑

j njαj :↑ αj =⇒ Power ↑.

■ Variance due to unsystematic sources, σ2ǫ : ↓ σ2

ǫ =⇒ Power ↑.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 103 of 119

Things that Effect Power (continued)

■ These things are inter-dependent (except for significancelevel).

■ To adjust and/or influence power during design stage,

◆ Sample size.◆ Error variance.

■ To compute power need to know:

αj , σ2ǫ , nj , and J.

■ Prospectively: make educated guesses.

■ Retrospectively: use data.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 104 of 119

Main Sources of Error Variance

■ Random variation in actual treatments (no experimentaltreatment is exactly the same for every subject),

e.g.,◆ calibration of equipment

◆ environmental factors (noise, humidity, temperature,illumination, etc),

◆ training & experience of experimenters.

■ Unanalyzed control factors or “nuisance” factors: Add themto analysis so not confounded with treatment.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 105 of 119

Main Sources of Error Variance

■ Individual differences or subject variability:

◆ Select subjects who are similar with respect to importantand relevant characteristics.

◆ Type of “matching”

◆ Repeated measures

◆ Analysis of covariance.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 106 of 119

Computing Power (or Sample Size)

SAS/ANALYST — only equal sample sizes

■ Solutions> Analysis > Analyst > Statistics > Sample size >One-way ANOVA.

■ In the window that opens up, you will need to enter (made upnumbers):

Calculate power The other option is sample size# of treatments 3 J

CSS of Means .38375 This equals∑

j(µj − µ)2

Standard Deviation .49551 This is square root of MSerror

Alpha .05 Significance levelN per group From 5

To 20By 1

■ Click “OK”.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 107 of 119

Sample Size TableOne-Way ANOVA

# Treatments = 3 CSS of Means = .38375

Standard Deviation = .49551 Alpha = 0.05

N per

Group Power

5 0.588

6 0.696

7 0.781

8 0.845

9 0.893

10 0.927

11 0.950

12 0.967

13 0.978

14 0.986

15 >.99

16 >.99

17 >.99

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

● Power

● Power● Example: Non-Centrality

Parameter

● Things that Effect Power

● Things that Effect Power

(continued)

● Main Sources of Error

Variance● Main Sources of Error

Variance

Multivariate Relationships and Multiple Linear Regression Slide 108 of 119

Power Plot

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 109 of 119

Violations of Assumptions

■ Checking the validity of assumptions tends to be neglected.

■ Our model (assumptions) are

Yij = µ + αj + ǫij

where it is assumed that

ǫij ∼ N (0, σ2ǫ ) i.i.d

■ That is, the assumptions are◆ Normality

◆ Homogeneity of variance◆ Independence

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 110 of 119

Normality Assumption

■ The Assumption: Errors (and hence Yij) are normallydistributed.

■ Detection of non-normality:

◆ Histograms, box-plots, stem-n-leaf displays of ǫij (or Yij).

◆ Normal probability plot of ǫij = Yij − Yj .

◆ Statistical tests of normality (Kolmogorov D, Wilk-Shaprio)

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 111 of 119

Example: Perceived Quality

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 112 of 119

Example: Perceived Quality

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 113 of 119

Violation of Normality

■ Non-normality tends to have negligible effects on theprobability of errors (i.e, α and β), except

1. When nj ’s are small.

2. When distributions are highly skewed. e.g.,◆ Binomial variable with tiny π.◆ Test scores with a “ceiling” or “floor”.

■ Remedies:1. Get larger samples.

2. Use more appropriate statistical procedure. e.g.,non-parametric test, generalized linear model.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 114 of 119

Homogeneity of Variance Assumption

■ The Assumption: σ21 = σ2

2 = . . . = σ2J

■ Detection of heterogeneous variances

◆ If you have large samples from normal populations,statistical test of homogeneity of variance (e.g., Bartlett,Cochran, Schaffé, and others’ test).

◆ Use what you know about the data. e.g., If the dependentor response variable is a count or frequency — beware.

◆ Graphical display: plot σj = sj versus Xj .

■ Out data. . .

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 115 of 119

Example: Perceived Quality

Level nj Yj sj s2j

Affirm 36 5.056 0.826 0.683Control 36 5.417 0.874 0.764

Undermine 55 4.509 0.690 0.477

From SAS/ANLAYST: TestTest df SS MS Statistic p-value

Levene’s testgroup 2 1.8389 0.9194 F = 1.68 .19Error 124 67.6690 0.5457

Brown and Forsythe’s Testgroup 2 0.3728 0.1864 F = 0.48 .62Error 124 47.7217 0.3849

Bartlett’s Testgroup 2 χ2 = 2.67 0.26

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 116 of 119

Effect of Heterogeneous Variance

■ Little (negligible) when nj = n (i.e., a balanced design).

■ Systematic effects

◆ When n1 > n2 and s21 > s2

2, the test is conservative; theactual α is smaller than desired.

◆ When n1 < n2 and s21 > s2

2, the test is liberal; the actual αlevel is larger than desired.

■ Remedies: Use same ones as recommended for twoindependent groups t-test.

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 117 of 119

The Independence Assumption

■ The assumption: Yij are independent within and betweengroups.

■ Detection of violation:

◆ Hard; rely on what you know about the experimentalprocedures and data collection.

◆ Compute intra-class correlation.

■ Our data on quality ratings, rintra = .32 (not significant).

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 118 of 119

SAS/Intraclass Correlation

proc mixed data=quality noclprint covtest method=REML;class group;model qual = / solution;random int /subject=group;title ’Random effects ANOVA/ random intercept HLM (nullmodel)’;run;

Edited SAS output:

Covariance Parameter EstimatesCov Parm Subject Estimate

Intercept group 0.1963Residual 0.6159

rintra =.1963

.6159= .3187

The General Linear Model

Explanatory Variables

1–Way Analysis of Variance

What & Whys of ANOVA

Designs

ANOVA Terminology and

Notation

Least Squares Estimation

Partitioning Sums of Squares

Hypothesis Testing: F -test

ANOVA & SAS

Unequal Sample Sizes

Effect Size

Power

Violations of Assumptions

● Violations of Assumptions

● Normality Assumption

● Example: Perceived Quality

● Example: Perceived Quality

● Violation of Normality

● Homogeneity of Variance

Assumption

● Example: Perceived Quality

● Effect of Heterogeneous

Multivariate Relationships and Multiple Linear Regression Slide 119 of 119

Effect of Violating Independence

■ Effect of Dependence (serious problem).◆ Effects significance level and power of the F test.◆ When observations are dependent, the actual/ true

probability of a Type I error is likely to be larger than thestated one, i.e, If there is a positive association betweenresponses, true α > desired α.

■ Remedies:◆ If appropriate, use another model.◆ Re-do the study/experiment using improved procedures.