Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 ›...

20
ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University of Neuchatel, 10 May 2016 Bernadetta Tarigan, Dr. sc. ETHZ Tarigan Statistical Consulting & Coaching statistical-coaching.ch Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL Analysis of Variance (ANOVA)

Transcript of Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 ›...

Page 1: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

ANOVA: Two Way Repeated Measures 1

Hands-on Data Analysis with R University of Neuchatel, 10 May 2016

Bernadetta Tarigan, Dr. sc. ETHZ

Tarigan Statistical Consulting & Coaching

statistical-coaching.ch

Doctoral Program in Computer Science of the Universities of Fribourg, Geneva, Lausanne, Neuchâtel, Bern and the EPFL

Analysis of Variance (ANOVA)

Page 2: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 2

Introduction

Variabletype data

Variable of interest: guessed, nearGuessed and notGuessed Goal : TO EXPLAIN 1. What is the distribution of guessed variables in every approach & heuristic, and what is the

best way to compare them? 2. For each heuristic, how to compare the relation in between guessed package and guessed

library? 3. For each heuristic, how to compare guessed, near guessed and not guessed to get some

parameter of successfulness?

Page 3: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 3

Introduction

Do the numbers in the groups come from the same population?

Res

po

nse

: o

ne

qu

anti

tati

ve v

aria

ble

Explanatory: one categorical variable with 4 levels

# groups = 4

Page 4: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 4

Introduction

Do the numbers in the groups come from the same population?

Res

po

nse

: o

ne

qu

anti

tati

ve v

aria

ble

Explanatory: two categorical variables with 4 levels and 2 levels # groups = 8

Page 5: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 5

Introduction

• Comparing groups = comparing populations = comparing distributions

• Distribution: parametric or nonparametric? • Assumptions of ANOVA

– Populations are from Normal family – Populations have equal variance – Observations across the groups are iid (independent

and identically distributed) – Different sizes of groups are allowed

• Normal distribution: symmetric around its mean • Comparing distributions = comparing means of

the groups

ANOVA : parametric approach

Page 6: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 6

Introduction

ANOVA: Comparing 𝒌 groups, with 𝒌 ≥ 𝟑

ANOVA Response = always one

quantitiave variable

Explanatory = always categorical, one or more

variables

One way

- 1 categorical variable - with 𝑘 levels - #groups = 𝑘

Two ways

- 2 categorical variables - with 𝑘1& 𝑘2 levels - #groups = 𝑘1 ∙ 𝑘2

...

Different subjects / participants in

each group

Same subjects / participants

between subjects within subjects / repeated measures

Page 7: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 7

Introduction

ANOVA and its siblings

ANCOVA

Covariance -> explanatory variables are not only categorical but also quantitative

MANOVA

Multi -> response variables are more than one

MANCOVA

Page 8: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 8

Introduction

1-way = 1 categorical explanatory variable (levels = 𝑘)

– All population means are equal

– That is, no variation in means between groups

𝐻0: 𝜇1 = 𝜇𝑘 = ⋯ = 𝜇𝑘

• At least one population mean is different than the others • That is, there is variation between groups • Does not mean that all population means are different

(some pairs may be the same)

𝐻1: 𝜇𝑖 ≠ 𝜇𝑗 , for at least one 𝑖, 𝑗 pair

ANOVA hypotheses

Page 9: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 9

Introduction

Illustration of the hypotheses

All populations means are the same: The null hypothesis is true

(no variation between groups)

At least one mean is different: The null hypothesis is NOT true

(variation is presence between groups)

Page 10: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 10

Introduction

How to test the hypotheses of means equality?

Analyze the data variance: variability of all subjects in all groups

large variation between groups small variation within groups

large variation between groups large variation within groups

Group means may look different, but large variation within groups makes the evidence weak!

Page 11: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 11

Introduction

The logic of ANOVA: decompose variability

SST total variability in data

SSB between group variability

SSW within group variability

grand mean: 𝒙 = 1

𝑛 𝑥𝑖𝑗

𝑛𝑗𝑖=1

𝑘𝑗=1 where n = 𝑛𝑗

group mean: 𝒙 𝒋 = 1

𝑛𝑗 𝑥𝑖𝑗𝑛𝑗𝑖=1

“signal” “noise” 𝑥𝑖𝑗 − 𝒙 𝒋

2

𝑛𝑗

𝑖=1

𝑘

𝑗=0

𝑥𝑖𝑗 − 𝒙 2

𝑛𝑗

𝑖=1

𝑘

𝑗=1

𝑛𝑗 𝒙 𝒋 − 𝒙 2

𝑘

𝑗=1

Page 12: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 12

Introduction

Test: the ratio of SSB to SSW

𝑛𝑗 𝒙 𝒋 − 𝒙 2

𝑘

𝑗=1

= 𝑆𝑆𝐵 ~ 𝜒2𝑘−1

𝑥𝑖𝑗 − 𝒙 𝒋2𝑛𝑗

𝑖=1𝑘𝑗=0 = 𝑆𝑆𝑊 ~ 𝜒2𝑛−𝑘

Under the assumptions and 𝐻0: 𝜇1 = 𝜇𝑘 = ⋯ = 𝜇𝑘 , we get that:

𝑆𝑆𝐵/(𝑘 − 1)

𝑆𝑆𝑊/(𝑛 − 𝑘) ~ 𝐹(𝑘−1;𝑛−𝑘)

Page 13: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 13

Introduction

ANOVA Table

Source of Variation

𝑺𝑺 (𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔) 𝒅𝒇 Mean SS

(aka. variance) 𝑭 ratio

Between groups

𝑆𝑆𝐵 𝑘 − 1 𝑀𝑆𝐵 =𝑆𝑆𝐵

𝑘 − 1 𝐹 =

𝑀𝑆𝐵

𝑀𝑆𝑊

signal to noise

ratio

Within groups

𝑆𝑆𝑊 𝑛 − 𝑘 𝑀𝑆𝑊 =𝑆𝑆𝑊

𝑛 − 𝑘

Total 𝑆𝑆𝑇 = 𝑆𝑆𝐵 + 𝑆𝑆𝑊 𝑛 − 1

Interpretation: • signal ≈ noise ⇒ 𝐹 ≈ 1 • signal ≫ noise ⇒ 𝐹 ≫ 1

Page 14: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 14

Introduction

It is all simple in R…

Source of Variation

𝑺𝑺 (𝑺𝒖𝒎 𝒐𝒇 𝑺𝒒𝒖𝒂𝒓𝒆𝒔) 𝒅𝒇 Mean SS

(aka. variance) 𝑭 ratio

Between groups

𝑆𝑆𝐵 𝑘 − 1 𝑀𝑆𝐵 =𝑆𝑆𝐵

𝑘 − 1 𝐹 =

𝑀𝑆𝐵

𝑀𝑆𝑊

signal to noise

ratio

Within groups

𝑆𝑆𝑊 𝑛 − 𝑘 𝑀𝑆𝑊 =𝑆𝑆𝑊

𝑛 − 𝑘

Total 𝑆𝑆𝑇 = 𝑆𝑆𝐵 + 𝑆𝑆𝑊 𝑛 − 1

Page 15: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 15

Introduction

Repeated Measures: reducing the noise

𝐹 =𝑀𝑆𝐵𝑀𝑆𝑊

𝐹 =𝑀𝑆𝐵𝑀𝑆𝑒𝑟𝑟𝑜𝑟

The noise is discounted, that means the F value increases leading to an increase in the power of the test to detect significant differences between groups

“signal” “noise”

Page 16: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 16

Introduction

Example

Page 17: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 17

Introduction

Model Validation

The same as what we have seen for the simple linear regression model: 1. Homoscedasticity of the residuals in each group

2. Normal distribution of the residuals in each group

Page 18: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 18

Introduction

2 way: 2 factors

total variability

between group variability of

Factor 1

“signal” “noise”

between group variability of

Factor 2

Residuals (of the subjects)

Page 19: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 19

Introduction

Any interaction between the 2 factors?

Page 20: Analysis of Variance (ANOVA) - Statistical Coaching › ... › 05 › CUSO2016_4_Analysis_of_Vari… · ANOVA: Two Way Repeated Measures 1 Hands-on Data Analysis with R University

Intro

1-way ANOVA & Repeated Measures

2-way ANOVA & Repeated Measures

ANOVA: Two Way Repeated Measures 20

Introduction

Repeated Measures: noise reduction