Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: •...

22
Chapter 6 Hypothesis Testing

Transcript of Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: •...

Page 1: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Chapter 6 Hypothesis Testing

Page 2: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

What is Hypothesis Testing? •  … the use of statistical procedures to answer research

questions •  Typical research question (generic):

•  For hypothesis testing, research questions are statements:

•  This is the null hypothesis (assumption of “no difference”) •  Statistical procedures seek to reject or accept the null

hypothesis (details to follow) 2

Page 3: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

•  Thus far: –  You have generated a hypothesis (E.g. The mean of group A is different

than the mean of group B) –  You have collected some data (samples in group A, samples in group B) –  Now you want to know if this data supports your hypothesis –  Formally: –  H0 (null hypothesis): there is no difference in the mean values of group A

and group B –  H1 (experimental hypothesis): there is a difference in the mean of group A

and group B

3

Page 4: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

A practitioner’s point of view •  Test statistic

–  Inferential statistics tell us what is the likelihood that the experimental hypothesis is true à by computing a test statistic.

–  Typically, if the likelihood of obtaining a value of a test statistic is <0.05, then we can reject the null hypothesis

–  “…significant effect of …”

•  Non-significant results –  Does not mean that the null hypothesis is true –  Interpreted to mean that the results you are getting could be a chance

finding

•  Significant result –  Means that the null hypothesis is highly unlikely

4

Page 5: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

•  Errors: –  Type 1 error (False positive) : we believe that there is an

effect when there isn’t one –  Type 2 error (False negative) : we believe that there isn’t an

effect, when there is one –  If p<0.05, then the probability of a Type 1 error is < 5%

(alpha level)

•  Typically, we deal with two types of hypotheses –  The mean of group A is different from the mean of group B

(one-tailed test) –  The mean of group A is larger than the mean of group B

(two-tailed test) 5

A practitioner’s point of view

Page 6: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Statistical Procedures •  Two types:

–  Parametric •  Data are assumed to come from a distribution, such as the

normal distribution, t-distribution, etc.

–  Non-parametric •  Data are not assumed to come from a distribution

–  Lots of debate on assumptions testing and what to do if assumptions are not met (avoided here, for the most part)

–  A reasonable basis for deciding on the most appropriate test is to match the type of test with the measurement scale of the data (next slide)

6

Page 7: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Measurement Scales vs. Statistical Tests

•  Parametric tests most appropriate for… –  Ratio data, interval data

•  Non-parametric tests most appropriate for… –  Ordinal data, nominal data (although limited use for ratio and

interval data) 7

M=Male, F=Female

Preference ranking

Likert scale responses

Task completion time

Examples

Page 8: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Tests Presented Here •  Parametric

–  T-test –  Analysis of variance (ANOVA) –  Most common statistical procedures in HCI research

8

Page 9: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

T-test •  Goal: To ascertain if the difference in the means of two groups is significant •  Assumptions

–  Data are normally distributed (you checked for this by looking at the histograms, reporting the mean/median/standard deviation, and by running Shapiro-Wilks)

–  If data come from different groups of people à Independent t-test (assumes scores are independent and variances in the populations are roughly equal … check your table of descriptive statistics)

–  If data come from same group of people à dependent t-test •  Practioner’s point of view: When in doubt, consult a book! Let’s do an

example in R

9

Page 10: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Tests Presented Here •  Parametric

–  Analysis of variance (ANOVA) •  Used for ratio data and interval data •  Most common statistical procedure in HCI research

•  Non-parametric –  Chi-square test

•  Used for nominal data

–  Mann-Whitney U, Wilcoxon Signed-Rank, Kruskal-Wallis, and Friedman tests

•  Used for ordinal data

10

Page 11: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Analysis of Variance •  The analysis of variance (ANOVA) is the most

widely used statistical test for hypothesis testing in factorial experiments

•  Goal à determine if an independent variable has a significant effect on a dependent variable

•  Remember, an independent variable has at least two levels (test conditions)

•  Goal (put another way) à determine if the test conditions yield different outcomes on the dependent variable (e.g., one of the test conditions is faster/slower than the other)

11

Page 12: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Why Analyse the Variance? •  Seems odd that we analyse the variance, but the

research question is concerned with the overall means:

•  Let’s explain through two simple examples (next slide)

12

Page 13: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

13

Example #1 Example #2

“Significant” implies that in all likelihood the difference observed is due to the test conditions (Method A vs. Method B).

“Not significant” implies that the difference observed is likely due to chance.

File: 06-AnovaDemo.xlsx

Page 14: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Example #1 - Details

14

Error bars show ±1 standard deviation

Note: SD is the square root of the variance

Note: Within-subjects design

Page 15: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Example #1 – ANOVA1

1 ANOVA table created by StatView (now marketed as JMP, a product of SAS; www.sas.com)

Probability of obtaining the observed data if the null hypothesis is true

Reported as…

F1,9 = 9.80, p < .05

Thresholds for “p” •  .05 •  .01 •  .005 •  .001 •  .0005 •  .0001

Page 16: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

How to Report an F-statistic

•  Notice in the parentheses –  Uppercase for F –  Lowercase for p –  Italics for F and p –  Space both sides of equal sign –  Space after comma –  Space on both sides of less-than sign –  Degrees of freedom are subscript, plain, smaller font –  Three significant figures for F statistic –  No zero before the decimal point in the p statistic (except in Europe)

Page 17: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Example #2 - Details

Error bars show ±1 standard deviation

Page 18: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Example #2 – ANOVA

Reported as…

F1,9 = 0.626, ns

Probability of obtaining the observed data if the null hypothesis is true

Note: For non-significant effects, use “ns” if F < 1.0,

or “p > .05” if F > 1.0.

Page 19: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Example #2 - Reporting

19

Page 20: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

More Than Two Test Conditions

20

Page 21: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

ANOVA

•  There was a significant effect of Test Condition on the dependent variable (F3,45 = 4.95, p < .005)

•  Degrees of freedom –  If n is the number of test conditions and m is the number of

participants, the degrees of freedom are… –  Effect à (n – 1) –  Residual à (n – 1)(m – 1) –  Note: single-factor, within-subjects design

21

Page 22: Chapter 6 Hypothesis Testing · • For hypothesis testing, research questions are statements: • This is the null hypothesis (assumption of “no difference”) • Statistical

Post Hoc Comparisons Tests •  A significant F-test means that at least one of the test

conditions differed significantly from one other test condition

•  Does not indicate which test conditions differed significantly from one another

•  To determine which pairs differ significantly, a post hoc comparisons tests is used

•  Examples: –  Fisher PLSD, Bonferroni/Dunn, Dunnett, Tukey/Kramer, Games/

Howell, Student-Newman-Keuls, orthogonal contrasts, Scheffé

22