Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

26
Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun www.phon.ox.ac.uk/~bettina/ teaching.html

Transcript of Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Page 1: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Statistics for Linguistics Students

Michaelmas 2004Week 7

Bettina Braunwww.phon.ox.ac.uk/~bettina/teaching.html

Page 2: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Overview

• Problems from last assignment

• Correlation analyses

• Repeated measures ANOVA– One-way (one IV)– Two-way (two IVs)

• Transformations

Page 3: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Chi-square using SPSS

• Organisation of data:

Page 4: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Chi-square using SPSS

• Where to find it…

Page 5: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Chi-square using SPSS

• How to interpret the output

Table similar to ours

Result: sign. interaction (x2=5.7, df=1, p=0.017

Page 6: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

More on interactions

North South

MaleFemale

North South

No effect of region, nor gender, no interaction

Effect of region and gender no interaction

North South

No effect of gender, effect of region, no interaction

North South

Effect of region and gender and interaction

North South

Effect of region and gender and interaction

Page 7: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Correlation analyses

• Often found in exploratory research– You do not test the effect of an independent

variable on the dependent one– But see what relationships hold between two

or more variables

Page 8: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Correlation coefficients

• Scatterplots helpful to see whether it is a linear relationship…

r = -1Neg. corr.

r = 0no corr.

r = 1pos. corr.

Page 9: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Bivariate correlation

• Do you expect a correlation between the two variables?

• Try “line-fitting” by eye

?

Page 10: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Pearson correlation

• T-test is used to test if corr. coefficient is different from 0 ( => data must be interval!)

• If not, use Spearmans correlation (non-parametric)

Page 11: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Pearson correlation

• Correlation coefficient– For interval data– For linear relationships

• r2 is the proportion of variation of one variable that is “explained” by the other

• Note: even a highly significant correlation does not imply a causal relationship (e.g. There might be another variable influencing both!)

Page 12: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Repeated measures ANOVA

• Recall:– In between-subjects designs large individual

differences – repeated measures (aka within-subjects) has

all participants in all levels of all conditions

• Problems: – Practice effect (carry-over) effect

Page 13: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Missing data

• You need to have data for every subject in every condition

• If this is not the case, you cannot include this subject

• If your design becomes inbalanced by the exclusion of a subject, you should randomly exclude a subject from the other group as well (or run another subject for the group with the exclusion)

Page 14: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Requirements for repeated measures ANOVA

• Same as for between-subjects ANOVA• You can have within- and between-subject

factors (e.g. boys vs. girls, producing /a/ and /i/ and /u/)

• Covariates– factors that might have an effect on the within-

subjects factor– Note: covariates can also be specified for

between-subjects designs!

Page 15: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Covariates: example

• You want to study French skills when using 2 different text-books. Students are randomly assigned to 2 groups. If you have the IQ of these students, you can decrease the variability within the groups by using IQ as covariate

• Problem: if the covariate is correlated with between-groups factor as well, F-value might get smaller (less significant)!

• You can also assess interaction between covariates and between-groups factors (e.g. one textbook might be better suited for smart students)

Page 16: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

One-way repeated measures ANOVA in SPSS

2

3

1. Define new name and levels for within-subject factor

Page 17: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

One-way repeated measures ANOVA in SPSS

• Factor-name• Four levels of the

within-subjects variable

• Enter between-subjects and covariates (if applicable)

Page 18: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Post-hoc tests for within-subjects variables

• SPSS does not allow you to do post-hoc tests for within-subjects variables

• Instead do “Contrasts” and define them as“Repeated”

2

1

Page 19: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Post-hoc tests for within-subjects variables

• You can also askfor a comparsonof means

Page 20: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

SPSS output: test of Sphericity

• Test for homgeneity of covariances among scores of within-subjecs factors

• Only calculated if variable has more than 2 levels

If test is significant, you have to reject the null-hypothesis that the variances are homogenious

Page 21: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

SPSS output: within-subjects contrasts

• Post-hoc test for within-subjects variables

Page 22: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

3 x 3 designs

• 3 x 3 between subjects

Factor B (between)

B1 B2 B3

A1 Group1 Group2 Group3

A2 Group4 Group5 Group6

A3 Group7 Group8 Group9

Page 23: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

3 x 3 designs

• 3 x 3 within subjects

Factor B (witin)

B1 B2 B3

A1 Group1

A2

A3

Group1 Group1

Group1

Group1

Group1

Group1

Group1

Group1

Page 24: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

3 x 3 designs

• 3 x 3 mixed design

Factor B (witin)

B1 B2 B3

Factor A(between)

A1 Group1

A2

A3

Group1 Group1

Group2

Group3

Group2 Group2

Group3 Group3

Page 25: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

Data transformation

• If you want to caculate an ANOVA but your interval data is not normally distributed (i.e. skewed) you can use mathematical transformations

• The type of transformation depends on the shape of the sample distribution

• NOTE: – After transforming data, check the resulting

distribution again for normality!– Note that your data becomes ordinal by transforming

it!! (but you can do an ANOVA with it)

Page 26: Statistics for Linguistics Students Michaelmas 2004 Week 7 Bettina Braun bettina/teaching.html.

What kind of tranformation?

e.g.f(x) = x1.5

e.g.f(x) = log(x)f(x) = atan(x)

Transformation