1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no...

49
1 GE5 Lecture 6 rules of engagement •no computer or no power → no lesson •no SPSS → no lesson •no homework done → no lesson

Transcript of 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no...

Page 1: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

1

GE5 Lecture 6

rules of engagement•no computer or no power → no lesson•no SPSS → no lesson•no homework done → no lesson

Page 2: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

2

Content of this lecture

1.Quiz chapters 9 to 11 Howitt & Cramer2.Review standard error and some other topics3.Chapter 12 and 13 Howitt & Cramer: Related and unrelated t-test4.SPSS workshop5.Discussion homework next week

Page 3: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

3

1Quiz

Pwd:

Page 4: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

4

2Preparation:Review standard errorand some other topics

Page 5: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

5

Review

a.Standard scores / z-scores (chapter 5)b.Standard error (chapter 11)c.Inferential statistics (chapter 9) and the logic of hypothesis testing (chapter 10)

Page 6: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

66

(a)Standard scores / z-scores

Page 7: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

7

Z-score: number of standard deviations

•No need of unit of measurement•E.g. if sd=7, a score of 21 above the mean, how many sd away is it?•Big advantage: equally applicable to all types of variables•Sometimes referred to as standard score•z = (particular score – sample mean of scores) / (standard deviation of scores)

•Z-scores are also directly related to probability:•Standard normal z-distribution (page 55)

Page 8: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

88

(b)The standard error

Page 9: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

9

Standard error

•If we draw a random sample the outcome will not be the same as the result in a population.•The larger the sample size, the more probable it is that the sample mean will be close to the population mean.•The greater the variance in the population, the less probable it is that the sample mean will be close to the population mean.•The standard error is the standard deviation of many sample means (all samples with the same N).•We did the in-class example with taking many samples from the same data.

Page 10: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

10

Standard error (2)

or with a simpler formula : SEest

= SD / √n where SE = standard error, SD = standard deviation

•The standard error is the standard deviation of many sample means •Normally, we have only one sample•How can we compute standard error?•The standard error can be estimated by dividing the (estimated) standard deviation of the population by the root of N.• (estimated) standard error =

√n

Page 11: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

11

Distribution of sample means for samples of sizes 4 , 10 and 100

Sample Size & distribution

N = 4 N = 10 N = 100SD = 40 SD = 30 SD = 20SE=20 SE=10 (approx) SE = 2

Page 12: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

12

N=4: The 95% CI is [40; 120]N=10: The 95% CI is [60; 100]N=100: The 95% CI is [76; 84]

Sample Size & confidence interval (CI)What is the 95% CI for these cases?

N = 4 N = 10 N = 100se=20 se=10 se = 2

Page 13: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

1313

(c)Descriptive vs inferential statistics &The logic of hypothesis testing

Page 14: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

14

Descriptive versus inferential statistics

Statistics serve two general purposes:•Descriptive statistics: Statistics is used to present data in a convenient way: tables, graphs and figures (summarize, organize and simplify the data).•Inferential statistics: Statistics is used to generalize from the sample to the population; it uses information from a sample to draw conclusions (inferences) about the population from which the sample was taken.

Descriptive statistics Inferential statistics

Individualvariables

frequency distribution, means, etc. for example public opinion polls to estimate percentages and means in the population, confidence interval

Relationships between variables

cross tabulation, compare means, scatter plot, correlation coefficient, etc.

hypotheses, significance testing: statistical significance for the correlation coefficient, related and unrelated T-test, etc.

Page 15: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

15

Howitt & Cramer

Page 16: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

16

Populations and samples

Page 17: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

17

The logic of hypothesis testing

•The null hypothesis is a statement that there is no difference or no relationship between two variables.•The corresponding alternative hypothesis states there is a difference or relationship.•On the basis of the results in the sample we decide whether it is likely that the sample comes from a population defined by the null hypothesis.•If it's rather unlikely (usually: less than 5%) we reject the null hypothesis.•A type I error occurs when we reject a null hypothesis that is actually true (in reality no difference or no relationship). When we reject the null hypothesis with a chance below 5%, the probability that a test will lead to a type I error is 5%.•A type II error occurs when we fail to reject a null hypothesis that is actually false: i.e. we have failed to detect a real difference or relationship.

Page 18: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

18

Example question: Does a TV show get “good enough” ratings? Say more than a 7.

H1: Population mean is > 70: H0: Population mean = 70 N=4: The 95% CI is [40; 120]. There are many population means between 40 and 70, which all support H0. Do not reject H0N=100: The 95% CI is [76; 84]. There are no population means expected (at 95%) that are 70 (or less). Reject H0now try this one for yourself:N = 10: Do or do not reject H0?

Sample Size & confidence interval (CI)

N = 4 N = 10 N = 100SE=20 SE=10 SE = 2

Page 19: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

19

The logic of hypothesis testing

•The null hypothesis is a statement that there is no difference or no relationship between two variables.•The corresponding alternative hypothesis states there is a difference or relationship.•On the basis of the results in the sample we decide whether it is likely that the sample comes from a population defined by the null hypothesis.•If it's rather unlikely (usually: less than 5%) we reject the null hypothesis.

•A type I error occurs when we reject a null hypothesis that is actually true (in reality no difference or no relationship). When we reject the null hypothesis with a chance below 5%, the probability that a test will lead to a type I error is 5%.•A type II error occurs when we fail to reject a null hypothesis that is actually false: i.e. we have failed to detect a real difference or relationship.

Page 20: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

Error types•Type 1•Are boys and girls the same on a test?

– Formulate H1 and H0•What you don't know

– Boys and girls are actually equally good at this test (population). – We happen to sample five good girls and five not-so-good boys

•We find mean girls = 8.0, mean boys = 5.9 with se = 1.0 •We reject H0.

•Type 2•Same setup as above, but now girls are actually better than boys

– Population mean girls 8.5, population mean boys 6.1, sd=1.0•In our sample, we find mean girls = 7.9, mean boys = 6.9, se = 1.1•Do not reject H0

Page 21: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

21

Significance: Misconception 1

Misconception: The probability value is the probability that the null hypothesis is false.

Proper interpretation: The probability value is the probability of a result as extreme or more extreme given that the null hypothesis is true. It is the probability of the data given the null hypothesis. It is not the probability that the null hypothesis is false.

Page 22: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

22

Significance: Misconception 2

Misconception: A non-significant outcome means that the null hypothesis is probably true.

Proper interpretation: A non-significant outcome means that the data do not conclusively demonstrate that the null hypothesis is false.

Page 23: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

23

Significance: Misconception 3

Misconception: A low probability value indicates a large effect.

Proper interpretation: A low probability value indicates that the sample outcome (or one more extreme) would be very unlikely if the null hypothesis was true. A low probability value can (for example) occur with small effect sizes, if the sample size is large or the standard deviation is small

Page 24: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

24

4Chapter 12 and 13 Howitt & Cramer:The t-test:Comparing two samplesof related or unrelated scores

Page 25: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

TIME TO PLAY:

http://imi.nhtv.nl/nanobots/

Page 26: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

26

SPSS workshop

Page 27: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

27

Class assignment

•Open the VR-Controllers.sav data set (on Moodle)•Compare the mean scores of Task3 and Task4 completion times (TCT_TASK_3, TCT_TASK_4).

• Filter first for the controller (choose controller 0). • Will you choose related or unrelated?• What is the H0?• Do you reject the H0?

•Compare the mean scores of Task3 completion times (TCT_TASK_3) for the two different controllers (CODE).

• Select all cases• Will you choose related or unrelated?• What is the H0?• Do you reject the H0?

•(look at VR-Controllers.pdf paper for explanation)

Page 28: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

28

BACK TO THEORY…

Page 29: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

29

The t-test in general

•Until now, we have tested whether the mean was or was not equal to one value

•The t-test is used to compare the means of two groups of scores–for example the performance before and after exercising or the income of

male and female persons.–Or the scores of boys and girls on a test

•As in chapter 10 (Statistical significance of the correlation coefficient) the significance depends on

–(1) the strength of the relationship, –(2) the sample size–(3) the chosen level of significance (almost always 5%).

Page 30: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

30

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 31: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

31

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 32: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

32

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 33: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

33

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 34: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

34

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 35: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

35

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 36: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

36

Compare means

If we compare the means of two groups of scores the strength of thecoherence depends on:•the difference between the means•the standard deviations within the groups.

0 5 10 15 20 25 30 40 45

Page 37: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

3737

Simple t-testIndependent samples

Page 38: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

38

Independent sample t-test

- Are boys (blue) better than girls (pink) on this score? - We have seen that this depends on the difference and on the variability (standard deviation)- How do we put this in a formula?

0 5 10 15 20 25 30 40 45

Page 39: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

39

The unrelated samples t-test

T test value = difference between boys and girls •divided by some measure of variability•t = mean sample 1 – mean sample 2 / standard error of differences between sample means•Now we have a t value, what is the chance of getting this value?•Up to now we have only use normal distribution•However, the distribution of t looks like the normal distribution but is not the same

–The distribution depends on the sample size. –For large N, the t-distribution = normal distribution–For small N, the curve of t becomes flat and more spread out

•More precise: the distribution is dependent on the degree of freedom•For the t-test: degrees of freedom is the total number of scores in the two samples minus 2.

Page 40: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

Example in Excel

This formula is NOT on the exam

Page 41: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

41

Different designs

Page 42: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

42

Related versus unrelated t-test

•The related t-test (chapter 12 Howitt & Cramer) compares the means of two related samples of scores to see whether the means differ significantly (other names for this test: 'repeated measures t-test', 'correlated measures t-test'). Examples: page 126-127.•The unrelated t-test (chapter 13 Howitt & Cramer) compares the means of two unrelated samples of scores to see whether the means differ significantly (other names for this test: 'independent samples t-test', 'uncorrelated scores t-test', 'Student t-test'). Examples: page 138-139.

Page 43: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

43

Related / repeated-measures designs

•In repeated-measures designs two separate scores are obtained for each individual in the sample. Examples:

–Same measurement at two different times–Measurements in two different conditions (within-subjects designs)–Comparison between two related variables measured in the same group

(comparison of the evaluation of two films)•Since the same participants are used in both measurements, there is no risk that accidental differences between groups of participants cause differences between the means.

Page 44: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

44

Relationship between a dependent and an independent variable•The means compared concern the dependent variable•There is another variable, the independent variable, which defines the difference between the two measurements: the variable time when results at two different times are compared, the variable light when in an experiment the results of participants in two different light conditions are compared, etc.•The null hypothesis states that there is no relationship between the independent and the dependent variable. This implies that the two samples of scores come from the same population of scores.•Over many pairs of samples one would expect no difference.•For one pair of sample the means can differ, because samples tend to vary (sample error).•Do they differ too much to believe that the scores really belong to the same population of scores? In other words: Is the difference significant?

Page 45: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

4545

The related samples t-test

Page 46: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

46

The related samples t-test

•t-score is based on analogous logic to the z-score: z = (particular score – sample mean of scores) / standard deviation of scores t = (particular sample mean – average of sample means) / standard error of sample means

•We can calculate if a sample with a certain mean is statistically unusual (using significance table 12.1).•For related measures designs the two samples of scores are turned into a single sample by subtracting one set of scores from the other. This is a sample of difference scores.•The null hypothesis is that the average difference score is 0, so we can delete the population mean from the formula for z-scores:

t = particular sample mean / standard error of sample means, or t = average difference score / standard error of difference scores

•Look up the calculated t-score in significance table 12.1 for the degrees of freedom of this sample (N-1).

Page 47: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

47

Howitt & Cramer

Page 48: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

48

•We will discuss chapter 14 to 16 next week.

•Exam: mc questions, comparable with the questions in the quizzes and the test exam. More exam info next week.

•Redo info in Moodle.

Follow-up

Page 49: 1 GE5 Lecture 6 rules of engagement no computer or no power → no lesson no SPSS → no lesson no homework done → no lesson.

4949