MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where...

43
@MEIConference #MEIConf2019

Transcript of MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where...

Page 1: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

@MEIConference #MEIConf2019

Page 2: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

TOPICS IN FM STATISTICS – A

PRACTICAL APPROACH

Page 3: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Practicals

Triangle Taste Test

Reaction Times

Sorting Objects

We need volunteers to conduct the experiments

and the rest of us to act as participants.

Page 4: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Triangle Taste Test

The aim is to find out if people can tell the

difference between bottled water and tap water.

Each person will be given three samples, two from

one source and one from the other. They will be

asked to pick out the odd sample.

Page 5: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Reaction times

Do people’s reaction times improve with practice?

You have to catch a ruler and will record the mark

on the ruler at the point where you caught it.

This will be repeated a further 4 times, to see if

you improve with practice.

Page 6: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Sorting objects

Does a person’s ability to sort objects blindfolded

improve with practice?

There are some objects on a tray. You are

blindfolded and the time taken to sort the objects is

measured.

The experiment is repeated with the same

volunteer and a 2nd time recorded.

Page 7: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Non-parametric tests

Many standard tests require distributional assumptions. Most commonly, the underlying distribution has to be Normal.

So, for example, the test of a product moment correlation coefficient requires underlying Normality in the population from which the sample has been drawn.

The rank correlation coefficient, however, requires no particular underlying distribution.

It is said to be a ‘non-parametric’ test – or sometimes a ‘distribution free’ test.

Page 8: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Non-parametric tests

Non-parametric tests are more broadly

applicable than parametric tests.

Non-parametric tests are generally not as

powerful as parametric tests. (The probability

of correctly rejecting the null hypothesis when

it is false is lower).

Non-parametric tests often do make some

assumptions about the underlying

distribution, but those assumptions are

generally weaker than for parametric tests.

Page 9: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – Triangle Test

Strictly speaking, this is not FM but it is a good

place to start as the underlying analysis is not too

difficult.

Page 10: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – Triangle Test

Strictly speaking, this is not FM but it is a good

place to start as the underlying analysis is not too

difficult.

Assuming that there is no detectable difference in

taste, then 𝑋, the number of times that the

singleton was correctly identified, should be

distributed ………?

Page 11: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – Triangle Test

Strictly speaking, this is not FM but it is a good

place to start as the underlying analysis is not too

difficult.

Assuming that there is no detectable difference in

taste, then 𝑋, the number of times that the

singleton was correctly identified, should be

distributed binomially 𝑋~𝐵(𝑛,1

3).

Page 12: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

This data can be analysed using the sign test.

The difference in two sets of paired measurements

are analysed purely in terms of whether they are

positive or negative.

Here we will look at the difference in the first and

fifth readings for each person.

Page 13: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

The null hypothesis is that there is no improvement

in people’s ‘scores’.

Page 14: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

The null hypothesis is that there is no improvement

in people’s ‘scores’.

If this is the case we can say that the median of

the differences should be zero, as a positive

improvement or negative improvement is equally

likely.

Page 15: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

The null hypothesis is that there is no improvement

in people’s ‘scores’.

If this is the case we can say that the median of

the differences should be zero, as a positive

improvement or negative improvement is equally

likely.

𝑃(+𝑉𝐸 improvement) =1

2

𝑃(−𝑉𝐸 improvement) =1

2

Page 16: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

The alternative hypothesis is that

𝑃(+𝑉𝐸 improvement) >1

2

Note that we need to ignore any cases where there

are zero differences (and reduce 𝑛 accordingly).

Page 17: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – reaction times

The alternative hypothesis is that

𝑃(+𝑉𝐸 improvement) >1

2

Note that we need to ignore any cases where there

are zero differences (and reduce 𝑛 accordingly).

Under 𝐻0 the distribution of positive (or negative)

differences should follow a binomial distribution

𝐵(𝑛,1

2) where 𝑛 is the number of non-zero

differences.

Page 18: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Wilcoxon Signed Rank Test

Instead of just looking at the number of the

positive and negative differences, a better test

would take into account the magnitudes of the

differences.

In the WSR test, we allocate ranks to the

magnitudes of the differences. Then we find the

sum of the ranks for positive values and the sum

of ranks for negative values.

Page 19: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Wilcoxon Signed Rank Test

In the WSR test, we allocate ranks to the

magnitudes of the differences. Then we find the

sum of the ranks for positive values and the sum

of ranks for negative values.

We call these sums of ranks 𝑊+ and 𝑊−

𝑊 or 𝑇 is the minimum of these two values.

There are tables of critical values for 𝑇.

Page 20: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Page 21: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Assumptions

The sign test and the Wilcoxon signed rank test are both tests of the median.

In the sign test we are simply counting the number of observations that lie above and below the median.

For any continuous distribution the probability that an observation lies above the median is 0.5, so the sign test can be applied without further restriction.

(A discrete distribution may have a non-zero probability of a value equal to the median occurring. Such values are simply ignored in the sign test.)

Page 22: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Assumptions

In the Wilcoxon signed rank test we need to

assume that the deviations above and below the

median are distributed the same way. That is,

we assume that the distribution is symmetrical

about its median.

This assumption is equivalent to saying that a

deviation of any given magnitude is equally likely

to be positive or negative.

Page 23: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Wilcoxon critical values

The critical values for Wilcoxon tests (and for

many other non-parametric tests) are obtained

by computer simulation. However, for sufficiently

large values of n the CLT applies so simulation

is not required.

(The sign test does not require simulation as it is

just a binomial test of p = 0.5. Again, a Normal

approximation can be used for large n.)

Page 24: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Table of Critical Values

Page 25: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Analysing the data – sorting

This is paired data – we could analyse using the

Sign Test, but we know the Wilcoxon Signed Rank

is more ‘powerful’.

Page 26: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Correlation: hypothesis testing The pmcc:

Null hypothesis: the correlation in the

underlying population is zero. That is ρ = 0

Alternative hypothesis: ρ ≠ 0, ρ > 0 or ρ <

0 depending on the situation

The sample pmcc is simply compared with

tabulated critical values.

NB: requires underlying bivariate Normality

Page 27: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Top Sportspeople of all time Michael Phelps

Roger Federer

Martina Navratilova

Muhammad Ali

Jackie Joyner-Kersee

Steve Redgrave

Michael Jordan

Pele

Michael Schumacher

Usain Bolt

Page 28: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Rank Correlation The (Spearman*) rank correlation coefficient is

just the pmcc applied to the ranks of the data

Formula: 1 −6 σ 𝐷2

𝑛(𝑛2−1)where D is difference in

ranks

As with the pmcc, the rank correlation coefficient should only be applied to data in which both variables are random

Unlike the pmcc, the rank correlation does not require any particular underlying distribution. It is ‘distribution free’. It is ‘non-parametric’.

*there are other correlation coefficients

Page 29: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Correlation: hypothesis testing Rank correlation:

Null hypothesis: there is no association in the

underlying population between the two variables

Alternative hypothesis: there is some association,

positive association or negative association,

depending on the situation

The sample rank correlation coefficient is simply

compared with tabulated critical values.

NB: valid for any underlying distribution

Page 30: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

For our example𝐻0: No agreement in the opinions of you and your

partner

𝐻1: Positive agreement in the opinions of you and

your partner

Test at the 5% level.

1 tailed test – why?

Page 31: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Page 32: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636

Page 33: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636

1 −6σ𝑑2

10 × 99≥ 0.5636

Page 34: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636

1 −6σ𝑑2

10 × 99σ𝑑2

165≤ 0.4364

≥ 0.5636

Page 35: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636

1 −6σ𝑑2

10 × 99σ𝑑2

165≤ 0.4364

σ𝑑2 ≤ 72

≥ 0.5636

Page 36: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if σ𝑑2 ≤ 72

Page 37: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Critical RegionReject 𝐻0 if σ𝑑2 ≤ 72

σ𝑑2 large – little agreement

σ𝑑2 small – good agreement

Page 38: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

When would you use 𝑟𝑠?

Page 39: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

When would you use 𝑟𝑠? If you are only given the ranks

Page 40: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

When would you use 𝑟𝑠? If you are only given the ranks

Where you suspect there is not a linear

relationship but where one variable generally

increases or decreases as other increases; in

that case we are looking for association rather

than correlation.

Page 41: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Other statistical experimentsVehicles passing an observation point – Cars

arriving at a petrol station – Customer arrivals –

Telephone calls (Poisson)

Newspaper analysis: difference in word or

sentence lengths between two papers – Do males

and females estimate a time interval differently? –

Does a person’s weight differ between the morning

and evening (Difference in means)

Lengths and widths of leaves – Heights of students

and their fathers/mothers – Daily rainfall &

sunshine (Correlation & Regression)

Page 42: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

Acknowledgements

Rouncefield, M.,& Holmes, P. (1989) Practical

Statistics. Macmillan.

Belson, C. et al (1992) SMP 16-19 Data collection:

Student text and unit guide. CUP.

Page 43: MEI PowerPoint Template · differences should follow a binomial distribution 𝐵(𝑛,1 2) where 𝑛is the number of non-zero differences. ... but we know the Wilcoxon Signed Rank

#MEIConf2019

About MEI Registered charity committed to improving

mathematics education

Independent UK curriculum development body

We offer continuing professional development

courses, provide specialist tuition for students

and work with employers to enhance

mathematical skills in the workplace

We also pioneer the development of innovative

teaching and learning resources