@MEIConference #MEIConf2019
#MEIConf2019
TOPICS IN FM STATISTICS – A
PRACTICAL APPROACH
#MEIConf2019
Practicals
Triangle Taste Test
Reaction Times
Sorting Objects
We need volunteers to conduct the experiments
and the rest of us to act as participants.
#MEIConf2019
Triangle Taste Test
The aim is to find out if people can tell the
difference between bottled water and tap water.
Each person will be given three samples, two from
one source and one from the other. They will be
asked to pick out the odd sample.
#MEIConf2019
Reaction times
Do people’s reaction times improve with practice?
You have to catch a ruler and will record the mark
on the ruler at the point where you caught it.
This will be repeated a further 4 times, to see if
you improve with practice.
#MEIConf2019
Sorting objects
Does a person’s ability to sort objects blindfolded
improve with practice?
There are some objects on a tray. You are
blindfolded and the time taken to sort the objects is
measured.
The experiment is repeated with the same
volunteer and a 2nd time recorded.
#MEIConf2019
Non-parametric tests
Many standard tests require distributional assumptions. Most commonly, the underlying distribution has to be Normal.
So, for example, the test of a product moment correlation coefficient requires underlying Normality in the population from which the sample has been drawn.
The rank correlation coefficient, however, requires no particular underlying distribution.
It is said to be a ‘non-parametric’ test – or sometimes a ‘distribution free’ test.
#MEIConf2019
Non-parametric tests
Non-parametric tests are more broadly
applicable than parametric tests.
Non-parametric tests are generally not as
powerful as parametric tests. (The probability
of correctly rejecting the null hypothesis when
it is false is lower).
Non-parametric tests often do make some
assumptions about the underlying
distribution, but those assumptions are
generally weaker than for parametric tests.
#MEIConf2019
Analysing the data – Triangle Test
Strictly speaking, this is not FM but it is a good
place to start as the underlying analysis is not too
difficult.
#MEIConf2019
Analysing the data – Triangle Test
Strictly speaking, this is not FM but it is a good
place to start as the underlying analysis is not too
difficult.
Assuming that there is no detectable difference in
taste, then 𝑋, the number of times that the
singleton was correctly identified, should be
distributed ………?
#MEIConf2019
Analysing the data – Triangle Test
Strictly speaking, this is not FM but it is a good
place to start as the underlying analysis is not too
difficult.
Assuming that there is no detectable difference in
taste, then 𝑋, the number of times that the
singleton was correctly identified, should be
distributed binomially 𝑋~𝐵(𝑛,1
3).
#MEIConf2019
Analysing the data – reaction times
This data can be analysed using the sign test.
The difference in two sets of paired measurements
are analysed purely in terms of whether they are
positive or negative.
Here we will look at the difference in the first and
fifth readings for each person.
#MEIConf2019
Analysing the data – reaction times
The null hypothesis is that there is no improvement
in people’s ‘scores’.
#MEIConf2019
Analysing the data – reaction times
The null hypothesis is that there is no improvement
in people’s ‘scores’.
If this is the case we can say that the median of
the differences should be zero, as a positive
improvement or negative improvement is equally
likely.
#MEIConf2019
Analysing the data – reaction times
The null hypothesis is that there is no improvement
in people’s ‘scores’.
If this is the case we can say that the median of
the differences should be zero, as a positive
improvement or negative improvement is equally
likely.
𝑃(+𝑉𝐸 improvement) =1
2
𝑃(−𝑉𝐸 improvement) =1
2
#MEIConf2019
Analysing the data – reaction times
The alternative hypothesis is that
𝑃(+𝑉𝐸 improvement) >1
2
Note that we need to ignore any cases where there
are zero differences (and reduce 𝑛 accordingly).
#MEIConf2019
Analysing the data – reaction times
The alternative hypothesis is that
𝑃(+𝑉𝐸 improvement) >1
2
Note that we need to ignore any cases where there
are zero differences (and reduce 𝑛 accordingly).
Under 𝐻0 the distribution of positive (or negative)
differences should follow a binomial distribution
𝐵(𝑛,1
2) where 𝑛 is the number of non-zero
differences.
#MEIConf2019
Wilcoxon Signed Rank Test
Instead of just looking at the number of the
positive and negative differences, a better test
would take into account the magnitudes of the
differences.
In the WSR test, we allocate ranks to the
magnitudes of the differences. Then we find the
sum of the ranks for positive values and the sum
of ranks for negative values.
#MEIConf2019
Wilcoxon Signed Rank Test
In the WSR test, we allocate ranks to the
magnitudes of the differences. Then we find the
sum of the ranks for positive values and the sum
of ranks for negative values.
We call these sums of ranks 𝑊+ and 𝑊−
𝑊 or 𝑇 is the minimum of these two values.
There are tables of critical values for 𝑇.
#MEIConf2019
#MEIConf2019
Assumptions
The sign test and the Wilcoxon signed rank test are both tests of the median.
In the sign test we are simply counting the number of observations that lie above and below the median.
For any continuous distribution the probability that an observation lies above the median is 0.5, so the sign test can be applied without further restriction.
(A discrete distribution may have a non-zero probability of a value equal to the median occurring. Such values are simply ignored in the sign test.)
#MEIConf2019
Assumptions
In the Wilcoxon signed rank test we need to
assume that the deviations above and below the
median are distributed the same way. That is,
we assume that the distribution is symmetrical
about its median.
This assumption is equivalent to saying that a
deviation of any given magnitude is equally likely
to be positive or negative.
#MEIConf2019
Wilcoxon critical values
The critical values for Wilcoxon tests (and for
many other non-parametric tests) are obtained
by computer simulation. However, for sufficiently
large values of n the CLT applies so simulation
is not required.
(The sign test does not require simulation as it is
just a binomial test of p = 0.5. Again, a Normal
approximation can be used for large n.)
#MEIConf2019
Table of Critical Values
#MEIConf2019
Analysing the data – sorting
This is paired data – we could analyse using the
Sign Test, but we know the Wilcoxon Signed Rank
is more ‘powerful’.
#MEIConf2019
Correlation: hypothesis testing The pmcc:
Null hypothesis: the correlation in the
underlying population is zero. That is ρ = 0
Alternative hypothesis: ρ ≠ 0, ρ > 0 or ρ <
0 depending on the situation
The sample pmcc is simply compared with
tabulated critical values.
NB: requires underlying bivariate Normality
#MEIConf2019
Top Sportspeople of all time Michael Phelps
Roger Federer
Martina Navratilova
Muhammad Ali
Jackie Joyner-Kersee
Steve Redgrave
Michael Jordan
Pele
Michael Schumacher
Usain Bolt
#MEIConf2019
Rank Correlation The (Spearman*) rank correlation coefficient is
just the pmcc applied to the ranks of the data
Formula: 1 −6 σ 𝐷2
𝑛(𝑛2−1)where D is difference in
ranks
As with the pmcc, the rank correlation coefficient should only be applied to data in which both variables are random
Unlike the pmcc, the rank correlation does not require any particular underlying distribution. It is ‘distribution free’. It is ‘non-parametric’.
*there are other correlation coefficients
#MEIConf2019
Correlation: hypothesis testing Rank correlation:
Null hypothesis: there is no association in the
underlying population between the two variables
Alternative hypothesis: there is some association,
positive association or negative association,
depending on the situation
The sample rank correlation coefficient is simply
compared with tabulated critical values.
NB: valid for any underlying distribution
#MEIConf2019
For our example𝐻0: No agreement in the opinions of you and your
partner
𝐻1: Positive agreement in the opinions of you and
your partner
Test at the 5% level.
1 tailed test – why?
#MEIConf2019
#MEIConf2019
Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636
#MEIConf2019
Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636
1 −6σ𝑑2
10 × 99≥ 0.5636
#MEIConf2019
Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636
1 −6σ𝑑2
10 × 99σ𝑑2
165≤ 0.4364
≥ 0.5636
#MEIConf2019
Critical RegionReject 𝐻0 if 𝑟𝑠 ≥ 0.5636
1 −6σ𝑑2
10 × 99σ𝑑2
165≤ 0.4364
σ𝑑2 ≤ 72
≥ 0.5636
#MEIConf2019
Critical RegionReject 𝐻0 if σ𝑑2 ≤ 72
#MEIConf2019
Critical RegionReject 𝐻0 if σ𝑑2 ≤ 72
σ𝑑2 large – little agreement
σ𝑑2 small – good agreement
#MEIConf2019
When would you use 𝑟𝑠?
#MEIConf2019
When would you use 𝑟𝑠? If you are only given the ranks
#MEIConf2019
When would you use 𝑟𝑠? If you are only given the ranks
Where you suspect there is not a linear
relationship but where one variable generally
increases or decreases as other increases; in
that case we are looking for association rather
than correlation.
#MEIConf2019
Other statistical experimentsVehicles passing an observation point – Cars
arriving at a petrol station – Customer arrivals –
Telephone calls (Poisson)
Newspaper analysis: difference in word or
sentence lengths between two papers – Do males
and females estimate a time interval differently? –
Does a person’s weight differ between the morning
and evening (Difference in means)
Lengths and widths of leaves – Heights of students
and their fathers/mothers – Daily rainfall &
sunshine (Correlation & Regression)
#MEIConf2019
Acknowledgements
Rouncefield, M.,& Holmes, P. (1989) Practical
Statistics. Macmillan.
Belson, C. et al (1992) SMP 16-19 Data collection:
Student text and unit guide. CUP.
#MEIConf2019
About MEI Registered charity committed to improving
mathematics education
Independent UK curriculum development body
We offer continuing professional development
courses, provide specialist tuition for students
and work with employers to enhance
mathematical skills in the workplace
We also pioneer the development of innovative
teaching and learning resources
Top Related