SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
-
date post
20-Dec-2015 -
Category
Documents
-
view
217 -
download
0
Transcript of SADC Course in Statistics Comparing Means from Independent Samples (Session 12)
2To put your footer here go to View > Header and Footer
Learning ObjectivesBy the end of this session, you will be able to
• explain how means from two populations may be compared
• describe the assumptions associated with the independent samples t-test
• interpret computer output from a two-sample t-test
• present and write up conclusions resulting from such tests
• explain the difference between statistical significance and an important result
3To put your footer here go to View > Header and Footer
An example: Comparing 2 means
Agric Non-agric
156 223
282 131
222 137
172 146
183 130
206 122
210 141
198 192199 188211 212
As part of a health survey, cholesterol levels of men in a small rural area were measured, including those working in agriculture and those employed in non-agricultural work.
Aim: To see if mean cholesterol levels were different between the two groups.
4To put your footer here go to View > Header and Footer
Summary statistics
Begin with summarising each column of data.
Agric Non-agric
Mean= 203.9 162.2Std. dev. = 33.9 37.6
Variance = 1147 1412
There appears to be a substantial difference between the two means.
Our question of interest is:
Is this difference showing a real effect, or could it merely be a chance occurrence?
5To put your footer here go to View > Header and Footer
Setting up the hypotheses
To answer the question, we set up:
Null hypothesis H0:
no difference between the two groups (in
terms of mean response), i.e. 1 = 2
Alternative hypothesis H1:
there is a difference, i.e. 1 2
The resulting test will be two-sided since the alternative is “not equal to”.
6To put your footer here go to View > Header and Footer
Test for comparing means
• Use a two-sample (unpaired) t-test- appropriate with 2 independent samples
• Assumptions - normal distributions for each sample- constant variance (so test uses a pooled
estimate of variance)- observations are independent
• Procedure - assess how large the difference in means is, relative to the noise in this difference, i.e. the std. error of the difference.
7To put your footer here go to View > Header and Footer
Test Statistic
tx x
sn
sn
1 2
2
1
2
2
s
n s n s
n nwith n n f2 1 1
22 2
2
1 21 2
1 1
22
d. .
where s2, the pooled estimate of variance, is given by
The test statistic is:
8To put your footer here go to View > Header and Footer
Numerical Results
2 21 1 2 22
1 2
n 1 s n 1 ss
n n 2
The pooled estimate of variance, is :
= 1279.5
Hence the t-statistic is:
= 41.7/(2x1279.5/10)
= 2.61 , based on 18 d.f.
Comparing with tables of t18, this result is significant at the 2% level, so reject H0.
Note: The exact p-value = 0.018
tx x
sn
sn
1 2
2
1
2
2
9To put your footer here go to View > Header and Footer
Presenting the results
• For comparisons, should report:- difference between means- s.e. of difference in means- 95% confidence interval for true diff.
• In addition, may report for each group:- mean- s.e. of each mean- sample size for each mean
• Conclusions will then follow…
10To put your footer here go to View > Header and Footer
Results and conclusions
Difference of means: 41.7Standard error of difference: 15.99
95% confidence interval for difference in means: (8.09, 75.3).
Conclusions: There is some evidence (p=0.018) that the mean cholesterol levels differ between those working in agriculture and others. The difference in means is 42 mg/dL with 95% confidence interval (8.1, 75.3).
11To put your footer here go to View > Header and Footer
Significance ideas again!e.g. Farmers report that using a fungicide increased crop yields by
2.7 kg ha-1, s.e.m.=0.41
This gave a t-statistic of 6.6 (p-value<0.001)
Recall that the p-value is the probability of rejecting the null hypothesis when it is true.
i.e. it is the chance of error in your conclusion that there is an effect due to fungicide!
12To put your footer here go to View > Header and Footer
How important are sig. tests?In relation to the example on the previous slide, we may find one of the following situations for different crops.
Mean yields: with and without fungicide. 589.9 587.2 Not an important finding! 9.9 7.2 Very important finding!
It is likely that in the first of these results, either too much replication or the incorrect level of replication had been used (e.g. plant level variation, rather than plot level variation used to compare means).
13To put your footer here go to View > Header and Footer
What does non-significance tell use.g. There was insufficient evidence in the data to demonstrate that using a fungicide had any effect on plant yields (p=0.128).
Mean yields: with and without fungicide.157.2 89.9
This difference may be an important finding, but the statistical analysis was unable to pick up this difference as being statistically significant.
HOW CAN THIS HAPPEN? Too small a sample size? High variability in the experimental material? One or two outliers? All sources of variability not identified?
14To put your footer here go to View > Header and Footer
Significance – Key Points
• Statistical significance alone is not enough. Consider whether the result is also scientifically meaningful and important.
• When a significant result if found, report the finding in terms of the corresponding estimates, their standard errors and C.I.’s