Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5...
-
date post
20-Dec-2015 -
Category
Documents
-
view
222 -
download
2
Transcript of Test statistic: Group Comparison Jobayer Hossain Larry Holmes, Jr Research Statistics, Lecture 5...
Test statistic: Group Test statistic: Group ComparisonComparison
Jobayer HossainJobayer Hossain
Larry Holmes, JrLarry Holmes, Jr
Research Statistics, Lecture 5
October 30,2008
HypothesisHypothesis Testing (Quantitative Testing (Quantitative variable)variable)
S ig n te st
O n e sa m p le t-te st
O n e g rou p sa m p le
M a n n -W h itn e y U te st
T w o -sa m p le t-te st
In de pe nd ent
W ilcoxo n S ig ne d Ra n k te st
P a ire d t-te st
N ot Ind ep en de n t
T w o -gro u p sa m p le
K ru ska l W a llis te st
A n a lysis o f va ria n ce
M o re th an tw o g ro u ps sam p le
H yp o th e s is T e s ting P roce d u re
One group sample - one sample t-testOne group sample - one sample t-test
Test for value of a single meanTest for value of a single mean
E.g., test to see if mean SBP of all AIDHC E.g., test to see if mean SBP of all AIDHC
employees is 120 mm Hgemployees is 120 mm Hg
Assumptions Assumptions
– Parent population is normalParent population is normal
– Sample observations (subjects) are independentSample observations (subjects) are independent
One group sample- one sampleOne group sample- one sample t-t-testtest
FormulaFormulaLet xLet x11, x, x22, ….x, ….xnn be a random sample from a normal be a random sample from a normal
population with mean µ and variance population with mean µ and variance σσ22, then the , then the following statistic is distributed as Student’s t with (n-1) following statistic is distributed as Student’s t with (n-1) degrees of freedom.degrees of freedom.
ns
xt
/
One group sample- one sample t-One group sample- one sample t-testtest
Computation in Excel:Computation in Excel:– Excel does not have a 1-sample test, but we can fool Excel does not have a 1-sample test, but we can fool
it.it.– Suppose we want to test if the mean height of pediatric Suppose we want to test if the mean height of pediatric
patients in our data set 1 is 50 inchpatients in our data set 1 is 50 inch– Create a dummy column parallel to the Create a dummy column parallel to the hgthgt column column
with an equal number of cells, all set to 0.0with an equal number of cells, all set to 0.0– Run the Matched sample test using Run the Matched sample test using hgthgt and the and the
dummy column and 50 as the hypothesized mean dummy column and 50 as the hypothesized mean difference. difference.
– The The pp-value for two tail test is 0.0092 -value for two tail test is 0.0092
One group sample - one sample t-testOne group sample - one sample t-test
Using SPSS:Using SPSS:– Analyze> Compare Means >One Sample T Test Analyze> Compare Means >One Sample T Test
> Select variable (e.g. height) > Test value: (e.g. > Select variable (e.g. height) > Test value: (e.g. 50) > ok50) > ok
– P-value is .009P-value is .009– Interpretation: The mean height of the pediatric Interpretation: The mean height of the pediatric
patients in our dataset 1 is statistically patients in our dataset 1 is statistically significantly different from 50 inches.significantly different from 50 inches.
One group sample - Sign Test One group sample - Sign Test (Nonparametric)(Nonparametric)
UseUse::(1) Compares the median of a single group with a (1) Compares the median of a single group with a specified value (specified value (instead of single sample t-testinstead of single sample t-test).).
HypothesisHypothesis:: HH00:Median = c :Median = c
HHaa:Median :Median c c
Test StatisticTest Statistic:: We take the difference of observations from median (xWe take the difference of observations from median (x ii - - c). The number of positive difference follows a Binomial c). The number of positive difference follows a Binomial distribution. For large sample size, this distribution follows distribution. For large sample size, this distribution follows normal distribution.normal distribution.
One group sample - Sign Test One group sample - Sign Test (Nonparametric)(Nonparametric)
SPSS: Analyze> Nonparametric Tests> SPSS: Analyze> Nonparametric Tests> Binomial Binomial
Two-group (independent) samples - two-Two-group (independent) samples - two-sample t-statisticsample t-statistic
UseUse– Test for equality of two meansTest for equality of two means
AssumptionsAssumptions– Parent population is normal Parent population is normal – Sample observations (subjects) are Sample observations (subjects) are
independent.independent.
Two-group (independent) samples - two-Two-group (independent) samples - two-sample t-statisticsample t-statistic
Formula (two groups)Formula (two groups) – Case 1: Equal Population Standard Deviations:Case 1: Equal Population Standard Deviations:
The following statistic is distributed as t distribution with (n1+n2 -2) d.f. The following statistic is distributed as t distribution with (n1+n2 -2) d.f.
The pooled standard deviation,The pooled standard deviation,
n1 and n2 are the sample sizes and Sn1 and n2 are the sample sizes and S11 and S and S22 are the sample standard are the sample standard deviations of two groups.deviations of two groups.
21
21
11
)(
nnS
xxt
p
2
)1()1(
21
222
211
nn
SnSnS p
Two-group (independent) samples - two-Two-group (independent) samples - two-sample t-statisticsample t-statistic
Formula (two groups) Formula (two groups) – Case 2: Unequal population standard deviationsCase 2: Unequal population standard deviations
The following statistic follows t distribution.The following statistic follows t distribution.
The d.f. of this statistic is,The d.f. of this statistic is,
2
22
1
21
2121 )()(
ns
ns
xxt
1)/(
1)/(
//
2
22
22
1
21
21
2
2221
21
nns
nns
nsnsv
Two-group (independent) samples - two-Two-group (independent) samples - two-sample t-statisticsample t-statistic
MS Excel (in Tools -> Data Analysis…)MS Excel (in Tools -> Data Analysis…)
Two Groups (Independent Samples):Two Groups (Independent Samples):– t-Test: Two-Sample Assuming Equal Variancest-Test: Two-Sample Assuming Equal Variances
– t-Test: Two-Sample Assuming Unequal Variancest-Test: Two-Sample Assuming Unequal Variances
Two-group (independent) samples - two-Two-group (independent) samples - two-sample t-statisticsample t-statistic
Using SPSS:Using SPSS:– Analyze>Compare Means>Independent-Samples T-test> Analyze>Compare Means>Independent-Samples T-test>
– Select Select hgthgt as a Test Variable as a Test Variable
– Select Select sexsex as a Grouping Variable as a Grouping Variable
– In Define Groups, type f for Group 1 and m for Group 2In Define Groups, type f for Group 1 and m for Group 2
– Click Continue then OK Click Continue then OK
– It gives us the p-value 0.205. We can assume equal It gives us the p-value 0.205. We can assume equal variance as the p-value of F statistic for testing equality of variance as the p-value of F statistic for testing equality of variances is 0.845.variances is 0.845.
Two-group (independent) samples- Two-group (independent) samples- Wilcoxon Rank-Sum Test (Nonparametric)Wilcoxon Rank-Sum Test (Nonparametric)
Use:Use: Compares medians of two independent Compares medians of two independent groups.groups.
Corresponds to t-Test for 2 Independent MeansCorresponds to t-Test for 2 Independent Means
Test Statistic: Test Statistic: Let, X and Y be two samples of sizes m and n. Suppose Let, X and Y be two samples of sizes m and n. Suppose N=m+n. Compute the rank of all N observations. Then, N=m+n. Compute the rank of all N observations. Then, the statistic,the statistic,
WWmm= Sum of the ranks of all observations of variable X. = Sum of the ranks of all observations of variable X.
Two-group (independent) samples- Wilcoxon Two-group (independent) samples- Wilcoxon Rank-Sum Test (Nonparametric)Rank-Sum Test (Nonparametric)
Asthmatic score A Asthmatic score B
Score Rank Score Rank
71 1 85 582 3 3.5 82 4 3.577 2 94 892 7 97 988 6 ... ...
Rank Sum 19.5 25.5
Two-group (independent) samples- Two-group (independent) samples- Wilcoxon Rank-Sum Test (Nonparametric)Wilcoxon Rank-Sum Test (Nonparametric)
SPSS: SPSS: – Two Groups: Analyze> Nonparametric Tests> 2 Two Groups: Analyze> Nonparametric Tests> 2
Independent Samples Independent Samples
Two-group (matched) samples - paired t-Two-group (matched) samples - paired t-statisticstatistic
Use: Compares equality of means of two Use: Compares equality of means of two matched or paired samples (e.g. pretest matched or paired samples (e.g. pretest versus posttest) versus posttest)
Assumptions:Assumptions:– Parent population is normalParent population is normal– Sample observations (subjects) are Sample observations (subjects) are
independentindependent
Two-group (matched) samples - paired t-Two-group (matched) samples - paired t-statisticstatistic
Formula Formula – The following statistic follows t distribution with n-1 d.f. The following statistic follows t distribution with n-1 d.f.
Where, d is the difference of two matched samples and Where, d is the difference of two matched samples and SSdd is the standard is the standard deviation of the variable d.deviation of the variable d.
ns
dt
d /
More on test statisticMore on test statistic
One-sidedOne-sided– There can only be on direction of effectThere can only be on direction of effect– The investigator is only interested in one The investigator is only interested in one
direction of effect.direction of effect.– Greater power to detect difference in Greater power to detect difference in
expected directionexpected direction
Two-sidedTwo-sided– Difference could go in either directionDifference could go in either direction– More conservativeMore conservative
More on test statisticMore on test statistic
One groupOne group Two groupsTwo groups
One sidedOne sided A single mean differs A single mean differs from a known value in a from a known value in a specific direction. e.g. specific direction. e.g. mean > 0 or median > 0mean > 0 or median > 0
Two means differ from Two means differ from one another in a specific one another in a specific direction. e.g., meandirection. e.g., mean22 < <
meanmean11
medianmedian22 < median < median11
Two sidedTwo sided A single mean differs A single mean differs from a known value in from a known value in either direction. e.g., either direction. e.g., mean ≠ 0 or median mean ≠ 0 or median 0 0
Two means are not Two means are not equal. That is, meanequal. That is, mean11 ≠ ≠
meanmean22
medianmedian11 ≠ median ≠ median22
Two-group (matched) samples Wilcoxon Two-group (matched) samples Wilcoxon Signed-Rank Test (Nonparametric)Signed-Rank Test (Nonparametric)
USEUSE::– Compares medians of two paired samples.Compares medians of two paired samples.
Test StatisticTest Statistic – Obtain Difference Scores, Obtain Difference Scores, DDii = = XX11ii - - XX22ii
– Take Absolute Value of Differences, Take Absolute Value of Differences, DDii
– Assign Ranks to absolute values (lower to higher), Assign Ranks to absolute values (lower to higher), RRii
– Sum up ranks for positive differences (TSum up ranks for positive differences (T++) and negative ) and negative
differences (Tdifferences (T--))
Test Statistic is smaller of TTest Statistic is smaller of T-- or T or T++ (2-tailed) (2-tailed)
SubjectSubject Hours of SleepHours of Sleep DifferenceDifference Rank Ignoring Rank Ignoring SignSign
DrugDrug PlaceboPlacebo
11 6.16.1 5.25.2 0.90.9 3.53.5
22 7.07.0 7.97.9 -0.9-0.9 3.53.5
33 8.28.2 3.93.9 4.34.3 1010
44 7.67.6 4.74.7 2.92.9 77
55 6.56.5 5.35.3 1.21.2 55
66 8.48.4 5.45.4 3.03.0 88
77 6.96.9 4.24.2 2.72.7 66
88 6.76.7 6.16.1 0.60.6 22
99 7.47.4 3.83.8 3.63.6 99
1010 5.85.8 6.36.3 -0.5-0.5 113rd & 4th ranks are tied hence averaged.
P-value of this test is 0.02. Hence the test is significant at any level more than 2%, indicating the drug is more effective than placebo.
Example of Wilcoxon signed rank Example of Wilcoxon signed rank test (two matched samples)test (two matched samples)
Two-group (matched) samples Wilcoxon Two-group (matched) samples Wilcoxon Signed-Rank Test (Nonparametric)Signed-Rank Test (Nonparametric)
SPSS: SPSS: – Two Matched Groups: Analyze> Nonparametric Two Matched Groups: Analyze> Nonparametric
Tests> 2 Related Samples Tests> 2 Related Samples
Comparing > 2 independent Comparing > 2 independent samples: F statistic (Parametric) samples: F statistic (Parametric)
Use:Use:– Compares means of more than two groups Compares means of more than two groups – Testing the equality of population variances.Testing the equality of population variances.
Comparing > 2 independent Comparing > 2 independent samples: F statistic (Parametric)samples: F statistic (Parametric)Let X and Y be two independent Chi-square variables with nLet X and Y be two independent Chi-square variables with n11 and n and n22 d.f. respectively, then the following statistic follows a F distribution d.f. respectively, then the following statistic follows a F distribution with nwith n11 and n and n22 d.f. d.f.
Let, X and Y are two independent normal variables with sample Let, X and Y are two independent normal variables with sample sizes nsizes n11 and n and n22. Then the following statistic follows a F distribution . Then the following statistic follows a F distribution with nwith n11 and n and n22 d.f. d.f.
Where, sWhere, sxx22 and s and syy
22 are sample variances of X and Y. are sample variances of X and Y.
2/
1/21 , nY
nXF nn
2
2
, 21
y
xnn s
sF
Comparing > 2 independent samples: F Comparing > 2 independent samples: F statistic (Parametric)statistic (Parametric)
Hypotheses:Hypotheses:HH00: µ: µ11= µ= µ22=…. =µ=…. =µnn
HHaa: µ: µ11≠ µ≠ µ22 ≠ …. ≠µ ≠ …. ≠µnn
Comparison will be done using analysis of Comparison will be done using analysis of variance (ANOVA) technique.variance (ANOVA) technique.
ANOVA uses F statistic for this comparison.ANOVA uses F statistic for this comparison.
The ANOVA technique will be covered in The ANOVA technique will be covered in another class session.another class session.
Proportion TestsProportion Tests
UseUse– Test for equality of two ProportionsTest for equality of two Proportions
E.g. proportions of subjects in two treatment groups who E.g. proportions of subjects in two treatment groups who benefited from treatment.benefited from treatment.
– Test for the value of a single proportionTest for the value of a single proportionE.g., to test if the proportion of smokers in a population E.g., to test if the proportion of smokers in a population is some specified value (less than 1) is some specified value (less than 1)
Proportion TestsProportion Tests
FormulaFormula– One Group:One Group:
– Two Groups:Two Groups:
npp
ppz
)1(
ˆ
00
0
.ˆ where
)11
)(ˆ1(ˆ
ˆˆ
21
21
21
21
nn
xxp
nnpp
ppz
Proportion TestProportion Test
SPSS: SPSS: – One Group: Analyze> Nonparametric Tests> BinomialOne Group: Analyze> Nonparametric Tests> Binomial
– Two Groups?Two Groups?
Proportion of males in Dataset 1 Proportion of males in Dataset 1
SPSS:SPSS:– recode recode sexsex as numeric - as numeric -
Transform> Recode>Into Different Variables> Make all Transform> Recode>Into Different Variables> Make all selections there and click on Change after recoding selections there and click on Change after recoding character variable into numeric. character variable into numeric.
– Analyze> Nonparametric test> Binomial> select Test Analyze> Nonparametric test> Binomial> select Test variable> Test proportionvariable> Test proportion
Set null hypothesis = 0.5Set null hypothesis = 0.5
The p-value = 1.0 The p-value = 1.0
Chi-square statisticChi-square statistic
USE USE – Testing the population variance Testing the population variance σσ22= = σσ00
22..
– Testing the goodness of fit. Testing the goodness of fit. – Testing the independence/ association of attributesTesting the independence/ association of attributes
AssumptionsAssumptions– Sample observations should be independent.Sample observations should be independent.
– Cell frequencies should be >= 5.Cell frequencies should be >= 5.
– Total observed and expected frequencies are equalTotal observed and expected frequencies are equal
Chi-square statisticChi-square statistic
Formula: If Formula: If xxii ( (i=1,2,…ni=1,2,…n) are independent and ) are independent and normally distributed with mean µ and standard normally distributed with mean µ and standard deviation deviation σσ, then, , then,
If we don’t know µ, then we estimate it using a If we don’t know µ, then we estimate it using a sample mean and then,sample mean and then,
d.f.n on with distributi a is 2
1
2
n
i
ix
d.f. 1)-(non with distributi a is 2
1
2
n
i
i xx
Chi-square statisticChi-square statistic
For a contingency table we use the following chi- For a contingency table we use the following chi- square test statistic,square test statistic,
Frequency Expected
Frequency Observed
d.f. 1)-(n with as ddistribute ,)( 2
1
22
i
i
n
i i
ii
E
O
E
EO
Chi-square statisticChi-square statistic
Male Male
O(E)O(E)
FemaleFemale
O(E)O(E)
TotalTotal
Group 1Group 1 9 (10)9 (10) 9 (10)9 (10) 2020
Group 2Group 2 8 (10)8 (10) 12 (10)12 (10) 2020
Group 3Group 3 11 (10)11 (10) 9(10)9(10) 2020
3030 3030 6060
Chi-square statistic – calculation of Chi-square statistic – calculation of expected frequencyexpected frequency
To obtain the expected frequency for any To obtain the expected frequency for any cell, use:cell, use:
Corresponding row total X column total / Corresponding row total X column total / grand totalgrand total
E.g: cell for group 1 and female, E.g: cell for group 1 and female, substituting: 30 X 20 / 60 = 10substituting: 30 X 20 / 60 = 10
Chi-square statisticChi-square statistic
SPSS: SPSS: – Analyze> Descriptive stat> Crosstabs> Analyze> Descriptive stat> Crosstabs>
statistics> Chi-squarestatistics> Chi-square– Select variables.Select variables.– Click on Cell button to select items you want Click on Cell button to select items you want
in cells, rows, and columns.in cells, rows, and columns.
CreditsCredits
Thanks are due to all whose works have Thanks are due to all whose works have been consulted prior to the preparation of been consulted prior to the preparation of these slides.these slides.
QuestionsQuestions