Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing...
-
Upload
dominick-pierce -
Category
Documents
-
view
227 -
download
0
Transcript of Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing...
Analysis of Analysis of Categorical DataCategorical Data
Types of TestsTypes of Testso Data in 2 Data in 2 XX 2 Tables 2 Tables (covered previously)(covered previously)
Comparing two population proportions using Comparing two population proportions using independent samples (Fisher’s Exact Test)independent samples (Fisher’s Exact Test)
Comparing two population proportions using Comparing two population proportions using
dependent samples (McNemar’s Test)dependent samples (McNemar’s Test)
Relative Risk (RR), Odds Ratios (OR), Risk Relative Risk (RR), Odds Ratios (OR), Risk Difference, Difference, Attributable Risk (AR), & Attributable Risk (AR), & NNT/NNHNNT/NNH
o Data inData in rr X X c c TablesTablesTests of Independence/Association and Tests of Independence/Association and Homogeneity.Homogeneity.
Cervical-Cancer and Age at Cervical-Cancer and Age at First Pregnancy – 2 First Pregnancy – 2 XX 2 Data 2 Data
TableTableThese data come from a case-control These data come from a case-control
study to examine the potential study to examine the potential relationship between age at first relationship between age at first pregnancy and cervical cancer. In this pregnancy and cervical cancer. In this study we will be comparing the study we will be comparing the proportion of women who had their proportion of women who had their first pregnancy at or before the ages first pregnancy at or before the ages of 25, because researchers suspected of 25, because researchers suspected that an early age at first pregnancy that an early age at first pregnancy leads to increased risk of developing leads to increased risk of developing cervical cancer. cervical cancer.
2 2 XX 2 Example: Case- 2 Example: Case-Control StudyControl Study
Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancyDiseaseDisease
StatusStatusAge at Age at 11stst PregnanPregnancycy
Age Age << 2525
Age at 1Age at 1stst Pregnancy Pregnancy
Age > 25Age > 25RowRow
TotalsTotals
Cervical Cervical Cancer Cancer (Case)(Case)
4242 77 4949
HealthyHealthy
(Control)(Control)203203 114114 317317
ColumnColumn
TotalsTotals245245 121121 366366
Previously Previously o We have compared the We have compared the
proportions of women with the proportions of women with the risk factor in both groups (prisk factor in both groups (p11 vs. vs. pp22) using the z-test, ) using the z-test, a CI for (p1 – p2) & Fisher’s Exact a CI for (p1 – p2) & Fisher’s Exact Test.Test.
o Computed the Odds Ratio (OR) Computed the Odds Ratio (OR) and found a CI for the population and found a CI for the population OR.OR.
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceOne way to generalize the question of One way to generalize the question of
interest to the researchers is to think interest to the researchers is to think of it as follows:of it as follows:
Q: Is there an Q: Is there an associationassociation between cervical cancer between cervical cancer status and whether or not a status and whether or not a woman had her 1woman had her 1stst pregnancy pregnancy at or before the age of 25?at or before the age of 25?
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceIf there is not an association, we If there is not an association, we
say that the variables are say that the variables are independentindependent..
In the probability notes we saw that In the probability notes we saw that two events A and B are said to be two events A and B are said to be independent if independent if P(A|B) = P(A)P(A|B) = P(A)..
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceIn the context of our study this would In the context of our study this would
mean mean P(Age P(Age << 25|Cancer Status) = P(Age 25|Cancer Status) = P(Age <<
25)25)
i.e. knowing something about disease i.e. knowing something about disease status tells you nothing about the status tells you nothing about the presence of the risk factor of having presence of the risk factor of having their first pregnancy at or before age 25. their first pregnancy at or before age 25.
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependence
When we consider this percentage conditioning on disease status we see that relationship for independence does not hold for these data.
P(Age < 25|Cervical Cancer) = 42/47 = .8936
P(Age < 25|Healthy Control) = 203/317 = .6404
P(Age < 25) = 245/366 = .6694
In this study 66.94% of the women sampled had their first pregnancy at or before the age of 25.
Should both be equal to .6694
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceo Of course the observed differences Of course the observed differences
could be due to random variation and could be due to random variation and in truth it may be the case that disease in truth it may be the case that disease and risk factor status are independent. and risk factor status are independent.
o Therefore we need a means of Therefore we need a means of assessing how different the observed assessing how different the observed results are from what we would expect results are from what we would expect to see if the these two factors were to see if the these two factors were independent.independent.
2 2 XX 2 Example: Case- 2 Example: Case-Control StudyControl Study
Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancyDiseaseDisease
StatusStatusAge at Age at 11stst PregnanPregnancycy
Age Age << 2525
Age at 1Age at 1stst Pregnancy Pregnancy
Age > 25Age > 25RowRow
TotalsTotals
Cervical Cervical Cancer Cancer (Case)(Case)
4242 77 4949
HealthyHealthy
(Control)(Control)203203 114114 317317
ColumnColumn
TotalsTotals245245 121121 366366
C1 C2
R1
R2
n
a b
c d
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependence
From this table we can calculate the conditional probability of having the risk factor of early pregnancy given the disease status of the subject as follows:
1
)|(R
aDiseaseRiskP
The unconditional probability of risk presence of these data is given by:
n
CRiskP 1)(
and setting these to equal we have
n
CRa
n
C
R
a 111
1
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceThus we expect the frequency in the Thus we expect the frequency in the aa cell to cell to
be equal to:be equal to:
Similarly we find the following Similarly we find the following expected expected frequencies frequencies for the cells making up the 2 for the cells making up the 2 X 2 tableX 2 table
n
CRa 11
n
CRd
n
CRc
n
CRb
n
CRa
2212
2111
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceIn general we denote the In general we denote the observed observed
frequencyfrequency in the i in the ith th row and jrow and jthth column as or just column as or just O O for short.for short.
We denote the expected frequency for We denote the expected frequency for the ithe ith th row and jrow and jthth column as column as
ijO
n
CRE jiij or just E for
short.jcolumn for alcolumn tot & i rowfor totalrow ji CR
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceo To measure how different our observed To measure how different our observed
results are from what we expected to results are from what we expected to see if the two variables in question see if the two variables in question were independent we intuitively should were independent we intuitively should look at the difference between the look at the difference between the observed (observed (OO) and expected () and expected (EE) ) frequencies, i.e. frequencies, i.e. O – EO – E or more or more specificallyspecifically
o However this will give too much weight However this will give too much weight to differences where these frequencies to differences where these frequencies are both large in size. are both large in size.
ijij EO
Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of
IndependenceIndependenceo One test statistic that addresses the One test statistic that addresses the
“size” of the frequencies issue is “size” of the frequencies issue is Pearson’s Chi-Square Pearson’s Chi-Square
)1()1(h wit
on distributi squared-chi~~ 2
1 1
2
cells all
22
crdf
E
EO
E
EO
r
i
c
j ij
ijij
Notice this test statistic still uses (O – E) as the basic building block. This statistic will be large when the observed frequencies do NOT match the expected values for independence.
Chi-square Distribution Chi-square Distribution
This is a graph of the chi-square distribution with 4 degrees of freedom. The area to the right of Pearson’s chi-square statistic give the p-value. The p-value is always the area to the right!
p-value
2 2 XX 2 Example: Case- 2 Example: Case-Control StudyControl Study
Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancyDiseaseDisease
StatusStatusAge at Age at 11stst PregnanPregnancycy
Age Age << 2525
Age at 1Age at 1stst Pregnancy Pregnancy
Age > 25Age > 25RowRow
TotalsTotals
Cervical Cervical Cancer Cancer (Case)(Case)
4242 77 4949
HealthyHealthy
(Control)(Control)203203 114114 317317
ColumnColumn
TotalsTotals245245 121121 366366
C1 C2
R1
R2
n
O11 O12
O21 O22
Calculating Expected Calculating Expected FrequenciesFrequencies
Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancy
DiseaseDisease
StatusStatusAge at 1Age at 1stst PregnanPregnancycy
Age Age << 25 25
Age at 1Age at 1stst PregnancPregnancy y
Age > 25Age > 25
RowRow
TotalsTotals
Cervical Cervical Cancer Cancer (Case)(Case)
4242 77 4949
HealthyHealthy
(Control)(Control)203203 114114 317317
ColumnColumn
TotalsTotals245245 121121 366366
104.80 366
121317
212.20 366
245317
16.20 366
12149
32.80 366
24549
2222
1221
2112
1111
n
CRE
n
CRE
n
CRE
n
CRE
(32.80)
(16.20)
(212.20)
(104.80)
C1 C2
R1
R2
n
celljth and rowith for frequency expected ijE
Calculating the Pearson Calculating the Pearson Chi-squareChi-square
0027
1)12()12( 9.011
9.011 81. .40 22.558.2
80.104
80.104114
20.212
20.212203
20.16
20.167
8.32
8.3242
2
22
22
cells all
22
.p-value
df
E
EO
http://www.stat.tamu.edu/~west/applets/chisqdemo.html
Chi-square Probability Chi-square Probability Calculator in JMPCalculator in JMP
Enter the test statistic value and df and thep-value is automatically calculated.
p-value = P(
2 2 XX 2 Example: Case- 2 Example: Case-Control Study Control Study
Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancy
Conclusion:Conclusion:We have strong evidence to suggest that at age at We have strong evidence to suggest that at age at
first pregnancy and cervical cancer status are first pregnancy and cervical cancer status are NOT independent, and that they are associated NOT independent, and that they are associated or related or related (p =.0027). In particular we found that the (p =.0027). In particular we found that the proportion of women having their first proportion of women having their first pregnancy at or before the age of 25 was pregnancy at or before the age of 25 was higher amongst women with cervical cancer higher amongst women with cervical cancer than for those without.than for those without.
Other things we could do…Other things we could do…o Odds Ratio (OR) and CI for OROdds Ratio (OR) and CI for OR
- case-control study means no RR.- case-control study means no RR.
o Fisher’s Exact TestFisher’s Exact Test - Pearson’s chi- - Pearson’s chi-square is an approximation that square is an approximation that requires “large” sample sizesrequires “large” sample sizes* * typically we would like all typically we would like all EEijij > > 55 * or at least 80% of cells should have * or at least 80% of cells should have EEijij > > 55
* thus the approximation should be good * thus the approximation should be good here as here as both of these conditions are met for this both of these conditions are met for this study. study.
Example 2: Response to Treatment and Example 2: Response to Treatment and Histological Type of Hodgkin’s DiseaseHistological Type of Hodgkin’s Disease
In this study a random sample of 538 In this study a random sample of 538 patients diagnosed with some form of patients diagnosed with some form of Hodgkin’s Disease was taken and the Hodgkin’s Disease was taken and the histological type: nodular sclerosis (NS), histological type: nodular sclerosis (NS), mixed cellularity (MC), lymphocyte mixed cellularity (MC), lymphocyte predominance (LP), or lymphocyte depletion predominance (LP), or lymphocyte depletion (LD) was recorded along with the outcome (LD) was recorded along with the outcome from standard treatment which was from standard treatment which was recorded as being none, partial, or complete recorded as being none, partial, or complete remission. remission.
Q:Q: Is there an association between type of Is there an association between type of Hodgkin’s and response to treatment? If so, Hodgkin’s and response to treatment? If so, what is the nature of the relationship? what is the nature of the relationship?
Example 2: Response to Treatment Example 2: Response to Treatment and Histological Type of Hodgkin’s and Histological Type of Hodgkin’s
DiseaseDisease
TypeType NoneNone PartiaPartiall
PositiPositiveve
RowRow
TotalsTotals
LDLD 4444 1010 1818 7272
LPLP 1212 1818 7474 104104
MCMC 5858 5454 154154 266266
NSNS 1212 1616 6868 9696
ColumColumnn
TotalsTotals
126126 9898 314314 n = n = 538538
Some Probabilities of Potential Interest
Probability of Positive Response to Treatment
P(positive) = 314/538 = .5836
Probability of Positive Response to Treatment Given Disease Type
P(positive|LD) = 18/72 = .2500P(positive|LP) = 74/104 = .7115
P(positive|MC) = 154/266 = .5789
P(positive|NS) = 68/96 = .7083
Notice the conditional probabilities are not equal to the unconditional!!!
Mosaic plot of the resultsMosaic plot of the resultsResponse to Treatment vs. Histological Type
Clearly we see that LP and NS respond most favorably to treatment with over 70% of those sampled having experiencing complete remission, whereas lymphocyte depletion has a majority (61.1%) of patients having no response to treatment.A statistical test at this point seems unnecessary
as it seems clear that there is an association between the type of Hodgkin’s disease and the response to treatment, nonetheless we will proceed…
Example 2: Response to Treatment Example 2: Response to Treatment and Histological Type of Hodgkin’s and Histological Type of Hodgkin’s
DiseaseDisease
TypeType NoneNone PartiaPartiall
PositivPositivee
RowRow
TotalsTotals
LDLD 4444 1010 1818 7272
LPLP 1212 1818 7474 104104
MCMC 5858 5454 154154 266266
NSNS 1212 1616 6868 9696
ColumColumnn
TotalsTotals
126126 9898 314314 n = n = 538538
03.56 538
31496
...
42.02 538
31472
13.11 538
9872
86.61 538
12672
3443
3121
2112
1111
n
CRE
n
CRE
n
CRE
n
CRE
(16.86)
(13.11)
(42.02)
(24.36)
(18.94)
(60.69)
(62.30)
(48.45)
(155.25)
(22.48)
(17.49)
(56.03)
Example 2: Response to Treatment Example 2: Response to Treatment and Histological Type of Hodgkin’s and Histological Type of Hodgkin’s
DiseaseDisease
TypeType NoneNone PartiaPartiall
PositivPositivee
RowRow
TotalsTotals
LDLD 4444 1010 1818 7272
LPLP 1212 1818 7474 104104
MCMC 5858 5454 154154 266266
NSNS 1212 1616 6868 9696
ColumColumnn
TotalsTotals
126126 9898 314314 n = n = 538538
(16.86)
(13.11)
(42.02)
(24.36)
(18.94)
(60.69)
(62.30)
(48.45)
(155.25)
(22.48)
(17.49)
(56.03)
.0001
6 89.75
89.7503.56
03.5668 . . .
11.13
11.1310
86.16
86.1644
2
2
222
1 1
2
2
cells all
22
valuep
df
E
EO
E
EO
r
i
c
j ij
ijij
We have strong evidence of an association between the type of Hodgkin’s and response to treatment (p < .0001).
Measures of Association Measures of Association Between Two Categorical Between Two Categorical
VariablesVariables
This can be applied to the cervical cancer case-control study.
157.366
011.9)(
2
n
Phi
Measures of Association Measures of Association Between Two Categorical Between Two Categorical
VariablesVariables
This can be used for general r x c tables. This can be used for the Hodgkin’s example:
3516.89.75538
89.75
C
Measures of Association Measures of Association Between Two Categorical Between Two Categorical
VariablesVariables
For the Hodgkin’s study 266.)13(538
89.75
V
Measures of Association Measures of Association Between Two Categorical Between Two Categorical
VariablesVariablesThere are lots of other measures of association. When both variables are nominal the previous measures are fine and there are certainly many more. For cases where both variables are ordinal common measures include Kendall’s tau and Somer’s D.
In some cases we wish to measure the degree of exact agreement between two nominal or ordinal variables measured using the same levels or scales in which case we generally use Cohen’s Kappa ().
Measures of Association Measures of Association Between Two Categorical Between Two Categorical
VariablesVariablesCohen’s Kappa (Cohen’s Kappa ()) – measures the degree of – measures the degree of
agreement between two variables on the agreement between two variables on the same scales.same scales.
Example 3: Medicare Study – General health at baseline and 2-yr. follow-up, how well do they agree?
excellent agreement
good agreement
0 < marginal agreement
There is a fairly good agreement between the general assessment of overall health baseline and at follow-up. However, there appears to be some general trend for improvement as well.
Testing for Lack of Testing for Lack of SymmetrySymmetry
o Bowker’s Test of SymmetryBowker’s Test of Symmetry is a is a generalization of McNemar’s Test to generalization of McNemar’s Test to r r x rx r tables where there where the row tables where there where the row and column variables are on the same and column variables are on the same scale. scale.
o The general health of the subjects in the The general health of the subjects in the Medicare study is an example of where Medicare study is an example of where this test could be used as both the this test could be used as both the health at baseline and follow-up is health at baseline and follow-up is recorded using the same 5-point ordinal recorded using the same 5-point ordinal scale. scale.
Bowker’s Test of Bowker’s Test of SymmetrySymmetry
2
)1(on distributi squared-chi ~
)( 22
rrdf
OO
OO
ji jiij
jiijB
11 22 …… rr Row Row TotaTotalsls
11 OO1111 OO1212 …… OO1r1r
22 OO2121 OO2222 …… OO2r2r
…… …… …… …… ……
rr OOr1r1 OOr2r2 …… OOrrrr
Column Column TotalsTotals
Y
X
The test looks for the frequencies to be generally larger on one side of the diagonal than the other.
Bowker’s Test of Bowker’s Test of SymmetrySymmetry
2
)1(on distributi squared-chi ~
)( 22
rrdf
OO
OO
ji jiij
jiijB
When will this test statistics be “large”? If there was a general trend or tendency for X > Y or for X < Y then we would expect the off diagonal cells of the table to larger on one side than the other. For example if Y tended to be larger than X, perhaps indicating an improvement in health, then we expect the frequencies above the diagonal to be larger than those below.
Bowker’s Test of Bowker’s Test of SymmetrySymmetry
Symmetry of DisagreementBowker’s test suggests the differences are asymmetric (p < .0001). Examining the percentages suggests a majority of patients either stayed the same or improved in each group based on baseline score.
Therefore it is reasonable to state that we have evidence that in general subjects health stayed the same or if it did change it was generally for the better (p < .0001).
Other ApproachesOther Approacheso Wilcoxon Sign-Rank Test for the Wilcoxon Sign-Rank Test for the
paired differences in the ordinal paired differences in the ordinal health score health score (p < .0001).(p < .0001).
o Direct examination of the Direct examination of the distribution of the changes in distribution of the changes in general health score.general health score.
Follow-up – Baseline
There is a slight advantage for improvement vs. decline in health.
The plot on the right shows the change in general health vs. baseline health. With the exception of those with the lowest health at baseline a majority (50%+) of patients stayed the same. The shading for improvement is larger than the shading for health decline.
Other Tests for Categorical Other Tests for Categorical DataData
o Chi-square Test for Trend in Binomial Chi-square Test for Trend in Binomial Proportions tests whether or notProportions tests whether or not p p1 1 < p< p22 < p < p33 < … < p < … < pk k where 1, 2, …, k are where 1, 2, …, k are levels of an ordinal variable, i.e. 2 X k table.levels of an ordinal variable, i.e. 2 X k table.
o Chi-square Goodness-of-Fit Tests – used test Chi-square Goodness-of-Fit Tests – used test whether observations come from some whether observations come from some hypothesized distribution.hypothesized distribution.
o Cochran-Mantel-Haenszel Test – Looks at Cochran-Mantel-Haenszel Test – Looks at whether or not there is a relationship in a 2 X whether or not there is a relationship in a 2 X 2 table situation adjusting for the level of a 2 table situation adjusting for the level of a third factor. For example, is there a third factor. For example, is there a relationship between heavy drinking (Y or N) relationship between heavy drinking (Y or N) and lung cancer (Y or N) adjusting for and lung cancer (Y or N) adjusting for smoking status.smoking status.