Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing...

Analysis of Analysis of Categorical DataCategorical Data

Types of TestsTypes of Testso Data in 2 Data in 2 XX 2 Tables 2 Tables (covered previously)(covered previously)

Comparing two population proportions using Comparing two population proportions using independent samples (Fisher’s Exact Test)independent samples (Fisher’s Exact Test)

Comparing two population proportions using Comparing two population proportions using

dependent samples (McNemar’s Test)dependent samples (McNemar’s Test)

Relative Risk (RR), Odds Ratios (OR), Risk Relative Risk (RR), Odds Ratios (OR), Risk Difference, Difference, Attributable Risk (AR), & Attributable Risk (AR), & NNT/NNHNNT/NNH

o Data inData in rr X X c c TablesTablesTests of Independence/Association and Tests of Independence/Association and Homogeneity.Homogeneity.

Cervical-Cancer and Age at Cervical-Cancer and Age at First Pregnancy – 2 First Pregnancy – 2 XX 2 Data 2 Data

TableTableThese data come from a case-control These data come from a case-control

study to examine the potential study to examine the potential relationship between age at first relationship between age at first pregnancy and cervical cancer. In this pregnancy and cervical cancer. In this study we will be comparing the study we will be comparing the proportion of women who had their proportion of women who had their first pregnancy at or before the ages first pregnancy at or before the ages of 25, because researchers suspected of 25, because researchers suspected that an early age at first pregnancy that an early age at first pregnancy leads to increased risk of developing leads to increased risk of developing cervical cancer. cervical cancer.

2 2 XX 2 Example: Case- 2 Example: Case-Control StudyControl Study

Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancyDiseaseDisease

StatusStatusAge at Age at 11stst PregnanPregnancycy

Age Age << 2525

Age at 1Age at 1stst Pregnancy Pregnancy

Age > 25Age > 25RowRow

TotalsTotals

Cervical Cervical Cancer Cancer (Case)(Case)

4242 77 4949

HealthyHealthy

(Control)(Control)203203 114114 317317

ColumnColumn

TotalsTotals245245 121121 366366

Previously Previously o We have compared the We have compared the

proportions of women with the proportions of women with the risk factor in both groups (prisk factor in both groups (p11 vs. vs. pp22) using the z-test, ) using the z-test, a CI for (p1 – p2) & Fisher’s Exact a CI for (p1 – p2) & Fisher’s Exact Test.Test.

o Computed the Odds Ratio (OR) Computed the Odds Ratio (OR) and found a CI for the population and found a CI for the population OR.OR.

Development of a Test Development of a Test Statistic to Measure Lack of Statistic to Measure Lack of

IndependenceIndependenceOne way to generalize the question of One way to generalize the question of

interest to the researchers is to think interest to the researchers is to think of it as follows:of it as follows:

Q: Is there an Q: Is there an associationassociation between cervical cancer between cervical cancer status and whether or not a status and whether or not a woman had her 1woman had her 1stst pregnancy pregnancy at or before the age of 25?at or before the age of 25?


IndependenceIndependenceIf there is not an association, we If there is not an association, we

say that the variables are say that the variables are independentindependent..

In the probability notes we saw that In the probability notes we saw that two events A and B are said to be two events A and B are said to be independent if independent if P(A|B) = P(A)P(A|B) = P(A)..


IndependenceIndependenceIn the context of our study this would In the context of our study this would

mean mean P(Age P(Age << 25|Cancer Status) = P(Age 25|Cancer Status) = P(Age <<

25)25)

i.e. knowing something about disease i.e. knowing something about disease status tells you nothing about the status tells you nothing about the presence of the risk factor of having presence of the risk factor of having their first pregnancy at or before age 25. their first pregnancy at or before age 25.


IndependenceIndependence

When we consider this percentage conditioning on disease status we see that relationship for independence does not hold for these data.

P(Age < 25|Cervical Cancer) = 42/47 = .8936

P(Age < 25|Healthy Control) = 203/317 = .6404

P(Age < 25) = 245/366 = .6694

In this study 66.94% of the women sampled had their first pregnancy at or before the age of 25.

Should both be equal to .6694


IndependenceIndependenceo Of course the observed differences Of course the observed differences

could be due to random variation and could be due to random variation and in truth it may be the case that disease in truth it may be the case that disease and risk factor status are independent. and risk factor status are independent.

o Therefore we need a means of Therefore we need a means of assessing how different the observed assessing how different the observed results are from what we would expect results are from what we would expect to see if the these two factors were to see if the these two factors were independent.independent.




Age Age << 2525



TotalsTotals


4242 77 4949

HealthyHealthy

(Control)(Control)203203 114114 317317

ColumnColumn

TotalsTotals245245 121121 366366

C1 C2

R1

R2

n

a b

c d


IndependenceIndependence

From this table we can calculate the conditional probability of having the risk factor of early pregnancy given the disease status of the subject as follows:

1

)|(R

aDiseaseRiskP

The unconditional probability of risk presence of these data is given by:

n

CRiskP 1)(

and setting these to equal we have

n

CRa

n

C

R

a 111

1


IndependenceIndependenceThus we expect the frequency in the Thus we expect the frequency in the aa cell to cell to

be equal to:be equal to:

Similarly we find the following Similarly we find the following expected expected frequencies frequencies for the cells making up the 2 for the cells making up the 2 X 2 tableX 2 table

n

CRa 11

n

CRd

n

CRc

n

CRb

n

CRa

2212

2111


IndependenceIndependenceIn general we denote the In general we denote the observed observed

frequencyfrequency in the i in the ith th row and jrow and jthth column as or just column as or just O O for short.for short.

We denote the expected frequency for We denote the expected frequency for the ithe ith th row and jrow and jthth column as column as

ijO

n

CRE jiij or just E for

short.jcolumn for alcolumn tot & i rowfor totalrow ji CR


IndependenceIndependenceo To measure how different our observed To measure how different our observed

results are from what we expected to results are from what we expected to see if the two variables in question see if the two variables in question were independent we intuitively should were independent we intuitively should look at the difference between the look at the difference between the observed (observed (OO) and expected () and expected (EE) ) frequencies, i.e. frequencies, i.e. O – EO – E or more or more specificallyspecifically

o However this will give too much weight However this will give too much weight to differences where these frequencies to differences where these frequencies are both large in size. are both large in size.

ijij EO


IndependenceIndependenceo One test statistic that addresses the One test statistic that addresses the

“size” of the frequencies issue is “size” of the frequencies issue is Pearson’s Chi-Square Pearson’s Chi-Square

)1()1(h wit

on distributi squared-chi~~ 2

1 1

2

cells all

22

crdf

E

EO

E

EO

r

i

c

j ij

ijij

Notice this test statistic still uses (O – E) as the basic building block. This statistic will be large when the observed frequencies do NOT match the expected values for independence.

Chi-square Distribution Chi-square Distribution

This is a graph of the chi-square distribution with 4 degrees of freedom. The area to the right of Pearson’s chi-square statistic give the p-value. The p-value is always the area to the right!

p-value




Age Age << 2525



TotalsTotals


4242 77 4949

HealthyHealthy

(Control)(Control)203203 114114 317317

ColumnColumn

TotalsTotals245245 121121 366366

C1 C2

R1

R2

n

O11 O12

O21 O22

Calculating Expected Calculating Expected FrequenciesFrequencies

Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancy

DiseaseDisease

StatusStatusAge at 1Age at 1stst PregnanPregnancycy

Age Age << 25 25

Age at 1Age at 1stst PregnancPregnancy y

Age > 25Age > 25

RowRow

TotalsTotals


4242 77 4949

HealthyHealthy

(Control)(Control)203203 114114 317317

ColumnColumn

TotalsTotals245245 121121 366366

104.80 366

121317

212.20 366

245317

16.20 366

12149

32.80 366

24549

2222

1221

2112

1111

n

CRE

n

CRE

n

CRE

n

CRE

(32.80)

(16.20)

(212.20)

(104.80)

C1 C2

R1

R2

n

celljth and rowith for frequency expected ijE

Calculating the Pearson Calculating the Pearson Chi-squareChi-square

0027

1)12()12( 9.011

9.011 81. .40 22.558.2

80.104

80.104114

20.212

20.212203

20.16

20.167

8.32

8.3242

2

22

22

cells all

22

.p-value

df

E

EO

http://www.stat.tamu.edu/~west/applets/chisqdemo.html

Chi-square Probability Chi-square Probability Calculator in JMPCalculator in JMP

Enter the test statistic value and df and thep-value is automatically calculated.

p-value = P(

2 2 XX 2 Example: Case- 2 Example: Case-Control Study Control Study

Cervical Cancer and Age at 1Cervical Cancer and Age at 1stst PregnancyPregnancy

Conclusion:Conclusion:We have strong evidence to suggest that at age at We have strong evidence to suggest that at age at

first pregnancy and cervical cancer status are first pregnancy and cervical cancer status are NOT independent, and that they are associated NOT independent, and that they are associated or related or related (p =.0027). In particular we found that the (p =.0027). In particular we found that the proportion of women having their first proportion of women having their first pregnancy at or before the age of 25 was pregnancy at or before the age of 25 was higher amongst women with cervical cancer higher amongst women with cervical cancer than for those without.than for those without.

Other things we could do…Other things we could do…o Odds Ratio (OR) and CI for OROdds Ratio (OR) and CI for OR

- case-control study means no RR.- case-control study means no RR.

o Fisher’s Exact TestFisher’s Exact Test - Pearson’s chi- - Pearson’s chi-square is an approximation that square is an approximation that requires “large” sample sizesrequires “large” sample sizes* * typically we would like all typically we would like all EEijij > > 55 * or at least 80% of cells should have * or at least 80% of cells should have EEijij > > 55

* thus the approximation should be good * thus the approximation should be good here as here as both of these conditions are met for this both of these conditions are met for this study. study.

Example 2: Response to Treatment and Example 2: Response to Treatment and Histological Type of Hodgkin’s DiseaseHistological Type of Hodgkin’s Disease

In this study a random sample of 538 In this study a random sample of 538 patients diagnosed with some form of patients diagnosed with some form of Hodgkin’s Disease was taken and the Hodgkin’s Disease was taken and the histological type: nodular sclerosis (NS), histological type: nodular sclerosis (NS), mixed cellularity (MC), lymphocyte mixed cellularity (MC), lymphocyte predominance (LP), or lymphocyte depletion predominance (LP), or lymphocyte depletion (LD) was recorded along with the outcome (LD) was recorded along with the outcome from standard treatment which was from standard treatment which was recorded as being none, partial, or complete recorded as being none, partial, or complete remission. remission.

Q:Q: Is there an association between type of Is there an association between type of Hodgkin’s and response to treatment? If so, Hodgkin’s and response to treatment? If so, what is the nature of the relationship? what is the nature of the relationship?

Example 2: Response to Treatment Example 2: Response to Treatment and Histological Type of Hodgkin’s and Histological Type of Hodgkin’s

DiseaseDisease

TypeType NoneNone PartiaPartiall

PositiPositiveve

RowRow

TotalsTotals

LDLD 4444 1010 1818 7272

LPLP 1212 1818 7474 104104

MCMC 5858 5454 154154 266266

NSNS 1212 1616 6868 9696

ColumColumnn

TotalsTotals

126126 9898 314314 n = n = 538538

Some Probabilities of Potential Interest

Probability of Positive Response to Treatment

P(positive) = 314/538 = .5836

Probability of Positive Response to Treatment Given Disease Type

P(positive|LD) = 18/72 = .2500P(positive|LP) = 74/104 = .7115

P(positive|MC) = 154/266 = .5789

P(positive|NS) = 68/96 = .7083

Notice the conditional probabilities are not equal to the unconditional!!!

Mosaic plot of the resultsMosaic plot of the resultsResponse to Treatment vs. Histological Type

Clearly we see that LP and NS respond most favorably to treatment with over 70% of those sampled having experiencing complete remission, whereas lymphocyte depletion has a majority (61.1%) of patients having no response to treatment.A statistical test at this point seems unnecessary

as it seems clear that there is an association between the type of Hodgkin’s disease and the response to treatment, nonetheless we will proceed…


DiseaseDisease


PositivPositivee

RowRow

TotalsTotals

LDLD 4444 1010 1818 7272

LPLP 1212 1818 7474 104104

MCMC 5858 5454 154154 266266

NSNS 1212 1616 6868 9696

ColumColumnn

TotalsTotals

126126 9898 314314 n = n = 538538

03.56 538

31496

...

42.02 538

31472

13.11 538

9872

86.61 538

12672

3443

3121

2112

1111

n

CRE

n

CRE

n

CRE

n

CRE

(16.86)

(13.11)

(42.02)

(24.36)

(18.94)

(60.69)

(62.30)

(48.45)

(155.25)

(22.48)

(17.49)

(56.03)


DiseaseDisease


PositivPositivee

RowRow

TotalsTotals

LDLD 4444 1010 1818 7272

LPLP 1212 1818 7474 104104

MCMC 5858 5454 154154 266266

NSNS 1212 1616 6868 9696

ColumColumnn

TotalsTotals

126126 9898 314314 n = n = 538538

(16.86)

(13.11)

(42.02)

(24.36)

(18.94)

(60.69)

(62.30)

(48.45)

(155.25)

(22.48)

(17.49)

(56.03)

.0001

6 89.75

89.7503.56

03.5668 . . .

11.13

11.1310

86.16

86.1644

2

2

222

1 1

2

2

cells all

22

valuep

df

E

EO

E

EO

r

i

c

j ij

ijij

We have strong evidence of an association between the type of Hodgkin’s and response to treatment (p < .0001).

Measures of Association Measures of Association Between Two Categorical Between Two Categorical

VariablesVariables

This can be applied to the cervical cancer case-control study.

157.366

011.9)(

2

n

Phi


VariablesVariables

This can be used for general r x c tables. This can be used for the Hodgkin’s example:

3516.89.75538

89.75

C


VariablesVariables

For the Hodgkin’s study 266.)13(538

89.75

V


VariablesVariablesThere are lots of other measures of association. When both variables are nominal the previous measures are fine and there are certainly many more. For cases where both variables are ordinal common measures include Kendall’s tau and Somer’s D.

In some cases we wish to measure the degree of exact agreement between two nominal or ordinal variables measured using the same levels or scales in which case we generally use Cohen’s Kappa ().


VariablesVariablesCohen’s Kappa (Cohen’s Kappa ()) – measures the degree of – measures the degree of

agreement between two variables on the agreement between two variables on the same scales.same scales.

Example 3: Medicare Study – General health at baseline and 2-yr. follow-up, how well do they agree?

excellent agreement

good agreement

0 < marginal agreement

There is a fairly good agreement between the general assessment of overall health baseline and at follow-up. However, there appears to be some general trend for improvement as well.

Testing for Lack of Testing for Lack of SymmetrySymmetry

o Bowker’s Test of SymmetryBowker’s Test of Symmetry is a is a generalization of McNemar’s Test to generalization of McNemar’s Test to r r x rx r tables where there where the row tables where there where the row and column variables are on the same and column variables are on the same scale. scale.

o The general health of the subjects in the The general health of the subjects in the Medicare study is an example of where Medicare study is an example of where this test could be used as both the this test could be used as both the health at baseline and follow-up is health at baseline and follow-up is recorded using the same 5-point ordinal recorded using the same 5-point ordinal scale. scale.

Bowker’s Test of Bowker’s Test of SymmetrySymmetry

2

)1(on distributi squared-chi ~

)( 22

rrdf

OO

OO

ji jiij

jiijB

11 22 …… rr Row Row TotaTotalsls

11 OO1111 OO1212 …… OO1r1r

22 OO2121 OO2222 …… OO2r2r

…… …… …… …… ……

rr OOr1r1 OOr2r2 …… OOrrrr

Column Column TotalsTotals

Y

X

The test looks for the frequencies to be generally larger on one side of the diagonal than the other.


2

)1(on distributi squared-chi ~

)( 22

rrdf

OO

OO

ji jiij

jiijB

When will this test statistics be “large”? If there was a general trend or tendency for X > Y or for X < Y then we would expect the off diagonal cells of the table to larger on one side than the other. For example if Y tended to be larger than X, perhaps indicating an improvement in health, then we expect the frequencies above the diagonal to be larger than those below.


Symmetry of DisagreementBowker’s test suggests the differences are asymmetric (p < .0001). Examining the percentages suggests a majority of patients either stayed the same or improved in each group based on baseline score.

Therefore it is reasonable to state that we have evidence that in general subjects health stayed the same or if it did change it was generally for the better (p < .0001).

Other ApproachesOther Approacheso Wilcoxon Sign-Rank Test for the Wilcoxon Sign-Rank Test for the

paired differences in the ordinal paired differences in the ordinal health score health score (p < .0001).(p < .0001).

o Direct examination of the Direct examination of the distribution of the changes in distribution of the changes in general health score.general health score.

Follow-up – Baseline

There is a slight advantage for improvement vs. decline in health.

The plot on the right shows the change in general health vs. baseline health. With the exception of those with the lowest health at baseline a majority (50%+) of patients stayed the same. The shading for improvement is larger than the shading for health decline.

Other Tests for Categorical Other Tests for Categorical DataData

o Chi-square Test for Trend in Binomial Chi-square Test for Trend in Binomial Proportions tests whether or notProportions tests whether or not p p1 1 < p< p22 < p < p33 < … < p < … < pk k where 1, 2, …, k are where 1, 2, …, k are levels of an ordinal variable, i.e. 2 X k table.levels of an ordinal variable, i.e. 2 X k table.

o Chi-square Goodness-of-Fit Tests – used test Chi-square Goodness-of-Fit Tests – used test whether observations come from some whether observations come from some hypothesized distribution.hypothesized distribution.

o Cochran-Mantel-Haenszel Test – Looks at Cochran-Mantel-Haenszel Test – Looks at whether or not there is a relationship in a 2 X whether or not there is a relationship in a 2 X 2 table situation adjusting for the level of a 2 table situation adjusting for the level of a third factor. For example, is there a third factor. For example, is there a relationship between heavy drinking (Y or N) relationship between heavy drinking (Y or N) and lung cancer (Y or N) adjusting for and lung cancer (Y or N) adjusting for smoking status.smoking status.

Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing...

Documents

Transcript of Analysis of Categorical Data. Types of Tests o Data in 2 X 2 Tables (covered previously) Comparing...