Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed...

171
2.Two-way contingency tables 2.1 Probability structure for contingency tables Setup : Let X be a categorical variable with i=1, …,I levels. Let Y be a categorical variable with j=1, …,J levels. There are IJ different possible combinations of X and Y together. Frequency counts of these combinations can be summarized in an IJ “contingency table”. Often called “two-way” tables since there are two variables of interest. Example : Larry Bird (data source: Wardrop, American Statistician, 1995) Free throws are typically shot in pairs. Below is a contingency table summarizing Larry Bird’s first and second free 2010 Christopher R. Bilder 2.1

Transcript of Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed...

Page 1: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2. Two-way contingency tables2.1 Probability structure for contingency tables

Setup: Let X be a categorical variable with i=1,…,I levels. Let Y be a categorical variable with j=1,…,J levels. There are IJ different possible combinations of X and Y

together. Frequency counts of these combinations can be

summarized in an IJ “contingency table”. Often called “two-way” tables since there are two

variables of interest.

Example: Larry Bird (data source: Wardrop, American Statistician, 1995)

Free throws are typically shot in pairs. Below is a contingency table summarizing Larry Bird’s first and second free throw attempts during the 1980-1 and 1981-2 NBA seasons. Let X=First attempt and Y=Second attempt.

Second      Made Missed Total

First Made 251 34 285Missed 48 5 53

  Total 299 39 338 2010 Christopher R. Bilder

2.1

Page 2: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Interpreting the table: 251 first and second free throw attempts were both

made 34 first free throw attempts were made and the

second were missed 48 first throw attempts were missed and the second

free throw were made 5 first and second free throw attempts were both

missed 285 first free throws were made regardless what

happened on the second attempt 299 second free throws were made regardless what

happened on the first attempt 338 free throw pairs were shot during these

seasons

What types of questions would be of interest for this data?

Example: Field goals

Below is a two-way table summarizing field goals from the 1995 NFL season (Bilder and Loughin, Chance, 1998). The data can be considered a representative sample from the population. The two categorical variables in the table are stadium type (dome or outdoors) and field goal result (success or failure).

2010 Christopher R. Bilder

2.2

Page 3: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Field goal result  Success Failure Total

Stadium type

Dome 335 52 387Outdoors 927 111 1038

  Total 1262 163 1425

What types of questions would be of interest for this data?

Example: Salk vaccine clinical trials

From p. 186 of the S-Plus 6 Guide to Statistics Volume IIn the Salk vaccine trials, two large groups were involved in the placebo-control phase of the study. The first group, which received the vaccination, consisted of 200,745 individuals. The second group, which received a placebo, consisted of 201,229 individuals. There were 57 cases of polio in the first group and 142 cases of polio in the second group.

  PolioPolio free Total

Vaccine 57 200,688 200,745 Placebo 142 201,087 201,229

2010 Christopher R. Bilder

2.3

Chris Bilder, 12/22/01,
See p. 192 also
Page 4: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Total 199 401,775 401,974 What types of questions would be of interest for this data?

Contingency tables do not have to be 22!

Example: #7.24

Subjects were asked whether methods of birth control should be available to teenagers between the ages of 14 and 16.

Teenage birth controlstrongly agree agree disagree strongly disagree

Rel

igio

us a

ttend

ance

Never 49 49 19 9

<1 per year 31 27 11 11

1-2 per year 46 55 25 8

several times per year 34 37 19 7

1 per month 21 22 14 16

2-3 per month 26 36 16 16

nearly every week 8 16 15 11

every week 32 65 57 61several times per week 4 17 16 20

Notice the “total” column and row are not necessary to include with a contingency table. Also, notice that both categorical variables are ordered.

2010 Christopher R. Bilder

2.4

Page 5: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

What types of questions would be of interest for this data?

In the previous examples, subjects were allowed to fall in only one cell of the contingency table. There are times when subjects may fall in more than one cell!

Example: Education and SOV

Loughin and Scherer (Biometrics, 1998) examine a sample of 262 Kansas livestock farmers who are asked, “What are your primary sources of veterinary information?” Farmers may pick as many sources that apply from (A) professional consultant, (B) veterinarian, (C) state or local extension service, (D) magazines, and (E) feed companies and representatives. Since respondents may pick any number out the possible categorical responses, Coombs (1964) refers to this type of variable as a “pick any/c” variable (“pick any/c” is read as “pick any out of c” and c is the number of categorical responses). Farmers are also asked many demographic questions including their highest attained level of education. Note that individual farmers may be represented more than once in the table since they may pick all sources that apply.

2010 Christopher R. Bilder

2.5

Page 6: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Information source Total TotalA B C D E Responses Farmers

Educ

atio

n

High school 19 38 29 47 40 173 88

Vocational school 2 6 8 8 4 28 16

2-year college 1 13 10 17 14 55 31

4-year college 19 29 40 53 29 170 113

Other 3 4 8 6 6 27 14

Total responses 44 90 95 131 93 262

Higgins (An Introduction to Modern Nonparametric Statistics, 2003) also discusses data in this format. The data is given in a multinomial format in Agresti (2002, p. 484-6).

What types of questions would be of interest for this data?

Notes: Unless otherwise mentioned, all of the contingency

tables in this course will have subjects (or items) who fall in only one cell.

There are many other examples of contingency tables from marketing, psychology, …

The contingency tables presented here are called “two-way” since there are only two categorical variables. Later, we will discuss “three-way” contingency tables

2010 Christopher R. Bilder

2.6

Page 7: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

when there are three categorical variables. Future chapters will discuss four-way, five-way,…

2010 Christopher R. Bilder

2.7

Page 8: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Probability distributions for contingency tables

Let ij = P(X=i, Y=j); i.e., the probability that category i of X and category j of Y is chosen. These probabilities can be put into a contingency table format. If I=2 and J=2, then the following table is produced:

Y1 2

X 1 11 12

2 21 22

Notes: 11, 12, 21, and 22 form the “joint” probability

distribution for X and Y (joint since two random variables).

Notice the row number goes first in the subscript for and the column number goes second.

11+12+21+22=1; thus, every item falls in one of the cells.

Suppose that only the probability distribution for Y is examined. This is called the “marginal” probability distribution for Y. It is denoted by

P(Y=1) = +1, P(Y=2) = +2, and +1++2=1

2010 Christopher R. Bilder

2.8

Page 9: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The “+” in the subscript denotes that all possible values of X are being summed over. Thus,

+1 = 11 + 21 and +2 = 12 + 22

Equivalently, +1 = P(Y=1) = P(Y=1, X=1) + P(Y=1, X=2). The marginal distribution of X, 1+ and 2+, can be found in a similar manner. You will often see a instead of + used exactly the same way in other textbooks.

The contingency table of the probabilities can be extended to include the marginal distribution of Y and X. Notice how the “marginal” probability distribution is put in the “margins” of the table.

Y1 2

X 1 11 12 1+

2 21 22 2+

+1 +2 1

Each of these ij’s are population parameters. These parameters can be estimated by taking a sample. Counts from the sample are summarized in a contingency table as shown below in a general format.

2010 Christopher R. Bilder

2.9

Page 10: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Y1 2

X 1 n11 n12 n1+

2 n21 n22 n2+

n+1 n+2 n

Thus, n11 denote the table count for X=1 and Y=1. Also, n1+= n11+ n12 denotes the table count for X=1 without regards Y. Finally, n = n11+n12+ n21+ n22 is the total sample size. This could also be denoted by n++.

Using the contingency table counts, the parameter estimates are found using pij = nij/n, pi+ = ni+/n, and p+j = n+j/n. Note that could also be used as notation, but Agresti prefers to use a “p”. The resulting contingency table with the “sample proportions” or “sample probabilities” or “estimated probabilities”… is:

Y1 2

X 1 p11 p12 p1+

2 p21 p22 p2+

p+1 p+2 1

22 contingency tables can be extended to IJ tables as shown below:

2010 Christopher R. Bilder

2.10

Page 11: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Y1 2    J

X

1 11 12 1J 1+

2  21 22 2J 2+

 

I I1 I2 IJ I+

+1 +2 +J 1

where i+ = for i=1,…,I and +j = for j=1,

…,J

Y1 2    J

X

1 n11 n12 n1J n1+

2  n21 n22 n2J n2+

 

I nI1 nI2 nIJ nI+

n+1 n+2 n+J n

where ni+ = for i=1,…,I and n+j = for j=1,

…,J

2010 Christopher R. Bilder

2.11

Page 12: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Y1 2    J

X

1 p11 p12 p1J p1+

2  p21 p22 p2J p2+

 

I pI1 pI2 pIJ pI+

p+1 p+2 p+J 1

where pi+ = for i=1,…,I and p+j = for j=1,

…,J

The contingency table could also be written in terms of the expected cell counts, ij, which is simply E(nij). Note that ij = nij.

Example: Larry Bird (bird.R)

Second      Made Missed Total

First Made n11=251 n12=34 n1+=285Missed n21=48 n22=5 n2+=53

  Total n+1=299 n+2=39 n=338

2010 Christopher R. Bilder

2.12

Page 13: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Second      Made Missed Total

First Made p11=0.7426 p12=0.1006 p1+=0.8432Missed p21=0.1420 p22=0.0148 p2+=0.1568

  Total p+1=0.8846 p+2=0.1154 1

For example, p11 = 251/338 = 0.7426 and p1+ = 285/338 = 0.8432.

Make sure you can interpret the probabilities in the table!

How are the contingency tables entered into R?

Below is the code for one method.

> #Create contingency table - notice the data is entered by # columns

> n.table <- array(data = c(251, 48, 34, 5), dim = c(2, 2), dimnames = list(First = c("made", "missed"), Second = c("made", "missed")))

> n.table SecondFirst made missed made 251 34 missed 48 5

> n.table[1,1] [1] 251

> #Find the estimated proportions> p.table <- n.table/sum(n.table)> p.table

SecondFirst made missed made 0.7426036 0.1005917

2010 Christopher R. Bilder

2.13

Rows first

I, J

Notice how the division is performed on each element

Page 14: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

missed 0.1420118 0.0147929

What if the data did not come in a contingency table format?

Suppose the data is in its “raw” form:

> all.data2 first second 1 missed missed 2 missed missed 3 missed missed 4 missed missed 5 missed missed 6 missed made 7 missed made 8 missed made

336 made made337 made made338 made made

The above data is stored in a data.frame (it is constructed in bird.R). To find a contingency table for the data, use the table() or xtabs() functions.

> #Find contingency table two different ways> bird.table1 <- table(all.data2$first, all.data2$second)> bird.table1 made missed made 251 34missed 48 5

> bird.table1[1, 1][1] 251

2010 Christopher R. Bilder

2.14

Page 15: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

> bird.table2<-xtabs(formula = ~ first + second, data=all.data2)

> bird.table2 secondfirst made missed made 251 34 missed 48 5

> bird.table2[1,1][1] 251

Note: For those of you with SAS experience, the corresponding output is similar to the output produced from PROC FREQ in SAS.

Conditional probability distributions

Often when one categorical variable is considered a “response” or “dependent” variable and another categorical variable is considered an “explanatory” or “independent” variable, we would like to look at the probability distribution for the response variable GIVEN the level of the explanatory variable. These can be examined through conditional probability distributions.

From STAT 218:

Suppose two events are denoted by A and B. The conditional probability of A given B happens is denoted by

2010 Christopher R. Bilder

2.15

Page 16: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

, provided P(B)0

For example, A = Bird’s 2nd free throw attempt outcome and B = Bird’s 1st free throw attempt outcome

For STAT 875, we can define conditional probabilities the following way.

Suppose Y (columns) is the response variable and X (rows) is the explanatory variable. Let j|i = P(Y=j | X=i).

Note that j|i = ij/i+ = P(X=i and Y=j) / P(X=i).

The conditional probability distribution has

probabilities 1|i, 2|i, …, J|i and .

Thus, one can think of each row of the contingency table as one conditional probability distribution.

Estimators for the conditional probabilities are pj|i = pij/pi+ = (nij/n) / (ni+/n) = nij/ni+.

Example: Larry Bird

2010 Christopher R. Bilder

2.16

Page 17: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Second      Made Missed Total

First Made p11=0.7426 p12=0.1006 p1+=0.8432Missed p21=0.1420 p22=0.0148 p2+=0.1568

  Total p+1=0.8846 p+2=0.1154 1

Given that Larry Bird misses the first free throw, what is the estimated probability that he will make the second?

P(2nd made | 1st missed) = 1|2

Be careful with the notation for this problem!

The corresponding estimator is p1|2 = p21/p2+ = 0.1420/0.1568 = 0.9057. You can also find this using p1|2 = n21/n2+ = 48/53. Be careful with making sure you know which variable level is represented first and which variable level is represented second in the subscript notation for p1|2.

Therefore it is still very likely that Larry Bird will make the second free throw even if the first one is missed.

Question for basketball fans: Why would this probability be important to know?

If the first free throw result is thought of as an explanatory variable and the second free throw result is

2010 Christopher R. Bilder

2.17

Page 18: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

thought of as a response variable, we can find the following table of conditional probabilities:

Second      Made Missed Total

First Made p1|1=0.8807 p2|1=0.1193 1Missed p1|2=0.9057 p2|2=0.0943 1

Notice the estimated probability of making the second free throw is larger after (given) the first free throw is missed!

Independence

Suppose Y is a response variable and X is an explanatory variable. Also, suppose Y is independent of X. What is j|i equal to?

Remember that j|i = P(Y=j | X=i). Independence means that the probability of Y=j does not depend on the level of X. Therefore, the probability is the same for all levels of X; i.e.,

P(Y=j | X=i) = P(Y=j) for i=1,…,I and j=1,…,J

j|i = +j for i=1,…,I and j=1,…,J

This can be rewritten as 2010 Christopher R. Bilder

2.18

Page 19: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

j|1 = j|2 = … = j|I for j=1,…,J

Thus, there is equality across rows for the conditional probability distributions. When both categorical variables can be thought of as response variables, independence can be written without the use of conditional probability distributions. Statistical independence occurs if

ij = i++j for i=1,…,I and j=1,…,J.

Thus, ij is equal to the product of the corresponding marginal probabilities.

The equivalence of the two ways to write independence can be shown as follows:

ij = i++j for i=1,…,I and j=1,…,J ij/i+ = i++j/i+ for i=1,…,I and j=1,…,J j|i = +j for i=1,…,I and j=1,…,J

Example: Larry Bird

What does independence mean in this example?

Do you think independence occurs?

Poisson, binomial, and multinomial sampling

2010 Christopher R. Bilder

2.19

Chris Bilder, 12/17/07,
The estimated P(make 2nd | miss 1st) is 0.9057; the estimated P(made 2nd) is 0.8846 - the sample probabiliites are close to each other - of course, we still need to do a hypothesis test for the population probabilities. What do the population probabilities represent here???
Chris Bilder, 12/26/01,
The outcome of the second free throw is not dependent on the outcome of the first
Page 20: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

How do counts in a contingency table come about with respect to probability distributions? There are 4 ways where 3 are discussed here:

1) We can often treat each cell of an IJ contingency table as independent Poison random variables; i.e., nij

~ ind. Poisson(ij). Thus,

for nij = 0, 1, 2, …

When use this distribution, we have Poisson sampling. The total sample size, n, is NOT fixed.

2) When n is fixed (or conditional on sample size), multinomial sampling occurs over all of the cells of the contingency table; i.e., (n11, n12, …, nIJ) ~ Multinomial(n, 11, 12, …, IJ). A random sample of size n from one multinomial distribution is taken and summarized by the sample counts in cells of the table.

Note (n11, n12, …, nIJ) ~ Multinomial(n, 11, 12, …, IJ) could also be expressed as (n11, n12, …, n-ijnij) ~ Multinomial(n, 11, 12, …, 1-ijij) since nIJ = n-ijnij and IJ = 1-ijij.

2010 Christopher R. Bilder

2.20

Page 21: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

3) Sometimes n1+, n2+,…, nI+ are fixed by the sampling design. For example in a clinical trial, there may be only 10 people available for the placebo group and 9 people available for the drug group. Also, suppose there are only two possible outcomes for the trial – cured and not-cured. In this case, we have binomial sampling within each row of the contingency table. This is often called “independent” binomial sampling since random variables are independent across the rows.

When more than two outcomes are possible, say cured, partially cured, and not cured, then “independent multinomial sampling” occurs within each row of the contingency table.

(n11, n12, …, n1J) ~ Multinomial(n1+, 1|1, 2|1, …, J|1), (n21, n22, …, n2J) ~ Multinomial(n2+, 1|2, 2|2, …, J|2), (nI1, nI2, …, nIJ) ~ Multinomial(nI+, 1|I, 2|I, …, J|I)

Example: Independent binomial and multinomial sampling and just multinomial sampling.

Suppose n1+=50 males and n2+=60 females are wanted for a study. These males and females are randomly selected from their individual populations. Suppose there are only 2 possible outcomes – cured

2010 Christopher R. Bilder

2.21

Page 22: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

and not cured. This is an example of independent binomial sampling.

Y

CuredNot

Cured

X Male n11 n12 n1+

Female n21 n22 n2+

n+1 n+2 n

Thus, n11~Binomial(n1+, 1|1) and n21~Binomial(n2+, 1|

2) where n11 is independent of n21.

Suppose n1+=50 males and n2+=60 females are wanted for a study. These males and females are randomly selected from their individual populations. Suppose there are now 3 possible outcomes – cured, partially cured, and not cured. This is an example of independent multinomial sampling.

Y

CuredPartiallyCured 

Not Cured

X Male n11 n12 n13 n1+

Female n21 n22 n23 n2+

n+1 n+2 n+3 n

2010 Christopher R. Bilder

2.22

Page 23: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Thus, (n11, n12, n13) ~ Multinomial(n1+, 1|1, 2|1, 3|1) and (n21, n22, n23) ~ Multinomial(n2+, 1|2, 2|2, 3|2) where the n1j’s are independent of the n2j’s.

Suppose n=110 subjects are wanted for a study. Males and females are randomly selected from the one population. This is an example of multinomial sampling. The n1+ and n2+ are not fixed for this study.

Y

CuredPartiallyCured 

Not Cured

X Male n11 n12 n13 n1+

Female  n21 n22 n23 n2+

n+1 n+2 n+3 n

Thus, (n11, n12, n13, n21, n22, n23) ~ Multinomial(n, 11, 12, 13, 21, 22, 23) Instead of Male and Female, we could have drug and placebo groups. Typically, the number of subjects receiving the drug and the number receiving the placebo will be fixed. Thus, independent binomial or multinomial sampling will be used. You can kind of think of this as a Completely Randomized Design used in ANOVA where you fixed the number of people receiving each treatment.

2010 Christopher R. Bilder

2.23

Page 24: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

What about Poisson sampling? Perhaps this could occur if the study allowed anyone who volunteered (with no upper limit) to participate in it.

Notes: Although Poisson sampling may occur, n or ni+ are

often conditioned upon. For the analyses to be examined in this book, we will

usually get the same results no matter what types of sampling methods are used.

You should think about how one can simulate observations in order to form a contingency table.

See the p. 40-41 of Agresti (2002) for an additional example.

2010 Christopher R. Bilder

2.24

Page 25: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2010 Christopher R. Bilder

2.25

Page 26: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.2 Comparing proportions in 22 contingency tables

Difference of proportions or differences of probabilities

Suppose we have the following 22 table

Y1 2

X 1 n11 n12 n1+

2 n21 n22 n2+

n+1 n+2 n

where n1+ and n2+ are FIXED. Thus, we have independent binomial sampling. Suppose Y=1 equates to a success and Y=2 equates to a failure.

We can then write the table in terms of the conditional probability distributions.

Y1=succes

s 2=failure

X 1 1|1 2|1 12 1|2 2|2 1

The sample proportions or probabilities can also be written in this format.

Note that Agresti writes the table as

2010 Christopher R. Bilder

2.26

Page 27: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Y1=succes

s2=failure

X 1 1 1-1 12 2 1-2 1

Example: Larry Bird

Second      Made Missed Total

First Made p1|1=0.8807 p2|1=0.1193 1Missed p1|2=0.9057 p2|2=0.0943 1

Often of interest is determining if the probability of success is the same across the two levels of X. If the probabilities are equal, then 1|1-1|2=0. A confidence interval can be found to examine the differences of the proportions (or probabilities).

Remember from Chapter 1 that the estimated proportion, p, can be treated as an approximate normal random variable with mean and variance for a large sample. Using the notation in this chapter, this means that

2010 Christopher R. Bilder

2.27

Page 28: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

p1|1 ~ N(1|1, 1|1(1-1|1)/n1+) and p1|2 ~ N(1|2, 1|2(1-1|2)/n2+) approximately

for large n1+ and n2+. Note that p1|1 and p1|2 are treated as random variables here, not the observed values in the last example.

The statistic that estimates 1|1 - 1|2 is p1|1 - p1|2. The distribution can be approximated by

N(1|1-1|2, 1|1(1-1|1)/n1+ + 1|2(1-1|2)/n2+)

for large n1+ and n2+.

Note: Var(p1|1 - p1|2) = Var(p1|1) + Var(p1|2) since p1|1 and p1|2 are independent random variables. Some of you may have seen the following: Let X and Y be independent random variables and let a and b be constants. Then Var(aX+bY) = a2Var(X) + b2Var(Y).

Thus, an approximate (1-)100% confidence interval for 1|1-1|2 is

Estimator (distributional value)(standard deviation of estimator)

p1|1-p1|2Z1-/2

2010 Christopher R. Bilder

2.28

Page 29: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Notice how p1|1 and p1|2 replace 1|1 and 1|2 in the standard deviation of the estimator. This is another example of a Wald confidence interval

Do you remember the problems with the Wald confidence interval in Chapter 1? Similar problems happen here.

Agresti and Caffo (2000) recommend using the “add two successes and two failures” methods for an interval of ANY level of confidence.

Let and .

The confidence interval is

Again, Agresti and Caffo do not change the adjustment for different confidence levels!

Below are two plots from the paper comparing the Agresti and Caffo interval to the Wald interval (similar to p. 1.45). The solid line denotes the Agresti and Caffo interval. The y-axis shows the true confidence level

2010 Christopher R. Bilder

2.29

Page 30: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

(coverage) of the confidence intervals. The x-axis shows various values of 1|1 where 1|2 is fixed at 0.3.

To find the estimated true confidence level, 10,000 samples from a binomial probability distribution with 1|2=0.3 and 10,000 samples from a binomial probability distribution with 1|1=x-axis value. The sample size is given on the bottom of the plot. For each of the 10,000 samples from binomial #1 and binomial #2, the confidence interval is calculated. The proportion of time that 1|1-0.3 is inside the interval is calculated as the “true confidence level”. In the plots, p1 represents our 1|1, and p2 represents our 1|2.

2010 Christopher R. Bilder

2.30

Page 31: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2010 Christopher R. Bilder

2.31

Page 32: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

For the plots below, the value of 1|1 was no longer fixed.

The Agresti and Caffo interval tends to be much better than the Wald interval.

2010 Christopher R. Bilder

2.32

Page 33: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Note that other confidence intervals can be done. Agresti and Caffo’s (2000) objective was to present a “better” than the Wald interval which could be used in elementary statistics courses. See Newcombe (Statistics in Medicine, 1998, p. 857-872) for other intervals.

Example: Larry Bird (bird.R)

Find a (1-)100% confidence interval for 1|1-1|2; i.e., P(2nd made | 1st made) – P(2nd made | 1st missed).

95% Wald confidence interval: -0.1122 1|1 - 1|2 0.0623

95% Agresti-Caffo confidence interval: -0.1022 1|1 - 1|2 0.0764

There is not sufficient evidence to indicate a difference in the proportions. What does this mean in terms of the original problem?

R code and output:

> #Confidence interval for difference of proportions> alpha <- 0.05> p.1.1 <- p.table[1, 1]/sum(p.table[1, ])> p.1.2 <- p.table[2, 1]/sum(p.table[2, ])> p.1.1[1] 0.8807018> p.1.2

2010 Christopher R. Bilder

2.33

Page 34: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

[1] 0.9056604

> #Wald> lower <- p.1.1 - p.1.2 - qnorm(1 - alpha/2) *

sqrt((p.1.1*(1-p.1.1))/sum(n.table[1,]) + (p.1.2*(1-p.1.2))/sum(n.table[2,]))

> upper <- p.1.1 - p.1.2 + qnorm(1 - alpha/2) * sqrt((p.1.1*(1-p.1.1))/sum(n.table[1,]) + (p.1.2*(1-p.1.2))/sum(n.table[2,]))

> cat("The Wald C.I. is:", round(lower, 4), "<= pi.1.1-pi.1.2 <=", round(upper, 4))

The Wald C.I. is: -0.1122 <= pi.1.1-pi.1.2 <= 0.0623

> #Agresti-Caffo> p.1.1<-(n.table[1,1]+1)/(sum(n.table[1,])+2)> p.1.2<-(n.table[2,1]+1)/(sum(n.table[2,])+2)> lower<-p.1.1-p.1.2-qnorm(1-alpha/2)*

sqrt(p.1.1*(1-p.1.1)/(sum(n.table[1,])+2) + p.1.2*(1-p.1.2)/(sum(n.table[2,])+2))

> upper<-p.1.1-p.1.2+qnorm(1-alpha/2)*sqrt(p.1.1*(1-p.1.1)/(sum(n.table[1,])+2) + p.1.2*(1-p.1.2)/(sum(n.table[2,])+2))

> cat("The Agresti-Caffo interval is:", round(lower,4) , "<= pi.1.1-pi.1.2 <=", round(upper,4))

The Agresti-Caffo interval is: -0.1035 <= pi.1.1-pi.1.2 <= 0.0778

Agresti provides code for these and a few other intervals for the difference of two proportions and other measures at www.stat.ufl.edu/~aa/cda/R/two_sample/R2/index.html

Relative risk

Suppose there is independent binomial sampling.

The ratio of two probabilities may be more meaningful than their difference when the proportions are close to 0

2010 Christopher R. Bilder

2.34

bilder, 12/04/06,
Includes score intervals (like Wilson saw earlier in terms of inverting a hypothesis test - add in future?)
Page 35: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

or 1 than 0.5. Consider two cases examining the probabilities of people who experience adverse reactions to a drug (1) or a placebo (2):

Adverse reactions    Yes No Total

Drug 1|1=0.510 2|1=0.490 1Placebo 1|2=0.501 2|2=0.499 1

1|1 - 1|2 = 0.510 – 0.501 = 0.009

Adverse reactions    Yes No Total

Drug 1|1=0.010 2|1=0.990 1Placebo 1|2=0.001 2|2=0.999 1

1|1 - 1|2 = 0.010 – 0.001 = 0.009

In both cases, the difference in proportions is the same. However in the second case, it is 10 times more likely to experience an adverse reaction by taking the drug!

The relative risk is the ratio of two probabilities. In the above example (2nd case), it is 1|1/1|2=0.010/0.001 = 10.

Consider the table below.

Y1=succes

s 2=failureX 1 1|1 2|1 1

2010 Christopher R. Bilder

2.35

Page 36: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2 1|2 2|2 1

General interpretation: A Y=1 (success) is 1|1/1|2 times more likely when X=1 rather than when X=2. Typically, it is easier to interpret this quantity when the relative risk is greater than 1. Thus, you may want to invert the ratio. Of course, “invert” your interpretation as well!!!

The sample version of the relative risk is the ratio of two sample conditional probabilities.

Questions: What does a relative risk of 1 mean? What is the range of the relative risk?

One version of an approximate (1-)100% confidence interval is

for large n1+ and n2+ (see #2.15). This is a Wald confidence interval. The estimated standard deviation used in the formula is derived using the “delta method” (see Chapter 14 of Agresti (2002) for a nice introduction).

2010 Christopher R. Bilder

2.36

Chris Bilder, 12/26/01,
Derive through the delta method
Page 37: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Example: Larry Bird (bird.R)

Second      Made Missed Total

First Made p1|1=0.8807 p2|1=0.1193 1Missed p1|2=0.9057 p2|2=0.0943 1

p1|1/p1|2 = 0.8807/0.9057 = 0.9724

If the relative risk is inverted: p1|2/p1|1 = 0.9057/0.8807 = 1.0284. Thus, a successful second free throw is estimated to be 1.0284 times more likely to occur when the first free throw is missed rather than made.

R code and output:> #####################################################Relative risk> p.1.1 <- p.table[1,1]/sum(p.table[1,])> n.1 <- sum(n.table[1,])> p.1.2 <- p.table[2,1]/sum(p.table[2,])> n.2 <- sum(n.table[2,])> cat("The sample relative risk is", p.1.1/p.1.2, "\n \n")The sample relative risk is 0.9724415

> alpha <- 0.05> lower <- exp(log(p.1.1/p.1.2) - qnorm(1 - alpha/2) *

sqrt((1-p.1.1)/(n.1*p.1.1) + (1-p.1.2)/(n.2*p.1.2)))> upper <- exp(log(p.1.1/p.1.2) + qnorm(1 - alpha/2) *

sqrt((1-p.1.1)/(n.1*p.1.1) + (1-p.1.2)/(n.2*p.1.2)))> cat("The Wald interval for RR is:", round(lower, 4), "<=

pi.1.1/pi.1.2 <=", round(upper, 4))The Wald interval for RR is: 0.8827 <= pi.1.1/pi.1.2 <=

1.0713

2010 Christopher R. Bilder

2.37

Page 38: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

> #Invert> cat("The Wald interval for RR is:", round(1/upper, 4), "<=

pi.1.2/pi.1.1 <=", round(1/lower, 4))The Wald interval for RR is: 0.9334 <= pi.1.2/pi.1.1 <= 1.1329

Standard interpretation: I am approximately 95% confident that a second FT success is between 0.9334 and 1.1329 times more likely when the first FT is missed rather than made.

What else could be said here if one wanted to do a hypothesis of Ho: 1|1/1|2 = 1 vs. Ho: 1|1/1|2 ≠ 1

What if the interval was 21|1/1|24?

2010 Christopher R. Bilder

2.38

Page 39: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.3 The odds ratio (OR)

Suppose there is independent binomial sampling with the following set of conditional probabilities:

Y1=succes

s 2=failure

X 1 1|1 2|1 12 1|2 2|2 1

For row 1, the “odds of a success” is odds1 = 1|1/(1-1|1) = 1|1/2|1.

For row 2, the “odds of a success” is odds2 =1|2/(1-1|2) = 1|2/2|2.

In general, the odds of a success are P(success)/P(failure). Notice that the odds are just a rescaling of the P(success)! For example, if P(success) = 0.75, then the odds are 3 or “3 to 1 odds”. The odds of a success are three times larger than for a failure.

The estimated odds are:

and

2010 Christopher R. Bilder

2.39

Page 40: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Notice what cells these correspond to in the contingency table.

Y1 2

X 1 n11 n12 n1+

2 n21 n22 n2+

n+1 n+2 n

Questions: What is the range of an odds? What does it mean for an odds to be 1?

 To incorporate information from both rows 1 and 2 into a single number, the ratio of the two odds is found. This is called an “odds ratio”. Formally, it is defined as:

“Odds ratio” is often abbreviated by “OR”. ORs are VERY useful in categorical data analysis and will be used throughout this book!

ORs measure how much greater the odds of success are for one level of X than for another level of X.

Questions: 2010 Christopher R. Bilder

2.40

Page 41: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

What is the range of an OR? What does it mean for an OR to be 1? What does it mean for an OR > 1? What does it mean for an OR < 1?

The OR can be estimated by

This is the maximum likelihood estimate of (“invariance property” of maximum likelihood estimators).

Notice how the OR is not dependent on a particular variable being called a “response” variable. If the roles of Y and X were switched, we would get the same OR! This is not true for relative risk (try it yourself).

If there was multinomial sampling for the entire table, one could just condition on the rows to obtain the same OR. Also, note that

which is same OR as before. Also,

2010 Christopher R. Bilder

2.41

Chris Bilder, 12/26/01,
When OR<1, the odds of a success are larger for X=2
Chris Bilder, 12/26/01,
When OR>1, the odds of a success are larger for X=1
Chris Bilder, 12/26/01,
When OR=1, the odds of a success are the same - INDEPENDENCE!!!
Page 42: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

is the same estimated odds ratio as before.

Interpretation of the OR: The odds of Y=1 (success) are times larger when

X=1 than when X=2. The odds of X=1 are times larger when Y=1 than

when Y=2.

When <1, we will often want to invert the OR. Below is how the interpretations could change: The odds of Y=1 (success) are 1/ times larger when

X=2 than when X=1 since

The odds of X=1 are 1/ times larger when Y=2 than when Y=1.

Also, the interpretations could change to: The odds of Y=2 are 1/ times larger when X=1 than

when X=2 since

2010 Christopher R. Bilder

2.42

Page 43: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The odds of X=2 are 1/ times larger when Y=1 than when Y=2.

The table below is used a lot for the rearrangement of terms above.

Y1=succes

s 2=failure

X 1 1|1 2|1 12 1|2 2|2 1

Work through these on your own to make sure you can show these relationships. You will need to become very comfortable with inverting an OR!

Confidence interval for

Since is a maximum likelihood estimate, we can use the “usual” properties of them to find the confidence interval. However, using the log( ) often works better

2010 Christopher R. Bilder

2.43

Page 44: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

(i.e., its distribution is closer to being a normal distribution). It can be shown that: log( ) has an approximate normal distribution with

mean log() for large n. The “asymptotic” (for large n) standard deviation of

log( ) is . This is derived using

the “delta method” (see Chapter 14 of Agresti (2002) for a nice introduction).

The approximate (1-)100% confidence interval for log() is

The approximate (1-)100% confidence interval for is

Lui and Lin (Biometrical Journal, 2003, p. 231) show this interval is conservative. What does “conservative” mean?

Problems with small cell counts

2010 Christopher R. Bilder

2.44

Chris Bilder, 01/08/03,
Ask: Why did I put approximate?
Page 45: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

What happens to if nij=0 for some i, j?

When there is a 0 or small cell count, the OR estimator is changed a little to help prevent problems. The OR estimator is

Thus, 0.5 is added to each cell count. The “asymptotic” standard deviation of log( ) is then

and the confidence interval for can be found.

Sometimes, a small number is just added to a cell with a 0 count instead.

Example: Larry Bird (bird.R)

Second      Made Missed Total

2010 Christopher R. Bilder

2.45

Chris Bilder, 12/28/01,
estimated OR is 0 or (!
Page 46: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

First Made n11=251 n12=34 n1+=285Missed n21=48 n22=5 n2+=53

  Total n+1=299 n+2=39 n=338

.

Interpretation: The estimated odds of a made second free throw

attempt are 0.7690 times larger when the first free throw is made than when the first free throw is missed.

The estimated odds of a made first free throw attempt are 0.7690 times larger when the second free throw is made than when the second free throw is missed. Note that this does not necessarily make sense to examine for this problem.

Often when the OR<1, the OR is inverted and the interpretation is changed. Therefore, the estimated odds of a made second free throw attempt are 1/0.7690=1.3004 times larger when the first free throw is missed than when the first free throw is made.

The approximate 95% confidence interval for is 0.2862 2.0659. If the interval is inverted, the approximate 95% confidence interval for 1/ is 0.4841 1/ 3.4935.

The interpretation can be extended to be: 2010 Christopher R. Bilder

2.46

Page 47: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

With approximately 95% confidence, the odds of a made second free throw attempt are between 0.4841 and 3.4935 times larger when the first free throw is missed than when the first free throw is made.

Since 1 is in the interval, there is not sufficient evidence to indicate that the first free throw result has an effect on the second free throw result.

R code and output:> ####################################################> #OR> theta.hat <- (n.table[1,1] * n.table[2,2]) / (n.table[1,2] * n.table[2,1])> theta.hat[1] 0.7689951

> 1/theta.hat[1] 1.300398

> alpha <- 0.05> lower <- exp(log(theta.hat) - qnorm(1 - alpha/2) *

sqrt(1/n.table[1,1] + 1/n.table[2,2] + 1/n.table[1,2] + 1/n.table[2,1])

> upper <- exp(log(theta.hat) + qnorm(1 - alpha/2) * sqrt(1/n.table[1,1] + 1/n.table[2,2] + 1/n.table[1,2] + 1/n.table[2,1]))

> cat("The Wald interval for OR is:", round(lower, 4), "<= theta <=", round(upper, 4))

The Wald interval for OR is: 0.2862 <= theta <= 2.0659

> #Invert> cat("The Wald interval for OR is:", round(1/upper, 4),“<=

1/theta <=", round(1/lower, 4))The Wald interval for OR is: 0.4841 <= 1/theta <= 3.4935

2010 Christopher R. Bilder

2.47

Page 48: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Be careful with the inverted OR. I could have put “the Wald interval for 1/OR is:…”.

Please note that it is incorrect to replace the word “odds” with “probability”. Also, a statement such as “it is 1.3 times more likely the second free throw is made when the first free throw is missed rather than made.” The word “likely” means probabilities are being compared.

Example: Salk vaccine clinical trials (polio.R)

  PolioPolio free Total

Vaccine 57 200,688 200,745 Placebo 142 201,087 201,229

R code and output: > n.table<-array(data = c(57, 142, 200688, 201087), dim =

c(2,2), dimnames=list(Trt = c("vaccine", "placebo"), Result = c("polio", "polio free")))

> n.table ResultTrt polio polio free vaccine 57 200688 placebo 142 201087

> theta.hat <- (n.table[1,1] * n.table[2,2]) / (n.table[1,2] * n.table[2,1])

> theta.hat[1] 0.4022065> 1/theta.hat

2010 Christopher R. Bilder

2.48

Page 49: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

[1] 2.486285

> alpha <- 0.05> lower <- exp(log(theta.hat) - qnorm(1 - alpha/2) *

sqrt(1/n.table[1,1] + 1/n.table[2,2] + 1/n.table[1,2] + 1/n.table[2,1]))

> upper <- exp(log(theta.hat) + qnorm(1 - alpha/2) * sqrt(1/n.table[1,1] + 1/n.table[2,2] + 1/n.table[1,2] + 1/n.table[2,1]))

> cat("The Wald interval for OR is:", round(lower, 4), "<= theta <=", round(upper, 4))

The Wald interval for OR is: 0.2958 <= theta <= 0.5469

> #Invertcat("The Wald interval for 1/OR is:", round(1/upper, 4), "<=

1/theta <=", round(1/lower, 4))

The Wald interval for OR is: 1.8283 <= 1/theta <= 3.381

The estimated odds of getting polio are 0.4022 times higher when the vaccine is given instead of a placebo. If this OR is inverted, a more meaningful interpretation results:

The estimated odds of getting polio are 2.4863 times higher when the placebo is given instead of the vaccine.

With approximately 95% confidence, the odds of getting polio are between 1.8283 and 3.3810 times higher when the placebo is given instead of the vaccine.

The odds ratio interpretation could also be written as: The estimated odds of not getting polio are 2.4863 times higher when the vaccine is given instead of the placebo.

2010 Christopher R. Bilder

2.49

Page 50: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Would you want to receive the vaccine?

ORs can be calculated for larger contingency tables. For example, suppose the table below is of interest.

Y1 2  3

X1 n11 n12 n13 n1+

2  n21 n22 n23 n2+

3 n31 n32 n33 n3+

n+1 n+2 n+3 n

Many ORs could be calculated here. For example,

The estimated odds of Y=1 vs. Y=2 are times

larger when X=1 than when X=2. Also, the estimated

odds of X=1 vs. X=2 are times larger when

Y=1 than when Y=2.

The estimated odds of Y=1 vs. Y=2 are times

larger when X=1 than when X=3.

The estimated odds of Y=1 vs. Y=3 are times

larger when X=1 than when X=3.

2010 Christopher R. Bilder

2.50

Page 51: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The estimated odds of Y=2 vs. Y=3 are times

larger when X=1 than when X=3.

Notice how each sentence has something like “Y=1 vs. Y=2”. This is needed since we need to know which levels are being compared. Before when there was just two, we could just say “Y=1” since this implies it is being compared to the only other level.

Notes: One could write the odds ratio in terms of the expected

cell counts, ij, as for a 22 table.

Read on your own Section 2.3.4 (Relationship between the OR and the relative risk), Section 2.3.5 (The odds ratio applies in case-control studies) and Section 2.3.5 (Types of observational studies).

The Chapter 2 extra notes for the following contains an old test problem (responsible for) and other measures of association in a contingency table (not responsible for).

2010 Christopher R. Bilder

2.51

bilder, 12/18/05,
Add Ken Hambleton article in the future
Page 52: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.4 Chi-squared tests of independence

We will be doing a variety of different hypothesis tests involving contingency tables. In order to do these hypothesis tests, we will need to find the expected cell counts under a hypothesis. These expected cell counts are denoted by ij.

Agresti’s (2007) notation here is not necessarily the best to use for all situations. It may be more appropriate to use something like to denote the expected value under a null hypothesis (Ho).

For example, the observed cell count for row i and column j of a contingency table is nij. Remember that nij is a random variable. The expected value of nij under a particular hypothesis is E(nij) = ij. Note that ij = nij if there are no restrictions upon what ij can be.

Suppose we assume multinomial sampling (n is fixed). A common hypothesis test is a test for independence:

Ho: ij=i++j for i=1,…,I and j=1,…,JHa: Not all equal

Under the null hypothesis of independence restriction, E(nij) = ij = ni++j. Under Ho or Ha (no restriction), E(nij) = nij.

2010 Christopher R. Bilder

2.52

Page 53: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Make sure you understand why ij = ni++j under Ho!

Pearson statistic

The Pearson chi-squared statistic is

Notes: The numerator measures how far the expected value

under Ho and observed cell counts are from each other. Think of this as a squared residual.

The denominator helps account for the scale of the cell count.

The larger , the more evidence that the null

hypothesis is incorrect. Large values of X2 indicate the null hypothesis is

incorrect. For large n, X2 has an approximate 2 distribution with

a particular number of degrees of freedom. The degrees of freedom are dependent on the hypotheses being tested. This is a right tail test.

Typical recommendations for a “large n” involve ij 5 (or nij 5).

2010 Christopher R. Bilder

2.53

Page 54: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Remember that with nij ~ Poisson(ij), then is an approximate standard normal

value. Thus, is an approximate value. See Section 24 of Ferguson (1996) for general uses of

the Pearson statistic.

Suppose we assume multinomial sampling (n is fixed). When a test for independence is done, the hypotheses are:

Ho: ij=i++j for i=1,…,I and j=1,…,JHa: Not all equal

The Pearson statistic has ni++j substituted for ij:

.

Problem:

Notice the parameter values are in the statistic! Thus, this statistic is difficult to calculate.

To solve the problem, the corresponding estimators replace the parameters. The expected cell count under independence is estimated by

.

2010 Christopher R. Bilder

2.54

Page 55: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The statistic becomes

.

For large n, this statistic has an approximate 2 distribution with (I-1)(J-1) degrees of freedom under Ho. The distribution can be denoted symbolically as .

Where does the (I-1)(J-1) degrees of freedom come from?

In general, the degrees of freedom can be calculated as:

[# of parameters under Ha - # of restrictions under Ha] –[# of parameters under Ho - # of restrictions under Ho]

= [# of free parameters under Ha] – [# of free parameters under Ho]

For a test of independence, the number of free parameters under Ha is IJ – 1.

Reason: There are IJ ij parameters. There is one restriction since ijij=1.

For a test of independence, the number of free parameters under Ho is I+J-2.

2010 Christopher R. Bilder

2.55

Page 56: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Reason: There are I i+ parameters and J +j parameters. There are two restrictions since ii+=1 and j+j=1.

Thus, [IJ–1] – [I+J-2] = IJ – I – J +1 = (I-1)(J-1).

Example: Larry Bird (bird.R)

Second      Made Missed Total

First Made n11=251 n12=34 n1+=285Missed n21=48 n22=5 n2+=53

  Total n+1=299 n+2=39 n=338

SecondMade Missed

First

Made

Missed

2010 Christopher R. Bilder

2.56

Page 57: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

= 0.0049 + 0.0382 + 0.0268 + 0.2017= 0.2716

The critical value at =0.05 is = = 3.84. The p-value for the test is 0.6015. Thus, there is not sufficient evidence to reject independence. Of course, this does not mean that the first and second attempts ARE independent!

0 1 2 3 4 5

0.5

1.0

1.5

12

x

Chi

-squ

are

f(x)

2010 Christopher R. Bilder

2.57

Page 58: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

par(xaxs = "i", yaxs = "i") #Removes extra space on x and y-axiscurve(expr = dchisq(x, df=1), col = "red", xlim = c(0,5), ylab = "Chi-square f(x)", main = expression(chi[1]^2))

Note that executing demo(plotmath) at the command prompt shows more of what you can do for plotting mathematical symbols.

Below is the R code and output.

> ind.test<-chisq.test(n.table, correct=F)> names(ind.test)[1] "statistic" "parameter" "p.value" "method"

"data.name" "observed" [7] "expected" "residuals"> ind.test

Pearson's Chi-squared testdata: n.table X-squared = 0.2727, df = 1, p-value = 0.6015

> #just p-value> ind.test$p.value[1] 0.6015021

> ind.test$expected SecondFirst made missed made 252.11538 32.884615 missed 46.88462 6.115385

> #Another way using the raw data> chisq.test(x = all.data2$first, y = all.data2$second, correct=F)

Pearson's Chi-squared testdata: all.data2$first and all.data2$second

2010 Christopher R. Bilder

2.58

bilder, 04/24/09,
Do an example of f(x) = x^2 to show curve() function
Page 59: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

X-squared = 0.2727, df = 1, p-value = 0.6015

> #critical value> qchisq(p = 0.95, df = 1)[1] 3.841459> 1 - pchisq(q = ind.test$statistic, df = 1) X-squared

0.6015021

> #Two more ways!> bird.table2<-xtabs(formula = ~ first + second, data=all.data2)> summary(bird.table2)Call: xtabs(formula = ~first + second, data = all.data2)Number of cases in table: 338 Number of factors: 2 Test for independence of all factors: Chisq = 0.27274, df = 1, p-value = 0.6015 > bird.table3<-table(all.data2$first, all.data2$second)> summary(bird.table3)Number of cases in table: 338 Number of factors: 2 Test for independence of all factors: Chisq = 0.27274, df = 1, p-value = 0.6015

Notes: When the sample size is small, a 2 approximation to the

distribution of X2 may not do a good job. The Yates’ continuity correction can be used to allow for a better approximation. With the correction, the Pearson statistic becomes:

2010 Christopher R. Bilder

2.59

Page 60: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

You can produce this statistic with the chisq.test() function by using the correct=TRUE option. We will discuss other alternatives later for when the sample size is small. Here is a quote from Agresti (1996, p.43), regarding the use of the correction:

There is no longer any reason to use this approximation, however, since modern software makes it possible to conduct Fisher’s exact test for fairly large samples…

The Pearson statistic can also be derived from the point of view of having independent multinomial sampling (ni+ fixed – each row of the contingency table represents a population). Instead of testing for independence as stated previously, equality of the j|i across the rows for each j=1,…,J is tested. Stated formally, the hypotheses are

Ho:j|1=…=j|I for j=1,…,J vs. Ha: At least one

The hypotheses here are equivalent to the independence hypotheses (see p. 2.18 – 2.19). The Pearson test statistic and its asymptotic distribution are also the same. Some books go into detail explaining the differences and how they end up being equivalent. See Chapter 2 of Christensen (1990) if you are interested.

Likelihood ratio test (LRT) statistic 2010 Christopher R. Bilder

2.60

bilder, 12/18/07,
Not in Agresti (2007)
Page 61: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

From Chapter 1 notes:

The LRT statistic, , is the ratio of two likelihood functions. The numerator is the likelihood function maximized over the parameter space restricted under the null hypothesis. The denominator is the likelihood function maximized over the unrestricted parameter space. The test statistic is written as:

Note that the ratio is between 0 and 1 since the numerator can not exceed the denominator.

Questions: Why can’t the numerator exceed the denominator? What does it mean when the ratio is close to 1? What does it mean when the ratio is close to 0?

The actual test statistic used for a LRT is –2log(). The reason is because this statistic has an approximate 2 distribution for large n. The degrees of freedom are found the same way as for the Pearson statistic.

Assuming multinomial sampling, –2log() becomes

2010 Christopher R. Bilder

2.61

Page 62: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

where ij is restricted under the null hypothesis. Note that ij under Ho or Ha ends up being just nij. The G2 notation is used throughout this book and by many other authors to denote this statistic.

Questions: What happens if nijij? What could produce a large value of G2?

The Pearson and G2 will often yield the same conclusions, but rarely the exact same statistic values. Each will always have the same large sample (asymptotic) distribution under the null hypothesis.

Suppose we assume multinomial sampling (n is fixed). When a test for independence is done, the hypotheses are:

Ho: ij=i++j for i=1,…,I and j=1,…,JHa: Not all equal

G2 has ni++j substituted for ij:

Problems: 2010 Christopher R. Bilder

2.62

Page 63: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

1) What if nij=0? Often, 0.5 or some other small constant is added to the cell.

2) Notice the parameter values in G2! Thus, this statistic is difficult to calculate.

To solve the problem, the corresponding estimators replace the parameters. The expected cell count under independence is estimated by

.

The statistic becomes

For large n, this statistic has an approximate distribution.

Example: Larry Bird (bird.R)From the last example,

Second      Made Missed Total

First Made n11=251 n12=34 n1+=285Missed n21=48 n22=5 n2+=53

2010 Christopher R. Bilder

2.63

Page 64: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

  Total n+1=299 n+2=39 n=338

SecondMade Missed

First MadeMissed

=

= 0.2858

The p-value is 0.5930. Thus, there is not sufficient evidence to reject independence. Remember the p-value from using the Pearson statistic was 0.6015.

For a small contingency table like this, you may have to do the calculations by hand on a test. Below is how the test can be done a few different ways in R.

> library(vcd)Loading required package: MASS

Attaching package 'vcd':

The following object(s) are masked from package:graphics :

barplot.default fourfoldplot mosaicplot

The following object(s) are masked from package:base : 2010 Christopher R. Bilder

2.64

Page 65: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

print.summary.table summary.table

> assocstats(n.table) X^2 df P(> X^2)Likelihood Ratio 0.28575 1 0.59296Pearson 0.27274 1 0.60150

Phi-Coefficient : 0.028 Contingency Coeff.: 0.028 Cramer's V : 0.028

The package, vcd, contains a function assoc.stats() which can calculate the LRT statistic and p-value. This package is not installed by default with R. You can install the package by selecting PACKAGES > INSTALL PACKAGE(S) FROM CRAN. Select the vcd package from the list and select OK.

2010 Christopher R. Bilder

2.65

Page 66: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

R may ask if you want to delete the installation files. You can type “Y” for deletion. In order to load the package (make ready for use) in any R session, use the library(vcd) code. This must be done before using any functions within the package.

See the Chapter 2 additional notes for how you can program the statistic itself into R.

Large n

2010 Christopher R. Bilder

2.66

Page 67: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The distributional approximations for X2 and G2 both rely on a “large n” for them to work. Below is a quote from Agresti (1990, p.49) that describes the approximation in more detail:

It is not simple to describe the sample size needed for the chi-squared distribution to approximate well the exact distribution of X2 and G2. For a fixed number of cells, X2 usually converges more quickly than G2. The chi-squared approximation is usually poor for G2 when n/IJ<5. When I or J is large, it can be decent for X2 for n/IJ as small as 1, if the table does not contain both very small and moderately large expected frequencies.

P. 395-6 of Agresti (2002) contains similar information.

Example: Salk vaccine clinical trials (polio.R)

  PolioPolio free Total

Vaccine 57 200,688 200,745 Placebo 142 201,087 201,229

# Test for independence - Pearson chi-square> ind.test <- chisq.test(n.table, correct = F)> ind.test

Pearson's chi-square test without Yates' continuity correction

2010 Christopher R. Bilder

2.67

Page 68: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

data: n.table X-square = 36.1201, df = 1, p-value = 0

#critical value> qchisq(p = 0.95, df = 1)[1] 3.841459> 1 - pchisq(q = ind.test$statistic, df = 1) X-square 1.855266e-009

> ind.test$expected ResultTrt polio polio free vaccine 99.3802 200645.6 placebo 99.6198 201129.4

###################################################### Test for independence – LRT

> library(vcd)> assocstats(n.table) X^2 df P(> X^2)Likelihood Ratio 37.313 1 1.0059e-09Pearson 36.120 1 1.8553e-09

Phi-Coefficient : 0.009 Contingency Coeff.: 0.009 Cramer's V : 0.009

There is evidence against the independence of the treatment and polio result.

Suppose subjects can pick more than one X and Y response. Below is an example of where this can happen:

2010 Christopher R. Bilder

2.68

Page 69: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

In this case, farmers can choose more than one type of swine waste storage method and more than one type of source of veterinary information. The previous methods for testing independence assume a subject (farmer here) is represented only once in the table. Therefore, they can not be used. As part of my research, I have derived a few different testing approaches for this. See Bilder and Loughin (Biometrics, 2004) for more information.

Residuals

Suppose the hypothesis of independence is rejected. The next step would be to determine WHY it was rejected. Summary measures like an OR can help determine what type of dependence exists. Cell residuals can also help determine where independence is a bad “fit”.

Cell deviations: nij- - hard to interpret because of the size of the counts

2010 Christopher R. Bilder

2.69

bilder, 01/07/09,
NSF grant #0207212
Page 70: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Cell 2: - can be “roughly” treated as

Pearson residual: - this is just the square root

of the cell 2; it can be treated “roughly” as a N(0,1); use 2 or 3 as “general” guidelines to help determine what cells are “outlying” or indicate evidence against independence

Standardized residual: for a test of

independence. Note that the denominator is . For large n, this can be treated as a

approximate N(0,1) random variable. Use 2 or 3 as guidelines to help determine what cells are “outlying” or indicate evidence against independence.

Questions: For the Pearson residual, why does it make sense to

divide by ? The standardized residual will change if a different

hypothesis is tested. The Pearson residual and the standardized residual

are the equivalent of semistudentized residuals and studentized residuals typically discussed in a regression analysis course similar to STAT 870. See Section 10.2 of my STAT 870 lecture notes at

2010 Christopher R. Bilder

2.70

Chris Bilder, 12/28/01,
n_ij~Poisson with mean mu_ij and variance mu_ij
bilder, 12/26/07,
Agresti (1996) called these "adjusted" residuals
Page 71: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

www.chrisbilder.com/stat870/schedule.htm for more information.

Example: Larry Bird (bird.R)

From the last example, Second  

    Made Missed Total

First Made n11=251 n12=34 n1+=285Missed n21=48 n22=5 n2+=53

  Total n+1=299 n+2=39 n=338

SecondMade Missed

First MadeMissed

Pay close attention to how elementwise subtraction and division are being done even though matrices are being used!#General way> mu.hat<-ind.test$expected> cell.dev <- n.table - mu.hat> cell.dev second made second missed first made -1.115385 1.115385first missed 1.115385 -1.115385

> pearson.res <- cell.dev/sqrt(mu.hat)> pearson.res second made second missed

2010 Christopher R. Bilder

2.71

Chris Bilder, 01/08/03,
See how this does the calculations over the ENTIRE table!
Page 72: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

first made -0.07024655 0.1945039first missed 0.16289564 -0.4510376

> ind.test$residuals #Pearson residuals easier way SecondFirst made missed made -0.07024655 0.1945039 missed 0.16289564 -0.4510376

> stand.res <- matrix(NA, 2, 2)> #find standardized residualsfor(i in 1:2) {

for(j in 1:2) {stand.res[i, j] <- pearson.res[i,j] /

sqrt((1-sum(n.table[i,])/n) * (1-sum(n.table[,j])/n))}

}> stand.res [,1] [,2] [1,] -0.5222416 0.5222416[2,] 0.5222416 -0.5222416

#Note that the Pearson residuals can also be found with:> ind.test<-chisq.test(n.table, correct=F)> ind.test$residuals second made second missedfirst made -0.07024655 0.1945039first missed 0.16289564 -0.4510376

Notice that none of the residuals are indicating that independence provides a bad fit to the contingency table. Why does this make sense?

Example: Salk vaccine clinical trials (polio.R)

  PolioPolio free Total

2010 Christopher R. Bilder

2.72

p+jpi+

Page 73: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Vaccine 57 200,688 200,745 Placebo 142 201,087 201,229

> n.table polio polio free vaccine 57 200688placebo 142 201087

> pearson.res<-ind.test$residuals> pearson.res ResultTrt polio polio free vaccine -4.251215 0.09461241 placebo 4.246099 -0.09449856> stand.res <- matrix(data = NA, nrow = 2, ncol = 2) #find

standardized residuals> for(i in 1:2) {

for(j in 1:2) { stand.res[i, j] <- pearson.res[i, j]/sqrt((1 –

sum(n.table[i, ]/n)) * (1 - sum(n.table[, j]/n)))}

}

> stand.res [,1] [,2] [1,] -6.009997 6.009997[2,] 6.009997 -6.009997

Notice that the residuals are indicating all cells contribute to the dependence.

Example: #7.13 (birth_control.R)

This example shows what happens when a table larger than 22 is used. Note that it may be difficult to

2010 Christopher R. Bilder

2.73

pi+ p+j

Page 74: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

summarize all of the dependence with ORs since the table is 94 in size!

Subjects were asked whether methods of birth control should be available to teenagers between the ages of 14 and 16. Notice the ordered categorical variables!

Teenage birth controlstrongly agree agree disagree strongly disagree

Rel

igio

us a

ttend

ance

Never 49 49 19 9

<1 per year 31 27 11 11

1-2 per year 46 55 25 8

several times per year 34 37 19 7

1 per month 21 22 14 16

2-3 per month 26 36 16 16

nearly every week 8 16 15 11

every week 32 65 57 61several times per week 4 17 16 20

Below is the R code and output.

n.table<-array(c(49, 31, 46, 34, 21, 26, 8, 32, 4, 49, 27, 55, 37, 22, 36, 16, 65, 17, 19, 11, 25, 19, 14, 16, 15, 57, 16, 9, 11, 8, 7, 16, 16, 11, 61, 20), dim=c(9,4), dimnames=list( Religous.attendance =

c("Never", "<1 per year", "1-2 per year", "several times per year", "1 per month", "2-3 per month", "nearly every week", "every week", "several times per week"),

Teenage.birth.control = c("strongly agree", "agree", "disagree", "strongly disagree")))

> n.table 2010 Christopher R. Bilder

2.74

Page 75: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Teenage.birth.controlReligous.attendance strongly agree agree disagree strongly disagree Never 49 49 19 9 <1 per year 31 27 11 11 1-2 per year 46 55 25 8 several times per year 34 37 19 7 1 per month 21 22 14 16 2-3 per month 26 36 16 16 nearly every week 8 16 15 11 every week 32 65 57 61 several times per week 4 17 16 20

####################################################### Test for independence - Pearson> ind.test <- chisq.test(n.table, correct = F)> ind.test

Pearson's chi-square test without Yates' continuity correction

data: n.table X-square = 106.1941, df = 24, p-value = 0

> mu.hat<-ind.test$expected> mu.hat Teenage.birth.controlReligous.attendance strongly agree agree disagree strongly disagree Never 34.15335 44.08639 26.12527 21.634989 <1 per year 21.68467 27.99136 16.58747 13.736501 1-2 per year 36.32181 46.88553 27.78402 23.008639 several times per year 26.29266 33.93952 20.11231 16.655508 1 per month 19.78726 25.54212 15.13607 12.534557 2-3 per month 25.47948 32.88985 19.49028 16.140389 nearly every week 13.55292 17.49460 10.36717 8.585313 every week 58.27754 75.22678 44.57883 36.916847 several times per week 15.45032 19.94384 11.81857 9.787257

####################################################### Test for independence - LRT> #easiest way> library(vcd)> assocstats(n.table) X^2 df P(> X^2)Likelihood Ratio 112.54 24 2.0284e-13Pearson 106.19 24 2.5890e-12

Phi-Coefficient : 0.339 2010 Christopher R. Bilder

2.75

Page 76: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Contingency Coeff.: 0.321 Cramer's V : 0.196

####################################################### Find residuals> pearson.res<-ind.test$residuals> pearson.res strongly agree agree disagree strongly disagree Never 2.5404573 0.7400280 -1.3940262 -2.71641759 <1 per year 2.0004242 -0.1873785 -1.3719091 -0.73834198 1-2 per year 1.6058693 1.1850612 -0.5281708 -3.12893004several times per year 1.5030986 0.5253346 -0.2480249 -2.36589885 1 per month 0.2726315 -0.7008651 -0.2920103 0.97882315 2-3 per month 0.1031195 0.5423137 -0.7905900 -0.03494422 nearly every week -1.5083590 -0.3573330 1.4388522 0.82410537 every week -3.4421839 -1.1791057 1.8603644 3.96370252several times per week -2.9130570 -0.6591897 1.2163031 3.26446417

#find standardized residuals> stand.res <- matrix(NA, 9, 4)> for(i in 1:9) {

for(j in 1:4) {stand.res[i, j]<-pearson.res[i,j] /

sqrt((1-sum(n.table[i,]/n)) * (1 - sum(n.table[, j]/n)))}

}> stand.res [,1] [,2] [,3] [,4] [1,] 3.2012973 0.9874517 -1.6845693 -3.21118144[2,] 2.4512975 -0.2431349 -1.6121413 -0.84876153[3,] 2.0337928 1.5892453 -0.6414677 -3.71746225[4,] 1.8606698 0.6886070 -0.2944292 -2.74746527[5,] 0.3327059 -0.9056755 -0.3417328 1.12058040[6,] 0.1274202 0.7095804 -0.9368124 -0.04050671[7,] -1.8164011 -0.4556525 1.6616023 0.93098781[8,] -4.6010650 -1.6689014 2.3846584 4.97026514[9,] -3.5220717 -0.8439433 1.4102460 3.70267268

There is strong evidence against independence. The deviation from independence appears to occur in the “corners” of the table. Notice the upper left and lower right have positive values, and the lower left and upper

2010 Christopher R. Bilder

2.76

Page 77: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

right have negative values. This could be due to the ordinal nature of the categorical variables. Models which take into this into account will be discussed later.

The type of dependence here is called “positive” dependence (not “negative” dependence). The upper left and lower right have positive values mean the (1,1), (9,4),… cells are occurring more frequently than expected under independence. Thus, low row and column indices occur together and the high row and high column indices occur together. The lower left and upper right have negative values mean the (9,1), (1,4),… cells are occurring less frequently than expected under independence.

If this is hard to understand, think of the positive relationship that typically occurs with high school and college GPAs. See the data set in the R Introduction notes.

Partitioning Chi-squared (p.32-3)

Read on your own

Comments on Chi-squared tests (p.33-34)

Read on your own

2010 Christopher R. Bilder

2.77

Page 78: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Note that X2 and G2 do not depend on the order of the rows or columns. Thus, they do not change for any ordering of the rows and columns. These tests assume the categorical variables are nominal. If the categorical variables are ordinal, the tests ignore the ordinal information.

2010 Christopher R. Bilder

2.78

Page 79: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.5 Testing independence for ordinal data

The previous tests for independence assumed each categorical variable was nominal. If at least one of the variables was ordinal, useful information may be ignored by using the previous tests!

Generally, tests which incorporate the ordinal information will be more POWERFUL in detecting dependence than tests which do not.

What does being more POWERFUL mean???

Linear trend alternative to independence

Suppose the row and column categorical variables are ordinal. If either of the categorical variables are nominal with only two categories, the test shown below can also be used.

Tests using the ordinal information assign “scores” to the each level of the row and each level of the column categorical variables.

Let u1u2…uI denote the scores for the row variable with at least one replaced with a <.

Let v1v2…vJ denote the scores for the column variable with at least one replaced with a <.

2010 Christopher R. Bilder

2.79

Page 80: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Example: #7.13 (birth_control.R)

Teenage birth controlstrongly agree agree disagree strongly disagree

Rel

igio

us a

ttend

ance

Never 49 49 19 9

<1 per year 31 27 11 11

1-2 per year 46 55 25 8

several times per year 34 37 19 7

1 per month 21 22 14 16

2-3 per month 26 36 16 16

nearly every week 8 16 15 11

every week 32 65 57 61several times per week 4 17 16 20

Teenage birth control opinions could have scores of v1=1 (strongly agree), v2=2 (agree), v3=3 (disagree), and v4=4 (strongly disagree).

Religious attendance could have scores of u1=0 (never), u2=1 (<1 per year),…, u9=8 (several times per week). Changing the levels to yearly could produce the following scores also: u1=0, u2=1, u3=1.5, u4=(3+12)/2 = 7.5, u5=12, u6=2.512=25, u7=(52+25) /2=38.5, u8=52, and u9=522=104.

Notice there generally is more than one way of assigning scores! One should try a few different ways to see if inferences are affected.

2010 Christopher R. Bilder

2.80

Chris Bilder, 12/28/01,
Agresti calls this a sensitivity analysis
Page 81: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Suppose each observation is replaced with their (ui, vj) pair. In the last example, there are 49 observation pairs of (u1,v1), …, 20 observation pairs of (u9, v4). Using this “new” data set, the Pearson product-moment correlation (often denoted by r) can be calculated and interpreted in its usual way!

Review from STAT 218 for a Pearson correlation:

Suppose X and Y are two variables. We observe (x1, y1), …. , (xn, yn) pairs where n is the sample size. The Pearson correlation is calculated as

r is scaleless and 0r1.

Since there are a number of observations with the same (ui, vj) pair, we can simplify the formula for the correlation to be

2010 Christopher R. Bilder

2.81

Page 82: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Compare this formula on your own to the formula for the Pearson product-moment correlation.

Notes: -1r1 Values close to -1 or 1 indicate strong negative or

positive dependence, respectively. Values close to 0 indicate independence or small

dependence. To test,

Ho:Independence Ha:Linear dependence,

use M2=(n-1)r2 as the test statistic. This statistic has an approximate distribution for large n.

Notice the null hypothesis is the same as previously used for “test of independence” with X2 and G2. However, the alternative hypothesis is not the same.

2010 Christopher R. Bilder

2.82

Page 83: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

This alternative hypothesis specifies the “type” of dependence. Previously, any “type” of dependence was given in the alternative hypothesis.

The alternative hypothesis here is a subset of the alternative hypothesis used with X2 and G2.

Example: #7.13 (birth_control.R)

Ho:Independence Ha:Linear dependence

r = 0.31, M2 = (926-1)0.312 = 88.96, p-value<0.0001

There is evidence of positive linear dependence. Notice the pattern of the residuals on p. 2.76. This is indicative of a linear relationship! The “corner” residuals are “large”. When the u and v scores are both small or large, the residuals are positive. When the u and v scores are opposite in their values (i.e. u small, v large or vice versa), the residuals are negative.

Below is the R code and output. Notice how the data is put into its “raw” form.

> ########################################################## ordinal measures

#scores> u <- 0:8> #u_c(0, 1, 1.5, 7.5, 12, 25, 38.5, 52, 104)

2010 Christopher R. Bilder

2.83

Page 84: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

> v <- 1:4

> all.data <- matrix(data = NA, nrow = 0, ncol = 2)

> #Put data in "raw" formfor(i in 1:9) {

for(j in 1:4) {all.data <- rbind(all.data, matrix(data = c(u[i],v[j]),

nrow = n.table[i, j], ncol = 2, byrow=T))}

}

#find correlation> r <- cor(all.data)> r [,1] [,2] [1,] 1.0000000 0.3101243[2,] 0.3101243 1.0000000> M.sq <- (sum(n.table) - 1) * r[1, 2]^2> M.sq[1] 88.96382> 1 - pchisq(M.sq, 1)[1] 0

When the second set of u scores are used, r= 0.3067.

Notes: r and M2 do not change for different sets of equal spaced

scores. For example, scores of 1,2,3,4 and 0,1,2,3 give the same results.

See the example using the data in Table 2.7 of Agresti (2007). The column variable is nominal, but one can still find r since it has only two levels.

See Agresti (2007) use of “midranks” to find the scores.

2010 Christopher R. Bilder

2.84

Combine u and v scores

Number of rows of the same u and v Number of columns

of the same u and v

Page 85: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Model based approaches for ordinal data will be discussed later in Chapter 7. Chapter 9 of Agresti (2002) discusses these in detail.

What if one of the variables is ordinal and the other variable is nominal (with more than two categories)? One can look at mean scores across the levels of the nominal variable. For example, suppose X is nominal and Y is ordinal. Find the mean scores for Y at each level of X. See Chapter 9 of Agresti (2002) again.

2010 Christopher R. Bilder

2.85

Page 86: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.6 Exact inference for small samples

X2 and G2 for a fixed n do NOT exactly have 2 distributions!!! We use a 2 distribution when n is large because the statistics “approximately” have this distribution. What happens if the sample size is not large???

A good overview of exact inference is:

Agresti, A. (1992). A survey of exact inference for contingency tables. Statistical Science 7, 131-153.

Exact inference refers to the “exact” probability distribution of the statistic being used. The Clopper-Pearson interval is an example of exact inference.

Here’s a quote from Agresti (1992) which quotes R. A. Fisher’s Statistical Methods for Research Workers 1st edition (1926) book:

… the traditional machinery of statistical processes is wholly unsuited to the needs of practical research. Not only does it take a cannon to shoot a sparrow, but it misses the sparrow! The elaborate mechanism built on the theory of infinitely large samples is not accurate enough for simple laboratory data. Only by systematically tackling

2010 Christopher R. Bilder

2.86

Page 87: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

small sample problems on their merits does it seem possible to apply accurate tests to practical data.

Small samples here does not just mean a small n. It also means having a mix of small and large cell counts.

Hypergeometric distribution

Here’s the classic set up for a random variable with a hypergeometric probability distribution:

Suppose an urn has n balls with a of them being red and b of them being blue. Thus a+b=n. Suppose kn balls are drawn from the urn without replacement. Let m be the number of red balls drawn out.

The random variable m has a hypergeometric distribution with density function of

for m = 0, 1,…, k

subject to m ≤ a and k – m ≤ b. Note that

= “e choose d”. Also, notice that

2010 Christopher R. Bilder

2.87

Page 88: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

a, n, b, and k are FIXED values. The only random variable is m!

Example: Let n=10, a=4, b=6, k=3, and m=2

Example: Urns (tea_taster.R)

Suppose there are n=8 balls in an urn with a=4 of them red and b=4 of them blue. Suppose k=4 balls are drawn from the urn. What is the probability of getting m=3 red balls?

The entire probability distribution is m P(m)0 0.01431 0.22862 0.5143

2010 Christopher R. Bilder

2.88

Page 89: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

3 0.22864 0.0143

Is it reasonable to observe m 3?

R code and output: > #P(3)> dhyper(3, 4, 4, 4)[1] 0.2285714> #P(0),...,P(4)> dhyper(0:4, 4, 4, 4)[1] 0.01428571 0.22857143 0.51428571 0.22857143 0.01428571

In general, the function is dhyper(m, a, b, k).

Fisher’s exact test

The hypergeometric distribution can be used with 22 tables to test for independence! Below is a 22 table.

Y1 2

X 1 n11 n12

n1+

2 n21 n22n2+

n+1 n+2 n

2010 Christopher R. Bilder

2.89

ab

n

m

unl, 12/04/06,
YES! This is very important for the upcoming discussion. Use bringing umbrella to school in case of rain example (if needed).
Page 90: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Suppose n1+, n2+, n+1, n+2, and n are FIXED by the sampling design. This means before the sample is taken or the experiment is conducted, these values are KNOWN. Given these known quantities, how many of the 4 cell counts (n11, n12, n21, and n22) are needed before all of the other cell counts are known?

Since only one of the four cell counts is needed to know the rest of the table counts, n11 can be treated as the only random variable! If you know n11, you know the rest of the table!

Suppose X and Y are independent. The probabilities of observing different n11 values (and thus different 22 tables) can be calculated using the hypergeometric distribution:

.

The probabilities are calculated under the assumption of independence between X and Y. A low probability indicates that a particular n11 is not likely to be observed. Thus, its corresponding 22 table is not likely under independence.

2010 Christopher R. Bilder

2.90

k

temp, 01/16/05,
P. 96-7 of Agresti (2002) gives more details about the distribution
Page 91: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Using the hypergeometric distribution in a test for independence with 22 contingency tables is called Fisher’s exact test. Note that the hypergeometric is the EXACT distribution for n11. Thus, this is where the name exact inference comes from.

Tea taster experiment

This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century” book by David Salsburg. Below is a “possible” outcome of the observed data (the actual data is unknown?).

Guess Pour FirstMilk Tea

Poured First

Milk 3 1 4Tea 1 3 4

4 4 8

Before the experiment, it was decided to have 4 cups with milk poured first and 4 cups with tea poured first. Thus, the row marginal totals are FIXED.

2010 Christopher R. Bilder

2.91

Page 92: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Since the taster was told before the experiment 4 cups had milk poured first and 4 cups had tea poured first, one would think the taster would guess 4 of each type. Thus, the column totals are FIXED.

Questions: How likely is it to have an experiment with both row and

column totals fixed? Suppose that the taster really can not tell the difference.

What does this mean in terms of the problem? What is the probability that the taster would have

guessed correctly three of the milk poured first?

Under the assumption that the taster can not tell the difference, the probability can be found with the hypergeometric distribution: P(3)=0.2286.

Does guessing 3 or more of the milk poured first correctly seem reasonable under the assumption that the taster can not tell the difference?

P(3)+P(4) = 0.2286 + 0.0143 = 0.2429

What is the p-value of Ho:=1 (independence) vs. Ha:>1 (positive dependence)?

Why is this test chosen instead of Ha:1 or Ha:<1?

Notice the only way to show there is some evidence that the taster can tell the difference is when n11=4.

2010 Christopher R. Bilder

2.92

Chris Bilder, 12/29/01,
>1 means that she can tell the difference; <1 means she tends to guess the opposite
Chris Bilder, 12/29/01,
INdependence
Chris Bilder, 12/29/01,
Not very - row totals being fixed is quite common - see more on conditional inference
Page 93: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The small sample size here is the reason.

R code and output from tea_taster.R:

> n.table<-array(data = c(3, 1, 1, 3), dim = c(2,2), dimnames=list(Actual = c("Pour Milk", "Pour Tea"),

Guess = c("Pour Milk", "Pour Tea")))> n.table GuessActual Pour Milk Pour Tea Pour Milk 3 1 Pour Tea 1 3

> fisher.test(x = n.table)

Fisher's Exact Test for Count Data

data: n.table p-value = 0.4857alternative hypothesis: true odds ratio is not equal to 1 95 percent confidence interval: 0.2117329 621.9337505 sample estimates:odds ratio 6.408309

> fisher.test(n.table, alternative = "greater")

Fisher's Exact Test for Count Data

data: n.table p-value = 0.2429alternative hypothesis: true odds ratio is greater than 1 95 percent confidence interval: 0.3135693 Inf sample estimates:odds ratio 6.408309

2010 Christopher R. Bilder

2.93

Page 94: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

The two-tail test p-value is given by fisher.test(). The two-tail test adds all probabilities that are P(n11); i.e., sum the table probabilities that are no more likely than the observed. In this case, this includes the probabilities of P(0), P(1), P(3), P(4).

Larger than 2 2 tables

Fisher’s exact test can be extended to tables larger than 22 by using the multiple hypergeometric distribution. The probability of observing cell counts that are not in row I or column J (see blue cells) is:

The marginal totals of the contingency table are again assumed to be fixed. Below is the IJ table shown for review:

2010 Christopher R. Bilder

2.94

Page 95: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Y1 2   J-1 J

X

1 n11 n12 n1,J-1 n1J n1+

2  n21 n22 n2,J-1 n2J n2+

  I-1 nI-1,1 nI-1,2 nI-1,J-1 nI-1,J

I nI1 nI2 nI,J-1 nIJ nI+

n+1 n+2 n+,J-1 n+J n

For 22 tables, the multiple hypergeometric simplifies to the hypergeometric.

Example: Table 2.10 of Agresti (1996, p. 45) (tab2.10.R)

n.table <- array(data = c(0, 1, 0, 7, 1, 8, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0), dim = c(3, 9))

> n.table [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [1,] 0 7 0 0 0 0 0 1 1[2,] 1 1 1 1 1 1 1 0 0[3,] 0 8 0 0 0 0 0 0 0

> fisher.test(n.table)

Fisher's Exact Test for Count Data

data: n.table p-value = 0.001505alternative hypothesis: two.sided

> x.sq<-chisq.test(n.table, correct=F) 2010 Christopher R. Bilder

2.95

bilder, 12/18/07,
Not in Agresti (2007)
Page 96: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Warning message: Chi-squared approximation may be incorrect in: chisq.test(n.table, correct = F)

> x.sq

Pearson's Chi-squared test

data: n.table X-squared = 22.2857, df = 16, p-value = 0.1342

Notice the difference in the p-values between the two tests.

Permutation tests

Introduction to Modern Nonparametric Statistics by James J. Higgins (2003) is a very good reference on these types of tests.

Similar to Fisher’s exact test, it would be nice if we could write out the exact probability distribution for statistics like X2 or G2 and use these distributions to judge how likely it is to observe the test statistic value under a null hypothesis. In the tea tasting experiment, there are 5 unique 22 tables under independence that produce the following probabilities:

n11 Table P(n11) X2

          2010 Christopher R. Bilder

2.96

Page 97: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

0 0 4 0.0143 84 0                  

1 1 3 0.2286 23 1                  

2 2 2 0.5143 02 2                  

3 3 1 0.2286 21 3                  

4 4 0 0.0143 80 4         

Notice that the X2 is the same for some tables. Taking this into account, the exact probability distribution of X2 can be written as

X2 P(X2) CDF0 0.5143 0.51432 0.4571 0.97158 0.0286 1.0000

The CDF column represents the “cumulative distribution function.” Remember that with a Pearson chi-square test for independence, we would use a distribution to

2010 Christopher R. Bilder

2.97

Page 98: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

approximate this discrete distribution. Below are some tables and plots showing how poor this approximation is:

X2 P(X2) CDF CDF for 0 0.5143 0.5143 0.00002 0.4571 0.9714 0.84278 0.0286 1.0000 0.9953

0 2 4 6 8

0.0

0.2

0.4

0.6

0.8

1.0

CDFs

X2

Cum

ulat

ive

prob

abili

ty 12

Exact

My perm_test_motivate.R program does the new calculations.

2010 Christopher R. Bilder

2.98

Page 99: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

A more general way to see this same exact distribution representation is to consider all possible “permutations” of the row and column numbers. For example, we observed the table:

Guess Pour FirstMilk Tea

Poured First

Milk 3 1 4Tea 1 3 4

4 4 8

There are 8 distinct observations the lady needs to make. We could label these as z1, z2, …, z8. Suppose we observed the following:

Row Column1 z1 = 11 z2 = 11 z3 = 11 z4 = 22 z5 = 12 z6 = 22 z7 = 22 z8 = 2

which produces the table above and X2 = 2. Under independence, these column numbers could have appeared with any of the row numbers. For example, we could have had

2010 Christopher R. Bilder

2.99

Page 100: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Row Column1 z2 = 1 1 z1 = 11 z3 = 11 z4 = 22 z5 = 12 z6 = 22 z7 = 22 z8 = 2

resulting in the same 22 table, so that X2 = 2 again. Also, we could have had

Row Column1 z1 = 1 1 z2 = 11 z7 = 21 z4 = 22 z5 = 12 z6 = 22 z3 = 12 z8 = 2

resulting in a contingency table with all 2’s in the cells and X2 = 0. These last two examples are “permutations” of the data, and there are 8! = 40,320 permutations in total. Because of the independence assumption, each of these are equally likely to occur – i.e., 1/40,320. If we found all possible permutations, we could form a table as follows:

X2 # of permutations Proportion0 20,736 0.5143

2010 Christopher R. Bilder

2.100

Page 101: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2 18,432 0.45718 1,152 0.0286

which is the same exact distribution that we saw before! In fact, one could have found this with

> dhyper(0:4, 4, 4, 4)*factorial(8)[1] 576 9216 20736 9216 576

in R.

In order to calculate a p-value, we can use this exact distribution. With X2 = 2 observed, the p-value is P(A 2) = 0.4571 + 0.0286 = 0.4857 where A is a random variable with this exact distribution (in a more mathematical statistics setting, one would write x2 = 2 is observed and the p-value is P(X2 2))

Frequently, the number of permutations is going to be so large that we can not calculate every permutation. Instead, we will randomly select a large number, say B, and calculate the estimate of the exact distribution from those. This estimate is often referred to as the “permutation distribution.” Using this distribution to do a hypothesis test is referred to as a “permutation test.”

Below is a description of a general way to find the permutation distribution.1)Randomly permute the column numbers. Put these

back into a data set with the row numbers. 2010 Christopher R. Bilder

2.101

Page 102: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2)Calculate X2. Denote this statistic by to avoid confusion with the observed X2.

3)Repeat 1) and 2) B times where B is a large number (1,000 or more).

4)Plot a histogram of the . This serves as a visual estimate of the exact distribution of X2.

To calculate our p-value, we can obtain an initial impression if it will be small or large by seeing where X2 falls on it. To calculate it formally, we can use step 5.

5)The p-value is . Small p-values

indicate the observed X2 would be unusual to obtain if independence was true.

How can we do all of this in R? First, we will need to put the data into its “raw form” (this is my own term), so that every cell in the contingency table is represented by row and column numbers like on p. 2.99. We can use then the sample() function to find each permutation. The example next shows the whole process.

Example: Table 2.10 of Agresti (1996) (tab2.10-v2.R)> n.table<-array(data = c(0, 1, 0, 7, 1, 8, 0, 1, 0, 0, 1, 0, 0, 1, 0,

2010 Christopher R. Bilder

2.102

Page 103: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

0, 1, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0), dim=c(3,9)) > x.sq<-chisq.test(n.table, correct=F)Warning message: Chi-squared approximation may be incorrect in: chisq.test(n.table, correct = F)

> x.sq

Pearson's Chi-squared test

data: n.table X-squared = 22.2857, df = 16, p-value = 0.1342

Note that X2 = 22.29. >##########################################################> #Put data into raw form > all.data<-matrix(data = NA, nrow = 0, ncol = 2)> > #Put data in "raw" form> for (i in 1:nrow(n.table)) { for (j in 1:ncol(n.table)) { all.data<-rbind(all.data, matrix(data = c(i, j), nrow = n.table[i,j], ncol = 2, byrow=T)) } }There were 16 warnings (use warnings() to see them)> #Note that warning messages will be generated since n.table[i,j]=0 sometimes> > all.data [,1] [,2] [1,] 1 2 [2,] 1 2 [3,] 1 2 [4,] 1 2 [5,] 1 2

2010 Christopher R. Bilder

2.103

Page 104: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

[6,] 1 2 [7,] 1 2 [8,] 1 8 [9,] 1 9[10,] 2 1[11,] 2 2[12,] 2 3[13,] 2 4[14,] 2 5[15,] 2 6[16,] 2 7[17,] 3 2[18,] 3 2[19,] 3 2[20,] 3 2[21,] 3 2[22,] 3 2[23,] 3 2[24,] 3 2

> save<-xtabs(~all.data[,1]+ all.data[,2])> save all.data[, 2]all.data[, 1] 1 2 3 4 5 6 7 8 9 1 0 7 0 0 0 0 0 1 1 2 1 1 1 1 1 1 1 0 0 3 0 8 0 0 0 0 0 0 0 > rowSums(save)1 2 3 9 7 8 > colSums(save) 1 2 3 4 5 6 7 8 9 1 16 1 1 1 1 1 1 1

This matches with the original contingency table so the raw data part worked. Note what the row and column marginal totals are!

2010 Christopher R. Bilder

2.104

Page 105: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Below is a further explanation of the code used to put the data into raw form:

The "c(i,j)" creates a vector containing the row (i) and column (j) index for the raw data format. The "matrix( ... )" part tells R to create a matrix with contents of "c(i,j)" and a number of rows of "n.table[i,j]", number of columns of "2", and do this by row (meaning, c(i,j) will be a 1x2 vector). Since c(i,j) is only one vector, R duplicates as many times as it is told to do by specifying "n.table[i,j]" as the number of rows (R calls this recycling). The "rbind( ... )" tells R to combine everything in "all.data" and "matrix( ... )" by row. Thus, everything that was in "all.data" comes first and the "matrix( ... )" is put below it. This is done for all rows and columns of the data through using the two for loops.

> ########################################################> #Do one permutation to illustrate – i.e., find one X^2*> set.seed(4088)> all.data.star<-cbind(all.data[,1], sample(all.data[,2], replace=F))> all.data.star [,1] [,2] [1,] 1 2 [2,] 1 2 [3,] 1 9 [4,] 1 2 [5,] 1 2 [6,] 1 2 [7,] 1 2 [8,] 1 4 [9,] 1 2

2010 Christopher R. Bilder

2.105

bilder, 12/04/06,
From 2-13-06 message board reply to Jeremy Penn
Page 106: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

[10,] 2 2[11,] 2 8[12,] 2 2[13,] 2 2[14,] 2 2[15,] 2 2[16,] 2 2[17,] 3 6[18,] 3 2[19,] 3 1[20,] 3 3[21,] 3 2[22,] 3 2[23,] 3 5[24,] 3 7

> calc.stat<-chisq.test(all.data.star[,1], all.data.star[,2], correct=F)Warning message: Chi-squared approximation may be incorrect in: chisq.test(all.data.star[, 1], all.data.star[, 2], correct = F)

> calc.stat$statisticX-squared 17.33036

> save.star<-xtabs(~all.data.star[,1] + all.data.star[,2])> save.star all.data.star[, 2]all.data.star[, 1] 1 2 3 4 5 6 7 8 9 1 0 7 0 1 0 0 0 0 1 2 0 6 0 0 0 0 0 1 0 3 1 3 1 0 1 1 1 0 0

> rowSums(save.star)1 2 3 9 7 8

> colSums(save.star) 1 2 3 4 5 6 7 8 9 1 16 1 1 1 1 1 1 1 >

2010 Christopher R. Bilder

2.106

Page 107: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Notes: To illustrate one possible permutation of the data, the

all.data.star data set is found. Notice how the column numbers are permuted using the sample() function. The row numbers are held fixed. The row and column numbers are then put back together to form a matrix. The xtabs(), rowSums(), and colSums() functions’ output show the marginal totals are still the same as with the observed data. The statistic is 17.33 for this permutation.

What is the probability this one permutation would occur?

Suppose I did a different permutation. Let set.seed(4089). For this seed, =16.46:

> save.star all.data.star[, 2]all.data.star[, 1] 1 2 3 4 5 6 7 8 9 1 0 6 0 0 1 1 1 0 0 2 1 4 1 0 0 0 0 0 1 3 0 6 0 1 0 0 0 1 0

> rowSums(save.star)1 2 3 9 7 8 > colSums(save.star) 1 2 3 4 5 6 7 8 9 1 16 1 1 1 1 1 1 1

Now, I would like to repeat this process B=1,000 times to get 1,000 different . These ’s will then represent my permutation distribution.

2010 Christopher R. Bilder

2.107

bilder, 01/07/09,
1/___
Page 108: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

> #########################################################> # A simple function and for loop to find the permutation distribution.

> do.it<-function(data.set){ all.data.star<-cbind(data.set[,1], sample(data.set[,2], replace=F)) chisq.test(all.data.star[,1], all.data.star[,2], correct=F)$statistic } > summarize<-function(result.set, statistic, df, B) { par(mfrow = c(1,2)) #Histogram hist(x = result.set, main = expression(paste("Histogram of ", X^2, " perm. dist.")), col = "blue", freq = FALSE) curve(expr = dchisq(x = x, df = df), col = "red", add = TRUE) segments(x0 = statistic, y0 = -10, x1 = statistic, y1 = 10)

#QQ-Plot chi.quant<-qchisq(p = seq(from = 1/(B+1), to = 1- 1/(B+1), by = 1/(B+1)), df = df) plot(x = sort(result.set), y = chi.quant, main = expression(paste("QQ-Plot of ", X^2, " perm. dist."))) abline(a = 0, b = 1) par(mfrow = c(1,1)) #p-value mean(result.set>=statistic) } > #Example use of do.it function> do.it(data.set = all.data)X-squared 16.14286 Warning message:

2010 Christopher R. Bilder

2.108

Page 109: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Chi-squared approximation may be incorrect in: chisq.test(all.data.star[, 1], all.data.star[, 2], correct = F) > B<-1000> results<-matrix(data = NA, nrow = B, ncol = 1) > set.seed(5333)> for(i in 1:B) { results[i,1]<-do.it(all.data) }There were 50 or more warnings (use warnings() to see the first 50) > summarize(results, x.sq$statistic, (nrow(n.table)- 1)*(ncol(n.table)-1), B) [1] 0.003

Histogram of X2 perm. dist.

result.set

Den

sity

16 17 18 19 20 21 22

0.0

0.1

0.2

0.3

0.4

0.5

0.6

16 17 18 19 20 21 22

510

1520

2530

3540

QQ-Plot of X2 perm. dist.

sort(result.set)

chi.q

uant

Notes: do.it() is a user written function! I have put the

sampling part and the calculation of inside of it. 2010 Christopher R. Bilder

2.109

Page 110: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Notice the syntax used with the function. Also, notice the example where I used the function once with the all.data data set. AND, notice the last line of function gives the value. For all functions written in R, the last line defines what is returned as a result of the function. Notice here the value was printed without me asking for it to be printed!

The for loop is used to repeat using the do.it() function B=1,000 times. The results are then stored in a matrix called results. The warning messages are just with regards to the 2 approximation to the distribution of each is probably not appropriate.

The set.seed() function is used before the for loop so that the results produced here can be reproduced by others. Notice that it only needs to be set once before the loop.

summarize() is another user written function to help summarize the results in a histogram, QQ-plot, and to find the p-value. Notice again the last line finds the p-value and this is returned as the result of the function.

Remember the distribution is used with X2 for a “regular” Pearson chi-square test for independence. The QQ-plot plots the quantiles of a distribution versus values. If the values fall on a straight line at 45 from the origin, the values would all be equal to the quantiles of a distribution. Thus, the distribution for X2 could be approximated by a distribution. As you can see, this does NOT happen

2010 Christopher R. Bilder

2.110

Page 111: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

here! See qq_plot_chi.square.R for an example where a simulated sample from a chi-square distribution is used.

There is strong evidence against independence since the p-value is 0.003. Agresti (1996) found a p-value of 0.001.

Below are the actual values of obtained.

> table(round(results,2))

15.66 15.8 15.89 15.9 16.14 16.19 16.33 16.4 16.46 16.71 16.74 16.75 16.83 67 100 30 129 61 6 12 40 123 60 47 7 2

16.9 17.19 17.33 17.71 17.83 17.89 18.02 18.05 18.24 18.48 18.66 19.23 19.69 99 7 12 9 31 53 1 32 11 9 16 2 8 19.9 20 20.08 20.14 20.46 21.02 21.19 22.29 22.31 2 4 7 5 1 1 3 1 2

Again, one can think of the permutation test as a way to obtain an estimate of the probability distribution function of the discrete random variable, X2, under Ho. Based upon the above information, we obtain column 2 in the table below.

Permutation dist. Chi-square dist.67/1000 = 0.067 0.5231

(67+100)/1000 = 0.167 0.5330

0.197 0.5393

0.997 0.8287

0.998 0.8659

2010 Christopher R. Bilder

2.111

Page 112: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

1 0.8665

The permutation distribution then replaces the chi-square distribution approximation for X2. Below is P(X2 ≤ ___ ) using a approximation.

> round(pchisq(q = as.numeric(names(table( round(results,2)))), df = (nrow(n.table)- 1)*(ncol(n.table)-1)),4) [1] 0.5231 0.5330 0.5393 0.5400 0.5568 0.5602 0.5698 0.5746 0.5787 0.5954[11] 0.5974 0.5980 0.6033 0.6079 0.6266 0.6354 0.6588 0.6661 0.6696 0.6773[21] 0.6790 0.6900 0.7035 0.7133 0.7431 0.7655 0.7752 0.7798 0.7834 0.7860[31] 0.7998 0.8223 0.8287 0.8659 0.8665

A plot of the cumulative distribution functions is shown below (see program for code)

2010 Christopher R. Bilder

2.112

Page 113: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

16 18 20 22

0.0

0.2

0.4

0.6

0.8

1.0

Compare CDFs

X2

CD

F

162

Exact

Here’s a simpler way to get the p-value:> set.seed(7709)> chisq.test(n.table, correct = FALSE, simulate.p.value = TRUE, B = 1000)

Pearson’s Chi-squared test with simulated p-value (based on 1000 replicates)

data: n.table X-squared = 22.2857, df = NA, p-value = 0.001

2010 Christopher R. Bilder

2.113

Page 114: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Why did I show the harder way first? It will help you understand what the chisq.test()

function is actually doing. You can not summarize the results from chisq.test()

with a histogram or QQ-plot. A permutation test is a very general approach for

inference. It can be used in many other settings which are not already programmed into a function like chisq.test()! A simple example is suppose you would like to use G2 for the test of independence.

Permutation tests are closely related to bootstrap hypothesis tests. See the additional Chapter 2 notes for how one can use functions in the boot package to do permutation tests.

Example: Larry Bird (bird_perm.R)

> #Create contingency table - notice the data is entered by columns> n.table<-array(c(251, 48, 34, 5), dim=c(2,2), dimnames=list(First = c("made", "missed"), Second = c("made", "missed")))> n.table SecondFirst made missed made 251 34 missed 48 5

> x.sq<-chisq.test(n.table, correct=F)> x.sq

Pearson's Chi-squared test

2010 Christopher R. Bilder

2.114

Page 115: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

data: n.table X-squared = 0.2727, df = 1, p-value = 0.6015

> #########################################################> #Find raw data

> all.data<-matrix(data = NA, nrow = 0, ncol = 2)

> #Put data in "raw" form> for (i in 1:nrow(n.table)) { for (j in 1:ncol(n.table)) { all.data<-rbind(all.data, matrix(data = c(i,j), nrow = n.table[i,j], ncol = 2, byrow=T)) }}

> #Check> xtabs(~all.data[,1]+ all.data[,2])

all.data[, 2]all.data[, 1] 1 2 1 251 34 2 48 5

Here’s how the test can be done using the methods demonstrated in the last example. When you do it yourself, you should only use one of these methods unless instructed to do otherwise.

Code for method #1: The same do.it() and summarize() function is used here so only partial results are given:

> summarize(result.set = results, statistic = x.sq$statistic, df = (nrow(n.table)-1)*(ncol(n.table)- 1), B = B)[1] 0.624

2010 Christopher R. Bilder

2.115

Page 116: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Histogram of X2 perm. dist.

result.set

Den

sity

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0 2 4 6 8 10

02

46

810

QQ-Plot of X2 perm. dist.

sort(result.set)ch

i.qua

nt

> #Shows the different X^2* values> table(round(results,2))

0 0.17 0.27 0.78 0.98 1.82 2.13 3.31 3.71 5.23 5.74 7.59 8.2 190 186 179 101 110 76 66 45 22 13 5 4 2 10.39 1 > #chi-square app.> round(pchisq(q = as.numeric(names(table(round(results,2)))), df = (nrow(n.table)-1)*(ncol(n.table)-1)),4) [1] 0.0000 0.3199 0.3967 0.6229 0.6778 0.8227 0.8556 0.9311 0.9459 0.9778[11] 0.9834 0.9941 0.9958 0.9987

Code and output for method #2: #Method #2> set.seed(8912)> chisq.test(n.table, correct = FALSE, simulate.p.value = TRUE, B = 1000)

Pearson's Chi-squared test with simulated p-value (based on 1000 replicates)

data: n.table

2010 Christopher R. Bilder

2.116

Page 117: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

X-squared = 0.2727, df = NA, p-value = 0.659

Notes: The p-value is 0.624 for method #1 and 0.6590 for

method #2 indicating there is not sufficient evidence against independence.

The Pearson statistic test for independence had a p-value of 0.6015. The reason for the general agreement between this test and the permutation test is the sample size is large enough for the “asymptotic” distribution used (chi-square) to work as the approximate distribution for the X2. See the QQ-plot.

Notice the “discreteness” of the permutation distribution. Why do you think this is happening?

Below is a plot comparing the cumulative distribution functions (see program for code):

2010 Christopher R. Bilder

2.117

Chris Bilder, 01/07/02,
Low cell count - note that PROC FREQ in SAS gets a p-value of 0.65; notice if p-value becomes > only, the value is about .48.
Page 118: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Compare CDFs

X2

CD

F

12

Exact

If you are interested in using exact inference for other problems outside of categorical data analysis, there is a nice software package which helps to automate these tests even more than in R. The software is made by the Cytel Corporation and is called StatXact. Also, PROC FREQ in SAS has an EXACT option that will do the test.

2010 Christopher R. Bilder

2.118

Page 119: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

2.7 Association in three-way tables

More than two categorical variables may be of interest. In this setting, one can construct contingency tables summarizing the counts of these additional variables. Tests for independence between all of the variables or between some of them conditional on the other variables can be constructed. However, it is often more beneficial to look at these types of settings from a modeling point of view. Therefore, the discussion of these settings will mostly be postponed until we get to models that can handle them. What is next is an introduction to what a contingency table would look like for three categorical variables and some important things to look out for in this setting (e.g., Simpson’s paradox).

In addition to the categorical variables, X and Y, suppose there is a third categorical variable, Z, with k=1,…,K levels.

Let nijk enote a cell count for the ith row, jth column, and kth

layer of a “three-way” contingency table. If X has I=2 levels and Y has J=2 levels, then the following is the contingency table for the counts:

Z=1 Y Z=2 Y Z=K Y1 2 1 2 1 2

X 1 n111 n121 n1+1 X 1 n112 n122 n1+2 X 1 n11K n12K n1+K

2 n211 n221 n2+1 2 n212 n222 n2+2 2 n21K n22K n2+K

n+11 n+21 n++1 n+12 n+22 n++2 n+1K n+2K n++K

2010 Christopher R. Bilder

2.119

Page 120: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Notes: A third subscript is added to the n’s to denote the Z

variable. There are other ways to display a “three-way”

contingency table. See Table 2.10 of Agresti (2007) for an example.

This table could easily be extended to a IJK table. The table could have also be written in terms of P(X = i,

Y = j, Z = k) = ijk or pijk=nijk/n as well. ijk = E(nijk); i.e., the expected frequency for the ith row, jth

column, and kth layer.

Properties such as are extended to the

three-way table.

Z as the control variable

Z often plays the role of a “control” variable. In this case, the purpose is still to understand the relationship between X and Y while controlling for Z. In addition to Z being called a “layer” variable, Z is often called a “stratification” variable.

Think of this as the categorical equivalent of an analysis for a randomized complete block design. The levels of X are the treatments, Y is the response, and the levels of Z are the blocks.

2010 Christopher R. Bilder

2.120

Page 121: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Example: Salk vaccine clinical trials

We had the following contingency table set up previously for this example.

  Polio Polio freeVaccinePlacebo

X is the drug (vaccine, placebo) and Y is the polio result (polio, polio free). Z could denote the clinical trial centers where the clinical trial takes place. Thus, we could have the following table:

 Omaha Polio

Polio

free

N.Y.  Polio

Polio

free

L.A.  Polio

Polio

free

Vaccine Vaccine

Vaccine

PlaceboPlaceb

o Placeb

o

The table above is called a three-way table since three variables are represented in a contingency table format.

Odds ratios can also be found for a particular level of Z. Since there are three categorical variables, the variables of interest are put in the subscript with the level of the conditioning variable. For 22K tables,

2010 Christopher R. Bilder

2.121

Page 122: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

One could also define P(X=i, Y=j, Z=k) / P(Z=k) = P(X=i, Y=j | Z=k) = ij|k and set up the odds ratios as

Conditional and marginal associations

In the Salk vaccine clinical trial example, each individual 22 table that relates drug to polio result for a specific clinical trial center is called a “partial table”. This is because each table represents “part” of the 22K table. The 22 table examined in Chapter 2 (before clinical center was known) is called a “marginal table” since it ignores clinical trial center.

Remember how the word “margins” was used earlier to denote summing over a categorical variable.

The partial table associations (relationships) between X and Y are also called “conditional associations” since they are dependent on the level of Z. An example of a

2010 Christopher R. Bilder

2.122

Page 123: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

conditional association measure is XY|k. The marginal table associations between X and Y can be called “marginal associations”. An example of a marginal association is calculating the odds ratio in the 22 marginal table for the Salk vaccine clinical trial example:

and

It is important to distinguish between the two types of association. The marginal association can be VERY different from the conditional associations! “Simpson’s paradox” occurs when this happens.

Example: Simpson’s paradox example

This example comes from Appleton et al. (American Statistician, 1996, p. 340-341). There were 1,314 women in the UK who participated in a survey in 1972-4 and then followed up on twenty years later. Information about their age (in 1972-4), smoking status, and survival status was recorded. Below is a marginal table summarizing the survival and smoking status.

2010 Christopher R. Bilder

2.123

Page 124: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Survival statusDead Alive

SmokerYes 139 443No 230 502

The estimated OR, , is 0.68 and a 95% confidence interval for the population OR is (0.54, 0.88). Therefore, the odds of being dead are between 0.54 and 0.88 times larger for smokers and than non-smokers with 95% confidence. Alternatively, the odds of survival are between 1.14 and 1.87 times larger for smokers than non-smokers with 95% confidence. Given this information, which would you prefer to be a smoker or non-smoker?

Now, let’s take age into account.

Age = 18-24 Survival status Age = 25-34 Survival status Age = 35-44 Survival statusDead Alive Dead Alive Dead Alive

SmokerYes 2 53

SmokerYes 3 121

SmokerYes 14 95

No 1 61 No 5 152 No 7 114OR: 2.30 OR: 0.75 OR: 2.40

Age = 45-54 Survival status Age = 55-64 Survival status Age = 65-74 Survival statusDead Alive Dead Alive Dead Alive

SmokerYes 27 103

SmokerYes 51 64

SmokerYes 29 7

No 12 66 No 40 81 No 101 28OR: 1.44 OR: 1.61 OR: 1.15

2010 Christopher R. Bilder

2.124

Page 125: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

Age = 75+

Survival status

Dead

Alive

SmokerYes 13 0No 64 0

OR: 0.21

Notice that most of these odds ratios are greater than 1 indicating the estimated odds of dying are larger for those who smoke than those who do not smoke. For example, . This is a contradiction of the results from the conditional associations!

The most important item to get out this example is to make sure you account for additional variables because you could make incorrect conclusions.

Read Agresti’s (2007) death penalty example for another illustration of Simpson’s paradox.

Conditional independence

X is independent of Y at EACH level of Z; i.e., independence in each partial table.

2010 Christopher R. Bilder

2.125

Page 126: Fsfsfsd - ChrisBilder.com€¦ · Web viewTea taster experiment This is a common example discussed often in statistics. See p. 46 of Agresti (2007) for the set up or “The Lady Tasting

More formally, conditional independence can be written as

XY(1) = XY(2) = … = XY(K) = 1 for a 22K table

or

ij|k = i+|k+j|k for each i=1,…,I, j=1,…,J and k=1,…,K

What is i+|k? i+|k = jij|k

Marginal independence: XY=1

See Agresti’s (2007) example for another reason why to not look at the marginal table. There are cases where the marginal and conditional associations are the same. These are discussed in Chapter 7 with respect to loglinear models.

Homogeneous X-Y association

X-Y have the same levels of association across all levels of Z.

For a 22K table, this means the partial ORs are the same but not necessarily equal to 1: XY(1) = XY(2) = … = XY(K). This will be important when discussing the Cochran-Mantel-Haenszel test in Chapter 4.

2010 Christopher R. Bilder

2.126