Tutorial contingency table.doc

19
A crosstabulation displays the number of cases in each category defined by two or more grouping variables. For example, we can display the number of sales employees in each division in each office location. Crosstabulations are useful for summarizing categorical variables -- variables with a limited number of distinct categories. 1

Transcript of Tutorial contingency table.doc

Page 1: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 1/19

A crosstabulation displays the number of cases in each category defined by two or 

more grouping variables.

For example, we can display the number of sales employees in each division in each

office location.

Crosstabulations are useful for summarizing categorical variables -- variables with a

limited number of distinct categories.

1

Page 2: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 2/19

The chi-square measures test the hypothesis that the row and columnvariables in a crosstabulation are independ

A low significance value (typically below 0.05) indicates that there may be some

relationship between the two variables.

While the chi-square measures may indicate that there is a relationship between two

variables, they do not indicate the strength or direction of the relationship.

2

Page 3: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 3/19

The nominal directional measures indicate both the strength and significance of the

relationship between the row and column variables of a crosstabulation.

The value of each statistic can range from 0 to 1 and indicates the proportional

reduction in error in predicting the value of one variable based on the value of theother variable.

3

Page 4: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 4/19

For example, a test statistic value of 0.021 indicates that you have only reduced the

error rate by 2.1% over what you could expect by random chance.

In this example, the low significance values for both tau and the uncertainty

coefficient indicate that there is a relationship between the two variables...

4

Page 5: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 5/19

But the low values for both test statistics indicate that the relationship between the

two variables is a fairly weak one.

The nominal directional measures are appropriate when both variables are nominal,

categorical variables.

5

Page 6: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 6/19

Somers' d is an ordinal directional measure that indicates the significance, strength

and direction of the relationship between the row and column variables of a

crosstabulation.

A low significance value (typically less than 0.05) indicates that there is a relationship

 between the two variables.

6

Page 7: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 7/19

The value of the statistic can range from -1 to 1.

Negative values indicate a negative relationship, and positive values indicatea positive relationship

In this example, the low significance values for Somers' d indicate that there is a

relationship between the two variables...

7

Page 8: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 8/19

But the low values for the test statistic indicate that the relationship between the two

variables is a fairly weak one.

Somers' d is appropriate when both variables are ordinal, categorical variables.

8

Page 9: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 9/19

The nominal symmetric measures indicate both the strength and significance of the

relationship between the row and column variables of a crosstabulation.

The value of each statistic can range from 0 to 1.

9

Page 10: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 10/19

Phi is only appropriate for 2x2 tables.

In this example, the low significance values for both Cramer's V and the contingency

coefficient indicate that there is a relationship between the two variables...

10

Page 11: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 11/19

But the low values for the test statistics indicate that the relationship between the two

variables is a fairly weak one.

The nominal symmetric measures are appropriate when both variables are nominal,

categorical variables.

11

Page 12: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 12/19

The ordinal symmetric measures indicate the significance, strength and direction of 

the relationship between the row and column variables of a crosstabulation.

A low significance value (typically less than 0.05) indicates that there is a relationship

 between the two variables.

12

Page 13: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 13/19

The values of the test statistics can range from -1 to 1.

 Negative values indicate a negative relationship, and positive values indicate a

 positive relationship.

In this example, the low significance values indicate that there is a relationship

 between the two variables...

13

Page 14: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 14/19

But the low values for the test statistics indicate that the relationship between the two

variables is a fairly weak one.

The ordinal symmetric measures are appropriate when both variables are ordinal,

categorical variables.

14

Page 15: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 15/19

The relative risk estimate is a measure of association between the presence or absence

of a factor and the occurrence of an event.

For example, you could examine the relationship between smoking and lung cancer.

In this hypothetical example, the relative risk of lung cancer is more than twice as

high among smokers than among non-smokers.

15

Page 16: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 16/19

And the 95% confidence interval for the relative risk ratio does not include 1,

indicating that there is a significant difference in the occurrence of lung cancer 

 between smokers and non-smokers.

The Breslow-Day and Tarone's statistics test the homogeneity of the odds ratio across

categories of the layer variable.

 A low significance value (typically below 0.05) indicates that the odds ratiovaries across categories of the layer variable

The Cochran's and Mantel-Haenszel statistics are designed to test for independence

 between a binary factor variable and a binary response variable. The statistics are

adjusted for covariate patterns defined by one or more control variables.

16

Page 17: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 17/19

low significance value (typically below 0.05) indicates that there may be somerelationship between the two variables.

While the measures may indicate that there is a relationship between two variables,

they do not indicate the strength or direction of the relationship.

This is essentially a t-test for the value of the common odds ratio.

17

Page 18: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 18/19

The estimate and natural log of the estimate of the common odds ratio are normally

distributed for sufficiently large data sets.

18

Page 19: Tutorial contingency table.doc

7/29/2019 Tutorial contingency table.doc

http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 19/19

A low significance value (typically below 0.05) indicates that the hypothesized value

of the common odds ratio is probably incorrect.