Tutorial contingency table.doc
Transcript of Tutorial contingency table.doc
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 1/19
A crosstabulation displays the number of cases in each category defined by two or
more grouping variables.
For example, we can display the number of sales employees in each division in each
office location.
Crosstabulations are useful for summarizing categorical variables -- variables with a
limited number of distinct categories.
1
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 2/19
The chi-square measures test the hypothesis that the row and columnvariables in a crosstabulation are independ
A low significance value (typically below 0.05) indicates that there may be some
relationship between the two variables.
While the chi-square measures may indicate that there is a relationship between two
variables, they do not indicate the strength or direction of the relationship.
2
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 3/19
The nominal directional measures indicate both the strength and significance of the
relationship between the row and column variables of a crosstabulation.
The value of each statistic can range from 0 to 1 and indicates the proportional
reduction in error in predicting the value of one variable based on the value of theother variable.
3
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 4/19
For example, a test statistic value of 0.021 indicates that you have only reduced the
error rate by 2.1% over what you could expect by random chance.
In this example, the low significance values for both tau and the uncertainty
coefficient indicate that there is a relationship between the two variables...
4
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 5/19
But the low values for both test statistics indicate that the relationship between the
two variables is a fairly weak one.
The nominal directional measures are appropriate when both variables are nominal,
categorical variables.
5
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 6/19
Somers' d is an ordinal directional measure that indicates the significance, strength
and direction of the relationship between the row and column variables of a
crosstabulation.
A low significance value (typically less than 0.05) indicates that there is a relationship
between the two variables.
6
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 7/19
The value of the statistic can range from -1 to 1.
Negative values indicate a negative relationship, and positive values indicatea positive relationship
In this example, the low significance values for Somers' d indicate that there is a
relationship between the two variables...
7
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 8/19
But the low values for the test statistic indicate that the relationship between the two
variables is a fairly weak one.
Somers' d is appropriate when both variables are ordinal, categorical variables.
8
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 9/19
The nominal symmetric measures indicate both the strength and significance of the
relationship between the row and column variables of a crosstabulation.
The value of each statistic can range from 0 to 1.
9
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 10/19
Phi is only appropriate for 2x2 tables.
In this example, the low significance values for both Cramer's V and the contingency
coefficient indicate that there is a relationship between the two variables...
10
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 11/19
But the low values for the test statistics indicate that the relationship between the two
variables is a fairly weak one.
The nominal symmetric measures are appropriate when both variables are nominal,
categorical variables.
11
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 12/19
The ordinal symmetric measures indicate the significance, strength and direction of
the relationship between the row and column variables of a crosstabulation.
A low significance value (typically less than 0.05) indicates that there is a relationship
between the two variables.
12
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 13/19
The values of the test statistics can range from -1 to 1.
Negative values indicate a negative relationship, and positive values indicate a
positive relationship.
In this example, the low significance values indicate that there is a relationship
between the two variables...
13
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 14/19
But the low values for the test statistics indicate that the relationship between the two
variables is a fairly weak one.
The ordinal symmetric measures are appropriate when both variables are ordinal,
categorical variables.
14
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 15/19
The relative risk estimate is a measure of association between the presence or absence
of a factor and the occurrence of an event.
For example, you could examine the relationship between smoking and lung cancer.
In this hypothetical example, the relative risk of lung cancer is more than twice as
high among smokers than among non-smokers.
15
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 16/19
And the 95% confidence interval for the relative risk ratio does not include 1,
indicating that there is a significant difference in the occurrence of lung cancer
between smokers and non-smokers.
The Breslow-Day and Tarone's statistics test the homogeneity of the odds ratio across
categories of the layer variable.
A low significance value (typically below 0.05) indicates that the odds ratiovaries across categories of the layer variable
The Cochran's and Mantel-Haenszel statistics are designed to test for independence
between a binary factor variable and a binary response variable. The statistics are
adjusted for covariate patterns defined by one or more control variables.
16
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 17/19
low significance value (typically below 0.05) indicates that there may be somerelationship between the two variables.
While the measures may indicate that there is a relationship between two variables,
they do not indicate the strength or direction of the relationship.
This is essentially a t-test for the value of the common odds ratio.
17
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 18/19
The estimate and natural log of the estimate of the common odds ratio are normally
distributed for sufficiently large data sets.
18
7/29/2019 Tutorial contingency table.doc
http://slidepdf.com/reader/full/tutorial-contingency-tabledoc 19/19
A low significance value (typically below 0.05) indicates that the hypothesized value
of the common odds ratio is probably incorrect.