Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and...
-
Upload
rosamond-lawrence -
Category
Documents
-
view
219 -
download
1
Transcript of Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and...
![Page 1: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/1.jpg)
Data analysis: cross-tabulation
GAP Toolkit 5 Training in basic drug abuse data management and analysis
Training session 11
![Page 2: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/2.jpg)
Objectives
• To introduce cross-tabulation as a method of investigating the relationship between two categorical variables
• To describe the SPSS facilities for cross-tabulation• To discuss a range of simple statistics to describe the
relationship between two categorical variables• To reinforce the range of SPSS skills learnt to date
![Page 3: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/3.jpg)
Bivariate analysis
• The relationship between two variables• A two-way table:
– Rows: categories of one variable– Columns: categories of the second variable
![Page 4: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/4.jpg)
Frequency Percent Valid Percent Cumulative Percent
Valid Male 1251 79.6 79.9 79.9
Female 314 22.0 20.1 100.0
Total 1565 99.6 100.0
Missing System 6 .4
Total 1571 100.0
Gender
![Page 5: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/5.jpg)
Frequency Percent Valid Percent Cumulative Percent
Valid Swallow 794 50.5 51.0 51.0
Smoke 634 40.4 40.7 91.7
Snort 62 3.9 4.0 95.6
Inject 30 1.9 1.9 97.6
12.00 2 .1 .1 97.7
15.00 1 .1 .1 97.8
23.00 10 .6 .6 98.4
24.00 11 .7 .7 99.1
25.00 5 .3 .3 99.4
34.00 4 .3 .3 99.7
234.00 5 .3 .3 100.0
Total 1558 99.2 100.0
Missing System 13 .8
Total 1571 100.0
Mode of ingestion Drug 1
Out-of-range values (note that none of the digits are
> 5)
![Page 6: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/6.jpg)
Cleaning Mode1
• Save a copy of the original• Recode the out-of-range values into a new value (for
example,12, 15, 23, 24 ,25, 34, 234 into the value 8)• Set the new value as a user-defined missing value (for
example, 8 is declared a missing value and given the label “Out-of-range”).
![Page 7: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/7.jpg)
Frequency Percent Valid Percent Cumulative Percent
Valid Swallow 794 50.5 52.2 52.2
Smoke 634 40.4 41.7 93.9
Snort 62 3.9 4.1 98.0
Inject 30 1.9 2.8 100.0
Total 1520 96.8 100.0
Missing Out-of-range 38 2.4
System 13 .8
Total 51 3.2
Total 1571 100.0
Mode of ingestion Drug 1
![Page 8: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/8.jpg)
![Page 9: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/9.jpg)
Gender
Male Female Total
Swallow 600 194 794
Smoke 553 77 630
Snort 44 17 61
Inject 20 10 30
Total 1271 298 1515
Mode of ingestion Drug1
Row totals
Joint frequencies
Grand total
Count
Mode of ingestion Drug1 * Gender cross-tabulation
Column totals
![Page 10: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/10.jpg)
Percentages
• The difference in sample size for men and women makes comparison of raw numbers difficult
• Percentages facilitate comparison by standardizing the scale
• There are three options for the denominator of the percentage:– Grand total– Row total– Column total
![Page 11: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/11.jpg)
Gender
Male Female Total
Swallow Count 600 194 794
% of Total 39.6% 12.8% 52.4%
Smoke Count 553 77 630
% of Total 36.5% 5.1% 41.6%
Snort Count 44 17 61
% of Total 2.9% 1.1% 4.0%
Inject Count 20 10 30
% of Total 1.3% .7% 2.0%
Total Count 1271 298 1515
% of Total 80.3% 19.7% 100.0%
Mode of ingestion Drug1
Marginal distribution
Mode1
Joint distribution Mode1 & Gender
Mode of ingestion Drug1 * Gender cross-tabulation
Marginal distributionGender
![Page 12: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/12.jpg)
Mode of ingestion Drug1 * Gender cross-tabulation
Gender
Male Female Total
Swallow Count 600 194 794
% within Mode of ingestion Drug1
75.6% 24.4% 100.0%
Smoke Count 553 77 630
% within Mode of ingestion Drug1
87.8% 12.2% 100.0%
Snort Count 44 17 61
% within Mode of ingestion Drug1
72.1% 27.9% 100.0%
Inject Count 20 10 30
% within Mode of ingestion Drug1
66.7% 33.3% 100.0%
Total Count 1271 298 1515
% within Mode of ingestion Drug1
80.3% 19.7% 100.0%
The distribution of Gender conditional on Mode1
Mode of ingestion Drug1
![Page 13: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/13.jpg)
Mode of ingestion Drug1 * Gender cross-tabulation
Gender
Male Female Total
Swallow Count 600 194 794
% within Gender 49.3% 65.1% 52.4%
Smoke Count 553 77 630
% within Gender 45.4% 25.8% 41.6%
Snort Count 44 17 61
% within Gender 3.6% 5.7% 4.0%
Inject Count 20 10 30
% within Gender 1.6% 3.4% 2.0%
Total Count 1271 298 1515
% within Gender 100.0% 100.0% 100.0%
Mode of ingestion Drug1
The distribution of Mode1 conditional on Gender
![Page 14: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/14.jpg)
Choosing percentages
• “Construct the proportions so that they sum to one within the categories of the explanatory variable.”
Source: (C. Marsh, Exploring Data: An Introduction to Data Analysis for Social Scientists (Cambridge, Polity Press, 1988), p. 143.)
![Page 15: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/15.jpg)
![Page 16: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/16.jpg)
![Page 17: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/17.jpg)
![Page 18: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/18.jpg)
n=600
n=553
n=44
n=20
n=194
n=77
n=17
n=10
![Page 19: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/19.jpg)
Dimensions
Definitions of vertical and horizontal variables
![Page 20: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/20.jpg)
Two-by-two tables
• Tables with two rows and two columns• A range of simple descriptive statistics can be applied to
two-by-two tables• It is possible to collapse larger tables to these
dimensions
![Page 21: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/21.jpg)
Gender * White pipe cross-tabulation
White pipe
Yes No Total
Male Count 290 961 1251
% within Gender 23.2% 76.8% 100.0%
Female Count 22 292 314
% within Gender 7.0% 93.0% 100.0%
Total Count 312 1253 1565
% within Gender 19.9% 80.1% 100.0%
Gender
![Page 22: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/22.jpg)
White pipe
Yes No
Gender Male 0.2318 0.7682
Female 0.0701 0.9299
![Page 23: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/23.jpg)
Relative risk
• Divide the probabilities for “success”:– For example:
P(Whitpipe=Yes|Gender=Male)=0.2318 P(Whitpipe=Yes|Gender=Female)=0.0701Relative risk is 0.2318/0.0701=3.309
• The proportion of males using white pipe was over three times greater than females
![Page 24: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/24.jpg)
Odds
• The odds of “success” are the ratio of the probability of “success” to the probability of “failure”
• For example:- For males the odds of “success” are 0.2318/0.7682=0.302 - For females the odds of “success” are 0.0701/0.9299=0.075
![Page 25: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/25.jpg)
Odds ratio
• Divide the odds of success for males by the odds of success for females
• For example: 0.302/0.075=4.005• The odds of taking white pipe as a male are four times
those for a female
![Page 26: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/26.jpg)
![Page 27: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/27.jpg)
95% Confidence interval
Value Lower Upper
Odds ratio for Gender (Male / Female)
4.005 2.547 6.299
For cohort white pipe = Yes 3.309 2.184 5.012
For cohort white pipe = No .826 .791 .862
N of valid cases 1565
Risk estimate
Relative risk of
“success”
Relative risk of
“failure”
Odds ratio M/F
![Page 28: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/28.jpg)
Exercise 1: cross-tabulations
• Create and comment on the following cross-tabulations:– Age vs Gender– Race vs Gender– Education vs Gender– Primary drugs vs Mode of ingestion
• Suggest other cross-tabulations that would be useful
![Page 29: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/29.jpg)
Exercise 2: cross-tabulation
• Construct a dichotomous variable for age: Up to 24 years and Above 24 years
• Construct a dichotomous variable for the primary drug of use: Alcohol and Not Alcohol
• Create a cross-tabulation of the two new variables and interpret
• Generate Relative Risks and Odds Ratios and interpret
![Page 30: Data analysis: cross-tabulation GAP Toolkit 5 Training in basic drug abuse data management and analysis Training session 11.](https://reader031.fdocuments.us/reader031/viewer/2022032201/56649d165503460f949eb813/html5/thumbnails/30.jpg)
Summary
• Cross-tabulations• Joint frequencies• Marginal frequencies• Row/Column/Total percentages• Relative risk• Odds• Odds ratios