Winter Electives Molecular and Genetic Epidemiology Decision and Cost-effectiveness Analysis...
-
date post
20-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Winter Electives Molecular and Genetic Epidemiology Decision and Cost-effectiveness Analysis...
Winter Electives
• Molecular and Genetic Epidemiology• Decision and Cost-effectiveness Analysis• Grantwriting (Workshop – not for credit hours)• Medical Informatics
• Today: – Lecture: Confounding & Interaction III– Section: 3:30 to 5 (S-18, S-22, S-22)
• Next Tuesday (12/6/05) – All at China Basin– 8:15 to 9:45: Journal Club
– 10:00 to 1:00 pm: Mitch Katz• Note chapters in his text book• box lunches provided
– 1:15 to 2:45: Last Small Group Section• Web-based course evaluation• Bring laptop
– Distribute Final Exam• Exam due 12/13 (in hands of Olivia by
4 pm) by email or China Basin 5700
Confounding and Interaction: Part III
• When Evaluating Association Between an Exposure and an Outcome, the Possible Roles of a 3rd Variable are:– Intermediary Variable– Effect Modifier– Confounder– No Effect
• Forming “Adjusted” Summary Estimates to Evaluate Presence of Confounding– Concept of weighted average
• Woolf’s Method• Mantel-Haenszel Method
– Clinical/biological decision rather than statistical– Handling more than one potential confounder
• Limitations of Stratification to Adjust for Confounding– the motivation for multivariable regression
When Assessing the Association Between an Exposure and a Disease,
What are the Possible Effects of a Third Variable?
EM+
_Confounding:
ANOTHER PATHWAY TO
GET TO THE DISEASE
Confounding:
ANOTHER PATHWAY TO
GET TO THE DISEASE
Effect Modifier (Interaction):
MODIFIES THE EFFECT OF THE EXPOSURE
D
I C
Intermediary
Variable:
No Effect
ON CAUSAL PATHWAY
Assumption: The third variable a priori is felt to be relevant
What are the Possible Effects of a 3rd Variable?
• Intermediary Variable• Effect Modifier (interaction)• Confounder• No Effect
Effect Modifier? (numerically assess both magnitude and statistical differences)
yesno
Confounder? (numerically assess difference between adjusted and crude; not a statistical decision)
yesno
Report stratum-specific estimates
Report “adjusted” summary estimate
Report Crude Estimate (3rd variable has no effect)
Intermediary Variable? (conceptual decision)
Report Crude Estimate
no yes
Effect of a Third Variable: Statistical Interaction
Delayed Not DelayedSmoking 26 133No Smoking 64 601
DelayedNot
DelayedSmoking 15 61No Smoking 47 528
Stratified
Crude
No Caffeine Use
Heavy Caffeine Use
RR crude = 1.7
RRno caffeine use = 2.4
DelayedNot
DelayedSmoking 11 72No Smoking 17 73
RRcaffeine use = 0.7
. cs delayed smoking, by(caffeine)
caffeine | RR [95% Conf. Interval] M-H Weight-----------------+------------------------------------------------- no caffeine | 2.414614 1.42165 4.10112 5.486943 heavy caffeine | .70163 .3493615 1.409099 8.156069 -----------------+------------------------------------------------- Crude | 1.699096 1.114485 2.590369 M-H combined | 1.390557 .9246598 2.091201-----------------+-------------------------------------------------Test of homogeneity (M-H) chi2(1) = 7.866 Pr>chi2 = 0.0050
Declare interaction; confounding is not relevant
Statistical Tests of Interaction: Test of Homogeneity (heterogeneity)
• Null hypothesis: The individual stratum-specific estimates of the measure of association differ only by random variation– i.e., the strength of association is homogeneous
across all strata– i.e., there is no interaction
• The test statistic will have a chi-square distribution with degrees of freedom of one less than the number of strata
Report vs Ignore Interaction?Some Guidelines
Relative Risks for a Given Exposure and
Disease
Potential Effect Modifier Present Absent
P value for heterogeneity
Report or Ignore
Interaction
2.3 2.6 0.45 Ignore
2.3 2.6 0.001 Ignore
2.0 20.0 0.001 Report
2.0 20.0 0.20 Report
2.0 20.0 0.40 Ignore
3.0 4.5 0.30 Ignore
3.0 4.5 0.001 +/-
0.5 3.0 0.001 Report
0.5 3.0 0.20 Report
0.5 3.0 0.30 +/-
Is an art form: requires consideration of both clinical and statistical significance
If Interaction is not Present, What Next?
• Case-control study of post-exposure AZT use in preventing HIV seroconversion after needlestick (NEJM 1997)
Crude
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320 347
ORcrude =0.61
(95% CI: 0.26 - 1.4)
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
Post-exposure prophylaxis with AZT after a needlestick
HIV
AZT Use
Severity of Exposure
Evaluating for Interaction
• Potential confounder: severity of exposure
Minor Severity Major
Severity
Crude
Stratified
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320 347
HIVNo
HIVAZT 0 91No AZT 3 161
3 252 255
ORcrude =0.61
OR = 0.0
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
HIVNo
HIVAZT 8 40No AZT 16 28
24 68 92
OR = 0.35
. cc HIV AZTuse,by(severity)
severity | OR [95% Conf. Interval] M-H Weight-----------------+------------------------------------------------- minor | 0 0 2.302373 1.070588 major | .35 .1344565 .9144599 6.956522-----------------+-------------------------------------------------
Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = 0.4400
To stratify the subjects into those women with maternal age less than 35 and those with maternal age >= 35, you add a “by(matage) option. If you add a “, pool” option as I have here, the program will give you not only the default MH summary but also the Woolf estimate.
To stratify the subjects into those women with maternal age less than 35 and those with maternal age >= 35, you add a “by(matage) option. If you add a “, pool” option as I have here, the program will give you not only the default MH summary but also the Woolf estimate.
Finally, you are already familiar with this command but for sake of comparison let’s look at the summary estimate as obtained by logistic regression which as you know uses the MLE approach. As you can see, the MH estimate is essentially identical to the MLE in this problem.
Finally, you are already familiar with this command but for sake of comparison let’s look at the summary estimate as obtained by logistic regression which as you know uses the MLE approach. As you can see, the MH estimate is essentially identical to the MLE in this problem.
Is there interaction?
Is there confounding?
Assuming Interaction is not Present, Form a Summary of the Unconfounded
Stratum-Specific Estimates
• Construct a weighted average– Assign weights to the individual strata– Summary Estimate = Weighted Average of
the stratum-specific estimates
– a simple mean is a weighted average where the weights are equal to 1
– which weights to use depends on type of effect estimate desired (OR, RR, RD) and characteristics of the data
– e.g., • Woolf’s method• Mantel-Haenszel method
ii
ii
w
istratuminestimateeffectw )] ([
Right. We need to assign a weight to each stratum and then perform a weighted average.
Right. We need to assign a weight to each stratum and then perform a weighted average.
How do we decide on a weight?How do we decide on a weight?
Hopefully the concept of a weighted average is understood by everyone. A simple mean is in fact a weighted average where the weights equal one. To get the average height of everyone in class, we add up everyone’s height and divide by the number of persons
contributing. The weight is one.
Hopefully the concept of a weighted average is understood by everyone. A simple mean is in fact a weighted average where the weights equal one. To get the average height of everyone in class, we add up everyone’s height and divide by the number of persons
contributing. The weight is one.
The second approach to getting a summary estimate is actually the one used by multivariable modeling approaches and we will touch on this briefly today. It is called the maximum likelihood approach
The second approach to getting a summary estimate is actually the one used by multivariable modeling approaches and we will touch on this briefly today. It is called the maximum likelihood approach
5)1)(4(
)8(1)6(1)4(1)2(1mean simple
Forming a Summary Estimate for Stratified Data
• Goal: – Create a summary “adjusted” estimate for
the relationship in question while adjusting for the potential confounder
Minor Severity
Major Severity
Crude
Stratified
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320 347
HIVNo
HIVAZT 0 91No AZT 3 161
3 252 255
ORcrude =0.61
OR = 0.0
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
HIVNo
HIVAZT 8 40No AZT 16 28
24 68 92
OR = 0.35
How would you weight these strata? According to sample size? No. of cases?
Summary Estimators: Woolf’s Method
• aka Directly pooled or precision estimator• Woolf’s estimate for adjusted odds ratio
– where wi
– wi is the inverse of the variance of the stratum-specific log(odds ratio)
idicibia1111
1
One of the first approaches developed for forming summaryl adjusted estimates was Woolf’s method:
One of the first approaches developed for forming summaryl adjusted estimates was Woolf’s method:
This is the inverse of the variance of
the log odds ratio. This makes sense the more precise strata have the
smallest variances and the inverse of a small number is a
large number
This is the inverse of the variance of
the log odds ratio. This makes sense the more precise strata have the
smallest variances and the inverse of a small number is a
large number
i
i
i
ii
Woolfw
w )]OR (log[
OR log
)(OR logOR WoolfWoolf e
Disease No DiseaseExposed ai bi
Unexposed ci di
Calculating a Summary Effect Using the Woolf Estimator
• e.g. AZT use, severity of needlestick, and HIV
Minor Severity
Major Severity
Crude
Stratified
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320
HIVNo
HIVAZT 0 91No AZT 3 161
3 252 255
ORcrude =0.61
OR = 0.0
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
HIVNo
HIVAZT 8 40No AZT 16 28
24 68 92
OR = 0.35
281
161
401
81
1
1611
31
911
01
1
)]0.35 log(
281
161
401
81
1[)]0 log(
1611
31
911
01
1[
WoolfOR log
Summary Estimators: Woolf’s Method
• Conceptually straightforward
• Best when:– number of strata is small– sample size within each strata is large
• Cannot be calculated when any cell in any stratum is zero because log(0) is undefined– 1/2 cell corrections have been suggested but are subject to
bias
• Formulae for Woolf’s summary estimates for other measures (e.g., RR, RD) available in texts and software documentation
– sensitive to small strata, cells with “0”– computationally messy
It seems the most reasonable to assign each stratum according to how sure you are of the inference and the
variance of the estimate is the best measure we have for this.
It seems the most reasonable to assign each stratum according to how sure you are of the inference and the
variance of the estimate is the best measure we have for this.
I discuss this approach first not only because it was one of the first proposed but also because it is the most conceptually straightforward.
I discuss this approach first not only because it was one of the first proposed but also because it is the most conceptually straightforward.
In the days before computers, this was considered computationally messy such that other easier methods were sought
In the days before computers, this was considered computationally messy such that other easier methods were sought
Summary Estimators: Mantel-Haenszel
• Mantel-Haenszel estimate for odds ratios
– ORMH =
– wi =
– wi is inverse of the variance of the stratum-specific odds ratio under the null hypothesis (OR =1)
i
ii
N
cb
i
ii
i
ii
Ncb
Nda
i
ii
i
i
i
i
i
ii
Ncb
dbca
Ncb
*
Disease No DiseaseExposed ai bi
Unexposed ci di
ai+ bi + ci + di = Ni
A more robust approach is the Mantel-Haenszel methodA more robust approach is the Mantel-Haenszel method
Again, using the same cell definitions, the M-H estimate for the summary OR is the sum of a times d divided by T divided by the sum of . . .
Again, using the same cell definitions, the M-H estimate for the summary OR is the sum of a times d divided by T divided by the sum of . . .
If we decompose this slightly, we can see that the weight is for each stratum is actually b times c divided by T. This is actually the inverse of the . . .
If we decompose this slightly, we can see that the weight is for each stratum is actually b times c divided by T. This is actually the inverse of the . . .
And the same logic as before, strata with the smallest variance get the most weight
And the same logic as before, strata with the smallest variance get the most weight
Summary Estimators: Mantel-Haenszel
• Mantel-Haenszel estimate for odds ratios
– relatively resistant to the effects of large numbers of strata with few observations
– resistant to cells with a value of “0”
– computationally easy
– most commonly used
The MH is the most commonly used estimator. The MH is the most commonly used estimator.
It is fairly resistant (ie it doesn’t blow up) . . .It is fairly resistant (ie it doesn’t blow up) . . .
Although really not a factor in the computer era, the computation of the MH estimator is a breeze.
Although really not a factor in the computer era, the computation of the MH estimator is a breeze.
More importantly is that the M-H closely approximates the MLE estimate which is generally regarded as the most accurate.
More importantly is that the M-H closely approximates the MLE estimate which is generally regarded as the most accurate.
Calculating a Summary Effect Using the Mantel-Haenszel Estimator
• e.g. AZT use, severity of needlestick, and HIV
• ORMH =
• ORMH =
Minor Severity
Major Severity
Crude
Stratified
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320
HIVNo
HIVAZT 0 91No AZT 3 161
3 252 255
ORcrude =0.61
OR = 0.0
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
HIVNo
HIVAZT 8 40No AZT 16 28
24 68 92
OR = 0.35
i
ii
ii
ii
i
ii
N
cbcb
da
N
cb*
i
ii
i
ii
Ncb
Nda
30.0
921640
255391
92288
2551610
Calculating a Summary Effect in Stata
• epitab command - Tables for epidemiologists– see “Survival Analysis and Epidemiological
Tables Reference Manual”
• To produce crude estimates and 2 x 2 tables:– For cross-sectional or cohort studies:
• cs variablecase variable exposed
– For case-control studies:
• cc variablecase variableexposed
• To stratify by a third variable:
– cs varcase varexposed, by(varthird variable)
– cc varcase varexposed, by(varthird variable)
• Default summary estimator is Mantel-Haenszel– , pool will also produce Woolf’s method
How can we make our lives a lot easier and implement all of this on the computer?How can we make our lives a lot easier and implement all of this on the computer?
The epitab command - Tables for Epidemiologists is quite a little handy command. Has anyone used it ?The epitab command - Tables for Epidemiologists is quite a little handy command. Has anyone used it ?
Calculating a Summary Effect Using the Mantel-Haenszel Estimator
• e.g. AZT use, severity of needlestick, and HIV
• . cc HIV AZTuse,by(severity) pool• severity | OR [95% Conf. Interval] M-H Weight• -----------------+-------------------------------------------------• minor | 0 0 2.302373 1.070588 • major | .35 .1344565 .9144599 6.956522 • -----------------+-------------------------------------------------• Crude | .6074729 .2638181 1.401432 • Pooled (direct) | . . .• M-H combined | .30332 .1158571 .7941072 • -----------------+-------------------------------------------------• Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = 0.4400• Test that combined OR = 1:• Mantel-Haenszel chi2(1) = 6.06• Pr>chi2 = 0.0138
Minor Severity
Major Severity
Crude
Stratified
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
HIV No HIVAZT 8 131No AZT 19 189
27 320
HIVNo
HIVAZT 0 91No AZT 3 161
3 252 255
ORcrude =0.61
OR = 0.0
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
HIVNo
HIVAZT 8 40No AZT 16 28
24 68 92
OR = 0.35
Calculating a Summary Effect Using the Mantel-Haenszel Estimator
• In addition to the odds ratio, Mantel-Haenszel estimators are also available in Stata for:
– risk ratio
• “cs varcase varexposed, by(varthird variable)”
– rate ratio
• “ir varcase varexposed vartime, by(varthird variable)”
Assessment of Confounding: Interpretation of Summary Estimate
• Compare “adjusted” estimate to crude estimate
– e.g. compare ORMH (= 0.30 in the example) to ORcrude (= 0.61 in the example)
• If “adjusted” measure “differs meaningfully” from crude estimate, then confounding is present
– e.g., does ORMH = 0.30 “differ meaningfully” from ORcrude = 0.61?
• What does “differs meaningfully” mean?– a matter of judgement based on
biologic/clinical sense rather than on a statistical test
– no one correct answer– the objective is to remove bias– 10% change from the crude often used– your threshold needs to be stated a priori
and included in your methods section
So, its in the hands of the researcherSo, its in the hands of the researcher
If the summary estimate, here a M-H OR estimator of 3.8If the summary estimate, here a M-H OR estimator of 3.8
Statistical Testing for Confounding is Inappropriate
• Testing for statistically significant differences between crude and adjusted measures is inappropriate
– e.g., when examining an association for which a factor is a known confounder (say age in the association between HTN and CAD)
– if the study has a small sample size, even large differences between crude and adjusted measures will not be statistically different
• yet, we know confounding is present
• therefore, the difference between crude and adjusted measures cannot be ignored as merely chance. The difference must be reported as confounding
– the issue of confounding is one of internal validity, not of sampling error.
• we must live with whatever effects we see after adjustment for a factor for which there is an a priori belief about confounding
• we’re not concerned that sampling error is causing confounding and therefore we don’t have to worry about testing for role of chance
Confidence Interval Estimation and Hypothesis Testing for the Mantel-
Haenszel Estimator
• e.g. AZT use, severity of needlestick, and HIV
• . cc HIV AZTuse,by(severity) pool• severity | OR [95% Conf. Interval] M-H Weight• -----------------+-------------------------------------------------• minor | 0 0 2.302373 1.070588 • major | .35 .1344565 .9144599 6.956522 • -----------------+-------------------------------------------------• Crude | .6074729 .2638181 1.401432 • Pooled (direct) | . . .
M-H combined | .30332 .1158571 .7941072
• -----------------+-------------------------------------------------• Test of homogeneity (B-D) chi2(1) = 0.60 Pr>chi2 = 0.4400
• Test that combined OR = 1:• Mantel-Haenszel chi2(1) = 6.06• Pr>chi2 = 0.0138
• What does the p value = 0.0138 mean?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
The goal is to combine or average the results from the different strata into one summary estimate. Any thoughts on how to do this?
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
After we have formed our strata and gotten rid of confounding, how do we summarize what the unconfounded estimates from the two or more strata are telling us. In the examples of last week, the measures of association from the different strata were
identical. This is seldom the case.
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
A more realistic is described in the Rothman chapter regarding the question of whether spermicide use might cause Down’s
Mantel-Haenszel Confidence Interval and Hypothesis Testing
stratumeach in cell a
for the valueexpected theis E
)1(
5.0
eCI %95
;;;
)(2
)(
))((2
)(
)(2
)(
OR) (logSE
i
12
2121
2
1 121
)MH
OR SE(log x (1.96 MH
OR log
1
2
1
1 1
1
1
2
1
where
NN
mmnn
Ea
N
cbw
N
daR
N
cbQ
N
daP
where
w
wQ
wR
RQwP
R
RP
k
i ii
iiii
k
i
k
iii
i
iii
i
iii
i
iii
i
iii
k
ii
k
iii
k
i
k
iii
k
iiiii
k
ii
k
iii
Disease No DiseaseExposed ai bi m1i
Unexposed ci di m2i
n1i n2i Ni
Mantel-Haenszel Techniques
• Mantel-Haenszel estimators• Mantel-Haenszel chi-square statistic• Mantel’s test for trend (dose-response)
Summary Effect in Stata -example• e.g. Spermicide use, maternal age and
Down’s Down’s No Down’sSpermicide use 4 109No spermicide use 12 1145
Down’s NoDown’s
Spermici use 3 104No spermic. 9 1059
1175
Age < 35 Age > 35
Crude
StratifiedDown’s No
Down’sSpermic. use 1 5No spermic. 3 86
95
OR = 3.4 OR = 5.7
OR = 3.5
With this in mind, let’s consider an example using . . .With this in mind, let’s consider an example using . . .
Should we pool these?Should we pool these?
Is there confounding present?Is there confounding present?
. cc downs spermici , by(matage) pool
matage | OR [95% Conf. Interval] M-H Weight-----------------+------------------------------------------------- < 35 | 3.394231 .9800358 11.80389 .7965957 >= 35 | 5.733333 0 50.8076 .1578947-----------------+------------------------------------------------- Crude | 3.501529 1.171223 10.49699 Pooled (direct) | 3.824166 1.196437 12.22316 M-H combined | 3.781172 1.18734 12.04142-----------------+-------------------------------------------------Test for heterogeneity (direct) chi2(1) = 0.137 Pr>chi2 = 0.7109Test for heterogeneity (M-H) chi2(1) = 0.138 Pr>chi2 = 0.7105
Test that combined OR = 1: Mantel-Haenszel chi2(1) = 5.81 Pr>chi2 = 0.0159
Which answer should you report as “final”?
No Effect of Third Variable
Lung Ca No Lung CaSmoking 900 300No Smoking 100 700
Lung CaNo
Lung CASmoking 810 270No Smoking 10 70
Stratified
Crude
Matches Absent
Matches Present
Lung CaNo
Lung CASmoking 90 30No Smoking 90 630
OR crude = 21.0
(95% CI: 16.4 - 26.9)
ORmatches = 21.0 OR no matches = 21.0
OR adj = 21.0
(95% CI: 14.2 - 31.1)
Whether or not to accept the “adjusted” summary estimate in favor
of the crude?
• Methodologic literature is inconsistent on this
• Scientifically most rigorous approach would appear to be to create two lists of potential confounders prior to the analysis:
– A. Those factors for which you will accept the adjusted result no matter how small the difference from the crude
– B. Those factors for which you will accept the adjusted result only if it meaningfully differs from the crude (with some pre-specified difference, e.g., 10%)
• For some analyses, may have no factors on A list. For other analyses, no factors on B list.
• Always putting all factors on A list may seem conservative, but not necessary the right thing to do to take the penalty in statistical imprecision
Presence or Absence of Confounding by a Third Variable?
Relative RisksCrude Third
FactorPresent
ThirdFactorAbsent
Adjusted
Adjust orIgnore?
4.1 1.9 2.1 2.0 Adjust4.0 1.2 1.0 1.1 Adjust0.2 0.7 0.9 0.8 Adjust4.0 3.8 4.2 4.1 Ignore4.0 8.2 7.7 7.9 Adjust1.0 3.1 2.7 3.0 Adjust1.9 1.6 1.9 1.8 Prob. Ignore0.9 0.1 0.2 0.1 Adjust4.0 0.4 0.6 0.5 Adjust
Stratifying by Multiple Potential Confounders
Crude
Stratified
<40 smokers
>60 non-smokers40-60 non-smokers
CAD NoCAD
Chlamydia
NoChlamydia
<40 non-smokers
40-60 smokers >60 smokers
CAD No CADChlamydiaNo chlamydia
CAD NoCAD
Chlamydia
NoChlamydia
CAD NoCAD
Chlamydia
NoChlamydia
CAD NoCAD
Chlamydia
NoChlamydia
CAD NoCAD
Chlamydia
NoChlamydia
CAD NoCAD
Chlamydia
NoChlamydia
The Need for Evaluation of Joint Confounding
• Variables that evaluated alone show no confounding may show confounding when evaluated jointly
Crude
Stratified by Factor 1 alone
by Factor 2 alone
by Factor 1 & 2
The examples I have shown thus far have just one potential confounder to worry about. What should we do when more than . . .
The examples I have shown thus far have just one potential confounder to worry about. What should we do when more than . . .
In this example, the crude estimate is identical to the stratum specific measures when the 2 other variables are looked at separately.
In this example, the crude estimate is identical to the stratum specific measures when the 2 other variables are looked at separately.
Disease No DiseaseExposed 12 4Unexposed 30 22
OR = 2.2
F1 +Disease
NoDisease
Exposed 6 2Unexposed 15 11
OR = 2.2
F1+F2+Disease
NoDisease
Exposed 1 1Unexposed 10 10
OR = 1.0
F1-F2+Disease
NoDisease
Exposed 5 1Unexposed 5 1
OR = 1.0
F1+F2-Disease
NoDisease
Exposed 5 1Unexposed 5 1
OR = 1.0
F1-F2-Disease
NoDisease
Exposed 1 1Unexposed 10 10
OR = 1.0
F1 -Disease
NoDisease
Exposed 6 2Unexposed 15 11
OR = 2.2
F2 +Disease
NoDisease
Exposed 6 2Unexposed 15 11
OR = 2.2
F2 -Disease
NoDisease
Exposed 6 2Unexposed 15 11
OR = 2.2
Approaches for When More than One Potential Confounder is Present
• Backward versus forward confounder evaluation strategies
– relevant both for stratification and especially multivariable modeling (the heart of model selection)
• Backwards Strategy
– initially evaluate all potential confounders together (i.e., look for joint confounding)
– conceptually preferred because in nature variables are all present and act together
– Procedure:
• with all potential confounders considered, form adjusted estimate. This is the “gold standard”
• one variable can then be dropped and the adjusted estimate is re-calculated (adjusted for remaining variables)
• if the dropping of the first variable results in a non-meaningful (eg <10%) change compared to the gold standard, it can be eliminated
• procedure continues until no more variables can be dropped (i.e. are remaining variables are relevant)
– Problem:
• with many potential confounders, cells become very sparse and stratum-specific estimates very imprecise
This introduces the whole topic of This introduces the whole topic of
I know you are learning a bit about this in biostatistics. Which is
preferable -backward or forwards?
I know you are learning a bit about this in biostatistics. Which is
preferable -backward or forwards?
In fact, you may not even be able to get off the ground because the initial stratification is just too thin
In fact, you may not even be able to get off the ground because the initial stratification is just too thin
Example: Backwards Selection
• Research question: Is prior hospitalization associated with the presence of methicillin-resistant S. aureus (MRSA)? (from Kleinbaum 2003)
• Outcome variable: MRSA (present or absent)• Primary predictor: prior hospitalization (yes/no)• Potential confounders: age (<55, >55), gender, prior antibiotic
use (atbxuse; yes/no)• Assume no interaction
Factors Adjusted For OR (95% CI) CI Width none (crude) 11.67 (5.99 to 22.77) 16.78
age, gender, atbxuse
(gold standard)
4.66 (2.14 to 10.14)
8.0
gender, atbxuse 5.04 (2.31 to 11.03) 8.72
age, atbxuse 4.63 (2.08 to 10.29) 8.21 age, gender 11.59 (5.91 to 22.76) 16.85
atbxuse 5.00 (2.26 to 11.04) 8.78
age 11.56 (5.87 to 22.76) 16.89 gender 12.06 (6.15 to 23.62) 17.47
• Which OR to report?
Approaches for When More than One Potential Confounder is Present
• Forward Strategy– start with the variable that has the biggest
“change-in-estimate” impact
– then add the variable with the second biggest impact
– keep this variable if its presence meaningfully changes the adjusted estimate
– procedure continues until no other added variable has an important impact
– Advantage:• avoids the initial sparse cell problem of
backwards approach
– Problem:• does not evaluate joint confounding effects
of many variables
In the forward selection approach, you start with . . .In the forward selection approach, you start with . . .
Stratification to Reduce Confounding
• Advantages– straightforward to implement and comprehend– easy way to evaluate statistical interaction
• Limitations– Looks at only one exposure-disease assoc. at a time– Requires continuous variables to be discretized
• loses information; possibly results in “residual confounding”
– Deteriorates with multiple confounders• e.g. suppose 4 confounders with 3 levels
– 3x3x3x3=81 strata needed– unless huge sample, many cells have “0”’s
and strata have undefined effect measures– Solution:
• Mathematical modeling (multivariable regression)– e.g.
» linear regression» logistic regression» proportional hazards regression
Although you are all now learning about the wonderful world of multivariable modeling, I would encourage you to examine your data whenever you can with stratification because it
is the most native way to see your data and the easiest to explain your data to others
Although you are all now learning about the wonderful world of multivariable modeling, I would encourage you to examine your data whenever you can with stratification because it
is the most native way to see your data and the easiest to explain your data to others
It does, however, have its limitations which is principally that it breaks down with multiple confounders
It does, however, have its limitations which is principally that it breaks down with multiple confounders
These approaches are the topics of Mitch Katz’s upcoming sessions and your Thursday sessions.
These approaches are the topics of Mitch Katz’s upcoming sessions and your Thursday sessions.