Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University...
-
Upload
theodora-mason -
Category
Documents
-
view
215 -
download
2
Transcript of Causation ? Tim Wiemken, PhD MPH CIC Assistant Professor Division of Infectious Diseases University...
Causation ?Causation ?
Tim Wiemken, PhD MPH CICAssistant Professor
Division of Infectious DiseasesUniversity of Louisville, Kentucky
1. Testing for an Association
3. Confidence Intervals
2. Other Measures of Association
OverviewOverview
3. Confidence Intervals
2. Other Measures of Association
Overview 1. Testing for an Association
Null hypothesis: There is no association
Alternative hypothesis: There is an association
1. Develop hypothesis
Testing for Association
1. Develop hypothesis
Testing for Association
What P-value will you consider statistically significant?
Usually 0.05 - arguments for bigger/smaller
2. Choose your level of significance
α value
Testing for Association
Call your statistician.
• A bad test gives bad results.• A good test may give bad results (bad data?).• A good statistician may tell you if the results are bad, but
cannot always tell you if the data were bad.
3. Choose Your Test
Testing for Association
Will tell you if there is an association between two variables
Chi-squared Test
Testing for Association
Will tell you if there is an association between two variables
Chi-squared Test
Testing for Association
Measures observed versus expected counts in study groups
Will tell you if there is an association between two variables
Chi-squared Test
Testing for Association
Measures observed versus expected counts in study groups
Must have adequate sample size
2x2 table – categorical data
Chi-squared Test
Outcome + Outcome -
Predictor +
Predictor -
Testing for Association
Example
Research question: Does HIV impact mortality in hospitalized patients with community-acquired
pneumonia?
Hospitalized CAP Patients
HIV+ HIV-
Dead DeadAlive Alive
Does HIV Have an Effect on Patient In-Hospital Mortality?
Example
Hospitalized CAP Patients
HIV+ HIV-
Dead DeadAlive Alive
Predictor Variable: ?
Example
Hospitalized CAP Patients
HIV+ HIV-
Dead DeadAlive Alive
Outcome Variable: ?
Example
Significance Level
Null Hypothesis
What Test?
Does HIV Have an Effect on Patient In-Hospital Mortality?
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
Outcome + Outcome -
Predictor +
Predictor -
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
+ HIV, - died: - HIV, - died: + HIV, + died :- HIV, + died :
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
Outcome + Outcome -
Predictor +
Predictor -
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
How many patients died in-hospital?
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
How many patients died in-hospital?n=27
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
How many patients had HIV?
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
How many patients had HIV?n=30
Example
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+
HIV-
Example
n=27
n=30
n=100
=countifs(b2:b101, 1, z2:z101, 1)
Does HIV Have an Effect on Patient In-Hospital Mortality?
How many patients with HIV died?
Example
count the number of cases of deaths (column b, in_hosp_mort=1) that had HIV (column z, hiv=1)
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11
HIV-
Example
n=27
n=30
n=100
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11
HIV- 27 - 11 = 16
Example
n=27
n=30
n=100
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11 30 - 11 = 19
HIV- 27 - 11 = 16
Example
n=27
n=30
n=100
Check this!
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11 30 - 11 = 19
HIV- 27 - 11 = 16
Example
n=27
n=30
n=100
=countifs(b2:b101, 0, z2:z101, 1)
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11 30 - 11 = 19
HIV- 27 - 11 = 16 100 – (11+16+19) = 54
Example
n=27
n=30
n=100
Plug the data into your excel stats program
Does HIV Have an Effect on Patient In-Hospital Mortality?
Dead + Dead -
HIV+ 11 30 - 11 = 19
HIV- 27 - 11 = 16 100 – (11+16+19) = 54
Example
Do they?
Example
No! P=0.154
P>0.05
Do they?
Example
Where to publish?
ExampleExample
Example
Maybe those without HIV are older than those with HIV, so the mortality ends up the same
Example
How do we check this?
Null Hypothesis:
Example
Alternative Hypothesis:
Null Hypothesis: The age of patients with and without HIV are NOT different.
Example
Alternative Hypothesis: The age of patients with and without HIV ARE different.
Is age different in patients with and without HIV? patients?
Example
Back to your dataset!
Total cases of HIVmean age HIVSD age HIV
Total cases of non-HIVmean age non HIVSD age non HIV
Example
Total Cases
Total cases of HIV
=countif(Z2:Z101,1)
Total cases of non-HIV
=countif(Z2:Z101,0)
Example
Average Age
=averageif(Z2:Z101,1,AN2:AN101)
Example
=averageif(Z2:Z101,0,AN2:AN101)
HIV+
HIV-
Standard Deviations… not as easy.
=stdev(if(Z2:Z101=1,AN2:AN101))
Example
Need to use an Array and a nested IF
HIV+
DON’T HIT ENTER!!!!!!!!!
Standard Deviations… not as easy.
=stdev(if(Z2:Z101=1,AN2:AN101))
Example
Need to use an Array and a nested IF
HIV+
ON WINDOWS: Control+Shift+Enter
ON MAC: Command+Enter
Back to your stats program!
Total cases of HIV = 30mean age HIV: 50.3SD age HIV: 13.62
Total cases of non-HIV = 70mean age non HIV: 56.5SD age non HIV: 15.96
Example
Is it?
Example
NO! P>0.05
Do they?
Example
BUT IT IS SOOOOO CLOSE!
3. Confidence Intervals
1. Testing for an Association
2. Other Measures of Association
Overview
Used for cohort studies or clinical trials
Gold standard measure for observational studies
1. Risk Ratio
Answers: How much more (less) likely is this group to get an outcome versus this other group?
Measures of Association
Do those admitted to the ICU die more than those not admitted to the ICU?
Example
Use the 2x2 Totals Tab
Total with outcome:
Total without outcome:
Do those admitted to the ICU die more than those not admitted to the ICU?
Example
Use the 2x2 Totals Tab
Total with outcome: =countif(B2:B101,1)n=27
Total without outcome: 100 – 27n=73
Do those admitted to the ICU die more than those not admitted to the ICU?
Example
Total with outcome in the ICU:
Total without outcome in the ICU:
Do those admitted to the ICU die more than those not admitted to the ICU?
Example
Total with outcome in the ICU: =countifs(B2:B101,1,I2:I101,1)n=9
Total without outcome in the ICU:=countifs(B2:B101,0,I2:I101,1)
n=7
Do those admitted to the ICU die more than those not in the ICU?
Example
Dead + Dead -
ICU+ 9 7
ICU- ? ?
P=0.004
Do those admitted to the ICU die more than those not in the ICU?
Example
Dead + Dead -
ICU+ 9 7
ICU- 27 - 9 = 18 73 – 7 = 66
P=0.004
How much more likely are those admitted to the ICU to die?
Example
Risk of death in ICU group: 9/ 9+7= 56.3%
Dead + Dead -
ICU+ 9 7
ICU- 18 66
How much more likely are those admitted to the ICU to die?
Example
Risk of death in ICUgroup: 9/ 9+7= 56.3%
Risk of death in non ICU group: 18/ 18+66= 21.4%
Dead + Dead -
ICU+ 9 7
ICU- 18 66
How much more likely are those admitted to the ICU to die?
Example
Risk of death in ICUgroup: 9/ 9+7= 56.3%
Risk of death in non ICU group: 18/ 18+66= 21.4%
Dead + Dead -
ICU+ 9 7
ICU- 18 66
Risk Ratio: 0.563/0.214 = 2.63
Interpret the Risk Ratio
Example
Who wants to interpret a risk ratio of 2.63?
Interpret the Risk Ratio
Example
Patients admitted to the ICU are 2.63 times more likely to die than those patients not
admitted to the ICU.
Example
CAP Patients
Empiric Atypical Pathogen Coverage
No Empiric Atypical Pathogen
Coverage
Dead DeadAlive Alive
Does Empiric Atypical Pathogen Coverage Have an Effect on Patient Mortality?
Example
Assuming a cohort study…
Do those patients who have empiric atypical pathogen coverage die less often
than those without atypical coverage?
+ Atypical : 2220- Atypical : 658+ Atypical + died : 217- Atypical + died : 110
Example
Assuming a cohort study…
Do those patients who have atypical pathogen coverage die more often than
those without atypical coverage?
Outcome + Outcome -
Predictor +
Predictor -
Example
Assuming a cohort study…
Do those patients who have empiric atypical pathogen coverage die less often than those without atypical
coverage?
+ Atypical : 2220- Atypical : 658+ Atypical + died : 217- Atypical + died : 110
Example
Assuming a cohort study…
Do those patients who have atypical pathogen coverage die more often than
those without atypical coverage?
Outcome + Outcome -
Predictor + 217 2003
Predictor - 110 548
Example
Anyone??
Interpret the Risk Ratio
Example
Interpret the Risk Ratio
Example
Those with atypical coverage are 42% less likely to die as compared to those without atypical coverage
Remember your baseline risk.
What does that mean?
Assuming 8% of CAP patients die, what is the risk of death with empiric atypical pathogen coverage?
Example
What does that mean?
Example
8% x 0.58 = 4.64
Just multiply original risk by the risk ratio!
Even Better:
Example
Number Needed to Treat
1/Absolute Risk Reduction (ARR)
ARR = Unexposed Risk – Exposed Risk
Even Better:
Example
Number Needed to Treat
ARR = Unexposed Risk – Exposed Risk
ARR = Risk w/out atypical coverage – Risk w/atypical coverage
Even Better:
Example
Number Needed to Treat
Even Better:
Example
Number Needed to Treat
16.7 = unexposed risk
16.7 = unexposed risk
Even Better:
Example
Number Needed to Treat9.8 = exposed
risk9.8 = exposed
risk
Even Better:
Example
Number Needed to Treat
1 / (16.7 – 9.8) = 15 (round up!)
Need to treat 15 patients to save 1
Used for case-control studies
Is an approximation of the risk ratio
2. Odds Ratio
Answers: How much more (less) likely are those with the outcome to have been in this group versus this other group?
Measures of Association
Only a good approximation when the outcome is rare
Can be an extremely bad approximation
2. Odds Ratio
Can correct with a formula
Zhang, J., & Yu, K. F. (1998). What's the relative risk? A method of correcting the odds ratio in cohort studies of common outcomes. JAMA, 280(19), 1690-1691.
Measures of Association
Acinetobacter outbreak
You gather information from 100 patients with Acinetobacter and 200 patients without.
Example
Need to identify the risk factors
Measures of Association
Select sample based on the outcome (Acinetobacter)
Key:
Example
Measures of Association
Because the sample was selected based on the outcome (a subset of everyone who might get the outcome in your
population), you can never know the actual incidence of the outcome in everyone who was exposed.
Cohort Study Sample
Example
Measures of Association
Everyone Exposed
Everyone Not Exposed
Outcome
Outcome
Case-Control Study Sample
Example
Measures of Association
Subset with Outcome
Subset without Outcome
Exposure Status
Exposure Status
Case-Control Study Sample
Example
Measures of Association
Subset with Outcome
Subset without Outcome
Exposure Status
Exposure Status
Cannot know everyone exposed who gets the
outcome
Example
Analyze a number of risk factors to see if they are associated with Acinetobacter infection
Measures of Association
+ Acinetobacter : 100- Acinetobacter : 200+ Acinetobacter + wound : 55- Acinetobacter + wound : 10
Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii
infection?
Example
Assuming a case-control study…
Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection?
Outcome + Outcome -
Predictor +
Predictor -
Example
+ Acinetobacter : 100- Acinetobacter : 200+ Acinetobacter + wound : 55- Acinetobacter + wound : 10
Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii
infection?
Example
Assuming a case-control study…
Outbreak Investigation: Was having a traumatic wound associated with Acinetobacter baumannii infection?
Acinetobacter + Acinetobacter -
Wound + 55 10
Wound - 45 190
ExampleExample
Anyone??
Interpret the Odds Ratio
Example
Those with Acinetobacter have a 23 times higher odds of having a nonsurgical wound compared to those without Acinetobacter.
Interpret the Odds Ratio
Example
What?
Interpret the Odds Ratio
Outcome + Outcome -
Predictor +
Predictor -
Order of interpretation:
ExampleExample
Risk: Know the incidence of the outcome.
So what’s the difference?
How you choose your population
Odds: Don’t know the incidence of the outcome.
Risk Versus Odds
So what’s the difference?
How you choose your population
You can’t identify the likelihood of someone with a predictor getting an outcome because you don’t know who all had the
outcome
Risk Versus Odds
Correct the Odds
Common Outcomes = Odds is a poor approximation of Risk
Risk Versus Odds
Even Chuck Norris Hates Odds.
So what’s the difference?
How you choose your population
Risk Versus Odds
Used for Time-to-event data
As good as the risk ratio
3. Hazard Ratio
Answers: How much more (less) likely are those in this group to get the outcome versus this other group at any given time?
Measures of Association
1. Testing for an Association
2. Other Measures of Association
3. Confidence Intervals
Overview
Patients in the Universe
Patients in the
Sample
Sampling
Generalizing
Confidence IntervalsConfidence Intervals
Uses an arbitrary cutoff (0.05)
Doesn’t give info on precision
P-value is not good.
Doesn’t help you generalize
Confidence Intervals
Fix: Use Confidence Interval
You are 95% confident that the risk (odds) of the patients in the universe is between that interval.
Definition – 95% CI
Confidence Intervals
You are 95% confident that the risk (odds) of the patients in the universe is between that interval.
Definition – 95% CI
“Universe” is not everyone in the world – it is everyone you can generalize back to.
Confidence Intervals
You are 95% confident that the risk (odds) of the patients in the universe is between that interval.
Definition – 95% CI
“Universe” is not everyone in the world – it is everyone you can generalize back to.
Confidence Intervals
If the CI includes 1, that measure of association is not statistically significant (like a P-value >0.05)
You are 95% confident that the risk (odds) of the patients in the universe is between that interval.
Definition – 95% CI
“Universe” is not everyone in the world – it is everyone you can generalize back to.
Confidence Intervals
‘Tighter’ CI = more power, more precision, larger sample
If the CI includes 1, that measure of association is not statistically significant (like a P-value >0.05)
Caveat
Confidence Intervals
Since CI gets tighter with more people in the sample, every measure of association (except exactly 1) will eventually be significant with a large enough sample size.
Is this risk ratio statistically significant?
Dead + Dead -
Bacteremia + 25 100
Bacteremia - 310 1537
Confidence Intervals
No – 95% Confidence Interval includes 1
Is the RR from the bacteremia example statistically significant?
Risk Ratio: 1.19
95% CI: (0.83,
1.72)
Confidence Intervals
Using the same proportions of Predictors and Outcomes
What happens as we increase the sample size?
Dead + Dead -
Bacteremia + 200 800
Bacteremia - 2500 12400
ExampleExample
Yes – 95% CI does not include 1.
Now is the RR from the bacteremia example statistically significant?
Risk Ratio: 1.19 (Same as
before)95% Confidence Interval:
(1.05, 1.36)
Sample Size
The confidence interval becomes tighter
What happens as we increase the sample size?
Sample SizeSample Size
The confidence interval becomes tighter
What happens as we increase the sample size?
Assuming the proportion of patients in each group stays the same, the risk ratio eventually becomes statistically significant.
Sample Size
The confidence interval becomes tighter
What happens as we increase the sample size?
Assuming the proportion of patients in each group stays the same, the risk ratio eventually becomes statistically significant.
Sample Size
This is because the power you have to detect that effect size has increased.
The larger your sample, the closer you are to actually sampling the entire universe.
What happens as we increase the sample size?
Sample Size
Therefore, your confidence interval is tighter and closer to “the truth in your universe.”
This makes sense.
What happens as we increase the sample size?
Sample Size
The more people in your study, the closer you are to having the universe as your sample. Therefore your statistic should be pretty close to the ‘truth in the universe’.
Patients in the Universe Patients
in the Sample
Sampling (easy)
Generalizing (hard)
Confidence IntervalsConfidence Intervals
Patients in the Universe
Patients in the Sample
Sampling (hard)
Generalizing (easy)
Confidence IntervalsConfidence Intervals