Bias and confounding - kupublicifsv.sund.ku.dk/~pka/epi18/MK2.pdfTypes of bias 02/02/2018 7...

Course: Epidemiological methods in medical research

Mads Kamper-Jørgensen Section of Epidemiology February 6th 2018

Bias and confounding

02/02/2018 1

The world according to an epidemiologist

02/02/2018 2

• We estimate the association between an exposure and an outcome. But does the association reflect causality or is it due to error?

Today we will talk about

• Chance

• Information bias

• Selection bias

• Confounding

Exposure Outcome

Two types of error

02/02/2018 3

Type I error

• We demonstrate an association, although no such association exist

• We typically accept a risk of type I error (α-level) of 5%

Type II error

• We do not demonstate an association, although a such actually does exist

• We typically accept a risk of type II error (β-level) of 20%

The error rates are traded off against each other. The only way to reduce both error rates is to increase the sample size.

Type I and type II error

02/02/2018 4

The truth

Association exists No association exists

Result of study

Association demonstrated

Reject 0-hypothesis (correct inference)

Reject 0-hypothesis (Type I error)

Association not demonstrated

Accept 0-hypothesis (Type II error)

Accept 0-hypothesis (correct inference)

Precision and bias

02/02/2018 5

Blood pressure measured once for 20 people

A) Precise, unbiased: Blood pressure meter

B) Precise, biased: Poorly calibrated blood pressure meter.

C) Unprecise, unbiased: iPhone

D) Unprecise, biased: Poorly calibrated iPhone

Precision and bias

02/02/2018 6

• Reduces the precision

• Has no direction

• Depends on sample size: Bigger is better

• Does not nescesarily lead to bias

• Reduces the validity

• Leads to over- or under estimation

• Bigger is not better

• Leads to bias

RANDOM ERROR SYSTEMATIC ERROR

Types of bias

02/02/2018 7

Information bias

• Has to do with the information about study participants

Selection bias

• Has to do with the selection of study participants

Confounding

• Has to do with mixing of effects because the compared study participants are not comparable

Why information bias?

02/02/2018 8

• Because we can over or under estimate frequencies or associations and draw the wrong inference if the information on participants is incorrect

• So far we assumed correct information: (Hardly) never the case

• Pertains to exposure, covariates and/or outcome

• Due to e.g. biologic variation, poor memory, imprecise question, ignorance etc.

• Information bias is due to systematically incorrect information about participants

• You can’t undo information bias once data has been collected so use precise instruments, questions, standardized procedures, blinding, training

Sensitivity and specificity

02/02/2018 9

• Sensitivity: the ability of a test to classify true positives (TP) as positives. Calculation: TP/(TP+FN)

• Specificity: the ability of a test to classify true negatives (TN) as negatives. Calculation: TN/(TN+FP)

• Most often related to the quality of a biologic test, can describe how well a question reflects ‘truth’

Diseased Non-diseased

Diseased TP FP

Non-diseased FN TN

Total TP+FN FP+TN

Misclassification

02/02/2018 10

• Wrong classification of participants

• If misclassification is similar in the compared groups it’s called non-differential misclassification

• If misclassification is not similar in the compared groups it’s called differential misclassification

• Both non-differential and differential misclassification may cause information bias

Examples from own research

02/02/2018 11

Ignorance

• Few adult Americans received transfusion

Culture

• Few adult Frenchmen drink alcohol

Poor question

• Few Danish children have age-appropriate motor skills

Quiz

02/02/2018 12

• Visit www.madskamper.dk/epiphd

• Take only the HPV quiz

• Discuss with your neighbour

http://www.madskamper.dk/epiphd

Misclassification

02/02/2018 13

• Fictitious cohort study of the association between alcohol consumption and self-percieved health using a poor measure of alcohol consumption

• True information on alcohol: RR=1.66

Good Bad Total

Abstinent 236 59 295

Consumer 846 419 1265

Total 1082 478 1560

Non-differential misclassification

02/02/2018 14


• 10% of consumers are misclassified: RR=1.38

Good Bad Total


Consumer 761 377 1139

Total 1082 478 1560

Non-differential misclassification

02/02/2018 15


• 10% of consumres are misclassified: RR=1.38

• 20% of consumers are misclassified: RR=1.27

• The association goes towards no difference between groups i.e. 0 if the scale is absolute and 1 if the scale is relative

Good Bad Total


Consumer 677 335 1012

Total 1082 478 1560

Differential misclassification

02/02/2018 16


• 10% of consumers are misclassified, but only among those with self-percieved bad health: RR=1.03

Good Bad Total


Consumer 846 377 1223

Total 1082 478 1560

Differential misclassification

02/02/2018 17


• 10% of consumres are misclassified, but only among those with self-percieved bad health: RR=1.03

• 20% of consumers are misclassified, but only among those with self-percieved bad health: RR=0.75

• Can reverse the association

Good Bad Total


Consumer 846 335 1181

Total 1082 478 1560

Examples of differential misclassification

02/02/2018 18

Case-control study

• Recall bias: cases remember exposures differently (often better) than controls. Not the same as poor memory!

• Interviewer bias: Interviewer asks differently (often in more detail) regarding exposures among cases compared with controls

Cohort study

• Detection bias: exposed are at different (often higher) risk of the outcome compared with unexposed

• Interviewer bias: exposed are asked differently (often in more detail) about the outcome compared with unexposed

02/02/2018 19

BREAK What are the sources of

information bias in your project – and is it non-differential or

differential?

Why selection bias?

02/02/2018 20

• Because we can over or under estimate frequencies or associations and draw the wrong inference if the study population does not represent the target population

• So far we assumed that participants in our study are comparable to those who do not participate: Not always the case

• Selection bias is due to systematic differences between participants and thoose who do not participate

• Selection into the cohort and attrition

Selection bias

02/02/2018 21

Target population

Source population

Study population

Systematic differences

An example of selection bias

02/02/2018 22

Target population

• Pregnant women in Denmark

Source population

• Pregnant women at selected GPs

Study population

• Paricipants in the Danish National Birth Cohort (DNBC): participation dependent on whether the woman wanted to participate

Selection bias?

• Is the study population different than the source population, and is the source population different than the target population?

It depends …

02/02/2018 23

DNBC women are different

• They drink less, they are better educated, they eat healthier, they use less medication etc.

Scientific question

• How many use pain killers during pregnancy? Yes, very likely information bias

• Is folic acid associated with neural tube defects? No, not very likely

Because

• Both the exposure and the outcome should be associated with the likelihood of participating in the study in comparative studies

Validity

02/02/2018 24

Internal validity

• Do the results apply to the target population?

• Threatened by selection bias, information bias and confounding

External validity

• Do results apply beyond the target population?

• Dependent on internal validity

• Qualitative statement of the direction and strength of an association

Are the results biased?

02/02/2018 25

• We (often times) do not know if the frequency or association is biased by selection because we (often times) do not have information about non-participants

• Risk of selection bias must be considered depending on the scientific question, the study design, and the applied data

• Texan study of HIV prevalence

Matthew McConaughey in ‘Dallas Buyers Club’

What to do?

02/02/2018 26

Data collection

• Maximize response rate through reminders, competitions, payment etc.

• Response rates dropped throughout 30 years

• Snowball sampling (hard-to-get groups)

• National registers without selection

Quiz

02/02/2018 27


• Take only the hepatitis quiz


http://www.madskamper.dk/epiphd

Examples of selection bias

02/02/2018 28

Intervention and cohort studies

• Generally not a problem because selection must relate to both exposure and outcome (which happens in the future)

• Attrition bias e.g. new anti-depressant and depression. Under estimates the effect of the new anti-depressant because the most depressed using the old drug drop out

Case-control studies

• Poor selection of controls: Pancreas cancer and coffee. Over estimates the effect of coffee because controls have been advised not to drink coffee

Examples of selection bias

02/02/2018 29

Cross-sectional studies

• Survival bias: Smoking and COPD. Under estimates the effect of smoking because smokers with COPD are at high risk of dying

Can selection bias explain it?

02/02/2018 30

• 1000 people were invited to participate in a study of the association between sex and hair loss. Of those, 650 (65%) agreed.

• OR = (100/200) / (50/300) = 3.00 (95% CI 2.04 - 4.40)

• We suspect men losing their hair to be more interested in participating than the other groups.

+ Hair loss - Hair loss

Man 100 200

Woman 50 300

Can selection bias explain it?

02/02/2018 31

• All men losing their hair participate, while participation in the other groups is 61%

+ Hair loss - Hair loss

Man 100 (100%) 200 (61%)

Woman 50 (61%) 300 (61%)

OR x truepart%(d) / part%(b)

part%(c) / part%(a) OR Observed

OR x true61 / 61

61 / 100 3

2.53)-1.32 CI (95% 1.83 OR True

02/02/2018 32

BREAK Do you have reasons to fear

selection in your studies – can you justify it?

Confounding

02/02/2018 33

What is it?

• To mix up, confuse, mistake

• Used in epidemiology to describe mixing up of causes of a given effect

• Leads to misinterpretation, wrong inference

An example

• Does birth order affect the risk of Down’s syndrome?

Birth order and Down’s syndrome

02/02/2018 34

From: K Rothman: Epidemiology – An Introduction 2002

DK in 2005-2009: ~ 0,5 per 1000 births

Maternal age and Down’s syndrome

02/02/2018 35


Birth order, maternal age and Down’s syndrome

02/02/2018 36


Confounding

02/02/2018 37

Is present when

• An observed association between exposure and outcome fully or partly can be attributed a different distribution of risk factors for the outcome, among exposed and unexposed i.e. unexchangeability

Criteria

• Independent risk-factor for the outcome

• Associated with the exposure

• Not an inter-mediate step between exposure and outcome

Confounder model

02/02/2018 38

Exposure Outcome

Confounder

Independent risk-factor for the outcome

Associated with the exposure

Not inter-mediate between exposure and outcome

Quiz

02/02/2018 39


• Take the last quiz


http://www.madskamper.dk/phd

Confounder identification

02/02/2018 40

Methods

• Stepwise selection (forwards or backwards)

• Change-in-estimate

• Causal diagrams (DAGs)

Recommendation

• Common sense

• Do not nescessarily do what others have done before

02/02/2018 41

Confounder control

02/02/2018 42

Randomization

• Not possible in observational design

Matching

• Not possible to investigate the effect of matching variable

• May remove the effect you are interested in studying

• Twin and sibling design

DESIGN

Standardization • Indirect standardization

(one population is standard)

• Direct standardization (external standard population)

Stratified analysis • Only possible to stratify according to a few

variables

Multivariate analysis • Adjust simultaneously for several variables

• Estimates from such analysis are called ‘adjusted’

ANALYSIS

Unmeasured vs. residual confounding

02/02/2018 43

Unmeasured

• Variables which we have no data on

Residual

• If the categorization is too crude or the information regarding the confounder is imprecise

Look out for mix-ups

Design and bias

02/02/2018 44

Sir Bradford Hill’s criteria of causality

02/02/2018 45

Criterion Explanation

Stregnth Strength depend on the prevalence. A strong association are not likely only due to confounding

Consistency Several investigations point towards the same i.e. replicated in other designs and settings

Specificity One cause leads to one outcome

Temporality Cause must predate effect

Dosis-response The risk of outcome increases with increasing exposure

Plausibility

Plausible biological explanation?

Experimental evidence

Designs with control of conditions (RCT or animal models)

Analogy If some exposures are harmfull similar exposures are probably harmfull too

Bias and confounding - kupublicifsv.sund.ku.dk/~pka/epi18/MK2.pdfTypes of bias 02/02/2018 7...

Documents

Transcript of Bias and confounding - kupublicifsv.sund.ku.dk/~pka/epi18/MK2.pdfTypes of bias 02/02/2018 7...