CPH EXAM REVIEW– EPIDEMIOLOGY Lina Lander, Sc.D. Associate Professor Department of Epidemiology,...

CPH EXAM REVIEW– EPIDEMIOLOGY

Lina Lander, Sc.D.Associate ProfessorDepartment of Epidemiology, College of Public HealthUniversity of Nebraska Medical Center

January 24, 2014

• Review of basic topics covered in the epidemiology section of the exam

• Materials covered cannot replace basic epidemiology course

• This review will be archived on the NBPHE website under Study Resources www.nbphe.org

http://www.nbphe.org/

Outline

• Overview• Terminology• Study design• Causation and validity• Screening

Populations

Group of people with a common characteristic– Place of residence, age, gender, religion– People who live in Omaha, Nebraska in January, 2014– Occurrence of a life event (undergoing cancer

treatment, giving birth)

Populations

• Membership can be permanent or transient– Population with permanent membership is

referred to as “Fixed” or “Closed”• People present at Hiroshima• Passengers on an airplane

– Population with transient membership is referred to as “Dynamic” or “Open”

• Population of Omaha

Measures of Morbidity

Measures of Frequency

“Count” - the most basic epidemiologic measure– Expressed as integers (1, 2, 3, …)– Answers the question, “How many people have this

disease?”– Often the numerator of many measures– Important to distinguish between incident (new) and

prevalent (existing) cases

Forecast of cancer deaths

Ratio

• One number (x) divided by another (y): • Range: zero (0) to infinity (∞)• (x) and (y) may be related or completely

independent

• Sex of children • Attending a clinic

females

males

Which of the following terms is expressed as a ratio (as distinguished from a proportion)?

(A) Male Births / Male + Female Births(B) Female Births / Male + Female Births(C) Male Births / Female Births(D) Stillbirths / Male + Female Births

Proportion

• Ratio in which the numerator (x) is included in the denominator (x+y):

• Range: zero (0) to one (1)

• Often expressed as percentage ( e.g., Among all children who attended a clinic, what proportion was female?

Rate

• Can be expressed as (a/T) where (a) = cases and (T) involves a component of time

• Range: zero (0) to infinity (∞)• Measures speed at which things happen

• Rate of at-risk females coming to a clinic (time)

Prevalence

• Proportion• Not a rate – no time component in the calculation • Measures proportion of existing disease in the

population at a given time • “Snapshot” • Dimensionless, positive number (0 to 1)

Prevalence proportion

BA

A

N

APrevalence

Where:

A = number of existing casesB = number of non-casesN = total population

Incidence

• Measures the occurrence of new cases in a population at risk over time

• Can be measured as a proportion or a rate

• The most fundamental epidemiologic indicator• Measures force of morbidity (as a rate)• Measures conversion of health status (proportion /rate)

Incidence proportion

• Synonyms: incidence, cumulative incidence, risk

• Measures probability (risk) of developing disease during period of observation

• Dimensionless, positive number (0 to 1)

Incidence Proportion

IP a

NWhere:

a = number of new onset cases (events) N = population-at-risk at beginning

Incidence Proportion

• Appropriate for fixed (closed) populations and short follow-up

• Must specify time period of observation because risk changes with time

• Not appropriate for long-term follow-up due to potential loss of subjects

• Assuming: complete follow-up, same risk over time

Follow 2000 newborns at monthly intervals to measure development of respiratory infection in the first year

• Suppose 50 infants develop respiratory infection in first year of life

• The risk (probability) of developing a respiratory

infection in the first year of life is ~ 2.5%

• 25 of 1000 infants in this population or 1 in 40 will

develop infection in the first year of life.

Incidence Rate

• Measures how rapidly new cases develop during specified time period

• Cases per person-time • Synonyms: incidence, incidence density, rate• Follow-up may be incomplete• Risk period not the same for all subjects

Incidence Rate

Where:

a = number of new onset cases T = person-time at risk during study period

(follow-up)

Person-time

• Accounts for all the time each person is in the population at risk

• The length of time for each person is called person-time

• Sum of person-times is called the total person-time at risk for the population

Person-time

1 2 3 4 5Died 1

5

5

3

2

T = person-time at risk during study period = 1+5+5+3+2 = 16 person-years

Person-time Assumption

• 100 persons followed 10 years = 1000 person years• 1000 persons followed for 1 year = 1000 person years

Follow 2000 newborns at monthly intervals to measure development of respiratory infection in the first year

)5.0*25()25.0*25()1*1900(

50

T

aIR

50 infants develop respiratory infection1900 complete the first year disease free25 complete 3 months (0.25 years) before infection25 complete 6 months (0.5 years) before infection

Calculate incidence rate:

= 2.6 per 100 person-years

Incidence, Prevalence, Duration

• Prevalence increases as new cases added to the existing cases (i.e., incidence)

• Prevalence decreases as people are cured or die

• Prevalence = Incidence * Duration

Measures of Mortality

Mortality

• Measures the occurrence death• Can be measured as a proportion or a rate• Can measure disease severity or

effectiveness of treatment

Mortality Rate

• Measures rate of death in the population over a specified amount of time

• Positive number (0 to ∞)• Can be a measure of incidence rate (risk)

when disease is severe and fatal, e.g. pancreatic cancer

• Synonym: fatality rate

Mortality Rate

Where:

d = number of deathsN = total population at mid-point of time periodT = follow-up time (usually one year)

Cancer Death Rates*, for Men, US, 1930-2003

*Age-adjusted to the 2000 US standard population.Source: US Mortality Public Use Data Tapes 1960-1999, US Mortality Volumes 1930-1959, National Center for Health Statistics, Centers for Disease Control and Prevention, 2002.

Case Fatality Rate

• This is not a rate, this is a proportion• Proportion of deaths from a specific illness

Where: a = Number of deaths from an illnessN = Number of people with that illness

N

aCase Fatality Rate

What percentage of people diagnosed as having a disease die within a certain time after diagnosis?

Case-fatality rate

• Case-fatality – a measure of the severity of the disease

• Case-fatality – can be used to measure benefits of a new therapy– As therapy improves - the case-fatality rate would be

expected to decline – e.g. AIDS deaths with the invention of ARVs

Proportionate Mortality• Of all deaths, the proportion caused by a certain

disease • Can determine the leading causes of death• Proportion of cause-specific death is dependent on all

other causes of death• This does not tell us the risk of dying from a disease

Proportionate mortality from Cardiovascular Disease in the U.S, in 2013 = # of U.S deaths from cardiovascular diseases in 2013 x 1,000

Total deaths in the U.S. for 2013

Which measure of mortality would you calculate to determine the proportion of all deaths that is caused by heart disease?

(A) Case fatality(B) Cause-specific mortality rate(C) Crude mortality rate(D) Proportionate mortality ratio(E) Potential years of life lost

Other Mortality Rates

• Crude Mortality Rate– Includes all deaths, total population, in a time period

• Cause-Specific Mortality Rate– Includes deaths from a specific cause, total

population, in a time period

• Age-Specific Mortality Rate– Includes all deaths in specific age group, population in

the specific age group, in a time period

Age-Specific Mortality Rates

Mortality Rates (Year = 2000)

Panama

• Population = 2,899,513 • Deaths = 13,483• Mortality Rate = 4.65 per 1000 per year

Sweden

• Population = 8,923,569• Deaths = 93,430• Mortality Rate = 10.47 per 1000 per year

Why do you think Sweden has almost a 2x higher mortality rate?

Differences in Mortality Rates

Crude mortality rates do not take into account differences between populations such as age

Can we remove this confounding by age?

• Separate (stratify) the population into age groups and calculate rates for each age– Compare age-specific mortality rates

• If two different populations, adjust (standardize) the mortality rates of the two populations, taking into account the age structures– Results in comparable rates between populations

or in the same population over time

Direct Standardization

• If the age composition of the populations were the same, would there be any differences in mortality rates?

• Direct age adjustment is used to remove the effects of age structure on mortality rates in two different populations

• Apply actual age-specific rates to a standard population (US population 2000)

Indirect Standardization

• When age-specific rates are not available – use age-specific mortality rates from the general population to calculate expected number of deaths

Standardized mortality ratios (SMR) = observed deaths/ expected deaths

• If the age composition of the populations were the same, would there be any differences in mortality rates?

Study Design

• Experimental studies (Clinical Trial, Randomized Controlled Trial)

• Observational studies– Cohort– Case-control– Cross-sectional – Ecological

Experimental studies are characterized by:

• The population under study: who is eligible for study entry?

• The intervention(s) being used or compared: what treatment(s) are being used? (Therapeutic (e.g., drug) or preventive (e.g., education)

• The method of treatment assignment: how are subjects assigned to intervention(s)?

• The outcomes of interest: how will success be measured?

45

Randomized Controlled Trials

• A randomized controlled trial is a type of experimental research design for comparing different treatments, in which the assignment of treatments to patients is made by a random mechanism.

• Customary to present table of patient characteristics to show that the randomization resulted in a balance in patient characteristics.

Randomized Controlled Trials

Time

Steps in carrying out a clinical trial

1. Select a sample from the population

2. Ethical considerations

3. Measure baseline variables

4. Randomize

5. Apply interventions

6. Follow up the cohorts

7. Measure outcome variables (blindly, if possible)

Use of “Blinding”

Important when knowing treatment could influence the interpretation of results

Especially important when outcomes are subjective (pain, functional status) and/or when placebo is employed (either alone or to mask actual treatment)

Placebo- ensure control and treatment group have same “experience”

May not be necessary if the outcome is an object measure (death, blood glucose)

49

Treat

• Blinding of the participants to which treatment was used ensures that bias is avoided– Single-blind: patient does not know what

treatment they are receiving– Double–blind: patient and investigator do not

know what treatment (cannot be used for some treatments, e.g. surgery)

It All Comes Down to…

Obtaining groups that are comparable for everything except the treatment…

So that differences in outcome can fairly be ascribed only to the difference between the groups (i.e., to the treatment).

Cohort Studies

• Definition: groups, defined on the basis of some characteristics (often exposure and non-exposure) are typically prospectively followed to see whether an outcome of interest occurs (may also be retrospective)

• Comparison of interest: Compare the proportion of persons with the disease in the exposed group to the proportion with the disease in the unexposed group.

• Motivation: If the exposure is associated with the disease, we expect that the proportion of persons with the disease in the exposed group will be greater than the proportion with disease in the unexposed group.

Cohort Studies

Exposed

DiseasedNot

diseased

Not Exposed

DiseasedNot

diseased

now

future

Prospective

past

now

Retrospective

Prospective cohort studies

• Define sample free of the disease/outcome of interest, measure the exposure and classify to exposed vs unexposed at time zero, then follow up at fixed time point to ascertain outcome

• Measure the ratio of outcome between the exposed and unexposed (Relative Risk)

Retrospective cohort studies

• Synonyms: historical cohort study, historical prospective study, non-concurrent prospective study

• Do not design retrospective cohort studies a priori – question always in retrospect

• Exposures and Outcomes have already occurred - data on the relevant exposures and outcomes must have been collected

Cohort study strengths

• May be used to define incidence / natural history• Known temporal sequence• Efficient in investigating rare exposures• Permits study of multiple exposures AND outcomes• Fewer biases or bias can be limited or evaluated

– Homogenous sample population– Accurate measurement of important variables– No bias in ascertainment of outcome

Cohort study limitations• Expensive and inefficient – especially for rare diseases

or outcomes (large sample size, long-term follow up)• Associations may be due to confounding

– Can adjust statistically for any measured potential confounders

• Must exclude subjects with outcome present at onset• Disease with long pre-clinical phase may not be

detected• Sensitive to follow-up bias (loss of diseased subjects)

Case-control Studies

• Definition: compare various characteristics (past exposure) for cases (subjects with disease) to those of controls (subjects without the disease)

• Comparison of interest: Compare the proportion with the exposure in the cases to the proportion with the exposure in the control group.

• Motivation: If the exposure is associated with the disease, we expect that the proportion of persons with the exposure in the cases will be greater than the proportion with the exposure in the control group.

61

Case-control Studies

Cases withdisease

Exposed in past Not Exposed in past

Controls withoutdisease

Exposed in past Not Exposed in past

Case-Control Studies

• Efficient for rare diseases• Efficient for diseases with long latency• Can evaluate multiple exposures • Lower costs of exposure assessment relative to

cohort studies• Incident cases preferable for causal research• Challenges of control selection • Challenges of retrospective exposure assessment

Case-control studies are among the best observational designs to study diseases of:

(A) High prevalence(B) High validity(C) Low case fatality(D) Low prevalence

Example

• Is smoking associated with coronary heart disease (CHD)?• 3000 smokers and 5000 nonsmokers were followed to see if

they developed CHD

• Case control or cohort study?

Exposure Developed CHD

No CHD Total

SmokerNon-smokerTotal

8487171

291649137829

300050008000

Cross-Sectional Studies

• Prevalence studies• All measurements of exposure and outcome are made

simultaneously (snapshot)• Disease proportions are determined and compared among

those with or without the exposure or at varying level of the exposure

• Examine association – determination of associations with outcomes; generates hypotheses that are the basis for further studies

• Most appropriate for studying the associations between chronic diseases and and chronic exposure

• Sometimes useful for common acute diseases of short duration

Cross-Sectional Studies

Defined Population

Exposed:Have

Disease

Exposed:No Disease

Not Exposed:

Have Disease

Not Exposed:

No disease

Gather Data on Exposure (Cause) and Disease (Effect / Outcome) T0

Time

Ecological

• The unit of observation is the population or community

• Disease rates and exposures are measured in each of a series of populations

• Disease and exposure information may be abstracted from published statistics and therefore does not require expensive or time consuming data collection

Study DesignCategory Type Subtype CharacteristicsAnalytic Experimental Investigates prevention and

treatment

Observational Investigates causes, prevention and treatment

Observational Cohort Investigates health effects of exposure

Observational Cross-Sectional Examines exposure-disease associations at a point in time

Observational Case-Control Investigates risk factors for disease

Observational Ecological Examines exposure-disease association at the population level

Descriptive Descriptive Describes health of populations

71

Probabilities

• Denote the probability of an event by p, where p ranges from 0 to 1.

• Notation: p = probability that event occurred

1-p = probability that event did not occur

Example

• What is the probability of CHD?• 171/8000 = 0.02

72

Exposure Developed CHD

No CHD Total

SmokerNon-smokerTotal

8487171

291649137829

300050008000

Relative Risk• RR = incidence among exposed

incidence among unexposed

• Approximates how much more likely it is for the outcome to be present among a certain group of subjects than another group

• RR = 1 implies that the risk is the same in the two groups• RR < 1 implies that the risk is higher in the unexposed• RR > 1 implies that the risk is higher in the exposed

Example

Crib Death

Usual sleeping position YES NO TOTAL

Prone Other

96

8371755

8461761

Total 15 2592 2607

Sleeping Position and Crib Death

1-year cumulative incidence prone = 9/846 = 10.64 per 10001-year cumulative incidence other = 6/1761 = 3.41 per 1000

Risk Ratio = 10.64 per 1000 = 3.1 3.41 per 1000

75

Odds Ratios

• Relative risk requires an estimate of the incidence of the disease

• For case control studies, we do not know the incidence of disease because we determine the number of cases and controls when the study is designed

• For case control studies, use the odds ratio (OR)

76

Odds

• Odds are another way of representing a probability

• The odds is the ratio of probability that the event of interest occurs to the probability that it does not.

• The odds are often estimated by the ratio of the number of times that the event occurs to the number of times that it does not.

77

Odds Ratio

ratio oddsp)p(1

)p1(p

p1p

p1p

oddsp1

p

21

21

2

2

1

1

Odds Ratio Example

• Case control study of 200 CHD cases and 400 controls to examine association of smoking with CHD(Note: now we are examining the probability of exposure)

• What is the probability of smoking among CHD cases? p = 112/200=0.56

• What is the odds of smoking among CHD cases?p/(1-p) = 0.56/0.44 = 1.27

78

CHD Cases Controls TotalSmokers 112 176 288Nonsmokers 88 224 312 Total 200 400 600

Odds Ratio Example

• What is the probability of smoking among controls? p = 176/400=0.44• What is the odds of smoking among controls?

p/(1-p) = 0.44/0.56 = 0.79• The odds ratio is 1.27 / 0.79 = 1.62

• Interpretation: The odds of smoking is 1.62 times higher for CHD cases compared with controls

79

CHD Cases Controls TotalSmokers 112 176 288Nonsmokers 88 224 312 Total 200 400 600

Odds Ratio vs. Relative RiskDisease No Disease Total

Exposed a b a+b

Not Explosed c d c+d

Total a+c b+d a+b+c+d

dccbaa

unexposedin incidence

exposedin incidence risk Relative

bc

ad

db

ca

dbd

dbb

cac

caa

/

/ratio Odds

Odds ratio

• Odds ratio = odds of exposure in case odds of exposure in controls

OR=1 exposure is not associated with the diseaseOR>1 exposure is positively associated with the diseaseOR<1 exposure is negatively associated with the disease

82

Odds Ratio vs. Relative Risk• Both compare the likelihood of an event between two

groups

• OR compares the relative odds of an event in each group

• RR compares the probability of an event in each group– More ‘natural’ interpretation because risk measured in terms of

probability – Cannot always be computed– Can lead to ambiguous interpretations

A case-control study comparing ovarian cancer cases with community controls found an odds ratio of 2.0 in relation to exposure to radiation. Which is the correct interpretation of the measure of association?

(A) Women exposed to radiation had 2.0 times the risk of ovarian cancer when compared to women not exposed to radiation

(B) Women exposed to radiation had 2.0 times the risk of ovarian cancer when compared to women without ovarian cancer

(C) Ovarian cancer cases had 2.0 times the odds of exposure to radiation when compared to controls

(D) Ovarian cancer cases had 2.0 times the odds of exposure to radiation when compared to women with other cancers

84

Odds Ratios vs. Relative Risks

• In general, odds ratios summarize associations from case-control studies and cross-sectional studies, and relative risks can be used to summarize associations in cohort studies.

• Odds ratio can be used to estimate the relative risk when in a case control study when:

1. Cases are representative of people with the disease in the population with respect to history of exposure AND

2. The controls are representative of people without the disease in the population with respect to history of exposure AND

3. The disease is rare

85

Odds ratio estimates relative risk when disease is rare

• When the disease is rare, the number of people with the disease (a and c) is small so that a+b≈b and c+d≈d

ORbc

ad

dc

ba

dcc

baaRR

/

/

)/(

)/(

Odds Ratios for matched case control studies

• Often, cases are matched with a control based on age, sex, etc.

• For a matched study, describe the results for each pair

• Concordant pairs: both case and control exposed or both not exposed

• Discordant pairs: Case exposed/control unexposed or case unexposed/control exposed

Odds Ratios for matched case control studies

Exposed UnexposedExposed a bUnexposed c d

Controls

Cases

OR is based on the discordant pairs:

OR = b/c

Cohort study is to risk ratio as:

(A) Ecologic fallacy is to cross-sectional study(B) Case-control study is to odds ratio(C) Genetics is to environment(D) Rate ratio is to ecologic study

Measures of Effect and Association

Risk

• RR and OR measure strength of the association

• How much of the disease can be attributed to the exposure? How much of the CHD risk experienced by smokers can be attributed to smoking?

• OR and RR do not address this.

Measures of Association

• Contrast measure of occurrence in two populations– Cancer incidence rates in males and females in

Canada– Incidence rate of dental caries in children within a

community before and after fluoridation– Both of these are measures of association

Absolute Measures

Causal rate differenceIncidence rate differenceIncidence density differenceRate differenceAttributable rate

Causal risk differenceIncidence proportion differenceCumulative incidence differenceRisk differenceExcess riskAttributable risk

Risk Difference

• Most often referred to as “attributable risk” – Refers to the amount of risk attributable to the

exposure of interest– For example, in the birth cohort analysis, where

exposure = prenatal care in the first 5 months

RD = R1 – R0 = Excess risk of preterm birth

attributable to prenatal care

Absolute Excess Measures

Unexposed Exposed

Incidence proportion (or rate)

Excess risk (or rate) in the exposed

If E is thought to cause D: Among persons exposed to E, what amount of the incidence of D is E responsible for?

Background risk – incidence rate in unexposed

Incidence not due to exposure

Incidence due to exposure

Example

Crib Death


Prone Other

96

8371755

8461761

Total 15 2592 2607



Risk difference = 10.64 per 1000 – 3.41 per 1000 = 7.23 per 1000

Added risk due to exposure

%ARE IP1 IP0

IP1

100

Attributable Risk Percent

What proportion of occurrence of disease in exposed persons is due to the exposure?

(Risk difference / Risk in Exposed) x 100

Example

Crib Death


Prone Other

96

8371755

8461761

Total 15 2592 2607



Risk difference = 10.64 per 1000 – 3.41 per 1000 = 7.23 per 1000

Attributable risk percent = 10.64 per 1000 – 3.41 per 1000 x 100 = 68.0%10.64 per 1000

Population Attributable Risk

PAR IPT IP0

What is the excess risk in the population caused by exposure E?

Population Attributable Risk Percent

PAR% IPT IP0

IPT100

What proportion of occurrence of disease in the population is due to the exposure?

Population Attributable Risk

Unexposed Exposed Population

Inci

denc

e pr

opor

tion

(or r

ate)

Should resources be allocated to controlling E or, instead, to exposures causing greater health problems in the population

Example

Crib Death


Prone Other

96

8371755

8461761

Total 15 2592 2607


1-year cumulative incidence total = 15/2607 = 5.75 per 10001-year cumulative incidence other = 6/1761 = 3.41 per 1000

Population attributable risk (PAR) = 5.75 per 1000 – 3.41 per 1000 = = 2.35 per 1000

Example

Crib Death


Prone Other

96

8371755

8461761

Total 15 2592 2607


1-year cumulative incidence total = 15/2607 = 5.75 per 10001-year cumulative incidence other = 6/1761 = 3.41 per 1000

Population attributable risk percent (PAR) = = 5.75 per 1000 – 3.41 per 1000 x 100 = 40.8%

5.75 per 1000

PAR% PE (RR 1)

PE (RR 1) 1100

Population Attributable Risk Percent

Affected by the prevalence of exposure in the population and the relative risk

Absolute MeasuresMeasure Abbrev. Formula Helps answer the question

Risk difference (attributable risk to the exposed)

RD AR

I1 – I0

If E is thought to cause D: Among persons exposed to E, what amount of the incidence of D is E responsible for?

Attributable risk percent AR% [(I1 – I0)/I1] X 100

What proportion of occurrence of disease in exposed persons was due to the exposure?

Attributable risk to the population PAR IT – I0

Should resources be allocated to controlling E or, instead, to exposures causing greater health problems in the population?

Attributable risk to the population (%) PAR% [(IT – I0)/IT] X 100

What portion of D in the population is caused by E? Should resources allocated for D be directed toward etiologic research or E?

Summary of Measures

• Absolute measures address questions about public health impact of an exposure– Excess risk in the exposed or population

attributable to the exposure• Relative measures address questions about

etiology and relations between exposure and outcome– Relative difference in risk between exposed and

unexposed populations

Causal Inference

109

Definition of a cause

• “That which produces an effect, result or consequence or the one such as a person, event or condition that is responsible for an action or result” American Heritage Dictionary

• Implies reason and occasion

Key Characteristic of a Cause1. Essential attributes: association, time order and

direction

2. Causes include:– host and environmental factors– active agents and static conditions

3. Causes may be either positive (presence induces disease) or negative (absence induces disease)

The Epidemiologic Triad

HOST

AGENT ENVIRONMENT

Agent

Host Environment

Vector

Factors involved in the Natural History of Disease

http://www.who.int/multimedia/ethiopiaweb/MALARIA/WHO-208698.jpg

113

Risk factors vs. causes

• Risk factors often used in epidemiology instead of causes

• A cautious way of making causal inference

• Risk factors are not direct causes of disease

• Serve to identify proximate causes

114

Causal Inference• During 1950s -1960s epidemiologists developed a set of postulates

for causal inferences regarding non-infectious diseases of unknown etiology

• Response to the discovery of association between smoking and lung cancer

• Debates by many epidemiologists yielded 5 criteria in the 1964 Report of the Advisory committees to the US Surgeon general on Smoking and Health

• Sir Austin Hill came up with the best known criteria or guidelines in 1965

• In 1976 Rothman presented a view of causations now known as the “Sufficient-Component Theory of Causation”

115

Hill Criteria1. Strength of Association 2. Consistency3. Specificity of the Association4. Temporal relationship5. Biological gradient6. Biologic plausibility7. Coherence8. Experiment9. Analogy

116

Sufficient-Component Cause Model

• Sufficient cause is a complete causal mechanism that inevitably produces disease

• Sufficient cause is not a single factor but rather a minimal set of factors that inevitably produce disease – Sufficient cause for AIDS may include: exposure - HIV

infection, susceptibility, lack of preventive exposures-absence of ARVs

• Each participating factor in a sufficient cause is termed a component cause

Disease Causation – 2 components

• Sufficient Cause– precedes the disease– if the cause is present, the disease always occurs

• Necessary Cause– precedes the disease– if the cause is absent, the disease cannot occur

118

Disease causation: Types of causal relationships

1. Necessary and sufficient: Without that factor, the disease never develops, and in the presence of that factor, the disease always develops

2. Necessary but not sufficient: Without that factor, the disease never develops but need other factors as well

3. Sufficient but not necessary: The factor can produce the disease, but so can other factor.

4. Neither sufficient not necessary: The factor itself cannot cause the disease but plays a role—multiple factors interact to cause the disease

Sufficient-Component Cause Model - attributes

1. Blocking the action of a single component stops the completion of the sufficient cause, thus prevents the disease from occurring by that pathway

2. Completion of a sufficient cause is synonymous with the biologic onset of disease

3. Component causes may be distant causes and others may be proximate causes

From Study to Causation• Associations between ‘exposures’ and outcomes identified in

observational studies may or may not be ‘causal’

• There is need to pay attention to valid assessment of exposure and outcome in order to think about causality– Reliability – Validity

• External validity• Internal validity – three concepts are considered

– Bias– Confounding – Chance (Random error)

122

Validity• Implies that a measure purports to measure what it is expected to

measure:– Appropriate– Accurate (has same numerical value as the phenomenon being investigated,

i.e. free of systemic error or bias)– Precise (minimal variations are only because of chance or random error)

• Validity of a study implies that the findings are the “truth”

• The degree to which a measurement or study reaches a correct conclusion

• Two types of validity: Internal validity, External validity

123

External validity: generalizabilty

• The extent to which the results of a study are applicable to the general population– Do the study results apply to other patients?

• A representative sample is drawn from the population (usually randomly)

• Individuals have equal chance to participate in the study

• Usually involves a sampling frame

• Inference is made back to the population

124

Internal validity• Is the extent to which the results of the study accurately reflect

the true situation of the study population

• Is influenced by:– Chance

• The probability that an observation occurred unpredictability without discernible human intention or observable cause

– Bias• Any systemic error (not random or due to chance) in a study which

leads to an incorrect estimate of the association between exposure and disease

– Confounding • The influence of other variables in a study which leads to an

incorrect estimate of the association between exposure and disease

Random error

• Chance

• “That part of our experience that we cannot predict” (Rothman and Greenland)

• Usually most easily conceptualized as sampling variability and can be influenced by sample size

Random error can be problematic, but . . .

• Influence can be reduced– increase sample size– change design of sampling– improve precision of instrument

• Probability of some types of influence can be quantified (e.g., confidence interval width)

127

I. Bias - Definition

• Any systemic error (not random or due to chance) in a study which leads to an incorrect estimate of the association between exposure and disease or outcome

• Therefore:– Bias is a systematic error that results in an incorrect

(invalid) estimate of the measure of association

I. Bias - Definition 1. Can create spurious association when there is none (bias away

from the null)2. Can mask an association when there is one (bias towards the null)3. Bias is primarily introduced by the investigator or study

participants4. Bias does not mean that the investigator is “prejudiced”5. Can occur in all study types: experimental, cohort, case-control 6. Occurs in the design and conduct of a study7. Bias can be evaluated but not “fixed” in the analysis phase8. Two main types are selection and observation bias

• Bias towards the null – observed value is closer to 1.0 than is the true value

• Bias away from the null – observed value is farther from 1.0 than is the true value

Direction of bias

Observed TrueNull

True Observed 2NullObserved 1

130

Types of bias

• Selection bias– Refusals, exclusions, non-participants– Failure to enumerate the entire population– Loss to follow up

• Observation/Information bias– Diagnostic (lead time) surveillance bias– Interviewer bias– Recall bias– Classification of exposure and outcome

• Misclassification bias (is part of information bias)– Non-differential– Differential

131

II. Selection bias

• Any systematic error that occurs in the process of identifying study populations

• The error that occurs whenever the identification and selection of individual subjects for inclusion into study is not independent of outcome (cohort) or exposure (case-control)

• Error due to systematic difference between those selected for study versus those not selected for the study

132

II. Selection Bias1. Results from procedures used to select subjects into a study

that lead to a result different from what would have been obtained from the entire population targeted for the study

2. Most likely to occur in case-control or retrospective cohort because exposure and outcome have occurred at the time of study selection

3. Selection bias can also occur in prospective cohort and experimental studies form differential loss to follow-up

- impacts which subjects are “selected” for the analysis

133

II. Selection bias• Occurs when there is a systematic difference between those

selected for study versus those who were not

– Refusers, non-participants, non-response, exclusions

– Failure to enumerate the entire population

– Those lost to follow up if related to exposure or outcome

– Differential selection of exposed/unexposed groups, or cases and controls

– Volunteers

– Healthy workers - example

134

II. Selection bias- cohort study

• Selection bias occurs when selection of exposed and unexposed subjects is not independent of the outcome (so this type can only occur in retrospective cohort study)

• Examples: • A retrospective study of an occupational exposure to asbestos and lung

disease in a factory setting• The exposed and unexposed groups are enrolled on the basis of prior

employment records• The records are old and many are lost, so the complete cohort working

in the plant is not available for study.• If people who did not develop the disease and were exposed were

more likely to have their records lost, then there will be an overestimate of association between the exposure and disease

135

II. Selection bias- cohort study

Solutions:– Increase participation

– Get relevant information on refusers

– Develop follow-up mechanisms

– Use comparable populations

– Valid assessment of outcome in prospective cohort studies (for example: blinding)

136

II. Selection bias: case-control study

• Sources of selection bias– Decisions about selecting incident or prevalent (survival)

cases– When controls do not reflect the population that gave rise to

the cases

• The selection of cases and controls must be independent of the exposure status– Do controls in the study have higher or lower prevalence of

exposure than controls not selected for the study?– Cases and controls should have the same exposure

opportunities (e.g., welders and general population)

137

II. Selection bias: case-control study

1. Occurs when controls or cases are more or less likely to be included in a study if they have been exposed –

inclusion in the study is not independent of exposure

2. Results: relationship between exposure and disease observed among study participants is different from relationship between exposure and disease in eligible individuals who were not included

3. The odds ratio from a study that suffers from selection bias will incorrectly represent the relationship between exposure and disease in the overall study population

138

II. Selection bias: cross-sectional study

• Selection bias can occur when

– Sampling frame does not represent the true underlying population of interest

• Voter registration lists• Driver’s license records• Telephone lists

139

II. Selection bias: cross-sectional study

– Estimates of association do not take into account the sampling structure

– There are sufficient numbers of refusers that the underlying sampling structure is compromised

– When the “sample” is a convenience sample (sampling fraction)

140

II. Selection Bias: solutions?

• Little or nothing can be done to fix this bias once it has occurred in cross-sectional studies

• Need to avoid it during design and implementation: – using the same criteria for selecting cases and controls– obtaining all relevant subject records– obtaining high participation rates– taking in account diagnostic patterns of disease

142

III. Observation/information bias

• An error that arises from systematic differences in the way information on exposure or disease is obtained from the study groups

• Results in participants who are incorrectly classified as either exposed or unexposed or as diseased or not diseased

• Occurs after the subjects have entered the study

• Several types of observation bias: recall bias, interviewer bias, and differential and non-differential misclassification

143

III. Observation/Information bias

• Recall bias• People with disease remember or report exposures

differently (more/less accurately) than those without disease– Differential ability of subject to remember previous activities

and exposures, e.g. in serious diseases – Cases search their memory to understand their illness – E.g., birth defects

• Can result in over-or under-estimation of measure of association

144


• Recall bias• Solutions:

– Use controls who are themselves sick– Use standardized questionnaires that obtain complete

information– Mask subjects to study hypothesis

145


• Interviewer bias

• Systematic difference in soliciting, recording, interpreting information

• Can occur whenever exposure information is sought when outcome is known (as in case-control) or when outcome information is sought when exposure is known (as in cohort study)

146

III. Observation/Information bias• Interviewer bias

– The way interviewer asks questions, and there is possibility of probing e.g. in-person interviews and telephone interviews, especially where the outcome has already occurred (case-control, and retrospective cohort studies)

– Solutions:• Mask interviewers to study hypothesis and disease or

exposure status of subjects • Use standardized questionnaires, or standardized methods

of outcome or exposure ascertainment• Use biomarkers to compare when possible• Surrogates tend to underreport exposures

147


• Classification of exposure and outcome– Leads to misclassification bias

– If exposure status is known in cohort studies, or outcome status is known in case-control studies

– Solution- blinding of data collectors to exposure/outcome status

148

III. Observation/Information bias – Misclassification bias

• A type of information bias

• Error arising from inaccurate measurement or classification of study subjects or variables

• Subject’s exposure or disease status is erroneously classified

• Happens at the assessment of exposure or outcome in both cohort and case-control studies

• Two types: non-differential and differential

149

A. Non-differential misclassification

• Inaccuracies with respect to disease classification are independent of exposure

• Inaccuracies with respect to exposure are independent of disease status

• The probability of exposure (or of outcome) misclassification is the same for cases and controls (or in study/comparison groups)

• Bias results towards the null - if the exposure has two categories, will make groups more similar (Type II error)

• Solution: Use multiple measurements, most accurate sources of information

150

B. Differential Misclassification• Differential misclassification

– Probability of misclassification of disease or exposure status differs for exposed and unexposed persons (cohort) or presence of absence of exposure (case-control)

– Probability of misclassification is different for cases and controls or for levels of exposure within cases and controls

– Direction of bias is unknown, i.e. overestimation or underestimation of the true risk

– Know that the observed OR deviates from truth, but direction is unknown

151

B. Differential Misclassification

• Also known as systematic misclassification:– The probability of misclassification of disease or exposure

status is correlated with presence or absence of characteristic in the study or control group.

– Thus, misclassification of presence or absence of disease differs for exposed and unexposed persons, or of presence or absence of exposure in cases and controls: it is differential

Confounding

153

Definition and Impact

• An alternate explanation for the observed association between exposure and disease

• “A mixing of effects”: the association between exposure and disease is distorted because it is mixed with the effects of another factor that is associated with the disease

• Result of confounding is to distort the true association toward the null (negative confounding) or away from the null (positive confounding)

154

Confounder

• A confounder is associated with the exposure and independently of that exposure is a risk factor for the disease

• A confounder has effect on the outcome which can be:– overestimating (positive), or– underestimating (negative) or even – change the direction of the observed effect, a spurious

relationship

155

Criteria for a variable to be a confounder

• The third variable must not be an intermediate link in the causal chain between exposure and outcome (i.e., is not an intermediate or intervening variable)

• The third variable must cause the outcome event (i.e., must be an independent predictor of disease with or without exposure)

• The third variable must be associated (correlated) with exposure (but not caused by the exposure)

C O N F O U N D I N G E D C

Example:- smoking is a confounder of effect of occupational exposures (to dyes) on bladder cancer

- age is confounder of effect of DDT pesticide exposure and breast cancer

157

Opportunities for confounding

• In an experimental designs:– Participation differs in study and control groups– There is no evidence for randomization – There is evidence of residual confounding

• In cohort and case-control studies:– When selection of comparison group differs by subject

characteristics– When risk factors other than the exposure are distributed

differently between the exposed and unexposed groups – There is evidence of residual confounding

158

Controlling for confoundingIn the design phase:• Goal is to eliminate or reduce variation in the

level of the confounding factor between compared groups

• Remember, a variable can only be a confounder if it is different between compared groups.

159

Control for confounding- design phase

– Randomization• With sufficient sample size, randomization is likely to control for both

known and unknown confounders- but not guaranteed

– Restriction• Restrict admissibility criteria for study subjects and limit entrance to

individuals who fall within a specified category of the confounder

– Matching• Select study subjects so that the potential confounders are

distributed in an identical manner among the exposed and unexposed groups (cohort study) or among the cases and controls (case-control study)

160

Control for confounding- analysis phase– Standardization: by age, race, gender, or calendar time in order

to make fair comparisons between populations

– Stratified analysis: test for homogeneity between strata to check if it is confounding or effect modification, only pool for summary measure if evidence of homogeneity

– Matched analysis: implement analysis of matched design

– Restriction: Restrict during data analysis

– Multivariate analysis: To enable controlling for several potential confounders simultaneously

Effect modification

• Interaction• The strength of the association between

an exposure and disease differs according to the level of another variable.

• Modification of the relationship between exposure and a disease by a third variable.

• If the association changes according to the level of the third variable, then effect modification is present

Measurement Error

Measurement

• Measurement of exposure, outcome, and other relevant characteristics are a key part of epidemiologic studies

• Almost all tests and measures are imperfect! Knowledge of how well a measure performs helps to:– Choose alternative measures– Interpret results of studies using a specific measure

Reliability

• How closely do duplicate measurements of the same characteristic agree with each other

• Examples:– Test-retest reliability: agreement between responses on a

questionnaire that is administered two (or more) times to the same person

– Intra-observer: agreement of a given interpreter with him/herself– Inter-observer: agreement among different interpreters

• Reliability is usually higher with more standardized or automated measurement procedures, lower when more complex judgments are required by human observers (e.g., x-ray reading)

Validity

• The degree to which an instrument measures what it sets out to measure

• For many epidemiologic applications, the underlying characteristic being measured is dichotomous (e.g., diseased vs. non diseased; exposed vs. not exposed). Then, validity can be regarded as having two components:– Sensitivity– Specificity

Sensitivity

• The probability of testing positive if the disease is truly present

Sensitivity = a / (a + c)

True Disease Status

+ -Results of Screening

Test

+a b

-c d

Specificity

• The probability of screening negative if the disease is truly absent

Specificity= d / (b + d)

True Disease Status

+ -Results of Screening

Test

+a b

-c d

PRESENT (+) ABSENT (-)

Test positive (+) TP FP

Test negative (-) FN TN

Disease

Sens= TP /(TP+FN)

Test

Spec= TN /(TN+FP)

PPV= TP /(TP+FP)

NPV= TN /(TN+FN)

Relationship between Sensitivity and Specificity

• Lowering the criterion of positivity results in an increased sensitivity, but at the expense of decreased specificity

• Making the criterion of positivity more stringent increases the specificity, but at the expense of decreased sensitivity

• The goal is to have a high sensitivity and high specificity, but this is often not possible or feasible

Relationship between Sensitivity and Specificity

• The decision for the cut-point involves weighing the consequences of leaving cases undetected (false negatives) against erroneously classifying healthy persons as diseased (false positives)

• Sensitivity should be increased when the penalty associated with missing a case is high– When the disease can be spread– When subsequent diagnostic evaluations are associated with minimal

cost and risk

• Specificity should be increased when the costs or risks associated with further diagnostic techniques are substantial (minimize false positives)– Example: positive screen requires that a biopsy be performed

A screening test is used in the same way in two similar populations, but the proportion of false-positive results among those who test positive in population B is higher than that among those who test positive in population A. What is the most likely explanation for this finding?

(A) The specificity of the test is higher in population A(B) The specificity of the test is lower in population A(C) The prevalence of disease is higher in population A(D) The prevalence of disease is lower in population A

Performance Yield

• Predictive Value Positive (PV+)– Individuals with a positive screening test results

will also test positive on the diagnostic test

• Predictive Value Negative (PV-)– Individuals with a negative screening test results

are actually free of disease

Performance Yield Predictive Value Positive (PV+)

• The probability that a person actually has a disease given that he/she tests positive

PV+ = a / (a + b)

Predictive Value Negative (PV-)• The probability that a person is

truly disease free given that he/she tests negative

PV- = d / (c + d)

True Disease Status

+ -

Results of

Screening Test

+a b

-c d

Performance Yield

• Factors that influence PV+ and PV-1. The more specific the test, the higher the PV+

2. The higher the prevalence of preclinical disease in the screened population, the higher the PV+

3. The more sensitive the test, the higher the PV-

Treatments not differentHo true

Treatments are differentHo false

Conclude treatments are not differentFail to reject Ho

Correct decision Type II errorβ

Conclude treatments are differentReject Ho

Type I errorα

Correct decisionPowerProbability =1- β

Reality

Decision

Correct :Reject the null hypothesis when it is falseDo not reject the null hypothesis when it is true

Errors:Reject the null hypothesis when it is true ( Type I error =a)Do not reject the null hypothesis when it is false( Type II error =ß)Power = probability of detecting a difference if one truly exists, i.e. probability that a study will find a statistically significant difference, when a difference of a given magnitude truly exists.Power = 1- ß, where beta is the probability of declaring a difference not statistically significant, when a difference truly exists.

Sampling errors

• Endemic: Habitual presence of disease within a given geographic area

• Epidemic: Occurrence in a community of a group of illnesses of similar nature in excess of what would normally be expected. Amount of disease depends on number susceptible (at risk) and number not susceptible by way of immunization, or genetics (immune).

• Pandemic: worldwide epidemic

Outbreak

Outbreak Types• Common source: group of persons exposed to common

agent

• Point-exposed over a brief period of time

• Intermittent-exposed over a long period of time

• Propagated: spreads gradually from person to person

• Mixed epidemic: common source and from person to person

How do you find outbreaks?

• Surveillance– Track disease/injury rates over time– Only for reported diseases– Time delay: primary purpose is to examine trends

over time• Laboratory reports• Healthcare institutions• Public health office• Observant healthcare personnel

Epidemic Investigation

1. Establish the presence of an epidemic – case definition

2. Communicate/Control

3. Analyze the outbreak

4. Form a hypothesis

5. Test the hypothesis

6. Complete the investigation

GOOD LUCK!!!

CPH EXAM REVIEW– EPIDEMIOLOGY Lina Lander, Sc.D. Associate Professor Department of Epidemiology,...

Documents

Transcript of CPH EXAM REVIEW– EPIDEMIOLOGY Lina Lander, Sc.D. Associate Professor Department of Epidemiology,...