Atiporn Ingsathit MD.PhD. · Clinical trials Field trials RCT. 3 Cohort study: Marching ......
Transcript of Atiporn Ingsathit MD.PhD. · Clinical trials Field trials RCT. 3 Cohort study: Marching ......
1
Cohort study
Atiporn Ingsathit MD.PhD.Section for Clinical Epidemiology & Biostatistics
Faculty of Medicine Ramathibodi HospitalMahidol University
Outlines
Definition of cohort study
How to assess risk
Survival analysis
2
Why we need observation studies?
Hypothesis generating Risk
Prognosis
Less expensive than RCT’s
Well done observation study yield results similar to RCT’s
Three questions to know the study design
Assign exposure/intervention? Experimental
Observational
Comparison groups ? Analytic
Descriptive
Start with exposure or outcome?
Case-control Cohort Cross-sectional
Clinical trials
Field trials
RCT
3
Cohort study: Marching towards outcomes
Population vs. cohort
Population All people in a defined setting or with certain
defined characteristics
Temporal and potentially dynamic
Cohort A population for whom membership is defined in a
permanent fashion
4
“Cohort” Group of soldiers that marched together into battle (Roman)
A group of people who share a commonexperience or condition A birth cohort shares the same year or period of birth
A cohort of smokers has the experience of smoking in common
A cohort of vegetarians share their dietary habit
Cohort study
An analytical, observational study, based on data, usually primary, from a follow-up period of a group in which some have had, have or will have the exposure of interest, to determine the association between that exposure and an outcome.
Do not provide empirical evidence that is as strong as that provided by properly executed randomized controlled clinical trials.
5
Cohort study
* Analytic study to find association between exposure and outcome
* Homogenous population
Exposure Outcome
Exposure & Outcome
Exposure Outcome
Risk factors
Intervention
Diseases
Health problems
Independent variable Dependent variable
6
Design
Type of cohort
Population Closed or fixed cohort
Opened or dynamic cohort
Design Retrospective
Prospective
Ambidirection
7
COHORT STUDIES
Fixed Cohort
Exposure
(+)
(-)
x
x
x
X = outcome
Relative risk = (2/3)/(1/3) = 2.0
Cumulative incidence
COHORT STUDIES
Dynamic Cohort
Exposure
(+)
(-)
Relative Risk
= 2/3/2/3 =1
or
2/5py/2/10py
= 2.0Years
XX
XX
Cumulative incidence or incidence rate
8
Designs
Prospective
Historic
Cohort study
Prospective cohort study
PopulationPeople without
The disease
Exposed
Not exposed
Disease
Disease
Direction of the study
No disease
No disease
Follow-up is mandatory*****
9
Cohort study
Retrospective cohort study
PopulationPeople without
The disease
Exposed
Not exposed
Disease
Disease
Direction of the study
No disease
No disease
Cohort study
Prospective Exposure, and factors
can be measured
Expensive, time consuming
Retrospective Cheaper Less time consuming Suitable for rare
outcome Some available data is
unavailable from historical part
Confounding factors
10
Study designs
Example
PopulationStudy
subject
Kidney Stone
No Stone
CKD
CKD
Direction of the study
Normal
Normal
Cohort Study Advantage
Can be standardized in eligible criteria & outcome assessment
Can establish temporal association
Disadvantage Usually expensive
Hard to blind
Long follow-up period for rare disorder
Difficult to find controls and confounders
11
What to look for in cohort studies
Who is at risk? Since women who have had a bilateral mastectomy
operation have almost no risk of breast cancer,17 they should not be included in cohort studies of CA breast.
Who is exposed? Cohort studies need a clear, unambiguous definition
of the exposure at the outset. This definition sometimes involves quantifying the exposure by degree, rather than just yes or no.
Who is an appropriate control (the unexposed)? The key notion is that controls should be similar to the
exposed in all important respects, except for the lack of exposure.
Have outcomes been assessed equally? Outcomes must be defined in advance; they should
be clear, specific, and measurable.
Keeping those who judge outcomes unaware of the exposure status of participants.
What to look for in cohort studies
12
Outcome measurement (1)
Subjective outcome
Fever Fever
PainPain
Objective outcome
HemocultureHemoculture
DeathDeath
Outcome measurement (2)
Surrogate outcome
Low-density lipoprotein
(LDL)
Low-density lipoprotein
(LDL)
ProteinuriaProteinuria
Clinical outcome
DeathDeath
ESRDESRD
13
When should we use cohort design?
Key areas of inquiry in Clinical Epidemiology
Risk With what probability will disease occur?
Prognosis What are the outcomes from disease?
Diagnosis How good are the diagnostic tool?
Treatment How is the prognosis altered by treatment?
14
Onset of acute MI
Risk Prognosis Death
Risk factors
Age
Male
Smoking
HT
LDL
Inactivity
Prognostic factorsAgeFemaleSmokingHypotensionAnterior infarctionCHFVentricular arrhythmia
Cohorts and their purposes
Characteristic in common
To assess effect of
Example
Exposure Risk factor Lung cancer in people who smoke
Disease Prognosis Survival rate for patients with breast cancer
Preventive intervention
Prevention Reduction in incidence of pneumonia after pneumococcal vaccination
Therapeutic intervention
Treatment Improvement in survival for patients with Hodgkin’s disease given chemotherapy
15
How to assess risk or association?
Risk
The probability of some unexpected event.
The probability that people who are exposed to certain “risk factors” will subsequently develop a particular disease more often than similar people who are not exposed.
16
Risk factorsHypercholesterolemiaPositive family history
Valvular diseaseViral infection
SmokingDM
High blood pressure Congestive heart failure
Coronary atherosclerosisStroke
Renal failureMyocardial infarction
Multiple causes and effects
Way to express and compare risk
Expression Question Definition
Absolute risk What is the incidence of disease in a gr initially free of the condition?
I = #new case#People in group
Attributable risk(Risk difference)
What is the incidence of disease attributable to exposure?
AR = IE+-IE-
Relative risk(Risk ratio)
How many times more likely are exposed persons to become diseased, relative to nonexposed persons?
RR = IE+
IE-
Population-attributable risk
What is the incidence of disease in a population, associated with the prevalence of a risk factor?
ARp = AR x P
17
Exposure
Disease No disease
Total Stone CKD No CKD
Total
+ a b a+b Yes 80 10 90
- c d c+d NO 20 90 110
a+c b+d n 100 100 200
Term General Example Question?
Risk a/(a+b)
Orc/(c+d)
80/90
Or20/110
What is the incidence of disease in a group initially free of the condition?
Relative risk a/(a+b) c/(c+d)
80/90 20/110= 5
How many times more likely are exposed persons to become disease, relative to nonexposed persons?
Way to express and compare risk
Exposure
Disease No disease
Total Stone CKD No CKD
Total
+ a b a+b Yes 80 10 90
- c d c+d NO 20 90 110
a+c b+d n 100 100 200
Term General Example Definition
Attributable risk (AR)
a/(a+b) –c/(c+d)
80/90 – 20/110= 0.7
The incidence of disease attributable to exposure
Population attributable risk
PAR= ARxPEX
AR x (a+b)/n
0.7 x 90/200= 0.32
The incidence of disease in a population is associated with the occurrence of a risk factor
Way to express and compare risk
18
Relative risk vs. attributable risk
RR
The strength of association
Causal inference
Valuable in etiologic studies
AR
Measure of how much of the disease risk is attributable to a certain exposure
Valuable in clinical practice and public health
Incidence
population unexposed
Ipop - I unexposed
Population attributable risk (PAR)
PAR% = {(Ipop - I unexposed ) / Ipop}100
19
Incidence
population unexposed
Ipop - I unexposed
Population attributable risk (PAR)
PAR% = {(Ipop - I unexposed ) / Ipop}100
If we had an effective prevention program (stop stone) in this population, how much of a reduction in CKD incidence could we anticipate in the total population (of both stones and no stones)?
Incidence rate
The measure of disease in cohort studies is the incidence rate, which is the proportion of subjects who develop the disease under study within a specified time period.
The numerator of the rate is the number of diseased subjects.
the denominator is usually the number of person-years of observation.
20
Total observed person-time 69.1 mo.
Survival analysis The likelihood that patients with a given
condition will experience an outcome at any point in time.
Cohort or a randomized control trial Time to event
Origin End point
21
Why survival analysis? Investigators Frequently must analyze
their data before all the subjects have died or the event has occurred.
Why survival analysis? Investigators frequently must analyze
their data before all the subjects have died or the event has occurred.
The patients do not typically enter the study at the same time.
22
Total observed person-time 69.1 mo.
Survival analysis Event : Code 1 0
1= Event occurred : death
0= censored observation: alive orloss to follow-up
Censored observation: An observation whose value is unknown because the subject has not been in the study long enough for the outcome of interest to occur
23
Methodological characteristics of survival study
The starting date for each patient must clearly defined
date of diagnose date of receiving treatment date of operation, etc.
The end date for each patient Patient’s status at the end Death if death is the final outcome recurrence, disease free infection, non-infection remission, non-remission recovery, non-recovery loss to follow up, withdraw
Survival Probability The proportion of population of such
people who survive a given length of time in the same circumstances.
An estimate of survivorship function S(t) is the estimated proportion of individual who survive longer than time
24
Life table analysis
ni wi di qi=
di/[ni-(wi/2)]
Pi=1-qi Si=pi(pi-1)
Interval start time
(mo)
No entering this interval
No withdrawal during interval
No of terminal events
Proportion terminating
Proportion surviving
Cum proportion surviving at End
0 13 2 1 0.083 0.917 0.917
3 10 4 1 0.125 0.875 0.802
6 5 4 0 0 1.000 0.802
9 1 1 0 0 1.000 0.802
Kaplan-Meier method ni Ci di qi= di/ni Pi=1-qi Si=pi(pi-1)
Event time
(mo)
No at risk Censor No of events
Mortality Survival Cum survival
3 10 0 1 1/10=0.1 0.90 0.9
4 9 1 0 0/9=0 1.0 0.9*1=0.9
5.7 8 1 0 0/8=0 1.0 0.9*1*1=0.9
6.5 7 0 2 2/7= 0.28 0.72 0.9*1*1*0.72=0.648
.
10
25
Kaplan-Meier survival estimate
Hazard function Hazard function h (t) is the probability
that an individual will die (fail) at time t. The death rate for an individual surviving
at time t. The cumulative hazard (H(t)) is therefore
the convergence of cumulative survival S(t), which is the probability that an individual would die after time t.
26
Comparing two survival curves Logrank test The null hypothesis for comparing
survival/failure times is:
Logrank statistic for survival No of patients at risk in Gr1 Gr2
No of observed events in Gr1 Gr2
No of expected events in Gr1 Gr2
χ 2 = (O1-E1)2 + (O2-E2) 2
E1 E2
27
Time(month)
d1jNumber ofDeaths
n1jNumber
at risk
d2j n2j dj nj e1j e2j
.03
.07.1.17.23.27.503.035.939.139.8013.2315.83
1111111101100
1514131211109877655
0000000010011
15151515151515151514141413
1111111111111
30292827262524232221201918
1x15/30=.501x14/29=.481x13/28=.46
.44
.42
.40
.38
.35
.32
.33
.30
.26
.28
1x15/30=.501x15/29=.521x15/28=.54
.56
.57
.60
.63
.65
.68
.67
.70
.74
.72
Total 10O1
3O2
4.93E1
8.07E2
Demonstrating calculation of Log-rank statistics
P=0.001
Hazard ratio HR = O1/E1
O2/E2
= 10/4.93 = 2.03/0.37 = 5.53/8.07
The risk of GF at any time in recipients olderthan 50 years old is 5.5 times greater than the riskin recipients who younger.
Faster
28
CC-EBM
stsum, by(ager_gr) failure _d: GF == 1 analysis time _t: _t id: hnr
| incidence no. of |------ Survival time -----| ager_gr | time at risk rate subjects 25% 50% 75% ---------+--------------------------------------------------------------------- <50 | 831.5701574 .0865832 253 3.756331 6.652977 . >=50 | 229.7221081 .0740025 71 3.635866 . . ---------+--------------------------------------------------------------------- total | 1061.292266 .08386 324 3.635866 . .
. sts test ager_gr failure _d: GF == 1 analysis time _t: _t id: hnr
Log-rank test for equality of survivor functions
| Events Events ager_gr | observed expected --------+------------------------- <50 | 72 69.61 >=50 | 17 19.39 --------+------------------------- Total | 89 89.00
chi2(1) = 0.38 Pr>chi2 = 0.5391
Statistical analysis
Survival analysis with Kaplan-Meier was used to estimate survival rate, median survival time of recipients.
Log-rank test was used to compare survival curves
Cox regression was used to determine factors associated with survival time
29
Cox regressionProportional hazard model The Cox proportional hazard model (or Cox
regression) was proposed by Cox in 1972 and has been used widely since then when desiring to investigate several variables simultaneously for time to event outcomes.
The model is a semi-parametric approach –no particular type of distribution is assumed for survival times.
Cox regressionProportional hazard model A strong assumption The effects of the different groups of
variable on survival are constant over time.
Benefits of the Cox model are: (i) It can perform multiple comparisons (ii) It’s able to adjust for confounding
variables for which the study design can not control for.
30
Example
Materials and Methods We examined a cohort of consecutive
end-stage renal disease patients who underwent first kidney transplantation at a single-center, university-based hospital during a 6- year study period. All subjects had a follow-up of at least 6 months.
31
Materials & Methods
Setting
The study was conducted at Ramathibodi Hospital which is a 1200-bed university hospital in Bangkok.
Study design
Ambidirectional cohort study
Methods Study design A ambidirectional cohort studyPast Present Future
Start 1997 2002 2006
32
Materials & Methods
Study population Inclusion criteria
Medical records of patients aged at least 18 years old who initially had undertaken kidney transplantation in Ramathibodi Hospital
Exclusion criteria
Multi-organ transplants or dual kidney transplants
Recipient who had graft failure or death within 6 months
Recipients who had time of follow-up of less than 6 months
Study design
HBV HCVHBV
HCV
Non HBVHCV
Follow over time
GF
No GF
GF
No GF
GF
No GF
33
Outcome The primary outcomes were time to graft
failure.
Graft failure was defined by the introduction of long-term dialysis after transplantation or retransplantation.
Statistical analysis Graft and patient and survivals were
determined using the Kaplan-Meier method.
Log-rank test was used to compare survival curves
Cox regression analysis with time-varying covariates was used to assess the effect of HBV/HCV infections adjusting for confounders.
34
Data collection
Baseline data Follow-up data
Baseline dataTime-fixed covariates
Origin End of study
35
Baseline dataTime-fixed covariates
Origin End of study
Follow-up dataTime-dependent covariates
Graft survival Patient survival
P=0.001 P=0.003
Among 353 recipients: HBV+ 6.5%, HCV+ 6.2%
36
Incidence of graft failure and HR
Characteristics No. ofGF
Totalsubjects
Timeat risk(years)
IncidenceGF/100/year
Hazard ratio(95%CI)
P-value
Recipient age, years < 50 > 50
920
26477
272.56981.12
3.302.01
1.0***1.67(0.76-3.68)
0.20
SexMaleFemale
1019
131210
461.93791.75
2.162.40
1.0***1.10(0.50-2.33)
0.84
Duration of dialysis <12 months
>12 months1415
162161
623.69551.88
2.242.71
1.0***1.17(0.56-2.43)
0.66
Anti-HCVPositiveNegative
623
22319
90.331171.7
7.331.96
3.88(1.57-9.57)1.0***
0.001
Potential bias in cohort studies
Selection bias Susceptibility bias
Migration bias
Measurement bias
Survival cohorts
Confounders
37
Selection or Susceptibility bias Groups being compared are not equally
susceptible to the outcome of interest, other than the factor under study
CA colon : CEA level and Relapse
Dukes classification and Relapse
Methods for controlling selection bias
MethodPhase of study
Design Analysis
Randomization +
Restriction +
Matching +
Stratification +
Adjustment
Multivariable
+
38
Propensity score matching (PSM)
A statistical matching technique that attempts to estimate the effect of a treatment, policy, or other intervention by accounting for the covariates that predict receiving the treatment.
PSM attempts to reduce the bias due to confounding variables that could be found in an estimate of the treatment effect obtained from simply comparing outcomes among units that received the treatment versus those that did not.
Propensity score matching (PSM)
For observational studies, the assignment of treatments to research subjects is typically not random.
Matching attempts to mimic randomization by creating a sample of units that received the treatment that is comparable on all observed covariates to a sample of units that did not receive the treatment.
39
PSM procedure
1.Run logistic regression: Dependent variable: Y = 1, if participate; Y = 0, otherwise.
Choose appropriate confounders (variables hypothesized to be associated with both treatment and outcome)
Obtain propensity score: predicted probability (p) or log[p/(1 − p)].
2. Check that propensity score is balanced across treatment and comparison groups, and check that covariates are balanced across treatment and comparison groups within strata of the propensity score.
3.Match each participant to one or more nonparticipants on propensity score
41
Migration bias
HBV HCVHBV
HCV
Non HBVHCVFollow over time
GF
No GF
GF
No GF
GF
No GF
Drop out
Cross over
Best-case/worst case analysis
Measurement bias
When patients in one subgroup of a cohort stand a better chance of having their outcomes detected than another subgroup.
Ways to control Unawareness of person who record outcome
events
Set up strict criteria/rules for diagnose outcome events
Apply efforts to discover outcome events equally
42
Survivor bias
Tracking participants over time
Have losses been minimised?
True cohortObserved
improvementTrue
improvement
Assemble cohortN=150
Measure outcomeImproved: 75Not improved: 75
50% 50%
Survival cohort
Begin F/UN=50
Not observedN=100
Measure outcomeImproved: 40Not improved: 10
DropoutsImproved: 35Not improved: 65
80% 50%
43
Confounding A factor that distorts the true
relationship of the study variables of interest by being related to the outcome of interest.
X Y
W
Confounding It must be a risk factor for outcome It must be associated with the exposure or
distributed unequally between the groups
Exposure
(Coffee drinking)
Outcome
(CA stomach)
44
Confounding It must be a risk factor for outcome It must be associated with the exposure or
distributed unequally between the groups
Exposure
(Coffee drinking)
Outcome
(CA stomach)
Confounding variable
(Smoking)
Features to look for in a cohort study