Network Analysis of Unstructured EHR Data for Clinical Research
description
Transcript of Network Analysis of Unstructured EHR Data for Clinical Research
Network(analysis(of(unstructured(EHR(data(for(clinical(research(
Anna$Bauer)Mehren$Shah$Lab,$Stanford$Center$for$Biomedical$Informa:cs$
Networks(
• Networks$have$successfully$been$used$in$biology$and$partly$in$medicine$• Ataxia$protein)protein$interac:on$network$(Lim$et$al,$Cell,$2006)$
• Pa:ent$stra:fica:on$using$ICD9$data$(Roque$et$al,$PLoS(Comp(Biol,$2011)$
• We$can$use$networks$from$unstructured$EHR$data$for:$• Cohort$construc:on$• Clinical$outcome$analysis$
4/2/13 2
4/2/13 3
NCBO(STRIDE(
• More$than$300$biomedical$ontologies$
• Tools$and$web$services$
• 1.8$million$pa:ents$• 18$years$• 35$million$coded$ICD9$
diagnoses$• 11$million$clinical$free$text$
notes$
Clinically(relevant(hypotheses(
Data(
Term$–$1$:$:$:$Term$–$n$
Term$recogni:on$tool$NCBO$Annotator$ NegEx$
Pa[erns$
Nega:on$detec:on$Family$History$detec:on$
Diseases$
Procedures$
Drugs$
BioPortal$–$knowledge$graph$
Crea:ng$clean$lexicons$
Annota:on$Workflow$
Text$clinical$note$
Terms$Recognized$
Nega:on$detec:on$
From(clinical(notes(to(paBent(feature(matrix$
Ontologies$provide$features$&$dependencies$
Simple$text$processing,$Feature$extrac:on$
Devices$
Diseases$ Procedures$Drugs$ Devices$
5
C1 … … Cn
P1 1 0 1 1
: 0 1 1 0
: 0 0 0 1
Pn 0 1 0 1
C1 … … Cn
C1 1 0.6 0.5 0.6
: 1 0.2 0.3
: 1 0.1
Cn 1
P1 … … Pn
P1 1 0.1 0.7 0.8
: 1 0.5 0.8
: 1 0.4
Pn 1
zolpidem
pravastatin
nifedipine
congestive heart failure
insulin glargine
cilostazol
transplantation
trimethoprim
bypass
doppler studies
surgical revision
atherectomy
bypass graftvascular surgical
procedures
ultrasound imaging
testosterone
fentanyl
amoxicillin
heart failure
coronary angiography
pantoprazole
cephalexin
hydralazine
amiodarone
obesity
wheelchair
diazepam
pneumonia
tacrolimus
sulfamethoxazole
temazepam
decompressive incision
fluoroscopic angiography heart
transplantation
revascularization
diagnostic imaging
vascular diseases
ramipril
angioplasty
doppler ultrasonography
cane
carotid endarterectomy
What is special about these
patients?
Who is getting these drugs,
conditions, etc?
Drug$Safety$ Compara:ve$Effec:veness$
Prac:ce)based$evidence$ Predic:ons$
Prac:ce)based$evidence$
PracBceEbased(evidence(
Peripheral(artery(disease((PAD):(obstruc:on$of$infra)renal$abdominal$aorta$and$lower$extremity$arteries$$
!$Cilostazol$
4/2/13 6
Cilostazol(• Reversible$selec:ve$inhibitor$of$phosphodiesterase$(PDE)$type$III$
• Long)term$oral$milrinone$therapy$associated$with$life)threatening$cardiovascular$events$in$conges:ve$heart$failure$(CHF)$pa:ents$
$$
$$$$$$$$
$$$$$$$$$$$$Hypothesis:$
4/2/13 7
Cilostazol is not associated with increased risk of major adverse cardiovascular events (MACE)
1. PAD ! Cilostazol? ! MACE 2. PAD/CHF ! Cilostazol? ! MACE
Packer et al. “Effect of oral milrinone on mortality in severe chronic heart failure. The PROMISE Study Research Group.” N Engl J Med. (1991)
50k
Pat
ient
s (1
.8 m
illio
n)
Dimension(reducBon(–(paBent(space(
4/2/13 8
t
Patient timeline
Follow up time
PAD
tPAD
Last note
tlast
Follow up time in peripheral artery disease patients
follow up time in 30 day intervals
Freq
uenc
y
0 1000 2000 3000 4000 5000 6000
050
010
0015
0020
0025
0030
00
5757 PAD patients
Concept
Peripheral artery disease
Peripheral vascular disease
Peripheral arterial occlusive disease
Intermittent claudication
Claudication (finding)
PAD
PAD
11547 PAD patients
?
Peripheral Artery Disease 5757 patients
!!!!!
Variable Before Matching Patient-patient similarity network
Propensity score matching
Treatment
(n= 232) Control (n= 5525)
p-value Control (n= 232)
p-value Control (n= 5525)
p-value
Demographics Age (at indication onset), mean (sd) 71.22
(11.02) 70.43
(12.46) 0.30 72.05
(10.62) 0.41 70.87
(11.51) 0.75
Gender (female), n (%) 37.22 45.94 <0.01 36.65 0.84 35.87 0.77 Race , (%)
Asian 8.52 7.41 0.56 6.33 0.37 10.31 0.51 Black 2.69 3.71 0.36 3.61 0.59 0.90 0.16 Unknown 22.87 26.17 0.25 22.17 0.82 20.63 0.57 White 65.47 62.22 0.32 67.87 0.55 67.27 0.69
Comorbidities Coronary artery disease, n (%) 5.38 6.47 0.48 4.98 0.83 6.28 0.70 Congestive heart failure, n (%) 25.56 22.84 0.36 20.36 0.21 30.49 0.36 Hypertension, n (%) 10.76 11.31 0.80 9.50 0.75 10.31 0.88 Co-prescriptions Beta blocking agents, n (%) 75.34 60.77 <0.001 69.68 0.20 74.89 0.91 ACE inhibitors, plain, n (%) 78.03 69.57 <0.01 67.87 0.01 78.92 0.81 Platelet aggregation inhibitors excl.
heparin, n (%) 91.93 79.00 <0.001 89.59 0.41 95.51 0.07
Vasodilators, n (%) 32.29 26.36 0.06 31.67 0.84 37.22 0.29 History of Cardiac arrhythmia, n (%) 32.29 32.17 0.97 23.08 0.03 33.18 0.84 Stroke, n (%) 17.94 18.31 0.89 15.84 0.61 21.52 0.34 Myocardial infarction, n (%) 17.94 15.87 0.43 13.58 0.24 19.73 0.64 Vascular surgical procedures, n (%) 74.44 47.71 <0.001 65.61 0.05 74.44 1 Bypass surgery, n (%)
41.70 26.56 <0.001 36.20 0.24 40.36 0.75
Cilostazol patients are sicker than other PAD patients ! Might affect outcome analysis
! Ideally, we want to compare to “medical twin”
C1 … … Cn
P1 1 0 1 1
: 0 1 1 0
: 0 0 0 1
Pn 0 1 0 1
Cohort(construcBon(
4/2/13 10
1. Choose$variables$for$matching$
2. Compute$PS$based$on$variables$(logis:c$regression)$
3. Nearest$neighbor$matching$(1:1)$
1 0 1
0 0 1
1 1 0
1 0 1
0 0 1
0 0 1
0 0 1
1 1 0
0 0 1
0 0 1
1 1 1
0 0 1
1 1 1
0 1 0
0 0 1
0 1 0
0 0 0
0 1 1
0 0 1
0 1 1
1 1 1
0 0 1
1 0 1
1 1 1
Drug$classes$ Diseases$ Devices$ Procedures$ Demographics$
J(A,B) =A∩BA∪B
0 0.6 0.8 0.6 0.5 0.8
0 0.8 0.9 0.8 0.3
0 0.7 0.9 0.9
0 0.7 0.8
0 0.8
0
Pa:e
nts$
Pa:ents$
5757(
5757(Nearest(neighbor(Matching((1:1)(
1 0 1
0 0 1
1 1 0
1 0 1
0 0 1
0 0 1
0 0 1
1 1 0
0 0 1
0 0 1
1 1 1
0 0 1 464(
1159(
Cilostazol(
Control(
Concepts$
0 0 1
0 1 1
1 1 1
0 0 1
1 0 1
1 1 1
PaBentEpaBent(similarity(network((PSim)( PropensityEscore(matching((PSM)(
5757(
1159(
Pa:e
nts$
Pa:e
nts$
1 0 1
0 0 1
1 1 0
1 0 1
0 0 1
0 0 1
0 0 1
1 1 0
0 0 1
0 0 1
1 1 1
0 0 1
464(
17(Concepts$
Pa:e
nts$
cilostazolcontrolcilostazol
!!!!!
Variable Before Matching Patient-patient similarity network
Propensity score matching
Treatment
(n= 232) Control (n= 5525)
p-value Control (n= 232)
p-value Control (n= 5525)
p-value
Demographics Age (at indication onset), mean (sd) 71.22
(11.02) 70.43
(12.46) 0.30 72.05
(10.62) 0.41 70.87
(11.51) 0.75
Gender (female), n (%) 37.22 45.94 <0.01 36.65 0.84 35.87 0.77 Race , (%)
Asian 8.52 7.41 0.56 6.33 0.37 10.31 0.51 Black 2.69 3.71 0.36 3.61 0.59 0.90 0.16 Unknown 22.87 26.17 0.25 22.17 0.82 20.63 0.57 White 65.47 62.22 0.32 67.87 0.55 67.27 0.69
Comorbidities Coronary artery disease, n (%) 5.38 6.47 0.48 4.98 0.83 6.28 0.70 Congestive heart failure, n (%) 25.56 22.84 0.36 20.36 0.21 30.49 0.36 Hypertension, n (%) 10.76 11.31 0.80 9.50 0.75 10.31 0.88 Co-prescriptions Beta blocking agents, n (%) 75.34 60.77 <0.001 69.68 0.20 74.89 0.91 ACE inhibitors, plain, n (%) 78.03 69.57 <0.01 67.87 0.01 78.92 0.81 Platelet aggregation inhibitors excl.
heparin, n (%) 91.93 79.00 <0.001 89.59 0.41 95.51 0.07
Vasodilators, n (%) 32.29 26.36 0.06 31.67 0.84 37.22 0.29 History of Cardiac arrhythmia, n (%) 32.29 32.17 0.97 23.08 0.03 33.18 0.84 Stroke, n (%) 17.94 18.31 0.89 15.84 0.61 21.52 0.34 Myocardial infarction, n (%) 17.94 15.87 0.43 13.58 0.24 19.73 0.64 Vascular surgical procedures, n (%) 74.44 47.71 <0.001 65.61 0.05 74.44 1 Bypass surgery, n (%)
41.70 26.56 <0.001 36.20 0.24 40.36 0.75
Cohort(construcBon(
controlcilostazol
> 1000 variables
Concepts$
Pa:e
nts$
464(
17(
17 expert-selected variables
1 0 1
0 0 1
1 1 0
1 0 1
0 0 1
0 0 1
0 0 1
1 1 0
0 0 1
0 0 1
1 1 1
0 0 1
No difference in results
Outcome(in(PAD(paBents(
4/2/13 12
C1 … Cn CIL
CIL1 1 0 1 1
CIL2 0 1 1 1
: 0 0 0 1
PAD1 0 1 0 0
: 1 0 1 0
PADn 0 1 1 0
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
0 1 2 3 4 5 6
Revascularization procedure
Bypass procedure
Cardiac arrhythmia
Atrial fibrillation
Ventricular tachicardia
Ventricular fibrillation
Death
Stroke
Myocardial infarctionMACE
Arrhythmias
MALE
similaritypsm control
cilostazol
A B
Differences(in(cohorts(
4/2/13 13
Concepts(enriched(in(PSim(but(not(in(PSM(cohort(
• No difference in MACE among patients taking Cilostazol
• Uncovered a “natural experiment” that supports what clinicians suspect
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
�
0 1 2 3 4 5 6
Revascularization procedure
Bypass procedure
Cardiac arrhythmia
Atrial fibrillation
Ventricular tachicardia
Ventricular fibrillation
Death
Stroke
Myocardial infarctionMACE
Arrhythmias
MALE
similaritypsm control
cilostazol
A B
pantoprazole
zolpidem
obesity
coronaryangiography
diazepam
hydralazine
amiodarone
amoxicillin
pravastatinheart failure
pneumonia
wheelchair
congestive heart failure
cephalexin
nifedipine
insulin glargine
carotidendarterectomy
trimethoprim
ramiprildecompressive
incisiontransplantation
temazepam
sulfamethoxazole
fluoroscopicangiography
hearttransplantation tacrolimus
cane
angioplasty
ultrasonography,doppler
cilostazol
doppler studies
diagnosticimaging
vasculardiseases
vascularsurgical
procedures
bypass
revascularization
fentanyl
surgical revision
atherectomy
ultrasoundimaging
bypass graft
cilostazol
control
drugdisease
proceduredevice
Natural(experiment(
4/2/13 14
Outcome(in(CHF(paBents(Iden:fied$and$manually$confirmed$43$PAD$pa:ents$$
with$history$of$CHF$taking$Cilostazol$despite$the$black)box$warning$$
4/2/13 15
A Major adverse cardiovascular events
revascularization
bypass
angioplasty
amputation
MALE
B Major adverse limb events
palpitations
dizziness
ventricular tachycardia
ventricular fibrillation
tachycardia
conduction disease/bradyarrythmia
atrial fibrillation
ARRHYTHMIAS
C Arrhythmias and symptoms
0 0.5 1 1.5 2 2.5 3 3.5 4
sudden cardiac death
stroke
myocardial infarction
defibrillation events
death (SSDI)
cardiac arrest
MACE
0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4Odds ratio
Confirmed results in 96 PAD/CHF patients in PAMF dataset
Conclusions$
• No$significant$differences$in$MACE$or$cardiac$arrhythmias$in$cilostazol$vs.$controls$
• Uncovered$“natural$experiment”$CHF$pa:ents$
• Cilostazol$has$no$differen:al$effect$on$survival$in$CHF$pa:ents$
$
• Pa:ent)pa:ent$similarity$can$be$used$for$cohort$building$
• Concept)concept$associa:on$networks$can$be$used$to$analyze$outcomes,$and$uncover$“natural$experiments”$
Cilostazol([is(or(is(not](associated(with(increased(risk(of(major(adverse(cardiovascular(events((MACE)(
P1( …( …( Pn(
P1( 1$ 0.6$ 0.5$ 0.6$
:( 1$ 0.2$ 0.3$
:( 1$ 0.1$
Pn( 1$
C1( …( …( Cn(
C1( 1$ 0.6$ 0.5$ 0.6$
:( 1$ 0.2$ 0.3$
:( 1$ 0.1$
Cn( 1$
controlcilostazol
zolpidem
pravastatin
nifedipine
congestive heart failure
insulin glargine
cilostazol
transplantation
trimethoprim
bypass
doppler studies
surgical revision
atherectomy
bypass graftvascular surgical
procedures
ultrasound imaging
testosterone
fentanyl
amoxicillin
heart failure
coronary angiography
pantoprazole
cephalexin
hydralazine
amiodarone
obesity
wheelchair
diazepam
pneumonia
tacrolimus
sulfamethoxazole
temazepam
decompressive incision
fluoroscopic angiography heart
transplantation
revascularization
diagnostic imaging
vascular diseases
ramipril
angioplasty
doppler ultrasonography
cane
carotid endarterectomy
Acknowledgements(
• Shah$Lab$• Nigam$Shah$• Paea$LePendu$• Rave$Harpaz$• Srinivasan$Iyer$• Kenneth$Jung$• Amogh$Vasekar$• Sandy$Huang$• Tyler$Cole$
• Medical$collaborator$• Nicholas$Leeper$
4/2/13 17
• NCBO$Team$• Mark$Musen$• NIH$funding$
$
• STRIDE$Team$• Tanya$Podchiyska$• Todd$Ferris$
• PAMF$Team$• Cliff$Olson$
$
$We$are$hiring$postdocs!$
$ $ $ $ $ $ $ [email protected]$