Network Analysis of Unstructured EHR Data for Clinical Research

18
Network analysis of unstructured EHR data for clinical research Anna Bauer)Mehren Shah Lab, Stanford Center for Biomedical Informa:cs

description

2013 Summit on Clinical Research Informatics

Transcript of Network Analysis of Unstructured EHR Data for Clinical Research

Page 1: Network Analysis of Unstructured EHR Data for Clinical Research

Network(analysis(of(unstructured(EHR(data(for(clinical(research(

Anna$Bauer)Mehren$Shah$Lab,$Stanford$Center$for$Biomedical$Informa:cs$

Page 2: Network Analysis of Unstructured EHR Data for Clinical Research

Networks(

•  Networks$have$successfully$been$used$in$biology$and$partly$in$medicine$•  Ataxia$protein)protein$interac:on$network$(Lim$et$al,$Cell,$2006)$

•  Pa:ent$stra:fica:on$using$ICD9$data$(Roque$et$al,$PLoS(Comp(Biol,$2011)$

•  We$can$use$networks$from$unstructured$EHR$data$for:$•  Cohort$construc:on$•  Clinical$outcome$analysis$

4/2/13 2

Page 3: Network Analysis of Unstructured EHR Data for Clinical Research

4/2/13 3

NCBO(STRIDE(

•  More$than$300$biomedical$ontologies$

•  Tools$and$web$services$

•  1.8$million$pa:ents$•  18$years$•  35$million$coded$ICD9$

diagnoses$•  11$million$clinical$free$text$

notes$

Clinically(relevant(hypotheses(

Data(

Page 4: Network Analysis of Unstructured EHR Data for Clinical Research

Term$–$1$:$:$:$Term$–$n$

Term$recogni:on$tool$NCBO$Annotator$ NegEx$

Pa[erns$

Nega:on$detec:on$Family$History$detec:on$

Diseases$

Procedures$

Drugs$

BioPortal$–$knowledge$graph$

Crea:ng$clean$lexicons$

Annota:on$Workflow$

Text$clinical$note$

Terms$Recognized$

Nega:on$detec:on$

From(clinical(notes(to(paBent(feature(matrix$

Ontologies$provide$features$&$dependencies$

Simple$text$processing,$Feature$extrac:on$

Devices$

Diseases$ Procedures$Drugs$ Devices$

Page 5: Network Analysis of Unstructured EHR Data for Clinical Research

5

C1 … … Cn

P1 1 0 1 1

: 0 1 1 0

: 0 0 0 1

Pn 0 1 0 1

C1 … … Cn

C1 1 0.6 0.5 0.6

: 1 0.2 0.3

: 1 0.1

Cn 1

P1 … … Pn

P1 1 0.1 0.7 0.8

: 1 0.5 0.8

: 1 0.4

Pn 1

zolpidem

pravastatin

nifedipine

congestive heart failure

insulin glargine

cilostazol

transplantation

trimethoprim

bypass

doppler studies

surgical revision

atherectomy

bypass graftvascular surgical

procedures

ultrasound imaging

testosterone

fentanyl

amoxicillin

heart failure

coronary angiography

pantoprazole

cephalexin

hydralazine

amiodarone

obesity

wheelchair

diazepam

pneumonia

tacrolimus

sulfamethoxazole

temazepam

decompressive incision

fluoroscopic angiography heart

transplantation

revascularization

diagnostic imaging

vascular diseases

ramipril

angioplasty

doppler ultrasonography

cane

carotid endarterectomy

What is special about these

patients?

Who is getting these drugs,

conditions, etc?

Drug$Safety$ Compara:ve$Effec:veness$

Prac:ce)based$evidence$ Predic:ons$

Prac:ce)based$evidence$

Page 6: Network Analysis of Unstructured EHR Data for Clinical Research

PracBceEbased(evidence(

Peripheral(artery(disease((PAD):(obstruc:on$of$infra)renal$abdominal$aorta$and$lower$extremity$arteries$$

!$Cilostazol$

4/2/13 6

Page 7: Network Analysis of Unstructured EHR Data for Clinical Research

Cilostazol(•  Reversible$selec:ve$inhibitor$of$phosphodiesterase$(PDE)$type$III$

•  Long)term$oral$milrinone$therapy$associated$with$life)threatening$cardiovascular$events$in$conges:ve$heart$failure$(CHF)$pa:ents$

$$

$$$$$$$$

$$$$$$$$$$$$Hypothesis:$

4/2/13 7

Cilostazol is not associated with increased risk of major adverse cardiovascular events (MACE)

1.  PAD ! Cilostazol? ! MACE 2.  PAD/CHF ! Cilostazol? ! MACE

Packer et al. “Effect of oral milrinone on mortality in severe chronic heart failure. The PROMISE Study Research Group.” N Engl J Med. (1991)

Page 8: Network Analysis of Unstructured EHR Data for Clinical Research

50k

Pat

ient

s (1

.8 m

illio

n)

Dimension(reducBon(–(paBent(space(

4/2/13 8

t

Patient timeline

Follow up time

PAD

tPAD

Last note

tlast

Follow up time in peripheral artery disease patients

follow up time in 30 day intervals

Freq

uenc

y

0 1000 2000 3000 4000 5000 6000

050

010

0015

0020

0025

0030

00

5757 PAD patients

Concept

Peripheral artery disease

Peripheral vascular disease

Peripheral arterial occlusive disease

Intermittent claudication

Claudication (finding)

PAD

PAD

11547 PAD patients

Page 9: Network Analysis of Unstructured EHR Data for Clinical Research

?

Peripheral Artery Disease 5757 patients

!!!!!

Variable Before Matching Patient-patient similarity network

Propensity score matching

Treatment

(n= 232) Control (n= 5525)

p-value Control (n= 232)

p-value Control (n= 5525)

p-value

Demographics Age (at indication onset), mean (sd) 71.22

(11.02) 70.43

(12.46) 0.30 72.05

(10.62) 0.41 70.87

(11.51) 0.75

Gender (female), n (%) 37.22 45.94 <0.01 36.65 0.84 35.87 0.77 Race , (%)

Asian 8.52 7.41 0.56 6.33 0.37 10.31 0.51 Black 2.69 3.71 0.36 3.61 0.59 0.90 0.16 Unknown 22.87 26.17 0.25 22.17 0.82 20.63 0.57 White 65.47 62.22 0.32 67.87 0.55 67.27 0.69

Comorbidities Coronary artery disease, n (%) 5.38 6.47 0.48 4.98 0.83 6.28 0.70 Congestive heart failure, n (%) 25.56 22.84 0.36 20.36 0.21 30.49 0.36 Hypertension, n (%) 10.76 11.31 0.80 9.50 0.75 10.31 0.88 Co-prescriptions Beta blocking agents, n (%) 75.34 60.77 <0.001 69.68 0.20 74.89 0.91 ACE inhibitors, plain, n (%) 78.03 69.57 <0.01 67.87 0.01 78.92 0.81 Platelet aggregation inhibitors excl.

heparin, n (%) 91.93 79.00 <0.001 89.59 0.41 95.51 0.07

Vasodilators, n (%) 32.29 26.36 0.06 31.67 0.84 37.22 0.29 History of Cardiac arrhythmia, n (%) 32.29 32.17 0.97 23.08 0.03 33.18 0.84 Stroke, n (%) 17.94 18.31 0.89 15.84 0.61 21.52 0.34 Myocardial infarction, n (%) 17.94 15.87 0.43 13.58 0.24 19.73 0.64 Vascular surgical procedures, n (%) 74.44 47.71 <0.001 65.61 0.05 74.44 1 Bypass surgery, n (%)

41.70 26.56 <0.001 36.20 0.24 40.36 0.75

Cilostazol patients are sicker than other PAD patients !  Might affect outcome analysis

!  Ideally, we want to compare to “medical twin”

C1 … … Cn

P1 1 0 1 1

: 0 1 1 0

: 0 0 0 1

Pn 0 1 0 1

Page 10: Network Analysis of Unstructured EHR Data for Clinical Research

Cohort(construcBon(

4/2/13 10

1.  Choose$variables$for$matching$

2.  Compute$PS$based$on$variables$(logis:c$regression)$

3.  Nearest$neighbor$matching$(1:1)$

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1

1 1 1

0 1 0

0 0 1

0 1 0

0 0 0

0 1 1

0 0 1

0 1 1

1 1 1

0 0 1

1 0 1

1 1 1

Drug$classes$ Diseases$ Devices$ Procedures$ Demographics$

J(A,B) =A∩BA∪B

0 0.6 0.8 0.6 0.5 0.8

0 0.8 0.9 0.8 0.3

0 0.7 0.9 0.9

0 0.7 0.8

0 0.8

0

Pa:e

nts$

Pa:ents$

5757(

5757(Nearest(neighbor(Matching((1:1)(

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1 464(

1159(

Cilostazol(

Control(

Concepts$

0 0 1

0 1 1

1 1 1

0 0 1

1 0 1

1 1 1

PaBentEpaBent(similarity(network((PSim)( PropensityEscore(matching((PSM)(

5757(

1159(

Pa:e

nts$

Pa:e

nts$

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1

464(

17(Concepts$

Pa:e

nts$

cilostazolcontrolcilostazol

Page 11: Network Analysis of Unstructured EHR Data for Clinical Research

!!!!!

Variable Before Matching Patient-patient similarity network

Propensity score matching

Treatment

(n= 232) Control (n= 5525)

p-value Control (n= 232)

p-value Control (n= 5525)

p-value

Demographics Age (at indication onset), mean (sd) 71.22

(11.02) 70.43

(12.46) 0.30 72.05

(10.62) 0.41 70.87

(11.51) 0.75

Gender (female), n (%) 37.22 45.94 <0.01 36.65 0.84 35.87 0.77 Race , (%)

Asian 8.52 7.41 0.56 6.33 0.37 10.31 0.51 Black 2.69 3.71 0.36 3.61 0.59 0.90 0.16 Unknown 22.87 26.17 0.25 22.17 0.82 20.63 0.57 White 65.47 62.22 0.32 67.87 0.55 67.27 0.69

Comorbidities Coronary artery disease, n (%) 5.38 6.47 0.48 4.98 0.83 6.28 0.70 Congestive heart failure, n (%) 25.56 22.84 0.36 20.36 0.21 30.49 0.36 Hypertension, n (%) 10.76 11.31 0.80 9.50 0.75 10.31 0.88 Co-prescriptions Beta blocking agents, n (%) 75.34 60.77 <0.001 69.68 0.20 74.89 0.91 ACE inhibitors, plain, n (%) 78.03 69.57 <0.01 67.87 0.01 78.92 0.81 Platelet aggregation inhibitors excl.

heparin, n (%) 91.93 79.00 <0.001 89.59 0.41 95.51 0.07

Vasodilators, n (%) 32.29 26.36 0.06 31.67 0.84 37.22 0.29 History of Cardiac arrhythmia, n (%) 32.29 32.17 0.97 23.08 0.03 33.18 0.84 Stroke, n (%) 17.94 18.31 0.89 15.84 0.61 21.52 0.34 Myocardial infarction, n (%) 17.94 15.87 0.43 13.58 0.24 19.73 0.64 Vascular surgical procedures, n (%) 74.44 47.71 <0.001 65.61 0.05 74.44 1 Bypass surgery, n (%)

41.70 26.56 <0.001 36.20 0.24 40.36 0.75

Cohort(construcBon(

controlcilostazol

> 1000 variables

Concepts$

Pa:e

nts$

464(

17(

17 expert-selected variables

1 0 1

0 0 1

1 1 0

1 0 1

0 0 1

0 0 1

0 0 1

1 1 0

0 0 1

0 0 1

1 1 1

0 0 1

No difference in results

Page 12: Network Analysis of Unstructured EHR Data for Clinical Research

Outcome(in(PAD(paBents(

4/2/13 12

C1 … Cn CIL

CIL1 1 0 1 1

CIL2 0 1 1 1

: 0 0 0 1

PAD1 0 1 0 0

: 1 0 1 0

PADn 0 1 1 0

0 1 2 3 4 5 6

Revascularization procedure

Bypass procedure

Cardiac arrhythmia

Atrial fibrillation

Ventricular tachicardia

Ventricular fibrillation

Death

Stroke

Myocardial infarctionMACE

Arrhythmias

MALE

similaritypsm control

cilostazol

A B

Page 13: Network Analysis of Unstructured EHR Data for Clinical Research

Differences(in(cohorts(

4/2/13 13

Concepts(enriched(in(PSim(but(not(in(PSM(cohort(

Page 14: Network Analysis of Unstructured EHR Data for Clinical Research

•  No difference in MACE among patients taking Cilostazol

•  Uncovered a “natural experiment” that supports what clinicians suspect

0 1 2 3 4 5 6

Revascularization procedure

Bypass procedure

Cardiac arrhythmia

Atrial fibrillation

Ventricular tachicardia

Ventricular fibrillation

Death

Stroke

Myocardial infarctionMACE

Arrhythmias

MALE

similaritypsm control

cilostazol

A B

pantoprazole

zolpidem

obesity

coronaryangiography

diazepam

hydralazine

amiodarone

amoxicillin

pravastatinheart failure

pneumonia

wheelchair

congestive heart failure

cephalexin

nifedipine

insulin glargine

carotidendarterectomy

trimethoprim

ramiprildecompressive

incisiontransplantation

temazepam

sulfamethoxazole

fluoroscopicangiography

hearttransplantation tacrolimus

cane

angioplasty

ultrasonography,doppler

cilostazol

doppler studies

diagnosticimaging

vasculardiseases

vascularsurgical

procedures

bypass

revascularization

fentanyl

surgical revision

atherectomy

ultrasoundimaging

bypass graft

cilostazol

control

drugdisease

proceduredevice

Natural(experiment(

4/2/13 14

Page 15: Network Analysis of Unstructured EHR Data for Clinical Research

Outcome(in(CHF(paBents(Iden:fied$and$manually$confirmed$43$PAD$pa:ents$$

with$history$of$CHF$taking$Cilostazol$despite$the$black)box$warning$$

4/2/13 15

A Major adverse cardiovascular events

revascularization

bypass

angioplasty

amputation

MALE

B Major adverse limb events

palpitations

dizziness

ventricular tachycardia

ventricular fibrillation

tachycardia

conduction disease/bradyarrythmia

atrial fibrillation

ARRHYTHMIAS

C Arrhythmias and symptoms

0 0.5 1 1.5 2 2.5 3 3.5 4

sudden cardiac death

stroke

myocardial infarction

defibrillation events

death (SSDI)

cardiac arrest

MACE

0 0.5 1 1.5 2 2.5 3 3.5 4 0 0.5 1 1.5 2 2.5 3 3.5 4Odds ratio

Confirmed results in 96 PAD/CHF patients in PAMF dataset

Page 16: Network Analysis of Unstructured EHR Data for Clinical Research

Conclusions$

•  No$significant$differences$in$MACE$or$cardiac$arrhythmias$in$cilostazol$vs.$controls$

•  Uncovered$“natural$experiment”$CHF$pa:ents$

•  Cilostazol$has$no$differen:al$effect$on$survival$in$CHF$pa:ents$

$

•  Pa:ent)pa:ent$similarity$can$be$used$for$cohort$building$

•  Concept)concept$associa:on$networks$can$be$used$to$analyze$outcomes,$and$uncover$“natural$experiments”$

Cilostazol([is(or(is(not](associated(with(increased(risk(of(major(adverse(cardiovascular(events((MACE)(

P1( …( …( Pn(

P1( 1$ 0.6$ 0.5$ 0.6$

:( 1$ 0.2$ 0.3$

:( 1$ 0.1$

Pn( 1$

C1( …( …( Cn(

C1( 1$ 0.6$ 0.5$ 0.6$

:( 1$ 0.2$ 0.3$

:( 1$ 0.1$

Cn( 1$

controlcilostazol

zolpidem

pravastatin

nifedipine

congestive heart failure

insulin glargine

cilostazol

transplantation

trimethoprim

bypass

doppler studies

surgical revision

atherectomy

bypass graftvascular surgical

procedures

ultrasound imaging

testosterone

fentanyl

amoxicillin

heart failure

coronary angiography

pantoprazole

cephalexin

hydralazine

amiodarone

obesity

wheelchair

diazepam

pneumonia

tacrolimus

sulfamethoxazole

temazepam

decompressive incision

fluoroscopic angiography heart

transplantation

revascularization

diagnostic imaging

vascular diseases

ramipril

angioplasty

doppler ultrasonography

cane

carotid endarterectomy

Page 17: Network Analysis of Unstructured EHR Data for Clinical Research

Acknowledgements(

•  Shah$Lab$•  Nigam$Shah$•  Paea$LePendu$•  Rave$Harpaz$•  Srinivasan$Iyer$•  Kenneth$Jung$•  Amogh$Vasekar$•  Sandy$Huang$•  Tyler$Cole$

•  Medical$collaborator$•  Nicholas$Leeper$

4/2/13 17

•  NCBO$Team$•  Mark$Musen$•  NIH$funding$

$

•  STRIDE$Team$•  Tanya$Podchiyska$•  Todd$Ferris$

•  PAMF$Team$•  Cliff$Olson$

$

Page 18: Network Analysis of Unstructured EHR Data for Clinical Research

$We$are$hiring$postdocs!$

$ $ $ $ $ $ $ [email protected]$