Intelligent Clinical Decision Support with Probabilistic ...

1
UC Irvine Heart Disease dataset Intelligent Clinical Decision Support with Probabilistic and Temporal EHR Modeling Simulations over 5,807 IN/TN patients >50% improvement in cost eectiveness >30% improvement in outcomes beyond existing fee-for-service model [Bennett and Hauser, Articial Intelligence in Medicine 2013] EHR SRL POMDP Domain expert Provides rules & corrective feedback Identies clinically relevant features Disease progression model Intelligent Clinical Decision Support System (ICDSS) Primary clinician Observations, test results, imaging, interventions Recommended treatment plans w/ human-readable rationale Kris Hauser* (PI), Sriraam Natarajan* (Co-PI), Shaun Grannis† (Co-PI) *Indiana University School of Informatics and Computing † Indiana University School of Medicine, Regenstrief Institute With clinical partners: Overview and Motivation Progress & Expected Outcomes Existing and Ongoing Work POMDP models in chronic depression treatment 0.45 0.55 0.65 0.75 0.85 0.95 J48 SVM AdaBoost Bagging Logis3c Naïve Bayes Single Rela3onal Tree So> Margin boos3ng AUCROC SRL for medical prediction tasks Cardiology domain Circulation; 92(8), 2157-62, 1995 JACC; 43, 842-7, 2004 Plaque in the left Coronary artery Given - age, sex, cholesterol, bmi, glucose, hdl level, ldl level, exercise, trig level, systolic bp and diastolic bp for all years 0 and 20 Predict – high Coronary Artery Calcication (CAC) levels at year 20 Atherosclerosis is the cause of the majority of Acute Myocardial Infarctions (heart attacks) Model: mixtures of relational probabilistic decision trees Learning: Relational Functional Gradient Boosting (RFGB) New Soft Margin relational learning algorithm improves recall Tunable penalties for false positives vs false negatives State-of-the-art performance on UC Irvine Heart Disease dataset (at right) and others Predic’ve model: Dynamic Bayesian Network Predic’on horizon:8 Decision variable: treat / not treat Outcome variables: selfreported survey (CDOI), treatment cost Objec’ve func’on: tradeoff between CDOI and $$$ (clinicianselected weight) [Yang et al, submitted to ICDM] Overview of envisioned CDSS Clinical decision-support systems (CDSS) have potential to exploit the wealth of clinical data in EHRs in addition to expert recommendations New AI techniques needed to plan chronic and multi-stage treatments: compute patient-specic, temporal, statistically- justied treatment plans Balance costs vs patient outcomes, reason with uncertainty Hypothesis: CDSS can improve state of current clinical practice by providing outcome-driven and cost-driven optimized decisions Two promising AI techniques: SRL and POMDP Sta’s’cal Rela’onal Learning (SRL): learning probabilis3c models from datasets with rela3onal structure Handles linked datasets, incomplete/missing data, noise Excellent for EHR data Par’ally Observable Markov Decision Processes (POMDP): learning probabilis3c models from datasets with rela3onal structure Handles linked datasets, incomplete/missing data, noise Excellent for EHR data 1800 1600 1400 1200 1000 800 600 400 200 0 Trivial Rand. Forest Log. Reg. Handmade BN Learned BN LOGLIKELIHOOD MODEL Final Outcome, Probabilis[c Predic[on, 10fold cross valida[on Higher is be\er (0 is perfect predic[on) Developing SRL methods to learn likelihoods of future adverse medical events CARDIA 20 year longitudinal dataset (N=5115) Preliminary results suggest high predictive power @ year 20 Future: use POMDP to suggest lifestyle changes (smoking, exercise), medications ER domain Int’l Stroke Trial dataset N=19,435 patients (1991-1996) with acute stroke symptoms 3 observation, 2 decision points: Admission (t=0): demographics, symptoms observed. Medicine administered Discharge (t=2 weeks): test results, diagnosis. Possible change of treatment, long-term prescription Follow up (t=6 mo): death, long term dependency Current: Learning probabilistic conditional models from partial & noisy data Future: optimize patient- specic plans Regenstrief PHESS: 65 million records across Indiana Identify high utilizers: signicant source of waste Stroke domain

Transcript of Intelligent Clinical Decision Support with Probabilistic ...

UC Irvine Heart Disease dataset  

Intelligent Clinical Decision Support with Probabilistic and Temporal EHR Modeling

Simulations over 5,807 IN/TN patients >50% improvement in cost effectiveness >30% improvement in outcomes

beyond existing fee-for-service model

[Bennett and Hauser, Artificial

Intelligence in Medicine 2013]

EHR

SRL POMDP

Domain expert

Provides rules & corrective feedback

Identifies clinically relevant features

Disease progression model

Intelligent Clinical Decision Support System (ICDSS)

Primary clinician Observations, test results, imaging, interventions

Recommended treatment plans w/ human-readable rationale

Kris Hauser* (PI), Sriraam Natarajan* (Co-PI), Shaun Grannis† (Co-PI) *Indiana University School of Informatics and Computing † Indiana University School of Medicine, Regenstrief Institute

With clinical partners:

Overview and Motivation Progress & Expected Outcomes Existing and Ongoing Work

POMDP models in chronic depression treatment

0.45  

0.55  

0.65  

0.75  

0.85  

0.95  

J48   SVM   AdaBoost   Bagging   Logis3c   Naïve  Bayes   Single  Rela3onal  

Tree  

So>  Margin  boos3ng  

AUCR

OC  

SRL for medical prediction tasks Cardiology domain

Circulation; 92(8), 2157-62, 1995

JACC; 43, 842-7, 2004

Plaque in the left Coronary artery

Given - age, sex, cholesterol, bmi, glucose, hdl level, ldl level, exercise, trig level, systolic bp and diastolic bp for all years 0 and 20 Predict– high Coronary Artery Calcification (CAC) levels at year 20

Atherosclerosis is the cause of the majority of Acute Myocardial Infarctions (heart attacks)

Model: mixtures of relational probabilistic decision trees Learning: Relational Functional Gradient Boosting (RFGB)

§  New Soft Margin relational learning algorithm improves recall

§  Tunable penalties for false positives vs false negatives

§  State-of-the-art performance on UC Irvine Heart Disease dataset (at right) and others  

§  Predic've  model:  Dynamic  Bayesian  Network  §  Predic'on  horizon:  8  §  Decision  variable:  treat  /  not  treat  §  Outcome  variables:  self-­‐reported  survey  (CDOI),  treatment  cost  §  Objec've  func'on:  tradeoff  between  CDOI  and  $$$  (clinician-­‐selected  weight)  

[Yang et al, submitted to ICD

M]  

Overview of envisioned CDSS

§  Clinical decision-support systems (CDSS) have potential to exploit the wealth of clinical data in EHRs in addition to expert recommendations

§  New AI techniques needed to plan chronic and multi-stage treatments: compute patient-specific, temporal, statistically-justified treatment plans

§  Balance costs vs patient outcomes, reason with uncertainty §  Hypothesis: CDSS can improve state of current clinical practice by

providing outcome-driven and cost-driven optimized decisions

Two promising AI techniques: SRL and POMDP Sta's'cal  Rela'onal  Learning  (SRL):  learning  probabilis3c  models  from  datasets  with  rela3onal  structure  §  Handles  linked  datasets,  incomplete/missing  data,  noise  §  Excellent  for  EHR  data  

Par'ally  Observable  Markov  Decision  Processes  (POMDP):  learning  probabilis3c  models  from  datasets  with  rela3onal  structure  §  Handles  linked  datasets,  incomplete/missing  data,  noise  §  Excellent  for  EHR  data  

-­‐1800  

-­‐1600  

-­‐1400  

-­‐1200  

-­‐1000  

-­‐800  

-­‐600  

-­‐400  

-­‐200  

0  

Trivial   Rand.  Forest   Log.  Reg.   Handmade  BN   Learned  BN  

LOG-­‐LIKELIHO

OD  

MODEL  

Final  Outcome,  Probabilis[c  Predic[on,  

10-­‐fold  cross  valida[on  Higher  is  be\er  (0  is  perfect  predic[on)  

Developing SRL methods to learn likelihoods of future adverse medical events •  CARDIA 20 year longitudinal

dataset (N=5115) •  Preliminary results suggest

high predictive power @ year 20

Future: use POMDP to suggest lifestyle changes (smoking, exercise), medications

ER domain

Int’l Stroke Trial dataset N=19,435 patients (1991-1996) with acute stroke symptoms 3 observation, 2 decision points: •  Admission (t=0): demographics,

symptoms observed. Medicine administered

•  Discharge (t=2 weeks): test results, diagnosis. Possible change of treatment, long-term prescription

•  Follow up (t=6 mo): death, long term dependency

•  Current: Learning probabilistic conditional models from partial & noisy data

•  Future: optimize patient-specific plans

Regenstrief PHESS: 65 million records across Indiana Identify high utilizers: significant source of waste

Stroke domain