Computational prediction of clinical outcome of...

1
Yosuke Tanigawa ([email protected]), Stephen Pfohl ([email protected]) Biomedical Informatics Ph.D. program, School of Medicine, Stanford University Abstract References Data Future Direction Models and Results Computational prediction of clinical outcome of sepsis from critical care database 1. D C Angus, W T Linde-Zwirble, J Lidicker, G Clermont, J Carcillo, and M R Pinsky. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Critical care medicine, 29(7):1303{1310, 2001. 2. Alistair E W Johnson, Tom J Pollard, Lu Shen, Li-Wei H Lehman, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. MIMIC-III, a freely accessible critical care database. Scientic data, 3:160035, 2016. 3. Robert Tibshirani. Regression Selection and Shrinkage via the Lasso, 1996. The classification algorithm would likely be immediately improved by further feature engineering to better represent temporality and also by a grouping of similar features through a mapping onto ontological knowledge graph such as the UMLS metathesaurus. However, it is likely more worthwhile to re-define the model objectives such that risk of a septic-event and mortality may be predicted in real-time. The data of interest is contained within the MIMIC III database[2], an electronic health record database curated by MIT that houses de- identified demographics, vital signs, lab test results, procedures, medications, notes, imaging reports, and outcomes of 58,000 hospital admissions between 2001 and 2012 for 38,645 adults and 7,875 neonates at the Beth Israel Deaconess Medical Center. For classification, we label hospital admissions as a positive example only if sepsis occurs over the course of the admission based on the clinical criteria by Angus et. al[1]. For the purposes of survival analysis, we consider the set of admission with a positive sepsis label who also experienced a death in the hospital and define the time of death as the number of days since admission. Discussion 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 False Positive Rate True Positive Rate modelName NoICD AllICD NonAngusICD Lasso Testing Performance MIMIC-III hadm id Item id Date, time Value Flag 1 1 10:00 20 1 1 11:00 50 ! 1 2 0:20 200 1 3 16:00 3.5 2 2 7:45 25 ! m n 8:30 1 hadm id Item1 mean Item1 slope Item1 mean diff Item1 # flag Item1 % flag Item1 cnt Item n cnt 1 25 2 30 1 0.50 2 N/A 2 N/A 0 N/A N/A N/A 0 N/A m 30 5 4 3 0.60 5 3 Table (1) Original (2) wide (3) sparsity (4) NZV nrow ncol nrow ncol ncol ncol diagnoses_icd 651,047 5 58,976 6,985 N/A 33 admissions 58,976 19 58,976 78 N/A N/A labevents 27,854,055 9 58,147 2,880 522 461 inputevents_cv 17,527,935 22 21,879 1,112 166 119 inputevents_mv 3,618,991 31 21,879 1,112 166 119 outputevents 4,349,218 13 51,836 4,556 18 17 procedureevents_mv 258,066 25 21,894 464 52 34 chartevents_1 38,033,561 15 28,687 268 70 61 chartevents_2 13,116,197 15 34,904 36 12 10 chartevents_3 38,657,533 15 29,085 356 108 89 chartevents_4 9,374,587 15 27,210 44 32 28 chartevents_5 18,201,026 15 27,231 168 54 49 chartevents_6 28,014,688 15 34,896 1,644 278 267 chartevents_7 255,967 15 2,030 1,488 6 5 chartevents_8 34,322,082 15 7,990 1,268 184 155 chartevents_9 1,274,692 15 7,452 404 162 156 chartevents_10 9,584,888 15 18,650 528 28 17 chartevents_11 470,141 15 8,672 996 12 10 chartevents_12 265,413 15 1,405 804 4 4 chartevents_13 39,066,570 15 56,716 500 74 53 chartevents_14 100,075,138 15 24,549 3,032 836 535 Model ICD features included Total # of features Training (n = 53,079) Test (n = 5,897) Classification AUC Lasso None 797 0.908 0.791 Lasso Non Sepsis 824 0.910 0.817 Lasso All 830 0.947 0.900 RF None 797 0.998 0.816 RF Non Sepsis 824 0.999 0.855 RF All 830 1.000 0.921 Model ICD features included Total # of features Training (n = 5,827) Test (n = 330) Survival c-index Cox None 797 0.92 0.81 The use of Electronic Health Records (EHR) over the past several years has generated a large data source that allows for development of machine learning models for early diagnosis, risk stratification, and clinical decision support. Generating gold- standard labels for the outcome (phenotyping) is critical to the process of developing a training cohort, but is often a labor- intensive process requiring manual chart review. Sepsis affects over a million patients annually and remains one of the largest contributors to mortality in the ICU, costing the healthcare system over 14 million dollars per year. In hopes of facilitating high- throughput development of predictive models, we propose an electronic phenotyping algorithm capable of retrospectively identifying sepsis cases from the EHR that attains high performance without the use of ICD-9 billing codes. Additionally we explore models that predict risk of mortality following sepsis on the basis of the derived EHR features. Logistic regression with L1 regularization (Lasso) =∑ & ( & * + - Random Forest (RF) - Fit with 250 trees Cox Proportional Hazards with L1 regularization =/ exp( ( (&) ) exp( ( (&) ) 6:8 (9) :8 (;) subject to - We were successful at processing a large and diverse clinical database for the retrospective classification of sepsis cases, but the utility of the model is limited in that valid classification may only be made retrospectively and thus cannot be used for clinical decision support or real-time prediction. However, given that we are able to achieve relatively high performance without the use of ICD-9 codes, it may be possible to use this model to develop study cohorts with patients that may have been missed by models using only the ICD-9 codes for the outcome definition. Additionally, this same set of summary features attains modest performance at predicting the time-dependent risk of death in the hospital following sepsis, but the result is less strong than in the classification case. 1. Raw Data (21 SQL Tables) 2. Convert to wide format 3. Remove variables with greater than 90% missing 4. Near-Zero-Variance filtering 5. Join Tables and define the labels 6. Split train/test (90/10) – All operations now separate 7. Near-Zero-Variance filtering 8. Log-transformation and Normalization 9. Median Imputation 10. Near-Zero-Variance filtering 11. Join ICD-codes (optional) The raw data is sparse and temporal. We performed the following operations to extract aggregate summary features for each admission. Feature Engineering 8 6 4 2 0.65 0.70 0.75 0.80 0.85 0.90 log(Lambda) AUC 704 657 602 506 407 303 203 125 77 55 37 19 12 4 0 10 20 30 40 50 0.5 0.0 0.5 L1 Norm Coefficients 0 215 446 573 678 732 0 100 200 Variable Importance (MeanDecreaseGini) Variable 0.00 0.25 0.50 0.75 1.00 0 50 100 150 200 Days Since Admission Survival probability 8 6 4 2 0 20 30 40 50 log(Lambda) Partial Likelihood Deviance 740 704 658 572 469 333 188 94 42 22 7 6 6 itemidlabevents_meanValue_51250 itemidlabevents_meanValue_50970 itemidlabevents_flagRate_50893 itemid_chartevents_meanValue_224057 itemidlabevents_flagNum_51144 itemidlabevents_valSlope_51301 itemidlabevents_meanValue_51237 itemidlabevents_meanValue_50818 itemidlabevents_meanValue_50971 itemidlabevents_meanValue_51491 itemid_chartevents_meanValue_723 itemidlabevents_meanValue_51265 itemidlabevents_flagNum_51301 itemidlabevents_meanValue_50820 itemidlabevents_nMeas_50818 itemidlabevents_flagNum_50902 itemidlabevents_flagRate_51516 itemidlabevents_meanValue_50804 itemid_chartevents_meanValue_224059 itemidlabevents_meanValue_51274 itemidlabevents_flagNum_51221 itemidlabevents_flagNum_50804 itemidlabevents_valSlope_50825 itemidlabevents_meanValue_51301 itemidlabevents_meanValue_51493 itemidlabevents_meanValue_50983 itemidlabevents_valSlope_51006 itemidlabevents_valSlope_51498 itemidlabevents_meanValue_50902 itemidlabevents_nMeas_50819 itemidlabevents_meanValue_51277 itemidlabevents_meanValue_51222 itemidlabevents_meanValue_50802 itemidlabevents_meanValue_51146 itemidlabevents_flagNum_50862 itemidlabevents_meanDiff_50821 itemidlabevents_meanValue_50863 itemidlabevents_meanDiff_50893 itemidlabevents_meanValue_50882 itemidlabevents_flagRate_50804 itemidlabevents_meanValue_51254 itemidlabevents_meanDiff_50970 itemid_inputCV_meanValue_220949 itemidlabevents_nMeas_51491 itemidlabevents_nMeas_51248 itemidlabevents_meanValue_51144 itemidlabevents_valSlope_51256 itemidlabevents_valSlope_51254 itemidlabevents_meanValue_51200 itemidlabevents_nMeas_51221 itemidlabevents_flagRate_51006 itemidlabevents_flagNum_51009 itemidlabevents_valSlope_50821 itemidlabevents_valSlope_50813 itemidlabevents_meanValue_51249 itemidlabevents_meanValue_50893 itemidlabevents_flagRate_51009 itemidlabevents_flagNum_51279 itemidlabevents_valSlope_50912 age itemidlabevents_nMeas_51009 itemidlabevents_meanValue_51516 itemidlabevents_nMeas_51265 itemidlabevents_meanValue_50821 itemidlabevents_flagNum_50931 itemidlabevents_valSlope_51244 itemidlabevents_nMeas_50813 itemidlabevents_nMeas_51301 itemidlabevents_meanValue_50912 itemidlabevents_flagNum_50970 itemidlabevents_nMeas_50983 itemidlabevents_flagNum_50882 itemidlabevents_nMeas_51279 itemidlabevents_nMeas_51200 itemidlabevents_nMeas_51146 itemidlabevents_nMeas_50825 itemidlabevents_meanValue_51006 itemidlabevents_nMeas_51277 itemidlabevents_nMeas_51249 itemidlabevents_flagNum_51256 itemidlabevents_nMeas_51222 itemidlabevents_nMeas_50971 itemidlabevents_nMeas_51250 itemidlabevents_flagNum_50893 itemidlabevents_flagNum_51006 itemidlabevents_nMeas_51254 itemidlabevents_nMeas_51244 itemidlabevents_nMeas_50931 itemidlabevents_flagNum_50912 itemidlabevents_nMeas_50882 itemidlabevents_nMeas_50902 itemidlabevents_flagRate_50912 itemidlabevents_nMeas_51006 itemidlabevents_nMeas_51256 itemidlabevents_flagNum_51244 itemidlabevents_nMeas_50868 itemidlabevents_flagNum_51222 itemidlabevents_nMeas_50960 itemidlabevents_nMeas_50912 itemidlabevents_nMeas_50970 itemidlabevents_nMeas_50893 0 100 200 Variable Importance (MeanDecreaseGini) Variable 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 False Positive Rate True Positive Rate modelName NoICD AllICD NonAngusICD Random Forest Testing Performance Lasso logistic regression and random forest successfully identify patients with or without ICD code features Cox models predict risk of mortality following sepsis with modest performance Lasso RF Cox Training Test 2. Long to wide 1. Raw Data 2. Long to wide 3. & 4. Filter 5. & 6. Define Training and Test sets Labels Features ICD Features Lasso Random Forest

Transcript of Computational prediction of clinical outcome of...

Page 1: Computational prediction of clinical outcome of …cs229.stanford.edu/proj2016/poster/TanigawaPfohl...MIMIC-III, a freely accessible critical care database. Scientic data, 3:160035,

Yosuke Tanigawa ([email protected]), Stephen Pfohl ([email protected])Biomedical Informatics Ph.D. program, School of Medicine, Stanford University

Abstract

References

Data

Future Direction

Models and Results

Computational prediction of clinical outcome of sepsis from critical care database

1. D C Angus, W T Linde-Zwirble, J Lidicker, G Clermont, J Carcillo, and M R Pinsky. Epidemiology of severe sepsis in the United States: analysis of incidence, outcome, and associated costs of care. Critical care medicine, 29(7):1303{1310, 2001.

2. Alistair E W Johnson, Tom J Pollard, Lu Shen, Li-Wei H Lehman, MenglingFeng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. MIMIC-III, a freely accessible critical care database. Scientic data, 3:160035, 2016.

3. Robert Tibshirani. Regression Selection and Shrinkage via the Lasso, 1996.

The classification algorithm would likely be immediately improved by further feature engineering to better represent temporality and also by a grouping of similar features through a mapping onto ontological knowledge graph such as the UMLS metathesaurus. However, it is likely more worthwhile to re-define the model objectives such that risk of a septic-event and mortality may be predicted in real-time.

The data of interest is contained within the MIMIC III database[2], an electronic health record database curated by MIT that houses de-identified demographics, vital signs, lab test results, procedures, medications, notes, imaging reports, and outcomes of 58,000 hospital admissions between 2001 and 2012 for 38,645 adults and 7,875 neonates at the Beth Israel Deaconess Medical Center. For classification, we label hospital admissions as a positive example only if sepsis occurs over the course of the admission based on the clinical criteria by Angus et. al[1]. For the purposes of survival analysis, we consider the set of admission with a positive sepsis label who also experienced a death in the hospital and define the time of death as the number of days since admission.

Discussion

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00False Positive Rate

True

Pos

itive

Rat

e

modelNameNoICD

AllICD

NonAngusICD

Lasso Testing Performance

MIMIC-III

hadmid

Itemid

Date,time Value Flag

1 1 10:00 20

1 1 11:00 50 !

1 2 0:20 200

1 3 16:00 3.5

2 2 7:45 25 !

m n 8:30 1

hadmid

Item1mean

Item1slope

Item1mean

diff

Item1# flag

Item1% flag

Item1cnt … Item n

cnt

1 25 2 30 1 0.50 2 N/A

2 N/A 0 N/A N/A N/A 0 N/A

m 30 5 4 3 0.60 5 3

Table(1)Original (2)wide (3)sparsity (4)NZV

nrow ncol nrow ncol ncol ncoldiagnoses_icd 651,047 5 58,976 6,985 N/A 33admissions 58,976 19 58,976 78 N/A N/Alabevents 27,854,055 9 58,147 2,880 522 461inputevents_cv 17,527,935 22 21,879 1,112 166 119inputevents_mv 3,618,991 31 21,879 1,112 166 119outputevents 4,349,218 13 51,836 4,556 18 17procedureevents_mv 258,066 25 21,894 464 52 34chartevents_1 38,033,561 15 28,687 268 70 61chartevents_2 13,116,197 15 34,904 36 12 10chartevents_3 38,657,533 15 29,085 356 108 89chartevents_4 9,374,587 15 27,210 44 32 28chartevents_5 18,201,026 15 27,231 168 54 49chartevents_6 28,014,688 15 34,896 1,644 278 267chartevents_7 255,967 15 2,030 1,488 6 5chartevents_8 34,322,082 15 7,990 1,268 184 155chartevents_9 1,274,692 15 7,452 404 162 156chartevents_10 9,584,888 15 18,650 528 28 17chartevents_11 470,141 15 8,672 996 12 10chartevents_12 265,413 15 1,405 804 4 4chartevents_13 39,066,570 15 56,716 500 74 53chartevents_14 100,075,138 15 24,549 3,032 836 535

ModelICD

featuresincluded

Total#offeatures

Training(n=53,079)

Test(n=5,897)

Classification AUCLasso None 797 0.908 0.791Lasso NonSepsis 824 0.910 0.817Lasso All 830 0.947 0.900RF None 797 0.998 0.816RF NonSepsis 824 0.999 0.855RF All 830 1.000 0.921

Model ICDfeaturesincluded

Total#offeatures

Training(n=5,827)

Test(n=330)

Survival c-indexCox None 797 0.92 0.81

The use of Electronic Health Records (EHR) over the past several years has generated a large data source that allows for development of machine learning models for early diagnosis, risk stratification, and clinical decision support. Generating gold-standard labels for the outcome (phenotyping) is critical to the process of developing a training cohort, but is often a labor-intensive process requiring manual chart review. Sepsis affects over a million patients annually and remains one of the largest contributors to mortality in the ICU, costing the healthcare system over 14 million dollars per year. In hopes of facilitating high-throughput development of predictive models, we propose an electronic phenotyping algorithm capable of retrospectively identifying sepsis cases from the EHR that attains high performance without the use of ICD-9 billing codes. Additionally we explore models that predict risk of mortality following sepsis on the basis of the derived EHR features.

Logistic regression with L1 regularization (Lasso)𝐿 𝜃 = ∑ 𝑦 & − 𝜃(𝑥 & *

+ 𝜆 𝜃 -

Random Forest (RF) - Fit with 250 trees

Cox Proportional Hazards with L1 regularization

𝐿 𝜃 = /exp(𝜃(𝑥(&))

∑ exp(𝜃(𝑥(&))�6:8(9):8(;)

subject to 𝜃 - ≤ 𝜆

We were successful at processing a large and diverse clinical database for the retrospective classification of sepsis cases, but the utility of the model is limited in that valid classification may only be made retrospectively and thus cannot be used for clinical decision support or real-time prediction. However, given that we are able to achieve relatively high performance without the use of ICD-9 codes, it may be possible to use this model to develop study cohorts with patients that may have been missed by models using only the ICD-9 codes for the outcome definition. Additionally, this same set of summary features attains modest performance at predicting the time-dependent risk of death in the hospital following sepsis, but the result is less strong than in the classification case.

1. Raw Data (21 SQL Tables)2. Convert to wide format3. Remove variables with greater than 90% missing4. Near-Zero-Variance filtering5. Join Tables and define the labels6. Split train/test (90/10) – All operations now

separate7. Near-Zero-Variance filtering8. Log-transformation and Normalization9. Median Imputation10. Near-Zero-Variance filtering11. Join ICD-codes (optional)

The raw data is sparse and temporal. We performed the following operations to extract aggregate summary features for each admission.

Feature Engineering

−8 −6 −4 −2

0.65

0.70

0.75

0.80

0.85

0.90

log(Lambda)

AUC

704 657 602 506 407 303 203 125 77 55 37 19 12 4

0 10 20 30 40 50

−0.5

0.0

0.5

L1 Norm

Coe

ffici

ents

0 215 446 573 678 732

0 100 200Variable Importance (MeanDecreaseGini)

Varia

ble

0.00

0.25

0.50

0.75

1.00

0 50 100 150 200Days Since Admission

Surv

ival p

roba

bilit

y

−8 −6 −4 −2 0

2030

4050

log(Lambda)

Parti

al L

ikelih

ood

Dev

ianc

e

740 704 658 572 469 333 188 94 42 22 7 6 6

itemidlabevents_meanValue_51250itemidlabevents_meanValue_50970itemidlabevents_flagRate_50893itemid_chartevents_meanValue_224057itemidlabevents_flagNum_51144itemidlabevents_valSlope_51301itemidlabevents_meanValue_51237itemidlabevents_meanValue_50818itemidlabevents_meanValue_50971itemidlabevents_meanValue_51491itemid_chartevents_meanValue_723itemidlabevents_meanValue_51265itemidlabevents_flagNum_51301itemidlabevents_meanValue_50820itemidlabevents_nMeas_50818itemidlabevents_flagNum_50902itemidlabevents_flagRate_51516itemidlabevents_meanValue_50804itemid_chartevents_meanValue_224059itemidlabevents_meanValue_51274itemidlabevents_flagNum_51221itemidlabevents_flagNum_50804itemidlabevents_valSlope_50825itemidlabevents_meanValue_51301itemidlabevents_meanValue_51493itemidlabevents_meanValue_50983itemidlabevents_valSlope_51006itemidlabevents_valSlope_51498itemidlabevents_meanValue_50902itemidlabevents_nMeas_50819itemidlabevents_meanValue_51277itemidlabevents_meanValue_51222itemidlabevents_meanValue_50802itemidlabevents_meanValue_51146itemidlabevents_flagNum_50862itemidlabevents_meanDiff_50821itemidlabevents_meanValue_50863itemidlabevents_meanDiff_50893itemidlabevents_meanValue_50882itemidlabevents_flagRate_50804itemidlabevents_meanValue_51254itemidlabevents_meanDiff_50970itemid_inputCV_meanValue_220949itemidlabevents_nMeas_51491itemidlabevents_nMeas_51248itemidlabevents_meanValue_51144itemidlabevents_valSlope_51256itemidlabevents_valSlope_51254itemidlabevents_meanValue_51200itemidlabevents_nMeas_51221itemidlabevents_flagRate_51006itemidlabevents_flagNum_51009itemidlabevents_valSlope_50821itemidlabevents_valSlope_50813itemidlabevents_meanValue_51249itemidlabevents_meanValue_50893itemidlabevents_flagRate_51009itemidlabevents_flagNum_51279itemidlabevents_valSlope_50912ageitemidlabevents_nMeas_51009itemidlabevents_meanValue_51516itemidlabevents_nMeas_51265itemidlabevents_meanValue_50821itemidlabevents_flagNum_50931itemidlabevents_valSlope_51244itemidlabevents_nMeas_50813itemidlabevents_nMeas_51301itemidlabevents_meanValue_50912itemidlabevents_flagNum_50970itemidlabevents_nMeas_50983itemidlabevents_flagNum_50882itemidlabevents_nMeas_51279itemidlabevents_nMeas_51200itemidlabevents_nMeas_51146itemidlabevents_nMeas_50825itemidlabevents_meanValue_51006itemidlabevents_nMeas_51277itemidlabevents_nMeas_51249itemidlabevents_flagNum_51256itemidlabevents_nMeas_51222itemidlabevents_nMeas_50971itemidlabevents_nMeas_51250itemidlabevents_flagNum_50893itemidlabevents_flagNum_51006itemidlabevents_nMeas_51254itemidlabevents_nMeas_51244itemidlabevents_nMeas_50931itemidlabevents_flagNum_50912itemidlabevents_nMeas_50882itemidlabevents_nMeas_50902itemidlabevents_flagRate_50912itemidlabevents_nMeas_51006itemidlabevents_nMeas_51256itemidlabevents_flagNum_51244itemidlabevents_nMeas_50868itemidlabevents_flagNum_51222itemidlabevents_nMeas_50960itemidlabevents_nMeas_50912itemidlabevents_nMeas_50970itemidlabevents_nMeas_50893

0 100 200Variable Importance (MeanDecreaseGini)

Varia

ble

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00False Positive Rate

True

Pos

itive

Rat

e

modelNameNoICD

AllICD

NonAngusICD

Random Forest Testing Performance

• Lasso logistic regression and random forest successfully identify patients with or without ICD code features

• Cox models predict risk of mortality following sepsis with modest performance

Lasso

RF

Cox

Training

Test

2. Long to wide

1.Raw Data

2. Long to wide

3. & 4. Filter

5. & 6. Define Training and Test sets

Labels Features ICD Features

Lasso RandomForest