Towards Reliable ARDS Clinical Decision Support: ARDS ...

10
Towards Reliable ARDS Clinical Decision Support: ARDS Patient Analytics with Free-text and Structured EMR Data Emilia Apostolova, PhD 1 , Amit Uppal, MD 2 , Jessica E. Galarraga, MD, MPH 3,4 , Ioannis Koutroulis, MD, PhD 5 , Tim Tschampel 6 , Tony Wang PhD 7 , Tom Velez, PhD, JD 6 1 Language.ai, Chicago, IL; 2 NYU School of Medicine, Bellevue Hospital Center, New York, NY; 3 MedStar Health Research Institute, Hyattsville, MD; 4 MedStar Washington Hospital Center, Georgetown University School of Medicine, Washington, DC; 5 Children’s National Health System, Washington, DC; 6 Computer Technology Associates, Ridgecrest, CA; 7 Imedacs, Ann Arbor, MI Abstract In this work, we utilize a combination of free-text and structured data to build Acute Respiratory Distress Syndrome (ARDS) prediction models and ARDS phenotype clusters. We derived ’Patient Context Vectors’ representing patient- specific contextual ARDS risk factors, utilizing deep-learning techniques on ICD and free-text clinical notes data. The Patient Context Vectors were combined with structured data from the first 24 hours of admission, such as vital signs and lab results, to build an ARDS patient prediction model and an ARDS patient mortality prediction model achieving AUC of 90.16 and 81.01 respectively. The ability of Patient Context Vectors to summarize patients’ medical history and current conditions is also demonstrated by the automatic clustering of ARDS patients into clinically meaningful phenotypes based on comorbidities, patient history, and presenting conditions. To our knowledge, this is the first study to successfully combine free-text and structured data, without any manual patient risk factor curation, to build real-time ARDS prediction models. Introduction Critical care Clinical Decision Support (CDS) systems aim at the early identification and timely treatment of rapidly progressive, life-threatening conditions. In particular, ARDS (Acute Respiratory Distress Syndrome) is a significant cause of morbidity and mortality in the USA and worldwide 1, 2 . Early recognition and evidence-based management of ARDS can limit the propagation of lung injury and significantly improve patient outcomes 3 . To date, there exists no accurate and reliable way to anticipate which patients, presenting with respiratory distress, are likely to develop ARDS. Numerous prediction scores have been developed to assess ARDS prognosis and risk of death, such as Lung Injury Score (LIS) 4 , Lung Injury Prediction Score (LIPS) 5 , APPS (Age, Plateau, PaO2/FiO2 Score) 6 , Early Acute Lung Injury (EALI) 7 , and Modified ARDS Prediction Score (MAPS) 8 . Using a consensus pro- cess, a panel of experts convened in 2011 to develop the Berlin definition, focusing on addressing a number of limita- tions of prior definitions 9 . Still, the predictive validities of these tools and definitions have proven to be moderate, for example, as measured by area under receiver operating curve (AUC). The difficulty in analyzing and predicting ARDS outcomes stems from the fact that this is both rare, and, at the same time, highly heterogeneous condition 10 . ARDS involves the interaction of multiple risk factors, past history, and current conditions, signs, and symptoms. Hospital alert systems typically rely on highly sensitive screening of structured data, such as vital signs and lab results, which, in the case of such rare conditions, are often associated with false clinical alarms resulting in “alarm fatigue” 11 . EMR data depends on what the clinician deems necessary to measure and record in the act of caring for the patient. EMR data is typically entered for the purposes of clinical documentation and billing 12 , and thus not centered around the needs of real-time surveillance-based CDS systems. The combined physician care related variables and underlying patient-related contextual factors needed for a reliable ARDS risk evaluation are typically dispersed across the patient EMR record, and available at different times throughout the patient stay. Patient demographics, past medical and visit history, chronic conditions, risk factors, current signs and symptoms can be found in diverse combinations of structured elements and clinical notes (e.g. nursing notes, radiology reports, etc.), that record diagnosis and procedure codes, vital signs, lab orders and results, ventilation parameters, etc. The challenge for real-time surveillance-based CDS systems is accommodating for the variability and the availability of real-time electronic data and enabling accurate contextual interpretation of real-time patient data. In this work, we utilize all available EMR patient information, in the form of structured data and free-text, for real- time predictive modeling. While our experiments are focused on identifying ARDS cases, the described method is

Transcript of Towards Reliable ARDS Clinical Decision Support: ARDS ...

Page 1: Towards Reliable ARDS Clinical Decision Support: ARDS ...

Towards Reliable ARDS Clinical Decision Support: ARDS Patient Analyticswith Free-text and Structured EMR Data

Emilia Apostolova, PhD1, Amit Uppal, MD2, Jessica E. Galarraga, MD, MPH3,4, IoannisKoutroulis, MD, PhD5, Tim Tschampel6, Tony Wang PhD7, Tom Velez, PhD, JD6

1Language.ai, Chicago, IL; 2NYU School of Medicine, Bellevue Hospital Center, New York,NY; 3MedStar Health Research Institute, Hyattsville, MD; 4MedStar Washington HospitalCenter, Georgetown University School of Medicine, Washington, DC; 5Children’s National

Health System, Washington, DC; 6Computer Technology Associates, Ridgecrest, CA;7Imedacs, Ann Arbor, MI

AbstractIn this work, we utilize a combination of free-text and structured data to build Acute Respiratory Distress Syndrome

(ARDS) prediction models and ARDS phenotype clusters. We derived ’Patient Context Vectors’ representing patient-specific contextual ARDS risk factors, utilizing deep-learning techniques on ICD and free-text clinical notes data. ThePatient Context Vectors were combined with structured data from the first 24 hours of admission, such as vital signsand lab results, to build an ARDS patient prediction model and an ARDS patient mortality prediction model achievingAUC of 90.16 and 81.01 respectively. The ability of Patient Context Vectors to summarize patients’ medical historyand current conditions is also demonstrated by the automatic clustering of ARDS patients into clinically meaningfulphenotypes based on comorbidities, patient history, and presenting conditions. To our knowledge, this is the firststudy to successfully combine free-text and structured data, without any manual patient risk factor curation, to buildreal-time ARDS prediction models.

IntroductionCritical care Clinical Decision Support (CDS) systems aim at the early identification and timely treatment of rapidlyprogressive, life-threatening conditions. In particular, ARDS (Acute Respiratory Distress Syndrome) is a significantcause of morbidity and mortality in the USA and worldwide1, 2. Early recognition and evidence-based management ofARDS can limit the propagation of lung injury and significantly improve patient outcomes3.

To date, there exists no accurate and reliable way to anticipate which patients, presenting with respiratory distress,are likely to develop ARDS. Numerous prediction scores have been developed to assess ARDS prognosis and riskof death, such as Lung Injury Score (LIS)4, Lung Injury Prediction Score (LIPS)5, APPS (Age, Plateau, PaO2/FiO2Score)6, Early Acute Lung Injury (EALI)7, and Modified ARDS Prediction Score (MAPS)8. Using a consensus pro-cess, a panel of experts convened in 2011 to develop the Berlin definition, focusing on addressing a number of limita-tions of prior definitions9. Still, the predictive validities of these tools and definitions have proven to be moderate, forexample, as measured by area under receiver operating curve (AUC).

The difficulty in analyzing and predicting ARDS outcomes stems from the fact that this is both rare, and, at thesame time, highly heterogeneous condition10. ARDS involves the interaction of multiple risk factors, past history,and current conditions, signs, and symptoms. Hospital alert systems typically rely on highly sensitive screening ofstructured data, such as vital signs and lab results, which, in the case of such rare conditions, are often associated withfalse clinical alarms resulting in “alarm fatigue”11.

EMR data depends on what the clinician deems necessary to measure and record in the act of caring for the patient.EMR data is typically entered for the purposes of clinical documentation and billing12, and thus not centered aroundthe needs of real-time surveillance-based CDS systems. The combined physician care related variables and underlyingpatient-related contextual factors needed for a reliable ARDS risk evaluation are typically dispersed across the patientEMR record, and available at different times throughout the patient stay. Patient demographics, past medical and visithistory, chronic conditions, risk factors, current signs and symptoms can be found in diverse combinations of structuredelements and clinical notes (e.g. nursing notes, radiology reports, etc.), that record diagnosis and procedure codes,vital signs, lab orders and results, ventilation parameters, etc. The challenge for real-time surveillance-based CDSsystems is accommodating for the variability and the availability of real-time electronic data and enabling accuratecontextual interpretation of real-time patient data.

In this work, we utilize all available EMR patient information, in the form of structured data and free-text, for real-time predictive modeling. While our experiments are focused on identifying ARDS cases, the described method is

Page 2: Towards Reliable ARDS Clinical Decision Support: ARDS ...

applicable to a variety of disease surveillance CDS use cases, needing information dispersed across the EMR patientrecord.

A second goal of this study is to utilize the combination of clinician knowledge and experience, and a data-drivenapproach to identify ARDS patients’ phenotypes and risk factors, acknowledging the need for targeted personalizedtreatments reflecting differences in treatment outcomes across patient subtypes13–15.

DatasetClinical encounter data of adult patients were extracted from the MIMIC3 Intensive Care Unit (ICU) database16.MIMIC3 consists of retrospective ICU encounter data of patients admitted into Beth Israel Deaconess Medical Centerfrom 2001 to 2012. Included ICUs are medical, surgical, trauma-surgical, coronary, cardiac surgery recovery, andmedical/surgical care units. MIMIC3 includes time series data recorded in the EMR during encounters (e.g. vitalsigns/diagnostic laboratory results, free text clinical notes, medications, procedures, etc.). The dataset contains dataassociated with over 58,000 ICU visits, including over 2 million free-text clinical notes and over 650,000 diagnosiscodes.

For this study, in accordance with previous literature17, we identified ARDS for adult patients older than 18 yearswith ICD-9 codes for severe acute respiratory failure and use of continuous invasive mechanical ventilation, excludingthose with codes for acute asthma, COPD and CHF exacerbations 1. This resulted in 4,624 ARDS cases from a totalof 48,399 adult ICU admissions. The ICU mortality rate in this population was approximately 59%, somewhat higherthan expected for ARDS18, suggesting that the algorithm used is capturing the most severe cases of ARDS, and thusintroducing some level noise with possibly containing true ARDS cases marked as negative examples.

Our ARDS predictive model utilized data in the form of free-text clinical notes, ICD codes, and structured physi-ological and ventilator data. The structured data included in this analysis consists of anion gap (aniongap), albumin,bands, bicarbonate, bilirubin, creatine, chloride, glucose, hematocrit, hemoglobin, lactate, platelet, potassium, partialthromboplastin time (ptt), international normalized ratio (inr), prothrombin time (pt), sodium, bun, white blood cellcount (wbc), heart rate (heartrate), systolic blood pressure (sysbd), diastolic blood pressure (diasbp), mean blood pres-sure (meanbp), respiratory rate (resperate), body temperature (tempc), peripheral capillary oxygen saturation (spo2),body mass index (bmi), gender, age, urine output (urine1). All variables are included as min, max, and mean valuesand are measured over the first 24 hours of ICU admission. The first 24 hour timeframe was chosen, as it has beenreported that ARDS develops at a median of 30 hours after hospital admission19. Thus, a 24-hour window providesfor the gathering of enough structured data, while at the same time is early enough for real-time CDS.

Descriptive AnalyticsARDS patient characteristics and risk factors were first gathered with the help of experienced clinicians. Expertknowledge was gathered in the form of Concept Maps (Cmaps)20. Cmaps is a tool developed at the Institute for Humanand Machine Cognition (IHMC) that enables collaborative knowledge creation, in the form of concepts, relations, andontologies, with links to external resources and publications. A snippet of the developed ARDS Cmap developed byour clinical research team is shown in Figure 1.

For example, clinicians coded ARDS precipitant causes include sepsis, aspiration, traumatic injuries, burns, anddrugs, including illicit drugs, such as cocaine, heroin, or prescription drugs, such as chemotherapeutic agents, etc.

Initially, the ARDS Cmap was used as a screening rule engine. In addition, the Cmap was used to identify riskfactors that were later used in predictive models. Although rule engines based on Cmaps are evidence-based and tendto be highly sensitive, they tend to perform with subpar specificity (e.g. the Berlin definition achieved an AUC of0.577). Such rule engines are also highly sensitive to missing data. In contrast, data-driven and Machine Learningalgorithms have the potential to improve on rule engines from training on large datasets, learning from a variety ofclinical data and response patterns, and are able to handle missing data. However, for machine learning to be effectivein predicting highly heterogeneous conditions such as ARDS, training data requires both high precision labeling andthe identification of features with adequate numbers of samples needed to separate classifications of ARDS classesand subclasses/phenotypes from non-ARDS patients.

1Inclusion ICD9 Codes: 51881, 51882, 51884, 51851, 51852, 51853, 5184, 5187, 78552, 99592, 9670, 9671, 9672; Exclusion ICD9 Codes:49391, 49392, 49322, 4280.

Page 3: Towards Reliable ARDS Clinical Decision Support: ARDS ...

Figure 1: ARDS Cmap: Clinician-coded representation of ARDS patient characteristics and risk factors.

ICD Embeddings and Patient VectorsWe then looked for a data-driven approach to provide additional insight into ARDS patient characteristics, risk factors,and phenotypes.

Intuitively, even without any additional patient EMR data, clinicians viewing properly coded patient diagnosis codes(e.g. ICD codes in a problem list) are typically able to create a mental summary of the overall patient condition, in-cluding medical history, risk factors, presenting conditions. ICD codes are used to describe both current diagnoses(e.g. Pneumonia, unspecified organism: ICD9 486 ), but also a variety of additional patient information. For example,ICD codes can describe ARDS risk factors, such as patient’s history and chronic conditions (e.g. Chronic kidney dis-ease; Personal history of malignant neoplasm; etc.); information regarding past and current treatments and procedures(e.g. Infection due to other bariatric procedure). In some cases, ICD codes contain information such as the patientage group and/or susceptibilities (e.g. Sepsis of newborn; Elderly multigravida); expected outcome (Encounter forpalliative care); patient social history (e.g. Adult emotional/psychological abuse; Cocaine dependence); the reasonfor the visit, (e.g. Railway accidents; Motor Vehicle accidents).

Using ICD codes for statistical analysis and predictive models, however, poses a series of challenges. Patient ICDcodes in EMRs tend to be sparse. There are numerous ICD codes (around 15,000 ICD9 codes and around 68,000ICD10 codes), with only a very small subset of these applicable to a particular patient (e.g. MIMIC3 admissions havean average of 11 ICD codes). ICD codes also tend to co-occur and overlap. In addition, ICD coding can be, in somecases, subjective and dependent on numerous external factors21–23.

However, the concurrence and mutual information of ICD codes over large data repositories can be utilized. Forexample, the fact that Pneumonia ICD codes are often accompanied with ICD codes describing Cough, Fever, Pleuraleffusion, etc. can be utilized to generate vector representations of ICD codes. Inspired by deep learning representation,such as word embeddings24, it has been suggested that this medical code co-occurrence can be exploited to gener-ate low-dimensional representations of ICD codes25–27 that may facilitate EMR data-based exploratory analysis andpredictive modeling28.

In this study, we utilized available MIMIC3 patient data to generate the ICD embeddings following the approach ofChoi et al.25. In our approach, we generated a low-dimensional representation of the patient history, symptoms, riskfactors, diagnosis, etc, by averaging the patient ICD code embeddings. We refer to this representation of the patient’smedical history and clinical condition as Patient Context Vectors.

To generate ARDS patients groups sharing similar characteristics and risk factors, we then clustered the ARDSPatient Context Vectors via k-means clustering29. Clinical review of the generated ARDS clusters determined theoptimal number of clusters to be 10.

Page 4: Towards Reliable ARDS Clinical Decision Support: ARDS ...

The Patient Context Vectors were able to clearly separate ARDS patient risk factors and conditions into clinicallyvalid categories, such as Malignancy or Chronic Hepatic Disease. Figure 2 shows the frequency of patients in variousclusters, sorted left-to-right according to mortality rate. Table 1 lists the 15 most representative ICD code descrip-tions (cluster centroids) for the 10 ARDS clusters. The cluster description was provided by clinicians reviewing thecorresponding cluster centroids.

Figure 2: Frequency and mortality rate of MIMIC3 ARDS patient clusters formed by clustering of averaged ICDembeddings.

Interestingly, the manually clinician-curated Cmap appears to overlap to a large extent with the automatically de-rived ARDS patient clusters. For example, clinicians listed chemotherapeutic agents as a distinct risk factor for ARDS.Our data-driven analysis showed a distinct cluster of ARDS patients with ICD codes describing malignancy. Furtheranalysis could delineate if this is a direct relationship between chemotherapeutic agents and the development of ARDS,or rather reflects an indirect relationship. For example, perhaps chemotherapy-induced immunosuppression acts as arisk factor for sepsis which, in turn, acts as a risk factor for ARDS.

Furthermore, the mortality associated with different clusters was also consistent with clinician experience. Clini-cians recognize that patients with advanced malignancy may develop severe infections and ARDS as a final commonpathway in their advanced disease. In this context, ARDS may represent a manifestation of their advanced underlyingdisease, and therapies directed at ARDS and its precipitant may not significantly impact mortality. Indeed, the clusterof malignancy-associated ARDS had a very high mortality rate of 90%.

In contrast, patients who develop ARDS as a result of trauma typically were well enough to be engaged in theactivity leading to trauma, and thus ARDS may truly be the primary disease as opposed to a symptom of advancedunderlying disease. The likelihood of mortality in this group may be more significantly influenced by the treatmentstargeted at ARDS. In our analysis, the cluster of trauma associated-ARDS had a significantly lower mortality rate of30% compared to other clusters.

Predictive AnalyticsClustering and manual evaluation by clinicians of Patient Context Vectors proved to be a useful tool in summarizingthe overall patient condition, risk factors and history. Real-time CDS systems, however, might not have access to thefull set of the patient ICD codes as they might be entered in the EMR system at a later stage. It has been observedthat clinical notes, specifically nursing and physician notes, typically contain all of the information available fromICD codes. Furthermore, while past medical history and presenting conditions might not always be ICD-coded, theyare typically available in the form of free-text notes. In previous work30, we have successfully utilized free-text forpredicting Patient Context Vectors, in the case of missing ICD codes. Additionally, we were able to successfullycombine information available in ICD codes and in nursing notes and produce an average Patient Context Vector.

A word-level Convolutional Neural Network (CNN) was trained to predict from free-text notes the Patient ContextVector (averaged ICD code embedding). Structured data, in the form of vital signs and lab results, was then combinedwith the predicted Patient Context Vector and utilized in a machine learning model trained to predict ARDS patients.

Page 5: Towards Reliable ARDS Clinical Decision Support: ARDS ...

Malignancy Chronic Hepatic DiseaseMalignant neoplasm of liver, secondary Other and unspecified coagulation defectsSecondary malignant neoplasm of bone and bone marrow Other ascitesAnemia in neoplastic disease Hepatic encephalopathySecondary malignant neoplasm of lung Thrombocytopenia, unspecifiedSecondary malignant neoplasm of other specified sites Portal hypertensionNeoplasm related pain (acute) (chronic) Alcoholic cirrhosis of liverPersonal history of irradiation, presenting hazards to health Hepatorenal syndromeSecondary and unspecified malignant neoplasm of intrathoracic lymph nodes Acquired coagulation factor deficiencyHypercalcemia Spontaneous bacterial peritonitisPersonal history of antineoplastic chemotherapy Acute and subacute necrosis of liverEncounter for palliative care Esophageal varices in diseases classified elsewhere, without mention of bleedingAntineoplastic and immunosuppressive drugs causing adverse effects in therapeutic use Cirrhosis of liver without mention of alcoholSecondary malignant neoplasm of pleura Other sequelae of chronic liver diseaseMalignant pleural effusion Other shock without mention of traumaSecondary malignant neoplasm of brain and spinal cord Portal vein thrombosisNosocomial Complications Vascular Disease ComplicationsUrinary tract infection, site not specified Acute kidney failure, unspecifiedIntestinal infection due to Clostridium difficile Congestive heart failure, unspecifiedPressure ulcer, lower back HyperpotassemiaAcute respiratory failure Anemia in chronic kidney diseaseHyperosmolality and/or hypernatremia Hypertensive chronic kidney disease, unspecified, with chronic kidney disease

stage V or end stage renal diseasePneumonitis due to inhalation of food or vomitus Candidiasis of other urogenital sitesAnemia of other chronic disease End stage renal diseaseCandidiasis of other urogenital sites Intestinal infection due to Clostridium difficilePneumonia, organism unspecified Long-term (current) use of insulinPseudomonas infection in conditions classified elsewhere and of unspecified site Below knee amputation statusAcute and chronic respiratory failure Chronic kidney disease, unspecifiedPressure ulcer, stage II Severe sepsisAlkalosis Acute kidney failure with lesion of tubular necrosisMixed acid-base balance disorder Pressure ulcer, lower backPressure ulcer, buttock Diabetes with renal manifestations, type II or unspecified type,

not stated as uncontrolledMulti-organ Failure Neurologic ComorbiditiesSevere sepsis Encephalopathy, unspecifiedSeptic shock Intracerebral hemorrhageAcute respiratory failure Cerebral artery occlusion, unspecified with cerebral infarctionUnspecified septicemia Dysphagia, unspecifiedAcute kidney failure with lesion of tubular necrosis Hemiplegia, unspecified, affecting unspecified sideAcidosis Obstructive hydrocephalusThrombocytopenia Cerebral edemaOther and unspecified coagulation defects Subdural hemorrhageStreptococcal septicemia Compression of brainIntestinal infection due to Clostridium difficile Hyperosmolality and/or hypernatremiaHyposmolality and/or hyponatremia Other encephalopathyCandidiasis of other urogenital sites Cerebral embolism with cerebral infarctionDefibrination syndrome Other disorders of neurohypophysisCandidiasis of mouth Grand mal statusAcute kidney failure Other convulsionsSurgical Source Control Complex ComorbiditiesParalytic ileus Anemia, unspecifiedUnspecified pleural effusion Unspecified acquired hypothyroidismUnspecified septicemia Diabetes mellitus without mention of complication, type II or unspecified type,

not stated as uncontrolledAcute kidney failure with lesion of tubular necrosis Chronic airway obstruction, not elsewhere classifiedAcute vascular insufficiency of intestine Atrial flutterIntestinal infection due to Clostridium difficile Mixed acid-base balance disorderPerforation of intestine Atrial fibrillationStreptococcal septicemia Acute kidney failure, unspecifiedOther postoperative infection Pneumonia, organism unspecifiedDisseminated candidiasis Delirium due to conditions classified elsewherePeritoneal abscess Do not resuscitate statusAcidosis Chronic kidney disease, unspecifiedCandidiasis of other urogenital sites Other and unspecified hyperlipidemiaOther suppurative peritonitis Personal history of tobacco useUnspecified intestinal obstruction HypoxemiaTrauma/Fx Toxic IngestionClosed fracture of dorsal [thoracic] vertebra without mention of spinal cord injury Bipolar disorder,unspecifiedTraumatic pneumothorax without mention of open wound into thorax Poisoning by tricyclic antidepressantsContusion of lung without mention of open wound into thorax Alcohol withdrawalOpen wound of scalp, without mention of complication Poisoning by benzodiazepine-based tranquilizersClosed fracture of lumbar vertebra without mention of spinal cord injury Poisoning by other antipsychotics, neuroleptics, and major tranquilizersOther closed skull fracture with cerebral laceration and contusion, with loss of Lack of housingconsciousness of unspecified durationOpen wound of forehead, without mention of complication RhabdomyolysisNontraffic accident involving other off-road motor vehicle injuring motorcyclist Other, mixed, or unspecified drug abuse. unspecifiedAccidental fall on or from other stairs or steps Poisoning by aromatic analgesics, not elsewhere classifiedMotor vehicle traffic accident of unspecified nature injuring passenger in motor vehicle Suicide and self-inflicted poisoning by analgesics, antipyretics, and antirheumaticsInjury by unspecified means, undetermined whether accidentally or purposely inflicted Substance abuse in familyTraumatic shock Suicide and self-inflicted poisoning by tranquilizers and other psychotropic agentsClosed fracture of clavicle, unspecified part Toxic effect of ethyl alcoholClosed fracture of seventh cervical vertebra Posttraumatic stress disorderClosed fracture of scapula, unspecified part Other and unspecified alcohol dependence, continuous

Table 1: The top 15 most representative ICD codes for various clusters, based on cosine similarity to the clustercentroid.

Figure 3 summarizes the system workflow during prediction time.The ARDS prediction model was trained utilizing structured data from the first 24 hours of admission, as described

above in the Dataset Section. In addition, the first half of the available patient nursing notes and the first half ofentered ICD codes were used to produce Patient Context Vectors of size 50. The patient ICD codes were used to look

Page 6: Towards Reliable ARDS Clinical Decision Support: ARDS ...

Figure 3: Real-time ARDS prediction workflow. Nursing notes available at prediction time are used to predict PatientContext Vectors. ICD codes available at prediction time are also converted to Patient Context Vectors by averaging ICDcode embeddings. Patient Context Vectors are used together with structured EMR data to predict the patient ARDSstatus.

ARDS PredictionFeatures AUC P R F1Baseline 79.74 29.77 67.80 37.17Baseline + first 90.16 49.46 78.85 48.63half of notes/ICDARDS Mortality PredictionFeatures AUC P R F1Baseline 78.26 69.19 92.60 79.20Baseline + first 82.11 73.66 89.99 81.01half of notes/ICD

Table 2: 10-fold cross-validation GBM results of predicting ARDS patients and predicting mortality among ARDSpatients. P=Precision, R=Recall, F1= F1-score for the positive class. The Baseline set of features consists of vitalsigns, lab results, Glasgow Coma Scale score, gender and age, in the form of structured data. ”Baseline + first halfof notes/ICD” includes also the average of the first half of entered visit ICD codes embeddings, and Patient ContextVectors predicted from the first half of the visit nursing notes.

up the corresponding ICD code embedding and averaged. Each nursing note was used to predict the Patient ContextVector via the trained word-level CNN. All Patient Context Vectors were then averaged and used in a addition to thestructured MIMIC3 data. As the MIMIC3 dataset ICD codes lack timestamps, we were unable to identify ICD codesavailable during the first 24 hours of admission. The MIMIC3 ICD Codes, however, are ordered, and we used the firsthalf of the ICD codes, and the first half of notes, as an approximation of data available early in the patient stay.

A Gradient Boosting Machine (GBM) model31, 32 was used to predict ARDS patients from the total population ofadult patients. A GBM model was also used to predict the mortality among all ARDS patients. Table 2 shows theresult from the experiments. All results were produced via 10-fold cross validation.

In both prediction models, the inclusion of Patient Context Vectors significantly increased the overall model perfor-mance. Intuitively, the patient medical history and overall clinical condition are important predictive factors. Resultsdemonstrate that Patient Context Vectors can be successfully utilized to represent additional knowledge of a patient’scondition. The importance of the Patient Context Vector is also demonstrated by the scaled GBM variable importanceshown in Figure 4. Together with critical vital signs measurements, various Patient Context Vector dimensions (shownwith prefix embedding) play an important role in predicting the patient’s ARDS outcome. A known limitation of

Page 7: Towards Reliable ARDS Clinical Decision Support: ARDS ...

(a) ARDS patient prediction model. (b) ARDS patient mortality prediction model.Figure 4: GBM scaled variable importance of Baseline prediction model features plus Patient Context Vectors fromfirst half of ICD codes/notes.

low-dimensionality representations is the lack of interpretably of individual dimensions (e.g. embedding 39 lacks theinterpretability of systolic blood pressure or temperature). Future work will focus on interpretability-imparted patientcontext vector embeddings, amenable to clinical interpretation.

In terms of practical application, the proposed system can be used in addition to existing high recall/low precisionhospital alert systems, and used to prioritize alerts, mitigating the effects of alert fatigue. Furthermore, the imperfectand noisy nature of the automatically created dataset is likely resulting in over-pessimistic evaluation. It has beenshown that ML classification algorithms are able to achieve high performance at relatively high levels of noise and thatML models generated from noisy datasets perform significantly better when evaluated on clean test sets (10 to 30%classification accuracy improvement at high training set noise levels)33–35. Future work will focus on creating a clean,clinician-reviewed ARDS test dataset for more precise evaluation of the proposed approach.

Related WorkNumerous prediction scores have been developed to assess ARDS prognosis. Gajic et al.5 developed a Lung InjuryPrediction Score (LIPS) formula including predisposing conditions, such as sepsis, shock, pneumonia, alcohol abuse,chemotherapy, FIO2 and respiratory rate measures, found useful in predicting ARDS and mortality in surgical criticalcare patients36. Interestingly, over 70% of score points associated with LIPS score calculation are based on patientcontext data rather than vials, labs, symptoms.

Other tools, such as Villar et al.6 base their ARDS prediction score on Age, PaO2/FIO2, and Plateau pressure score.Levitt et al.7 developed an Early Acute Lung Injury (EALI) score including risk factors, respiratory rate, and oxygenrequirement. Xie et al.8 developed a modified ARDS prediction score (MAPS) based on a hand-crafted set of riskfactors, risk modifiers, vital signs, etc.

A number of studies focus on predicting mortality among ARDS patients. For example, similar to our clusterfindings, Hyers37 reports that patients who develop bacterial sepsis and multiple organ dysfunction are at high risk ofdying and patients who develop ARDS from trauma or other noninfectious causes have a better prognosis. Navarrete-Navarro et al.38 report that mortality among ARDS patients correlates with the PaO2/FIO2 ratio on the 3rd day ofARDS, the APACHE III score, and the development of multiple system organ failure. Timmons et al.39 studiedchildren with ARDS and found significant differences between survivors and nonsurvivors based on intrapulmonaryvenous admixture, mean airway pressure, alveolar-arterial oxygen tension difference, oxygenation index, and peakinspiratory pressure. Villar et al.40 built an ARDS mortality prediction model based on tertiles of patient age, plateauairway pressure, and PaO2/FIO2 at the time the patient meets ARDS criteria. Similarly, Spicer et al.41 studied pediatricARDS patients and determined that oxygenation index and hematopoietic stem cell transplant / cancer history can beused on Day 1 or Day 3 of ARDS to predict hospital mortality without the need for more complex models.

Unlike the current work, the described previous studies utilized only structured patient data, and ICD codes/riskfactors, when used, consisted of manually crafted lists.

In a broader context, a large volume of literature on combining structured and free-text EMR data apply MedicalConcept detection on the free-text notes for manually curated list of risk factors and other disease-relevant medical

Page 8: Towards Reliable ARDS Clinical Decision Support: ARDS ...

concepts. Ford et al.42 present a review of various approaches to Medical Concept detection from free-text notes forthe purpose of detecting cases of a clinical condition, often in conjunction with structured data.

More recently, deep learning has been used to utilize free-text and structured EMR data. Shickel et al.43 presenta survey of various deep learning techniques. Miotto et al.44 build a Deep Patient representation in an unsupervisedmanner via denoising autoencoders, however, similar to previous approaches they first pre-process the free-text notesby extracting medical concepts with an off-the-shelf tool. Various studies25–27, 45 use deep learning techniques togenerate low-dimensional representations of diagnosis codes and patients utilizing structured data (diagnosis codes,medications, and procedures). Unlike previous work, we combine free-text and structured EMR data for obtaininglow-dimensional patient representations, without the use medical concept detection.

ConclusionThis work demonstrates the utility of deep learning techniques to summarize a patient’s medical history, risk factors,comorbidities, and current signs and symptoms in the form of Patient Context Vectors. Automatically generated ARDSpatients clusters agree with manually curated clinician knowledge and provide additional insight into the complexi-ties and risk factors associated with ARDS. More importantly, Patient Context Vectors, derived from available ICDcodes and nursing notes, can be easily combined with structured EMR data to build real-time ARDS CDS tools, withpotential to improve patient outcomes and reduce mortality among ARDS patients.

AcknowledgementsResearch reported in this publication was supported by a NIH SBIR award to CTA by NIH National Heart, Lung, andBlood Institute, of the National Institutes of Health under award number 1R43HL135909-01A1.

References1. Maca J, Jor O, Holub M, Sklienka P, Bursa F, Burda M, et al. Past and present ARDS mortality rates: a systematic

review. Respiratory care. 2017;62(1):113–122.

2. Bellani G, Laffey JG, Pham T, Fan E, Brochard L, Esteban A, et al. Epidemiology, patterns of care, andmortality for patients with acute respiratory distress syndrome in intensive care units in 50 countries. Jama.2016;315(8):788–800.

3. Fan E, Del Sorbo L, Goligher EC, Hodgson CL, Munshi L, Walkey AJ, et al. An official American Thoracic So-ciety/European Society of Intensive Care Medicine/Society of Critical Care Medicine clinical practice guideline:mechanical ventilation in adult patients with acute respiratory distress syndrome. American journal of respiratoryand critical care medicine. 2017;195(9):1253–1263.

4. Murray JF, Matthay MA, Luce JM, Flick MR, et al. An expanded definition of the adult respiratory distresssyndrome. Am Rev Respir Dis. 1988;138(3):720–723.

5. Gajic O, Dabbagh O, Park PK, Adesanya A, Chang SY, Hou P, et al. Early identification of patients at risk ofacute lung injury: evaluation of lung injury prediction score in a multicenter cohort study. American journal ofrespiratory and critical care medicine. 2011;183(4):462–470.

6. Villar J, Ambros A, Soler JA, Martınez D, Ferrando C, Solano R, et al. Age, PaO2/FIO2, and Plateau pressurescore: a proposal for a Simple Outcome score in patients with the Acute Respiratory distress syndrome. Criticalcare medicine. 2016;44(7):1361–1369.

7. Levitt JE, Calfee CS, Goldstein BA, Vojnik R, Matthay MA. Early acute lung injury: criteria for identifying lunginjury prior to the need for positive pressure ventilation. Critical care medicine. 2013;41(8):1929.

8. Xie J, Liu L, Yang Y, Yu W, Li M, Yu K, et al. A modified acute respiratory distress syndrome prediction score:a multicenter cohort study in China. Journal of thoracic disease. 2018;10(10):5764.

9. Force ADT, Ranieri V, Rubenfeld G, et al. Acute respiratory distress syndrome. Jama. 2012;307(23):2526–2533.

10. Shaver CM, Bastarache JA. Clinical and biological heterogeneity in acute respiratory distress syndrome: directversus indirect lung injury. Clinics in chest medicine. 2014;35(4):639–653.

Page 9: Towards Reliable ARDS Clinical Decision Support: ARDS ...

11. Sendelbach S, Funk M. Alarm fatigue: a patient safety concern. AACN advanced critical care. 2013;24(4):378–386.

12. Wasserman RC. Electronic medical records (EMRs), epidemiology, and epistemology: reflections on EMRs andfuture pediatric clinical research. Academic pediatrics. 2011;11(4):280–287.

13. Calfee CS, Delucchi K, Parsons PE, Thompson BT, Ware LB, Matthay MA, et al. Subphenotypes in acuterespiratory distress syndrome: latent class analysis of data from two randomised controlled trials. The LancetRespiratory Medicine. 2014;2(8):611–620.

14. Sinha P, Delucchi KL, Thompson BT, McAuley DF, Matthay MA, Calfee CS, et al. Latent class analysis of ARDSsubphenotypes: a secondary analysis of the statins for acutely injured lungs from sepsis (SAILS) study. Intensivecare medicine. 2018;44(11):1859–1869.

15. Zhang Z. Identification of three classes of acute respiratory distress syndrome using latent class analysis. PeerJ.2018;6:e4592.

16. Johnson AE, Pollard TJ, Shen L, Li-wei HL, Feng M, Ghassemi M, et al. MIMIC-III, a freely accessible criticalcare database. Scientific data. 2016;3:160035.

17. Bime C, Poongkunran C, Borgstrom M, Natt B, Desai H, Parthasarathy S, et al. Racial Differences in Mortalityfrom Severe Acute Respiratory Failure in the United States, 2008–2012. Annals of the American Thoracic Society.2016;13(12):2184–2189.

18. Rawal G, Yadav S, Kumar R. Acute respiratory distress syndrome: An update and review. Journal of TranslationalInternal Medicine. 2016;.

19. Shari G, Kojicic M, Li G, Cartin-Ceba R, Alvarez CT, Kashyap R, et al. Timing of the onset of acute respiratorydistress syndrome: a population-based study. Respiratory care. 2011;56(5):576–582.

20. Canas AJ, Hill G, Carff R, Suri N, Lott J, Gomez G, et al. CmapTools: A knowledge modeling and sharingenvironment. 2004;.

21. O’malley KJ, Cook KF, Price MD, Wildes KR, Hurdle JF, Ashton CM. Measuring diagnoses: ICD code accuracy.Health services research. 2005;40(5p2):1620–1639.

22. Stausberg J, Lehmann N, Kaczmarek D, Stein M. Reliability of diagnoses coding with ICD-10. Internationaljournal of medical informatics. 2008;77(1):50–57.

23. Burles K, Innes G, Senior K, Lang E, McRae A. Limitations of pulmonary embolism ICD-10 codes in emergencydepartment administrative data: let the buyer beware. BMC medical research methodology. 2017;17(1):89.

24. Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXivpreprint arXiv:13013781. 2013;.

25. Choi Y, Chiu CYI, Sontag D. Learning low-dimensional representations of medical concepts. AMIA Summits onTranslational Science Proceedings. 2016;2016:41.

26. Choi E, Bahadori MT, Searles E, Coffey C, Thompson M, Bost J, et al. Multi-layer representation learning formedical concepts. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discoveryand Data Mining. ACM; 2016. p. 1495–1504.

27. Kartchner D, Christensen T, Humpherys J, Wade S. Code2Vec: Embedding and Clustering Medical DiagnosisData. In: Healthcare Informatics (ICHI), 2017 IEEE International Conference on. IEEE; 2017. p. 386–390.

28. Bai T, Chanda AK, Egleston BL, Vucetic S. EHR phenotyping via jointly embedding medical concepts and wordsinto a unified vector space. BMC medical informatics and decision making. 2018;18(4):123.

Page 10: Towards Reliable ARDS Clinical Decision Support: ARDS ...

29. MacQueen J, et al. Some methods for classification and analysis of multivariate observations. In: Proceedingsof the fifth Berkeley symposium on mathematical statistics and probability. vol. 1. Oakland, CA, USA; 1967. p.281–297.

30. Apostolova E, Wang T, Tschampel T, Koutroulis I, Velez T. Combining Structured and Free-text ElectronicMedical Record Data for Real-time Clinical Decision Support. To appear in BioNLP 2019;.

31. Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of statistics. 2001;p. 1189–1232.

32. h2o.ai;. Accessed: 2019-01-30. https://www.h2o.ai/.

33. Frenay B, Verleysen M. Classification in the presence of label noise: a survey. IEEE transactions on neuralnetworks and learning systems. 2013;25(5):845–869.

34. Rolnick D, Veit A, Belongie S, Shavit N. Deep learning is robust to massive label noise. arXiv preprintarXiv:170510694. 2017;.

35. Kreek RA, Apostolova E. Training and Prediction Data Discrepancies: Challenges of Text Classification withNoisy, Historical Data. In: Proceedings of the 2018 EMNLP Workshop W-NUT: The 4th Workshop on NoisyUser-generated Text; 2018. p. 104–109.

36. Bauman ZM, Gassner MY, Coughlin MA, Mahan M, Watras J. Lung injury prediction score is useful in predictingacute respiratory distress syndrome and mortality in surgical critical care patients. Critical care research andpractice. 2015;2015.

37. Hyers T. Prediction of survival and mortality in patients with adult respiratory distress syndrome. New horizons(Baltimore, Md). 1993;1(4):466–470.

38. Navarrete-Navarro P, Ruiz-Bailen M, Rivera-Fernandez R, Guerrero-Lopez F, Pola-Gallego-de Guzman MD,Vazquez-Mata G. Acute respiratory distress syndrome in trauma patients: ICU mortality and prediction factors.Intensive care medicine. 2000;26(11):1624–1629.

39. Timmons OD, Dean JM, Vernon DD. Mortality rates and prognostic variables in children with adult respiratorydistress syndrome. The Journal of pediatrics. 1991;119(6):896–899.

40. Villar J, Perez-Mendez L, Basaldua S, Blanco J, Aguilar G, Toral D, et al. A risk tertiles model for predictingmortality in patients with acute respiratory distress syndrome: age, plateau pressure, and PaO2/FiO2 at ARDSonset can predict mortality. Respiratory care. 2011;56(4):420–428.

41. Spicer AC, Calfee CS, Zinter MS, Khemani RG, Lo VP, Alkhouli MF, et al. A Simple and Robust BedsideModel for Mortality Risk in Pediatric Patients with ARDS. Pediatric critical care medicine: a journal of theSociety of Critical Care Medicine and the World Federation of Pediatric Intensive and Critical Care Societies.2016;17(10):907.

42. Ford E, Carroll JA, Smith HE, Scott D, Cassell JA. Extracting information from the text of electronic medicalrecords to improve case detection: a systematic review. Journal of the American Medical Informatics Association.2016;23(5):1007–1015.

43. Shickel B, Tighe PJ, Bihorac A, Rashidi P. Deep EHR: a survey of recent advances in deep learning techniques forelectronic health record (EHR) analysis. IEEE journal of biomedical and health informatics. 2018;22(5):1589–1604.

44. Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patientsfrom the electronic health records. Scientific reports. 2016;6:26094.

45. Choi E, Xiao C, Stewart W, Sun J. Mime: Multilevel medical embedding of electronic health records for predictivehealthcare. In: Advances in Neural Information Processing Systems; 2018. p. 4547–4557.