Using the Shapes of Clinical Data Trajectories to Predict Mortality in ICUs

Junchao Ma; Donald K K Lee; Michael E Perkins; Margaret A Pisani; Edieal Pinker

doi:10.1097/CCE.0000000000000010

Using the Shapes of Clinical Data Trajectories to Predict Mortality in ICUs

Crit Care Explor. 2019 Apr 17;1(4):e0010. doi: 10.1097/CCE.0000000000000010. eCollection 2019 Apr.

Authors

Junchao Ma¹, Donald K K Lee², Michael E Perkins³, Margaret A Pisani⁴, Edieal Pinker¹

Affiliations

¹ School of Management, Yale University, New Haven, CT.
² Goizueta Business School, Emory University, Atlanta, GA.
³ Hartford Hospital, Hartford, CT.
⁴ Yale New Haven Hospital Pulmonary and Critical Care Medicine, New Haven, CT.

Abstract

1) To show how to exploit the information contained in the trajectories of time-varying patient clinical data for dynamic predictions of mortality in the ICU; and 2) to demonstrate the additional predictive value that can be achieved by incorporating this trajectory information.

Design: Observational, retrospective study of patient medical records for training and testing of statistical learning models using different sets of predictor variables.

Setting: Medical ICU at the Yale-New Haven Hospital.

Subjects: Electronic health records of 3,763 patients admitted to the medical ICU between January 2013 and January 2015.

Interventions: None.

Measurements and main results: Six-hour mortality predictions for ICU patients were generated and updated every 6 hours by applying the random forest classifier to patient time series data from the prior 24 hours. The time series were processed in different ways to create two main models: 1) manual extraction of the summary statistics used in the literature (min/max/median/first/last/number of measurements) and 2) automated extraction of trajectory features using machine learning. Out-of-sample area under the receiver operating characteristics curve and area under the precision-recall curve ("precision" refers to positive predictive value and "recall" to sensitivity) were used to evaluate the predictive performance of the two models. For 6-hour prediction and updating, the second model achieved area under the receiver operating characteristics curve and area under the precision-recall curve of 0.905 (95% CI, 0.900-0.910) and 0.381 (95% CI, 0.368-0.394), respectively, which are statistically significantly higher than those achieved by the first model, with area under the receiver operating characteristics curve and area under the precision-recall curve of 0.896 (95% CI, 0.892-0.900) and 0.905 (95% CI, 0.353-0.379). The superiority of the second model held true for 12-hour prediction/updating as well as for 24-hour prediction/updating.

Conclusions: We show that statistical learning techniques can be used to automatically extract all relevant shape features for use in predictive modeling. The approach requires no additional data and can potentially be used to improve any risk model that uses some form of trajectory information. In this single-center study, the shapes of the clinical data trajectories convey information about ICU mortality risk beyond what is already captured by the summary statistics currently used in the literature.

Keywords: hospital mortality; informatics; machine learning; prognosis; statistical models; time-dependent covariates.