Representation of time-varying and time-invariant EMR data and its application in modeling outcome prediction for heart failure patients

J Biomed Inform. 2023 Jul:143:104427. doi: 10.1016/j.jbi.2023.104427. Epub 2023 Jun 18.

Abstract

Objective: To represent a patient record with both time-invariant and time-varying features as a single vector using an end-to-end deep learning model, and further to predict the kidney failure (KF) status and mortality of heart failure (HF) patients.

Materials and methods: The time-invariant EMR data included demographic information and comorbidities, and the time-varying EMR data were lab tests. We used a Transformer encoder module to represent the time-invariant data, and refined a long short-term memory (LSTM) with a Transformer encoder attached to the top to represent the time-varying data, taking the original measured values and their corresponding embedding vectors, masking vectors, and two types of time intervals as inputs. The proposed representations of patients with time-invariant and time-varying data were used to predict KF status (949 out of 5268 HF patients diagnosed with KF) and mortality (463 in-hospital deaths) for HF patients. Comparative experiments were conducted between the proposed model and some representative machine learning models. Ablation experiments were also performed around the time-varying data representation, including replacing the refined LSTM with the standard LSTM, GRU-D and T-LSTM, respectively, and removing the Transformer encoder and the time-varying data representation module, respectively. The visualization of the attention weights of the time-invariant and time-varying features was used to clinically interpret the predictive performance. We used the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), and the F1-score to evaluate the predictive performance of the models.

Results: The proposed model achieved superior performance, with average AUROCs, AUPRCs and F1-scores of 0.960, 0.610 and 0.759 for KF prediction and 0.937, 0.353 and 0.537 for mortality prediction, respectively. Predictive performance improved with the addition of time-varying data from longer time periods. The proposed model outperformed the comparison and ablation references in both prediction tasks.

Conclusions: Both time-invariant and time-varying EMR data of patients could be efficiently represented by the proposed unified deep learning model, which shows higher performance in clinical prediction tasks. The way to use time-varying data in the current study is hopeful to be used in other kinds of time-varying data and other clinical tasks.

Keywords: Long short-term memory; Outcome prediction; Patient representation; Time-varying data; Transformer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Comorbidity
  • Heart Failure* / diagnosis
  • Humans
  • Machine Learning*
  • Patients
  • Prognosis