Interpretable disease prediction using heterogeneous patient records with self-attentive fusion encoder

J Am Med Inform Assoc. 2021 Sep 18;28(10):2155-2164. doi: 10.1093/jamia/ocab109.

Abstract

Objective: We propose an interpretable disease prediction model that efficiently fuses multiple types of patient records using a self-attentive fusion encoder. We assessed the model performance in predicting cardiovascular disease events, given the records of a general patient population.

Materials and methods: We extracted 798111 ses and 67 623 controls from the sample cohort database and nationwide healthcare claims data of South Korea. Among the information provided, our model used the sequential records of medical codes and patient characteristics, such as demographic profiles and the most recent health examination results. These two types of patient records were combined in our self-attentive fusion module, whereas previously dominant methods aggregated them using a simple concatenation. The prediction performance was compared to state-of-the-art recurrent neural network-based approaches and other widely used machine learning approaches.

Results: Our model outperformed all the other compared methods in predicting cardiovascular disease events. It achieved an area under the curve of 0.839, while the other compared methods achieved between 0.74111 d 0.830. Moreover, our model consistently outperformed the other methods in a more challenging setting in which we tested the model's ability to draw an inference from more nonobvious, diverse factors.

Discussion: We also interpreted the attention weights provided by our model as the relative importance of each time step in the sequence. We showed that our model reveals the informative parts of the patients' history by measuring the attention weights.

Conclusion: We suggest an interpretable disease prediction model that efficiently fuses heterogeneous patient records and demonstrates superior disease prediction performance.

Keywords: attention; cardiovascular disease; deep learning; disease prediction; recurrent neural network.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Attention
  • Delivery of Health Care
  • Electronic Health Records*
  • Humans
  • Machine Learning
  • Neural Networks, Computer*