Continuous-time probabilistic models for longitudinal electronic health records

J Biomed Inform. 2022 Jun:130:104084. doi: 10.1016/j.jbi.2022.104084. Epub 2022 May 7.

Abstract

Analysis of longitudinal Electronic Health Record (EHR) data is an important goal for precision medicine. Difficulty in applying Machine Learning (ML) methods, either predictive or unsupervised, stems in part from the heterogeneity and irregular sampling of EHR data. We present an unsupervised probabilistic model that captures nonlinear relationships between variables over continuous-time. This method works with arbitrary sampling patterns and captures the joint probability distribution between variable measurements and the time intervals between them. Inference algorithms are derived that can be used to evaluate the likelihood of future using under a trained model. As an example, we consider data from the United States Veterans Health Administration (VHA) in the areas of diabetes and depression. Likelihood ratio maps are produced showing the likelihood of risk for moderate-severe vs minimal depression as measured by the Patient Health Questionnaire-9 (PHQ-9).

Keywords: Electronic health records; Mixture models; Probabilistic models; Time-dependent modeling.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Electronic Health Records*
  • Humans
  • Machine Learning*
  • Models, Statistical
  • Probability