Using an optimized generative model to infer the progression of complications in type 2 diabetes patients

BMC Med Inform Decis Mak. 2022 Jul 1;22(1):174. doi: 10.1186/s12911-022-01915-5.

Abstract

Background: People live a long time in pre-diabetes/early diabetes without a formal diagnosis or management. Heterogeneity of progression coupled with deficiencies in electronic health records related to incomplete data, discrete events, and irregular event intervals make identification of pre-diabetes and critical points of diabetes progression challenging.

Methods: We utilized longitudinal electronic health records of 9298 patients with type 2 diabetes or prediabetes from 2005 to 2016 from a large regional healthcare delivery network in China. We optimized a generative Markov-Bayesian-based model to generate 5000 synthetic illness trajectories. The synthetic data were manually reviewed by endocrinologists.

Results: We build an optimized generative progression model for type 2 diabetes using anchor information to reduce the number of parameters learning in the third layer of the model from [Formula: see text] to [Formula: see text], where [Formula: see text] is the number of clinical findings, [Formula: see text] is the number of complications, [Formula: see text] is the number of anchors. Based on this model, we infer the relationships between progression stages, the onset of complication categories, and the associated diagnoses during the whole progression of type 2 diabetes using electronic health records.

Discussion: Our findings indicate that 55.3% of single complications and 31.8% of complication patterns could be predicted early and managed appropriately to potentially delay (as it is a progressive disease) or prevented (by lifestyle modifications that keep patient from developing/triggering diabetes in the first place).

Conclusions: The full type 2 diabetes patient trajectories generated by the chronic disease progression model can counter a lack of real-world evidence of desired longitudinal timeframe while facilitating population health management.

Keywords: Computer simulation; Diabetes mellitus, type 2; Disease progression model; Electronic health records; Probabilistic generative model.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • China / epidemiology
  • Diabetes Mellitus, Type 2* / complications
  • Humans
  • Prediabetic State* / complications
  • Prediabetic State* / epidemiology