Longitudinal deep learning clustering of Type 2 Diabetes Mellitus trajectories using routinely collected health records

J Biomed Inform. 2022 Nov:135:104218. doi: 10.1016/j.jbi.2022.104218. Epub 2022 Oct 8.

Abstract

Type 2 diabetes mellitus (T2DM) is a highly heterogeneous chronic disease with different pathophysiological and genetic characteristics affecting its progression, associated complications and response to therapies. The advances in deep learning (DL) techniques and the availability of a large amount of healthcare data allow us to investigate T2DM characteristics and evolution with a completely new approach, studying common disease trajectories rather than cross sectional values. We used an Kernelized-AutoEncoder algorithm to map 5 years of data of 11,028 subjects diagnosed with T2DM in a latent space that embedded similarities and differences between patients in terms of the evolution of the disease. Once we obtained the latent space, we used classical clustering algorithms to create longitudinal clusters representing different evolutions of the diabetic disease. Our unsupervised DL clustering algorithm suggested seven different longitudinal clusters. Different mean ages were observed among the clusters (ranging from 65.3±11.6 to 72.8±9.4). Subjects in clusters B (Hypercholesteraemic) and E (Hypertensive) had shorter diabetes duration (9.2±3.9 and 9.5±3.9 years respectively). Subjects in Cluster G (Metabolic) had the poorest glycaemic control (mean glycated hemoglobin 7.99±1.42%), while cluster E had the best one (mean glycated hemoglobin 7.04±1.11%). Obesity was observed mainly in clusters A (Neuropathic), C (Multiple Complications), F (Retinopathy) and G. A dashboard is available at dm2.b2slab.upc.edu to visualize the different trajectories corresponding to the 7 clusters.

Keywords: AutoEncoder; Deep learning; Diabetic complications; Electronic health records; Longitudinal cluster; Type 2 diabetes.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Cross-Sectional Studies
  • Deep Learning*
  • Diabetes Mellitus, Type 2* / diagnosis
  • Diabetes Mellitus, Type 2* / epidemiology
  • Glycated Hemoglobin / analysis
  • Humans

Substances

  • Glycated Hemoglobin A