Long-term prediction models for vision-threatening diabetic retinopathy using medical features from data warehouse

Sci Rep. 2022 May 19;12(1):8476. doi: 10.1038/s41598-022-12369-0.

Abstract

We sought to evaluate the performance of machine learning prediction models for identifying vision-threatening diabetic retinopathy (VTDR) in patients with type 2 diabetes mellitus using only medical data from data warehouse. This is a multicenter electronic medical records review study. Patients with type 2 diabetes screened for diabetic retinopathy and followed-up for 10 years were included from six referral hospitals sharing same electronic medical record system (n = 9,102). Patient demographics, laboratory results, visual acuities (VAs), and occurrence of VTDR were collected. Prediction models for VTDR were developed using machine learning models. F1 score, accuracy, specificity, and area under the receiver operating characteristic curve (AUC) were analyzed. Machine learning models revealed F1 score, accuracy, specificity, and AUC values of up 0.89, 0.89.0.95, and 0.96 during training. The trained models predicted the occurrence of VTDR at 10-year with F1 score, accuracy, and specificity up to 0.81, 0.70, and 0.66, respectively, on test set. Important predictors included baseline VA, duration of diabetes treatment, serum level of glycated hemoglobin and creatinine, estimated glomerular filtration rate and blood pressure. The models could predict the long-term occurrence of VTDR with fair performance. Although there might be limitation due to lack of funduscopic findings, prediction models trained using medical data can facilitate proper referral of subjects at high risk for VTDR to an ophthalmologist from primary care.

Publication types

  • Multicenter Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Warehousing
  • Diabetes Mellitus, Type 2* / complications
  • Diabetes Mellitus, Type 2* / epidemiology
  • Diabetic Retinopathy* / diagnosis
  • Diabetic Retinopathy* / epidemiology
  • Glycated Hemoglobin
  • Humans
  • ROC Curve
  • Risk Factors

Substances

  • Glycated Hemoglobin A