Study on risk factors of diabetic peripheral neuropathy and establishment of a prediction model by machine learning

BMC Med Inform Decis Mak. 2023 Aug 2;23(1):146. doi: 10.1186/s12911-023-02232-1.

Abstract

Background: Diabetic peripheral neuropathy (DPN) is a common complication of diabetes. Predicting the risk of developing DPN is important for clinical decision-making and designing clinical trials.

Methods: We retrospectively reviewed the data of 1278 patients with diabetes treated in two central hospitals from 2020 to 2022. The data included medical history, physical examination, and biochemical index test results. After feature selection and data balancing, the cohort was divided into training and internal validation datasets at a 7:3 ratio. Training was made in logistic regression, k-nearest neighbor, decision tree, naive bayes, random forest, and extreme gradient boosting (XGBoost) based on machine learning. The k-fold cross-validation was used for model assessment, and the accuracy, precision, recall, F1-score, and the area under the receiver operating characteristic curve (AUC) were adopted to validate the models' discrimination and clinical practicality. The SHapley Additive exPlanation (SHAP) was used to interpret the best-performing model.

Results: The XGBoost model outperformed other models, which had an accuracy of 0·746, precision of 0·765, recall of 0·711, F1-score of 0·736, and AUC of 0·813. The SHAP results indicated that age, disease duration, glycated hemoglobin, insulin resistance index, 24-h urine protein quantification, and urine protein concentration were risk factors for DPN, while the ratio between 2-h postprandial C-peptide and fasting C-peptide(C2/C0), total cholesterol, activated partial thromboplastin time, and creatinine were protective factors.

Conclusions: The machine learning approach helped established a DPN risk prediction model with good performance. The model identified the factors most closely related to DPN.

Keywords: Data analysis; Diabetes; Diabetic peripheral neuropathy; Machine learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • C-Peptide
  • Diabetes Mellitus*
  • Diabetic Neuropathies* / diagnosis
  • Diabetic Neuropathies* / etiology
  • Humans
  • Machine Learning
  • Retrospective Studies
  • Risk Factors

Substances

  • C-Peptide