Simple Linear Support Vector Machine Classifier Can Distinguish Impaired Glucose Tolerance Versus Type 2 Diabetes Using a Reduced Set of CGM-Based Glycemic Variability Indices

J Diabetes Sci Technol. 2020 Mar;14(2):297-302. doi: 10.1177/1932296819838856. Epub 2019 Mar 31.

Abstract

Background: Many glycemic variability (GV) indices exist in the literature. In previous works, we demonstrated that a set of GV indices, extracted from continuous glucose monitoring (CGM) data, can distinguish between stages of diabetes progression. We showed that 25 indices driving a logistic regression classifier can differentiate between healthy and nonhealthy individuals; whereas 37 GV indices and four individual parameters, feeding a polynomial-kernel support vector machine (SVM), can further distinguish between impaired glucose tolerance (IGT) and type 2 diabetes (T2D). The latter approach has some limitations to interpretability (complex model, extensive index pool). In this article, we try to obtain the same performance with a simpler classifier and a parsimonious subset of indices.

Methods: We analyzed the data of 62 subjects with IGT or T2D. We selected 17 interpretable GV indices and four parameters (age, sex, BMI, waist circumference). We trained a SVM on the data of a baseline visit and tested it on the follow-up visit, comparing the results with the state-of-art methods.

Results: The linear SVM fed by a reduced subset of 17 GV indices and four basic parameters achieved 82.3% accuracy, only marginally worse than the reference 87.1% (41-features polynomial-kernel SVM). Cross-validation accuracies were comparable (69.6% vs 72.5%).

Conclusion: The proposed SVM fed by 17 GV indices and four parameters can differentiate between IGT and T2D. Using a simpler model and a parsimonious set of indices caused only a slight accuracy deterioration, with significant advantages in terms of interpretability.

Keywords: classification; continuous glucose monitoring; glycemic variability; impaired glucose tolerance; support vector machine; type 2 diabetes.

Publication types

  • Evaluation Study

MeSH terms

  • Adult
  • Aged
  • Algorithms
  • Blood Glucose / analysis
  • Blood Glucose / metabolism*
  • Blood Glucose Self-Monitoring / methods
  • Blood Glucose Self-Monitoring / statistics & numerical data
  • Data Interpretation, Statistical
  • Datasets as Topic / statistics & numerical data
  • Diabetes Mellitus, Type 2 / blood
  • Diabetes Mellitus, Type 2 / diagnosis*
  • Diagnosis, Differential
  • Female
  • Glucose Intolerance / blood
  • Glucose Intolerance / diagnosis*
  • Glycemic Control / methods
  • Glycemic Control / statistics & numerical data
  • Health Status Indicators*
  • Humans
  • Male
  • Middle Aged
  • Predictive Value of Tests
  • Reproducibility of Results
  • Support Vector Machine*

Substances

  • Blood Glucose