Prediction of Vitamin D Deficiency in Older Adults: The Role of Machine Learning Models

J Clin Endocrinol Metab. 2022 Sep 28;107(10):2737-2747. doi: 10.1210/clinem/dgac432.

Abstract

Context: Conventional prediction models for vitamin D deficiency have limited accuracy.

Background: Using cross-sectional data, we developed models based on machine learning (ML) and compared their performance with those based on a conventional approach.

Methods: Participants were 5106 community-resident adults (50-84 years; 58% male). In the randomly sampled training set (65%), we constructed 5 ML models: lasso regression, elastic net regression, random forest, gradient boosted decision tree, and dense neural network. The reference model was a logistic regression model. Outcomes were deseasonalized serum 25-hydroxyvitamin D (25(OH)D) <50 nmol/L (yes/no) and <25 nmol/L (yes/no). In the test set (the remaining 35%), we evaluated predictive performance of each model, including area under the receiver operating characteristic curve (AUC) and net benefit (decision curves).

Results: Overall, 1270 (25%) and 91 (2%) had 25(OH)D <50 and <25 nmol/L, respectively. Compared with the reference model, the ML models predicted 25(OH)D <50 nmol/L with similar accuracy. However, for prediction of 25(OH)D <25 nmol/L, all ML models had higher AUC point estimates than the reference model by up to 0.14. AUC was highest for elastic net regression (0.93; 95% CI 0.90-0.96), compared with 0.81 (95% CI 0.71-0.91) for the reference model. In the decision curve analysis, ML models mostly achieved a greater net benefit across a range of thresholds.

Conclusion: Compared with conventional models, ML models predicted 25(OH)D <50 nmol/L with similar accuracy but they predicted 25(OH)D <25 nmol/L with greater accuracy. The latter finding suggests a role for ML models in participant selection for vitamin D supplement trials.

Keywords: Vitamin D; machine learning; prediction; vitamin D deficiency.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Aged
  • Cross-Sectional Studies
  • Humans
  • Logistic Models
  • Machine Learning
  • Vitamin D
  • Vitamin D Deficiency* / diagnosis
  • Vitamin D Deficiency* / epidemiology

Substances

  • Vitamin D