Prediction of Vitamin D Deficiency in Older Adults: The Role of Machine Learning Models

John D Sluyter; Yoshihiko Raita; Kohei Hasegawa; Ian R Reid; Robert Scragg; Carlos A Camargo

doi:10.1210/clinem/dgac432

Prediction of Vitamin D Deficiency in Older Adults: The Role of Machine Learning Models

J Clin Endocrinol Metab. 2022 Sep 28;107(10):2737-2747. doi: 10.1210/clinem/dgac432.

Authors

John D Sluyter¹, Yoshihiko Raita², Kohei Hasegawa², Ian R Reid³, Robert Scragg¹, Carlos A Camargo²

Affiliations

¹ School of Population Health, University of Auckland, Auckland 1023, New Zealand.
² Department of Emergency Medicine, Massachusetts General Hospital, Harvard Medical School, Boston, MA 02115, USA.
³ Department of Medicine, Faculty of Medical and Health Sciences, University of Auckland, Auckland 1023, New Zealand.

PMID: 35876536
DOI: 10.1210/clinem/dgac432

Abstract

Context: Conventional prediction models for vitamin D deficiency have limited accuracy.

Background: Using cross-sectional data, we developed models based on machine learning (ML) and compared their performance with those based on a conventional approach.

Methods: Participants were 5106 community-resident adults (50-84 years; 58% male). In the randomly sampled training set (65%), we constructed 5 ML models: lasso regression, elastic net regression, random forest, gradient boosted decision tree, and dense neural network. The reference model was a logistic regression model. Outcomes were deseasonalized serum 25-hydroxyvitamin D (25(OH)D) <50 nmol/L (yes/no) and <25 nmol/L (yes/no). In the test set (the remaining 35%), we evaluated predictive performance of each model, including area under the receiver operating characteristic curve (AUC) and net benefit (decision curves).

Results: Overall, 1270 (25%) and 91 (2%) had 25(OH)D <50 and <25 nmol/L, respectively. Compared with the reference model, the ML models predicted 25(OH)D <50 nmol/L with similar accuracy. However, for prediction of 25(OH)D <25 nmol/L, all ML models had higher AUC point estimates than the reference model by up to 0.14. AUC was highest for elastic net regression (0.93; 95% CI 0.90-0.96), compared with 0.81 (95% CI 0.71-0.91) for the reference model. In the decision curve analysis, ML models mostly achieved a greater net benefit across a range of thresholds.

Conclusion: Compared with conventional models, ML models predicted 25(OH)D <50 nmol/L with similar accuracy but they predicted 25(OH)D <25 nmol/L with greater accuracy. The latter finding suggests a role for ML models in participant selection for vitamin D supplement trials.

Keywords: Vitamin D; machine learning; prediction; vitamin D deficiency.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Aged
Cross-Sectional Studies
Humans
Logistic Models
Machine Learning
Vitamin D
Vitamin D Deficiency* / diagnosis
Vitamin D Deficiency* / epidemiology

Substances

Vitamin D