Use and performance of machine learning models for type 2 diabetes prediction in clinical and community care settings: Protocol for a systematic review and meta-analysis of predictive modeling studies

Digit Health. 2021 Sep 28:7:20552076211047390. doi: 10.1177/20552076211047390. eCollection 2021 Jan-Dec.

Abstract

Objective: Machine learning involves the use of algorithms without explicit instructions. Of late, machine learning models have been widely applied for the prediction of type 2 diabetes. However, no evidence synthesis of the performance of these prediction models of type 2 diabetes is available. We aim to identify machine learning prediction models for type 2 diabetes in clinical and community care settings and determine their predictive performance.

Methods: The systematic review of English language machine learning predictive modeling studies in 12 databases will be conducted. Studies predicting type 2 diabetes in predefined clinical or community settings are eligible. Standard CHARMS and TRIPOD guidelines will guide data extraction. Methodological quality will be assessed using a predefined risk of bias assessment tool. The extent of validation will be categorized by Reilly-Evans levels. Primary outcomes include model performance metrics of discrimination ability, calibration, and classification accuracy. Secondary outcomes include candidate predictors, algorithms used, level of validation, and intended use of models. The random-effects meta-analysis of c-indices will be performed to evaluate discrimination abilities. The c-indices will be pooled per prediction model, per model type, and per algorithm. Publication bias will be assessed through funnel plots and regression tests. Sensitivity analysis will be conducted to estimate the effects of study quality and missing data on primary outcome. The sources of heterogeneity will be assessed through meta-regression. Subgroup analyses will be performed for primary outcomes.

Ethics and dissemination: No ethics approval is required, as no primary or personal data are collected. Findings will be disseminated through scientific sessions and peer-reviewed journals.

Prospero registration number: CRD42019130886.

Keywords: Type 2 diabetes; machine learning; meta-analysis; prediction models; protocol.