Reporting of Model Performance and Statistical Methods in Studies That Use Machine Learning to Develop Clinical Prediction Models: Protocol for a Systematic Review

Colin George Wyllie Weaver; Robert B Basmadjian; Tyler Williamson; Kerry McBrien; Tolu Sajobi; Devon Boyne; Mohamed Yusuf; Paul Everett Ronksley

doi:10.2196/30956

Reporting of Model Performance and Statistical Methods in Studies That Use Machine Learning to Develop Clinical Prediction Models: Protocol for a Systematic Review

JMIR Res Protoc. 2022 Mar 3;11(3):e30956. doi: 10.2196/30956.

Authors

Colin George Wyllie Weaver^#¹, Robert B Basmadjian¹, Tyler Williamson¹, Kerry McBrien^{1

2}, Tolu Sajobi¹, Devon Boyne³, Mohamed Yusuf⁴, Paul Everett Ronksley¹

Affiliations

¹ Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
² Department of Family Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
³ Department of Oncology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
⁴ Faculty of Science & Engineering, Manchester Metropolitan University, Manchester, United Kingdom.

^# Contributed equally.

PMID: 35238322
PMCID: PMC8931652
DOI: 10.2196/30956

Abstract

Background: With the growing excitement of the potential benefits of using machine learning and artificial intelligence in medicine, the number of published clinical prediction models that use these approaches has increased. However, there is evidence (albeit limited) that suggests that the reporting of machine learning-specific aspects in these studies is poor. Further, there are no reviews assessing the reporting quality or broadly accepted reporting guidelines for these aspects.

Objective: This paper presents the protocol for a systematic review that will assess the reporting quality of machine learning-specific aspects in studies that use machine learning to develop clinical prediction models.

Methods: We will include studies that use a supervised machine learning algorithm to develop a prediction model for use in clinical practice (ie, for diagnosis or prognosis of a condition or identification of candidates for health care interventions). We will search MEDLINE for studies published in 2019, pseudorandomly sort the records, and screen until we obtain 100 studies that meet our inclusion criteria. We will assess reporting quality with a novel checklist developed in parallel with this review, which includes content derived from existing reporting guidelines, textbooks, and consultations with experts. The checklist will cover 4 key areas where the reporting of machine learning studies is unique: modelling steps (order and data used for each step), model performance (eg, reporting the performance of each model compared), statistical methods (eg, describing the tuning approach), and presentation of models (eg, specifying the predictors that contributed to the final model).

Results: We completed data analysis in August 2021 and are writing the manuscript. We expect to submit the results to a peer-reviewed journal in early 2022.

Conclusions: This review will contribute to more standardized and complete reporting in the field by identifying areas where reporting is poor and can be improved.

Trial registration: PROSPERO International Prospective Register of Systematic Reviews CRD42020206167; https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=206167.

International registered report identifier (irrid): RR1-10.2196/30956.

Keywords: artificial intelligence; clinical prediction; clinical prediction models; digital medicine; eHealth; machine learning; modeling; prediction; research methods; research reporting; statistics.

©Colin George Wyllie Weaver, Robert B Basmadjian, Tyler Williamson, Kerry McBrien, Tolu Sajobi, Devon Boyne, Mohamed Yusuf, Paul Everett Ronksley. Originally published in JMIR Research Protocols (https://www.researchprotocols.org), 03.03.2022.