Regression and Random Forest Machine Learning Have Limited Performance in Predicting Bowel Preparation in Veteran Population

Jacob E Kurlander; Akbar K Waljee; Stacy B Menees; Rachel Lipson; Alex N Kokaly; Andrew J Read; Karmel S Shehadeh; Amy Cohn; Sameer D Saini

doi:10.1007/s10620-021-07113-z

Regression and Random Forest Machine Learning Have Limited Performance in Predicting Bowel Preparation in Veteran Population

Dig Dis Sci. 2022 Jul;67(7):2827-2841. doi: 10.1007/s10620-021-07113-z. Epub 2021 Jun 24.

Authors

Jacob E Kurlander^{1

2

3}, Akbar K Waljee^{4

5

6}, Stacy B Menees^{4

7}, Rachel Lipson⁶, Alex N Kokaly⁸, Andrew J Read^{4

5}, Karmel S Shehadeh⁹, Amy Cohn¹⁰, Sameer D Saini^{4

5

6}

Affiliations

¹ Department of Internal Medicine, University of Michigan, 3912 Taubman Center, 1500 E. Medical Center Dr., SPC 5362, Ann Arbor, MI, 48109-5362, USA. jkurland@umich.edu.
² Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, MI, USA. jkurland@umich.edu.
³ Veterans Affairs Ann Arbor Center for Clinical Management Research, Ann Arbor, MI, USA. jkurland@umich.edu.
⁴ Department of Internal Medicine, University of Michigan, 3912 Taubman Center, 1500 E. Medical Center Dr., SPC 5362, Ann Arbor, MI, 48109-5362, USA.
⁵ Institute for Healthcare Policy and Innovation, University of Michigan, Ann Arbor, MI, USA.
⁶ Veterans Affairs Ann Arbor Center for Clinical Management Research, Ann Arbor, MI, USA.
⁷ VA Ann Arbor Healthcare System, 2215 Fuller Road, Ann Arbor, MI, USA.
⁸ Department of Medicine, UCLA Health, 200 UCLA Medical Plaza, Suite 420, Los Angeles, 90095-1685, CA, USA.
⁹ Department of Industrial and Systems Engineering, Lehigh University, 200 West Packer Ave, Bethlehem, PA, 18015, USA.
¹⁰ Department of Industrial and Operations Engineering, University of Michigan, 2015 Beal Ave, Ann Arbor, MI, 4819-2117, USA.

PMID: 34169434
DOI: 10.1007/s10620-021-07113-z

Abstract

Background: Inadequate bowel preparation undermines the quality of colonoscopy, but patients likely to be affected are difficult to identify beforehand.

Aims: This study aimed to develop, validate, and compare prediction models for bowel preparation inadequacy using conventional logistic regression (LR) and random forest machine learning (RFML).

Methods: We created a retrospective cohort of patients who underwent outpatient colonoscopy at a single VA medical center between January 2012 and October 2015. Candidate predictor variables were chosen after a literature review. We extracted all available predictor variables from the electronic medical record, and bowel preparation from the endoscopy database. The data were split into 70% training and 30% validation sets. Multivariable LR and RFML were used to predict preparation inadequacy as a dichotomous outcome.

Results: The cohort included 6,885 Veterans, of whom 964 (14%) had inadequate preparation. Using LR, the area under the receiver operating characteristic curve (AUC) for the validation cohort was 0.66 (95% CI 0.62, 0.69) and the Brier score, in which a lower score indicates better performance, was 0.11. Using RFML, the AUC for the validation cohort was 0.61 (95% CI 0.58, 0.65) and the Brier score was 0.12.

Conclusions: LR and RFML had similar performance in predicting bowel preparation, which was modest and likely insufficient for use in practice. Future research is needed to identify additional predictor variables and to test other machine learning algorithms. At present, endoscopy units should focus on universal strategies to enhance preparation adequacy.

Keywords: Bowel preparation; Colonoscopy; Healthcare quality; Prediction models; Random forest machine learning; Veterans health.

Publication types

Review
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Humans
Logistic Models
Machine Learning
Retrospective Studies
Risk Assessment
Veterans*