Performance of Machine Learning Algorithms for Predicting Disease Activity in Inflammatory Bowel Disease

Inflammation. 2023 Aug;46(4):1561-1574. doi: 10.1007/s10753-023-01827-0. Epub 2023 May 12.

Abstract

This study aimed to explore the effectiveness of predicting disease activity in patients with inflammatory bowel disease (IBD), using machine learning (ML) models. A retrospective research was undertaken on IBD patients who were admitted into the First Affiliated Hospital of Wenzhou Medical University between September 2011 and September 2019. At first, data were randomly split into a 3:1 ratio of training to test set. The least absolute shrinkage and selection operator (LASSO) algorithm was applied to reduce the dimension of variables. These variables were used to generate seven ML algorithms, namely random forests (RFs), adaptive boosting (AdaBoost), K-nearest neighbors (KNNs), support vector machines (SVMs), naïve Bayes (NB), ridge regression, and eXtreme gradient boosting (XGBoost) to train to predict disease activity in IBD patients. SHapley Additive exPlanation (SHAP) analysis was performed to rank variable importance. A total of 876 participants with IBD, consisting of 275 ulcerative colitis (UC) and 601 Crohn's disease (CD), were retrospectively enrolled in the study. Thirty-three variables were obtained from the clinical characteristics and laboratory tests of the participants. Finally, after LASSO analysis, 11 and 5 variables were screened out to construct ML models for CD and UC, respectively. All seven ML models performed well in predicting disease activity in the CD and UC test sets. Among these ML models, SVM was more effective in predicting disease activity in the CD group, whose AUC reached 0.975, sensitivity 0.947, specificity 0.920, and accuracy 0.933. AdaBoost performed best for the UC group, with an AUC of 0.911, sensitivity 0.844, specificity 0.875, and accuracy 0.855. ML algorithms were available and capable of predicting disease activity in IBD patients. Based on clinical and laboratory variables, ML algorithms demonstrate great promise in guiding physicians' decision-making.

Keywords: Crohn’s disease; disease activity.; inflammatory bowel disease; machine learning; ulcerative colitis.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Colitis, Ulcerative* / diagnosis
  • Crohn Disease* / diagnosis
  • Humans
  • Inflammatory Bowel Diseases* / diagnosis
  • Retrospective Studies