Serological Biomarker-Based Machine Learning Models for Predicting the Relapse of Ulcerative Colitis

J Inflamm Res. 2023 Aug 21:16:3531-3545. doi: 10.2147/JIR.S423086. eCollection 2023.

Abstract

Purpose: To explore whether machine learning models using serological markers can predict the relapse of Ulcerative colitis (UC).

Patients and methods: This clinical cohort study included 292 UC patients, and serological markers were obtained when patients were discharged from the hospital. Subsequently, four machine learning models including the random forest (RF) model, the logistic regression model, the decision tree, and the neural network were compared to predict the relapse of UC. A nomogram was constructed, and the performance of these models was evaluated by accuracy, sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC).

Results: Based on the patients' characteristics and serological markers, we selected the relevant variables associated with relapse and developed a LR model. The novel model including gender, white blood cell count, percentage of leukomonocyte, percentage of monocyte, absolute value of neutrophilic granulocyte, and erythrocyte sedimentation rate was established for predicting the relapse. In addition, the average AUC of the four machine learning models was 0.828, of which the RF model was the best. The AUC of the test group was 0.889, the accuracy was 76.4%, the sensitivity was 78.5%, and the specificity was 76.4%. There were 45 variables in the RF models, and the relative weight coefficients of these variables were determined. Age has the greatest impact on classification results, followed by hemoglobin concentration, white blood cell count, and platelet distribution width.

Conclusion: Machine learning models based on serological markers had high accuracy in predicting the relapse of UC. The model can be used to noninvasively predict patient outcomes and can be an effective tool for determining personalized treatment plans.

Keywords: machine learning; random forest model; relapse; serological markers; ulcerative colitis.