Identifying important microbial biomarkers for the diagnosis of colon cancer using a random forest approach

Heliyon. 2024 Jan 15;10(2):e24713. doi: 10.1016/j.heliyon.2024.e24713. eCollection 2024 Jan 30.

Abstract

Colon cancer is one of the most common cancers, with 30-50 % of patients returning or metastasizing within 5 years of treatment. Increasingly, researchers have highlighted the influence of microbes on cancer malignant activity, while no studies have explored the relationship between colon cancer and the microbes in tumors. Here, we used tissue and blood samples from 67 colon cancer patients to identify pathogenic microorganisms associated with the diagnosis and prediction of colon cancer and evaluate the predictive performance of each pathogenic marker and its combination based on the next-generation sequencing data by using random forest algorithms. The results showed that we constructed a database of 13,187 pathogenic microorganisms associated with human disease and identified 2 pathogenic microorganisms (Synthetic.construct_32630 and Dicrocoelium.dendriticum_57078) associated with colon cancer diagnosis, and the constructed diagnostic prediction model performed well for tumor tissue samples and blood samples. In summary, for the first time, we provide new molecular markers for the diagnosis of colon cancer based on the expression of pathogenic microorganisms in order to provide a reference for improving the effective screening rate of colon cancer in clinical practice and ameliorating the personalized treatment of colon cancer patients.

Keywords: Colon cancer; Next-generation sequencing; Pathogenic microorganism; Random forest approach.