Machine learning for predicting colon cancer recurrence

Surg Oncol. 2024 Jun:54:102079. doi: 10.1016/j.suronc.2024.102079. Epub 2024 Apr 19.

Abstract

Introduction: Colorectal cancer (CRC) is a global public health concern, ranking among the most commonly diagnosed malignancies worldwide. Despite advancements in treatment modalities, the specter of CRC recurrence remains a significant challenge, demanding innovative solutions for early detection and intervention. The integration of machine learning into oncology offers a promising avenue to address this issue, providing data-driven insights and personalized care.

Methods: This retrospective study analyzed data from 396 patients who underwent surgical procedures for colon cancer (CC) between 2010 and 2021. Machine learning algorithms were employed to predict CC recurrence, with a focus on demographic, clinicopathological, and laboratory characteristics. A range of evaluation metrics, including AUC (Area Under the Receiver Operating Characteristic), accuracy, recall, precision, and F1 scores, assessed the performance of machine learning algorithms.

Results: Significant risk factors for CC recurrence were identified, including sex, carcinoembryonic antigen (CEA) levels, tumor location, depth, lymphatic and venous invasion, and lymph node involvement. The CatBoost Classifier demonstrated exceptional performance, achieving an AUC of 0.92 and an accuracy of 88 % on the test dataset. Feature importance analysis highlighted the significance of CEA levels, albumin levels, N stage, weight, platelet count, height, neutrophil count, lymphocyte count, and gender in determining recurrence risk.

Discussion: The integration of machine learning into healthcare, exemplified by this study's findings, offers a pathway to personalized patient risk stratification and enhanced clinical decision-making. Early identification of individuals at risk of CC recurrence holds the potential for more effective therapeutic interventions and improved patient outcomes.

Conclusion: Machine learning has the potential to revolutionize our approach to CC recurrence prediction, emphasizing the synergy between medical expertise and cutting-edge technology in the fight against cancer. This study represents a vital step toward precision medicine in CC management, showcasing the transformative power of data-driven insights in oncology.

Keywords: CatBoost classifier; Clinicopathological factors; Colon cancer recurrence; Machine learning; Predictive models.

MeSH terms

  • Adult
  • Aged
  • Aged, 80 and over
  • Colonic Neoplasms* / pathology
  • Colonic Neoplasms* / surgery
  • Female
  • Follow-Up Studies
  • Humans
  • Machine Learning*
  • Male
  • Middle Aged
  • Neoplasm Recurrence, Local* / pathology
  • Prognosis
  • Retrospective Studies
  • Risk Factors