Integrative analysis from multi-center studies identities a consensus machine learning-derived lncRNA signature for stage II/III colorectal cancer

EBioMedicine. 2022 Jan:75:103750. doi: 10.1016/j.ebiom.2021.103750. Epub 2021 Dec 15.

Abstract

Background: Long non-coding RNAs (lncRNAs) have recently emerged as essential biomarkers of cancer progression. However, studies are limited regarding lncRNAs correlated with recurrence and fluorouracil-based adjuvant chemotherapy (ACT) in stage II/III colorectal cancer (CRC).

Methods: 1640 stage II/III CRC patients were enrolled from 15 independent datasets and a clinical in-house cohort. 10 prevalent machine learning algorithms were collected and then combined into 76 combinations. 109 published transcriptome signatures were also retrieved. qRT-PCR assay was performed to verify our model.

Findings: We comprehensively identified 27 stably recurrence-related lncRNAs from multi-center cohorts. According to these lncRNAs, a consensus machine learning-derived lncRNA signature (CMDLncS) that exhibited best power for predicting recurrence risk was determined from 76 kinds of algorithm combinations. A high CMDLncS indicated unfavorable recurrence and mortality rates. CMDLncS not only could work independently of common clinical traits (e.g., AJCC stage) and molecular features (e.g., microsatellite state, KRAS mutation), but also presented dramatically better performance than these variables. qRT-PCR results from 173 patients further verified our in-silico findings and assessed its feasible in different centers. Comparisons of CMDLncS with 109 published transcriptome signatures further demonstrated its predictive superiority. Additionally, patients with high CMDLncS benefited more from fluorouracil-based ACT and were characterized by activation of stromal and epithelial-mesenchymal transition, while patients with low CMDLncS suggested the sensitivity to bevacizumab and displayed enhanced immune activation.

Interpretation: CMDLncS provides an attractive platform for identifying patient at high risk of recurrence and could optimize precision treatment to improve the clinical outcomes in stage II/III CRC.

Funding: This study was supported by the National Natural Science Foundation of China (81,972,663); Henan Province Young and Middle-Aged Health Science and Technology Innovation Talent Project (YXKC2020037); and Henan Provincial Health Commission Joint Youth Project (SB201902014).

Keywords: Chemotherapy; LncRNA; Machine learning; Recurrence; Stage II/III colorectal cancer.

MeSH terms

  • Adolescent
  • Biomarkers, Tumor / genetics
  • Colorectal Neoplasms* / drug therapy
  • Colorectal Neoplasms* / genetics
  • Consensus Sequence* / genetics
  • Humans
  • Machine Learning
  • Middle Aged
  • Neoplasm Recurrence, Local / genetics
  • Prognosis
  • RNA, Long Noncoding* / genetics

Substances

  • Biomarkers, Tumor
  • RNA, Long Noncoding