Machine learning-based prediction of survival prognosis in cervical cancer

BMC Bioinformatics. 2021 Jun 16;22(1):331. doi: 10.1186/s12859-021-04261-x.

Abstract

Background: Accurately forecasting the prognosis could improve cervical cancer management, however, the currently used clinical features are difficult to provide enough information. The aim of this study is to improve forecasting capability by developing a miRNAs-based machine learning survival prediction model.

Results: The expression characteristics of miRNAs were chosen as features for model development. The cervical cancer miRNA expression data was obtained from The Cancer Genome Atlas database. Preprocessing, including unquantified data removal, missing value imputation, samples normalization, log transformation, and feature scaling, was performed. In total, 42 survival-related miRNAs were identified by Cox Proportional-Hazards analysis. The patients were optimally clustered into four groups with three different 5-years survival outcome (≥ 90%, ≈ 65%, ≤ 40%) by K-means clustering algorithm base on top 10 survival-related miRNAs. According to the K-means clustering result, a prediction model with high performance was established. The pathways analysis indicated that the miRNAs used play roles involved in the regulation of cancer stem cells.

Conclusion: A miRNAs-based machine learning cervical cancer survival prediction model was developed that robustly stratifies cervical cancer patients into high survival rate (5-years survival rate ≥ 90%), moderate survival rate (5-years survival rate ≈ 65%), and low survival rate (5-years survival rate ≤ 40%).

Keywords: Cervical cancer; Machine learning; Support-vector machines; Survival prediction; miRNAs.

MeSH terms

  • Algorithms
  • Female
  • Humans
  • Machine Learning
  • MicroRNAs* / genetics
  • Survival Rate
  • Uterine Cervical Neoplasms* / genetics

Substances

  • MicroRNAs