Identification of a novel gene signature for the prediction of recurrence in HCC patients by machine learning of genome-wide databases

Sci Rep. 2020 Mar 10;10(1):4435. doi: 10.1038/s41598-020-61298-3.

Abstract

Hepatocellular carcinoma (HCC) is a common malignant tumor in China. In the present study, we aimed to construct and verify a prediction model of recurrence in HCC patients using databases (TCGA, AMC and Inserm) and machine learning methods and obtain the gene signature that could predict early relapse of HCC. Statistical methods, such as feature selection, survival analysis and Chi-Square test in R software, were used to analyze and select mutant genes related to disease free survival (DFS), race and vascular invasion. In addition, whole-exome sequencing was performed on 10 HCC patients recruited from our center, and the sequencing results were compared with the databases. Using the databases and machine learning methods, the prediction model of recurrence was constructed and optimized, and the selected mutant genes were verified in the test group. The accuracy of prediction was 74.19%. Moreover, these 10 patients from our center were used to verify these mutant genes and the prediction model, and a success rate of 80% was achieved. Collectively, we discovered recurrence-related genes and established recurrence prediction model of recurrence for HCC patients, which could provide significant guidance for clinical prediction of recurrence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Carcinoma, Hepatocellular / genetics*
  • Carcinoma, Hepatocellular / pathology
  • Disease-Free Survival
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Liver Neoplasms / genetics*
  • Liver Neoplasms / pathology
  • Machine Learning*
  • Neoplasm Recurrence, Local / genetics*
  • Neoplasm Recurrence, Local / pathology
  • Survival Rate

Substances

  • Biomarkers, Tumor