Construction and Validation of a Prognostic Gene-Based Model for Overall Survival Prediction in Hepatocellular Carcinoma Using an Integrated Statistical and Bioinformatic Approach

Int J Mol Sci. 2021 Feb 5;22(4):1632. doi: 10.3390/ijms22041632.

Abstract

Hepatocellular carcinoma (HCC) is one of the most common lethal cancers worldwide and is often related to late diagnosis and poor survival outcome. More evidence is demonstrating that gene-based prognostic models can be used to predict high-risk HCC patients. Therefore, our study aimed to construct a novel prognostic model for predicting the prognosis of HCC patients. We used multivariate Cox regression model with three hybrid penalties approach including least absolute shrinkage and selection operator (Lasso), adaptive lasso and elastic net algorithms for informative prognostic-related genes selection. Then, the best subset regression was used to identify the best prognostic gene signature. The prognostic gene-based risk score was constructed using the Cox coefficient of the prognostic gene signature. The model was evaluated by Kaplan-Meier (KM) and receiver operating characteristic curve (ROC) analyses. A novel four-gene signature associated with prognosis was identified and the risk score was constructed based on the four-gene signature. The risk score efficiently distinguished the patients into a high-risk group with poor prognosis. The time-dependent ROC analysis revealed that the risk model had a good performance with an area under the curve (AUC) of 0.780, 0.732, 0.733 in 1-, 2- and 3-year prognosis prediction in The Cancer Genome Atlas (TCGA) dataset. Moreover, the risk score revealed a high diagnostic performance to classify HCC from normal samples. The prognosis and diagnosis prediction performances of risk scores were verified in external validation datasets. Functional enrichment analysis of the four-gene signature and its co-expressed genes involved in the metabolic and cell cycle pathways was constructed. Overall, we developed a novel-gene-based prognostic model to predict high-risk HCC patients and we hope that our findings can provide promising insight to explore the role of the four-gene signature in HCC patients and aid risk classification.

Keywords: bioinformatics; biomarker; diagnosis; differential expressed gene; hepatocellular carcinoma; prognosis; risk model; survival analysis.

MeSH terms

  • Biomarkers, Tumor / genetics
  • Carcinoma, Hepatocellular / diagnosis*
  • Carcinoma, Hepatocellular / genetics
  • Carcinoma, Hepatocellular / mortality*
  • Computational Biology / methods*
  • Databases, Genetic
  • Early Detection of Cancer
  • Gene Expression Profiling
  • Gene Expression Regulation, Neoplastic
  • Gene Regulatory Networks*
  • Genetic Predisposition to Disease / genetics
  • Humans
  • Kaplan-Meier Estimate
  • Liver Neoplasms / diagnosis*
  • Liver Neoplasms / genetics
  • Liver Neoplasms / mortality*
  • Nomograms
  • Prognosis
  • ROC Curve
  • Regression Analysis
  • Survival Analysis

Substances

  • Biomarkers, Tumor