A Recurrence-Specific Gene-Based Prognosis Prediction Model for Lung Adenocarcinoma through Machine Learning Algorithm

Biomed Res Int. 2020 Nov 7:2020:9124792. doi: 10.1155/2020/9124792. eCollection 2020.

Abstract

Background: After curative surgical resection, about 30-75% lung adenocarcinoma (LUAD) patients suffer from recurrence with dismal survival outcomes. Identification of patients with high risk of recurrence to impose intense therapy is urgently needed.

Materials and methods: Gene expression data of LUAD were obtained from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differentially expressed genes (DEGs) were calculated by comparing the recurrent and primary tissues. Prognostic genes associated with the recurrence-free survival (RFS) of LUAD patients were identified using univariate analysis. LASSO Cox regression and multivariate Cox analysis were applied to extract key genes and establish the prediction model.

Results: We detected 37 DEGs between primary and recurrent LUAD tumors. Using univariate analysis, 31 DEGs were found to be significantly associated with RFS. We established the RFS prediction model including thirteen genes using the LASSO Cox regression. In the training cohort, we classified patients into high- and low-risk groups and found that patients in the high-risk group suffered from worse RFS compared to those in the low-risk group (P < 0.01). Concordant results were confirmed in the internal and external validation cohort. The efficiency of the prediction model was also confirmed under different clinical subgroups. The high-risk group was significantly identified as the risk factor of recurrence in LUAD by the multivariate Cox analysis (HR = 13.37, P = 0.01). Compared to clinicopathological features, our prediction model possessed higher accuracy to identify patients with high risk of recurrence (AUC = 96.3%). Finally, we found that the G2M checkpoint pathway was enriched both in recurrent tumors and primary tumors of high-risk patients.

Conclusions: Our recurrence-specific gene-based prognostic prediction model provides extra information about the risk of recurrence in LUAD, which is conducive for clinicians to conduct individualized therapy in clinic.

MeSH terms

  • Adenocarcinoma of Lung / genetics
  • Adenocarcinoma of Lung / mortality*
  • Adenocarcinoma of Lung / pathology*
  • Aged
  • Algorithms
  • Databases, Genetic
  • Female
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Lung Neoplasms / genetics
  • Lung Neoplasms / mortality*
  • Lung Neoplasms / pathology*
  • Machine Learning
  • Male
  • Middle Aged
  • Models, Theoretical
  • Neoplasm Recurrence, Local / genetics
  • Prognosis