Predicting Deep Learning Based Multi-Omics Parallel Integration Survival Subtypes in Lung Cancer Using Reverse Phase Protein Array Data

Biomolecules. 2020 Oct 19;10(10):1460. doi: 10.3390/biom10101460.

Abstract

Mortality attributed to lung cancer accounts for a large fraction of cancer deaths worldwide. With increasing mortality figures, the accurate prediction of prognosis has become essential. In recent years, multi-omics analysis has emerged as a useful survival prediction tool. However, the methodology relevant to multi-omics analysis has not yet been fully established and further improvements are required for clinical applications. In this study, we developed a novel method to accurately predict the survival of patients with lung cancer using multi-omics data. With unsupervised learning techniques, survival-associated subtypes in non-small cell lung cancer were first detected using the multi-omics datasets from six categories in The Cancer Genome Atlas (TCGA). The new subtypes, referred to as integration survival subtypes, clearly divided patients into longer and shorter-surviving groups (log-rank test: p = 0.003) and we confirmed that this is independent of histopathological classification (Chi-square test of independence: p = 0.94). Next, an attempt was made to detect the integration survival subtypes using only one categorical dataset. Our machine learning model that was only trained on the reverse phase protein array (RPPA) could accurately predict the integration survival subtypes (AUC = 0.99). The predicted subtypes could also distinguish between high and low risk patients (log-rank test: p = 0.012). Overall, this study explores novel potentials of multi-omics analysis to accurately predict the prognosis of patients with lung cancer.

Keywords: deep learning and machine learning; lung cancer; multi-omics analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Carcinoma, Non-Small-Cell Lung / genetics*
  • Carcinoma, Non-Small-Cell Lung / pathology
  • DNA Methylation / genetics
  • Deep Learning*
  • Disease-Free Survival
  • Female
  • Genomics / statistics & numerical data
  • Humans
  • Machine Learning*
  • Male
  • Models, Theoretical
  • Prognosis*
  • Protein Array Analysis / methods
  • Proteomics / statistics & numerical data