A deep learning approach based on multi-omics data integration to construct a risk stratification prediction model for skin cutaneous melanoma

J Cancer Res Clin Oncol. 2023 Nov;149(17):15923-15938. doi: 10.1007/s00432-023-05358-x. Epub 2023 Sep 7.

Abstract

Purpose: Skin cutaneous melanoma (SKCM) is a highly aggressive melanocytic carcinoma whose high heterogeneity and complex etiology make its prognosis difficult to predict. This study aimed to construct a risk subtype typing model for SKCM.

Methods: The study proposes a deep learning framework combining early fusion feature autoencoder (AE) and late fusion feature AE for risk subtype prediction of SKCM. The deep learning framework integrates mRNA, miRNA, and DNA methylation data of SKCM patients from The Cancer Genome Atlas (TCGA), and clusters the screened multi-omics features associated with survival prognosis to identify risk subtypes. Differential expression analysis and functional enrichment analysis were performed between risk subtypes, while SVM classifiers were constructed between differentially expressed genes (DEGs) obtained by Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression screening and risk subtype labels inferred from multi-omics data, and the predictive robustness of risk subtypes inferred from the risk subtype classification prediction model was validated using two independent datasets.

Results: The deep learning framework that combined early fusion feature AE with late fusion feature AE distinguished the two best risk subtypes compared to the multi-omics integration approach with single strategy AE or PCA. A promising C-index (C-index = 0.748) and a significant difference in survival (log-rank P value = 4.61 × 10-9) were found between the identified risk subtypes. The DEGs with the top significance values together with differentially expressed miRNAs provided the biological interpretation of risk subtypes on SKCM. Finally, the framework was applied to predict risk subtypes in two independent test datasets of SKCM patients, all of which showed good predictive power (C-index > 0.680) and significant survival differences (log-rank P value < 0.01).

Conclusion: The SKCM risk subtypes identified by integrating multi-omics data based on deep learning can not only improve the understanding of the molecular mechanisms of SKCM, but also provide clinicians with assistance in treatment decisions.

Keywords: Autoencoder; Deep learning; Multi-omics data integration; Prognosis prediction; Skin cutaneous melanoma; Subtyping.

MeSH terms

  • Deep Learning*
  • Humans
  • Melanoma* / genetics
  • Melanoma, Cutaneous Malignant
  • MicroRNAs* / genetics
  • Multiomics
  • Risk Assessment
  • Skin Neoplasms* / genetics

Substances

  • MicroRNAs