Integrating Multi-Omic Data With Deep Subspace Fusion Clustering for Cancer Subtype Prediction

IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):216-226. doi: 10.1109/TCBB.2019.2951413. Epub 2021 Feb 3.

Abstract

One type of cancer usually consists of several subtypes with distinct clinical implications, thus the cancer subtype prediction is an important task in disease diagnosis and therapy. Utilizing one type of data from molecular layers in biological system to predict is difficult to bridge the cancer genome to cancer phenotypes, since the genome is neither simple nor independent but rather complicated and dysregulated from multiple molecular mechanisms. Similarity Network Fusion (SNF) has been recently proposed to integrate diverse omics data for improving the understanding of tumorigenesis. SNF adopts Euclidean distance to measure the similarity between patients, which shows some limitations. In this article, we introduce a novel prediction technique as an extension of SNF, namely Deep Subspace Fusion Clustering (DSFC). DSFC utilizes auto-encoder and data self-expressiveness approaches to guide a deep subspace model, which can achieve effective expression of discriminative similarity between patients. As a result, the dissimilarity between inter-cluster is delivered and enhanced compactness of intra-cluster is achieved at the same time. The validity of DSFC is examined by extensive simulations over six different cancer through three levels omics data. The survival analysis demonstrates that DSFC delivers comparable or even better results than many state-of-the-art integrative methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis*
  • Computational Biology / methods*
  • Databases, Factual
  • Humans
  • Machine Learning*
  • Neoplasms / classification*