Multi-View Clustering for Integration of Gene Expression and Methylation Data With Tensor Decomposition and Self-Representation Learning

IEEE/ACM Trans Comput Biol Bioinform. 2023 May-Jun;20(3):2050-2063. doi: 10.1109/TCBB.2022.3229678. Epub 2023 Jun 5.

Abstract

The accumulated DNA methylation and gene expression provide a great opportunity to exploit the epigenetic patterns of genes, which is the foundation for revealing the underlying mechanisms of biological systems. Current integrative algorithms are criticized for undesirable performance because they fail to address the heterogeneity of expression and methylation data, and the intrinsic relations among them. To solve this issue, a novel multi-view clustering with self-representation learning and low-rank tensor constraint (MCSL-LTC) is proposed for the integration of gene expression and DNA methylation data, which are treated as complementary views. Specifically, MCSL-LTC first learns the low-dimensional features for each view with the linear projection, and then these features are fused in a unified tensor space with low-rank constraints. In this case, the complementary information of various views is precisely captured, where the heterogeneity of omic data is avoided, thereby enhancing the consistency of different views. Finally, MCSL-LTC obtains a consensus cluster of genes reflecting the structure and features of various views. Experimental results demonstrate that the proposed approach outperforms state-of-the-art baselines in terms of accuracy on both the social and cancer data, which provides an effective and efficient method for the integration of heterogeneous genomic data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Cluster Analysis
  • DNA Methylation* / genetics
  • Gene Expression
  • Genomics