Evaluation of integrative clustering methods for the analysis of multi-omics data

Brief Bioinform. 2020 Mar 23;21(2):541-552. doi: 10.1093/bib/bbz015.

Abstract

Recent advances in sequencing, mass spectrometry and cytometry technologies have enabled researchers to collect large-scale omics data from the same set of biological samples. The joint analysis of multiple omics offers the opportunity to uncover coordinated cellular processes acting across different omic layers. In this work, we present a thorough comparison of a selection of recent integrative clustering approaches, including Bayesian (BCC and MDI) and matrix factorization approaches (iCluster, moCluster, JIVE and iNMF). Based on simulations, the methods were evaluated on their sensitivity and their ability to recover both the correct number of clusters and the simulated clustering at the common and data-specific levels. Standard non-integrative approaches were also included to quantify the added value of integrative methods. For most matrix factorization methods and one Bayesian approach (BCC), the shared and specific structures were successfully recovered with high and moderate accuracy, respectively. An opposite behavior was observed on non-integrative approaches, i.e. high performances on specific structures only. Finally, we applied the methods on the Cancer Genome Atlas breast cancer data set to check whether results based on experimental data were consistent with those obtained in the simulations.

Keywords: benchmark; clustering; data integration; multi-omics; unsupervised analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Bayes Theorem
  • Breast Neoplasms / genetics
  • Breast Neoplasms / metabolism
  • Cluster Analysis
  • Genomics / methods*
  • Humans
  • Proteomics / methods*
  • Unsupervised Machine Learning