Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer

Nat Commun. 2021 Jan 5;12(1):124. doi: 10.1038/s41467-020-20430-7.

Abstract

High-dimensional multi-omics data are now standard in biology. They can greatly enhance our understanding of biological systems when effectively integrated. To achieve proper integration, joint Dimensionality Reduction (jDR) methods are among the most efficient approaches. However, several jDR methods are available, urging the need for a comprehensive benchmark with practical guidelines. We perform a systematic evaluation of nine representative jDR methods using three complementary benchmarks. First, we evaluate their performances in retrieving ground-truth sample clustering from simulated multi-omics datasets. Second, we use TCGA cancer data to assess their strengths in predicting survival, clinical annotations and known pathways/biological processes. Finally, we assess their classification of multi-omics single-cell data. From these in-depth comparisons, we observe that intNMF performs best in clustering, while MCIA offers an effective behavior across many contexts. The code developed for this benchmark study is implemented in a Jupyter notebook-multi-omics mix (momix)-to foster reproducibility, and support users and future developers.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Benchmarking
  • Cell Line, Tumor
  • Computational Biology / methods*
  • Datasets as Topic
  • Gene Expression Regulation, Neoplastic*
  • Gene Ontology
  • Humans
  • Molecular Sequence Annotation
  • Multifactor Dimensionality Reduction
  • Neoplasm Proteins / genetics*
  • Neoplasm Proteins / metabolism
  • Neoplasms / diagnosis
  • Neoplasms / genetics*
  • Neoplasms / mortality
  • Neoplasms / pathology
  • Reproducibility of Results
  • Single-Cell Analysis
  • Survival Analysis

Substances

  • Neoplasm Proteins