Multi-omics integration in the age of million single-cell data

Nat Rev Nephrol. 2021 Nov;17(11):710-724. doi: 10.1038/s41581-021-00463-x. Epub 2021 Aug 20.

Abstract

An explosion in single-cell technologies has revealed a previously underappreciated heterogeneity of cell types and novel cell-state associations with sex, disease, development and other processes. Starting with transcriptome analyses, single-cell techniques have extended to multi-omics approaches and now enable the simultaneous measurement of data modalities and spatial cellular context. Data are now available for millions of cells, for whole-genome measurements and for multiple modalities. Although analyses of such multimodal datasets have the potential to provide new insights into biological processes that cannot be inferred with a single mode of assay, the integration of very large, complex, multimodal data into biological models and mechanisms represents a considerable challenge. An understanding of the principles of data integration and visualization methods is required to determine what methods are best applied to a particular single-cell dataset. Each class of method has advantages and pitfalls in terms of its ability to achieve various biological goals, including cell-type classification, regulatory network modelling and biological process inference. In choosing a data integration strategy, consideration must be given to whether the multi-omics data are matched (that is, measured on the same cell) or unmatched (that is, measured on different cells) and, more importantly, the overall modelling and visualization goals of the integrated analysis.

Publication types

  • Review

MeSH terms

  • Computational Biology
  • Data Analysis
  • Data Visualization
  • Epigenomics
  • Gene Expression Profiling
  • Genomics*
  • Humans
  • Proteomics
  • Single-Cell Analysis*