Dimensionality reduction and visualization of single-cell RNA-seq data with an improved deep variational autoencoder

Brief Bioinform. 2023 May 19;24(3):bbad152. doi: 10.1093/bib/bbad152.

Abstract

Single-cell RNA sequencing (scRNA-seq) is a revolutionary breakthrough that determines the precise gene expressions on individual cells and deciphers cell heterogeneity and subpopulations. However, scRNA-seq data are much noisier than traditional high-throughput RNA-seq data because of technical limitations, leading to many scRNA-seq data studies about dimensionality reduction and visualization remaining at the basic data-stacking stage. In this study, we propose an improved variational autoencoder model (termed DREAM) for dimensionality reduction and a visual analysis of scRNA-seq data. Here, DREAM combines the variational autoencoder and Gaussian mixture model for cell type identification, meanwhile explicitly solving 'dropout' events by introducing the zero-inflated layer to obtain the low-dimensional representation that describes the changes in the original scRNA-seq dataset. Benchmarking comparisons across nine scRNA-seq datasets show that DREAM outperforms four state-of-the-art methods on average. Moreover, we prove that DREAM can accurately capture the expression dynamics of human preimplantation embryonic development. DREAM is implemented in Python, freely available via the GitHub website, https://github.com/Crystal-JJ/DREAM.

Keywords: dimensionality reduction; dropout; single-cell RNA-seq; variational autoencoder; visualization.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • Gene Expression Profiling / methods
  • Humans
  • RNA-Seq
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods
  • Single-Cell Gene Expression Analysis*