G2S3: A gene graph-based imputation method for single-cell RNA sequencing data

PLoS Comput Biol. 2021 May 18;17(5):e1009029. doi: 10.1371/journal.pcbi.1009029. eCollection 2021 May.

Abstract

Single-cell RNA sequencing technology provides an opportunity to study gene expression at single-cell resolution. However, prevalent dropout events result in high data sparsity and noise that may obscure downstream analyses in single-cell transcriptomic studies. We propose a new method, G2S3, that imputes dropouts by borrowing information from adjacent genes in a sparse gene graph learned from gene expression profiles across cells. We applied G2S3 and ten existing imputation methods to eight single-cell transcriptomic datasets and compared their performance. Our results demonstrated that G2S3 has superior overall performance in recovering gene expression, identifying cell subtypes, reconstructing cell trajectories, identifying differentially expressed genes, and recovering gene regulatory and correlation relationships. Moreover, G2S3 is computationally efficient for imputation in large-scale single-cell transcriptomic datasets.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology / methods
  • Datasets as Topic
  • Gene Expression Profiling
  • Humans
  • Sequence Analysis, RNA / methods*
  • Single-Cell Analysis / methods*