SDImpute: A statistical block imputation method based on cell-level and gene-level information for dropouts in single-cell RNA-seq data

PLoS Comput Biol. 2021 Jun 17;17(6):e1009118. doi: 10.1371/journal.pcbi.1009118. eCollection 2021 Jun.

Abstract

The single-cell RNA sequencing (scRNA-seq) technologies obtain gene expression at single-cell resolution and provide a tool for exploring cell heterogeneity and cell types. As the low amount of extracted mRNA copies per cell, scRNA-seq data exhibit a large number of dropouts, which hinders the downstream analysis of the scRNA-seq data. We propose a statistical method, SDImpute (Single-cell RNA-seq Dropout Imputation), to implement block imputation for dropout events in scRNA-seq data. SDImpute automatically identifies the dropout events based on the gene expression levels and the variations of gene expression across similar cells and similar genes, and it implements block imputation for dropouts by utilizing gene expression unaffected by dropouts from similar cells. In the experiments, the results of the simulated datasets and real datasets suggest that SDImpute is an effective tool to recover the data and preserve the heterogeneity of gene expression across cells. Compared with the state-of-the-art imputation methods, SDImpute improves the accuracy of the downstream analysis including clustering, visualization, and differential expression analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Cluster Analysis
  • Computational Biology
  • Computer Simulation
  • Data Interpretation, Statistical
  • Data Visualization
  • Databases, Nucleic Acid / statistics & numerical data
  • Gene Expression Profiling / statistics & numerical data
  • Genetic Techniques / statistics & numerical data
  • Humans
  • RNA, Messenger / genetics
  • RNA, Messenger / isolation & purification
  • RNA-Seq / statistics & numerical data*
  • Single-Cell Analysis / statistics & numerical data*
  • Software*

Substances

  • RNA, Messenger

Grants and funding

The work was supported by the National Natural Science Foundation of China (http://www.nsfc.gov.cn/) via the Grant number 11971130 (awarded to SJ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.