cnnImpute: missing value recovery for single cell RNA sequencing data

Sci Rep. 2024 Feb 16;14(1):3946. doi: 10.1038/s41598-024-53998-x.

Abstract

The advent of single-cell RNA sequencing (scRNA-seq) technology has revolutionized our ability to explore cellular diversity and unravel the complexities of intricate diseases. However, due to the inherently low signal-to-noise ratio and the presence of an excessive number of missing values, scRNA-seq data analysis encounters unique challenges. Here, we present cnnImpute, a novel convolutional neural network (CNN) based method designed to address the issue of missing data in scRNA-seq. Our approach starts by estimating missing probabilities, followed by constructing a CNN-based model to recover expression values with a high likelihood of being missing. Through comprehensive evaluations, cnnImpute demonstrates its effectiveness in accurately imputing missing values while preserving the integrity of cell clusters in scRNA-seq data analysis. It achieved superior performance in various benchmarking experiments. cnnImpute offers an accurate and scalable method for recovering missing values, providing a useful resource for scRNA-seq data analysis.

MeSH terms

  • Cluster Analysis
  • Exome Sequencing
  • Gene Expression Profiling* / methods
  • Probability
  • RNA
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods

Substances

  • RNA