[Imputation method for dropout in single-cell transcriptome data]

Sheng Wu Yi Xue Gong Cheng Xue Za Zhi. 2023 Aug 25;40(4):778-783. doi: 10.7507/1001-5515.202301009.
[Article in Chinese]

Abstract

Single-cell transcriptome sequencing (scRNA-seq) can resolve the expression characteristics of cells in tissues with single-cell precision, enabling researchers to quantify cellular heterogeneity within populations with higher resolution, revealing potentially heterogeneous cell populations and the dynamics of complex tissues. However, the presence of a large number of technical zeros in scRNA-seq data will have an impact on downstream analysis of cell clustering, differential genes, cell annotation, and pseudotime, hindering the discovery of meaningful biological signals. The main idea to solve this problem is to make use of the potential correlation between cells and genes, and to impute the technical zeros through the observed data. Based on this, this paper reviewed the basic methods of imputing technical zeros in the scRNA-seq data and discussed the advantages and disadvantages of the existing methods. Finally, recommendations and perspectives on the use and development of the method were provided.

单细胞转录组测序(scRNA-seq)可以在单细胞精度下解析组织中细胞的表达特征,使得研究人员能以更高的分辨率定量群体内的细胞异质性,揭示潜在的异质细胞群体和复杂组织的动态。然而scRNA-seq数据中存在的大量技术零值,将对下游的细胞聚类、差异基因、细胞注释、拟时序等分析造成影响,阻碍了对有意义的生物学信号的发现。利用细胞与细胞、基因与基因之间潜在的关联性,通过已观测到的数据来对技术零值进行填补是解决这个问题的主要思路。基于此,本文综述了scRNA-seq数据中填补技术零值的基本方法,并讨论了现有方法的优势和不足,最后对方法的使用和开发进行了推荐和展望。.

Keywords: Deep learning; Dropout; Low rank matrix completion; Single-cell RNA sequencing; Statistical model.

Publication types

  • English Abstract
  • Review

MeSH terms

  • Cluster Analysis
  • Transcriptome*

Grants and funding

国家自然科学基金重大科研仪器研制项目(81827901);国家重点研发计划(2022YFF0710800)