An End-to-End Deep Hybrid Autoencoder Based Method for Single-Cell RNA-Seq Data Analysis

IEEE/ACM Trans Comput Biol Bioinform. 2023 Nov-Dec;20(6):3889-3900. doi: 10.1109/TCBB.2023.3328029. Epub 2023 Dec 25.

Abstract

Single-cell RNA sequencing technology provides powerful support for researchers to understand the complex mechanisms of cells at the single-cell level. Due to the high sparsity, technical noise, and computational complexity of single-cell transcriptome data, the existing data analysis methods are unable to effectively extract the fine-grained characteristics of scRNA-seq data, resulting in inaccurately analyze the heterogeneity of the individual cell from a great quantity of cell mixtures. To address these shortcomings, we proposed an end-to-end analysis method called dhaSCA, which integrates the Graph convolutional neural network (GCN) feature learning and downstream tasks such as classification and imputation into a unified deep learning manner. dhaSCA uses hybrid GCN-MLP deep autoencoder and to capture structural information between cells, and learn the low dimensional cell representation. It also introduces downstream tasks as constraints to guide the model to learn more accurate cell features. We conducted various experiments to evaluate the performance of dhaSCA based on eight real RNA-Seq datasets, including classification, imputation, clustering, and visualization. The results show that dhaSCA outperforms other state-of-the-art methods in these downstream tasks. Therefore, dhaSCA is able to obtain a richer representation of cells, and provides strong support for efficient analysis of single-cell data.

MeSH terms

  • Cluster Analysis
  • Data Analysis
  • Gene Expression Profiling
  • Neural Networks, Computer
  • Research Design*
  • Sequence Analysis, RNA
  • Single-Cell Gene Expression Analysis*