MulCNN: An efficient and accurate deep learning method based on gene embedding for cell type identification in single-cell RNA-seq data

Front Genet. 2023 Apr 4:14:1179859. doi: 10.3389/fgene.2023.1179859. eCollection 2023.

Abstract

Advancements in single-cell sequencing research have revolutionized our understanding of cellular heterogeneity and functional diversity through the analysis of single-cell transcriptomes and genomes. A crucial step in single-cell RNA sequencing (scRNA-seq) analysis is identifying cell types. However, scRNA-seq data are often high dimensional and sparse, and manual cell type identification can be time-consuming, subjective, and lack reproducibility. Consequently, analyzing scRNA-seq data remains a computational challenge. With the increasing availability of well-annotated scRNA-seq datasets, advanced methods are emerging to aid in cell type identification by leveraging this information. Deep learning neural networks have great potential for analyzing single-cell data. This paper proposes MulCNN, a multi-level convolutional neural network that uses a unique cell type-specific gene expression feature extraction method. This method extracts critical features through multi-scale convolution while filtering noise. Extensive testing using datasets from various species and comparisons with popular classification methods show that MulCNN has outstanding performance and offers a new and scalable direction for scRNA-seq analysis.

Keywords: cell type identification; convolutional neural Networks; gene expression feature extraction; scRNA-seq; single-cell sequencing.

Grants and funding

This work was supported by National Key Research and Development Project of China (2021YFA1000103, 2021YFA1000102), National Natural Science Foundation of China (Grant Nos. 61873280, 61972416, 62272479, 62202498), Taishan Scholarship (tsqn201812029), Foundation of Science and Technology Development of Jinan (201907116), Shandong Provincial Natural Science Foundation (ZR2021QF023), Fundamental Research Funds for the Central Universities (21CX06018A), Spanish project PID2019-106960GB-I00, Juan de la Cierva IJC2018-038539-I.