scHybridBERT: integrating gene regulation and cell graph for spatiotemporal dynamics in single-cell clustering

Brief Bioinform. 2024 Jan 22;25(2):bbae018. doi: 10.1093/bib/bbae018.

Abstract

Graph learning models have received increasing attention in the computational analysis of single-cell RNA sequencing (scRNA-seq) data. Compared with conventional deep neural networks, graph neural networks and language models have exhibited superior performance by extracting graph-structured data from raw gene count matrices. Established deep neural network-based clustering approaches generally focus on temporal expression patterns while ignoring inherent interactions at gene-level as well as cell-level, which could be regarded as spatial dynamics in single-cell data. Both gene-gene and cell-cell interactions are able to boost the performance of cell type detection, under the framework of multi-view modeling. In this study, spatiotemporal embedding and cell graphs are extracted to capture spatial dynamics at the molecular level. In order to enhance the accuracy of cell type detection, this study proposes the scHybridBERT architecture to conduct multi-view modeling of scRNA-seq data using extracted spatiotemporal patterns. In this scHybridBERT method, graph learning models are employed to deal with cell graphs and the Performer model employs spatiotemporal embeddings. Experimental outcomes about benchmark scRNA-seq datasets indicate that the proposed scHybridBERT method is able to enhance the accuracy of single-cell clustering tasks by integrating spatiotemporal embeddings and cell graphs.

Keywords: BERT architecture; cell graphs; graph attention networks; multi-view modeling; spatiotemporal embedding.

MeSH terms

  • Benchmarking*
  • Cell Communication
  • Cluster Analysis
  • Gene Expression Regulation*
  • Learning