Using DenseFly algorithm for cell searching on massive scRNA-seq datasets

BMC Genomics. 2020 Dec 16;21(Suppl 5):222. doi: 10.1186/s12864-020-6651-8.

Abstract

Background: High throughput single-cell transcriptomic technology produces massive high-dimensional data, enabling high-resolution cell type definition and identification. To uncover the expressional patterns beneath the big data, a transcriptional landscape searching algorithm at a single-cell level is desirable.

Results: We explored the feasibility of using DenseFly algorithm for cell searching on scRNA-seq data. DenseFly is a locality sensitive hashing algorithm inspired by the fruit fly olfactory system. The experiments indicate that DenseFly outperforms the baseline methods FlyHash and SimHash in classification tasks, and the performance is robust to dropout events and batch effects.

Conclusion: We developed a method for mapping cells across scRNA-seq datasets based on the DenseFly algorithm. It can be an efficient tool for cell atlas searching.

Keywords: Cell searching; DenseFly; Locality sensitive hashing; scRNA-seq.

MeSH terms

  • Algorithms
  • Computational Biology
  • RNA, Small Cytoplasmic*
  • Sequence Analysis, RNA
  • Single-Cell Analysis*

Substances

  • RNA, Small Cytoplasmic