ScSmOP: a universal computational pipeline for single-cell single-molecule multiomics data analysis

Brief Bioinform. 2023 Sep 22;24(6):bbad343. doi: 10.1093/bib/bbad343.

Abstract

Single-cell multiomics techniques have been widely applied to detect the key signature of cells. These methods have achieved a single-molecule resolution and can even reveal spatial localization. These emerging methods provide insights elucidating the features of genomic, epigenomic and transcriptomic heterogeneity in individual cells. However, they have given rise to new computational challenges in data processing. Here, we describe Single-cell Single-molecule multiple Omics Pipeline (ScSmOP), a universal pipeline for barcode-indexed single-cell single-molecule multiomics data analysis. Essentially, the C language is utilized in ScSmOP to set up spaced-seed hash table-based algorithms for barcode identification according to ligation-based barcoding data and synthesis-based barcoding data, followed by data mapping and deconvolution. We demonstrate high reproducibility of data processing between ScSmOP and published pipelines in comprehensive analyses of single-cell omics data (scRNA-seq, scATAC-seq, scARC-seq), single-molecule chromatin interaction data (ChIA-Drop, SPRITE, RD-SPRITE), single-cell single-molecule chromatin interaction data (scSPRITE) and spatial transcriptomic data from various cell types and species. Additionally, ScSmOP shows more rapid performance and is a versatile, efficient, easy-to-use and robust pipeline for single-cell single-molecule multiomics data analysis.

Keywords: barcode identification; multiomics; pipeline; single cell; single molecule; spaced seed hash.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Chromatin / genetics
  • Data Analysis
  • Genomics*
  • Multiomics*
  • Reproducibility of Results

Substances

  • Chromatin