SCExecute: custom cell barcode-stratified analyses of scRNA-seq data

Bioinformatics. 2023 Jan 1;39(1):btac768. doi: 10.1093/bioinformatics/btac768.

Abstract

Motivation: In single-cell RNA-sequencing (scRNA-seq) data, stratification of sequencing reads by cellular barcode is necessary to study cell-specific features. However, apart from gene expression, the analyses of cell-specific features are not sufficiently supported by available tools designed for high-throughput sequencing data.

Results: We introduce SCExecute, which executes a user-provided command on barcode-stratified, extracted on-the-fly, single-cell binary alignment map (scBAM) files. SCExecute extracts the alignments with each cell barcode from aligned, pooled single-cell sequencing data. Simple commands, monolithic programs, multi-command shell scripts or complex shell-based pipelines are then executed on each scBAM file. scBAM files can be restricted to specific barcodes and/or genomic regions of interest. We demonstrate SCExecute with two popular variant callers-GATK and Strelka2-executed in shell-scripts together with commands for BAM file manipulation and variant filtering, to detect single-cell-specific expressed single nucleotide variants from droplet scRNA-seq data (10X Genomics Chromium System).In conclusion, SCExecute facilitates custom cell-level analyses on barcoded scRNA-seq data using currently available tools and provides an effective solution for studying low (cellular) frequency transcriptome features.

Availability and implementation: SCExecute is implemented in Python3 using the Pysam package and distributed for Linux, MacOS and Python environments from https://horvathlab.github.io/NGS/SCExecute.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genomics
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, RNA
  • Single-Cell Analysis
  • Single-Cell Gene Expression Analysis*
  • Software*