scCancer: a package for automated processing of single-cell RNA-seq data in cancer

Brief Bioinform. 2021 May 20;22(3):bbaa127. doi: 10.1093/bib/bbaa127.

Abstract

Molecular heterogeneities and complex microenvironments bring great challenges for cancer diagnosis and treatment. Recent advances in single-cell RNA-sequencing (scRNA-seq) technology make it possible to study cancer cell heterogeneities and microenvironments at single-cell transcriptomic level. Here, we develop an R package named scCancer, which focuses on processing and analyzing scRNA-seq data for cancer research. Except basic data processing steps, this package takes several special considerations for cancer-specific features. Firstly, the package introduced comprehensive quality control metrics. Secondly, it used a data-driven machine learning algorithm to accurately identify major cancer microenvironment cell populations. Thirdly, it estimated a malignancy score to classify malignant (cancerous) and non-malignant cells. Then, it analyzed intra-tumor heterogeneities by key cellular phenotypes (such as cell cycle and stemness), gene signatures and cell-cell interactions. Besides, it provided multi-sample data integration analysis with different batch-effect correction strategies. Finally, user-friendly graphic reports were generated for all the analyses. By testing on 56 samples with 433 405 cells in total, we demonstrated its good performance. The package is available at: http://lifeome.net/software/sccancer/.

Keywords: cancer; pipeline; single-cell RNA-sequencing.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Nucleic Acid
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Machine Learning*
  • Neoplasms*
  • RNA, Neoplasm* / biosynthesis
  • RNA, Neoplasm* / genetics
  • RNA-Seq*
  • Single-Cell Analysis*
  • Software*

Substances

  • RNA, Neoplasm