CIARA: a cluster-independent algorithm for identifying markers of rare cell types from single-cell sequencing data

Development. 2023 Jun 1;150(11):dev201264. doi: 10.1242/dev.201264. Epub 2023 Jun 8.

Abstract

A powerful feature of single-cell genomics is the possibility of identifying cell types from their molecular profiles. In particular, identifying novel rare cell types and their marker genes is a key potential of single-cell RNA sequencing. Standard clustering approaches perform well in identifying relatively abundant cell types, but tend to miss rarer cell types. Here, we have developed CIARA (Cluster Independent Algorithm for the identification of markers of RAre cell types), a cluster-independent computational tool designed to select genes that are likely to be markers of rare cell types. Genes selected by CIARA are subsequently integrated with common clustering algorithms to single out groups of rare cell types. CIARA outperforms existing methods for rare cell type detection, and we use it to find previously uncharacterized rare populations of cells in a human gastrula and among mouse embryonic stem cells treated with retinoic acid. Moreover, CIARA can be applied more generally to any type of single-cell omic data, thus allowing the identification of rare cells across multiple data modalities. We provide implementations of CIARA in user-friendly packages available in R and Python.

Keywords: Computational method; Rare cell types; Single-cell sequencing.

MeSH terms

  • Algorithms*
  • Animals
  • Cluster Analysis
  • Gene Expression Profiling / methods
  • Humans
  • Mice
  • Sequence Analysis, RNA / methods
  • Single-Cell Analysis* / methods