BANKSY unifies cell typing and tissue domain segmentation for scalable spatial omics data analysis

Nat Genet. 2024 Mar;56(3):431-441. doi: 10.1038/s41588-024-01664-3. Epub 2024 Feb 27.

Abstract

Spatial omics data are clustered to define both cell types and tissue domains. We present Building Aggregates with a Neighborhood Kernel and Spatial Yardstick (BANKSY), an algorithm that unifies these two spatial clustering problems by embedding cells in a product space of their own and the local neighborhood transcriptome, representing cell state and microenvironment, respectively. BANKSY's spatial feature augmentation strategy improved performance on both tasks when tested on diverse RNA (imaging, sequencing) and protein (imaging) datasets. BANKSY revealed unexpected niche-dependent cell states in the mouse brain and outperformed competing methods on domain segmentation and cell typing benchmarks. BANKSY can also be used for quality control of spatial transcriptomics data and for spatially aware batch effect correction. Importantly, it is substantially faster and more scalable than existing methods, enabling the processing of millions of cell datasets. In summary, BANKSY provides an accurate, biologically motivated, scalable and versatile framework for analyzing spatially resolved omics data.

MeSH terms

  • Algorithms*
  • Animals
  • Benchmarking*
  • Data Analysis
  • Gene Expression Profiling
  • Mice
  • RNA
  • Transcriptome

Substances

  • RNA