SOMDE: a scalable method for identifying spatially variable genes with self-organizing map

Bioinformatics. 2021 Dec 7;37(23):4392-4398. doi: 10.1093/bioinformatics/btab471.

Abstract

Motivation: Recent developments of spatial transcriptomic sequencing technologies provide powerful tools for understanding cells in the physical context of tissue microenvironments. A fundamental task in spatial gene expression analysis is to identify genes with spatially variable expression patterns, or spatially variable genes (SVgenes). Several computational methods have been developed for this task. Their high computational complexity limited their scalability to the latest and future large-scale spatial expression data.

Results: We present SOMDE, an efficient method for identifying SVgenes in large-scale spatial expression data. SOMDE uses self-organizing map to cluster neighboring cells into nodes, and then uses a Gaussian process to fit the node-level spatial gene expression to identify SVgenes. Experiments show that SOMDE is about 5-50 times faster than existing methods with comparable results. The adjustable resolution of SOMDE makes it the only method that can give results in ∼5 min in large datasets of more than 20 000 sequencing sites. SOMDE is available as a python package on PyPI at https://pypi.org/project/somde free for academic use.

Availability and implementation: SOMDE is available for download from PyPI, and the source code is openly available from the Github repository https://github.com/XuegongLab/somde.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology* / methods
  • Normal Distribution
  • Software