Nonnegative spatial factorization applied to spatial genomics

F William Townes; Barbara E Engelhardt

doi:10.1038/s41592-022-01687-w

Nonnegative spatial factorization applied to spatial genomics

Nat Methods. 2023 Feb;20(2):229-238. doi: 10.1038/s41592-022-01687-w. Epub 2022 Dec 31.

Authors

F William Townes¹, Barbara E Engelhardt^{2

3}

Affiliations

¹ Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA, USA. ftownes@andrew.cmu.edu.
² Data Science and Biotechnology Institute, Gladstone Institutes, San Francisco, CA, USA. barbara.engelhardt@gladstone.ucsf.edu.
³ Department of Biomedical Data Science, Stanford University, Stanford, CA, USA. barbara.engelhardt@gladstone.ucsf.edu.

Abstract

Nonnegative matrix factorization (NMF) is widely used to analyze high-dimensional count data because, in contrast to real-valued alternatives such as factor analysis, it produces an interpretable parts-based representation. However, in applications such as spatial transcriptomics, NMF fails to incorporate known structure between observations. Here, we present nonnegative spatial factorization (NSF), a spatially-aware probabilistic dimension reduction model based on transformed Gaussian processes that naturally encourages sparsity and scales to tens of thousands of observations. NSF recovers ground truth factors more accurately than real-valued alternatives such as MEFISTO in simulations, and has lower out-of-sample prediction error than probabilistic NMF on three spatial transcriptomics datasets from mouse brain and liver. Since not all patterns of gene expression have spatial correlations, we also propose a hybrid extension of NSF that combines spatial and nonspatial components, enabling quantification of spatial importance for both observations and features. A TensorFlow implementation of NSF is available from https://github.com/willtownes/nsf-paper .

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Algorithms*
Animals
Gene Expression Profiling* / methods
Genomics
Mice
Models, Statistical

Abstract

Publication types

MeSH terms

Grants and funding