Overcoming false-positive gene-category enrichment in the analysis of spatially resolved transcriptomic brain atlas data

Nat Commun. 2021 May 11;12(1):2669. doi: 10.1038/s41467-021-22862-1.

Abstract

Transcriptomic atlases have improved our understanding of the correlations between gene-expression patterns and spatially varying properties of brain structure and function. Gene-category enrichment analysis (GCEA) is a common method to identify functional gene categories that drive these associations, using gene-to-category annotation systems like the Gene Ontology (GO). Here, we show that applying standard GCEA methodology to spatial transcriptomic data is affected by substantial false-positive bias, with GO categories displaying an over 500-fold average inflation of false-positive associations with random neural phenotypes in mouse and human. The estimated false-positive rate of a GO category is associated with its rate of being reported as significantly enriched in the literature, suggesting that published reports are affected by this false-positive bias. We show that within-category gene-gene coexpression and spatial autocorrelation are key drivers of the false-positive bias and introduce flexible ensemble-based null models that can account for these effects, made available as a software toolbox.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Brain / metabolism*
  • Gene Expression Profiling / methods*
  • Gene Ontology*
  • Humans
  • Male
  • Mice
  • Mice, Inbred C57BL
  • Molecular Sequence Annotation / methods*
  • Reproducibility of Results
  • Software