blitzGSEA: efficient computation of gene set enrichment analysis through gamma distribution approximation

Bioinformatics. 2022 Apr 12;38(8):2356-2357. doi: 10.1093/bioinformatics/btac076.

Abstract

Motivation: The identification of pathways and biological processes from differential gene expression is central for interpretation of data collected by transcriptomics assays. Gene set enrichment analysis (GSEA) is the most commonly used algorithm to calculate the significance of the relevancy of an annotated gene set with a differential expression signature. To compute significance, GSEA implements permutation tests which are slow and inaccurate for comparing many differential expression signatures to thousands of annotated gene sets.

Results: Here, we present blitzGSEA, an algorithm that is based on the same running sum statistic as GSEA, but instead of performing permutations, blitzGSEA approximates the enrichment score probabilities based on Gamma distributions. blitzGSEA achieves significant improvement in performance compared with prior GSEA implementations, while approximating small P-values more accurately.

Availability and implementation: The data, a python package, together with all source code, and a detailed user guide are available from GitHub at: https://github.com/MaayanLab/blitzgsea.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • Gene Expression Profiling
  • Probability
  • Software*