Impact of adaptive filtering on power and false discovery rate in RNA-seq experiments

BMC Bioinformatics. 2022 Sep 24;23(1):388. doi: 10.1186/s12859-022-04928-z.

Abstract

Background: In RNA-sequencing studies a large number of hypothesis tests are performed to compare the differential expression of genes between several conditions. Filtering has been proposed to remove candidate genes with a low expression level which may not be relevant and have little or no chance of showing a difference between conditions. This step may reduce the multiple testing burden and increase power.

Results: We show in a simulation study that filtering can lead to some increase in power for RNA-sequencing data, too aggressive filtering, however, can lead to a decline. No uniformly optimal filter in terms of power exists. Depending on the scenario different filters may be optimal. We propose an adaptive filtering strategy which selects one of several filters to maximise the number of rejections. No additional adjustment for multiplicity has to be included, but a rule has to be considered if the number of rejections is too small.

Conclusions: For a large range of simulation scenarios, the adaptive filter maximises the power while the simulated False Discovery Rate is bounded by the pre-defined significance level. Using the adaptive filter, it is not necessary to pre-specify a single individual filtering method optimised for a specific scenario.

Keywords: Gene expression; Gene filter; Multiple testing; Next generation sequencing.

MeSH terms

  • Computer Simulation
  • Exome Sequencing
  • RNA* / genetics
  • RNA-Seq
  • Sequence Analysis, RNA / methods

Substances

  • RNA