Post hoc power estimation in large-scale multiple testing problems

Bioinformatics. 2010 Apr 15;26(8):1050-6. doi: 10.1093/bioinformatics/btq085. Epub 2010 Feb 25.

Abstract

Background: The statistical power or multiple Type II error rate in large-scale multiple testing problems as, for example, in gene expression microarray experiments, depends on typically unknown parameters and is therefore difficult to assess a priori. However, it has been suggested to estimate the multiple Type II error rate post hoc, based on the observed data.

Methods: We consider a class of post hoc estimators that are functions of the estimated proportion of true null hypotheses among all hypotheses. Numerous estimators for this proportion have been proposed and we investigate the statistical properties of the derived multiple Type II error rate estimators in an extensive simulation study.

Results: The performance of the estimators in terms of the mean squared error depends sensitively on the distributional scenario. Estimators based on empirical distributions of the null hypotheses are superior in the presence of strongly correlated test statistics.

Availability: R-code to compute all considered estimators based on P-values and supplementary material is available on the authors web page http://statistics.msi.meduniwien.ac.at/index.php?page=pageszfnr.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / chemistry
  • Gene Expression Profiling / methods
  • Genomics / methods*
  • Humans
  • Models, Statistical
  • Oligonucleotide Array Sequence Analysis / methods*

Substances

  • Biomarkers, Tumor