Sample size reassessment for a two-stage design controlling the false discovery rate

Stat Appl Genet Mol Biol. 2015 Nov;14(5):429-42. doi: 10.1515/sagmb-2014-0025.

Abstract

Sample size calculations for gene expression microarray and NGS-RNA-Seq experiments are challenging because the overall power depends on unknown quantities as the proportion of true null hypotheses and the distribution of the effect sizes under the alternative. We propose a two-stage design with an adaptive interim analysis where these quantities are estimated from the interim data. The second stage sample size is chosen based on these estimates to achieve a specific overall power. The proposed procedure controls the power in all considered scenarios except for very low first stage sample sizes. The false discovery rate (FDR) is controlled despite of the data dependent choice of sample size. The two-stage design can be a useful tool to determine the sample size of high-dimensional studies if in the planning phase there is high uncertainty regarding the expected effect sizes and variability.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Data Interpretation, Statistical
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing*
  • Oligonucleotide Array Sequence Analysis
  • ROC Curve
  • Reproducibility of Results
  • Sample Size
  • Sequence Analysis, DNA