Symmetric Directional False Discovery Rate Control

Sarah E Holte; Eva K Lee; Yajun Mei

doi:10.1016/j.stamet.2016.08.002

Symmetric Directional False Discovery Rate Control

Stat Methodol. 2016 Dec:33:71-82. doi: 10.1016/j.stamet.2016.08.002. Epub 2016 Aug 24.

Authors

Sarah E Holte¹, Eva K Lee², Yajun Mei²

Affiliations

¹ Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
² H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332, USA.

Abstract

This research is motivated from the analysis of a real gene expression data that aims to identify a subset of "interesting" or "significant" genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample t-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into "over-expressed" and "under-expressed" genes, pairs "over-expressed" and "under-expressed" genes, defines the p-values for gene pairs via column permutations, and then applies the standard FDR method to select "significant" gene pairs instead of "significant" individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.

Keywords: Column permutation; Directional FDR; False discovery rate; Multiple testing; Symmetric decision; Three-decisions.

Abstract

Grants and funding