Normalizing a large number of quantitative traits using empirical normal quantile transformation

Bo Peng; Robert K Yu; Kevin L Dehoff; Christopher I Amos

doi:10.1186/1753-6561-1-s1-s156

Normalizing a large number of quantitative traits using empirical normal quantile transformation

BMC Proc. 2007;1 Suppl 1(Suppl 1):S156. doi: 10.1186/1753-6561-1-s1-s156. Epub 2007 Dec 18.

Authors

Bo Peng¹, Robert K Yu, Kevin L Dehoff, Christopher I Amos

Affiliation

¹ Department of Epidemiology, The University of Texas, M.D. Anderson Cancer Center, 1155 Pressler Boulevard, Unit 1340, Houston, Texas 77030, USA. bpeng@mdanderson.org

Abstract

Variance-components and regression-based methods are frequently used to map quantitative trait loci. The normality of the trait values is usually assumed and violation of this assumption can have a detrimental effect on the power and type I error of such analyses. Various transformations can be used, but appropriate transformations usually require careful analysis of individual traits, which is not feasible for data sets with a large number of traits like those in Problem 1 of Genetic Analysis Workshop 15 (GAW15). A semiparametric variance-components method can estimate the transformation along with the model parameters, but existing methods are computationally intensive. In this paper, we propose the use of empirical normal quantile transformation to normalize the scaled rank of trait values using an inverse normal transformation. Despite its simplicity and potential loss of information, this transformation is shown, by extensive simulations, to have good control of power and type I error, even when compared with the semiparametric method. To investigate the impact of such a transformation on real data sets, we apply variance-components and variance-regression methods to the expression data of GAW15 and compare the results before and after transformation.

Grants and funding

R01 ES009912/ES/NIEHS NIH HHS/United States