Assessing gene length biases in gene set analysis of Genome-Wide Association Studies

Peilin Jia; Jian Tian; Zhongming Zhao

doi:10.1504/IJCBDD.2010.038394

Assessing gene length biases in gene set analysis of Genome-Wide Association Studies

Int J Comput Biol Drug Des. 2010;3(4):297-310. doi: 10.1504/IJCBDD.2010.038394. Epub 2011 Feb 4.

Authors

Peilin Jia¹, Jian Tian, Zhongming Zhao

Affiliation

¹ Departments of Biomedical Informatics and Psychiatry, Vanderbilt University Medical Centre, Nashville, Tennessee 37232, USA. peilin.jia@vanderbilt.edu

PMID: 21297229
DOI: 10.1504/IJCBDD.2010.038394

Abstract

Genome-Wide Association Studies (GWAS) have rapidly become a major genetics approach to studying complex diseases. Although many susceptibility variants and genes have been uncovered by single marker analysis, gene set based analysis is emerging as a very promising approach aiming to detect joint association of a set of genes with disease. In the available gene set based methods, it is often the smallest P value of the Single Nucleotide Polymorphisms (SNPs) in a gene region is used to represent the gene-level association signal. This approach may introduce strong bias of association signal towards long genes. In this study, we propose a resampling strategy by randomly generating genomic intervals across the accessible genomic region to estimate the background distribution of P values at the gene level. Comparing with the gene-wise P value in real data, the proportion of random intervals could be used to assess the bias that might be introduced by gene length and in turn to help the investigators choose the appropriate gene set analysis algorithms in their GWAS datasets. Our method uses only summarised GWAS data with no need of permutation, thus, it is computationally efficient. A computer program is freely available for the users.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Genetic Predisposition to Disease*
Genome-Wide Association Study / methods*
Genomics
Humans
Polymorphism, Single Nucleotide*

Abstract

Publication types

MeSH terms

Grants and funding