Gene-Based Testing of Interactions Using XGBoost in Genome-Wide Association Studies

Front Cell Dev Biol. 2021 Dec 16:9:801113. doi: 10.3389/fcell.2021.801113. eCollection 2021.

Abstract

Among the myriad of statistical methods that identify gene-gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene-gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical p-value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene-gene interactions.

Keywords: XGBoost; additive model; gene-based testing; gene–gene interactions; genome-wide association studies.