SLINGER: large-scale learning for predicting gene expression

Kévin Vervier; Jacob J Michaelson

doi:10.1038/srep39360

SLINGER: large-scale learning for predicting gene expression

Sci Rep. 2016 Dec 20:6:39360. doi: 10.1038/srep39360.

Authors

Kévin Vervier¹, Jacob J Michaelson¹

Affiliation

¹ University of Iowa, Carver College of Medicine, Department of Psychiatry, Iowa City, 52242, USA.

Abstract

Recent studies have established that single nucleotide polymorphisms are sufficient to build accurate predictive models of gene expression. Gamazon, et al., found that gene expression values predicted from cis neighborhood SNPs show statistical association with disease status. In this work, we remove the cis neighborhood constraint during the learning process, and propose a novel predictive approach called SLINGER. We demonstrate that models drawing from a genome-wide set of SNPs are able to predict expression for more genes than the ones built on cis neighborhood only. Results indicate that these new models significantly improve accuracy for a large number of genes. Thanks to a penalized linear model, we also show that the number of features used in our models remains comparable to the cis-only models. Finally, SLINGER application on seven Wellcome Trust Case-Control Consortium genome-wide association studies demonstrate that compared to a cis-only approach, our models lead to associations with greater fidelity to actual gene expression values.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology
Gene Expression / genetics*
Gene Expression Regulation / genetics*
Genetic Predisposition to Disease / genetics*
Genome-Wide Association Study / methods*
Humans
Models, Theoretical*
Polymorphism, Single Nucleotide / genetics

Abstract

Publication types

MeSH terms

Grants and funding