Analysis of schizophrenia data using a nonlinear threshold index logistic model

PLoS One. 2014 Oct 17;9(10):e109454. doi: 10.1371/journal.pone.0109454. eCollection 2014.

Abstract

Genetic information, such as single nucleotide polymorphism (SNP) data, has been widely recognized as useful in prediction of disease risk. However, how to model the genetic data that is often categorical in disease class prediction is complex and challenging. In this paper, we propose a novel class of nonlinear threshold index logistic models to deal with the complex, nonlinear effects of categorical/discrete SNP covariates for Schizophrenia class prediction. A maximum likelihood methodology is suggested to estimate the unknown parameters in the models. Simulation studies demonstrate that the proposed methodology works viably well for moderate-size samples. The suggested approach is therefore applied to the analysis of the Schizophrenia classification by using a real set of SNP data from Western Australian Family Study of Schizophrenia (WAFSS). Our empirical findings provide evidence that the proposed nonlinear models well outperform the widely used linear and tree based logistic regression models in class prediction of schizophrenia risk with SNP data in terms of both Types I/II error rates and ROC curves.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Case-Control Studies
  • Genetic Predisposition to Disease
  • Humans
  • Logistic Models
  • Models, Genetic*
  • Polymorphism, Single Nucleotide*
  • Schizophrenia / genetics*

Grants and funding

Partially supported by a Discovery Project grant and Lu's research also by a Future Fellowships grant from the Australian Research Council, and Liang's research by NSF grants DMS-1207444 and DMS-1418042, and Award Number 11228103, made by National Natural Science Foundation of China, which are also acknowledged. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.