iFish: predicting the pathogenicity of human nonsynonymous variants using gene-specific/family-specific attributes and classifiers

Sci Rep. 2016 Aug 16:6:31321. doi: 10.1038/srep31321.

Abstract

Accurate prediction of the pathogenicity of genomic variants, especially nonsynonymous single nucleotide variants (nsSNVs), is essential in biomedical research and clinical genetics. Most current prediction methods build a generic classifier for all genes. However, different genes and gene families have different features. We investigated whether gene-specific and family-specific customized classifiers could improve prediction accuracy. Customized gene-specific and family-specific attributes were selected with AIC, BIC, and LASSO, and Support Vector Machine classifiers were generated for 254 genes and 152 gene families, covering a total of 5,985 genes. Our results showed that the customized attributes reflected key features of the genes and gene families, and the customized classifiers achieved higher prediction accuracy than the generic classifier. The customized classifiers and the generic classifier for other genes and families were integrated into a new tool named iFish (integrated Functional inference of SNVs in human, http://ifish.cbi.pku.edu.cn). iFish outperformed other methods on benchmark datasets as well as on prioritization of candidate causal variants from whole exome sequencing. iFish provides a user-friendly web-based interface and supports other functionalities such as integration of genetic evidence. iFish would facilitate high-throughput evaluation and prioritization of nsSNVs in human genetics research.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Regulatory Networks
  • Genetic Predisposition to Disease*
  • Humans
  • Multigene Family
  • Polymorphism, Single Nucleotide*
  • Support Vector Machine