DAMpred: Recognizing Disease-Associated nsSNPs through Bayes-Guided Neural-Network Model Built on Low-Resolution Structure Prediction of Proteins and Protein-Protein Interactions

J Mol Biol. 2019 Jun 14;431(13):2449-2459. doi: 10.1016/j.jmb.2019.02.017. Epub 2019 Feb 21.

Abstract

Nearly one-third of non-synonymous single-nucleotide polymorphism (nsSNPs) are deleterious to human health, but recognition of the disease-associated mutations remains a significant unsolved problem. We proposed a new algorithm, DAMpred, to identify disease-causing nsSNPs through the coupling of evolutionary profiles with structure predictions of proteins and protein-protein interactions. The pipeline was trained by a novel Bayes-guided artificial neural network algorithm that incorporates posterior probabilities of distinct feature classifiers with the network training process. DAMpred was tested on a large-scale data set involving 10,635 nsSNPs from 2154 ORFs in the human genome and recognized disease-associated nsSNPs with an accuracy 0.80 and a Matthews correlation coefficient of 0.601, which is 9.1% higher than the best of other state-of-the-art methods. In the blind test on the TP53 gene, DAMpred correctly recognized the mutations causative of Li-Fraumeni-like syndrome with a Matthews correlation coefficient that is 27% higher than the control methods. The study demonstrates an efficient avenue to quantitatively model the association of nsSNPs with human diseases from low-resolution protein structure prediction, which should find important usefulness in diagnosis and treatment of genetic diseases.

Keywords: Bayes-guided artificial neural network algorithm; non-synonymous single nucleotide polymorphisms; p53 protein; protein structure prediction; protein–protein interaction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • Genetic Predisposition to Disease
  • Humans
  • Neural Networks, Computer
  • Polymorphism, Single Nucleotide*
  • Protein Interaction Maps
  • Proteins / chemistry
  • Proteins / genetics*
  • Proteins / metabolism*

Substances

  • Proteins