Contact prediction using mutual information and neural nets

Proteins. 2007:69 Suppl 8:159-64. doi: 10.1002/prot.21791.

Abstract

Prediction of protein structures continues to be a difficult problem, particularly when there are no solved structures for homologous proteins to use as templates. Local structure prediction (secondary structure and burial) is fairly reliable, but does not provide enough information to produce complete three-dimensional structures. Residue-residue contact prediction, though still not highly reliable, may provide a useful guide for assembling local structure prediction into full tertiary prediction. We develop a neural network which is applied to pairs of residue positions and outputs a probability of contact between the positions. One of the neural net inputs is a novel statistic for detecting correlated mutations: the statistical significance of the mutual information between the corresponding columns of a multiple sequence alignment. This statistic, combined with a second statistic based on the propensity of two amino acid types being in contact, results in a simple neural network that is a good predictor of contacts. Adding more features from amino-acid distributions and local structure predictions, the final neural network predicts contacts better than other submitted contact predictions at CASP7, including contact predictions derived from fragment-based tertiary models on free-modeling domains. It is still not known if contact predictions can improve tertiary models on free-modeling domains. Available at http://www.soe.ucsc.edu/research/compbio/SAM_T06/T06-query.html.

Publication types

  • Evaluation Study
  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Data Interpretation, Statistical
  • Mutation
  • Neural Networks, Computer*
  • Protein Conformation*
  • Proteins / chemistry
  • Sequence Alignment

Substances

  • Proteins