Moment-based prediction of DNA-binding proteins

J Mol Biol. 2004 Jul 30;341(1):65-71. doi: 10.1016/j.jmb.2004.05.058.

Abstract

Net charge, electric dipole moment and quadrupole moment tensors were calculated for 78 amino acid sequences from 62 representative DNA-binding proteins with known structures. It was found that the magnitudes of the moments of electric charge distribution in these chains differ significantly from those of a non-binding control data set. Net charge, net dipole moment and quadrupole moment could each distinguish binding and non-binding proteins with 82.6%, 77.4% and 73.7% accuracy by single-variable predictors without cross-validation. Using hybrid predictors with information of charge and both moments, the best predictions were 85.6% without cross-validation and 83.9% for the cross-validated data sets. This level of prediction accuracy obtained with these simple descriptors competes with the results obtained using more complex models including many descriptors. The coarse graining of atomic charges onto C(alpha) atoms did not reduce the prediction accuracy significantly. This result suggests that we can use C(alpha) coordinates derived from homology modeling to predict DNA-binding proteins. The speed and accuracy of this method, in combination with homology-based methods of structure prediction, should enhance genome-wide recognition of DNA-binding proteins.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • DNA / metabolism*
  • DNA-Binding Proteins / metabolism*
  • Humans
  • Protein Structure, Tertiary
  • Sequence Analysis, Protein*
  • Static Electricity

Substances

  • DNA-Binding Proteins
  • DNA