Prediction of the secondary structure contents of globular proteins based on three structural classes

J Protein Chem. 1998 Apr;17(3):261-72. doi: 10.1023/a:1022588803017.

Abstract

The prediction of the secondary structural contents (those of alpha-helix and beta-strand) of a globular protein is of great use in the prediction of protein structure. In this paper, a new prediction algorithm has been proposed based on Chou's database [Chou (1995), Proteins 21, 319-344]. The new algorithm is an improved multiple linear regression method, taking into account the nonlinear and coupling terms of the frequencies of different amino acids and the length of the protein. The prediction is also based on the structural classes of proteins, but instead of four classes, only three classes are considered, the alpha class, beta class, and the mixed alpha+beta and alpha/beta class or simply the alphabeta class. Thus the ambiguity that usually occurs between alpha+beta proteins and alpha/beta proteins is eliminated. A resubstitution examination for the algorithm shows that the average absolute errors are 0.040 and 0.035 for the prediction of alpha-helix content and beta-strand content, respectively. An examination of cross-validation, the jackknife analysis, shows that the average absolute errors are 0.051 and 0.045 for the prediction of alpha-helix content and beta-strand content, respectively. Both examinations indicate the self-consistency and the extrapolating effectiveness of the new algorithm. Compared with other methods, ours has the merits of simplicity and convenience for use, as well as high prediction accuracy. By incorporating the prediction of the structural classes, the only input of our method is the amino acid composition and the length of the protein to be predicted.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Linear Models
  • Models, Chemical*
  • Protein Structure, Secondary*