QuaBingo: A Prediction System for Protein Quaternary Structure Attributes Using Block Composition

Biomed Res Int. 2016:2016:9480276. doi: 10.1155/2016/9480276. Epub 2016 Aug 17.

Abstract

Background. Quaternary structures of proteins are closely relevant to gene regulation, signal transduction, and many other biological functions of proteins. In the current study, a new method based on protein-conserved motif composition in block format for feature extraction is proposed, which is termed block composition. Results. The protein quaternary assembly states prediction system which combines blocks with functional domain composition, called QuaBingo, is constructed by three layers of classifiers that can categorize quaternary structural attributes of monomer, homooligomer, and heterooligomer. The building of the first layer classifier uses support vector machines (SVM) based on blocks and functional domains of proteins, and the second layer SVM was utilized to process the outputs of the first layer. Finally, the result is determined by the Random Forest of the third layer. We compared the effectiveness of the combination of block composition, functional domain composition, and pseudoamino acid composition of the model. In the 11 kinds of functional protein families, QuaBingo is 23% of Matthews Correlation Coefficient (MCC) higher than the existing prediction system. The results also revealed the biological characterization of the top five block compositions. Conclusions. QuaBingo provides better predictive ability for predicting the quaternary structural attributes of proteins.

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Computer Simulation
  • Models, Chemical
  • Models, Molecular*
  • Molecular Sequence Data
  • Pattern Recognition, Automated / methods
  • Protein Structure, Quaternary*
  • Proteins / chemistry*
  • Proteins / ultrastructure*
  • Sequence Analysis, Protein / methods*
  • Support Vector Machine

Substances

  • Proteins