Ensemble of Template-Free and Template-Based Classifiers for Protein Secondary Structure Prediction

Int J Mol Sci. 2021 Oct 23;22(21):11449. doi: 10.3390/ijms222111449.

Abstract

Protein secondary structures are important in many biological processes and applications. Due to advances in sequencing methods, there are many proteins sequenced, but fewer proteins with secondary structures defined by laboratory methods. With the development of computer technology, computational methods have (started to) become the most important methodologies for predicting secondary structures. We evaluated two different approaches to this problem-driven by the recent results obtained by computational methods in this task-(i) template-free classifiers, based on machine learning techniques; and (ii) template-based classifiers, based on searching tools. Both approaches are formed by different sub-classifiers-six for template-free and two for template-based, each with a specific view of the protein. Our results show that these ensembles improve the results of each approach individually.

Keywords: BLAST; deep learning; ensemble; machine learning; protein secondary structure prediction.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Protein
  • Machine Learning
  • Neural Networks, Computer
  • Protein Conformation
  • Protein Structure, Secondary*
  • Proteins / chemistry*
  • Software

Substances

  • Proteins