Classifying the superfamily of small heat shock proteins by using g-gap dipeptide compositions

Int J Biol Macromol. 2021 Jan 15:167:1575-1578. doi: 10.1016/j.ijbiomac.2020.11.111. Epub 2020 Nov 17.

Abstract

Small heat shock protein (sHSP) is a superfamily of molecular chaperone and is found from archaea to human. Recent researches have demonstrated that sHSPs participate in a series of biological processes and are even closely associated with serious diseases. Since sHSP is a very large superfamily and members from different superfamilies exhibit distinct functions, accurate classification of the subfamily of sHSP will be helpful for unrevealing its functions. In the present work, a support vector machine-based method was proposed to classify the subfamily of sHSPs. In the 10-fold cross validation test, an overall accuracy of 93.25% was obtained for classifying the subfamily of sHSPs. The superiority of the proposed method was also demonstrated by comparing it with the other methods. It is anticipated that the proposed method will become a useful tool for classifying the subfamily of sHSPs.

Keywords: Small heat shock proteins; Superfamily; Support vector machine; g-Gap dipeptide composition.

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Computational Biology / methods*
  • Databases, Protein
  • Dipeptides / chemistry
  • Dipeptides / classification*
  • Dipeptides / genetics
  • Heat-Shock Proteins, Small / chemistry
  • Heat-Shock Proteins, Small / classification*
  • Heat-Shock Proteins, Small / genetics
  • Humans
  • Machine Learning*
  • Proteomics / methods
  • Sequence Alignment

Substances

  • Dipeptides
  • Heat-Shock Proteins, Small