PSSNet-An Accurate Super-Secondary Structure for Protein Segmentation

Int J Mol Sci. 2022 Nov 26;23(23):14813. doi: 10.3390/ijms232314813.

Abstract

A super-secondary structure (SSS) is a spatially unique ensemble of secondary structural elements that determine the three-dimensional shape of a protein and its function, rendering SSSs attractive as folding cores. Understanding known types of SSSs is important for developing a deeper understanding of the mechanisms of protein folding. Here, we propose a universal PSSNet machine-learning method for SSS recognition and segmentation. For various types of SSS segmentation, this method uses key characteristics of SSS geometry, including the lengths of secondary structural elements and the distances between them, torsion angles, spatial positions of Cα atoms, and primary sequences. Using four types of SSSs (βαβ-unit, α-hairpin, β-hairpin, αα-corner), we showed that extensive SSS sets could be reliably selected from the Protein Data Bank and AlphaFold 2.0 database of protein structures.

Keywords: AlphaFold 2.0; data bank; graph neural network; machine learning; protein features; super-secondary structure.

MeSH terms

  • Databases, Protein
  • Machine Learning
  • Protein Folding*
  • Protein Structure, Secondary
  • Proteins* / chemistry

Substances

  • Proteins