SAFoldNet: A Novel Tool for Discovering and Aligning Three-Dimensional Protein Structures Based on a Neural Network

Int J Mol Sci. 2023 Sep 22;24(19):14439. doi: 10.3390/ijms241914439.

Abstract

The development and improvement of methods for comparing and searching for three-dimensional protein structures remain urgent tasks in modern structural biology. To solve this problem, we developed a new tool, SAFoldNet, which allows for searching, aligning, superimposing, and determining the exact coordinates of fragments of protein structures. The proposed search and alignment tool was built using neural networking. Specifically, we implemented the integrative synergy of neural network predictions and the well-known BLAST algorithm for searching and aligning sequences. The proposed method involves multistage processing, comprising a stage for converting the geometry of protein structures into sequences of a structural alphabet using a neural network, a search stage for forming a set of candidate structures, and a refinement stage for calculating the structural alignment and overlap and evaluating the similarity with the starting structure of the search. The effectiveness and practical applicability of the proposed tool were compared with those of several widely used services for searching and aligning protein structures. The results of the comparisons confirmed that the proposed method is effective and competitive relative to the available modern services. Furthermore, using the proposed approach, a service with a user-friendly web interface was developed, which allows for searching, aligning, and superimposing protein structures; determining the location of protein fragments; mapping onto a protein molecule chain; and providing structural similarity metrices (expected value and root mean square deviation).

Keywords: neural network; protein conformation; protein domain; protein motif; protein structure; structural alphabet.

MeSH terms

  • Algorithms*
  • Databases, Protein
  • Mathematics
  • Neural Networks, Computer
  • Proteins* / chemistry
  • Sequence Alignment
  • Software

Substances

  • Proteins