Struct2GO: protein function prediction based on graph pooling algorithm and AlphaFold2 structure information

Bioinformatics. 2023 Oct 3;39(10):btad637. doi: 10.1093/bioinformatics/btad637.

Abstract

Motivation: In recent years, there has been a breakthrough in protein structure prediction, and the AlphaFold2 model of the DeepMind team has improved the accuracy of protein structure prediction to the atomic level. Currently, deep learning-based protein function prediction models usually extract features from protein sequences and combine them with protein-protein interaction networks to achieve good results. However, for newly sequenced proteins that are not in the protein-protein interaction network, such models cannot make effective predictions. To address this, this article proposes the Struct2GO model, which combines protein structure and sequence data to enhance the precision of protein function prediction and the generality of the model.

Results: We obtain amino acid residue embeddings in protein structure through graph representation learning, utilize the graph pooling algorithm based on a self-attention mechanism to obtain the whole graph structure features, and fuse them with sequence features obtained from the protein language model. The results demonstrate that compared with the traditional protein sequence-based function prediction model, the Struct2GO model achieves better results.

Availability and implementation: The data underlying this article are available at https://github.com/lyjps/Struct2GO.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Amino Acids
  • Neural Networks, Computer*
  • Proteins* / chemistry

Substances

  • Proteins
  • Amino Acids