Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network

Interdiscip Sci. 2022 Dec;14(4):937-946. doi: 10.1007/s12539-022-00529-9. Epub 2022 Jun 17.

Abstract

Protein subcellular localization prediction is an important research area in bioinformatics, which plays an essential role in understanding protein function and mechanism. Many machine learning and deep learning algorithms have been employed for this task, but most of them do not use structural information of proteins. With the advances in protein structure research in recent years, protein contact map prediction has been dramatically enhanced. In this paper, we present GraphLoc, a deep learning model that predicts the localization of proteins at the subcellular level. The cores of the model are a graph convolutional neural network module and a multi-head attention module. The protein topology graph is constructed based on a contact map predicted from protein sequences, which is used as the input of the GCN module to take full advantage of the structural information of proteins. Multi-head attention module learns the weighted contribution of different amino acids to subcellular localization in different feature representation subspaces. Experiments on the benchmark dataset show that the performance of our model is better than others. The code can be accessed at https://github.com/GoodGuy398/GraphLoc . The proposed GraphLoc model consists of three parts. The first part is a graph convolutional network (GCN) module, which utilizes the predicted contact maps to construct protein graph, taking benefit of protein information accordingly. The second part is the multi-head attention module, which learns the weighted contribution of different amino acids in different feature representation subspace, and weighted average the feature map across all amino acid nodes. The last part is a fully connected layer that maps the flatten graph representation vector to another vector with a category number dimension, followed by a softmax layer to predict the protein subcellular localization.

Keywords: Deep learning; Graph convolutional network; Multi-head attention; Protein subcellular localization.

MeSH terms

  • Amino Acids
  • Computational Biology* / methods
  • Machine Learning
  • Neural Networks, Computer*
  • Proteins / chemistry

Substances

  • Proteins
  • Amino Acids