Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model

Bioinformatics. 2023 Apr 3;39(4):btad187. doi: 10.1093/bioinformatics/btad187.

Abstract

Motivation: Identifying the B-cell epitopes is an essential step for guiding rational vaccine development and immunotherapies. Since experimental approaches are expensive and time-consuming, many computational methods have been designed to assist B-cell epitope prediction. However, existing sequence-based methods have limited performance since they only use contextual features of the sequential neighbors while neglecting structural information.

Results: Based on the recent breakthrough of AlphaFold2 in protein structure prediction, we propose GraphBepi, a novel graph-based model for accurate B-cell epitope prediction. For one protein, the predicted structure from AlphaFold2 is used to construct the protein graph, where the nodes/residues are encoded by ESM-2 learning representations. The graph is input into the edge-enhanced deep graph neural network (EGNN) to capture the spatial information in the predicted 3D structures. In parallel, a bidirectional long short-term memory neural networks (BiLSTM) are employed to capture long-range dependencies in the sequence. The learned low-dimensional representations by EGNN and BiLSTM are then combined into a multilayer perceptron for predicting B-cell epitopes. Through comprehensive tests on the curated epitope dataset, GraphBepi was shown to outperform the state-of-the-art methods by more than 5.5% and 44.0% in terms of AUC and AUPR, respectively. A web server is freely available at http://bio-web1.nscc-gz.cn/app/graphbepi.

Availability and implementation: The datasets, pre-computed features, source codes, and the trained model are available at https://github.com/biomed-AI/GraphBepi.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Epitopes, B-Lymphocyte* / chemistry
  • Language
  • Neural Networks, Computer*
  • Proteins / chemistry
  • Software

Substances

  • Epitopes, B-Lymphocyte
  • Proteins