Predicting MHC class I binder: existing approaches and a novel recurrent neural network solution

Brief Bioinform. 2021 Nov 5;22(6):bbab216. doi: 10.1093/bib/bbab216.

Abstract

Major histocompatibility complex (MHC) possesses important research value in the treatment of complex human diseases. A plethora of computational tools has been developed to predict MHC class I binders. Here, we comprehensively reviewed 27 up-to-date MHC I binding prediction tools developed over the last decade, thoroughly evaluating feature representation methods, prediction algorithms and model training strategies on a benchmark dataset from Immune Epitope Database. A common limitation was identified during the review that all existing tools can only handle a fixed peptide sequence length. To overcome this limitation, we developed a bilateral and variable long short-term memory (BVLSTM)-based approach, named BVLSTM-MHC. It is the first variable-length MHC class I binding predictor. In comparison to the 10 mainstream prediction tools on an independent validation dataset, BVLSTM-MHC achieved the best performance in six out of eight evaluated metrics. A web server based on the BVLSTM-MHC model was developed to enable accurate and efficient MHC class I binder prediction in human, mouse, macaque and chimpanzee.

Keywords: MHC class I; Position-specific scoring matrix; long short-term memory; variable recurrent neural network.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Binding Sites*
  • Carrier Proteins / chemistry*
  • Carrier Proteins / metabolism
  • Computational Biology / methods*
  • Databases, Factual
  • Deep Learning
  • Epitopes / chemistry
  • Epitopes / immunology
  • Epitopes / metabolism
  • Histocompatibility Antigens Class I / chemistry*
  • Histocompatibility Antigens Class I / immunology
  • Histocompatibility Antigens Class I / metabolism
  • Machine Learning
  • Neural Networks, Computer*
  • Protein Binding
  • ROC Curve
  • Reproducibility of Results
  • Software*
  • Web Browser

Substances

  • Carrier Proteins
  • Epitopes
  • Histocompatibility Antigens Class I