A framework for predicting variable-length epitopes of human-adapted viruses using machine learning methods

Brief Bioinform. 2022 Sep 20;23(5):bbac281. doi: 10.1093/bib/bbac281.

Abstract

The coronavirus disease 2019 pandemic has alerted people of the threat caused by viruses. Vaccine is the most effective way to prevent the disease from spreading. The interaction between antibodies and antigens will clear the infectious organisms from the host. Identifying B-cell epitopes is critical in vaccine design, development of disease diagnostics and antibody production. However, traditional experimental methods to determine epitopes are time-consuming and expensive, and the predictive performance using the existing in silico methods is not satisfactory. This paper develops a general framework to predict variable-length linear B-cell epitopes specific for human-adapted viruses with machine learning approaches based on Protvec representation of peptides and physicochemical properties of amino acids. QR decomposition is incorporated during the embedding process that enables our models to handle variable-length sequences. Experimental results on large immune epitope datasets validate that our proposed model's performance is superior to the state-of-the-art methods in terms of AUROC (0.827) and AUPR (0.831) on the testing set. Moreover, sequence analysis also provides the results of the viral category for the corresponding predicted epitopes with high precision. Therefore, this framework is shown to reliably identify linear B-cell epitopes of human-adapted viruses given protein sequences and could provide assistance for potential future pandemics and epidemics.

Keywords: Epitope prediction; Human viruses; Machine learning; Variable-length.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acids
  • COVID-19*
  • Epitope Mapping / methods
  • Epitopes, B-Lymphocyte
  • Humans
  • Machine Learning
  • Peptides / chemistry
  • Viruses*

Substances

  • Amino Acids
  • Epitopes, B-Lymphocyte
  • Peptides