Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction

J Biol Chem. 2016 Jul 8;291(28):14585-99. doi: 10.1074/jbc.M116.729020. Epub 2016 May 9.

Abstract

X-ray crystallography has shown that an antibody paratope typically binds 15-22 amino acids (aa) of an epitope, of which 2-5 randomly distributed amino acids contribute most of the binding energy. In contrast, researchers typically choose for B-cell epitope mapping short peptide antigens in antibody binding assays. Furthermore, short 6-11-aa epitopes, and in particular non-epitopes, are over-represented in published B-cell epitope datasets that are commonly used for development of B-cell epitope prediction approaches from protein antigen sequences. We hypothesized that such suboptimal length peptides result in weak antibody binding and cause false-negative results. We tested the influence of peptide antigen length on antibody binding by analyzing data on more than 900 peptides used for B-cell epitope mapping of immunodominant proteins of Chlamydia spp. We demonstrate that short 7-12-aa peptides of B-cell epitopes bind antibodies poorly; thus, epitope mapping with short peptide antigens falsely classifies many B-cell epitopes as non-epitopes. We also show in published datasets of confirmed epitopes and non-epitopes a direct correlation between length of peptide antigens and antibody binding. Elimination of short, ≤11-aa epitope/non-epitope sequences improved datasets for evaluation of in silico B-cell epitope prediction. Achieving up to 86% accuracy, protein disorder tendency is the best indicator of B-cell epitope regions for chlamydial and published datasets. For B-cell epitope prediction, the most effective approach is plotting disorder of protein sequences with the IUPred-L scale, followed by antibody reactivity testing of 16-30-aa peptides from peak regions. This strategy overcomes the well known inaccuracy of in silico B-cell epitope prediction from primary protein sequences.

Keywords: antibody; antigen; bioinformatics; epitope mapping; immunogenicity; protein motif; protein-protein interaction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Animals
  • Binding Sites, Antibody
  • Cattle
  • Chlamydia / immunology*
  • Chlamydia Infections / immunology*
  • Epitope Mapping / methods*
  • Epitopes, B-Lymphocyte / chemistry
  • Epitopes, B-Lymphocyte / immunology*
  • Humans
  • Machine Learning
  • Mice
  • Models, Immunological
  • Peptides / chemistry
  • Peptides / immunology

Substances

  • Epitopes, B-Lymphocyte
  • Peptides