Enhancing in silico protein-based vaccine discovery for eukaryotic pathogens using predicted peptide-MHC binding and peptide conservation scores

PLoS One. 2014 Dec 29;9(12):e115745. doi: 10.1371/journal.pone.0115745. eCollection 2014.

Abstract

Given thousands of proteins constituting a eukaryotic pathogen, the principal objective for a high-throughput in silico vaccine discovery pipeline is to select those proteins worthy of laboratory validation. Accurate prediction of T-cell epitopes on protein antigens is one crucial piece of evidence that would aid in this selection. Prediction of peptides recognised by T-cell receptors have to date proved to be of insufficient accuracy. The in silico approach is consequently reliant on an indirect method, which involves the prediction of peptides binding to major histocompatibility complex (MHC) molecules. There is no guarantee nevertheless that predicted peptide-MHC complexes will be presented by antigen-presenting cells and/or recognised by cognate T-cell receptors. The aim of this study was to determine if predicted peptide-MHC binding scores could provide contributing evidence to establish a protein's potential as a vaccine. Using T-Cell MHC class I binding prediction tools provided by the Immune Epitope Database and Analysis Resource, peptide binding affinity to 76 common MHC I alleles were predicted for 160 Toxoplasma gondii proteins: 75 taken from published studies represented proteins known or expected to induce T-cell immune responses and 85 considered less likely vaccine candidates. The results show there is no universal set of rules that can be applied directly to binding scores to distinguish a vaccine from a non-vaccine candidate. We present, however, two proposed strategies exploiting binding scores that provide supporting evidence that a protein is likely to induce a T-cell immune response-one using random forest (a machine learning algorithm) with a 72% sensitivity and 82.4% specificity and the other, using amino acid conservation scores with a 74.6% sensitivity and 70.5% specificity when applied to the 160 benchmark proteins. More importantly, the binding score strategies are valuable evidence contributors to the overall in silico vaccine discovery pool of evidence.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acids / chemistry
  • Amino Acids / classification
  • Artificial Intelligence
  • Computational Biology
  • Computer Simulation
  • Databases, Protein
  • Epitopes, T-Lymphocyte / immunology
  • Genes, MHC Class I / immunology*
  • Humans
  • Peptides / chemistry
  • Peptides / immunology
  • Peptides / metabolism*
  • Protein Binding / immunology*
  • Proteins / immunology
  • Proteins / metabolism*
  • Protozoan Vaccines*
  • T-Lymphocytes / immunology
  • T-Lymphocytes / parasitology
  • Toxoplasma

Substances

  • Amino Acids
  • Epitopes, T-Lymphocyte
  • Peptides
  • Proteins
  • Protozoan Vaccines

Grants and funding

SJG gratefully acknowledges receipt of a PhD scholarship from Zoetis (Pfizer) Animal Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.