Prediction of T-cell epitopes based on least squares support vector machines and amino acid properties

Anal Chim Acta. 2007 Feb 12;584(1):37-42. doi: 10.1016/j.aca.2006.11.037. Epub 2006 Nov 19.

Abstract

T-lymphocyte (T-cell) is a very important component in human immune system. It possesses a receptor (TCR) that is specific for the foreign epitopes which are in a form of short peptides bound to the major histocompatibility complex (MHC). When T-cell receives the message about the peptides bound to MHC, it makes the immune system active and results in the disposal of the immunogen. The antigenic determinants recognized and bound by the T-cell receptor is known as T-cell epitope. The accurate prediction of T-cell epitopes is crucial for vaccine development and clinical immunology. For the first time we developed new models using least squares support vector machine (LSSVM) and amino acid properties for T-cell epitopes prediction. A dataset including 203 short peptides (167 non-epitopes and 36 epitopes) was used as the input dataset and it was randomly divided into a training set and a test set. The models based on LSSVM and amino acid properties were evaluated using leave-one-out cross-validation method and the predictive ability of the test set, and obtained the results of 0.9875 and 0.9734 under the ROC curves, respectively. This result is more satisfactory than that were reported before. Especially, the accuracy of true positive gets a marked enhancement.

MeSH terms

  • Amino Acids / analysis*
  • Codon
  • Epitopes / analysis*
  • Epitopes / chemistry*
  • Epitopes / genetics
  • Genetic Vectors
  • Humans
  • Least-Squares Analysis
  • Molecular Weight
  • Peptides / chemistry
  • Peptides / immunology
  • Solubility
  • T-Lymphocytes / immunology*
  • T-Lymphocytes, Cytotoxic / immunology*

Substances

  • Amino Acids
  • Codon
  • Epitopes
  • Peptides