T-lymphocyte (T-cell) is a very important component in human immune system. T-cell epitopes can be used for the accurately monitoring the immune responses which activation by major histocompatibility complex (MHC), and rationally designing vaccines. Therefore, accurate prediction of T-cell epitopes is crucial for vaccine development and clinical immunology. In current study, two types peptide features, i.e., amino acid properties and chemical molecular features were used for the T-cell epitopes peptide representation. Based on these features, random forest (RF) algorithm, a powerful machine learning algorithm, was used to classify T-cell epitopes and non-T-cell epitopes. The classification accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), and area under the curve (AUC) values for proposed method are 97.54%, 97.22%, 97.60%, 0.9193, and 0.9868, respectively. These results indicate that current method based on the combined features and RF is effective for T-cell epitopes prediction.
Keywords: Amino acid properties; Chemical molecular features; MHC; RF; Random forest (RF); T cell receptors; T-cell epitopes; TCRs; major histocompatibility complex; random forest.
Copyright © 2013. Published by Elsevier B.V.