Identification and characterization of B-cell epitopes in target antigens was one of the key steps in epitopes-driven vaccine design, immunodiagnostic tests, and antibody production. Experimental determination of epitopes was labor-intensive and expensive. Therefore, there was an urgent need of computational methods for reliable identification of B-cell epitopes. In current study, we proposed a novel peptide feature description method which combined peptide amino acid properties with chemical molecular features. Based on these combined features, a random forest (RF) classifier was adopted to classify B-cell epitopes and non-epitopes. RF is an ensemble method that uses recursive partitioning to generate many trees for aggregating the results; and it always produces highly competitive models. The classification accuracy, sensitivity, specificity, Matthews correlation coefficient (MCC), and area under the curve (AUC) values for current method were 78.31%, 80.05%, 72.23%, 0.5836, and 0.8800, respectively. These results showed that an appropriate combination of peptide amino acid features and chemical molecular features with a RF model could enhance the prediction performance of linear B-cell epitopes. Finally, a freely online service was available at http://sysbio.yznu.cn/Research/Epitopesprediction.aspx.
Keywords: Amino acid properties; Chemical molecular features; Computational method; Epitopes identification.
Copyright © 2014. Published by Elsevier Masson SAS.