Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens

Edison Ong; Haihe Wang; Mei U Wong; Meenakshi Seetharaman; Ninotchka Valdez; Yongqun He

doi:10.1093/bioinformatics/btaa119

Vaxign-ML: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens

Bioinformatics. 2020 May 1;36(10):3185-3191. doi: 10.1093/bioinformatics/btaa119.

Authors

Edison Ong¹, Haihe Wang^{2

3}, Mei U Wong³, Meenakshi Seetharaman⁴, Ninotchka Valdez⁴, Yongqun He^{3

5

6}

Affiliations

¹ Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.
² Department of Pathogenobiology, Daqing Branch of Harbin Medical University, Daqing 163319, China.
³ Unit for Laboratory Animal Medicine.
⁴ College of Literature, Science, and the Arts, University of Michigan.
⁵ Department of Microbiology and Immunology.
⁶ Center of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA.

Abstract

Motivation: Reverse vaccinology (RV) is a milestone in rational vaccine design, and machine learning (ML) has been applied to enhance the accuracy of RV prediction. However, ML-based RV still faces challenges in prediction accuracy and program accessibility.

Results: This study presents Vaxign-ML, a supervised ML classification to predict bacterial protective antigens (BPAgs). To identify the best ML method with optimized conditions, five ML methods were tested with biological and physiochemical features extracted from well-defined training data. Nested 5-fold cross-validation and leave-one-pathogen-out validation were used to ensure unbiased performance assessment and the capability to predict vaccine candidates against a new emerging pathogen. The best performing model (eXtreme Gradient Boosting) was compared to three publicly available programs (Vaxign, VaxiJen, and Antigenic), one SVM-based method, and one epitope-based method using a high-quality benchmark dataset. Vaxign-ML showed superior performance in predicting BPAgs. Vaxign-ML is hosted in a publicly accessible web server and a standalone version is also available.

Availability and implementation: Vaxign-ML website at http://www.violinet.org/vaxign/vaxign-ml, Docker standalone Vaxign-ML available at https://hub.docker.com/r/e4ong1031/vaxign-ml and source code is available at https://github.com/VIOLINet/Vaxign-ML-docker.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

Research Support, N.I.H., Extramural

MeSH terms

Antigens, Bacterial*
Computational Biology
Machine Learning
Software
Supervised Machine Learning
Vaccinology*

Substances

Antigens, Bacterial

Grants and funding

R01 AI081062/AI/NIAID NIH HHS/United States