Ensemble-AMPPred: Robust AMP Prediction and Recognition Using the Ensemble Learning Method with a New Hybrid Feature for Differentiating AMPs

Genes (Basel). 2021 Jan 21;12(2):137. doi: 10.3390/genes12020137.

Abstract

Antimicrobial peptides (AMPs) are natural peptides possessing antimicrobial activities. These peptides are important components of the innate immune system. They are found in various organisms. AMP screening and identification by experimental techniques are laborious and time-consuming tasks. Alternatively, computational methods based on machine learning have been developed to screen potential AMP candidates prior to experimental verification. Although various AMP prediction programs are available, there is still a need for improvement to reduce false positives (FPs) and to increase the predictive accuracy. In this work, several well-known single and ensemble machine learning approaches have been explored and evaluated based on balanced training datasets and two large testing datasets. We have demonstrated that the developed program with various predictive models has high performance in differentiating between AMPs and non-AMPs. Thus, we describe the development of a program for the prediction and recognition of AMPs using MaxProbVote, which is an ensemble model. Moreover, to increase prediction efficiency, the ensemble model was integrated with a new hybrid feature based on logistic regression. The ensemble model integrated with the hybrid feature can effectively increase the prediction sensitivity of the developed program called Ensemble-AMPPred, resulting in overall improvements in terms of both sensitivity and specificity compared to those of currently available programs.

Keywords: AMP prediction; MaxProbVote; antimicrobial peptides; heterogeneous ensemble machine learning; logistic regression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Antimicrobial Cationic Peptides / chemistry
  • Antimicrobial Cationic Peptides / pharmacology*
  • Databases, Genetic*
  • Machine Learning*
  • Reproducibility of Results
  • Sensitivity and Specificity
  • Software*

Substances

  • Antimicrobial Cationic Peptides