BBPpred: Sequence-Based Prediction of Blood-Brain Barrier Peptides with Feature Representation Learning and Logistic Regression

J Chem Inf Model. 2021 Jan 25;61(1):525-534. doi: 10.1021/acs.jcim.0c01115. Epub 2021 Jan 11.

Abstract

Blood-brain barrier peptides (BBPs) have a large range of biomedical applications since they can cross the blood-brain barrier based on different mechanisms. As experimental methods for the identification of BBPs are laborious and expensive, computational approaches are necessary to be developed for predicting BBPs. In this work, we describe a computational method, BBPpred (blood-brain barrier peptides prediction), that can efficiently identify BBPs using logistic regression. We investigate a wide variety of features from amino acid sequence information, and then a feature learning method is adopted to represent the informative features. To improve the prediction performance, seven informative features are selected for classification by eliminating redundant and irrelevant features. In addition, we specifically create two benchmark data sets (training and independent test), which contain a total of 119 BBPs from public databases and the literature. On the training data set, BBPpred shows promising performances with an AUC score of 0.8764 and an AUPR score of 0.8757 using the 10-fold cross-validation. We also test our new method on the independent test data set and obtain a favorable performance. We envision that BBPpred will be a useful tool for identifying, annotating, and characterizing BBPs. BBPpred is freely available at http://BBPpred.xialab.info.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Blood-Brain Barrier*
  • Logistic Models
  • Peptides*

Substances

  • Peptides