ITP-Pred: an interpretable method for predicting, therapeutic peptides with fused features low-dimension representation

Brief Bioinform. 2021 Jul 20;22(4):bbaa367. doi: 10.1093/bib/bbaa367.

Abstract

The peptide therapeutics market is providing new opportunities for the biotechnology and pharmaceutical industries. Therefore, identifying therapeutic peptides and exploring their properties are important. Although several studies have proposed different machine learning methods to predict peptides as being therapeutic peptides, most do not explain the decision factors of model in detail. In this work, an Interpretable Therapeutic Peptide Prediction (ITP-Pred) model based on efficient feature fusion was developed. First, we proposed three kinds of feature descriptors based on sequence and physicochemical property encoded, namely amino acid composition (AAC), group AAC and coding autocorrelation, and concatenated them to obtain the feature representation of therapeutic peptide. Then, we input it into the CNN-Bi-directional Long Short-Term Memory (BiLSTM) model to automatically learn recognition of therapeutic peptides. The cross-validation and independent verification experiments results indicated that ITP-Pred has a higher prediction performance on the benchmark dataset than other comparison methods. Finally, we analyzed the output of the model from two aspects: sequence order and physical and chemical properties, mining important features as guidance for the design of better models that can complement existing methods.

Keywords: CNN-BiLSTM; feature fusion; interpretability analysis; therapeutic peptides prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Machine Learning*
  • Models, Genetic*
  • Peptides / chemistry
  • Peptides / genetics*
  • Peptides / therapeutic use
  • Sequence Analysis, Protein*

Substances

  • Peptides