Anticancer peptides prediction with deep representation learning features

Brief Bioinform. 2021 Sep 2;22(5):bbab008. doi: 10.1093/bib/bbab008.

Abstract

Anticancer peptides constitute one of the most promising therapeutic agents for combating common human cancers. Using wet experiments to verify whether a peptide displays anticancer characteristics is time-consuming and costly. Hence, in this study, we proposed a computational method named identify anticancer peptides via deep representation learning features (iACP-DRLF) using light gradient boosting machine algorithm and deep representation learning features. Two kinds of sequence embedding technologies were used, namely soft symmetric alignment embedding and unified representation (UniRep) embedding, both of which involved deep neural network models based on long short-term memory networks and their derived networks. The results showed that the use of deep representation learning features greatly improved the capability of the models to discriminate anticancer peptides from other peptides. Also, UMAP (uniform manifold approximation and projection for dimension reduction) and SHAP (shapley additive explanations) analysis proved that UniRep have an advantage over other features for anticancer peptide identification. The python script and pretrained models could be downloaded from https://github.com/zhibinlv/iACP-DRLF or from http://public.aibiochem.net/iACP-DRLF/.

Keywords: anticancer; feature selection; light gradient boosting; peptide; representation learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Antineoplastic Agents / chemistry
  • Antineoplastic Agents / therapeutic use*
  • Benchmarking
  • Computational Biology / methods*
  • Computer Simulation
  • Deep Learning*
  • Drug Discovery / methods*
  • Humans
  • Memory, Short-Term
  • Neoplasms / drug therapy*
  • Peptides / chemistry
  • Peptides / therapeutic use*

Substances

  • Antineoplastic Agents
  • Peptides