Prediction of anticancer peptides based on an ensemble model of deep learning and machine learning using ordinal positional encoding

Brief Bioinform. 2023 Jan 19;24(1):bbac630. doi: 10.1093/bib/bbac630.

Abstract

Anticancer peptides (ACPs) are the types of peptides that have been demonstrated to have anticancer activities. Using ACPs to prevent cancer could be a viable alternative to conventional cancer treatments because they are safer and display higher selectivity. Due to ACP identification being highly lab-limited, expensive and lengthy, a computational method is proposed to predict ACPs from sequence information in this study. The process includes the input of the peptide sequences, feature extraction in terms of ordinal encoding with positional information and handcrafted features, and finally feature selection. The whole model comprises of two modules, including deep learning and machine learning algorithms. The deep learning module contained two channels: bidirectional long short-term memory (BiLSTM) and convolutional neural network (CNN). Light Gradient Boosting Machine (LightGBM) was used in the machine learning module. Finally, this study voted the three models' classification results for the three paths resulting in the model ensemble layer. This study provides insights into ACP prediction utilizing a novel method and presented a promising performance. It used a benchmark dataset for further exploration and improvement compared with previous studies. Our final model has an accuracy of 0.7895, sensitivity of 0.8153 and specificity of 0.7676, and it was increased by at least 2% compared with the state-of-the-art studies in all metrics. Hence, this paper presents a novel method that can potentially predict ACPs more effectively and efficiently. The work and source codes are made available to the community of researchers and developers at https://github.com/khanhlee/acp-ope/.

Keywords: Anticancer peptide; Deep learning; Feature fusion; Handcrafted feature; Machine learning; Model ensemble; Ordinal positional encoding.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Deep Learning*
  • Machine Learning
  • Neural Networks, Computer
  • Peptides / therapeutic use

Substances

  • Peptides