ACP-GBDT: An improved anticancer peptide identification method with gradient boosting decision tree

Front Genet. 2023 Mar 29:14:1165765. doi: 10.3389/fgene.2023.1165765. eCollection 2023.

Abstract

Cancer is one of the most dangerous diseases in the world, killing millions of people every year. Drugs composed of anticancer peptides have been used to treat cancer with low side effects in recent years. Therefore, identifying anticancer peptides has become a focus of research. In this study, an improved anticancer peptide predictor named ACP-GBDT, based on gradient boosting decision tree (GBDT) and sequence information, is proposed. To encode the peptide sequences included in the anticancer peptide dataset, ACP-GBDT uses a merged-feature composed of AAIndex and SVMProt-188D. A GBDT is adopted to train the prediction model in ACP-GBDT. Independent testing and ten-fold cross-validation show that ACP-GBDT can effectively distinguish anticancer peptides from non-anticancer ones. The comparison results of the benchmark dataset show that ACP-GBDT is simpler and more effective than other existing anticancer peptide prediction methods.

Keywords: anticancer peptides; artificial intelligence; biological sequence analysis; machine learning; protein identification.

Grants and funding

This work was supported in part by the Research Start-up Funding Project of Quzhou University (BSYJ202109 and BSYJ202112), Key Program of Zhejiang Natural Science Foundation (Z23F020002), National Natural Science Foundation of China (62272267, 61901103, and 61671189), and Natural Science Foundation of Heilongjiang Province (LH2019F002).