ADP-Fuse: A novel two-layer machine learning predictor to identify antidiabetic peptides and diabetes types using multiview information

Comput Biol Med. 2023 Oct:165:107386. doi: 10.1016/j.compbiomed.2023.107386. Epub 2023 Aug 14.

Abstract

Diabetes mellitus has become a major public health concern associated with high mortality and reduced life expectancy and can cause blindness, heart attacks, kidney failure, lower limb amputations, and strokes. A new generation of antidiabetic peptides (ADPs) that act on β-cells or T-cells to regulate insulin production is being developed to alleviate the effects of diabetes. However, the lack of effective peptide-mining tools has hampered the discovery of these promising drugs. Hence, novel computational tools need to be developed urgently. In this study, we present ADP-Fuse, a novel two-layer prediction framework capable of accurately identifying ADPs or non-ADPs and categorizing them into type 1 and type 2 ADPs. First, we comprehensively evaluated 22 peptide sequence-derived features coupled with eight notable machine learning algorithms. Subsequently, the most suitable feature descriptors and classifiers for both layers were identified. The output of these single-feature models, embedded with multiview information, was trained with an appropriate classifier to provide the final prediction. Comprehensive cross-validation and independent tests substantiate that ADP-Fuse surpasses single-feature models and the feature fusion approach for the prediction of ADPs and their types. In addition, the SHapley Additive exPlanation method was used to elucidate the contributions of individual features to the prediction of ADPs and their types. Finally, a user-friendly web server for ADP-Fuse was developed and made publicly accessible (https://balalab-skku.org/ADP-Fuse), enabling the swift screening and identification of novel ADPs and their types. This framework is expected to contribute significantly to antidiabetic peptide identification.

Keywords: Antidiabetic peptides; Bioinformatics; Machine learning; Multiview information; Sequence analysis; Stacking ensemble learning.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Computational Biology
  • Diabetes Mellitus*
  • Hypoglycemic Agents*
  • Machine Learning
  • Peptides

Substances

  • Hypoglycemic Agents
  • Peptides