iAMY-RECMFF: Identifying amyloidgenic peptides by using residue pairwise energy content matrix and features fusion algorithm

J Bioinform Comput Biol. 2023 Oct;21(5):2350023. doi: 10.1142/S0219720023500233. Epub 2023 Oct 27.

Abstract

Various diseases, including Huntington's disease, Alzheimer's disease, and Parkinson's disease, have been reported to be linked to amyloid. Therefore, it is crucial to distinguish amyloid from non-amyloid proteins or peptides. While experimental approaches are typically preferred, they are costly and time-consuming. In this study, we have developed a machine learning framework called iAMY-RECMFF to discriminate amyloidgenic from non-amyloidgenic peptides. In our model, we first encoded the peptide sequences using the residue pairwise energy content matrix. We then utilized Pearson's correlation coefficient and distance correlation to extract useful information from this matrix. Additionally, we employed an improved similarity network fusion algorithm to integrate features from different perspectives. The Fisher approach was adopted to select the optimal feature subset. Finally, the selected features were inputted into a support vector machine for identifying amyloidgenic peptides. Experimental results demonstrate that our proposed method significantly improves the identification of amyloidgenic peptides compared to existing predictors. This suggests that our method may serve as a powerful tool in identifying amyloidgenic peptides. To facilitate academic use, the dataset and codes used in the current study are accessible at https://figshare.com/articles/online_resource/iAMY-RECMFF/22816916.

Keywords: Amyloidgenic peptides; Pearson’s correlation coefficient; distance correlation; residue pairwise energy content matrix; similarity network fusion algorithm.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Machine Learning
  • Peptides* / chemistry
  • Support Vector Machine

Substances

  • Peptides