PATH - Prediction of Amyloidogenicity by Threading and Machine Learning

Sci Rep. 2020 May 7;10(1):7721. doi: 10.1038/s41598-020-64270-3.

Abstract

Amyloids are protein aggregates observed in several diseases, for example in Alzheimer's and Parkinson's diseases. An aggregate has a very regular beta structure with a tightly packed core, which spontaneously assumes a steric zipper form. Experimental methods enable studying such peptides, however they are tedious and costly, therefore inappropriate for genomewide studies. Several bioinformatic methods have been proposed to evaluate protein propensity to form an amyloid. However, the knowledge of aggregate structures is usually not taken into account. We propose PATH (Prediction of Amyloidogenicity by THreading) - a novel structure-based method for predicting amyloidogenicity and show that involving available structures of amyloidogenic fragments enhances classification performance. Experimental aggregate structures were used in templatebased modeling to recognize the most stable representative structural class of a query peptide. Several machine learning methods were then applied on the structural models, using their energy terms. Finally, we identified the most important terms in classification of amyloidogenic peptides. The proposed method outperforms most of the currently available methods for predicting amyloidogenicity, with its area under ROC curve equal to 0.876. Furthermore, the method gave insight into significance of selected structural features and the potentially most stable structural class of a peptide fragment if subjected to crystallization.

MeSH terms

  • Algorithms
  • Alzheimer Disease / genetics
  • Alzheimer Disease / pathology
  • Amyloid / chemistry
  • Amyloid / ultrastructure*
  • Computational Biology / methods
  • Humans
  • Parkinson Disease / genetics
  • Parkinson Disease / pathology
  • Peptide Fragments / chemistry
  • Peptide Fragments / ultrastructure*
  • Protein Aggregates / genetics
  • Protein Aggregation, Pathological / genetics
  • Protein Aggregation, Pathological / pathology
  • Protein Conformation, beta-Strand / genetics*
  • Software*

Substances

  • Amyloid
  • Peptide Fragments
  • Protein Aggregates