Protein-Protein Interactions Prediction Based on Graph Energy and Protein Sequence Information

Molecules. 2020 Apr 16;25(8):1841. doi: 10.3390/molecules25081841.

Abstract

Identification of protein-protein interactions (PPIs) plays an essential role in the understanding of protein functions and cellular biological activities. However, the traditional experiment-based methods are time-consuming and laborious. Therefore, developing new reliable computational approaches has great practical significance for the identification of PPIs. In this paper, a novel prediction method is proposed for predicting PPIs using graph energy, named PPI-GE. Particularly, in the process of feature extraction, we designed two new feature extraction methods, the physicochemical graph energy based on the ionization equilibrium constant and isoelectric point and the contact graph energy based on the contact information of amino acids. The dipeptide composition method was used for order information of amino acids. After multi-information fusion, principal component analysis (PCA) was implemented for eliminating noise and a robust weighted sparse representation-based classification (WSRC) classifier was applied for sample classification. The prediction accuracies based on the five-fold cross-validation of the human, Helicobacter pylori (H. pylori), and yeast data sets were 99.49%, 97.15%, and 99.56%, respectively. In addition, in five independent data sets and two significant PPI networks, the comparative experimental results also demonstrate that PPI-GE obtained better performance than the compared methods.

Keywords: WSRC classifier; contact information; graph energy; physicochemical properties; protein-protein interaction.

MeSH terms

  • Computational Biology / methods*
  • Databases, Protein
  • Helicobacter pylori / metabolism
  • Humans
  • Isoelectric Point
  • Principal Component Analysis
  • Protein Interaction Mapping / methods*
  • Protein Interaction Maps
  • Proteins / metabolism*
  • Saccharomyces cerevisiae / metabolism
  • Support Vector Machine

Substances

  • Proteins