Predicting protein complexes using a supervised learning method combined with local structural information

PLoS One. 2018 Mar 19;13(3):e0194124. doi: 10.1371/journal.pone.0194124. eCollection 2018.

Abstract

The existing protein complex detection methods can be broadly divided into two categories: unsupervised and supervised learning methods. Most of the unsupervised learning methods assume that protein complexes are in dense regions of protein-protein interaction (PPI) networks even though many true complexes are not dense subgraphs. Supervised learning methods utilize the informative properties of known complexes; they often extract features from existing complexes and then use the features to train a classification model. The trained model is used to guide the search process for new complexes. However, insufficient extracted features, noise in the PPI data and the incompleteness of complex data make the classification model imprecise. Consequently, the classification model is not sufficient for guiding the detection of complexes. Therefore, we propose a new robust score function that combines the classification model with local structural information. Based on the score function, we provide a search method that works both forwards and backwards. The results from experiments on six benchmark PPI datasets and three protein complex datasets show that our approach can achieve better performance compared with the state-of-the-art supervised, semi-supervised and unsupervised methods for protein complex detection, occasionally significantly outperforming such methods.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Machine Learning*
  • Protein Interaction Mapping / methods*
  • Protein Interaction Maps*
  • Proteomics / methods*
  • Software*

Grants and funding

This work was supported by NO.61572005, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS, CQ; NO.61272004, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS; NO.61672086, National Natural Science Foundation of China, www.nsfc.gov.cn, YQS; and Fundamental Research Funds for the Central Universities K17JB00220 to CQ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.