A survey on computational models for predicting protein-protein interactions

Brief Bioinform. 2021 Sep 2;22(5):bbab036. doi: 10.1093/bib/bbab036.

Abstract

Proteins interact with each other to play critical roles in many biological processes in cells. Although promising, laboratory experiments usually suffer from the disadvantages of being time-consuming and labor-intensive. The results obtained are often not robust and considerably uncertain. Due recently to advances in high-throughput technologies, a large amount of proteomics data has been collected and this presents a significant opportunity and also a challenge to develop computational models to predict protein-protein interactions (PPIs) based on these data. In this paper, we present a comprehensive survey of the recent efforts that have been made towards the development of effective computational models for PPI prediction. The survey introduces the algorithms that can be used to learn computational models for predicting PPIs, and it classifies these models into different categories. To understand their relative merits, the paper discusses different validation schemes and metrics to evaluate the prediction performance. Biological databases that are commonly used in different experiments for performance comparison are also described and their use in a series of extensive experiments to compare different prediction models are discussed. Finally, we present some open issues in PPI prediction for future work. We explain how the performance of PPI prediction can be improved if these issues are effectively tackled.

Keywords: biological databases; computational prediction models; performance evaluation; protein–protein interaction.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Databases, Protein
  • Gene Ontology
  • Humans
  • Models, Molecular
  • Protein Conformation
  • Protein Interaction Domains and Motifs
  • Protein Interaction Mapping / methods*
  • Protein Interaction Mapping / statistics & numerical data
  • Proteins / chemistry
  • Proteins / genetics
  • Proteins / metabolism*
  • Saccharomyces cerevisiae / genetics
  • Saccharomyces cerevisiae / metabolism
  • Software*
  • Support Vector Machine*

Substances

  • Proteins