Protein complex prediction: A survey

Genomics. 2020 Jan;112(1):174-183. doi: 10.1016/j.ygeno.2019.01.011. Epub 2019 Jan 17.

Abstract

Protein complexes are one of the most important functional units for deriving biological processes within the cell. Experimental methods have provided valuable data to infer protein complexes. However, these methods have inherent limitations. Considering these limitations, many computational methods have been proposed to predict protein complexes, in the last decade. Almost all of these in-silico methods predict protein complexes from the ever-increasing protein-protein interaction (PPI) data. These computational approaches usually use the PPI data in the format of a huge protein-protein interaction network (PPIN) as input and output various sub-networks of the given PPIN as the predicted protein complexes. Some of these methods have already reached a promising efficiency in protein complex detection. Nonetheless, there are challenges in prediction of other types of protein complexes, specially sparse and small ones. New methods should further incorporate the knowledge of biological properties of proteins to improve the performance. Additionally, there are several challenges that should be considered more effectively in designing the new complex prediction algorithms in the future. This article not only reviews the history of computational protein complex prediction but also provides new insight for improvement of new methodologies. In this article, most important computational methods for protein complex prediction are evaluated and compared. In addition, some of the challenges in the reconstruction of the protein complexes are discussed. Finally, various tools for protein complex prediction and PPIN analysis as well as the current high-throughput databases are reviewed.

Keywords: Network clustering; Protein complex; Protein interaction network; Protein–protein interaction.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Computational Biology / methods
  • Multiprotein Complexes / metabolism*
  • Protein Interaction Mapping*
  • Software

Substances

  • Multiprotein Complexes