ProtCHOIR: a tool for proteome-scale generation of homo-oligomers

Brief Bioinform. 2021 Nov 5;22(6):bbab182. doi: 10.1093/bib/bbab182.

Abstract

The rapid developments in gene sequencing technologies achieved in the recent decades, along with the expansion of knowledge on the three-dimensional structures of proteins, have enabled the construction of proteome-scale databases of protein models such as the Genome3D and ModBase. Nevertheless, although gene products are usually expressed as individual polypeptide chains, most biological processes are associated with either transient or stable oligomerisation. In the PDB databank, for example, ~40% of the deposited structures contain at least one homo-oligomeric interface. Unfortunately, databases of protein models are generally devoid of multimeric structures. To tackle this particular issue, we have developed ProtCHOIR, a tool that is able to generate homo-oligomeric structures in an automated fashion, providing detailed information for the input protein and output complex. ProtCHOIR requires input of either a sequence or a protomeric structure that is queried against a pre-constructed local database of homo-oligomeric structures, then extensively analyzed using well-established tools such as PSI-Blast, MAFFT, PISA and Molprobity. Finally, MODELLER is employed to achieve the construction of the homo-oligomers. The output complex is thoroughly analyzed taking into account its stereochemical quality, interfacial stabilities, hydrophobicity and conservation profile. All these data are then summarized in a user-friendly HTML report that can be saved or printed as a PDF file. The software is easily parallelizable and also outputs a comma-separated file with summary statistics that can straightforwardly be concatenated as a spreadsheet-like document for large-scale data analyses. As a proof-of-concept, we built oligomeric models for the Mabellini Mycobacterium abscessus structural proteome database. ProtCHOIR can be run as a web-service and the code can be obtained free-of-charge at http://lmdm.biof.ufrj.br/protchoir.

Keywords: homo-oligomers; protein complexes; protein modelling; protein–protein interaction; proteome-scale modelling.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Databases, Protein
  • Models, Molecular*
  • Promoter Regions, Genetic
  • Protein Binding
  • Protein Conformation
  • Protein Multimerization*
  • Proteome*
  • Proteomics / methods*
  • Reproducibility of Results
  • Software*
  • User-Computer Interface

Substances

  • Proteome