Predicting Functions of Uncharacterized Human Proteins: From Canonical to Proteoforms

Genes (Basel). 2020 Jun 21;11(6):677. doi: 10.3390/genes11060677.

Abstract

Despite tremendous efforts in genomics, transcriptomics, and proteomics communities, there is still no comprehensive data about the exact number of protein-coding genes, translated proteoforms, and their function. In addition, by now, we lack functional annotation for 1193 genes, where expression was confirmed at the proteomic level (uPE1 proteins). We re-analyzed results of AP-MS experiments from the BioPlex 2.0 database to predict functions of uPE1 proteins and their splice forms. By building a protein-protein interaction network for 12 ths. identified proteins encoded by 11 ths. genes, we were able to predict Gene Ontology categories for a total of 387 uPE1 genes. We predicted different functions for canonical and alternatively spliced forms for four uPE1 genes. In total, functional differences were revealed for 62 proteoforms encoded by 31 genes. Based on these results, it can be carefully concluded that the dynamics and versatility of the interactome is ensured by changing the dominant splice form. Overall, we propose that analysis of large-scale AP-MS experiments performed for various cell lines and under various conditions is a key to understanding the full potential of genes role in cellular processes.

Keywords: AP-MS; BioPlex; Gene Ontology; function annotation; human interactome; protein coding genes; protein–protein interaction; proteoform; splice form; uPE1 proteins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing / genetics
  • Computational Biology
  • Databases, Protein
  • Gene Ontology
  • Humans
  • Molecular Sequence Annotation*
  • Protein Interaction Maps / genetics*
  • Proteome / classification
  • Proteome / genetics*
  • Proteomics*

Substances

  • Proteome