Exploring Potential Signals of Selection for Disordered Residues in Prokaryotic and Eukaryotic Proteins

Genomics Proteomics Bioinformatics. 2020 Oct;18(5):549-564. doi: 10.1016/j.gpb.2020.06.005. Epub 2020 Dec 18.

Abstract

Intrinsically disordered proteins (IDPs) are an important class of proteins in all domains of life for their functional importance. However, how nature has shaped the disorder potential of prokaryotic and eukaryotic proteins is still not clearly known. Randomly generated sequences are free of any selective constraints, thus these sequences are commonly used as null models. Considering different types of random protein models, here we seek to understand how the disorder potential of natural eukaryotic and prokaryotic proteins differs from random sequences. Comparing proteome-wide disorder content between real and random sequences of 12 model organisms, we noticed that eukaryotic proteins are enriched in disordered regions compared to random sequences, but in prokaryotes such regions are depleted. By analyzing the position-wise disorder profile, we show that there is a generally higher disorder near the N- and C-terminal regions of eukaryotic proteins as compared to the random models; however, either no or a weak such trend was found in prokaryotic proteins. Moreover, here we show that this preference is not caused by the amino acid or nucleotide composition at the respective sites. Instead, these regions were found to be endowed with a higher fraction of protein-protein binding sites, suggesting their functional importance. We discuss several possible explanations for this pattern, such as improving the efficiency of protein-protein interaction, ribosome movement during translation, and post-translational modification. However, further studies are needed to clearly understand the biophysical mechanisms causing the trend.

Keywords: Comparative genomics; Gene function; Intrinsically disordered protein; Proteome evolution; Z-score.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology
  • Eukaryota* / genetics
  • Intrinsically Disordered Proteins* / genetics
  • Protein Conformation
  • Proteome

Substances

  • Intrinsically Disordered Proteins
  • Proteome