Intrinsic protein disorder reduces small-scale gene duplicability

DNA Res. 2017 Aug 1;24(4):435-444. doi: 10.1093/dnares/dsx015.

Abstract

Whereas the rate of gene duplication is relatively high, only certain duplications survive the filter of natural selection and can contribute to genome evolution. However, the reasons why certain genes can be retained after duplication whereas others cannot remain largely unknown. Many proteins contain intrinsically disordered regions (IDRs), whose structures fluctuate between alternative conformational states. Due to their high flexibility, IDRs often enable protein-protein interactions and are the target of post-translational modifications. Intrinsically disordered proteins (IDPs) have characteristics that might either stimulate or hamper the retention of their encoding genes after duplication. On the one hand, IDRs may enable functional diversification, thus promoting duplicate retention. On the other hand, increased IDP availability is expected to result in deleterious unspecific interactions. Here, we interrogate the proteomes of human, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Arabidopsis thaliana and Escherichia coli, in order to ascertain the impact of protein intrinsic disorder on gene duplicability. We show that, in general, proteins encoded by duplicated genes tend to be less disordered than those encoded by singletons. The only exception is proteins encoded by ohnologs, which tend to be more disordered than those encoded by singletons or genes resulting from small-scale duplications. Our results indicate that duplication of genes encoding IDPs outside the context of whole-genome duplication (WGD) is often deleterious, but that IDRs facilitate retention of duplicates in the context of WGD. We discuss the potential evolutionary implications of our results.

Keywords: ohnologs; protein folding; singleton; unstructured proteins; whole genome duplications.

MeSH terms

  • Animals
  • Escherichia coli / genetics
  • Escherichia coli / metabolism
  • Eukaryota / genetics*
  • Eukaryota / metabolism
  • Evolution, Molecular*
  • Genes, Duplicate*
  • Genome*
  • Humans
  • Ploidies
  • Protein Folding*
  • Protein Processing, Post-Translational
  • Proteomics