Evolutionary Forces and Codon Bias in Different Flavors of Intrinsic Disorder in the Human Proteome

J Mol Evol. 2020 Mar;88(2):164-178. doi: 10.1007/s00239-019-09921-4. Epub 2019 Dec 10.

Abstract

In this study, we perform a systematic analysis of evolutionary forces (i.e., mutational bias and natural selection) that shape the codon usage bias of human genes encoding proteins characterized by different flavors of intrinsic disorder. Well-structured proteins are expected to be more under control by purifying natural selection than intrinsically disordered proteins because one or few mutations (even synonymous) in the genes can result in a protein that no longer folds correctly. On the contrary, intrinsically disordered proteins are thought to evolve more rapidly than well-folded proteins, due to a relaxed purifying natural selection and an increased role of mutational bias. Using different bioinformatic tools, we find evidence that codon usage in IDPs is not only affected by a basic mutational bias, but it is also more selectively constrained than the rest of the human proteome. We speculate that intrinsically disordered proteins have not only a high tolerance to mutations but also a selective propensity to preserve their structural disorder under physiological conditions. Additionally, we confirm not only that intrinsically disordered proteins are preferentially encoded by GC-rich genes, but also that they are characterized by the highest fraction of CpG sites in the sequences, implying a higher susceptibility to methylation resulting in C-T transition mutations. Overall, our results corroborate the essential role of intrinsic disorder for the evolutionary adaptability and evolvability of proteins, offering new insight about protein evolution not only in terms of functional properties and roles in diseases but also in terms of evolutionary forces they are subjected to.

Keywords: Codon usage bias; Human proteome; Intrinsically disordered protein; Mutational bias; Natural selection; Neutral evolution.

MeSH terms

  • Base Composition
  • Codon Usage*
  • Computational Biology
  • Evolution, Molecular*
  • Humans
  • Intrinsically Disordered Proteins / genetics*
  • Models, Genetic
  • Mutation
  • Proteome / genetics*
  • Selection, Genetic*

Substances

  • Intrinsically Disordered Proteins
  • Proteome