mRNA/protein sequence complementarity and its determinants: The impact of affinity scales

PLoS Comput Biol. 2017 Jul 27;13(7):e1005648. doi: 10.1371/journal.pcbi.1005648. eCollection 2017 Jul.

Abstract

It has recently been demonstrated that the nucleobase-density profiles of mRNA coding sequences are related in a complementary manner to the nucleobase-affinity profiles of their cognate protein sequences. Based on this, it has been proposed that cognate mRNA/protein pairs may bind in a co-aligned manner, especially if unstructured. Here, we study the dependence of mRNA/protein sequence complementarity on the properties of the nucleobase/amino-acid affinity scales used. Specifically, we sample the space of randomly generated scales by employing a Monte Carlo strategy with a fitness function that depends directly on the level of complementarity. For model organisms representing all three domains of life, we show that even short searches reproducibly converge upon highly optimized scales, implying that the topology of the underlying fitness landscape is decidedly funnel-like. Furthermore, the optimized scales, generated without any consideration of the physicochemical attributes of nucleobases or amino acids, resemble closely the nucleobase/amino-acid binding affinity scales obtained from experimental structures of RNA-protein complexes. This provides support for the claim that mRNA/protein sequence complementarity may indeed be related to binding between the two. Finally, we characterize suboptimal scales and show that intermediate-to-high complementarity can be reached by substantially diverse scales, but with select amino acids contributing disproportionally. Our results expose the dependence of cognate mRNA/protein sequence complementarity on the properties of the underlying nucleobase/amino-acid affinity scales and provide quantitative constraints that any physical scales need to satisfy for the complementarity to hold.

MeSH terms

  • Amino Acid Sequence / genetics
  • Amino Acid Sequence / physiology*
  • Base Sequence / genetics
  • Base Sequence / physiology*
  • Computational Biology
  • Escherichia coli / genetics
  • Methanocaldococcus / genetics
  • Models, Genetic
  • Monte Carlo Method
  • Proteins / chemistry*
  • Proteins / genetics
  • Proteins / metabolism*
  • RNA, Messenger / chemistry*
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism*
  • Saccharomyces cerevisiae / genetics
  • Software

Substances

  • Proteins
  • RNA, Messenger