Reversibility and efficiency in coding protein information

J Theor Biol. 2010 Dec 21;267(4):519-25. doi: 10.1016/j.jtbi.2010.09.025. Epub 2010 Sep 22.

Abstract

Why the genetic code has a fixed length? Protein information is transferred by coding each amino acid using codons whose length equals 3 for all amino acids. Hence the most probable and the least probable amino acid get a codeword with an equal length. Moreover, the distributions of amino acids found in nature are not uniform and therefore the efficiency of such codes is sub-optimal. The origins of these apparently non-efficient codes are yet unclear. In this paper we propose an a priori argument for the energy efficiency of such codes resulting from their reversibility, in contrast to their time inefficiency. Such codes are reversible in the sense that a primitive processor, reading three letters in each step, can always reverse its operation, undoing its process. We examine the codes for the distributions of amino acids that exist in nature and show that they could not be both time efficient and reversible. We investigate a family of Zipf-type distributions and present their efficient (non-fixed length) prefix code, their graphs, and the condition for their reversibility. We prove that for a large family of such distributions, if the code is time efficient, it could not be reversible. In other words, if pre-biotic processes demand reversibility, the protein code could not be time efficient. The benefits of reversibility are clear: reversible processes are adiabatic, namely, they dissipate a very small amount of energy. Such processes must be done slowly enough; therefore time efficiency is non-important. It is reasonable to assume that early biochemical complexes were more prone towards energy efficiency, where forward and backward processes were almost symmetrical.

MeSH terms

  • Amino Acids / genetics
  • Genetic Code*
  • Models, Genetic
  • Open Reading Frames / genetics*

Substances

  • Amino Acids