A novel theory on the origin of the genetic code: a GNC-SNS hypothesis

J Mol Evol. 2002 Apr;54(4):530-8. doi: 10.1007/s00239-001-0053-6.

Abstract

We have previously proposed an SNS hypothesis on the origin of the genetic code (Ikehara and Yoshida 1998). The hypothesis predicts that the universal genetic code originated from the SNS code composed of 16 codons and 10 amino acids (S and N mean G or C and either of four bases, respectively). But, it must have been very difficult to create the SNS code at one stroke in the beginning. Therefore, we searched for a simpler code than the SNS code, which could still encode water-soluble globular proteins with appropriate three-dimensional structures at a high probability using four conditions for globular protein formation (hydropathy, alpha-helix, beta-sheet, and beta-turn formations). Four amino acids (Gly [G], Ala [A], Asp [D], and Val [V]) encoded by the GNC code satisfied the four structural conditions well, but other codes in rows and columns in the universal genetic code table do not, except for the GNG code, a slightly modified form of the GNC code. Three three-amino acid systems ([D], Leu and Tyr; [D], Tyr and Met; Glu, Pro and Ile) also satisfied the above four conditions. But, some amino acids in the three systems are far more complex than those encoded by the GNC code. In addition, the amino acids in the three-amino acid systems are scattered in the universal genetic code table. Thus, we concluded that the universal genetic code originated not from a three-amino acid system but from a four-amino acid system, the GNC code encoding [GADV]-proteins, as the most primitive genetic code.

MeSH terms

  • Alanine
  • Amino Acids, Acidic / genetics
  • Amino Acids, Basic / genetics
  • Animals
  • Evolution, Molecular*
  • Genetic Code*
  • Humans
  • Lysine
  • Protein Structure, Secondary / genetics

Substances

  • Amino Acids, Acidic
  • Amino Acids, Basic
  • Lysine
  • Alanine