Multiple levels of meaning in DNA sequences, and one more

Ann N Y Acad Sci. 2012 Sep:1267:35-8. doi: 10.1111/j.1749-6632.2012.06589.x.

Abstract

If we define a genetic code as a widespread DNA sequence pattern that carries a message with an impact on biology, then there are multiple genetic codes. Sequences involved in these codes overlap and, thus, both interact with and constrain each other, such as for the triplet code, the intron-splicing code, the code for amphipathic alpha helices, and the chromatin code. Nucleosomes preferentially are located at the ends of exons, thus protecting splice junctions, with the N9 positions of guanines of the GT and AG junctions oriented toward the histones. Analysis of protein-coding sequences reveals numerous traces of tandem repeats, apparently formed by triplet expansion, which in effect is a genome inflation ``code''. Our data are consistent with the hypothesis that expansion of simple tandem repetition of certain aggressive triplets has been a characteristic of life from its emergence. Such expanding triplets appear to be the major factor underlying observed codon usage biases.

MeSH terms

  • Base Sequence
  • DNA / genetics
  • Genetic Code
  • Humans
  • Nucleic Acid Conformation
  • Nucleosomes / metabolism
  • RNA Splice Sites
  • Repetitive Sequences, Nucleic Acid
  • Sequence Analysis, DNA
  • Trinucleotide Repeat Expansion / genetics*

Substances

  • Nucleosomes
  • RNA Splice Sites
  • DNA