Self-complementary circular codes in coding theory

Theory Biosci. 2018 Apr;137(1):51-65. doi: 10.1007/s12064-018-0259-4. Epub 2018 Mar 12.

Abstract

Self-complementary circular codes are involved in pairing genetic processes. A maximal [Formula: see text] self-complementary circular code X of trinucleotides was identified in genes of bacteria, archaea, eukaryotes, plasmids and viruses (Michel in Life 7(20):1-16 2017, J Theor Biol 380:156-177, 2015; Arquès and Michel in J Theor Biol 182:45-58 1996). In this paper, self-complementary circular codes are investigated using the graph theory approach recently formulated in Fimmel et al. (Philos Trans R Soc A 374:20150058, 2016). A directed graph [Formula: see text] associated with any code X mirrors the properties of the code. In the present paper, we demonstrate a necessary condition for the self-complementarity of an arbitrary code X in terms of the graph theory. The same condition has been proven to be sufficient for codes which are circular and of large size [Formula: see text] trinucleotides, in particular for maximal circular codes ([Formula: see text] trinucleotides). For codes of small-size [Formula: see text] trinucleotides, some very rare counterexamples have been constructed. Furthermore, the length and the structure of the longest paths in the graphs associated with the self-complementary circular codes are investigated. It has been proven that the longest paths in such graphs determine the reading frame for the self-complementary circular codes. By applying this result, the reading frame in any arbitrary sequence of trinucleotides is retrieved after at most 15 nucleotides, i.e., 5 consecutive trinucleotides, from the circular code X identified in genes. Thus, an X motif of a length of at least 15 nucleotides in an arbitrary sequence of trinucleotides (not necessarily all of them belonging to X) uniquely defines the reading (correct) frame, an important criterion for analyzing the X motifs in genes in the future.

Keywords: Genetic code; Graph properties; Reading frame; Self-complementary circular codes; Translation process.

MeSH terms

  • DNA / analysis
  • Eukaryotic Cells
  • Genes, Archaeal
  • Genes, Bacterial
  • Genes, Viral
  • Genetic Code
  • Models, Genetic*
  • Models, Theoretical
  • Nucleotides / genetics*
  • Oligonucleotides
  • Open Reading Frames
  • Plasmids
  • RNA / analysis
  • Ribosomes
  • Saccharomyces cerevisiae / genetics

Substances

  • Nucleotides
  • Oligonucleotides
  • RNA
  • DNA