Genes on the circular code alphabet

Biosystems. 2021 Aug:206:104431. doi: 10.1016/j.biosystems.2021.104431. Epub 2021 Apr 22.

Abstract

The X motifs, motifs from the circular code X, are enriched in the (protein coding) genes of bacteria, archaea, eukaryotes, plasmids and viruses, moreover, in the minimal gene set belonging to the three domains of life, as well as in tRNA and rRNA sequences. They allow to retrieve, maintain and synchronize the reading frame in genes, and contribute to the regulation of gene expression. These results lead here to a theoretical study of genes based on the circular code alphabet. A new occurrence relation of the circular code X under the hypothesis of an equiprobable (balanced) strand pairing is given. Surprisingly, a statistical analysis of a large set of bacterial genes retrieves this relation on the circular code alphabet, but not on the DNA alphabet. Furthermore, the circular code X has the strongest balanced circular code pairing among 216 maximal C3 self-complementary trinucleotide circular codes, a new property of this circular code X. As an application of this theory, different tRNAs studied on the circular code alphabet reveal an unexpected stem structure. Thus, the circular code X would have constructed a coding stem in tRNAs as an outline of the future gene structure and the future DNA double helix.

Keywords: Alphabet; Circular code; Pairing rule; Protein coding gene; Transfer RNA gene; Trinucleotide.

MeSH terms

  • Animals
  • Genes, Bacterial / physiology*
  • Genetic Code / physiology*
  • Humans
  • RNA, Circular / physiology*
  • RNA, Transfer / physiology*

Substances

  • RNA, Circular
  • RNA, Transfer