Three 3D graphical representations of DNA primary sequences based on the classifications of DNA bases and their applications

J Theor Biol. 2011 Jan 21;269(1):123-30. doi: 10.1016/j.jtbi.2010.10.018. Epub 2010 Oct 20.

Abstract

In this article, we introduce three 3D graphical representations of DNA primary sequences, which we call RY-curve, MK-curve and SW-curve, based on three classifications of the DNA bases. The advantages of our representations are that (i) these 3D curves are strictly non-degenerate and there is no loss of information when transferring a DNA sequence to its mathematical representation and (ii) the coordinates of every node on these 3D curves have clear biological implication. Two applications of these 3D curves are presented: (a) a simple formula is derived to calculate the content of the four bases (A, G, C and T) from the coordinates of nodes on the curves; and (b) a 12-component characteristic vector is constructed to compare similarity among DNA sequences from different species based on the geometrical centers of the 3D curves. As examples, we examine similarity among the coding sequences of the first exon of beta-globin gene from eleven species and validate similarity of cDNA sequences of beta-globin gene from eight species.

MeSH terms

  • Animals
  • Base Composition / genetics*
  • Base Sequence
  • DNA / chemistry*
  • DNA / genetics*
  • DNA, Complementary / genetics
  • Exons / genetics
  • Humans
  • Models, Molecular*
  • Molecular Sequence Data
  • Open Reading Frames / genetics
  • Sequence Homology, Nucleic Acid
  • Species Specificity
  • beta-Globins / genetics

Substances

  • DNA, Complementary
  • beta-Globins
  • DNA