Similarity analysis for DNA sequences based on chaos game representation. Case study: the albumin

J Theor Biol. 2010 Dec 21;267(4):513-8. doi: 10.1016/j.jtbi.2010.09.027. Epub 2010 Sep 28.

Abstract

Using chaos game representation we introduce a novel and straightforward method for identifying similarities/dissimilarities between DNA sequences of the same type, from different organisms. A matrix is associated to each CGR pattern and the similarities result from the comparison between the matrices of the sequences of interest. Three different methods of analysis of the resulting difference matrix are considered: a 3-dimensional representation giving both local and global information, a numerical characterization by defining an n-letter word similarity measure and a statistical evaluation. The method is illustrated by implementation to the study of albumin nucleotides sequences from eight mammal species taking as reference the human albumin.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Albumins / genetics*
  • Animals
  • Base Sequence
  • Game Theory*
  • Humans
  • Nonlinear Dynamics*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid

Substances

  • Albumins