Quantitative measure of randomness and order for complete genomes

Phys Rev E Stat Nonlin Soft Matter Phys. 2009 Jun;79(6 Pt 1):061911. doi: 10.1103/PhysRevE.79.061911. Epub 2009 Jun 9.

Abstract

We propose an order index, phi, which gives a quantitative measure of randomness and order of complete genomic sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length. The 786 complete genomic sequences in GenBank were found to have phi values in a very narrow range, phig=0.031(-0.015)+0.028. We show this implies that genomes are halfway toward being completely random, or, at the "edge of chaos." We further show that artificial "genomes" converted from literary classics have phi 's that almost exactly coincide with phig, but sequences of low information content do not. We infer that phig represents a high information-capacity "fixed point" in sequence space, and that genomes are driven to it by the dynamics of a robust growth and evolution process. We show that a growth process characterized by random segmental duplication can robustly drive genomes to the fixed point.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computer Simulation
  • Data Interpretation, Statistical
  • Genome / genetics*
  • Models, Genetic*
  • Models, Statistical*
  • Molecular Sequence Data
  • Mutation / genetics
  • Sequence Analysis, DNA / methods*