Statistical distribution of amino acid sequences: a proof of Darwinian evolution

Bioinformatics. 2010 Dec 1;26(23):2933-5. doi: 10.1093/bioinformatics/btq571. Epub 2010 Oct 28.

Abstract

Motivation: The article presents results of the listing of the quantity of amino acids, dipeptides and tripeptides for all proteins available in the UNIPROT-TREMBL database and the listing for selected species and enzymes. UNIPROT-TREMBL contains protein sequences associated with computationally generated annotations and large-scale functional characterization. Due to the distinct metabolic pathways of amino acid syntheses and their physicochemical properties, the quantities of subpeptides in proteins vary. We have proved that the distribution of amino acids, dipeptides and tripeptides is statistical which confirms that the evolutionary biodiversity development model is subject to the theory of independent events. It seems interesting that certain short peptide combinations occur relatively rarely or even not at all. First, it confirms the Darwinian theory of evolution and second, it opens up opportunities for designing pharmaceuticals among rarely represented short peptide combinations. Furthermore, an innovative approach to the mass analysis of bioinformatic data is presented.

Contact: eitner@amu.edu.pl

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence*
  • Amino Acids / analysis
  • Amino Acids / genetics
  • Computational Biology / methods
  • Enzymes / chemistry
  • Evolution, Molecular*
  • Humans
  • Oligopeptides / chemistry*
  • Proteins / chemistry
  • Proteins / genetics
  • Sequence Analysis, Protein*
  • Statistical Distributions

Substances

  • Amino Acids
  • Enzymes
  • Oligopeptides
  • Proteins