An S-curve-based approach of identifying biological sequences

Acta Biotheor. 2010 Mar;58(1):1-14. doi: 10.1007/s10441-009-9081-1. Epub 2009 Jun 16.

Abstract

The main idea of S-curve diagram is to assign different angle values (from 0 degrees to 180 degrees ) to different nucleotide acid residues or to different protein amino acids, and then according to cos alpha(j) and sin alpha(j), the values are accumulated to construct an S-curve diagram, which is in strict one-to-one correspondence with the biological sequence. In addition, the S-curve diagram proves to be without the degeneracy phenomenon, so that both the degeneracy problem represented by diagrams and the problem of visualization for biological sequence data are solved. Meanwhile, a new approach to differentiate the similarity of biological sequences--the degree of similarity--is put forward on the basis of the S-curve diagram. To put it in detail, the least square approach is first adopted to obtain a straight line equation according to the S-curve diagram, then according to the distance formula of the point to the straight line, the average ratio of square sum for the distance between the S-curve and the straight line is calculated, and finally, the similarity of the biological sequences is presented by the new standard--the degree of similarity. As is shown by the experimental results, the S-curve diagram can better represent biological sequences (such as protein's) within Cartesian coordinate system, and the mutation point of biological sequence. Thus, it turns out that the new standard-the degree of similarity is of obviously great advantage.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Amino Acid Sequence
  • Animals
  • Base Sequence
  • Computational Biology / methods*
  • DNA / chemistry*
  • Genome
  • Histones / chemistry
  • Humans
  • Models, Statistical
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Proteins / chemistry
  • RNA / chemistry
  • Sequence Alignment / methods*

Substances

  • Histones
  • Proteins
  • RNA
  • DNA