[Averaging results of site recognition can increase the accuracy of annotating the human genome]

Biofizika. 1999 Jul-Aug;44(4):649-54.
[Article in Russian]

Abstract

A systemic approach is proposed, which makes it possible to increase the accuracy of recognition of functional sites in arbitrary DNA sequences. The approach is based on the Central limit theorem and consists in the averaging of a large number of recognitions of a particular site. To obtain a rather large number of recognitions within the framework of conventional methods of recognition, consensus, and frequency matrix, 20 novel oligonucleotide alphabets were used. The approach was used to study the binding sites of GATA-1 and C/EBP transcription factors. It was found that the averaged recognition of these sites is more precise than each of specific recognitions, which just follows from the Central limit theorem.

Publication types

  • English Abstract

MeSH terms

  • Base Sequence
  • Binding Sites
  • CCAAT-Enhancer-Binding Proteins
  • DNA / genetics
  • DNA / metabolism*
  • DNA-Binding Proteins / metabolism
  • Erythroid-Specific DNA-Binding Factors
  • GATA1 Transcription Factor
  • Genome, Human*
  • Humans
  • Nuclear Proteins / metabolism
  • Transcription Factors / metabolism

Substances

  • CCAAT-Enhancer-Binding Proteins
  • DNA-Binding Proteins
  • Erythroid-Specific DNA-Binding Factors
  • GATA1 Transcription Factor
  • GATA1 protein, human
  • Nuclear Proteins
  • Transcription Factors
  • DNA