Computer system "gene discovery" for promoter structure analysis

In Silico Biol. 2002;2(3):257-62.

Abstract

This paper presents implementation of Data Mining and Knowledge Discovery techniques for searching for regularities in tables of context features of DNA sequences involved in regulation of transcription. The goal is to discover regularities that relate nucleotide sequences to the functional classes of these sequences. The search patterns for regularities have been constructed in the first-order logic augmented by probabilistic estimates. To this aim. the PC software system "Gene Discovery" has been designed. This system accepts molecular-genetical data retrieved from a database by using SQL queries. Nucleotide sequences of promoters of several functional systems were extracted from the TRRD database (http://wwwmgs.bionet.nsc.ru/mgs/gnw/trrd/) and analysed. The data include nucleotide sequences of erythroid-specific gene promoters, endocrine system gene promoters, promoter regions of the genes controlling cell cycle, promoter of genes regulating lipid metabolism, and muscle-specific gene promoters. Several regularities that relate the nucleotide sequences in the regulatory DNA and their location relative to the transcription start with each functional class have been found.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Computational Biology*
  • DNA / chemistry*
  • Databases, Genetic
  • Information Storage and Retrieval
  • Promoter Regions, Genetic*

Substances

  • DNA