The repetitive sequence database and mining putative regulatory elements in gene promoter regions

J Comput Biol. 2002;9(4):621-40. doi: 10.1089/106652702760277354.

Abstract

At least 43% of the human genome is occupied by repetitive elements. Moreover, around 51% of the rice genome is occupied by repetitive elements. The analysis of repetitive elements reveals that repetitive elements in our genome may have been very important in the evolutionary genomics. The first part of this study is to describe a database of repetitive elements - RSDB. The RSDB database contains repetitive elements, which are classified into the following categories: exact, tandem, and similar. The interfaces needed to query and show the results and statistical data, such as the relationship between repetitive elements and genes, cross-references of repetitive elements among different organisms, and so on, are provided. The second part of this study then attempts to mine the putative binding site for information on how combinations of the known regulatory sites and overrepresented repetitive elements in RSDB are distributed in the promoter regions of groups of functionally related genes. The overrepresented repetitive elements appearing in the associations are possible transcription factor binding sites. Our proposed approach is applied to Saccharomyces cerevisiae and the promoter regions of Yeast ORFs. The complete contents of RSDB and partial putative binding sites are available to the public at www.rsdb.csie.ncu.edu.tw. The readers may download partial query results.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Binding Sites
  • Databases, Nucleic Acid*
  • Genome
  • Genome, Fungal
  • Humans
  • Internet
  • Open Reading Frames
  • Promoter Regions, Genetic*
  • Regulatory Sequences, Nucleic Acid*
  • Repetitive Sequences, Nucleic Acid*
  • Saccharomyces cerevisiae / genetics
  • Software
  • Transcription Factors / metabolism
  • Transcription, Genetic

Substances

  • Transcription Factors