A novel method to calculate the G+C content of genomic DNA sequences

J Biomol Struct Dyn. 2001 Oct;19(2):333-41. doi: 10.1080/07391102.2001.10506743.

Abstract

The base composition of a DNA fragment or genome is usually measured by the proportion of A+T or G+C in the sequence. The G+C content along genomic sequences is usually calculated using an overlapping or non-overlapping sliding window method. The result and accuracy of such an approach depends on the size of the window and the moving distance adopted. In this paper, a novel windowless technique to calculate the G+C content of genomic sequences is proposed. By this method, the G+C content can be calculated at different "resolution". In an extreme case, the G+C content may be computed at a specific point, rather than in a window of finite size. This is particularly useful to analyze the fine variation of base composition along genomic sequences. As the first example, the variation of G+C content along each of 16 yeast chromosomes is analyzed. The G+C-rich regions with length larger than 5 kb sequences are detected and listed in details. It is found that each chromosome consists of several G+C-rich and G+C-poor regions alternatively, i.e., a mosaic structure. Another example is to analyze the G+C content for each of the two chromosomes of the Vibrio cholerae genome. Based on the variations of the G+C content in each chromosome, it is shown that some fragments in the Vibrio cholerae genome may have been transferred from other species. Especially, the position and size of the large integron island on the smaller chromosome was precisely predicted. This method would be a useful tool for analyzing genomic sequences.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Composition*
  • Chromosomes, Bacterial / genetics
  • Chromosomes, Fungal / genetics
  • DNA / chemistry*
  • DNA / genetics*
  • DNA, Bacterial / chemistry
  • DNA, Bacterial / genetics
  • DNA, Fungal / chemistry
  • DNA, Fungal / genetics
  • Genetic Techniques
  • Genome
  • Genome, Bacterial
  • Saccharomyces cerevisiae / chemistry
  • Saccharomyces cerevisiae / genetics
  • Vibrio cholerae / chemistry
  • Vibrio cholerae / genetics

Substances

  • DNA, Bacterial
  • DNA, Fungal
  • DNA