A GeneTrek analysis of the maize genome

Proc Natl Acad Sci U S A. 2007 Jul 10;104(28):11844-9. doi: 10.1073/pnas.0704258104. Epub 2007 Jul 5.

Abstract

Analysis of the sequences of 74 randomly selected BACs demonstrated that the maize nuclear genome contains approximately 37,000 candidate genes with homologues in other plant species. An additional approximately 5,500 predicted genes are severely truncated and probably pseudogenes. The distribution of genes is uneven, with approximately 30% of BACs containing no genes. BAC gene density varies from 0 to 7.9 per 100 kb, whereas most gene islands contain only one gene. The average number of genes per gene island is 1.7. Only 72% of these genes show collinearity with the rice genome. Particular LTR retrotransposon families (e.g., Gyma) are enriched on gene-free BACs, most of which do not come from pericentromeres or other large heterochromatic regions. Gene-containing BACs are relatively enriched in different families of LTR retrotransposons (e.g., Ji). Two major bursts of LTR retrotransposon activity in the last 2 million years are responsible for the large size of the maize genome, but only the more recent of these is well represented in gene-containing BACs, suggesting that LTR retrotransposons are more efficiently removed in these domains. The results demonstrate that sample sequencing and careful annotation of a few randomly selected BACs can provide a robust description of a complex plant genome.

Publication types

  • Comparative Study
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Chromosomes, Artificial, Bacterial / genetics*
  • Genes, Plant
  • Genetic Markers
  • Genome, Plant*
  • Multigene Family
  • Oryza / genetics
  • Random Allocation
  • Sequence Analysis, DNA* / methods
  • Zea mays / genetics*

Substances

  • Genetic Markers