Ustilago maydis transcript features identified through full-length cDNA analysis

Mol Genet Genomics. 2011 Aug;286(2):143-59. doi: 10.1007/s00438-011-0634-z. Epub 2011 Jul 13.

Abstract

Ustilago maydis is the model for investigating basidiomycete biotrophic plant pathogens. To further the annotation of its genome, 12,943 full-length cDNA sequences were used to construct databases for the promoter and untranslated regions of U. maydis genes. A subset of clones was sequenced to determine full cDNA sequences. These and the original ESTs were assembled into contigs representing 3,058, or 45%, of the predicted U. maydis genes. The new sequencing allowed the confirmation of 2,842 gene models, 690 of which contain an intron. The use of full-length cDNA clone sequences ensured that untranslated regions were physically linked to the open reading frames (ORFs), not merely aligned upstream of the start of transcription. Identified sequence features include: (1) over 500 potential short upstream ORFs, (2) 95 gene models that require further annotation, (3) one new potential ORF, (4) varying GC content in different gene regions, (5) a WebLogo motif for the start of translation, (6) the correlation of UTR length with transcript representation in cDNA libraries and with gene function categories, (7) a relationship between natural antisense transcripts and UTR length that differs from that of Saccharomyces cerevisiae, (8) a potential relationship between DNA replication and the control of transcription, and (9) new insights regarding mechanisms for the control of transcription and mRNA maturation in U. maydis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • DNA, Complementary / genetics*
  • Databases, Genetic
  • Expressed Sequence Tags
  • Genes, Fungal / genetics
  • Genome, Fungal*
  • Introns
  • Molecular Sequence Data
  • Open Reading Frames
  • Ustilago / genetics*

Substances

  • DNA, Complementary