Genome-Wide Profiling of Polyadenylation Events in Maize Using High-Throughput Transcriptomic Sequences

G3 (Bethesda). 2019 Aug 8;9(8):2749-2760. doi: 10.1534/g3.119.400196.

Abstract

Polyadenylation is an essential post-transcriptional modification of eukaryotic transcripts that plays critical role in transcript stability, localization, transport, and translational efficiency. About 70% genes in plants contain alternative polyadenylation (APA) sites. Despite availability of vast amount of sequencing data, to date, a comprehensive map of the polyadenylation events in maize is not available. Here, 9.48 billion RNA-Seq reads were analyzed to characterize 95,345 Poly(A) Clusters (PAC) in 23,705 (51%) maize genes. Of these, 76% were APA genes. However, most APA genes (55%) expressed a dominant PAC rather than favoring multiple PACs equally. The lincRNA genes with PACs were significantly longer in length than the genes without any PAC and about 48% genes had APA sites. Heterogeneity was observed in 52% of the PACs supporting the imprecise nature of the polyadenylation process. Genomic distribution revealed that the majority of the PACs (78%) were located in the genic regions. Unlike previous studies, large number of PACs were observed in the intergenic (n = 21,264), 5'-UTR (735), CDS (2,542), and the intronic regions (12,841). The CDS and introns with PACs were longer in length than without PACs, whereas intergenic PACs were more often associated with transcripts that lacked annotated 3'-UTRs. Nucleotide composition around PACs demonstrated AT-richness and the common upstream motif was AAUAAA, which is consistent with other plants. According to this study, only 2,830 genes still maintained the use of AAUAAA motif. This large-scale data provides useful insights about the gene expression regulation and could be utilized as evidence to validate the annotation of transcript ends.

Keywords: 3′-UTR annotations; Alternative polyadenylation (APA); Zea mays; lncRNA; non-coding RNA.

MeSH terms

  • 3' Untranslated Regions
  • Computational Biology
  • Gene Expression Profiling*
  • Gene Expression Regulation, Plant*
  • Gene Ontology
  • Genome, Plant
  • Genome-Wide Association Study* / methods
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing*
  • Polyadenylation*
  • RNA, Long Noncoding
  • Transcriptome*
  • Zea mays / genetics*

Substances

  • 3' Untranslated Regions
  • RNA, Long Noncoding