Transcriptome-guided annotation and functional classification of long non-coding RNAs in Arabidopsis thaliana

Sci Rep. 2022 Aug 18;12(1):14063. doi: 10.1038/s41598-022-18254-0.

Abstract

Long non-coding RNAs (lncRNAs) are a prominent class of eukaryotic regulatory genes. Despite the numerous available transcriptomic datasets, the annotation of plant lncRNAs remains based on dated annotations that have been historically carried over. We present a substantially improved annotation of Arabidopsis thaliana lncRNAs, generated by integrating 224 transcriptomes in multiple tissues, conditions, and developmental stages. We annotate 6764 lncRNA genes, including 3772 that are novel. We characterize their tissue expression patterns and find 1425 lncRNAs are co-expressed with coding genes, with enriched functional categories such as chloroplast organization, photosynthesis, RNA regulation, transcription, and root development. This improved transcription-guided annotation constitutes a valuable resource for studying lncRNAs and the biological processes they may regulate.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Arabidopsis* / metabolism
  • Molecular Sequence Annotation
  • RNA, Long Noncoding* / metabolism
  • Transcriptome / genetics

Substances

  • RNA, Long Noncoding