Identification of putative flowering genes and transcription factors from flower de novo transcriptome dataset of tuberose (Polianthes tuberosa L.)

Data Brief. 2018 Sep 22:20:2027-2035. doi: 10.1016/j.dib.2018.09.051. eCollection 2018 Oct.

Abstract

Polianthes tuberosa is commercially popular because of their economic importance in floriculture for cut and loose flowers and in perfume industry because of the unique fragrance. Despite its commercial importance, no ready-to-use transcript sequence information is available in the public database. We have sequenced the RNA obtained from tuberose flowers using the Illumina HiSeq. 2000 platform and have carried out a de novo analysis of the transcriptome data. The de novo assembly generated 11,100 transcripts. These transcripts represent a total of 7876 unigenes that were considered for downstream analysis. These 7876 unigenes, which was further annotated using blast2go and KEGG pathways, were also assigned. Tuberose transcripts were also assigned to metabolic pathways using the Kyoto Encyclopedia of Genes and Genomes database to determine their biochemical functions. 4591 of the tuberose transcripts matched to genes in KEGG pathways and 66 transcripts were mapped to the Flavonoid biosynthesis pathway. 21 flowering genes have been identified in this tuberose transcriptome. Transcription factor analysis helped in the identification of a large number of transcripts similar to key genes in the flowering regulation network of Arabidopsis thaliana. Among the transcription factors identified "NAC" which is associated with plant stress response represented the most abundant category followed by APETALA2 (AP2)/ethylene-responsive element binding proteins (EREBPs) which plays various role in floral organ identity and respond to different biotic and abiotic stress.

Keywords: Amaryllidaceae; Flower specific genes; KEGG; Transcription factors; Transcriptome analysis; Tuberose.