The first Illumina-based de novo transcriptome sequencing and analysis of safflower flowers

PLoS One. 2012;7(6):e38653. doi: 10.1371/journal.pone.0038653. Epub 2012 Jun 19.

Abstract

Background: The safflower, Carthamus tinctorius L., is a worldwide oil crop, and its flowers, which have a high flavonoid content, are an important medicinal resource against cardiovascular disease in traditional medicine. Because the safflower has a large and complex genome, the development of its genomic resources has been delayed. Second-generation Illumina sequencing is now an efficient route for generating an enormous volume of sequences that can represent a large number of genes and their expression levels.

Methodology/principal findings: To investigate the genes and pathways that might control flavonoids and other secondary metabolites in the safflower, we used Illumina sequencing to perform a de novo assembly of the safflower tubular flower tissue transcriptome. We obtained a total of 4.69 Gb in clean nucleotides comprising 52,119,104 clean sequencing reads, 195,320 contigs, and 120,778 unigenes. Based on similarity searches with known proteins, we annotated 70,342 of the unigenes (about 58% of the identified unigenes) with cut-off E-values of 10(-5). In total, 21,943 of the safflower unigenes were found to have COG classifications, and BLAST2GO assigned 26,332 of the unigenes to 1,754 GO term annotations. In addition, we assigned 30,203 of the unigenes to 121 KEGG pathways. When we focused on genes identified as contributing to flavonoid biosynthesis and the biosynthesis of unsaturated fatty acids, which are important pathways that control flower and seed quality, respectively, we found that these genes were fairly well conserved in the safflower genome compared to those of other plants.

Conclusions/significance: Our study provides abundant genomic data for Carthamus tinctorius L. and offers comprehensive sequence resources for studying the safflower. We believe that these transcriptome datasets will serve as an important public information platform to accelerate studies of the safflower genome, and may help us define the mechanisms of flower tissue-specific and secondary metabolism in this non-model plant.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biosynthetic Pathways / genetics
  • Carthamus tinctorius / genetics*
  • Carthamus tinctorius / metabolism
  • Computational Biology / methods
  • Flavonoids / biosynthesis
  • Flowers / genetics*
  • Flowers / metabolism
  • Gene Expression Profiling*
  • High-Throughput Nucleotide Sequencing
  • Molecular Sequence Annotation
  • Phenotype
  • Transcriptome*

Substances

  • Flavonoids