Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery

BMC Genomics. 2011 May 25:12:265. doi: 10.1186/1471-2164-12-265.

Abstract

Background: Lentil (Lens culinaris Medik.) is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality.

Results: Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs). De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR)-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism.

Conclusions: A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Cluster Analysis
  • DNA Primers / genetics
  • DNA, Complementary / genetics
  • Expressed Sequence Tags / metabolism
  • Gene Expression Profiling / methods*
  • Genetic Markers / genetics*
  • Genotype
  • Lens Plant / genetics*
  • Lens Plant / growth & development
  • Minisatellite Repeats / genetics*
  • Molecular Sequence Annotation
  • Reproducibility of Results
  • Sequence Analysis, DNA / methods*

Substances

  • DNA Primers
  • DNA, Complementary
  • Genetic Markers