Genomic Resources for Water Yam (Dioscorea alata L.): Analyses of EST-Sequences, De Novo Sequencing and GBS Libraries

PLoS One. 2015 Jul 29;10(7):e0134031. doi: 10.1371/journal.pone.0134031. eCollection 2015.

Abstract

The reducing cost and rapid progress in next-generation sequencing techniques coupled with high performance computational approaches have resulted in large-scale discovery of advanced genomic resources in several model and non-model plant species. Yam (Dioscorea spp.) is a major food and cash crop in many countries but research efforts have been limited to understand the genetics and generate genomic information for the crop. The availability of a large number of genomic resources including genome-wide molecular markers will accelerate the breeding efforts and application of genomic selection in yams. In the present study, several methods including expressed sequence tags (EST)-sequencing, de novo sequencing, and genotyping-by-sequencing (GBS) profiles on two yam (Dioscorea alata L.) genotypes (TDa 95/00328 and TDa 95-310) was performed to generate genomic resources for use in its improvement programs. This includes a comprehensive set of EST-SSRs, genomic SSRs, whole genome SNPs, and reduced representation SNPs. A total of 1,152 EST-SSRs were developed from >40,000 EST-sequences generated from the two genotypes. A set of 388 EST-SSRs were validated as polymorphic showing a polymorphism rate of 34% when tested on two diverse parents targeted for anthracnose disease. In addition, approximately 40X de novo whole genome sequence coverage was generated for each of the two genotypes, and a total of 18,584 and 15,952 genomic SSRs were identified for TDa 95/00328 and TDa 95-310, respectively. A custom made pipeline resulted in the selection of 573 genomic SSRs common across the two genotypes, of which only eight failed, 478 being polymorphic and 62 monomorphic indicating a polymorphic rate of 83.5%. Additionally, 288,505 high quality SNPs were also identified between these two genotypes. Genotyping by sequencing reads on these two genotypes also revealed 36,790 overlapping SNP positions that are distributed throughout the genome. Our efforts in using different approaches in generating genomic resources provides a non-biased glimpse into the publicly available EST-sequences, yam genome, and GBS profiles with affirmation that the genomic complexity can be methodically unraveled and constitute a critical foundation for future studies in linkage mapping, germplasm analysis, and predictive breeding.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • DNA, Plant / genetics*
  • Dioscorea / genetics*
  • Expressed Sequence Tags
  • Gene Library
  • Genetic Markers / genetics
  • Genome, Plant / genetics*
  • Genomics / methods
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods
  • Polymorphism, Single Nucleotide / genetics

Substances

  • DNA, Plant
  • Genetic Markers

Grants and funding

RB thanks USAID-Linkage grant awarded through the International Institute of Tropical Agriculture for sponsoring the research. The funding agency had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.