Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity

PLoS One. 2013 Sep 5;8(9):e74612. doi: 10.1371/journal.pone.0074612. eCollection 2013.

Abstract

Genome-wide molecular markers are often being used to evaluate genetic diversity in germplasm collections and for making genomic selections in breeding programs. To accurately predict phenotypes and assay genetic diversity, molecular markers should assay a representative sample of the polymorphisms in the population under study. Ascertainment bias arises when marker data is not obtained from a random sample of the polymorphisms in the population of interest. Genotyping-by-sequencing (GBS) is rapidly emerging as a low-cost genotyping platform, even for the large, complex, and polyploid wheat (Triticum aestivum L.) genome. With GBS, marker discovery and genotyping occur simultaneously, resulting in minimal ascertainment bias. The previous platform of choice for whole-genome genotyping in many species such as wheat was DArT (Diversity Array Technology) and has formed the basis of most of our knowledge about cereals genetic diversity. This study compared GBS and DArT marker platforms for measuring genetic diversity and genomic selection (GS) accuracy in elite U.S. soft winter wheat. From a set of 365 breeding lines, 38,412 single nucleotide polymorphism GBS markers were discovered and genotyped. The GBS SNPs gave a higher GS accuracy than 1,544 DArT markers on the same lines, despite 43.9% missing data. Using a bootstrap approach, we observed significantly more clustering of markers and ascertainment bias with DArT relative to GBS. The minor allele frequency distribution of GBS markers had a deficit of rare variants compared to DArT markers. Despite the ascertainment bias of the DArT markers, GS accuracy for three traits out of four was not significantly different when an equal number of markers were used for each platform. This suggests that the gain in accuracy observed using GBS compared to DArT markers was mainly due to a large increase in the number of markers available for the analysis.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Gene Frequency
  • Genetic Markers*
  • Genetic Variation*
  • Genome, Plant*
  • Genotype
  • Models, Statistical
  • Phenotype
  • Polymorphism, Single Nucleotide
  • Principal Component Analysis
  • Reproducibility of Results
  • Sequence Analysis, DNA
  • Triticum / genetics*

Substances

  • Genetic Markers

Grants and funding

This research was supported in part by the Bill and Melinda Gates Foundation (Durable Rust Resistance in Wheat), USDA-NIFA-AFRI grant award number 2011-68002-30029 and by Hatch project 149-449. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.