Evaluation of linkage disequilibrium, population structure, and genetic diversity in the U.S. peanut mini core collection

BMC Genomics. 2019 Jun 11;20(1):481. doi: 10.1186/s12864-019-5824-9.

Abstract

Background: Due to the recent domestication of peanut from a single tetraploidization event, relatively little genetic diversity underlies the extensive morphological and agronomic diversity in peanut cultivars today. To broaden the genetic variation in future breeding programs, it is necessary to characterize germplasm accessions for new sources of variation and to leverage the power of genome-wide association studies (GWAS) to discover markers associated with traits of interest. We report an analysis of linkage disequilibrium (LD), population structure, and genetic diversity, and examine the ability of GWA to infer marker-trait associations in the U.S. peanut mini core collection genotyped with a 58 K SNP array.

Results: LD persists over long distances in the collection, decaying to r2 = half decay distance at 3.78 Mb. Structure within the collection is best explained when separated into four or five groups (K = 4 and K = 5). At K = 4 and 5, accessions loosely clustered according to market type and subspecies, though with numerous exceptions. Out of 107 accessions, 43 clustered in correspondence to the main market type subgroup whereas 34 did not. The remaining 30 accessions had either missing taxonomic classification or were classified as mixed. Phylogenetic network analysis also clustered accessions into approximately five groups based on their genotypes, with loose correspondence to subspecies and market type. Genome wide association analysis was performed on these lines for 12 seed composition and quality traits. Significant marker associations were identified for arachidic and behenic fatty acid compositions, which despite having low bioavailability in peanut, have been reported to raise cholesterol levels in humans. Other traits such as blanchability showed consistent associations in multiple tests, with plausible candidate genes.

Conclusions: Based on GWA, population structure as well as additional simulation results, we find that the primary limitations of this collection for GWAS are a small collection size, significant remaining structure/genetic similarity and long LD blocks that limit the resolution of association mapping. These results can be used to improve GWAS in peanut in future studies - for example, by increasing the size and reducing structure in the collections used for GWAS.

Keywords: Genetic diversity; Genome wide association; Linkage disequilibrium; Phylogenetic network tree; Population structure.

MeSH terms

  • Arachis / genetics*
  • Chromosomes, Plant / genetics
  • Gene Frequency
  • Genetic Variation*
  • Genome-Wide Association Study
  • Haplotypes
  • Linkage Disequilibrium*
  • Phylogeny
  • Polymorphism, Single Nucleotide
  • Population Dynamics