Genetic diversity of avocado (Persea americana Mill.) germplasm using pooled sequencing

BMC Genomics. 2019 May 15;20(1):379. doi: 10.1186/s12864-019-5672-7.

Abstract

Background: Discovering a genome-wide set of avocado (Persea americana Mill.) single nucleotide polymorphisms and characterizing the diversity of germplasm collection is a powerful tool for breeding. However, discovery is a costly process, due to loss of loci that are proven to be non-informative when genotyping the germplasm.

Results: Our study on a collection of 100 accessions comprised the three race types, Guatemalan, Mexican, and West Indian. To increase the chances of discovering polymorphic loci, three pools of genomic DNA, one from each race, were sequenced and the reads were aligned to a reference transcriptome. In total, 507,917 polymorphic loci were identified in the entire collection. Of these, 345,617 were observed in all three pools, 117,692 in two pools, 44,552 in one of the pools, and only 56 (0.0001%) were homozygous in the three pools but for different alleles. The polymorphic loci were validated using 192 randomly selected SNPs by genotyping the accessions within each pool. The sensitivity of polymorphic locus prediction ranged from 0.77 to 0.94. The correlation between the allele frequency estimated from the pooled sequences and actual allele frequency from genotype calling of individual accessions was r = 0.8. A subset of 109 SNPs were then used to evaluate the genetic relationships among avocado accessions and the genetic diversity of the collection. The three races were distinctly clustered by projecting the genetic variation on a PCA plot. As expected, by estimating the kinship coefficient for all the accessions, many of the cultivars from the California breeding program were closely related to each other, especially, the Hass-like ones. The green-skin avocados, e.g., 'Bacon', 'Zutano', 'Ettinger' and 'Fuerte' were also closely related to each other.

Conclusions: A framework for SNP discovery and genetically characterizing of a breeder's accessions was described. Sequencing pools of gDNA is a cost-effective approach to create a genome-wide stock of polymorphic loci for a breeding program. Reassessing the botanical and the genetic knowledge about the germplasm accessions is valuable for future breeding. Kinship analysis may be used as a first step in finding a parental candidates in a parentage analyses.

Keywords: Avocado; FST; Germplasm collection; Kinship; SNP.

MeSH terms

  • DNA, Plant / genetics
  • Genetics, Population*
  • Genome, Plant*
  • High-Throughput Nucleotide Sequencing / methods*
  • Persea / classification*
  • Persea / genetics*
  • Polymorphism, Single Nucleotide*
  • Seeds / genetics*

Substances

  • DNA, Plant