Genome-wide association studies and genomic selection assays made in a large sample of cacao (Theobroma cacao L.) germplasm reveal significant marker-trait associations and good predictive value for improving yield potential

PLoS One. 2022 Oct 6;17(10):e0260907. doi: 10.1371/journal.pone.0260907. eCollection 2022.

Abstract

A genome-wide association study (GWAS) was undertaken to unravel marker-trait associations (MTAs) between SNP markers and phenotypic traits. It involved a subset of 421 cacao accessions from the large and diverse collection conserved ex situ at the International Cocoa Genebank Trinidad. A Mixed Linear Model (MLM) in TASSEL was used for the GWAS and followed by confirmatory analyses using GAPIT FarmCPU. An average linkage disequilibrium (r2) of 0.10 at 5.2 Mb was found across several chromosomes. Seventeen significant (P ≤ 8.17 × 10-5 (-log10 (p) = 4.088)) MTAs of interest, including six that pertained to yield-related traits, were identified using TASSEL MLM. The latter accounted for 5 to 17% of the phenotypic variation expressed. The highly significant association (P ≤ 8.17 × 10-5) between seed length to width ratio and TcSNP 733 on chromosome 5 was verified with FarmCPU (P ≤ 1.12 × 10-8). Fourteen MTAs were common to both the TASSEL and FarmCPU models at P ≤ 0.003. The most significant yield-related MTAs involved seed number and seed length on chromosome 7 (P ≤ 1.15 × 10-14 and P ≤ 6.75 × 10-05, respectively) and seed number on chromosome 1 (P ≤ 2.38 × 10-05), based on the TASSEL MLM. It was noteworthy that seed length, seed length to width ratio and seed number were associated with markers at different loci, indicating their polygenic nature. Approximately 40 candidate genes that encode embryo and seed development, protein synthesis, carbohydrate transport and lipid biosynthesis and transport were identified in the flanking regions of the significantly associated SNPs and in linkage disequilibrium with them. A significant association of fruit surface anthocyanin intensity co-localised with MYB-related protein 308 on chromosome 4. Testing of a genomic selection approach revealed good predictive value (genomic estimated breeding values (GEBV)) for economic traits such as seed number (GEBV = 0.611), seed length (0.6199), seed width (0.5435), seed length to width ratio (0.5503), seed/cotyledon mass (0.6014) and ovule number (0.6325). The findings of this study could facilitate genomic selection and marker-assisted breeding of cacao thereby expediting improvement in the yield potential of cacao planting material.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Anthocyanins
  • Cacao* / genetics
  • Genome-Wide Association Study*
  • Genomics
  • Genotype
  • Linkage Disequilibrium
  • Lipids
  • Phenotype
  • Plant Breeding
  • Polymorphism, Single Nucleotide

Substances

  • Anthocyanins
  • Lipids

Grants and funding

Financial support from the Government of Trinidad and Tobago and the Cocoa Research Association, UK, and CIRAD, France that facilitated collation of the phenotypic and genotypic data, respectively, is gratefully acknowledged. However, the study design, conduct of this research and preparation of the manuscript were not influenced by the funding agencies.