Domestication and improvement genes reveal the differences of seed size- and oil-related traits in soybean domestication and improvement

Comput Struct Biotechnol J. 2022 Jun 13:20:2951-2964. doi: 10.1016/j.csbj.2022.06.014. eCollection 2022.

Abstract

To address domestication and improvement studies of soybean seed size- and oil-related traits, a series of domesticated and improved regions, loci, and candidate genes were identified in 286 soybean accessions using domestication and improvement analyses, genome-wide association studies, quantitative trait locus (QTL) mapping and bulked segregant analyses in this study. As a result, 534 candidate domestication regions (CDRs) and 458 candidate improvement regions (CIRs) were identified in this study and integrated with those in five and three previous studies, respectively, to obtain 952 CDRs and 538 CIRs; 1469 loci for soybean seed size- and oil-related traits were identified in this study and integrated with those in Soybase to obtain 433 QTL clusters. The two results were intersected to obtain 245 domestication and 221 improvement loci for the above traits. Around these trait-related domestication and improvement loci, 7 domestication and 7 improvement genes were found to be truly associated with these traits, and 372 candidate domestication and 87 candidate improvement genes were identified using gene expression, SNP variants in genome, miRNA binding, KEGG pathway, DNA methylation, and haplotype analysis. These genes were used to explain the trait changes in domestication and improvement. As a result, the trait changes can be explained by their frequencies of elite haplotypes, base mutations in coding region, and three factors affecting their expression levels. In addition, 56 domestication and 15 improvement genes may be valuable for future soybean breeding. This study can provide useful gene resources for future soybean breeding and molecular biology research.

Keywords: 100SW, 100-seed weight; CDGs, candidate domestication genes; CDRs, candidate domestication regions; CIGs, candidate improvement genes; CIRs, candidate improvement regions; DAF, days after flowering; Domestication; Genome-wide association study; Improvement; LA, linoleic acid; LNA, linolenic acid; LOD, logarithm of odds; OA, oleic acid; OIL, oil content; PA, palmitic acid; PCD, potential candidate domestication; PCI, potential candidate improvement; QTL, quantitative trait locus; QTNs, quantitative trait nucleotides; SA, stearic acid; SL, seed length; SLT, seed length to thickness ratio; SLW, seed length to width ratio; ST, seed thickness; SW, seed width; SWT, seed width to thickness ratio; Seed oil content; Seed size; Soybean.