Near-gapless genome assemblies of Williams 82 and Lee cultivars for accelerating global soybean research

Plant Genome. 2023 Dec;16(4):e20382. doi: 10.1002/tpg2.20382. Epub 2023 Sep 25.

Abstract

Complete, gapless telomere-to-telomere chromosome assemblies are a prerequisite for comprehensively investigating the architecture of complex regions, like centromeres or telomeres and removing uncertainties in the order, spacing, and orientation of genes. Using complementary genomics technologies and assembly algorithms, we developed highly contiguous, nearly gapless, genome assemblies for two economically important soybean [Glycine max (L.) Merr] cultivars (Williams 82 and Lee). The centromeres were distinctly annotated on all the chromosomes of both assemblies. We further found that the canonical telomeric repeats were present at the telomeres of all chromosomes of both Williams 82 and Lee genomes. A total of 10 chromosomes in Williams 82 and eight in Lee were entirely reconstructed in single contigs without any gap. Using the combination of ab initio prediction, protein homology, and transcriptome evidence, we identified 58,287 and 56,725 protein-coding genes in Williams 82 and Lee, respectively. The genome assemblies and annotations will serve as a valuable resource for studying soybean genomics and genetics and accelerating soybean improvement.

MeSH terms

  • Algorithms
  • Genome*
  • Genomics
  • Glycine max* / genetics