Lifting of the 1,000 wheat exome project SNPs from Triticum aestivum cv. Chinese Spring assembly RefSeq v1.0 to RefSeq v2.1

BMC Res Notes. 2023 Sep 14;16(1):220. doi: 10.1186/s13104-023-06496-8.

Abstract

Objective: The 1,000 wheat exome project captured the single nucleotide variants in the coding regions of a diverse set of 890 wheat accessions to analyse the contribution of introgression to adaptation of wheat. However, this highly useful single nucleotide polymorphism (SNP) dataset is based on RefSeq v1.0 of the International Wheat Genome Sequencing Consortium (IWGSC) assembly of the bread wheat genome of Chinese Spring. This reference sequence has recently been updated using optical maps and long-read sequencing to produce the improved RefSeq v2.1. Our objective was to develop a reliable high-density SNP dataset positioned onto RefSeq v2.1 because it is the current standard reference sequence used by wheat researchers.

Results: The 3,039,822 SNPs originally positioned on RefSeq v1.0 were projected to v2.1 using Liftoff with four different flanking regions, and 2,946,536 SNPs were consistently lifted to the same location irrespective of the flanking region lengths. Of these, 2,799,166 were located on the '+' ve strand. The distribution of the SNPs across the 21 chromosomes on RefSeq v2.1 was similar to that of RefSeq v1.0. Among the SNPs that were based on unanchored scaffolds in RefSeq v1.0, 11,938 were projected to one of the 21 pseudomolecules in the upgraded assembly. This SNP dataset constitutes a much-needed standardized resource for the wheat research community.

Keywords: Chinese Spring reference genome; Liftoff; RefSeq v2.1; SNP lifting.

MeSH terms

  • Chromosome Mapping
  • Exome*
  • Polymorphism, Single Nucleotide
  • Triticum* / genetics