Taming the massive genome of Scots pine with PiSy50k, a new genotyping array for conifer research

Plant J. 2022 Mar;109(5):1337-1350. doi: 10.1111/tpj.15628. Epub 2022 Jan 16.

Abstract

Pinus sylvestris (Scots pine) is the most widespread coniferous tree in the boreal forests of Eurasia, with major economic and ecological importance. However, its large and repetitive genome presents a challenge for conducting genome-wide analyses such as association studies, genetic mapping and genomic selection. We present a new 50K single-nucleotide polymorphism (SNP) genotyping array for Scots pine research, breeding and other applications. To select the SNP set, we first genotyped 480 Scots pine samples on a 407 540 SNP screening array and identified 47 712 high-quality SNPs for the final array (called 'PiSy50k'). Here, we provide details of the design and testing, as well as allele frequency estimates from the discovery panel, functional annotation, tissue-specific expression patterns and expression level information for the SNPs or corresponding genes, when available. We validated the performance of the PiSy50k array using samples from Finland and Scotland. Overall, 39 678 (83.2%) SNPs showed low error rates (mean = 0.9%). Relatedness estimates based on array genotypes were consistent with the expected pedigrees, and the level of Mendelian error was negligible. In addition, array genotypes successfully discriminate between Scots pine populations of Finnish and Scottish origins. The PiSy50k SNP array will be a valuable tool for a wide variety of future genetic studies and forestry applications.

Keywords: Pinus sylvestris; genetic diversity; genotyping; pedigree; single-nucleotide polymorphism.

MeSH terms

  • Genome-Wide Association Study
  • Genotype
  • Pinus sylvestris* / genetics
  • Plant Breeding
  • Polymorphism, Single Nucleotide / genetics
  • Tracheophyta* / genetics