Genomic prediction using information across years with epistatic models and dimension reduction via haplotype blocks

PLoS One. 2023 Mar 31;18(3):e0282288. doi: 10.1371/journal.pone.0282288. eCollection 2023.

Abstract

The importance of accurate genomic prediction of phenotypes in plant breeding is undeniable, as higher prediction accuracy can increase selection responses. In this regard, epistasis models have shown to be capable of increasing the prediction accuracy while their high computational load is challenging. In this study, we investigated the predictive ability obtained in additive and epistasis models when utilizing haplotype blocks versus pruned sets of SNPs by including phenotypic information from the last growing season. This was done by considering a single biological trait in two growing seasons (2017 and 2018) as separate traits in a multi-trait model. Thus, bivariate variants of the Genomic Best Linear Unbiased Prediction (GBLUP) as an additive model, Epistatic Random Regression BLUP (ERRBLUP) and selective Epistatic Random Regression BLUP (sERRBLUP) as epistasis models were compared with respect to their prediction accuracies for the second year. The prediction accuracies of bivariate GBLUP, ERRBLUP and sERRBLUP were assessed with eight phenotypic traits for 471/402 doubled haploid lines in the European maize landrace Kemater Landmais Gelb/Petkuser Ferdinand Rot. The results indicate that the obtained prediction accuracies are similar when utilizing a pruned set of SNPs or haplotype blocks, while utilizing haplotype blocks reduces the computational load significantly compared to the pruned sets of SNPs. The number of interactions considered in the model was reduced from 323.5/456.4 million for the pruned SNP panel to 4.4/5.5 million in the haplotype block dataset for Kemater and Petkuser landraces, respectively. Since the computational load scales linearly with the number of parameters in the model, this leads to a reduction in computational time of 98.9% from 13.5 hours for the pruned set of markers to 9 minutes for the haplotype block dataset. We further investigated the impact of genomic correlation, phenotypic correlation and trait heritability as factors affecting the bivariate models' prediction accuracy, identifying the genomic correlation between years as the most influential one. As computational load is substantially reduced, while the accuracy of genomic prediction is unchanged, the here proposed framework to use haplotype blocks in sERRBLUP provided a solution for the practical implementation of sERRBLUP in real breeding programs. Furthermore, our results indicate that sERRBLUP is not only suitable for prediction across different locations, but also for the prediction across growing seasons.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome
  • Genomics / methods
  • Genotype
  • Haplotypes
  • Models, Genetic*
  • Phenotype
  • Plant Breeding*
  • Polymorphism, Single Nucleotide

Associated data

  • figshare/10.6084/m9.figshare.12137142

Grants and funding

This work was funded by the German Federal Ministry of Education and Research (BMBF) within the scope of the funding initiative “Plant Breeding Research for the Bioeconomy” (MAZE – “Accessing the genomic and functional diversity of maize to improve quantitative traits”; Funding ID: 031B0195).