Estimation of physiological genomic estimated breeding values (PGEBV) combining full hyperspectral and marker data across environments for grain yield under combined heat and drought stress in tropical maize (Zea mays L.)

PLoS One. 2019 Mar 20;14(3):e0212200. doi: 10.1371/journal.pone.0212200. eCollection 2019.

Abstract

High throughput phenotyping technologies are lagging behind modern marker technology impairing the use of secondary traits to increase genetic gains in plant breeding. We aimed to assess whether the combined use of hyperspectral data with modern marker technology could be used to improve across location pre-harvest yield predictions using different statistical models. A maize bi-parental doubled haploid (DH) population derived from F1, which consisted of 97 lines was evaluated in testcross combination under heat stress as well as combined heat and drought stress during the 2014 and 2016 summer season in Ciudad Obregon, Sonora, Mexico (27°20" N, 109°54" W, 38 m asl). Full hyperspectral data, indicative of crop physiological processes at the canopy level, was repeatedly measured throughout the grain filling period and related to grain yield. Partial least squares regression (PLSR), random forest (RF), ridge regression (RR) and Bayesian ridge regression (BayesB) were used to assess prediction accuracies on grain yield within (two-fold cross-validation) and across environments (leave-one-environment-out-cross-validation) using molecular markers (M), hyperspectral data (H) and the combination of both (HM). Highest prediction accuracy for grain yield averaged across within and across location predictions (rGP) were obtained for BayesB followed by RR, RF and PLSR. The combined use of hyperspectral and molecular marker data as input factor on average had higher predictions for grain yield than hyperspectral data or molecular marker data alone. The highest prediction accuracy for grain yield across environments was measured for BayesB when molecular marker data and hyperspectral data were used as input factors, while the highest within environment prediction was obtained when BayesB was used in combination with hyperspectral data. It is discussed how the combined use of hyperspectral data with molecular marker technology could be used to introduce physiological genomic estimated breeding values (PGEBV) as a pre-harvest decision support tool to select genetically superior lines.

MeSH terms

  • Agriculture / methods*
  • Bayes Theorem
  • Biomarkers
  • Droughts
  • Edible Grain / genetics
  • Forecasting / methods
  • Genome, Plant / genetics
  • Genomics
  • Genotype
  • Heat-Shock Response / genetics*
  • Hot Temperature
  • Mexico
  • Models, Genetic
  • Phenotype
  • Plant Breeding / methods
  • Salt Tolerance / genetics
  • Selection, Genetic / genetics
  • Zea mays / genetics*

Substances

  • Biomarkers

Grants and funding

The authors received no specific funding for this work.