Deep Kernel for Genomic and Near Infrared Predictions in Multi-environment Breeding Trials

G3 (Bethesda). 2019 Sep 4;9(9):2913-2924. doi: 10.1534/g3.119.400493.

Abstract

Kernel methods are flexible and easy to interpret and have been successfully used in genomic-enabled prediction of various plant species. Kernel methods used in genomic prediction comprise the linear genomic best linear unbiased predictor (GBLUP or GB) kernel, and the Gaussian kernel (GK). In general, these kernels have been used with two statistical models: single-environment and genomic × environment (GE) models. Recently near infrared spectroscopy (NIR) has been used as an inexpensive and non-destructive high-throughput phenotyping method for predicting unobserved line performance in plant breeding trials. In this study, we used a non-linear arc-cosine kernel (AK) that emulates deep learning artificial neural networks. We compared AK prediction accuracy with the prediction accuracy of GB and GK kernel methods in four genomic data sets, one of which also includes pedigree and NIR information. Results show that for all four data sets, AK and GK kernels achieved higher prediction accuracy than the linear GB kernel for the single-environment and GE multi-environment models. In addition, AK achieved similar or slightly higher prediction accuracy than the GK kernel. For all data sets, the GE model achieved higher prediction accuracy than the single-environment model. For the data set that includes pedigree, markers and NIR, results show that the NIR wavelength alone achieved lower prediction accuracy than the genomic information alone; however, the pedigree plus NIR information achieved only slightly lower prediction accuracy than the marker plus the NIR high-throughput data.

Keywords: GenPred; Genomic Best Unbiased Predictor (GBLUP, GB linear and non-linear kernel methods); Genomic Prediction; Genomic based prediction; Shared Data Resources; deep learning; genomic × environment interaction model; near infrared (NIR) high-throughput phenotype; single-environment model.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Databases, Genetic
  • Deep Learning
  • Genomics / methods*
  • Genomics / statistics & numerical data
  • Models, Genetic*
  • Phenotype
  • Plant Breeding / methods*
  • Spectrophotometry / methods*
  • Spectrophotometry / statistics & numerical data
  • Triticum / genetics
  • Zea mays / genetics