Dissecting Breeders' Sense via Explainable Machine Learning Approach: Application to Fruit Peelability and Hardness in Citrus

Front Plant Sci. 2022 Feb 10:13:832749. doi: 10.3389/fpls.2022.832749. eCollection 2022.

Abstract

"Genomics-assisted breeding", which utilizes genomics-based methods, e.g., genome-wide association study (GWAS) and genomic selection (GS), has been attracting attention, especially in the field of fruit breeding. Low-cost genotyping technologies that support genome-assisted breeding have already been established. However, efficient collection of large amounts of high-quality phenotypic data is essential for the success of such breeding. Most of the fruit quality traits have been sensorily and visually evaluated by professional breeders. However, the fruit morphological features that serve as the basis for such sensory and visual judgments are unclear. This makes it difficult to collect efficient phenotypic data on fruit quality traits using image analysis. In this study, we developed a method to automatically measure the morphological features of citrus fruits by the image analysis of cross-sectional images of citrus fruits. We applied explainable machine learning methods and Bayesian networks to determine the relationship between fruit morphological features and two sensorily evaluated fruit quality traits: easiness of peeling (Peeling) and fruit hardness (FruH). In each of all the methods applied in this study, the degradation area of the central core of the fruit was significantly and directly associated with both Peeling and FruH, while the seed area was significantly and directly related to FruH alone. The degradation area of albedo and the area of flavedo were also significantly and directly related to Peeling and FruH, respectively, except in one or two methods. These results suggest that an approach that combines explainable machine learning methods, Bayesian networks, and image analysis can be effective in dissecting the experienced sense of a breeder. In breeding programs, collecting fruit images and efficiently measuring and documenting fruit morphological features that are related to fruit quality traits may increase the size of data for the analysis and improvement of the accuracy of GWAS and GS on the quality traits of the citrus fruits.

Keywords: Bayesian network; Grad-CAM; breeding; citrus; deep learning; image analysis; machine learning.