Statistical discrimination using different machine learning models reveals dissimilar key compounds of soybean leaves in targeted polyphenol-metric metabolomics in terms of traits and cultivation

Food Chem. 2023 Mar 15;404(Pt A):134454. doi: 10.1016/j.foodchem.2022.134454. Epub 2022 Sep 29.

Abstract

Soybean (SB) leaves (SLs) contain diverse flavonoids with health-promoting properties. To investigate the chemical constituents of SB and their correlations across phenotypes, growing periods, and environmental factors, a validated separation method for mass detection was used with targeted metabolomics. Thirty-six polyphenols (1 coumestrol, 5 flavones, 18 flavonols, and 12 isoflavones) were identified in SLs, 31 of which were quantified. Machine learning (ML) modelling was used to differentiate between the variety, bean color, growing period, and cultivation area and identify the key compounds responsible for these differences. The isoflavone and flavonol profiles were influenced by the growing period and cultivation area based on bootstrap forest modelling. The neural model showed the best predictive capacity for SL differences among the various ML models. Discriminant polyphenols can differ depending on the ML method applied; therefore, a cautious approach should be ensured when using statistical ML outputs, including orthogonal partial least squares discriminant analysis.

Keywords: Chemometrics; Machine learning; Multivariate analysis; Polyphenol; Soybean leaf; Targeted metabolomics.

MeSH terms

  • Fabaceae*
  • Flavonols
  • Glycine max
  • Isoflavones*
  • Machine Learning
  • Metabolomics / methods
  • Phenotype
  • Plant Leaves / chemistry
  • Polyphenols / analysis

Substances

  • Polyphenols
  • Isoflavones
  • Flavonols