BoostSweet: Learning molecular perceptual representations of sweeteners

Food Chem. 2022 Jul 30:383:132435. doi: 10.1016/j.foodchem.2022.132435. Epub 2022 Feb 12.

Abstract

The development of safe artificial sweeteners has attracted considerable interest in the food industry. Previous machine learning (ML) studies based on quantitative structure-activity relationships have provided some molecular principles for predicting sweetness, but these models can be improved via the chemical recognition of sweetness active factors. Our ML model, a soft-vote ensemble model that has a light gradient boosting machine and uses both layered fingerprints and alvaDesc molecular descriptor features, demonstrates state-of-the-art performance, with an AUROC score of 0.961. Based on an analysis of feature importance and dataset, we identified that the number of nitrogen atoms that serve as hydrogen bond donors in molecules can play an essential role in determining sweetness. These results potentially provide an advanced understanding of the relationship between molecular structure and sweetness, which can be used to design new sweeteners based on molecular structural dependence.

Keywords: Feature analysis; Ligand-binding approach; Machine learning; Quantitative structure-activity relationship; Sweetener prediction.

MeSH terms

  • Machine Learning
  • Quantitative Structure-Activity Relationship
  • Sweetening Agents* / chemistry
  • Taste*

Substances

  • Sweetening Agents