Building a Kokumi Database and Machine Learning-Based Prediction: A Systematic Computational Study on Kokumi Analysis

J Chem Inf Model. 2024 Apr 8;64(7):2670-2680. doi: 10.1021/acs.jcim.3c01728. Epub 2024 Jan 17.

Abstract

Kokumi is a subtle sensation characterized by a sense of fullness, continuity, and thickness. Traditional methods of taste discovery and analysis, including those of kokumi, have been labor-intensive and costly, thus necessitating the emergence of computational methods as critical strategies in molecular taste analysis and prediction. In this study, we undertook a comprehensive analysis, prediction, and screening of the kokumi compounds. We categorized 285 kokumi compounds from a previously unreleased kokumi database into five groups based on their molecular characteristics. Moreover, we predicted kokumi/non-kokumi and multi-flavor compositions using six structure-taste relationship models: MLP-E3FP, MLP-PLIF, MLP-RDKFP, SVM-RDKFP, RF-RDKFP, and WeaveGNN feature of Atoms and Bonds. These six predictors exhibited diverse performance levels across two different models. For kokumi/non-kokumi prediction, the WeaveGNN model showed an exceptional predictive AUC value (0.94), outperforming the other models (0.87, 0.90, 0.89, 0.92, and 0.78). For multi-flavor prediction, the MLP-E3FP model demonstrated a higher predictive AUC and MCC value (0.94 and 0.74) than the others (0.73 and 0.33; 0.92 and 0.70; 0.95 and 0.73; 0.94 and 0.64; and 0.88 and 0.69). This data highlights the model's proficiency in accurately predicting kokumi molecules. As a result, we sourced kokumi active compounds through a high-throughput screening of over 100 million molecules, further refined by toxicity and similarity screening. Lastly, we launched a web platform, KokumiPD (https://www.kokumipd.com/), offering a comprehensive kokumi database and online prediction services for users.

MeSH terms

  • Databases, Factual
  • Machine Learning*