Can machine learning methods accurately predict the molar absorption coefficient of different classes of dyes?

Spectrochim Acta A Mol Biomol Spectrosc. 2022 Oct 15:279:121442. doi: 10.1016/j.saa.2022.121442. Epub 2022 May 30.

Abstract

In this article, we provide a convenient tool for all researchers to predict the value of the molar absorption coefficient for a wide number of dyes without any computer costs. The new model is based on RFR method (ALogPS, OEstate + Fragmentor + QNPR) and is able to predict the molar absorption coefficient with an accuracy (5-fold cross-validation RMSE) of 0.26 log unit. This accuracy was achieved due to the fact that the model was trained on data for more than 20,000 unique dye molecules. To our knowledge, this is the first model for predicting the molar absorption coefficient trained on such a large and diverse set of dyes. The model is available at https://ochem.eu/article/145413. We hope that the new model will allow researchers to predict dyes with practically significant spectral characteristics and verify existing experimental data.

Keywords: BODIPY; Machine learning; Molar absorption coefficient; OCHEM; Random forests.

MeSH terms

  • Coloring Agents*
  • Machine Learning*

Substances

  • Coloring Agents