Insights into modeling refractive index of ionic liquids using chemical structure-based machine learning methods

Sci Rep. 2023 Jul 24;13(1):11966. doi: 10.1038/s41598-023-39079-5.

Abstract

Ionic liquids (ILs) have drawn much attention due to their extensive applications and environment-friendly nature. Refractive index prediction is valuable for ILs quality control and property characterization. This paper aims to predict refractive indices of pure ILs and identify factors influencing refractive index changes. Six chemical structure-based machine learning models called eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM), Categorical Boosting (CatBoost), Convolutional Neural Network (CNN), Adaptive Boosting-Decision Tree (Ada-DT), and Adaptive Boosting-Support Vector Machine (Ada-SVM) were developed to achieve this goal. An enormous dataset containing 6098 data points of 483 different ILs was exploited to train the machine learning models. Each data point's chemical substructures, temperature, and wavelength were considered for the models' inputs. Including wavelength as input is unprecedented among predictions done by machine learning methods. The results show that the best model was CatBoost, followed by XGBoost, LightGBM, Ada-DT, CNN, and Ada-SVM. The R2 and average absolute percent relative error (AAPRE) of the best model were 0.9973 and 0.0545, respectively. Comparing this study's models with the literature shows two advantages regarding the dataset's abundance and prediction accuracy. This study also reveals that the presence of the -F substructure in an ionic liquid has the most influence on its refractive index among all inputs. It was also found that the refractive index of imidazolium-based ILs increases with increasing alkyl chain length. In conclusion, chemical structure-based machine learning methods provide promising insights into predicting the refractive index of ILs in terms of accuracy and comprehensiveness.