Random Forest Model with Combined Features: A Practical Approach to Predict Liquid-crystalline Property

Mol Inform. 2019 Apr;38(4):e1800095. doi: 10.1002/minf.201800095. Epub 2018 Dec 7.

Abstract

Quantitative structure-property relationships were developed to predict the liquid crystalline (LC) of a large dataset of aromatic organic compounds using machine learning algorithms and different molecular descriptors. The aim of this study was to find appropriate models and descriptors for the prediction of a large variety of liquid crystalline behaviors. Furthermore, descriptor calculations based on LC structural templates were proposed to understand the structural effects on the LC behaviors. The results suggest that random forest classifier and combined features which consists of structural templates were usable for LC behavior prediction. The best performance of prediction models showed high accuracy and F1 score (90 % and 93 %). Furthermore, the random forest has strong abilities to large input feature, quick training and easy model-tuning for constructing LC prediction model. Therefore, the prediction model allows experimentalists to seek the synthesis of a predicted molecule that would exhibit the desired LC properties to accelerate the progress in the discovery of new LC materials.

Keywords: Cheminformatics; Liquid Crystals; QSPR; Random Forest.

MeSH terms

  • Algorithms
  • Liquid Crystals / chemistry*
  • Models, Chemical
  • Quantitative Structure-Activity Relationship