Highlighting and trying to overcome a serious drawback with QSPR studies; data collection in different experimental conditions (mixed-QSPR)

J Comput Chem. 2012 Mar 15;33(7):732-47. doi: 10.1002/jcc.22892. Epub 2012 Jan 13.

Abstract

The experimental conditions in quantitative structure-property relationship (QSPR) studies need to be the same for each dataset in case one wishes to relate the property, only to the structure. This major drawback limits QSPR studies due to two reasons: (1) Gathering of physicochemical data obtained under the same experimental condition is difficult. (2) The obtained model is just useful to predict the physicochemical properties under the specific experimental condition. In this article, we report an attempt to highlight the shortcoming of QSPR studies for a property that was measured under different experimental conditions. In addition, we reveal inadequacies that correlating the fluorescence properties and the descriptor of the solvent has. These defects are eventually removed by taking into account the solvent-solute interactions in descriptor calculations. Quantum chemical calculations (HF/6-31G*) were carried out to optimize geometry and calculate the structural descriptors. The genetic algorithm combined with multiple linear regression method was utilized to construct the linear QSPR models. Because of the better nonlinear relationship between the quantum yield of fluorescence and structural descriptors in comparison with those of a linear relationship, support vector machine was used to construct the nonlinear QSPR model. Result analyses demonstrated that the proposed models meet our goal.