Fuzzy QSARs for predicting logKoc of persistent organic pollutants

Chemosphere. 2004 Feb;54(6):771-6. doi: 10.1016/j.chemosphere.2003.08.023.

Abstract

Fuzzy regression methodology has been employed in this study to develop a relationship for logKoc for persistent organic pollutants (POPs) using other property and molecular descriptors. Fuzzy regression is distinct from statistical regression and is used to characterize the imprecision arising from limited data and/or incomplete model descriptions. The study is based on the premise that statistically based QSARs do not fully account for all the sorbate-sorbent interactions pertinent to the partitioning of POPs and as such these relationships have inherent fuzziness associated with them. A comparison between the statistical and fuzzy logKow-logKoc relationship indicated that the fuzzy regression model enveloped all scatter in the data and provided a tighter fit around the mid-point values (least-square estimates). In addition, fuzzy regression was also employed to characterize imprecision associated with a three parameter QSAR that employs molecular connectivity indicies. A comparison between fuzzy and statistical regression analysis indicated that the fuzziness in this model was primarily associated with characterization of local (atomic) scale interactions while statistical randomness manifested at both local and global (molecular) scales. Experimental and estimation artifacts appear to have a higher impact on statistical regression than fuzzy regression. However, the superiority of the fuzzy regression seems to diminish with increasing correlation between the inputs and the output variable.

MeSH terms

  • Environmental Pollutants / analysis*
  • Environmental Pollution / analysis*
  • Environmental Pollution / statistics & numerical data
  • Forecasting
  • Fuzzy Logic
  • Organic Chemicals / analysis*
  • Quantitative Structure-Activity Relationship

Substances

  • Environmental Pollutants
  • Organic Chemicals