Data set modelability by QSAR

J Chem Inf Model. 2014 Jan 27;54(1):1-4. doi: 10.1021/ci400572x. Epub 2014 Jan 8.

Abstract

We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 data sets, and the threshold of 0.65 was found to separate the nonmodelable and modelable data sets.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Computational Biology
  • Databases, Chemical* / statistics & numerical data
  • Drug Design
  • Models, Chemical*
  • Quantitative Structure-Activity Relationship*