Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models

Irene Luque Ruiz; Miguel Ángel Gómez-Nieto

doi:10.1021/acs.jcim.7b00492

Advantages of Relative versus Absolute Data for the Development of Quantitative Structure-Activity Relationship Classification Models

J Chem Inf Model. 2017 Nov 27;57(11):2776-2788. doi: 10.1021/acs.jcim.7b00492. Epub 2017 Nov 8.

Authors

Irene Luque Ruiz¹, Miguel Ángel Gómez-Nieto¹

Affiliation

¹ Department of Computing and Numerical Analysis, University of Córdoba , Albert Einstein building, Campus de Rabanales, E-14071, Córdoba, Spain.

PMID: 29072460
DOI: 10.1021/acs.jcim.7b00492

Abstract

The appropriate selection of a chemical space represented by the data set, the selection of its chemical data representation, the development of a correct modeling process using a robust and reproducible algorithm, and the performance of an exhaustive training and external validation determine the usability and reproducibility of a quantitative structure-activity relationship (QSAR) classification model. In this paper, we show that the use of relative versus absolute data in the representation of the data sets produces better classification models when the other processes are not modified. Relative data considers a reference frame to measure the chemical characteristics involved in the classification model, refining the data set representation and smoothing the lack of chemical information. Three data sets with different characteristics have been used in this study, and classifications models have been built applying the support vector machine algorithm. For randomly selected training and test sets, values of accuracy and area under the receiver operating characteristic curve close to 100% have been obtained for the generation of the models and external validations in all cases.

MeSH terms

Antiprotozoal Agents / chemistry
Antiprotozoal Agents / pharmacology
Models, Theoretical*
Plasmodium falciparum / drug effects
Plasmodium falciparum / growth & development
Quantitative Structure-Activity Relationship*
Support Vector Machine

Substances

Antiprotozoal Agents