Beware of R(2): Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models

D L J Alexander; A Tropsha; David A Winkler

doi:10.1021/acs.jcim.5b00206

Beware of R(2): Simple, Unambiguous Assessment of the Prediction Accuracy of QSAR and QSPR Models

J Chem Inf Model. 2015 Jul 27;55(7):1316-22. doi: 10.1021/acs.jcim.5b00206. Epub 2015 Jul 9.

Authors

D L J Alexander¹, A Tropsha², David A Winkler^{3

4

5

6}

Affiliations

¹ †CSIRO Digital Productivity Flagship, Private Bag 10, Clayton South, VIC 3169, Australia.
² ‡UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States.
³ §CSIRO Manufacturing Flagship, Clayton, VIC 3168, Australia.
⁴ ∥Monash Institute of Pharmaceutical Sciences, Parkville, VIC 3052, Australia.
⁵ ⊥Latrobe Institute for Molecular Science, Bundoora, VIC 3046, Australia.
⁶ #School of Chemical and Physical Sciences, Flinders University, Bedford Park, SA 5042, Australia.

Abstract

The statistical metrics used to characterize the external predictivity of a model, i.e., how well it predicts the properties of an independent test set, have proliferated over the past decade. This paper clarifies some apparent confusion over the use of the coefficient of determination, R(2), as a measure of model fit and predictive power in QSAR and QSPR modeling. R(2) (or r(2)) has been used in various contexts in the literature in conjunction with training and test data for both ordinary linear regression and regression through the origin as well as with linear and nonlinear regression models. We analyze the widely adopted model fit criteria suggested by Golbraikh and Tropsha ( J. Mol. Graphics Modell. 2002 , 20 , 269 - 276 ) in a strict statistical manner. Shortcomings in these criteria are identified, and a clearer and simpler alternative method to characterize model predictivity is provided. The intent is not to repeat the well-documented arguments for model validation using test data but rather to guide the application of R(2) as a model fit statistic. Examples are used to illustrate both correct and incorrect uses of R(2). Reporting the root-mean-square error or equivalent measures of dispersion, which are typically of more practical importance than R(2), is also encouraged, and important challenges in addressing the needs of different categories of users such as computational chemists, experimental scientists, and regulatory decision support specialists are outlined.

MeSH terms

Quantitative Structure-Activity Relationship*
Regression Analysis
Statistics as Topic / methods*

Grants and funding

R01 GM096967/GM/NIGMS NIH HHS/United States