Comparing predictive ability of QSAR/QSPR models using 2D and 3D molecular representations

J Comput Aided Mol Des. 2021 Feb;35(2):179-193. doi: 10.1007/s10822-020-00361-7. Epub 2021 Jan 4.

Abstract

Quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) models predict biological activity and molecular property based on the numerical relationship between chemical structures and activity (property) values. Molecular representations are of importance in QSAR/QSPR analysis. Topological information of molecular structures is usually utilized (2D representations) for this purpose. However, conformational information seems important because molecules are in the three-dimensional space. As a three-dimensional molecular representation applicable to diverse compounds, similarity between a test molecule and a set of reference molecules has been previously proposed. This 3D representation was found to be effective on virtual screening for early enrichment of active compounds. In this study, we introduced the 3D representation into QSAR/QSPR modeling (regression tasks). Furthermore, we investigated relative merits of 3D representations over 2D in terms of the diversity of training data sets. For the prediction task of quantum mechanics-based properties, the 3D representations were superior to 2D. For predicting activity of small molecules against specific biological targets, no consistent trend was observed in the difference of performance using the two types of representations, irrespective of the diversity of training data sets.

Keywords: Molecular representations; Predictability of models; Quantitative structure–activity relationship; Quantitative structure–property relationship.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Drug Evaluation, Preclinical
  • Machine Learning
  • Models, Molecular
  • Molecular Conformation
  • Organic Chemicals / chemistry*
  • Quantitative Structure-Activity Relationship
  • Regression Analysis

Substances

  • Organic Chemicals