Molecular Similarity Perception Based on Machine-Learning Models

Int J Mol Sci. 2022 May 30;23(11):6114. doi: 10.3390/ijms23116114.

Abstract

Molecular similarity is an impressively broad topic with many implications in several areas of chemistry. Its roots lie in the paradigm that 'similar molecules have similar properties'. For this reason, methods for determining molecular similarity find wide application in pharmaceutical companies, e.g., in the context of structure-activity relationships. The similarity evaluation is also used in the field of chemical legislation, specifically in the procedure to judge if a new molecule can obtain the status of orphan drug with the consequent financial benefits. For this procedure, the European Medicines Agency uses experts' judgments. It is clear that the perception of the similarity depends on the observer, so the development of models to reproduce the human perception is useful. In this paper, we built models using both 2D fingerprints and 3D descriptors, i.e., molecular shape and pharmacophore descriptors. The proposed models were also evaluated by constructing a dataset of pairs of molecules which was submitted to a group of experts for the similarity judgment. The proposed machine-learning models can be useful to reduce or assist human efforts in future evaluations. For this reason, the new molecules dataset and an online tool for molecular similarity estimation have been made freely available.

Keywords: chemical data set; machine learning; molecular similarity; similarity perception.

MeSH terms

  • Humans
  • Machine Learning*
  • Perception
  • Receptors, Drug*
  • Structure-Activity Relationship

Substances

  • Receptors, Drug