Comparison of Chemical Structure and Cell Morphology Information for Multitask Bioactivity Predictions

J Chem Inf Model. 2021 Mar 22;61(3):1444-1456. doi: 10.1021/acs.jcim.0c00864. Epub 2021 Mar 4.

Abstract

The understanding of the mechanism-of-action (MoA) of compounds and the prediction of potential drug targets play an important role in small-molecule drug discovery. The aim of this work was to compare chemical and cell morphology information for bioactivity prediction. The comparison was performed using bioactivity data from the ExCAPE database, image data (in the form of CellProfiler features) from the Cell Painting data set (the largest publicly available data set of cell images with ∼30,000 compound perturbations), and extended connectivity fingerprints (ECFPs) using the multitask Bayesian matrix factorization (BMF) approach Macau. We found that the BMF Macau and random forest (RF) performance were overall similar when ECFPs were used as compound descriptors. However, BMF Macau outperformed RF in 159 out of 224 targets (71%) when image data were used as compound information. Using BMF Macau, 100 (corresponding to about 45%) and 90 (about 40%) of the 224 targets were predicted with high predictive performance (AUC > 0.8) with ECFP data and image data as side information, respectively. There were targets better predicted by image data as side information, such as β-catenin, and others better predicted by fingerprint-based side information, such as proteins belonging to the G-protein-Coupled Receptor 1 family, which could be rationalized from the underlying data distributions in each descriptor domain. In conclusion, both cell morphology changes and chemical structure information contain information about compound bioactivity, which is also partially complementary, and can hence contribute to in silico MoA analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Bayes Theorem
  • Computer Simulation
  • Databases, Factual
  • Drug Discovery*
  • Proteins*

Substances

  • Proteins