Illuminating the Chemical Space of Untargeted Proteins

J Chem Inf Model. 2023 May 8;63(9):2689-2698. doi: 10.1021/acs.jcim.2c01364. Epub 2023 Apr 19.

Abstract

According to the Illuminating the Druggable Genome (IDG) initiative, 90% of the proteins encoded by the human genome still lack an identified active ligand, that is, a small molecule with biologically relevant binding potency or functional activity in an in vitro assay. Under this scenario, there is an urgent need for new approaches to chemically address these yet untargeted proteins. It is widely recognized that the best starting point for generating novel small molecules for proteins is to exploit the expected polypharmacology of known active ligands across phylogenetically related proteins following the paradigm that similar proteins are likely to interact with similar ligands. Here, we introduce a computational strategy to identify privileged structures that, when chemically expanded, are highly probable to contain active small molecules for untargeted proteins. The protocol was first tested on a set of 576 currently targeted proteins having at least one protein family sibling the year before their first active ligand was reported. A privileged structure contained in active ligands that were identified in the following years was correctly anticipated for 214 (37%) of those targeted proteins, a lower-bound recall estimate when considering data completeness issues. When applied to a set of 1184 untargeted potential druggable genes in cancer, the identification of privileged structures from known bioactive ligands of protein family siblings allowed for extracting a priority list of diverse commercially available small molecules for 960 of them. Assuming a minimum success rate of 37%, the chemical library selections should be able to deliver active ligands for at least 355 currently untargeted proteins associated with cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • Ligands
  • Polypharmacology*
  • Proteins* / chemistry

Substances

  • Ligands
  • Proteins